The beta launch of the numeroom.com web site over a month ago finally allowed the debut of the various collaboration functions that are uniquely provided by the site. One of these services is the real time implicit language translation. It allows free users of the service to implicitly translate the text they type in a selected language to and from the languages used by other users they are collaborating with on the service either through chat rooms or in private one on one IM's.
The translation on agent demand feature (as it is also called) makes chatting with a group of people typing in different languages easy and efficient (from the business perspective). It was in fact one of the charter features of the collaboration API that I built into the the AgilEntity platform that is exposed through the numeroom.com website. In order to provide the service I realized that the best way to make it extremely efficient was to distribute the necessary horsepower in multiple dimensions. Toward that end, I had to design a distributed chat algorithm that would work without requiring that each user in a chat room be logically associated with a single server that is hosting the conversation. Allowing the hosting of users in a conversation across multiple servers would enable significantly more efficient chatting as the resource requirements on memory could be spread across different servers even when all the participants of the conversation are logically in the "same" room. Another distribution strategy involved the actual translation function, when doing many to many language translation the need to translate each users text into the language of all the other users and then send that text to those users displays makes the activity incredibly computationally taxing. I had to design a way to eliminate this computational complexity and after a week or so of thinking about it, came up with the efficient "message board" solution, rather than translate each message entered by a user and transmit those messages to the user interface. I could use the already distributed chat system to translate the message and write them to a corresponding language message board that would then be viewable to all users that are typing in that language. Thus only one translation is necessary per message per language, decoupling the translation act from the viewing act which no longer requires active delivery of the message to the viewer but instead requires that the user actively go to view the message board. This significantly reduces the cost of real time translation since the server never has to deliver translated message to the client if those clients do not explicitly request the message by selecting the message board of the translated language. If no one is viewing a message board, no translation is performed to that language. Some more details of this algorithm are described in the patent application for the technique available online.
The actual source of the translated text comes from the distributed network of Google (I also had it working with babelfish and freetranslate but settled on Google for the use of the free API rather than hacking the public interface ;) ) Google uses a statistical language translation technology to provide fairly accurate translations. For chatting the accuracy is excellent as people tend to "speak" in a very non formal and basis syntax in a chat room. When I originally got the function working, I wanted to have full support of all the languages that Google made available but I ran into a problem where languages that used non Roman character sets would fail to display in the chat room. The cause of the problem had to do with logical bugs in the code that I wrote to read and write text from url's and to documents. As the date for my launch came round, I decided to just launch the site with the limited Roman character support rather than try and add the additional languages and that is what I did. Now that I finally have time to breath after spending most of last month , networking , testing ad campaigns and trying to hunt down Angel and VC's for investment purposes I decided to add the non Roman character support.
I did this last night, the site now has support for 30 different languages spanning over 5 character sets. The new languages added last night include:
Chinese (simplified)
Japanese
Korean
Tagalog (Filipino)
Vietnamese
Turkish
Hebrew
Arabic
Dutch
Russian
Finnish
Hindi
Romanian
Polish
Ukranian
Indonesian
Persian (Farsi)
Google rushed to add the Farsi support after the recent events in Iran, in order to facilitate communication for third party products that used their API such as numeroom.com does. So any one in Iran can use a Numeroom to chat with non Farsi speakers.
So now the original vision I had to level the communication playing field once and for all by making language irrelevant is coming to fruition! The language I most wanted to have was Romanian, which was the language that inspired the hunt for a solution. I am happy to say bidirectional translation using Romanian exists for all the other languages. If you are unfamiliar with the numeroom.com site , you can create a free account to try out the many services and features offered for secure collaboration by clicking on the links below.
Sign up for free, create your first numeroom automatically !!
and then...
Video Tutorial: Try out the real time language translation
Video Tutorial: Invite your friends to chat(just email is needed no need for them to join)
The translation on agent demand feature (as it is also called) makes chatting with a group of people typing in different languages easy and efficient (from the business perspective). It was in fact one of the charter features of the collaboration API that I built into the the AgilEntity platform that is exposed through the numeroom.com website. In order to provide the service I realized that the best way to make it extremely efficient was to distribute the necessary horsepower in multiple dimensions. Toward that end, I had to design a distributed chat algorithm that would work without requiring that each user in a chat room be logically associated with a single server that is hosting the conversation. Allowing the hosting of users in a conversation across multiple servers would enable significantly more efficient chatting as the resource requirements on memory could be spread across different servers even when all the participants of the conversation are logically in the "same" room. Another distribution strategy involved the actual translation function, when doing many to many language translation the need to translate each users text into the language of all the other users and then send that text to those users displays makes the activity incredibly computationally taxing. I had to design a way to eliminate this computational complexity and after a week or so of thinking about it, came up with the efficient "message board" solution, rather than translate each message entered by a user and transmit those messages to the user interface. I could use the already distributed chat system to translate the message and write them to a corresponding language message board that would then be viewable to all users that are typing in that language. Thus only one translation is necessary per message per language, decoupling the translation act from the viewing act which no longer requires active delivery of the message to the viewer but instead requires that the user actively go to view the message board. This significantly reduces the cost of real time translation since the server never has to deliver translated message to the client if those clients do not explicitly request the message by selecting the message board of the translated language. If no one is viewing a message board, no translation is performed to that language. Some more details of this algorithm are described in the patent application for the technique available online.
The actual source of the translated text comes from the distributed network of Google (I also had it working with babelfish and freetranslate but settled on Google for the use of the free API rather than hacking the public interface ;) ) Google uses a statistical language translation technology to provide fairly accurate translations. For chatting the accuracy is excellent as people tend to "speak" in a very non formal and basis syntax in a chat room. When I originally got the function working, I wanted to have full support of all the languages that Google made available but I ran into a problem where languages that used non Roman character sets would fail to display in the chat room. The cause of the problem had to do with logical bugs in the code that I wrote to read and write text from url's and to documents. As the date for my launch came round, I decided to just launch the site with the limited Roman character support rather than try and add the additional languages and that is what I did. Now that I finally have time to breath after spending most of last month , networking , testing ad campaigns and trying to hunt down Angel and VC's for investment purposes I decided to add the non Roman character support.
I did this last night, the site now has support for 30 different languages spanning over 5 character sets. The new languages added last night include:
Chinese (simplified)
Japanese
Korean
Tagalog (Filipino)
Vietnamese
Turkish
Hebrew
Arabic
Dutch
Russian
Finnish
Hindi
Romanian
Polish
Ukranian
Indonesian
Persian (Farsi)
Google rushed to add the Farsi support after the recent events in Iran, in order to facilitate communication for third party products that used their API such as numeroom.com does. So any one in Iran can use a Numeroom to chat with non Farsi speakers.
So now the original vision I had to level the communication playing field once and for all by making language irrelevant is coming to fruition! The language I most wanted to have was Romanian, which was the language that inspired the hunt for a solution. I am happy to say bidirectional translation using Romanian exists for all the other languages. If you are unfamiliar with the numeroom.com site , you can create a free account to try out the many services and features offered for secure collaboration by clicking on the links below.
Sign up for free, create your first numeroom automatically !!
and then...
Video Tutorial: Try out the real time language translation
Video Tutorial: Invite your friends to chat(just email is needed no need for them to join)
Comments