Skip to main content

Numeroom: Non Roman character support is here.

The beta launch of the numeroom.com web site over a month ago finally allowed the debut of the various collaboration functions that are uniquely provided by the site. One of these services is the real time implicit language translation. It allows free users of the service to implicitly translate the text they type in a selected language to and from the languages used by other users they are collaborating with on the service either through chat rooms or in private one on one IM's.

The translation on agent demand feature (as it is also called) makes chatting with a group of people typing in different languages easy and efficient (from the business perspective). It was in fact one of the charter features of the collaboration API that I built into the the AgilEntity platform that is exposed through the numeroom.com website. In order to provide the service I realized that the best way to make it extremely efficient was to distribute the necessary horsepower in multiple dimensions. Toward that end, I had to design a distributed chat algorithm that would work without requiring that each user in a chat room be logically associated with a single server that is hosting the conversation. Allowing the hosting of users in a conversation across multiple servers would enable significantly more efficient chatting as the resource requirements on memory could be spread across different servers even when all the participants of the conversation are logically in the "same" room. Another distribution strategy involved the actual translation function, when doing many to many language translation the need to translate each users text into the language of all the other users and then send that text to those users displays makes the activity incredibly computationally taxing. I had to design a way to eliminate this computational complexity and after a week or so of thinking about it, came up with the efficient "message board" solution, rather than translate each message entered by a user and transmit those messages to the user interface. I could use the already distributed chat system to translate the message and write them to a corresponding language message board that would then be viewable to all users that are typing in that language. Thus only one translation is necessary per message per language, decoupling the translation act from the viewing act which no longer requires active delivery of the message to the viewer but instead requires that the user actively go to view the message board. This significantly reduces the cost of real time translation since the server never has to deliver translated message to the client if those clients do not explicitly request the message by selecting the message board of the translated language. If no one is viewing a message board, no translation is performed to that language. Some more details of this algorithm are described in the patent application for the technique available online.

The actual source of the translated text comes from the distributed network of Google (I also had it working with babelfish and freetranslate but settled on Google for the use of the free API rather than hacking the public interface ;) ) Google uses a statistical language translation technology to provide fairly accurate translations. For chatting the accuracy is excellent as people tend to "speak" in a very non formal and basis syntax in a chat room. When I originally got the function working, I wanted to have full support of all the languages that Google made available but I ran into a problem where languages that used non Roman character sets would fail to display in the chat room. The cause of the problem had to do with logical bugs in the code that I wrote to read and write text from url's and to documents. As the date for my launch came round, I decided to just launch the site with the limited Roman character support rather than try and add the additional languages and that is what I did. Now that I finally have time to breath after spending most of last month , networking , testing ad campaigns and trying to hunt down Angel and VC's for investment purposes I decided to add the non Roman character support.

I did this last night, the site now has support for 30 different languages spanning over 5 character sets. The new languages added last night include:

Chinese (simplified)
Japanese
Korean
Tagalog (Filipino)
Vietnamese
Turkish
Hebrew
Arabic
Dutch
Russian
Finnish
Hindi
Romanian
Polish
Ukranian
Indonesian
Persian (Farsi)

Google rushed to add the Farsi support after the recent events in Iran, in order to facilitate communication for third party products that used their API such as numeroom.com does. So any one in Iran can use a Numeroom to chat with non Farsi speakers.

So now the original vision I had to level the communication playing field once and for all by making language irrelevant is coming to fruition! The language I most wanted to have was Romanian, which was the language that inspired the hunt for a solution. I am happy to say bidirectional translation using Romanian exists for all the other languages. If you are unfamiliar with the numeroom.com site , you can create a free account to try out the many services and features offered for secure collaboration by clicking on the links below.


Sign up for free, create your first numeroom automatically !!

and then...

Video Tutorial: Try out the real time language translation

Video Tutorial: Invite your friends to chat(just email is needed no need for them to join)

Comments

Popular posts from this blog

the attributes of web 3.0...

As the US economy continues to suffer the doldrums of stagnant investment in many industries, belt tightening budgets in many of the largest cities and continuous rounds of lay offs at some of the oldest of corporations, it is little comfort to those suffering through economic problems that what is happening now, has happened before. True, the severity of the downturn might have been different but the common factors of people and businesses being forced to do more with less is the theme of the times. Like environmental shocks to an ecosystem, stresses to the economic system lead to people hunkering down to last the storm, but it is instructive to realize that during the storm, all that idle time in the shelter affords people the ability to solve previous or existing problems. Likewise, economic downturns enable enterprising individuals and corporations the ability to make bold decisions with regard to marketing , sales or product focus that can lead to incredible gains as the economic ...

How many cofactors for inducing expression of every cell type?

Another revolution in iPSC technology announced: "Also known as iPS cells, these cells can become virtually any cell type in the human body -- just like embryonic stem cells. Then last year, Gladstone Senior Investigator Sheng Ding, PhD, announced that he had used a combination of small molecules and genetic factors to transform skin cells directly into neural stem cells. Today, Dr. Huang takes a new tack by using one genetic factor -- Sox2 -- to directly reprogram one cell type into another without reverting to the pluripotent state." -- So the method invented by Yamanaka is now refined to rely only 1 cofactor and b) directly generate the target cell type from the source cell type (skin to neuron) without the stem like intermediate stage.  It also mentions that oncogenic triggering was eliminated in their testing. Now comparative methods can be used to discover other types...the question is..is Sox2 critical for all types? It may be that skin to neuron relies on Sox2 ...

AgilEntity Architecture: Action Oriented Workflow

Permissions, fine grained versus management headache The usual method for determining which users can perform a given function on a given object in a managed system, employs providing those Users with specific access rights via the use of permissions. Often these permissions are also able to be granted to collections called Groups, to which Users are added. The combination of Permissions and Groups provides the ability to provide as atomic a dissemination of rights across the User space as possible. However, this granularity comes at the price of reduced efficiency for managing the created permissions and more importantly the Groups that collect Users designated to perform sets of actions. Essentially the Groups serve as access control lists in many systems, which for the variable and often changing environment of business applications means a need to constantly update the ACL’s (groups) in order to add or remove individuals based on their ability to perform cert...