Skip to main content

Numeroom: Non Roman character support is here.

The beta launch of the numeroom.com web site over a month ago finally allowed the debut of the various collaboration functions that are uniquely provided by the site. One of these services is the real time implicit language translation. It allows free users of the service to implicitly translate the text they type in a selected language to and from the languages used by other users they are collaborating with on the service either through chat rooms or in private one on one IM's.

The translation on agent demand feature (as it is also called) makes chatting with a group of people typing in different languages easy and efficient (from the business perspective). It was in fact one of the charter features of the collaboration API that I built into the the AgilEntity platform that is exposed through the numeroom.com website. In order to provide the service I realized that the best way to make it extremely efficient was to distribute the necessary horsepower in multiple dimensions. Toward that end, I had to design a distributed chat algorithm that would work without requiring that each user in a chat room be logically associated with a single server that is hosting the conversation. Allowing the hosting of users in a conversation across multiple servers would enable significantly more efficient chatting as the resource requirements on memory could be spread across different servers even when all the participants of the conversation are logically in the "same" room. Another distribution strategy involved the actual translation function, when doing many to many language translation the need to translate each users text into the language of all the other users and then send that text to those users displays makes the activity incredibly computationally taxing. I had to design a way to eliminate this computational complexity and after a week or so of thinking about it, came up with the efficient "message board" solution, rather than translate each message entered by a user and transmit those messages to the user interface. I could use the already distributed chat system to translate the message and write them to a corresponding language message board that would then be viewable to all users that are typing in that language. Thus only one translation is necessary per message per language, decoupling the translation act from the viewing act which no longer requires active delivery of the message to the viewer but instead requires that the user actively go to view the message board. This significantly reduces the cost of real time translation since the server never has to deliver translated message to the client if those clients do not explicitly request the message by selecting the message board of the translated language. If no one is viewing a message board, no translation is performed to that language. Some more details of this algorithm are described in the patent application for the technique available online.

The actual source of the translated text comes from the distributed network of Google (I also had it working with babelfish and freetranslate but settled on Google for the use of the free API rather than hacking the public interface ;) ) Google uses a statistical language translation technology to provide fairly accurate translations. For chatting the accuracy is excellent as people tend to "speak" in a very non formal and basis syntax in a chat room. When I originally got the function working, I wanted to have full support of all the languages that Google made available but I ran into a problem where languages that used non Roman character sets would fail to display in the chat room. The cause of the problem had to do with logical bugs in the code that I wrote to read and write text from url's and to documents. As the date for my launch came round, I decided to just launch the site with the limited Roman character support rather than try and add the additional languages and that is what I did. Now that I finally have time to breath after spending most of last month , networking , testing ad campaigns and trying to hunt down Angel and VC's for investment purposes I decided to add the non Roman character support.

I did this last night, the site now has support for 30 different languages spanning over 5 character sets. The new languages added last night include:

Chinese (simplified)
Japanese
Korean
Tagalog (Filipino)
Vietnamese
Turkish
Hebrew
Arabic
Dutch
Russian
Finnish
Hindi
Romanian
Polish
Ukranian
Indonesian
Persian (Farsi)

Google rushed to add the Farsi support after the recent events in Iran, in order to facilitate communication for third party products that used their API such as numeroom.com does. So any one in Iran can use a Numeroom to chat with non Farsi speakers.

So now the original vision I had to level the communication playing field once and for all by making language irrelevant is coming to fruition! The language I most wanted to have was Romanian, which was the language that inspired the hunt for a solution. I am happy to say bidirectional translation using Romanian exists for all the other languages. If you are unfamiliar with the numeroom.com site , you can create a free account to try out the many services and features offered for secure collaboration by clicking on the links below.


Sign up for free, create your first numeroom automatically !!

and then...

Video Tutorial: Try out the real time language translation

Video Tutorial: Invite your friends to chat(just email is needed no need for them to join)

Comments

Popular posts from this blog

Highly targeted Cpg vaccine immunotherapy for a range of cancer

Significance?


This will surely go down as a seminal advance in cancer therapy. It reads like magic:

So this new approach looks for the specific proteins that are associated with a given tumors resistance to attack by the body's T cells, it then adjusts those T cells to be hyper sensitive to the specific oncogenic proteins targeted. These cells become essentially The Terminator​ T cells in the specific tumor AND have the multiplied effect of traveling along the immune pathway of spreading that the cancer many have metastasized. This is huge squared because it means you can essentially use targeting one tumor to identify and eliminate distal tumors that you many not even realize exist.

This allows the therapy for treating cancer to, for the first time; end the "wack a mole" problem that has frustrated traditional shot gun methods of treatment involving radiation and chemotherapy ...which by their nature unfortunately damage parts of the body that are not cancer laden but …

Engineers versus Programmers

I have found as more non formally trained people enter the coding space, the quality of code that results varies in an interesting way.

The formalities of learning to code in a structured course at University involve often strong focus on "correctness" and efficiency in the form of big O representations for the algorithms created.

Much less focus tends to be placed on what I'll call practical programming, which is the type of code that engineers (note I didn't use "programmers" on purpose) must learn to write.

Programmers are what Universities create, students that can take a defined development environment and within in write an algorithm for computing some sequence or traversing a tree or encoding and decoding a string. Efficiency and invariant rules are guiding development missions. Execution time for creating the solution is often a week or more depending on the professor and their style of teaching code and giving out problems. This type of coding is devo…

AgilEntity Architecture: Action Oriented Workflow

Permissions, fine grained versus management headache
The usual method for determining which users can perform a given function on a given object in a managed system, employs providing those Users with specific access rights via the use of permissions. Often these permissions are also able to be granted to collections called Groups, to which Users are added. The combination of Permissions and Groups provides the ability to provide as atomic a dissemination of rights across the User space as possible. However, this granularity comes at the price of reduced efficiency for managing the created permissions and more importantly the Groups that collect Users designated to perform sets of actions. Essentially the Groups serve as access control lists in many systems, which for the variable and often changing environment of business applications means a need to constantly update the ACL’s (groups) in order to add or remove individuals based on their ability to perform certain actions. Also, the…