If you're multilingual, you can easily translate between the languages in which you're fluent but, as a writer, I often find it useful to have a resource for translation, for effect, or simply for clarification. Sometimes, it's a case of translating a single word, which is pretty easy using any one of the language-to-language dictionaries available.
But, what about phrases, entire sentences, or even paragraphs? I've actually received a number of foreign language press releases, in fact, which I would like to be able to translate.
Well, it's hardly surprising that Google has built out its own Web-based service, Google Translate
(currently in its Beta phase), which now supports 41 languages. Google has developed its own "statistical translation system" between any two pairs of supported languages.
The statistical machine translation system, according to Google, is different from the typical rules-based translation systems that require significant definition of vocabularies and grammar rules. According to Google, its software is fed "billions of words of text, both monolingual text in the target language, and aligned text consisting of examples of human translations between the languages, and then apply statistical learning techniques to build a translation model."
Google says it has achieved "very good results" using this model.
I'm not so sure a translation model that doesn't incorporate complex rules can produce consistent results.
Google has recently added Estonian to its list of supported languages, so I decided to have a look with some simple sentences.
Estonian: Sinine on sinu taevas, kallis Eesti kodumaa.
My translation: Blue is your sky, my dear Estonian homeland.
The blue sky is yours, for my own baby.
Estonian: Ma tahaksin kodus olla, kus õunapuud õitsevad.
My translation: I would like to be home, where the apple trees blossom.
I would like to be at home, where the trees are blossoming.
Estonian: Musta lehma saba on kirju lehma taga.
My translation: The black cow's tail is behind the (multi)colored cow's tail.
Black tail of a cow is a cow behind letters.
Estonian: Mu arvuti on parandusel.
My translation: My computer is being repaired.
My computer is a correction.
Estonian: Mu arvuti on katki.
My translation: My computer is broken.
My computer is broken.
Estonian: Mis kell me läheme?
My translation: What time are we leaving?
What time are we going?
There are only a few examples, but this small sample seems to show that the simplest of sentences should be handled quite well by the software, but more complex structures and homonyms (let alone idiomatic words and expressions) present a challenge.
That's to be expected, I think, but it speaks to a need to grammar and usage rules and dictionaries. Every language has its idiosyncrasies, and without the proper "knowledge base," it seems nearly impossible to provide accurate translations of anything but the most basic sentences, which isn't particularly useful.
That said, Google is aware of the challenge: "Automatic translation is very difficult, as the meaning of words depends on the context in which they're used. While we are working on the problem, it may be some time before anyone can offer human quality translations. In the interim, we hope you find the service we provide useful for most purposes."
Google also provides the opportunity to provide feedback on specific translations as well as sending bilingual text for enhancing its algorithms.
So is this service useful? It can be, in a somewhat limited capacity for now. But spoken language is more complex than any programming language. We'll see what happens, but Google certainly has its work cut out if it wants to make this a truly useful resource. That said, if anyone can accomplish it, while providing an easy to use Web application, Google is probably the one.
By the way, going from English to Estonian seems better:
English: ITEXPO in Miami was a very successful event.
ITEXPO Miami oli väga edukas sündmus.
English: I'm looking for a VoIP service provider.
Estonian: Otsin VoIP teenusepakkuja.
I'd be interested to hear what others have found translating other languages.