With the explosion of digital content (a term has coined as “data deluge”) and the growing expectations of customers for quality translations in many languages, many translation companies experience a great pressure from businesses to localise content and thus use the Internet as a multilingual sales channel by maximising their online presence. Human resources (translators) cannot meet the increasing translation needs of global businesses any longer and therefore technology has to help in this process. Several attempts have been made over the last few decades to develop and improve reliable machine translation systems, unsuccessfully and with doubtful results. The ALPAC report killed research in machine translation after expectations were very high in the 1950′s. In the 21st century, technology has advanced to make use of massive amounts of data, following proposals by Chomsky. Nowadays, it is statistics and not linguistics that run the engines behind popular engines. Machine translation is the essential key to satisfy our hunger for more and more multilingual publications (website translations, eCommerce sites, documentation, business information, etc). Translation has become a business enhancement tool helping companies to gain competitive advantages.

Before we explain some of the advantages, it is important to understand what is machine translation and how it works. Machine translation (or MT for short) is a software process whereby text is translated with the help of a computer program. The advantage of this process is that, once the system is running, translation needs no human intervention – it does not matter if it is a single sentence or massive amounts of online data. A translation engines is created and configured with training data (typically hundreds of thousands and even millions of previous translations) or with the help of dictionaries and rules. The translation thinking takes place in seconds. Mathematical algorithms find the chances of the new text being composed by smaller units coming from previous translations using statistics. Alternatively, computer scripts created by linguists relate the input in one language to another following a set of rules that match the structure one language to another.

Types of machine translation

There are two general types of machine translation: rules-based systems and statistical systems. As explained above, a rule-based system uses a combination of language and grammar rules along with dictionaries. These types of systems tend to offer consistent but also literal translation. They are the general systems used in the from the 60′s. The downside of this systems is that translations may sound too “stiff” or purely incomprehensible, with no flow. This is because they reflect the structure of the source language. The vast majority of traditional rule-based systems was skillfully developed over the years working out relationships and rules between two different languages. This tends to work fine when languages are somewhat related, but results do not look so good when languages are not related. Development times and costs are big.

A statistical system, however, does not pay any attention to any language rules.

Moses is the most used Open Source Translation engine

Moses is the most used Open Source Translation engine

A statistical system, such as the ones used by Google Translate requires enormous amounts of parallel data (corpora). Google initially used corpora from the EU, from the UN and other free online data it mined. After pruning and discarding the parts that were not fit for machine learning, Google created its famous translation offering. A statistical machine translation system can translate large amounts of data fast and efficiently. It is the technology behind on-the-spot web page translation or in mobile devices, etc. This type of machine translation systems also deliver translations that tend to read more fluently, due to the fact that they analyse chunks of data in groups of 3, 4 or 5 words (also known as n-grams).

Nowadays, most systems make use of the best features of the above two methods. This is called hybrid machine translation. The reason for developing hybrid machine translation systems stems from the fact that over the years, no single technique was able to provide a 100% satisfactory level of accuracy. Researchers began to wonder whether to solve the shortcomings of either approach, some rules or linguistic input could be required or some statistical technologies. Research has reported that a hybrid approach can improve the accuracy of final translations in certain language combinations. Several popular machine translation systems employ hybrid methods. PangeaMT published two academic papers on hybridisation for the English – Japanese language pair in the Asian Association for Machine Translation appointing to the fact that some hybridisation techniques improve translation results when language pairs are syntactically remote, like Japanese and English. The results were also presented at the Japan Translation event in 2011. Our company’s efforts on hybridisation received the acknowledgment of the European Union and we are now part of a EU research program on hybrid approaches to machine translation (EXPERT).

Organisations operating in the international market working on sensitive projects often choose some kind of machine translation API or connector to translate low-priority content automatically. Typically, this is content like user comments or user-generated content which the organisation does not want to translate and it is not core to the company documentation effort, yet it can be useful to a community of users.

If you are a company with a large amount of data to be translated and it has to be done in a short time, machine translation services must be high on your list. Free machine translation services can leak data, as it was reported by TCWorld and the European Union is taking measures on security in that respect. Therefore, a good machine translation system should be hosted and completely secured if it is to handle private data, for example data which may be sensitive from an industrial point of view, for intelligence services, military, hospitals, insurance, etc. As a user, you must have the complete peace of mind that your information is not disclosed to third parties. Sensitivie content must be secured from getting into the hands of a careless or unscrupulous translator. This way, confidentiality is completely secured.

Large corporations, ecommerce stores, government agencies, and content writers are getting good benefit of hiring and optimising machine translation services. There are many more added advantages of using machine translation. They result in enhanced productivity, faster and cheap translation. You need not have to muddle your head in finding whether the vendor has good team of translators or not. Nor you have to worry they have the expertise to understand your market or not. All your translation work will be completed easily and without any hassle. They are flexible enough to suit any type of project.

Hope you are convinced now why you should use machine translation services for getting your translation job done. The only thing that you need to consider here is choosing good machine translation services. You should ensure that they use latest technology in their core job. There are plenty of vendors who have an online presence but Pangeanic works on technology transfer via its division PangeaMT. You can visit the website to find out more and try the technology for your own purposes, even build new business thanks to our API. Read the reviews of the past clients and presentations with large corporations in the last few years. You will get an insight about the customer service of the chosen vendor.

The popularity of machine translation services is increasing by leaps and bounds. Why wait any longer? Call a Pangeanic team member to see how we can best help you with machine translation when human translation cannot satisfy all your needs.

Leave a Reply

Your email address will not be published. Required fields are marked *


eight + 6 =

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>