Domain-specific MT system for Celsius Data

Celsius Data — a representative of several Scandinavian medical databases and a clinical decision support system in the Baltics. Celsius Data produces publications for the medical industry — doctors, pharmacists and nurses.

Due to the highly technical terminology, translations from English into Estonian were done in-house and that was time-consuming. General machine translation systems could not provide accurate medicine and pharmacology terminology and, also due to data privacy and security concerns, the translations could not be assigned to a third party. Because Celsius Data produced medical content in Estonian, it required a single system — English into Estonian but with highly accurate and consistent terminology in different areas of life-sciences. Celsius Data turned to Tilde and asked to build their own machine translation system. For Tilde, it took few weeks to developed a machine translator based on one million units of parallel data — sentences both in Estonian and English. The process involved training, tuning and retraining the system in order to make sure that the resulting machine translation provides precise and consistent terminology.



With the custom MT system, Celsius Data has significantly increased its translation productivity — over less than a year, the English-Estonian machine translator has become an important tool for daily operations. During this period, 3.5 million words have been translated through the machine translator and the system has been used 248,389 times.

Elise Urva, Celsius Data, “Celsius is offering a variety of medical databases for doctors and pharmacists. All translations are done in-house in order to ensure a high quality and we were looking for solutions to automate the process. The system has proven many times it is remarkably accurate and has become a fundamental tool for the translation process”.