Enable your solutions to speak the languages of the Baltics

Machine translation Proofreading tools Speech services Linguistic tools

Can your products speak Latvian, Lithuanian, and Estonian? Enable your solutions to reach the Baltic market by integrating our language technology services for the Baltic languages. As the largest language technology company in the region, Tilde has spent years developing the world’s best language technology services for the languages of the Baltic states. By integrating these services, you'll be able to fully access this key region of northern Europe.

 

Machine translation services

The world's highest-quality MT systems for Latvian, Lithuanian, and Estonian, developed with the largest corpora of data. These systems are available in multiple domains, including the general domain. 

Explore API

Proofreading tools

Tilde has developed the market-leading proofreading systems for Latvia and Lithuanian, two of the most morphologically rich and complex languages in Europe. These solutions are based on years of research and innovation in proofreading tools. 

  • Latvian spelling checker
    • Verifies the spelling of every word and offers to replace a misspelled word with the correct one. Automatically changes words that are unambiguously misspelled. Tilde’s team constantly improves the spelling checker by including new lexical items and by adding new features (e.g., Intelligent AutoCorrect). The Latvian spelling checkers now recognize more than 22 million forms generated from more than 130 thousand lemmas.

  • Lithuanian spelling checker
    • Verifies the spelling of every word and offers to replace a misspelled word with the correct one. Automatically changes words that are unambiguously misspelled. Tilde’s team constantly improves the spelling checker by including new lexical items and by adding new features (e.g., Intelligent AutoCorrect). The Lithuanian spelling checkers now recognize more than 22 million forms generated from more than 130 thousand lemmas.

  • Latvian grammar checker
    • Verifies sentence structure and punctuation. Tilde's developed grammar checker is based on syntactic analysis of the text, which offers to correct the most common grammar mistakes. These include errors in word agreement, punctuation errors at the end of sentences, stylistic errors, as well as comma errors in insertions, participial phrases, equal parts of sentences, and sub-clauses.

      The grammar checker also allows the program to find long distance syntactical errors between different sub-parts of a sentence. In addition, calques, slang, and some other undesirable words or language construction usage are identified. This module also corrects such simple errors as extra spaces before or after punctuation marks, mistakes in the number of opening and closing brackets, quotation marks, etc.

  • Lithuanian grammar checker
    • Verifies sentence structure and punctuation. Tilde's developed grammar checker is based on syntactic analysis of the text, which offers to correct the most common grammar mistakes. These include errors in word agreement, punctuation errors at the end of sentences, stylistic errors, as well as comma errors in insertions, participial phrases, equal parts of sentences, and sub-clauses.

      The grammar checker also allows the program to find long distance syntactical errors between different sub-parts of a sentence. In addition, calques, slang, and some other undesirable words or language construction usage are identified. This module also corrects such simple errors as extra spaces before or after punctuation marks, mistakes in the number of opening and closing brackets, quotation marks, etc.

  • Latvian hyphenator
    • The hyphenator puts all the possible hyphens in the words in a text. For hyphenation, both rules defining the usual hyphenation process and exception list (words which cannot be hyphenated using just rules) are used.

  • Lithuanian hyphenator
    • The hyphenator puts all the possible hyphens in the words in a text. For hyphenation, both rules defining the usual hyphenation process and exception list (words which cannot be hyphenated using just rules) are used.

Speech services

Speech is the next step in language technology. Though speech technology already exists for the world’s larger languages – such as English, Spanish, and German – smaller languages are underrepresented. Tilde is currently working on building speech technology services for Europe’s smaller languages like Latvian, Lithuanian, and Estonian. 

  • Latvian Automated Speech Recognition (ASR)
    • Tilde was the world’s first company to create ASR for Latvian. The ARS service is based on huge database of spoken Latvian data. Since completion, the service has been integrated into a mobile app that recognizes spoken numerals.

  • Latvian text-to-speech service
    • Speech Synthesis (Text-to-Speech, TTS) technology transform the wording of an utterance into sounds that are outputted to the user.

      Tilde has worked on Speech Synthesis technology development since the late 1990s. Technology for pronouncing Latvian words and texts is also included in our product Tildes Birojs, the market-leading proofreading software for Latvian.

      In 2005, Tilde together with the Latvian Society of the Blind started a project to address the needs of visually impaired people using computers in Latvian. The architecture of the system covers the traditional TTS transformation, performing text normalization, grapheme-to-phoneme conversion, prosody generation, and waveform synthesis. Now it is available free of charge for all visually impaired people in Latvia.

Linguistic tools

Sophisticated linguistic components are the foundation for any language technology systems, including MT. Tilde has developed these tools for multiple languages, combining the knowledge of its team of computational linguists, engineers, and developers.

  • Morphological analyzers and synthesizers
    • Languages: Latvian, Lithuanian.

      Recognizes the morphology of words – particularly important for complex languages such as the Baltic languages. These tools are used in many practical applications, e.g., electronic dictionaries, grammar checkers, MT systems, and search engines developed by Tilde as well as licensed to other developers.

  • Tokenizers
    • Languages: Latvian, Lithuanian, Estonian

      A tokenizer recognizes where a word ends, thus dividing a sentence into tokens. Tilde has developed tokenizers for the Baltic languages.

  • Sentence breakers
    • Languages: Latvian, Lithuanian, Estonian

      Recognizes the end of a sentence, which can vary by language. Tilde has developed sentence breakers for the Baltic languages.

  • Lemmatizers
    • Languages: Latvian, Lithuanian

      Recognizes the basic forms or roots of words, which is essential in highly inflected languages. Tilde has developed lemmatizers for the Baltic languages.

  • Part-of-speech and morpho-syntactic taggers
    • Languages: Latvian, Lithuanian, Estonian

      Tags the part of speech of each word in a sentence. For morphologically rich languages like the Baltic languages, morpho-syntactic tagging is necessary as well. Tilde has built taggers for the Baltic languages.

  • Syntactic parsers
    • Languages: Latvian, Lithuanian

      Recognizes the syntax of each sentences, offering explanation of the relationship between words in the sentence. Tilde has built syntactic parsers for the Baltic languages.

  • Named entity recognizers
    • Languages: Latvian, Lithuanian

      Recognizes which words are named entities (persons, organizations, locations, etc.). Tilde has built NER for the Baltic languages.