Linguistic excellence

Term candidate extraction is a linguistic process that requires different linguistic tools in order to achieve better extraction quality.

Tilde Terminology uses state-of-the-art linguistically, statistically, and reference corpora motivated term extraction methods to provide the best term identification and extraction services.

Management facilities

Different tools for different languages

Depending on different language specific tools and resources, supported languages can be divided in four categories:

A level languages (English and Latvian) have the highest level of support in the Tilde Terminology platform for term tagging, term normalisation, and term translation equivalent look-up in the TaaS Statistical Data Base (SDB). The following linguistic tools are available for A level languages:

  • Part-of-speech (or morpho-syntactic) taggers trained on (high quality) human annotated training data
  • Lemmatisers, which allow performing better statistical analysis for term candidate extraction
  • Morphological analysers and synthesisers, which are required for term normalisation
  • Rule-based term normalisers, which allow reducing redundancy in the extracted term candidate lists

B level languages (Dutch, Estonian, French, German, Hungarian, Italian, Lithuanian and Spanish) have the highest level of support in the Tilde Terminology platform for term tagging, however, they do not have a term normalisation tool. Term translation equivalent look-up in the TaaS SDB is performed using lemmatised term forms instead of normalised term forms. The following linguistic tools are available for B level languages:

  • Part-of-speech (or morpho-syntactic) taggers trained on (high quality) human annotated training data
  • Lemmatisers, which allow performing better statistical analysis for term candidate extraction and provide basic support for redundancy reduction in the extracted term candidate lists

C level languages (Bulgarian, Croatian, Czech, Danish, Finnish, Greek, Maltese, Polish, Portuguese, Romanian, Russian, Slovak, Slovene and Swedish) have basic support in the Tilde Terminology platform for term tagging, they do not have a term normalisation tool, and term translation equivalent look-up in the TaaS SDB is performed using only term surface forms (the forms found in contexts) instead of normalised forms. The following linguistic tools are available for C level languages:

  • Part-of-speech taggers trained on (lower quality) automatically annotated training data

D level languages (Irish and Turkish) have no linguistic tool support in the Tilde Terminology platform for term tagging, they do not have a term normalisation tool, and term translation equivalent look-up in the TaaS SDB is performed using only term surface forms (the forms found in contexts) instead of normalised forms. Term candidate extraction for these languages is based on language independent methods.

features

Other Terminology Services features

Integration

Integration

Work with your terms in your translation environment tool, machine translation or content management system.

Largest Terminological resource
Secure storage

Secure storage

Cloud technology ensures a safe storage of your terminology, facilities for collaboration, and support for remote work. 

Various formats

Various formats

Support for term identification and extraction, import and export options in all most popular document formats.