Data Library

Don’t have enough data? No problem. We can draw from our huge multilingual Data Library to improve your MT system’s capabilities.

Tilde Data Library is one of the world’s largest storehouses of multilingual data, with over 2.5 billion parallel sentences and 4 million terms in more than 125 languages.

machine translation feature Data Library

How can data library help you?

Data is the driving force behind language technology. To use language technologies like MT, data must be multilingual, plentiful, and reliable. Tilde provides one of the world's largest repositories of multilingual data.

For clients that don’t have enough data for building a system, we can draw from this repository to boost a system’s capabilities. For users that wish to build their own system, this repository is made available as a resource.

Represented domains include:

  • pharmaceutical

  • IT

  • legal

  • finance

Tilde Data Library includes:

  • 12.35 billion parallel sentences

  • 4 million terms

  • over 125 languages

  • sophisticated linguistic components