2. Points Illustrated
We consider both the end-user perspective and the content producer perspective.
From the user perspective
- Interlingua-based translation: we translate meanings, rather than words
- Incremental parsing: the user is at every point guided by the list of possible next words
- Mixed input modalities: selection of words ("fridge magnets") combined with text input
- Quasi-incremental translation: many basic types are also used as phrases, one can translate both words and complete sentences, and get intermediate results
- Disambiguation, esp. of politeness distinctions: if a phrase has many translations, each of them is shown and given an explanation (currently just in English, later in any source language)
- Fall-back to statistical translation: currently just a link to Google translate (forthcoming: tailor-made statistical models)
- Feed-back from users: users are welcome to send comments, bug reports, and better translation suggestions
From the programmer's perspective
- The use of resource grammars and functors: the translator was implemented on top of an earlier linguistic knowledge base, the GF Resource Grammar Library
- Example-based grammar writing and grammar induction from statistical models (Google translate): many of the grammars were created semi-automatically by generalization from examples
- Compile-time transfer especially, in Action in Words: the structural differences between languages are treated at compile time, for maximal run-time efficiency
- The level of skills involved in grammar development: testing different configurations (see table below)
- Grammar testing: use of treebanks with guided random generation for initial evaluation and regression testing