Towards Paratext 8
Whilst most of the resources available to ICAP are focussed on delivering and supporting current technologies, BFBS LC is already researching new technologies which are unlikely to be released for some years. Systems which might make it into later releases of Paratext 7 include:

  • Word-Formation Analysis
    A research prototype exists which is able to automatically analyse affixal morphology for any affixal language. It requires nothing more than a sizable word list from which it can identify stem lemmata and their associated morpheme structures.

Looking further ahead towards Paratext 8 we would hope to see:

  • Automatic Hyphenation
    The current hyphenation system used by Paratext for vernacular languages is entirely syllable-based. It may be possible to improve the results of the automatic process by including information from information generated by Word-Form Analysis (WFA).
  • Further Development of WFA
    The current limitation of WFA to affixal languages is a severe limitation for projects working with complex non-concatenative morphologies. BFBS LC is presently researching a system to handle both affixal and infixal structures.
  • Proper-Name Transliteration Prediction
    A common problem for many projects is the difficulty in deciding upon transliterations of proper-names and then applying them consistently across the text as a whole. BFBS LC has a research project to create a system which can learn the preferred transliteration map which applies between a model language and target language and apply that knowledge to predicting consistent renderings for proper-names and other terms which transliterate rather than are translated.

Unfortunately, lack of resource means that at present only further developments in WFA is actively being considered.