Analysing non-concatenative morphologies
The breadth of languages with which the Bible Societies must work is probably greater than any other organisation. The lack of linguistic databases for most of these languages has encouraged Bible Society to begin developing systems which can analyse automatically some characteristics of natural language. A particular need is the ability identify cognate word forms in a language with the minimum of supervision. The ability to identify close cognates improves the performance of key term analysers and automatic back-translation and once texts are complete contributes to creating concordances lemmata based search routines for these texts which are increasingly being made available on the web. This paper considers the problems created for such processing by complex, non-concatenative morphologies.
Comments are turned off for this item