This directory contains data files that serve as language models for language
recognition. They are in the format used by libtextcat and obtained from the
current openoffice.org-common Ubuntu package. You might find them in your
installation in a place like
    basis3.1/share/fingerprint/

Note that several files were removed due to incorrect encodings or other
problems. Here is a summary of some issues encountered:

Arabic (ar): No frequencies
Belarus (be): No frequencies
Chinese (zh): No frequencies
Croatian (hr): Possibly broken
Esperanto (eo): Possibly broken
Japanese (jp): No frequencies
Latvian (lv): Possible orthography issues
Lithuanian (lt): Division signs
Middle Frisian (??): Applicable?
Mingo (??): What is this?
Polish (pl): Incorrect encoding?
Romanian (ro): Missing non-ASCII characters. Incorrect encoding?
Serbian@latin: Language code needs Cyrillic model as well
Turkish (tr): Missing non-ASCII characters.
Ukrainian (uk): No frequencies
Vietnamese (vi): Encoding issues in model


Classified incorrectly:
* Norwegian (no): Classified as Danish
