r8 - 26 Aug 2017 - 14:07:00 - RomanYangarberYou are here: TWiki >  Main Web > MorphologyTutorial

Morphology Materials

Encoding Notes

Please note: the software has been tested to run on all department machines. If you have encoding problems, follow these steps to make sure that the encoding is consistent between the running environment and the input files provided on these pages.

The encoding in your working environment is most likely to be in UTF-8. The files provided here assume Latin-1 encoding.

  • To make the PC-Kimmo / KGEN toolkit work in Latin-1 mode:
    • Set up these environment variables: assuming sh or bash
      • unset LC_ALL
      • unset LC_MONETARY
      • export LC_CTYPE=fi_FI.iso88591
    • start a new shell process -- export only affects NEW processes, because internal structures of the OLD shell process are already set up according to the old environment variable setting.
    • set X-terminal encoding to Western ISO-8859-1
      • from menubar -- Terminal -- Set Character Encoding -- Western (ISO-8859-1)

  • NB: the PC-Kimmo package does not work in UTF-8 mode: for the following reasons:
    • all attached files would need to be converted to UTF-8 encoding -- this is not hard, e.g., using:
      • recode latin1..utf8 ...
      • iconv -f latin1 -t utf8 ...
    • kgen compiler would need to be recompiled using UTF-8, and it would be a more serious change, since (the C code) currently assumes 8-bit characters.

It is advisable to use XFST toolkit and materials, since it avoids all encoding problems, along with providing other advantages.

Materials

  • Finnish-all.tgz: Finnish files accompanying tutorial part I and II: NB: LATIN-1 ENCODING

-- RomanYangarber - 02 Jul 2011

Topic attachments
I Attachment Action Size Date Who Comment
ziptgz Finnish-all.tgz manage 9.0 K 21 Feb 2014 - 15:30 RomanYangarber Finnish files accompanying tutorial part I and II: NB: Latin-1 ENCODING
Edit | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r8 < r7 < r6 < r5 < r4 | More topic actions

tip TWiki Tip of the Day
Re-parenting a topic
The breadcrumb displayed with a particular TWiki topic is constructed with a topic's Parent . On the ... Read on Read more

 
Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback