| --- |
| layout: default |
| title: Liblouis User's and Programmer's Manual |
| --- |
| <!-- This manual is for liblouis (version 2.5.4, 3 March 2014), |
| a Braille Translation and Back-Translation Library derived from the |
| Linux screen reader BRLTTY. |
| |
| Copyright (C) 1999-2006 by the BRLTTY Team. |
| |
| Copyright (C) 2004-2007 ViewPlus Technologies, Inc. |
| www.viewplus.com. |
| |
| Copyright (C) 2007,2009 Abilitiessoft, Inc. |
| www.abilitiessoft.com. |
| |
| This file is free software; you can redistribute it and/or modify it |
| under the terms of the GNU Lesser (or library) General Public License |
| (LGPL) as published by the Free Software Foundation; either version 3, |
| or (at your option) any later version. |
| |
| This file is distributed in the hope that it will be useful, but |
| WITHOUT ANY WARRANTY; without even the implied warranty of |
| MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU |
| Lesser (or Library) General Public License LGPL for more details. |
| |
| You should have received a copy of the GNU Lesser (or Library) General |
| Public License (LGPL) along with this program; see the file COPYING. |
| If not, write to the Free Software Foundation, 51 Franklin Street, |
| Fifth Floor, Boston, MA 02110-1301, USA. --> |
| <!-- Created by GNU Texinfo 5.1, http://www.gnu.org/software/texinfo/ --> |
| |
| <a name="SEC_Contents"></a> |
| <h2 class="contents-heading">Table of Contents</h2> |
| |
| <div class="contents"> |
| |
| <ul class="no-bullet"> |
| <li><a name="toc-Introduction-1" href="#Introduction">1 Introduction</a></li> |
| <li><a name="toc-Test-Programs-1" href="#Test-Programs">2 Test Programs</a> |
| <ul class="no-bullet"> |
| <li><a name="toc-lou_005fdebug-1" href="#lou_005fdebug">2.1 lou_debug</a></li> |
| <li><a name="toc-lou_005ftrace-1" href="#lou_005ftrace">2.2 lou_trace</a></li> |
| <li><a name="toc-lou_005fchecktable-1" href="#lou_005fchecktable">2.3 lou_checktable</a></li> |
| <li><a name="toc-lou_005fallround-1" href="#lou_005fallround">2.4 lou_allround</a></li> |
| <li><a name="toc-lou_005ftranslate-1" href="#lou_005ftranslate-_0028program_0029">2.5 lou_translate</a></li> |
| <li><a name="toc-lou_005fcheckhyphens-1" href="#lou_005fcheckhyphens">2.6 lou_checkhyphens</a></li> |
| </ul></li> |
| <li><a name="toc-How-to-Write-Translation-Tables-1" href="#How-to-Write-Translation-Tables">3 How to Write Translation Tables</a> |
| <ul class="no-bullet"> |
| <li><a name="toc-Hyphenation-Tables-1" href="#Hyphenation-Tables">3.1 Hyphenation Tables</a></li> |
| <li><a name="toc-Character_002dDefinition-Opcodes-1" href="#Character_002dDefinition-Opcodes">3.2 Character-Definition Opcodes</a></li> |
| <li><a name="toc-Braille-Indicator-Opcodes-1" href="#Braille-Indicator-Opcodes">3.3 Braille Indicator Opcodes</a></li> |
| <li><a name="toc-Emphasis-Opcodes-1" href="#Emphasis-Opcodes">3.4 Emphasis Opcodes</a></li> |
| <li><a name="toc-Special-Symbol-Opcodes-1" href="#Special-Symbol-Opcodes">3.5 Special Symbol Opcodes</a></li> |
| <li><a name="toc-Special-Processing-Opcodes-1" href="#Special-Processing-Opcodes">3.6 Special Processing Opcodes</a></li> |
| <li><a name="toc-Translation-Opcodes-1" href="#Translation-Opcodes">3.7 Translation Opcodes</a></li> |
| <li><a name="toc-Character_002dClass-Opcodes-1" href="#Character_002dClass-Opcodes">3.8 Character-Class Opcodes</a></li> |
| <li><a name="toc-Swap-Opcodes-1" href="#Swap-Opcodes">3.9 Swap Opcodes</a></li> |
| <li><a name="toc-The-Context-and-Multipass-Opcodes-1" href="#The-Context-and-Multipass-Opcodes">3.10 The Context and Multipass Opcodes</a></li> |
| <li><a name="toc-The-correct-Opcode-1" href="#The-correct-Opcode">3.11 The correct Opcode</a></li> |
| <li><a name="toc-Miscellaneous-Opcodes-1" href="#Miscellaneous-Opcodes">3.12 Miscellaneous Opcodes</a></li> |
| <li><a name="toc-Deprecated-Opcodes-1" href="#Deprecated-Opcodes">3.13 Deprecated Opcodes</a></li> |
| </ul></li> |
| <li><a name="toc-How-to-test-Translation-Tables-1" href="#How-to-test-Translation-Tables">4 How to test Translation Tables</a> |
| <ul class="no-bullet"> |
| <li><a name="toc-Translation-Table-Test-Harness-1" href="#Translation-Table-Test-Harness">4.1 Translation Table Test Harness</a></li> |
| <li><a name="toc-Translation-Table-Doctests-1" href="#Translation-Table-Doctests">4.2 Translation Table Doctests</a></li> |
| </ul></li> |
| <li><a name="toc-Notes-on-Back_002dTranslation-1" href="#Notes-on-Back_002dTranslation">5 Notes on Back-Translation</a></li> |
| <li><a name="toc-Programming-with-liblouis-1" href="#Programming-with-liblouis">6 Programming with liblouis</a> |
| <ul class="no-bullet"> |
| <li><a name="toc-License-1" href="#License">6.1 License</a></li> |
| <li><a name="toc-Overview-1" href="#Overview">6.2 Overview</a></li> |
| <li><a name="toc-Data-structure-of-liblouis-tables-1" href="#Data-structure-of-liblouis-tables">6.3 Data structure of liblouis tables</a></li> |
| <li><a name="toc-lou_005fversion-1" href="#lou_005fversion">6.4 lou_version</a></li> |
| <li><a name="toc-lou_005ftranslateString-1" href="#lou_005ftranslateString">6.5 lou_translateString</a></li> |
| <li><a name="toc-lou_005ftranslate-2" href="#lou_005ftranslate">6.6 lou_translate</a></li> |
| <li><a name="toc-lou_005fbackTranslateString-1" href="#lou_005fbackTranslateString">6.7 lou_backTranslateString</a></li> |
| <li><a name="toc-lou_005fbackTranslate-1" href="#lou_005fbackTranslate">6.8 lou_backTranslate</a></li> |
| <li><a name="toc-lou_005fhyphenate-1" href="#lou_005fhyphenate">6.9 lou_hyphenate</a></li> |
| <li><a name="toc-lou_005fcompileString-1" href="#lou_005fcompileString">6.10 lou_compileString</a></li> |
| <li><a name="toc-lou_005fdotsToChar-1" href="#lou_005fdotsToChar">6.11 lou_dotsToChar</a></li> |
| <li><a name="toc-lou_005fcharToDots-1" href="#lou_005fcharToDots">6.12 lou_charToDots</a></li> |
| <li><a name="toc-lou_005flogFile-1" href="#lou_005flogFile">6.13 lou_logFile</a></li> |
| <li><a name="toc-lou_005flogPrint-1" href="#lou_005flogPrint">6.14 lou_logPrint</a></li> |
| <li><a name="toc-lou_005flogEnd-1" href="#lou_005flogEnd">6.15 lou_logEnd</a></li> |
| <li><a name="toc-lou_005fsetDataPath-1" href="#lou_005fsetDataPath">6.16 lou_setDataPath</a></li> |
| <li><a name="toc-lou_005fgetDataPath-1" href="#lou_005fgetDataPath">6.17 lou_getDataPath</a></li> |
| <li><a name="toc-lou_005fgetTable-1" href="#lou_005fgetTable">6.18 lou_getTable</a></li> |
| <li><a name="toc-lou_005freadCharFromFile-1" href="#lou_005freadCharFromFile">6.19 lou_readCharFromFile</a></li> |
| <li><a name="toc-lou_005ffree-1" href="#lou_005ffree">6.20 lou_free</a></li> |
| <li><a name="toc-Python-bindings-1" href="#Python-bindings">6.21 Python bindings</a></li> |
| </ul></li> |
| <li><a name="toc-Opcode-Index-1" href="#Opcode-Index">Opcode Index</a></li> |
| <li><a name="toc-Function-Index-1" href="#Function-Index">Function Index</a></li> |
| <li><a name="toc-Program-Index-1" href="#Program-Index">Program Index</a></li> |
| </ul> |
| </div> |
| |
| <hr/> |
| <a name="Top"></a> |
| <a name="Liblouis-User_0027s-and-Programmer_0027s-Manual"></a> |
| |
| <p>This manual is for liblouis (version 2.5.4, 3 March 2014), |
| a Braille Translation and Back-Translation Library derived from the |
| Linux screen reader <acronym>BRLTTY</acronym>. |
| </p> |
| <p>Copyright © 1999-2006 by the BRLTTY Team. |
| </p> |
| <p>Copyright © 2004-2007 ViewPlus Technologies, Inc. |
| <a href="www.viewplus.com">www.viewplus.com</a>. |
| </p> |
| <p>Copyright © 2007,2009 Abilitiessoft, Inc. |
| <a href="www.abilitiessoft.com">www.abilitiessoft.com</a>. |
| </p> |
| <blockquote> |
| <p>This file is free software; you can redistribute it and/or modify it |
| under the terms of the GNU Lesser (or library) General Public License |
| (LGPL) as published by the Free Software Foundation; either version 3, |
| or (at your option) any later version. |
| </p> |
| <p>This file is distributed in the hope that it will be useful, but |
| WITHOUT ANY WARRANTY; without even the implied warranty of |
| MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU |
| Lesser (or Library) General Public License LGPL for more details. |
| </p> |
| <p>You should have received a copy of the GNU Lesser (or Library) General |
| Public License (LGPL) along with this program; see the file COPYING. |
| If not, write to the Free Software Foundation, 51 Franklin Street, |
| Fifth Floor, Boston, MA 02110-1301, USA. |
| </p></blockquote> |
| |
| |
| |
| <hr> |
| <a name="Introduction"></a> |
| <a name="Introduction-1"></a> |
| <h2 class="chapter">1 Introduction</h2> |
| |
| <p>Liblouis is an open-source braille translator and back-translator |
| derived from the translation routines in the BRLTTY screen reader for |
| Linux. It has, however, gone far beyond these routines. It is named in |
| honor of Louis Braille. In Linux and Mac OSX it is a shared library, |
| and in Windows it is a DLL. For installation instructions see the |
| README file. Please report bugs and oddities to the maintainer, |
| <a href="mailto:john.boyer@abilitiessoft.com">john.boyer@abilitiessoft.com</a> |
| </p> |
| <p>This documentation is derived from Chapter 7 of the BRLTTY manual, but |
| it has been extensively rewritten to cover new features. |
| </p> |
| <p>Please read the following copyright and warranty information. Note |
| that this information also applies to all source code, tables and |
| other files in this distribution of liblouis. It applies similarly to |
| the sister library liblouisxml. |
| </p> |
| <p>This file is maintained by John J. Boyer |
| <a href="mailto:john.boyer@abilitiessoft.com">john.boyer@abilitiessoft.com</a>. |
| </p> |
| <p>Persons who wish to program with liblouis but will not be writing |
| translation tables may want to skip ahead to <a href="#Programming-with-liblouis">Programming with liblouis</a>. |
| </p> |
| <hr> |
| <a name="Test-Programs"></a> |
| <a name="Test-Programs-1"></a> |
| <h2 class="chapter">2 Test Programs</h2> |
| |
| <p>A number of test programs are provided as part of the liblouis |
| package. They are intended for testing liblouis and for debugging |
| tables. None of them is suitable for braille transcription. An |
| application that can be used for transcription is <code>xml2brl</code>, |
| which is part of the liblouisxml package (see <a href="liblouisxml.html#Top">Introduction</a> in <cite>Liblouisxml User’s and Programmer’s Manual</cite>). The source |
| code of the test programs can be studied to learn how to use the |
| liblouis library and they can be used to perform the following |
| functions. |
| </p> |
| <a name="common-options"></a><p>All of these programs recognize the <samp>--help</samp> and |
| <samp>--version</samp> options. |
| </p> |
| <dl compact="compact"> |
| <dt><samp>--help</samp></dt> |
| <dt><samp>-h</samp></dt> |
| <dd><p>Print a usage message listing all available options, then exit |
| successfully. |
| </p> |
| </dd> |
| <dt><samp>--version</samp></dt> |
| <dt><samp>-v</samp></dt> |
| <dd><p>Print the version number, then exit successfully. |
| </p> |
| </dd> |
| </dl> |
| |
| |
| <hr> |
| <a name="lou_005fdebug"></a> |
| <a name="lou_005fdebug-1"></a> |
| <h3 class="section">2.1 lou_debug</h3> |
| <a name="index-lou_005fdebug"></a> |
| |
| <p>The <code>lou_debug</code> tool is intended for debugging liblouis |
| translation tables. The command line for <code>lou_debug</code> is: |
| </p> |
| <div class="example"> |
| <pre class="example">lou_debug [OPTIONS] TABLE[,TABLE,...] |
| </pre></div> |
| |
| <p>The command line options that are accepted by <code>lou_debug</code> are |
| described in <a href="#common-options">common options</a>. |
| </p> |
| <p>The table (or comma-separated list of tables) is compiled. If no |
| errors are found a brief command summary is printed, then the prompt |
| ‘<samp>Command:</samp>’. You can then input one of the command letters and get |
| output, as described below. |
| </p> |
| <p>Most of the commands print information in the various arrays of |
| <code>TranslationTableHeader</code>. Since these arrays are pointers to |
| chains of hashed items, the commands first print the hash number, then |
| the first item, then the next item chained to it, and so on. After |
| each item there is a prompt indicated by ‘<samp>=></samp>’. You can then press |
| enter (<kbd><span class="key">RET</span></kbd>) to see the next item in the chain or the first |
| item in the next chain. Or you can press <kbd>h</kbd> (for next-(h)ash) to |
| skip to the next hash chain. You can also press <kbd>e</kbd> to exit the |
| command and go back to the ‘<samp>command:</samp>’ prompt. |
| </p> |
| <dl compact="compact"> |
| <dt><kbd>h</kbd></dt> |
| <dd><p>Brings up a screen of somewhat more extensive help. |
| </p> |
| </dd> |
| <dt><kbd>f</kbd></dt> |
| <dd><p>Display the first forward-translation rule in the first non-empty hash |
| bucket. The number of the bucket is displayed at the beginning of the |
| chain. Each rule is identified by the word ‘<samp>Rule:</samp>’. The fields |
| are displayed by phrases consisting of the name of the field, an equal |
| sign, and its value. The before and after fields are displayed only if |
| they are nonzero. Special opcodes such as the <code>correct</code> opcode (see <a href="#correct-opcode"><code>correct</code></a>) and |
| the multipass opcodes are shown with the code that instructs the |
| virtual machine that interprets them. If you want to see only the |
| rules for a particular character string you can type <kbd>p</kbd> at the |
| ‘<samp>command:</samp>’ prompt. This will take you to the ‘<samp>particular:</samp>’ |
| prompt, where you can press <kbd>f</kbd> and then type in the string. The |
| whole hash chain containing the string will be displayed. |
| </p> |
| </dd> |
| <dt><kbd>b</kbd></dt> |
| <dd><p>Display back-translation rules. This display is very similar to that |
| of forward translation rules except that the dot pattern is displayed |
| before the character string. |
| </p> |
| </dd> |
| <dt><kbd>c</kbd></dt> |
| <dd><p>Display character definitions, again within their hash chains. |
| </p> |
| </dd> |
| <dt><kbd>d</kbd></dt> |
| <dd><p>Displays single-cell dot definitions. If a character-definition opcode |
| gives a multi-cell dot pattern, it is displayed among the |
| back-translation rules. |
| </p> |
| </dd> |
| <dt><kbd>C</kbd></dt> |
| <dd><p>Display the character-to-dots map. This is set up by the |
| character-definition opcodes and can also be influenced by the |
| <code>display</code> opcode (see <a href="#display-opcode"><code>display</code></a>). |
| </p> |
| </dd> |
| <dt><kbd>D</kbd></dt> |
| <dd><p>Display the dot to character map, which shows which single-cell dot |
| patterns map to which characters. |
| </p> |
| </dd> |
| <dt><kbd>z</kbd></dt> |
| <dd><p>Show the multi-cell dot patterns which have been assigned to the |
| characters from 0 to 255 to comply with computer braille codes such as |
| a 6-dot code. Note that the character-definition opcodes should use |
| 8-dot computer braille. |
| </p> |
| </dd> |
| <dt><kbd>p</kbd></dt> |
| <dd><p>Bring up a secondary (‘<samp>particular:</samp>’) prompt from which you can |
| examine particular character strings, dot patterns, etc. The commands |
| (given in its own command summary) are very similar to those of the |
| main ‘<samp>command:</samp>’ prompt, but you can type a character string or |
| dot pattern. They include <kbd>h</kbd>, <kbd>f</kbd>, <kbd>b</kbd>, <kbd>c</kbd>, <kbd>d</kbd>, |
| <kbd>C</kbd>, <kbd>D</kbd>, <kbd>z</kbd> and <kbd>x</kbd> (to exit this prompt), but not |
| <kbd>p</kbd>, <kbd>i</kbd> and <kbd>m</kbd>. |
| </p> |
| </dd> |
| <dt><kbd>i</kbd></dt> |
| <dd><p>Show braille indicators. This shows the dot patterns for various |
| opcodes such as the <code>capsign</code> opcode (see <a href="#capsign-opcode"><code>capsign</code></a>) and the <code>numsign</code> opcode (see <a href="#numsign-opcode"><code>numsign</code></a>). |
| It also shows emphasis dot patterns, such as those for the |
| <code>italword</code>, |
| the <code>firstletterbold</code> opcode (see <a href="#firstletterbold-opcode"><code>firstletterbold</code></a>), etc. If a given |
| opcode has not been used nothing is printed for it. |
| </p> |
| </dd> |
| <dt><kbd>m</kbd></dt> |
| <dd><p>Display various miscellaneous information about the table, such as the |
| number of passes, whether certain opcodes have been used, and whether |
| there is a hyphenation table. |
| </p> |
| </dd> |
| <dt><kbd>q</kbd></dt> |
| <dd><p>Exit the program. |
| </p></dd> |
| </dl> |
| |
| <hr> |
| <a name="lou_005ftrace"></a> |
| <a name="lou_005ftrace-1"></a> |
| <h3 class="section">2.2 lou_trace</h3> |
| <a name="index-lou_005ftrace"></a> |
| |
| <p>When working on translation tables it is sometimes useful to determine |
| what rules were applied when translating a string. <code>lou_trace</code> |
| helps with exactly that. It list all the the applied rules for a given |
| translation table and an input string. |
| </p> |
| <div class="example"> |
| <pre class="example">lou_trace [OPTIONS] TABLE[,TABLE,...] |
| </pre></div> |
| |
| <p><code>lou_trace</code> accepts all the standard options (see <a href="#common-options">common options</a>). Once started you can type an input string followed by |
| <kbd><span class="key">RET</span></kbd>. <code>lou_trace</code> will print the braille |
| translation followed by list of rules that were applied to produce the |
| translation. A possible invocation is listed in the following example: |
| </p> |
| <div class="example"> |
| <pre class="example">$ lou_trace tables/en-us-g2.ctb |
| the u.s. postal service |
| ! u4s4 po/al s}vice |
| 1. largesign the 2346 |
| 2. repeated 0 |
| 3. lowercase u 136 |
| 4. punctuation . 46 |
| 5. context _$l["."]$l @256 |
| 6. lowercase s 234 |
| 7. postpunc . 256 |
| 8. repeated 0 |
| 9. begword post 1234-135-34 |
| 10. largesign a 1 |
| 11. lowercase l 123 |
| 12. repeated 0 |
| 13. lowercase s 234 |
| 14. always er 12456 |
| 15. lowercase v 1236 |
| 16. lowercase i 24 |
| 17. lowercase c 14 |
| 18. lowercase e 15 |
| 19. pass2 $s1-10 @0 |
| 20. pass2 $s1-10 @0 |
| 21. pass2 $s1-10 @0 |
| </pre></div> |
| |
| <hr> |
| <a name="lou_005fchecktable"></a> |
| <a name="lou_005fchecktable-1"></a> |
| <h3 class="section">2.3 lou_checktable</h3> |
| <a name="index-lou_005fchecktable"></a> |
| |
| <p>To use this program type the following: |
| </p> |
| <div class="example"> |
| <pre class="example">lou_checktable [OPTIONS] TABLE |
| </pre></div> |
| |
| <p>Aside from the standard options (see <a href="#common-options">common options</a>) |
| <code>lou_checktable</code> also accepts the following options: |
| </p> |
| <dl compact="compact"> |
| <dt><samp>--quiet</samp></dt> |
| <dt><samp>-q</samp></dt> |
| <dd><p>Do not write to standard error if there are no errors. |
| </p> |
| </dd> |
| </dl> |
| |
| <p>If the table contains errors, appropriate messages will be displayed. |
| If there are no errors the message ‘<samp>no errors found.</samp>’ will be |
| shown. |
| </p> |
| <hr> |
| <a name="lou_005fallround"></a> |
| <a name="lou_005fallround-1"></a> |
| <h3 class="section">2.4 lou_allround</h3> |
| <a name="index-lou_005fallround"></a> |
| |
| <p>This program tests every capability of the liblouis library. It is |
| completely interactive. Invoke it as follows: |
| </p> |
| <div class="example"> |
| <pre class="example">lou_allround [OPTIONS] |
| </pre></div> |
| |
| <p>The command line options that are accepted by <code>lou_allround</code> |
| are described in <a href="#common-options">common options</a>. |
| </p> |
| <p>You will see a few lines telling you how to use the program. Pressing |
| one of the letters in parentheses and then enter will take you to a |
| message asking for more information or for the answer to a yes/no |
| question. Typing the letter ‘<samp>r</samp>’ and then <tt class="key">RET</tt> will take you |
| to a screen where you can enter a line to be processed by the library |
| and then view the results. |
| </p> |
| <hr> |
| <a name="lou_005ftranslate-_0028program_0029"></a> |
| <a name="lou_005ftranslate-1"></a> |
| <h3 class="section">2.5 lou_translate</h3> |
| <a name="index-lou_005ftranslate-1"></a> |
| |
| <p>This program translates whatever is on the standard input unit and |
| prints it on the standard output unit. It is intended for large-scale |
| testing of the accuracy of translation and back-translation. The |
| command line for <code>lou_translate</code> is: |
| </p> |
| <div class="example"> |
| <pre class="example">lou_translate [OPTION] TABLE[,TABLE,...] |
| </pre></div> |
| |
| <p>Aside from the standard options (see <a href="#common-options">common options</a>) this program |
| also accepts the following options: |
| </p> |
| <dl compact="compact"> |
| <dt><samp>--forward</samp></dt> |
| <dt><samp>-f</samp></dt> |
| <dd><p>Do a forward translation. |
| </p> |
| </dd> |
| <dt><samp>--backward</samp></dt> |
| <dt><samp>-b</samp></dt> |
| <dd><p>Do a backward translation. |
| </p> |
| </dd> |
| </dl> |
| |
| <p>To use it to translate or back-translate a file use a line like |
| </p> |
| <div class="example"> |
| <pre class="example">lou_translate --forward en-us-g2.ctb <liblouis.txt >testtrans |
| </pre></div> |
| |
| <hr> |
| <a name="lou_005fcheckhyphens"></a> |
| <a name="lou_005fcheckhyphens-1"></a> |
| <h3 class="section">2.6 lou_checkhyphens</h3> |
| <a name="index-lou_005fcheckhyphens"></a> |
| |
| <p>This program checks the accuracy of hyphenation in Braille translation |
| for both translated and untranslated words. It is completely |
| interactive. Invoke it as follows: |
| </p> |
| <div class="example"> |
| <pre class="example">lou_checkhyphens [OPTIONS] |
| </pre></div> |
| |
| <p>The command line options that are accepted by |
| <code>lou_checkhyphens</code> are described in <a href="#common-options">common options</a>. |
| </p> |
| <p>You will see a few lines telling you how to use the program. |
| </p> |
| <hr> |
| <a name="How-to-Write-Translation-Tables"></a> |
| <a name="How-to-Write-Translation-Tables-1"></a> |
| <h2 class="chapter">3 How to Write Translation Tables</h2> |
| |
| <p>Many translation (contraction) tables have already been made up. They |
| are included in this distribution in the tables directory and should be |
| studied as part of the documentation. The most helpful (and normative) |
| are listed in the following table: |
| </p> |
| <dl compact="compact"> |
| <dt><samp>chardefs.cti</samp></dt> |
| <dd><p>Character definitions for U.S. tables |
| </p></dd> |
| <dt><samp>compress.ctb</samp></dt> |
| <dd><p>Remove excessive whitespace |
| </p></dd> |
| <dt><samp>en-us-g1.ctb</samp></dt> |
| <dd><p>Uncontracted American English |
| </p></dd> |
| <dt><samp>en-us-g2.ctb</samp></dt> |
| <dd><p>Contracted or Grade 2 American English |
| </p></dd> |
| <dt><samp>en-us-brf.dis</samp></dt> |
| <dd><p>Make liblouis output conform to BRF standard |
| </p></dd> |
| <dt><samp>en-us-comp8.ctb</samp></dt> |
| <dd><p>8-dot computer braille for use in coding examples |
| </p></dd> |
| <dt><samp>en-us-comp6.ctb</samp></dt> |
| <dd><p>6-dot computer braille |
| </p></dd> |
| <dt><samp>nemeth.ctb</samp></dt> |
| <dd><p>Nemeth Code translation for use with liblouisxml |
| </p></dd> |
| <dt><samp>nemeth_edit.ctb</samp></dt> |
| <dd><p>Fixes errors at the boundaries of math and text |
| </p> |
| </dd> |
| </dl> |
| |
| <p>The names used for files containing translation tables are completely |
| arbitrary. They are not interpreted in any way by the translator. |
| Contraction tables may be 8-bit ASCII files, UTF-8, 16-bit big-endian |
| Unicode files or 16-bit little-endian Unicode files. Blank lines are |
| ignored. Any leading and trailing whitespace (any number of blanks |
| and/or tabs) is ignored. Lines which begin with a number sign or hatch |
| mark (‘<samp>#</samp>’) are ignored, i.e. they are comments. If the number |
| sign is not the first non-blank character in the line, it is treated |
| as an ordinary character. If the first non-blank character is |
| less-than (‘<samp><</samp>’) the line is also treated as a comment. This makes |
| it possible to mark up tables as xhtml documents. Lines which are not |
| blank or comments define table entries. The general format of a table |
| entry is: |
| </p> |
| <div class="example"> |
| <pre class="example">opcode operands comments |
| </pre></div> |
| |
| <p>Table entries may not be split between lines. The opcode is a mnemonic |
| that specifies what the entry does. The operands may be character |
| sequences, braille dot patterns or occasionally something else. They |
| are described for each opcode, please see <a href="#Opcode-Index">Opcode Index</a>. With some |
| exceptions, opcodes expect a certain number of operands. Any text on |
| the line after the last operand is ignored, and may be a comment. A |
| few opcodes accept a variable number of operands. In this case a |
| number sign begins a comment unless it is preceded by a backslash |
| (‘<samp>\</samp>’). |
| </p> |
| <p>Here are some examples of table entries. |
| </p> |
| <div class="example"> |
| <pre class="example"># This is a comment. |
| always world 456-2456 A word and the dot pattern of its contraction |
| </pre></div> |
| |
| <p>Most opcodes have both a "characters" operand and a "dots" operand, |
| though some have only one and a few have other types. |
| </p> |
| <p>The characters operand consists of any combination of characters and |
| escape sequences proceeded and followed by whitespace. Escape |
| sequences are used to represent difficult characters. They begin with |
| a backslash (‘\‘). They are: |
| </p> |
| <dl compact="compact"> |
| <dt><kbd>\</kbd></dt> |
| <dd><p>backslash |
| </p></dd> |
| <dt><kbd>\f</kbd></dt> |
| <dd><p>form feed |
| </p></dd> |
| <dt><kbd>\n</kbd></dt> |
| <dd><p>new line |
| </p></dd> |
| <dt><kbd>\r</kbd></dt> |
| <dd><p>carriage return |
| </p></dd> |
| <dt><kbd>\s</kbd></dt> |
| <dd><p>blank (space) |
| </p></dd> |
| <dt><kbd>\t</kbd></dt> |
| <dd><p>horizontal tab |
| </p></dd> |
| <dt><kbd>\v</kbd></dt> |
| <dd><p>vertical tab |
| </p></dd> |
| <dt><kbd>\e</kbd></dt> |
| <dd><p>"escape" character (hex 1b, dec 27) |
| </p></dd> |
| <dt><kbd>\xhhhh</kbd></dt> |
| <dd><p>4-digit hexadecimal value of a character |
| </p> |
| </dd> |
| </dl> |
| |
| <p>If liblouis has been compiled for 32-bit Unicode the following are |
| also recognized. |
| </p> |
| <dl compact="compact"> |
| <dt><kbd>\yhhhhh</kbd></dt> |
| <dd><p>5-digit (20 bit) character |
| </p></dd> |
| <dt><kbd>\zhhhhhhhh</kbd></dt> |
| <dd><p>Full 32-bit value. |
| </p> |
| </dd> |
| </dl> |
| |
| <p>The dots operand is a braille dot pattern. The real braille dots, 1 |
| through 8, must be specified with their standard numbers. liblouis |
| recognizes "virtual dots," which are used for special purposes, such |
| as distinguishing accent marks. There are seven virtual dots. They are |
| specified by the number 9 and the letters ‘<samp>a</samp>’ through ‘<samp>f</samp>’. |
| For a multi-cell dot pattern, the cell specifications must be |
| separated from one another by a dash (‘<samp>-</samp>’). For example, the |
| contraction for the English word ‘<samp>lord</samp>’ (the letter ‘<samp>l</samp>’ |
| preceded by dot 5) would be specified as 5-123. A space may be |
| specified with the special dot number 0. |
| </p> |
| <p>An opcode which is helpful in writing translation tables is |
| <code>include</code>. Its format is: |
| </p> |
| <div class="example"> |
| <pre class="example">include filename |
| </pre></div> |
| |
| <p>It reads the file indicated by <code>filename</code> and incorporates or |
| includes its entries into the table. Included files can include other |
| files, which can include other files, etc. For an example, see what |
| files are included by the entry <code>include en-us-g1.ctb</code> in the table |
| <samp>en-us-g2.ctb</samp>. If the included file is not in the same directory |
| as the main table, use a full path name for filename. Tables can also be |
| specified in a table list, in which the table names are separated by |
| commas and given as a single table name in calls to the translation |
| functions. |
| </p> |
| <p>The order of the various types of opcodes or table entries is |
| important. Character-definition opcodes should come first. However, if |
| the optional <code>display</code> opcode (see <a href="#display-opcode"><code>display</code></a>) is used it should precede |
| character-definition opcodes. Braille-indicator opcodes should come |
| next. Translation opcodes should follow. The <code>context</code> opcode (see <a href="#context-opcode"><code>context</code></a>) is a |
| translation opcode, even though it is considered along with the |
| multipass opcodes. These latter should follow the translation opcodes. |
| The <code>correct</code> opcode (see <a href="#correct-opcode"><code>correct</code></a>) can be used anywhere after the |
| character-definition opcodes, but it is probably a good idea to group |
| all <code>correct</code> opcodes together. The <code>include</code> opcode (see <a href="#include-opcode"><code>include</code></a>) can be |
| used anywhere, but the order of entries in the combined table must |
| conform to the order given above. Within each type of opcode, the |
| order of entries is generally unimportant. Thus the translation |
| entries can be grouped alphabetically or in any other order that is |
| convenient. Hyphenation tables may be specified either with an |
| <code>include</code> opcode or as part of a table list. They should come after |
| everything else. Character-definition opcodes are necessary for |
| hyphenation tables to work. |
| </p> |
| |
| <hr> |
| <a name="Hyphenation-Tables"></a> |
| <a name="Hyphenation-Tables-1"></a> |
| <h3 class="section">3.1 Hyphenation Tables</h3> |
| |
| <p>Hyphenation tables are necessary to make opcodes such as the |
| <code>nocross</code> opcode (see <a href="#nocross-opcode"><code>nocross</code></a>) function properly. There are no opcodes for |
| hyphenation table entries because these tables have a special format. |
| Therefore, they cannot be specified as part of an ordinary table. |
| Rather, they must be included using the <code>include</code> opcode (see <a href="#include-opcode"><code>include</code></a>) or as part |
| of a table list. The liblouis hyphenation algorithm was adopted from the |
| one used by OpenOffice. Note that Hyphenation tables must follow |
| character definitions and should preferably be the last. For an example |
| of a hyphenation table, see <samp>hyph_en_US.dic</samp>. |
| </p> |
| <hr> |
| <a name="Character_002dDefinition-Opcodes"></a> |
| <a name="Character_002dDefinition-Opcodes-1"></a> |
| <h3 class="section">3.2 Character-Definition Opcodes</h3> |
| |
| <p>These opcodes are needed to define attributes such as digit, |
| punctuation, letter, etc. for all characters and their dot patterns. |
| liblouis has no built-in character definitions, but such definitions |
| are essential to the operation of the <code>context</code> opcode (see <a href="#context-opcode"><code>context</code></a>), the |
| <code>correct</code> opcode (see <a href="#correct-opcode"><code>correct</code></a>), the multipass opcodes and the back-translator. If |
| the dot pattern is a single cell, it is used to define the mapping |
| between dot patterns and characters, unless a <code>display</code> opcode (see <a href="#display-opcode"><code>display</code></a>) for |
| that character-dot-pattern pair has been used previously. If only a |
| single-cell dot pattern has been given for a character, that dot |
| pattern is defined with the character’s own attributes. If more than |
| one cell is given and some of them have not previously been defined as |
| single cells, the undefined cells are entered into the dots table with |
| the space attribute. This is done for backward compatibility with |
| old tables, but it may cause problems with the above opcodes or |
| back-translation. For this reason, every single-cell dot pattern |
| should be defined before it is used in a multi-cell character |
| representation. The best way to do this is to use the 8-dot computer |
| braille representation for the particular braille code. If a character |
| or dot pattern used in any rule, except those with the <code>display</code> |
| opcode, the <code>repeated</code> opcode (see <a href="#repeated-opcode"><code>repeated</code></a>) or the <code>replace</code> opcode (see <a href="#replace-opcode"><code>replace</code></a>), is not |
| defined by one of the character-definition opcodes, liblouis will give |
| an error message and refuse to continue until the problem is fixed. If |
| the translator or back-translator encounters an undefined character in |
| its input it produces a succinct error indication in its output, and |
| the character is treated as a space. |
| </p> |
| <dl compact="compact"> |
| <dd><a name="index-space"></a> |
| <a name="space-opcode"></a></dd> |
| <dt><code>space character dots</code></dt> |
| <dd><p>Defines a character as a space and also defines the dot pattern as |
| such. for example: |
| </p> |
| <div class="example"> |
| <pre class="example">space \s 0 \s is the escape sequence for blank; 0 means no dots. |
| </pre></div> |
| |
| <a name="index-punctuation"></a> |
| <a name="punctuation-opcode"></a></dd> |
| <dt><code>punctuation character dots</code></dt> |
| <dd><p>Associates a punctuation mark in the particular language with a |
| braille representation and defines the character and dot pattern as |
| punctuation. For example: |
| </p> |
| <div class="example"> |
| <pre class="example">punctuation . 46 dot pattern for period in NAB computer braille |
| </pre></div> |
| |
| <a name="index-digit"></a> |
| <a name="digit-opcode"></a></dd> |
| <dt><code>digit character dots</code></dt> |
| <dd><p>Associates a digit with a dot pattern and defines the character as a |
| digit. For example: |
| </p> |
| <div class="example"> |
| <pre class="example">digit 0 356 NAB computer braille |
| </pre></div> |
| |
| <a name="index-uplow"></a> |
| <a name="uplow-opcode"></a></dd> |
| <dt><code>uplow characters dots [,dots]</code></dt> |
| <dd><p>The characters operand must be a pair of letters, of which the first |
| is uppercase and the second lowercase. The first dots suboperand |
| indicates the dot pattern for the upper-case letter. It may have more |
| than one cell. The second dots suboperand must be separated from the |
| first by a comma and is optional, as indicated by the square brackets. |
| If present, it indicates the dot pattern for the lower-case letter. It |
| may also have more than one cell. If the second dots suboperand is not |
| present the first is used for the lower-case letter as well as the |
| upper-case letter. This opcode is needed because not all languages |
| follow a consistent pattern in assigning Unicode codes to upper and |
| lower case letters. It should be used even for languages that do. The |
| distinction is important in the forward translator. for example: |
| </p> |
| <div class="example"> |
| <pre class="example">uplow Aa 17,1 |
| </pre></div> |
| |
| <a name="index-grouping"></a> |
| <a name="grouping-opcode"></a></dd> |
| <dt><code>grouping name characters dots ,dots</code></dt> |
| <dd><p>This opcode is used to indicate pairs of grouping symbols used in |
| processing mathematical expressions. These symbols are usually |
| generated by the MathML interpreter in liblouisxml. They are used in |
| multipass opcodes. The name operand must contain only letters, but |
| they may be upper- or lower-case. The characters operand must contain |
| exactly two Unicode characters. The dots operand must contain exactly |
| two braille cells, separated by a comma. Note that grouping dot |
| patterns also need to be declared with the <code>exactdots</code> opcode (see <a href="#exactdots-opcode"><code>exactdots</code></a>). The |
| characters may need to be declared with the <code>math</code> opcode (see <a href="#math-opcode"><code>math</code></a>). |
| </p> |
| <div class="example"> |
| <pre class="example">grouping mrow \x0001\x0002 1e,2e |
| grouping mfrac \x0003\x0004 3e,4e |
| </pre></div> |
| |
| <a name="index-letter"></a> |
| <a name="letter-opcode"></a></dd> |
| <dt><code>letter character dots</code></dt> |
| <dd><p>Associates a letter in the language with a braille representation and |
| defines the character as a letter. This is intended for letters which |
| are neither uppercase nor lowercase. |
| </p> |
| <a name="index-lowercase"></a> |
| <a name="lowercase-opcode"></a></dd> |
| <dt><code>lowercase character dots</code></dt> |
| <dd><p>Associates a character with a dot pattern and defines the character as |
| a lowercase letter. Both the character and the dot pattern have the |
| attributes lowercase and letter. |
| </p> |
| <a name="index-uppercase"></a> |
| <a name="uppercase-opcode"></a></dd> |
| <dt><code>uppercase character dots</code></dt> |
| <dd><p>Associates a character with a dot pattern and defines the character as |
| an uppercase letter. Both the character and the dot pattern have the |
| attributes uppercase and letter. <code>lowercase</code> and <code>uppercase</code> |
| should be used when a letter has only one case. Otherwise use the |
| <code>uplow</code> opcode (see <a href="#uplow-opcode"><code>uplow</code></a>). |
| </p> |
| <a name="index-litdigit"></a> |
| <a name="litdigit-opcode"></a></dd> |
| <dt><code>litdigit digit dots</code></dt> |
| <dd><p>Associates a digit with the dot pattern which should be used to |
| represent it in literary texts. For example: |
| </p> |
| <div class="example"> |
| <pre class="example">litdigit 0 245 |
| litdigit 1 1 |
| </pre></div> |
| |
| <a name="index-sign"></a> |
| <a name="sign-opcode"></a></dd> |
| <dt><code>sign character dots</code></dt> |
| <dd><p>Associates a character with a dot pattern and defines both as a sign. |
| This opcode should be used for things like at sign (‘<samp>@</samp>’), |
| percent (‘<samp>%</samp>’), dollar sign (‘<samp>$</samp>’), etc. Do not use it to |
| define ordinary punctuation such as period and comma. For example: |
| </p> |
| <div class="example"> |
| <pre class="example">sign % 4-25-1234 literary percent sign |
| </pre></div> |
| |
| <a name="index-math"></a> |
| <a name="math-opcode"></a></dd> |
| <dt><code>math character dots</code></dt> |
| <dd><p>Associates a character and a dot pattern and defines them as a |
| mathematical symbol. It should be used for less than (‘<samp><</samp>’), |
| greater than(‘<samp>></samp>’), equals(‘<samp>=</samp>’), plus(‘<samp>+</samp>’), etc. For |
| example: |
| </p> |
| <div class="example"> |
| <pre class="example">math + 346 plus |
| </pre></div> |
| |
| </dd> |
| </dl> |
| |
| <hr> |
| <a name="Braille-Indicator-Opcodes"></a> |
| <a name="Braille-Indicator-Opcodes-1"></a> |
| <h3 class="section">3.3 Braille Indicator Opcodes</h3> |
| |
| <p>Braille indicators are dot patterns which are inserted into the |
| braille text to indicate such things as capitalization, italic type, |
| computer braille, etc. The opcodes which define them are followed only |
| by a dot pattern, which may be one or more cells. |
| </p> |
| <dl compact="compact"> |
| <dd><a name="index-capsign"></a> |
| <a name="capsign-opcode"></a></dd> |
| <dt><code>capsign dots</code></dt> |
| <dd><p>The dot pattern which indicates capitalization of a single letter. In |
| English, this is dot 6. For example: |
| </p> |
| <div class="example"> |
| <pre class="example">capsign 6 |
| </pre></div> |
| |
| <a name="index-begcaps"></a> |
| <a name="begcaps-opcode"></a></dd> |
| <dt><code>begcaps dots</code></dt> |
| <dd><p>The dot pattern which begins a block of capital letters. For example: |
| </p> |
| <div class="example"> |
| <pre class="example">begcaps 6-6 |
| </pre></div> |
| |
| <a name="index-endcaps"></a> |
| <a name="endcaps-opcode"></a></dd> |
| <dt><code>endcaps dots</code></dt> |
| <dd><p>The dot pattern which ends a block of capital letters within a word. |
| For example: |
| </p> |
| <div class="example"> |
| <pre class="example">endcaps 6-3 |
| </pre></div> |
| |
| <a name="index-letsign"></a> |
| <a name="letsign-opcode"></a></dd> |
| <dt><code>letsign dots</code></dt> |
| <dd><p>This indicator is needed in Grade 2 to show that a single letter is |
| not a contraction. It is also used when an abbreviation happens to be |
| a sequence of letters that is the same as a contraction. For example: |
| </p> |
| <div class="example"> |
| <pre class="example">letsign 56 |
| </pre></div> |
| |
| <a name="index-noletsign"></a> |
| <a name="noletsign-opcode"></a></dd> |
| <dt><code>noletsign letters</code></dt> |
| <dd> |
| <p>The letters in the operand will not be proceeded by a letter sign. |
| More than one <code>noletsign</code> opcode can be used. This is equivalent |
| to a single entry containing all the letters. In addition, if a single |
| letter, such as ‘<samp>a</samp>’ in English, is defined as a <code>word</code> |
| (see <a href="#word-opcode"><code>word</code></a>) or <code>largesign</code> |
| (see <a href="#largesign-opcode"><code>largesign</code></a>), it will be |
| treated as though it had also been specified in a <code>noletsign</code> |
| entry. |
| </p> |
| <a name="index-noletsignbefore"></a> |
| <a name="noletsignbefore-opcode"></a></dd> |
| <dt><code>noletsignbefore characters</code></dt> |
| <dd><p>If any of the characters proceeds a single letter without a space a |
| letter sign is not used. By default the characters apostrophe |
| (‘<samp>'</samp>’) and period (‘<samp>.</samp>’) have this property. Use of a |
| <code>noletsignbefore</code> entry cancels the defaults. If more than one |
| <code>noletsignbefore</code> entry is used, the characters in all entries |
| are combined. |
| </p> |
| <a name="index-noletsignafter"></a> |
| <a name="noletsignafter-opcode"></a></dd> |
| <dt><code>noletsignafter characters</code></dt> |
| <dd><p>If any of the characters follows a single letter without a space a |
| letter sign is not used. By default the characters apostrophe |
| (‘<samp>'</samp>’) and period (‘<samp>.</samp>’) have this property. Use of a |
| <code>noletsignafter</code> entry cancels the defaults. If more than one |
| <code>noletsignafter</code> entry is used the characters in all entries are |
| combined. |
| </p> |
| <a name="index-numsign"></a> |
| <a name="numsign-opcode"></a></dd> |
| <dt><code>numsign dots</code></dt> |
| <dd><p>The translator inserts this indicator before numbers made up of digits |
| defined with the <code>litdigit</code> opcode (see <a href="#litdigit-opcode"><code>litdigit</code></a>) to show that they are a number |
| and not letters or some other symbols. For example: |
| </p> |
| <div class="example"> |
| <pre class="example">numsign 3456 |
| </pre></div> |
| |
| </dd> |
| </dl> |
| |
| <hr> |
| <a name="Emphasis-Opcodes"></a> |
| <a name="Emphasis-Opcodes-1"></a> |
| <h3 class="section">3.4 Emphasis Opcodes</h3> |
| |
| <p>These also define braille indicators, but they require more |
| explanation. There are four sets, for italic, bold, underline and |
| computer braille. In each of the first three sets there are seven |
| opcodes, for use before the first word of a phrase, for use before the |
| last word, for use after the last word, for use before the first |
| letter (or character) if emphasis starts in the middle of a word, for |
| use after the last letter (or character) if emphasis ends in the |
| middle of a word, before a single letter (or character), and to |
| specify the length of a phrase to which the first-word and |
| last-word-before indicators apply. This rather elaborate set of |
| emphasis opcodes was devised to try to meet all contingencies. It is |
| unlikely that a translation table will contain all of them. The |
| translator checks for their presence. If they are present, it first |
| looks to see if the single-letter indicator should be used. Then it |
| looks at the word (or phrase) indicators and finally at the |
| multi-letter indicators. |
| </p> |
| <p>The translator will apply up to two emphasis indicators to each phrase |
| or string of characters, depending on what the <code>typeform</code> |
| parameter in its calling sequence indicates (see <a href="#Programming-with-liblouis">Programming with liblouis</a>). |
| </p> |
| <p>For computer braille there are only two braille indicators, for the |
| beginning and end of a sequence of characters to be rendered in |
| computer braille. Such a sequence may also have other emphasis. The |
| computer braille indicators are applied not only when computer braille |
| is indicated in the <code>typeform</code> parameter, but also when a |
| sequence of characters is determined to be computer braille because it |
| contains a subsequence defined by the <code>compbrl</code> opcode (see <a href="#compbrl-opcode"><code>compbrl</code></a>) or the |
| <code>literal</code> opcode (see <a href="#literal-opcode"><code>literal</code></a>). |
| </p> |
| <p>Here are the various emphasis opcodes. |
| </p> |
| <dl compact="compact"> |
| <dd> |
| <a name="index-firstwordital"></a> |
| <a name="firstwordital-opcode"></a></dd> |
| <dt><code>firstwordital dots</code></dt> |
| <dd><p>This is the braille indicator to be placed before the first word of an |
| italicized phrase that is longer than the value given in the |
| <code>lenitalphrase</code> opcode (see <a href="#lenitalphrase-opcode"><code>lenitalphrase</code></a>). For example: |
| </p> |
| <div class="example"> |
| <pre class="example">firstwordital 46-46 English indicator |
| </pre></div> |
| |
| <a name="index-lastworditalbefore"></a> |
| <a name="lastworditalbefore-opcode"></a></dd> |
| <dt><code>lastworditalbefore dots</code></dt> |
| <dd><p>This is the braille indicator to be placed before the last word of an |
| italicized phrase. In addition, if <code>firstwordital</code> is not used, |
| this braille indicator is doubled and placed before the first word. Do |
| not use <code>lastworditalbefore</code> and <code>lastworditalafter</code> in the |
| same table. For example: |
| </p> |
| <div class="example"> |
| <pre class="example">lastworditalbefore 4-6 |
| </pre></div> |
| |
| <a name="index-lastworditalafter"></a> |
| <a name="lastworditalafter-opcode"></a></dd> |
| <dt><code>lastworditalafter dots</code></dt> |
| <dd><p>This is the braille indicator to be placed after the last word of an |
| italicized phrase. Do not use <code>lastworditalbefore</code> and |
| <code>lastworditalafter</code> in the same table. See also the |
| <code>lenitalphrase</code> opcode (see <a href="#lenitalphrase-opcode"><code>lenitalphrase</code></a>) for more information. |
| </p> |
| <a name="index-firstletterital"></a> |
| <a name="firstletterital-opcode"></a></dd> |
| <dt><code>firstletterital dots</code></dt> |
| <dd><p>This is the braille indicator to be placed before the first letter (or |
| character) if italicization begins in the middle of a word. |
| </p> |
| <a name="index-lastletterital"></a> |
| <a name="lastletterital-opcode"></a></dd> |
| <dt><code>lastletterital dots</code></dt> |
| <dd><p>This is the braille indicator to be placed after the last letter (or |
| character) when italicization ends in the middle of a word. |
| </p> |
| <a name="index-singleletterital"></a> |
| <a name="singleletterital-opcode"></a></dd> |
| <dt><code>singleletterital dots</code></dt> |
| <dd><p>This braille indicator is used if only a single letter (or character) |
| is italicized. |
| </p> |
| <a name="index-lenitalphrase"></a> |
| <a name="lenitalphrase-opcode"></a></dd> |
| <dt><code>lenitalphrase number</code></dt> |
| <dd><p>If <code>lastworditalbefore</code> is used, an italicized phrase is checked |
| to see how many words it contains. If this number is less than or |
| equal to the number given in the <code>lenitalphrase</code> opcode, the |
| <code>lastworditalbefore</code> sign is placed in front of each word. If it |
| is greater, the <code>firstwordital</code> indicator is placed before the |
| first word and the <code>lastworditalbefore</code> indicator is placed after |
| the last word. Note that if the <code>firstwordital</code> opcode is not |
| used its indicator is made up by doubling the dot pattern given in the |
| <code>lastworditalbefore</code> entry. For example: |
| </p> |
| <div class="example"> |
| <pre class="example">lenitalphrase 4 |
| </pre></div> |
| |
| <a name="index-firstwordbold"></a> |
| <a name="firstwordbold-opcode"></a></dd> |
| <dt><code>firstwordbold dots</code></dt> |
| <dd><p>This is the braille indicator to be placed before the first word of a |
| bold phrase. For example: |
| </p> |
| <div class="example"> |
| <pre class="example">firstwordbold 456-456 |
| </pre></div> |
| |
| <a name="index-lastwordboldbefore"></a> |
| <a name="lastwordboldbefore-opcode"></a></dd> |
| <dt><code>lastwordboldbefore dots</code></dt> |
| <dd><p>This is the braille indicator to be placed before the last word of a |
| bold phrase. In addition, if <code>firstwordbold</code> is not used, this |
| braille indicator is doubled and placed before the first word. Do not |
| use <code>lastwordboldbefore</code> and <code>lastwordboldafter</code> in the same |
| table. For example: |
| </p> |
| <div class="example"> |
| <pre class="example">lastwordboldbefore 456 |
| </pre></div> |
| |
| <a name="index-lastwordboldafter"></a> |
| <a name="lastwordboldafter-opcode"></a></dd> |
| <dt><code>lastwordboldafter dots</code></dt> |
| <dd><p>This is the braille indicator to be placed after the last word of a |
| bold phrase. Do not use <code>lastwordboldbefore</code> and |
| <code>lastwordboldafter</code> in the same table. |
| </p> |
| <a name="index-firstletterbold"></a> |
| <a name="firstletterbold-opcode"></a></dd> |
| <dt><code>firstletterbold dots</code></dt> |
| <dd><p>This is the braille indicator to be placed before the first letter (or |
| character) if bold emphasis begins in the middle of a word. |
| </p> |
| <a name="index-lastletterbold"></a> |
| <a name="lastletterbold-opcode"></a></dd> |
| <dt><code>lastletterbold dots</code></dt> |
| <dd><p>This is the braille indicator to be placed after the last letter (or |
| character) when bold emphasis ends in the middle of a word. |
| </p> |
| <a name="index-singleletterbold"></a> |
| <a name="singleletterbold-opcode"></a></dd> |
| <dt><code>singleletterbold dots</code></dt> |
| <dd><p>This braille indicator is used if only a single letter (or character) |
| is in boldface. |
| </p> |
| <a name="index-lenboldphrase"></a> |
| <a name="lenboldphrase-opcode"></a></dd> |
| <dt><code>lenboldphrase number</code></dt> |
| <dd><p>If <code>lastwordboldbefore</code> is used, a bold phrase is checked to see |
| how many words it contains. If this number is less than or equal to |
| the number given in the <code>lenboldphrase</code> opcode, the |
| <code>lastwordboldbefore</code> sign is placed in front of each word. If it |
| is greater, the <code>firstwordbold</code> indicator is placed before the |
| first word and the <code>lastwordboldbefore</code> indicator is placed after |
| the last word. Note that if the <code>firstwordbold</code> opcode is not |
| used its indicator is made up by doubling the dot pattern given in the |
| <code>lastwordboldbefore</code> entry. |
| </p> |
| <a name="index-firstwordunder"></a> |
| <a name="firstwordunder-opcode"></a></dd> |
| <dt><code>firstwordunder dots</code></dt> |
| <dd><p>This is the braille indicator to be placed before the first word of an |
| underlined phrase. |
| </p> |
| <a name="index-lastwordunderbefore"></a> |
| <a name="lastwordunderbefore-opcode"></a></dd> |
| <dt><code>lastwordunderbefore dots</code></dt> |
| <dd><p>This is the braille indicator to be placed before the last word of an |
| underlined phrase. In addition, if <code>firstwordunder</code> is not used, |
| this braille indicator is doubled and placed before the first word. |
| </p> |
| <a name="index-lastwordunderafter"></a> |
| <a name="lastwordunderafter-opcode"></a></dd> |
| <dt><code>lastwordunderafter dots</code></dt> |
| <dd><p>This is the braille indicator to be placed after the last word of an |
| underlined phrase. |
| </p> |
| <a name="index-firstletterunder"></a> |
| <a name="firstletterunder-opcode"></a></dd> |
| <dt><code>firstletterunder dots</code></dt> |
| <dd><p>This is the braille indicator to be placed before the first letter (or |
| character) if underline emphasis begins in the middle of a word. |
| </p> |
| <a name="index-lastletterunder"></a> |
| <a name="lastletterunder-opcode"></a></dd> |
| <dt><code>lastletterunder dots</code></dt> |
| <dd><p>This is the braille indicator to be placed after the last letter (or |
| character) when underline emphasis ends in the middle of a word. |
| </p> |
| <a name="index-singleletterunder"></a> |
| <a name="singleletterunder-opcode"></a></dd> |
| <dt><code>singleletterunder dots</code></dt> |
| <dd><p>This braille indicator is used if only a single letter (or character) |
| is underlined. |
| </p> |
| <a name="index-lenunderphrase"></a> |
| <a name="lenunderphrase-opcode"></a></dd> |
| <dt><code>lenunderphrase number</code></dt> |
| <dd><p>If <code>lastwordunderbefore</code> is used, an underlined phrase is checked |
| to see how many words it contains. If this number is less than or |
| equal to the number given in the <code>lenunderphrase</code> opcode, the |
| <code>lastwordunderbefore</code> sign is placed in front of each word. If it |
| is greater, the <code>firstwordunder</code> indicator is placed before the |
| first word and the <code>lastwordunderbefore</code> indicator is placed |
| after the last word. Note that if the <code>firstwordunder</code> opcode is |
| not used its indicator is made up by doubling the dot pattern given in |
| the <code>lastwordunderbefore</code> entry. |
| </p> |
| <a name="index-begcomp"></a> |
| <a name="begcomp-opcode"></a></dd> |
| <dt><code>begcomp dots</code></dt> |
| <dd><p>This braille indicator is placed before a sequence of characters |
| translated in computer braille, whether this sequence is indicated in |
| the <code>typeform</code> parameter (see <a href="#Programming-with-liblouis">Programming with liblouis</a>) or |
| inferred because it contains a subsequence specified by the |
| <code>compbrl</code> opcode (see <a href="#compbrl-opcode"><code>compbrl</code></a>). |
| </p> |
| <a name="index-endcomp"></a> |
| <a name="endcomp-opcode"></a></dd> |
| <dt><code>endcomp dots</code></dt> |
| <dd><p>This braille indicator is placed after a sequence of characters |
| translated in computer braille, whether this sequence is indicated in |
| the <code>typeform</code> parameter (see <a href="#Programming-with-liblouis">Programming with liblouis</a>) or |
| inferred because it contains a subsequence specified by the |
| <code>compbrl</code> opcode (see <a href="#compbrl-opcode"><code>compbrl</code></a>). |
| </p> |
| </dd> |
| </dl> |
| |
| <hr> |
| <a name="Special-Symbol-Opcodes"></a> |
| <a name="Special-Symbol-Opcodes-1"></a> |
| <h3 class="section">3.5 Special Symbol Opcodes</h3> |
| |
| <p>These opcodes define certain symbols, such as the decimal point, which |
| require special treatment. |
| </p> |
| <dl compact="compact"> |
| <dd><a name="index-decpoint"></a> |
| <a name="decpoint-opcode"></a></dd> |
| <dt><code>decpoint character dots</code></dt> |
| <dd><p>This opcode defines the decimal point. The character operand must have |
| only one character. For example, in <samp>en-us-g1.ctb</samp> we have: |
| </p> |
| <div class="example"> |
| <pre class="example">decpoint . 46 |
| </pre></div> |
| |
| <a name="index-hyphen"></a> |
| <a name="hyphen-opcode"></a></dd> |
| <dt><code>hyphen character dots</code></dt> |
| <dd><p>This opcode defines the hyphen, that is, the character used in |
| compound words such as have-nots. The back-translator uses it to |
| determine the end of individual words. |
| </p> |
| </dd> |
| </dl> |
| |
| <hr> |
| <a name="Special-Processing-Opcodes"></a> |
| <a name="Special-Processing-Opcodes-1"></a> |
| <h3 class="section">3.6 Special Processing Opcodes</h3> |
| |
| <p>These opcodes cause special processing to be carried out. |
| </p> |
| <dl compact="compact"> |
| <dd><a name="index-capsnocont"></a> |
| <a name="capsnocont-opcode"></a></dd> |
| <dt><code>capsnocont</code></dt> |
| <dd><p>This opcode has no operands. If it is specified, words or parts of |
| words in all caps are not contracted. This is needed for languages |
| such as Norwegian. |
| </p> |
| </dd> |
| </dl> |
| |
| <hr> |
| <a name="Translation-Opcodes"></a> |
| <a name="Translation-Opcodes-1"></a> |
| <h3 class="section">3.7 Translation Opcodes</h3> |
| |
| <p>These opcodes define the braille representations for character |
| sequences. Each of them defines an entry within the contraction table. |
| These entries may be defined in any order except, as noted below, when |
| they define alternate representations for the same character sequence. |
| </p> |
| <p>Each of these opcodes specifies a condition under which the |
| translation is legal, and each also has a characters operand and a |
| dots operand. The text being translated is processed strictly from |
| left to right, character by character, with the most eligible entry |
| for each position being used. If there is more than one eligible entry |
| for a given position in the text, then the one with the longest |
| character string is used. If there is more than one eligible entry for |
| the same character string, then the one defined first is is tested for |
| legality first. (This is the only case in which the order of the |
| entries makes a difference.) |
| </p> |
| <p>The characters operand is a sequence or string of characters preceded |
| and followed by whitespace. Each character can be entered in the |
| normal way, or it can be defined as a four-digit hexadecimal number |
| preceded by ‘<samp>\x</samp>’. |
| </p> |
| <p>The dots operand defines the braille representation for the characters |
| operand. It may also be specified as an equals sign (‘<samp>=</samp>’). This |
| means that the the default representation for each character |
| (see <a href="#Character_002dDefinition-Opcodes">Character-Definition Opcodes</a>) within the sequence is to be |
| used. Note however that the ‘<samp>=</samp>’ shortcut for dot patterns is |
| deprecated. Dot patterns should be written out. Otherwise |
| back-translation may not be correct. |
| </p> |
| <p>In what follows the word ‘<samp>characters</samp>’ means a sequence of one or |
| more consecutive letters between spaces and/or punctuation marks. |
| </p> |
| <dl compact="compact"> |
| <dd> |
| <a name="index-noback"></a> |
| <a name="noback-opcode"></a></dd> |
| <dt><code>noback opcode ...</code></dt> |
| <dd><p>This is an opcode prefix, that is to say, it modifies the operation of |
| the opcode that follows it on the same line. noback specifies that no |
| back-translation is to be done using this line. |
| </p> |
| <div class="example"> |
| <pre class="example">noback always ;\s; 0 |
| </pre></div> |
| |
| <a name="index-nofor"></a> |
| <a name="nofor-opcode"></a></dd> |
| <dt><code>nofor opcode ...</code></dt> |
| <dd><p>This is an opcode prefix which modifies the operation of the opcode |
| following it on the same line. nofor specifies that forward translation |
| is not to use the information on this line. |
| </p> |
| <a name="index-compbrl"></a> |
| <a name="compbrl-opcode"></a></dd> |
| <dt><code>compbrl characters</code></dt> |
| <dd><p>If the characters are found within a block of text surrounded by |
| whitespace the entire block is translated according to the default |
| braille representations defined by the <a href="#Character_002dDefinition-Opcodes">Character-Definition Opcodes</a>, if 8-dot computer braille is enabled or according to the dot |
| patterns given in the <code>comp6</code> opcode (see <a href="#comp6-opcode"><code>comp6</code></a>), if 6-dot computer braille is |
| enabled. For example: |
| </p> |
| <div class="example"> |
| <pre class="example">compbrl www translate URLs in computer braille |
| </pre></div> |
| |
| <a name="index-comp6"></a> |
| <a name="comp6-opcode"></a></dd> |
| <dt><code>comp6 character dots</code></dt> |
| <dd><p>This opcode specifies the translation of characters in 6-dot computer |
| braille. It is necessary because the translation of a single character |
| may require more than one cell. The first operand must be a character |
| with a decimal representation from 0 to 255 inclusive. The second |
| operand may specify as many cells as necessary. The opcode is somewhat |
| of a misnomer, since any dots, not just dots 1 through 6, can be |
| specified. This even includes virtual dots. |
| </p> |
| <a name="index-nocont"></a> |
| <a name="nocont-opcode"></a></dd> |
| <dt><code>nocont characters</code></dt> |
| <dd><p>Like <code>compbrl</code>, except that the string is uncontracted. |
| <code>prepunc</code> opcode (see <a href="#prepunc-opcode"><code>prepunc</code></a>) and <code>postpunc</code> opcode (see <a href="#postpunc-opcode"><code>postpunc</code></a>) rules are applied, |
| however. This is useful for specifying that foreign words should not |
| be contracted in an entire document. |
| </p> |
| <a name="index-replace"></a> |
| <a name="replace-opcode"></a></dd> |
| <dt><code>replace characters {characters}</code></dt> |
| <dd><p>Replace the first set of characters, no matter where they appear, with |
| the second. Note that the second operand is <em>NOT</em> a dot pattern. |
| It is also optional. If it is omitted the character(s) in the first |
| operand will be discarded. This is useful for ignoring characters. It |
| is possible that the "ignored" characters may still affect the |
| translation indirectly. Therefore, it is preferable to use |
| <code>correct</code> opcode (see <a href="#correct-opcode"><code>correct</code></a>). |
| </p> |
| <a name="index-always"></a> |
| <a name="always-opcode"></a></dd> |
| <dt><code>always characters dots</code></dt> |
| <dd><p>Replace the characters with the dot pattern no matter where they |
| appear. Do <em>NOT</em> use an entry such as <code>always a 1</code>. Use the |
| <code>uplow</code>, <code>letter</code>, etc. character definition opcodes |
| instead. For example: |
| </p> |
| <div class="example"> |
| <pre class="example">always world 456-2456 unconditional translation |
| </pre></div> |
| |
| <a name="index-repeated"></a> |
| <a name="repeated-opcode"></a></dd> |
| <dt><code>repeated characters dots</code></dt> |
| <dd><p>Replace the characters with the dot pattern no matter where they |
| appear. Ignore any consecutive repetitions of the same character |
| sequence. This is useful for shortening long strings of spaces or |
| hyphens or periods. For example: |
| </p> |
| <div class="example"> |
| <pre class="example">repeated --- 36-36-36 shorten separator lines made with hyphens |
| </pre></div> |
| |
| <a name="index-repword"></a> |
| <a name="repword-opcode"></a></dd> |
| <dt><code>repword characters dots</code></dt> |
| <dd><p>When characters are encountered check to see if the word before this |
| string matches the word after it. If so, replace characters with dots |
| and eliminate the second word and any word following another occurrence |
| of characters that is the same. This opcode is used in Malaysian |
| braille. In this case the rule is: |
| </p> |
| <div class="example"> |
| <pre class="example">repword - 123456 |
| </pre></div> |
| |
| <a name="index-largesign"></a> |
| <a name="largesign-opcode"></a></dd> |
| <dt><code>largesign characters dots</code></dt> |
| <dd><p>Replace the characters with the dot pattern no matter where they |
| appear. In addition, if two words defined as large signs follow each |
| other, remove the space between them. For example, in |
| <samp>en-us-g2.ctb</samp> the words ‘<samp>and</samp>’ and ‘<samp>the</samp>’ are both |
| defined as large signs. Thus, in the phrase ‘<samp>the cat and the dog</samp>’ |
| the space would be deleted between ‘<samp>and</samp>’ and ‘<samp>the</samp>’, with the |
| result ‘<samp>the cat andthe dog</samp>’. Of course, ‘<samp>and</samp>’ and ‘<samp>the</samp>’ |
| would be properly contracted. The term <code>largesign</code> is a bit of |
| braille jargon that pleases braille experts. |
| </p> |
| <a name="index-word"></a> |
| <a name="word-opcode"></a></dd> |
| <dt><code>word characters dots</code></dt> |
| <dd><p>Replace the characters with the dot pattern if they are a word, that |
| is, are surrounded by whitespace and/or punctuation. |
| </p> |
| <a name="index-syllable"></a> |
| <a name="syllable-opcode"></a></dd> |
| <dt><code>syllable characters dots</code></dt> |
| <dd><p>As its name indicates, this opcode defines a "syllable" which must be |
| represented by exactly the dot patterns given. Contractions may not |
| cross the boundaries of this "syllable" either from left or right. The |
| character string defined by this opcode need not be a lexical |
| syllable, though it usually will be. The equal sign in the following |
| example means that the the default representation for each character |
| within the sequence is to be used (see <a href="#Translation-Opcodes">Translation Opcodes</a>): |
| </p> |
| <div class="example"> |
| <pre class="example">syllable horse = sawhorse, horseradish |
| </pre></div> |
| |
| <a name="index-nocross"></a> |
| <a name="nocross-opcode"></a></dd> |
| <dt><code>nocross characters dots</code></dt> |
| <dd><p>Replace the characters with the dot pattern if the characters are all |
| in one syllable (do not cross a syllable boundary). For this opcode to |
| work, a hyphenation table must be included. If this is not done, |
| <code>nocross</code> behaves like the <code>always</code> opcode (see <a href="#always-opcode"><code>always</code></a>). For example, if |
| the English Grade 2 table is being used and the appropriate |
| hyphenation table has been included <code>nocross sh 146</code> will cause |
| the ‘<samp>sh</samp>’ in ‘<samp>monkshood</samp>’ not to be contracted. |
| </p> |
| <a name="index-joinword"></a> |
| <a name="joinword-opcode"></a></dd> |
| <dt><code>joinword characters dots</code></dt> |
| <dd><p>Replace the characters with the dot pattern if they are a word which |
| is followed by whitespace and a letter. In addition remove the |
| whitespace. For example, <samp>en-us-g2.ctb</samp> has <code>joinword to |
| 235</code>. This means that if the word ‘<samp>to</samp>’ is followed by another |
| word the contraction is to be used and the space is to be omitted. If |
| these conditions are not met, the word is translated according to any |
| other opcodes that may apply to it. |
| </p> |
| <a name="index-lowword"></a> |
| <a name="lowword-opcode"></a></dd> |
| <dt><code>lowword characters dots</code></dt> |
| <dd><p>Replace the characters with the dot pattern if they are a word |
| preceded and followed by whitespace. No punctuation either before or |
| after the word is allowed. The term <code>lowword</code> derives from the |
| fact that in English these contractions are written in the lower part |
| of the cell. For example: |
| </p> |
| <div class="example"> |
| <pre class="example">lowword were 2356 |
| </pre></div> |
| |
| <a name="index-contraction"></a> |
| <a name="contraction-opcode"></a></dd> |
| <dt><code>contraction characters</code></dt> |
| <dd><p>If you look at <samp>en-us-g2.ctb</samp> you will see that some words are |
| actually contracted into some of their own letters. A famous example |
| among braille transcribers is ‘<samp>also</samp>’, which is contracted as |
| ‘<samp>al</samp>’. But this is also the name of a person. To take another |
| example, ‘<samp>altogether</samp>’ is contracted as ‘<samp>alt</samp>’, but this is |
| the abbreviation for the alternate key on a computer keyboard. |
| Similarly ‘<samp>could</samp>’ is contracted into ‘<samp>cd</samp>’, but this is the |
| abbreviation for compact disk. To prevent confusion in such cases, the |
| letter sign (see <code>letsign</code> opcode (see <a href="#letsign-opcode"><code>letsign</code></a>)) is placed before such letter |
| combinations when they actually are abbreviations, not contractions. |
| The <code>contraction</code> opcode tells the translator to do this. |
| </p> |
| <a name="index-sufword"></a> |
| <a name="sufword-opcode"></a></dd> |
| <dt><code>sufword characters dots</code></dt> |
| <dd><p>Replace the characters with the dot pattern if they are either a word |
| or at the beginning of a word. |
| </p> |
| <a name="index-prfword"></a> |
| <a name="prfword-opcode"></a></dd> |
| <dt><code>prfword characters dots</code></dt> |
| <dd><p>Replace the characters with the dot pattern if they are either a word |
| or at the end of a word. |
| </p> |
| <a name="index-begword"></a> |
| <a name="begword-opcode"></a></dd> |
| <dt><code>begword characters dots</code></dt> |
| <dd><p>Replace the characters with the dot pattern if they are at the |
| beginning of a word. |
| </p> |
| <a name="index-begmidword"></a> |
| <a name="begmidword-opcode"></a></dd> |
| <dt><code>begmidword characters dots</code></dt> |
| <dd><p>Replace the characters with the dot pattern if they are either at the |
| beginning or in the middle of a word. |
| </p> |
| <a name="index-midword"></a> |
| <a name="midword-opcode"></a></dd> |
| <dt><code>midword characters dots</code></dt> |
| <dd><p>Replace the characters with the dot pattern if they are in the middle |
| of a word. |
| </p> |
| <a name="index-midendword"></a> |
| <a name="midendword-opcode"></a></dd> |
| <dt><code>midendword characters dots</code></dt> |
| <dd><p>Replace the characters with the dot pattern if they are either in the |
| middle or at the end of a word. |
| </p> |
| <a name="index-endword"></a> |
| <a name="endword-opcode"></a></dd> |
| <dt><code>endword characters dots</code></dt> |
| <dd><p>Replace the characters with the dot pattern if they are at the end of |
| a word. |
| </p> |
| <a name="index-partword"></a> |
| <a name="partword-opcode"></a></dd> |
| <dt><code>partword characters dots</code></dt> |
| <dd><p>Replace the characters with the dot pattern if the characters are |
| anywhere in a word, that is, if they are proceeded or followed by a |
| letter. |
| </p> |
| <a name="index-exactdots"></a> |
| <a name="exactdots-opcode"></a></dd> |
| <dt><code>exactdots @dots</code></dt> |
| <dd><p>Note that the operand must begin with an at sign (‘<samp>@</samp>’). The dot |
| pattern following it is evaluated for validity. If it is valid, |
| whenever an at sign followed by this dot pattern appears in the source |
| document it is replaced by the characters corresponding to the dot |
| pattern in the output. This opcode is intended for use in liblouisxml |
| semantic-action files to specify exact dot patterns, as in |
| mathematical codes. For example: |
| </p> |
| <div class="example"> |
| <pre class="example">exactdots @4-46-12356 |
| </pre></div> |
| <p>will produce the characters with these dot patterns in the output. |
| </p> |
| <a name="index-prepunc"></a> |
| <a name="prepunc-opcode"></a></dd> |
| <dt><code>prepunc characters dots</code></dt> |
| <dd><p>Replace the characters with the dot pattern if they are part of |
| punctuation at the beginning of a word. |
| </p> |
| <a name="index-postpunc"></a> |
| <a name="postpunc-opcode"></a></dd> |
| <dt><code>postpunc characters dots</code></dt> |
| <dd><p>Replace the characters with the dot pattern if they are part of |
| punctuation at the end of a word. |
| </p> |
| <a name="index-begnum"></a> |
| <a name="begnum-opcode"></a></dd> |
| <dt><code>begnum characters dots</code></dt> |
| <dd><p>Replace the characters with the dot pattern if they are at the |
| beginning of a number, that is, before all its digits. For example, in |
| <samp>en-us-g1.ctb</samp> we have <code>begnum # 4</code>. |
| </p> |
| <a name="index-midnum"></a> |
| <a name="midnum-opcode"></a></dd> |
| <dt><code>midnum characters dots</code></dt> |
| <dd><p>Replace the characters with the dot pattern if they are in the middle |
| of a number. For example, <samp>en-us-g1.ctb</samp> has <code>midnum . 46</code>. |
| This is because the decimal point has a different dot pattern than the |
| period. |
| </p> |
| <a name="index-endnum"></a> |
| <a name="endnum-opcode"></a></dd> |
| <dt><code>endnum characters dots</code></dt> |
| <dd><p>Replace the characters with the dot pattern if they are at the end of |
| a number. For example <samp>en-us-g1.ctb</samp> has <code>endnum th 1456</code>. |
| This handles things like ‘<samp>4th</samp>’. A letter sign is <em>NOT</em> |
| inserted. |
| </p> |
| <a name="index-joinnum"></a> |
| <a name="joinnum-opcode"></a></dd> |
| <dt><code>joinnum characters dots</code></dt> |
| <dd><p>Replace the characters with the dot pattern. In addition, if |
| whitespace and a number follows omit the whitespace. This opcode can |
| be used to join currency symbols to numbers for example: |
| </p> |
| <div class="example"> |
| <pre class="example">joinnum \x20AC 15 (EURO SIGN) |
| joinnum \x0024 145 (DOLLAR SIGN) |
| joinnum \x00A3 1234 (POUND SIGN) |
| joinnum \x00A5 13456 (YEN SIGN) |
| </pre></div> |
| |
| </dd> |
| </dl> |
| |
| <hr> |
| <a name="Character_002dClass-Opcodes"></a> |
| <a name="Character_002dClass-Opcodes-1"></a> |
| <h3 class="section">3.8 Character-Class Opcodes</h3> |
| |
| <p>These opcodes define and use character classes. A character class |
| associates a set of characters with a name. The name then refers to |
| any character within the class. A character may belong to more than |
| one class. |
| </p> |
| <p>The basic character classes correspond to the character definition |
| opcodes, with the exception of the <code>uplow</code> opcode (see <a href="#uplow-opcode"><code>uplow</code></a>), which defines |
| characters belonging to the two classes <code>uppercase</code> and |
| <code>lowercase</code>. These classes are: |
| </p> |
| <dl compact="compact"> |
| <dt><code>space</code></dt> |
| <dd><p>Whitespace characters such as blank and tab |
| </p></dd> |
| <dt><code>digit</code></dt> |
| <dd><p>Numeric characters |
| </p></dd> |
| <dt><code>letter</code></dt> |
| <dd><p>Both uppercase and lowercase alphabetic characters |
| </p></dd> |
| <dt><code>lowercase</code></dt> |
| <dd><p>Lowercase alphabetic characters |
| </p></dd> |
| <dt><code>uppercase</code></dt> |
| <dd><p>Uppercase alphabetic characters |
| </p></dd> |
| <dt><code>punctuation</code></dt> |
| <dd><p>Punctuation marks |
| </p></dd> |
| <dt><code>sign</code></dt> |
| <dd><p>Signs such as percent (‘<samp>%</samp>’) |
| </p></dd> |
| <dt><code>math</code></dt> |
| <dd><p>Mathematical symbols |
| </p></dd> |
| <dt><code>litdigit</code></dt> |
| <dd><p>Literary digit |
| </p></dd> |
| <dt><code>undefined</code></dt> |
| <dd><p>Not properly defined |
| </p> |
| </dd> |
| </dl> |
| |
| <p>The opcodes which define and use character classes are shown below. |
| For examples see <samp>fr-abrege.ctb</samp>. |
| </p> |
| <dl compact="compact"> |
| <dd> |
| <a name="index-class"></a> |
| <a name="class-opcode"></a></dd> |
| <dt><code>class name characters</code></dt> |
| <dd><p>Define a new character class. The characters operand must be specified |
| as a string. A character class may not be used until it has been |
| defined. |
| </p> |
| <a name="index-after"></a> |
| <a name="after-opcode"></a></dd> |
| <dt><code>after class opcode ...</code></dt> |
| <dd><p>The specified opcode is further constrained in that the matched |
| character sequence must be immediately preceded by a character |
| belonging to the specified class. If this opcode is used more than |
| once on the same line then the union of the characters in all the |
| classes is used. |
| </p> |
| <a name="index-before"></a> |
| <a name="before-opcode"></a></dd> |
| <dt><code>before class opcode ...</code></dt> |
| <dd><p>The specified opcode is further constrained in that the matched |
| character sequence must be immediately followed by a character |
| belonging to the specified class. If this opcode is used more than |
| once on the same line then the union of the characters in all the |
| classes is used. |
| </p> |
| </dd> |
| </dl> |
| |
| <hr> |
| <a name="Swap-Opcodes"></a> |
| <a name="Swap-Opcodes-1"></a> |
| <h3 class="section">3.9 Swap Opcodes</h3> |
| |
| <p>The swap opcodes are needed to tell the <code>context</code> opcode (see <a href="#context-opcode"><code>context</code></a>), the |
| <code>correct</code> opcode (see <a href="#correct-opcode"><code>correct</code></a>) and multipass opcodes which dot patterns to swap |
| for which characters. There are three, <code>swapcd</code>, <code>swapdd</code> |
| and <code>swapcc</code>. The first swaps dot patterns for characters. The |
| second swaps dot patterns for dot patterns and the third swaps |
| characters for characters. The first is used in the <code>context</code> |
| opcode and the second is used in the multipass opcodes. Dot patterns |
| are separated by commas and may contain more than one cell. |
| </p> |
| <dl compact="compact"> |
| <dd> |
| <a name="index-swapcd"></a> |
| <a name="swapcd-opcode"></a></dd> |
| <dt><code>swapcd name characters dots, dots, dots, ...</code></dt> |
| <dd><p>See above paragraph for explanation. For example: |
| </p> |
| <div class="example"> |
| <pre class="example">swapcd dropped 0123456789 356,2,23,... |
| </pre></div> |
| |
| <a name="index-swapdd"></a> |
| <a name="swapdd-opcode"></a></dd> |
| <dt><code>swapdd name dots, dots, dots ... dotpattern1, dotpattern2, dotpattern3, ...</code></dt> |
| <dd><p>The <code>swapdd</code> opcode defines substitutions for the multipass |
| opcodes. In the second operand the dot patterns must be single cells, |
| but in the third operand multi-cell dot patterns are allowed. This is |
| because multi-cell patterns in the second operand would lead to |
| ambiguities. |
| </p> |
| <a name="index-swapcc"></a> |
| <a name="swapcc-opcode"></a></dd> |
| <dt><code>swapcc name characters characters</code></dt> |
| <dd><p>The <code>swapcc</code> opcode swaps characters in its second operand for |
| characters in the corresponding places in its third operand. It is |
| intended for use with <code>correct</code> opcodes and can solve problems |
| such as formatting phone numbers. |
| </p> |
| </dd> |
| </dl> |
| |
| <hr> |
| <a name="The-Context-and-Multipass-Opcodes"></a> |
| <a name="The-Context-and-Multipass-Opcodes-1"></a> |
| <h3 class="section">3.10 The Context and Multipass Opcodes</h3> |
| |
| <p>The <code>context</code> and multipass opcodes (<code>pass2</code>, <code>pass3</code> |
| and <code>pass4</code>) provide translation capabilities beyond those of the |
| basic translation opcodes (see <a href="#Translation-Opcodes">Translation Opcodes</a>) discussed |
| previously. The multipass opcodes cause additional passes to be made |
| over the string to be translated. The number after the word |
| <code>pass</code> indicates in which pass the entry is to be applied. If no |
| multipass opcodes are given, only the first translation pass is made. |
| The <code>context</code> opcode is basically a multipass opcode for the |
| first pass. It differs slightly from the multipass opcodes per se. The |
| format of all these opcodes is <code>opcode test action</code>. The specific |
| opcodes are invoked as follows: |
| </p> |
| <dl compact="compact"> |
| <dd><a name="context-opcode"></a><a name="index-context"></a> |
| <a name="index-pass2"></a> |
| <a name="index-pass3"></a> |
| <a name="index-pass4"></a> |
| </dd> |
| <dt><code>context test action</code></dt> |
| <dt><code>pass2 test action</code></dt> |
| <dt><code>pass3 test action</code></dt> |
| <dt><code>pass4 test action</code></dt> |
| </dl> |
| |
| <p>The <code>test</code> and <code>action</code> operands have suboperands. Each |
| suboperand begins with a non-alphanumeric character and ends when |
| another non-alphanumeric character is encountered. The suboperands and |
| their initial characters are as follows. |
| </p> |
| <dl compact="compact"> |
| <dt><kbd>" (double quote)</kbd></dt> |
| <dd><p>a string of characters. This string must be terminated by another |
| double quote. It may contain any characters. If a double quote is |
| needed within the string, it must be preceded by a backslash |
| (‘<samp>\</samp>’). If a space is needed, it must be represented by the escape |
| sequence \s. This suboperand is valid only in the test part of the |
| <code>context</code> opcode. |
| </p> |
| </dd> |
| <dt><kbd>@ (at sign)</kbd></dt> |
| <dd><p>a sequence of dot patterns. Cells are separated by hyphens as usual. |
| This suboperand is not valid in the test part of the context and |
| correct opcodes. |
| </p> |
| </dd> |
| <dt><kbd>` (accent mark)</kbd></dt> |
| <dd><p>If this is the beginning of the string being translated this suboperand |
| is true. It is valid only in the test part and must be the first thing |
| in this operand. |
| </p> |
| </dd> |
| <dt><kbd>~ (tilde)</kbd></dt> |
| <dd><p>If this is the end of the string being translated this suboperand is |
| true. It is valid only in the test part and must be the last thing in |
| this operand. |
| </p> |
| </dd> |
| <dt><kbd>$ (dollar sign)</kbd></dt> |
| <dd><p>a string of attributes, such as ‘<samp>d</samp>’ for digit, ‘<samp>l</samp>’ for |
| letter, etc. More than one attribute can be given. If you wish to |
| check characters with any attribute, use the letter ‘<samp>a</samp>’. Input |
| characters are checked to see if they have at least one of the |
| attributes. The attribute string can be followed by numbers specifying |
| how many characters are to be checked. If no numbers are given, 1 is |
| assumed. If two numbers separated by a hyphen are given, the input is |
| checked to make sure that at least the first number of characters with |
| the attributes are present, but no more than the second number. If |
| only one number is present, then exactly that many characters must |
| have the attributes. A period instead of the numbers indicates an |
| indefinite number of characters (for technical reasons the number of |
| characters that are actually matched is limited to 65535). |
| </p> |
| <p>This suboperand is valid in all test parts but not in action parts. |
| For the characters which can be used in attribute strings, see the |
| following table. |
| </p> |
| </dd> |
| <dt><kbd>! (exclamation point)</kbd></dt> |
| <dd><p>reverses the logical meaning of the suboperand which follows. For |
| example, !$d is true only if the character is <em>NOT</em> a digit. This |
| suboperand is valid in test parts only. |
| </p> |
| </dd> |
| <dt><kbd>% (percent sign)</kbd></dt> |
| <dd><p>the name of a class defined by the <code>class</code> opcode (see <a href="#class-opcode"><code>class</code></a>) or the name of a |
| swap set defined by the swap opcodes (see <a href="#Swap-Opcodes">Swap Opcodes</a>). Names may |
| contain only letters. The letters may be upper or |
| lower-case. The case matters. Class names may be used in test parts |
| only. Swap names are valid everywhere. |
| </p> |
| </dd> |
| <dt><kbd>{ (left brace)</kbd></dt> |
| <dd><p>Name: the name of a grouping pair. The left brace indicates that the |
| first (or left) member of the pair is to be used in matching. If this is |
| between replacement brackets it must be the only item. This is also |
| valid in the action part. |
| </p> |
| </dd> |
| <dt><kbd>} (right brace)</kbd></dt> |
| <dd><p>Name: the name of a grouping pair. The right brace indicates that the |
| second (or right) member is to be used in matching. See the remarks on |
| the left brace immediately above. |
| </p> |
| </dd> |
| <dt><kbd>/ (slash)</kbd></dt> |
| <dd><p>Search the input for the expression following the slash and return true |
| if found. This can be used to set a variable. |
| </p> |
| </dd> |
| <dt><kbd>_ (underscore)</kbd></dt> |
| <dd><p>Move backward. If a number follows, move backward that number of |
| characters. The program never moves backward beyond the beginning of |
| the input string. This suboperand is valid only in test parts. |
| </p> |
| </dd> |
| <dt><kbd>[ (left bracket)</kbd></dt> |
| <dd><p>start replacement here. This suboperand must always be paired with a |
| right bracket and is valid only in test parts. Multiple pairs of |
| square brackets in a single expression are not allowed. |
| </p> |
| </dd> |
| <dt><kbd>] (right bracket)</kbd></dt> |
| <dd><p>end replacement here. This suboperand must always be paired with a |
| left bracket and is valid only in test parts. |
| </p> |
| </dd> |
| <dt><kbd># (number sign or crosshatch)</kbd></dt> |
| <dd><p>test or set a variable. Variables are referred to by numbers 1 to 50, |
| for example, <code>#1</code>, <code>#2</code>, <code>#25</code>. Variables may be set by |
| one <code>context</code> or multipass opcode and tested by another. Thus, an |
| operation that occurs at one place in a translation can tell an |
| operation that occurs later about itself. This feature will be used in |
| math translation, and it may also help to alleviate the need for new |
| opcodes. This suboperand is valid everywhere. |
| </p> |
| <p>Variables are set in the action part. To set a variable use an |
| expression like <code>#1=1</code>, <code>#2=5</code>, etc. Variables are also |
| incremented and decremented in the action part with expressions like |
| <code>#1+</code>, <code>#3-</code>, etc. These operators increment or decrement |
| the variable by 1. |
| </p> |
| <p>Variables are tested in the test part with expressions like |
| <code>#1=2</code>, <code>#3<4</code>, <code>#5>6</code>, etc. |
| </p> |
| </dd> |
| <dt><kbd>* (asterisk)</kbd></dt> |
| <dd><p>Copy the characters or dot patterns in the input within the |
| replacement brackets into the output and discard anything else that |
| may match. This feature is used, for example, for handling numeric |
| subscripts in Nemeth. This suboperand is valid only in action parts. |
| </p> |
| </dd> |
| <dt><kbd>? (question mark)</kbd></dt> |
| <dd><p>Valid only in the action part. The characters to be replaced are |
| simply ignored. That is, they are replaced with nothing. If either |
| member of a grouping pair is in the replace brackets the other member at |
| the same level is also removed. |
| </p> |
| </dd> |
| </dl> |
| |
| <p>The characters which can be used in attribute strings are as follows: |
| </p> |
| <dl compact="compact"> |
| <dt><kbd>a</kbd></dt> |
| <dd><p>any attribute |
| </p></dd> |
| <dt><kbd>d</kbd></dt> |
| <dd><p>digit |
| </p></dd> |
| <dt><kbd>D</kbd></dt> |
| <dd><p>literary digit |
| </p></dd> |
| <dt><kbd>l</kbd></dt> |
| <dd><p>letter |
| </p></dd> |
| <dt><kbd>m</kbd></dt> |
| <dd><p>math |
| </p></dd> |
| <dt><kbd>p</kbd></dt> |
| <dd><p>punctuation |
| </p></dd> |
| <dt><kbd>S</kbd></dt> |
| <dd><p>sign |
| </p></dd> |
| <dt><kbd>s</kbd></dt> |
| <dd><p>space |
| </p></dd> |
| <dt><kbd>U</kbd></dt> |
| <dd><p>uppercase |
| </p></dd> |
| <dt><kbd>u</kbd></dt> |
| <dd><p>lowercase |
| </p></dd> |
| <dt><kbd>w</kbd></dt> |
| <dd><p>first user-defined class |
| </p></dd> |
| <dt><kbd>x</kbd></dt> |
| <dd><p>second user-defined class |
| </p></dd> |
| <dt><kbd>y</kbd></dt> |
| <dd><p>third user-defined class |
| </p></dd> |
| <dt><kbd>z</kbd></dt> |
| <dd><p>fourth user-defined class |
| </p></dd> |
| </dl> |
| |
| <p>The following illustrates the algorithm how text is evaluated with |
| multipass expressions: |
| </p> |
| <p>Loop over context, pass2, pass3 and pass4 and do the following for each pass: |
| </p> |
| <ol> |
| <li> Match the text following the cursor against all expressions in the |
| current pass |
| </li><li> If there is no match: shift the cursor one position to the right and |
| continue the loop |
| </li><li> If there is a match: choose the longest match |
| </li><li> Do the replacement (everything between square brackets) |
| </li><li> Place the cursor after the replaced text |
| </li><li> continue loop |
| </li></ol> |
| |
| <hr> |
| <a name="The-correct-Opcode"></a> |
| <a name="The-correct-Opcode-1"></a> |
| <h3 class="section">3.11 The correct Opcode</h3> |
| |
| <dl compact="compact"> |
| <dd><a name="index-correct"></a> |
| <a name="correct-opcode"></a></dd> |
| <dt><code>correct test action</code></dt> |
| <dd><p>Because some input (such as that from an OCR program) may contain |
| systematic errors, it is sometimes advantageous to use a |
| pre-translation pass to remove them. The errors and their corrections |
| are specified by the <code>correct</code> opcode. If there are no |
| <code>correct</code> opcodes in a table, the pre-translation pass is not |
| used. The format of the <code>correct</code> opcode is very similar to that |
| of the <code>context</code> opcode (see <a href="#context-opcode"><code>context</code></a>). The only difference is that in the action |
| part strings may be used and dot patterns may not be used. Some |
| examples of <code>correct</code> opcode entries are: |
| </p> |
| <div class="example"> |
| <pre class="example">correct "\\" ? Eliminate backslashes |
| correct "cornf" "comf" fix a common "scano" |
| correct "cornm" "comm" |
| correct "cornp" "comp" |
| correct "*" ? Get rid of stray asterisks |
| correct "|" ? ditto for vertical bars |
| correct "\s?" "?" drop space before question mark |
| </pre></div> |
| |
| </dd> |
| </dl> |
| |
| <hr> |
| <a name="Miscellaneous-Opcodes"></a> |
| <a name="Miscellaneous-Opcodes-1"></a> |
| <h3 class="section">3.12 Miscellaneous Opcodes</h3> |
| |
| <dl compact="compact"> |
| <dd><a name="index-include"></a> |
| <a name="include-opcode"></a></dd> |
| <dt><code>include filename</code></dt> |
| <dd><p>Read the file indicated by <code>filename</code> and incorporate or include |
| its entries into the table. Included files can include other files, |
| which can include other files, etc. For an example, see what files are |
| included by the entry include <samp>en-us-g1.ctb</samp> in the table |
| <samp>en-us-g2.ctb</samp>. If the included file is not in the same directory |
| as the main table, use a full path name for filename. |
| </p> |
| <a name="index-locale"></a> |
| <a name="locale-opcode"></a></dd> |
| <dt><code>locale characters</code></dt> |
| <dd><p>Not implemented, but recognized and ignored for backward |
| compatibility. |
| </p> |
| <a name="index-undefined"></a> |
| <a name="undefined-opcode"></a></dd> |
| <dt><code>undefined dots</code></dt> |
| <dd><p>If this opcode is used in a table any characters which have not been |
| defined in the table but are encountered in the text will be replaced by |
| the dot pattern. If this opcode is not used, any undefined characters |
| are replaced by <code>'\xhhhh'</code>, where the h’s are hexadecimal digits. |
| </p> |
| <a name="index-display"></a> |
| <a name="display-opcode"></a></dd> |
| <dt><code>display character dots</code></dt> |
| <dd><p>Associates dot patterns with the characters which will be sent to a |
| braille embosser, display or screen font. The character must be in the |
| range 0-255 and the dots must specify a single cell. Here are some |
| examples: |
| </p> |
| <div class="example"> |
| <pre class="example"># When the character a is sent to the embosser or display, |
| # it will produce a dot 1. |
| display a 1 |
| </pre></div> |
| |
| <div class="example"> |
| <pre class="example"># When the character L is sent to the display or embosser |
| # it will produce dots 1-2-3. |
| display L 123 |
| </pre></div> |
| |
| <p>The <code>display</code> opcode is optional. It is used when the embosser or |
| display has a different mapping of characters to dot patterns than |
| that given in <a href="#Character_002dDefinition-Opcodes">Character-Definition Opcodes</a>. If used, display |
| entries must proceed character-definition entries. |
| </p> |
| <p>A possible use case would be to define display opcodes so that the |
| result is Unicode braille for use on a display and a second set of |
| display opcodes (in a different file) to produce plain ASCII braille |
| for use with an embosser. |
| </p> |
| <a name="index-multind"></a> |
| <a name="multind-opcode"></a></dd> |
| <dt><code>multind dots opcode opcode ...</code></dt> |
| <dd><p>The <code>multind</code> opcode tells the back-translator that a sequence of |
| braille cells represents more than one braille indicator. For example, |
| in <samp>en-us-g1.ctb</samp> we have <code>multind 56-6 letsign capsign</code>. |
| The back-translator can generally handle single braille indicators, |
| but it cannot apply them when they immediately follow each other. It |
| recognizes the letter sign if it is followed by a letter and takes |
| appropriate action. It also recognizes the capital sign if it is |
| followed by a letter. But when there is a letter sign followed by a |
| capital sign it fails to recognize the letter sign unless the sequence |
| has been defined with <code>multind</code>. A <code>multind</code> entry may not |
| contain a comment because liblouis would attempt to interpret it as an |
| opcode. |
| </p> |
| </dd> |
| </dl> |
| |
| <hr> |
| <a name="Deprecated-Opcodes"></a> |
| <a name="Deprecated-Opcodes-1"></a> |
| <h3 class="section">3.13 Deprecated Opcodes</h3> |
| |
| <p>The following opcodes are an early attempt to handle emphasis. They |
| have been deprecated by more specific opcodes, but are kept for |
| backward compatibility. |
| </p> |
| <dl compact="compact"> |
| <dd><a name="index-italsign"></a> |
| <a name="italsign-opcode"></a></dd> |
| <dt><code>italsign dots</code></dt> |
| <dd><p>This opcode is deprecated. Use the <code>lastworditalbefore</code> opcode (see <a href="#lastworditalbefore-opcode"><code>lastworditalbefore</code></a>) instead. |
| </p> |
| <a name="index-begital"></a> |
| <a name="begital-opcode"></a></dd> |
| <dt><code>begital dots</code></dt> |
| <dd><p>This opcode is deprecated. Use the <code>firstletterital</code> opcode (see <a href="#firstletterital-opcode"><code>firstletterital</code></a>) instead. |
| </p> |
| <a name="index-endital"></a> |
| <a name="endital-opcode"></a></dd> |
| <dt><code>endital dots</code></dt> |
| <dd><p>This opcode is deprecated. Use the <code>lastletterital</code> opcode (see <a href="#lastletterital-opcode"><code>lastletterital</code></a>) instead. |
| </p> |
| <a name="index-boldsign"></a> |
| <a name="boldsign-opcode"></a></dd> |
| <dt><code>boldsign dots</code></dt> |
| <dd><p>This opcode is deprecated. Use the <code>lastwordboldbefore</code> opcode (see <a href="#lastwordboldbefore-opcode"><code>lastwordboldbefore</code></a>) instead. |
| </p> |
| <a name="index-begbold"></a> |
| <a name="begbold-opcode"></a></dd> |
| <dt><code>begbold dots</code></dt> |
| <dd><p>This opcode is deprecated. Use the <code>firstletterbold</code> opcode (see <a href="#firstletterbold-opcode"><code>firstletterbold</code></a>) instead. |
| </p> |
| <a name="index-endbold"></a> |
| <a name="endbold-opcode"></a></dd> |
| <dt><code>endbold dots</code></dt> |
| <dd><p>This opcode is deprecated. Use the <code>lastletterbold</code> opcode (see <a href="#lastletterbold-opcode"><code>lastletterbold</code></a>) instead. |
| </p> |
| <a name="index-undersign"></a> |
| <a name="undersign-opcode"></a></dd> |
| <dt><code>undersign dots</code></dt> |
| <dd><p>This opcode is deprecated. Use the <code>lastwordunderbefore</code> opcode (see <a href="#lastwordunderbefore-opcode"><code>lastwordunderbefore</code></a>) instead. |
| </p> |
| <a name="index-begunder"></a> |
| <a name="begunder-opcode"></a></dd> |
| <dt><code>begunder dots</code></dt> |
| <dd><p>This opcode is deprecated. Use the <code>firstletterunder</code> opcode (see <a href="#firstletterunder-opcode"><code>firstletterunder</code></a>) instead. |
| </p> |
| <a name="index-endunder"></a> |
| <a name="endunder-opcode"></a></dd> |
| <dt><code>endunder dots</code></dt> |
| <dd><p>This opcode is deprecated. Use the <code>lastletterunder</code> opcode (see <a href="#lastletterunder-opcode"><code>lastletterunder</code></a>) instead. |
| </p> |
| <a name="index-literal"></a> |
| <a name="literal-opcode"></a></dd> |
| <dt><code>literal characters</code></dt> |
| <dd><p>This opcode is deprecated. Use the <code>compbrl</code> opcode (see <a href="#compbrl-opcode"><code>compbrl</code></a>) instead. |
| </p></dd> |
| </dl> |
| |
| <hr> |
| <a name="How-to-test-Translation-Tables"></a> |
| <a name="How-to-test-Translation-Tables-1"></a> |
| <h2 class="chapter">4 How to test Translation Tables</h2> |
| |
| <p>There are a number of automated tests for liblouis and they are |
| proving to be of tremendous value. When changing the code the |
| developers can run the tests to see if anything broke. |
| </p> |
| <p>For testing the translation tables there are basically two approaches: |
| there are the harness tests and the doctests. They were created at |
| roughly the same time using different technologies, have influenced |
| each other and have gone through improvements and technology changes. |
| For now they are both based on Python so you need to have that |
| installed. The philosophies of the two are slightly different: |
| </p> |
| <dl compact="compact"> |
| <dt>Harness tests</dt> |
| <dd><p>The harness tests are data driven, i.e. you give the test data, i.e. a |
| string to translate and the expected output. The data is in a standard |
| format, i.e. json. They work with both Python2 and Python3, however |
| since the format is json it is perceivable that somebody would write |
| some C code which takes the data in the harness file and runs it through |
| liblouis so they could also run without Python and without ucs4. |
| </p> |
| </dd> |
| <dt>Doctests</dt> |
| <dd><p>The doctests on the other hand are based on a technology used in Python |
| where you define your tests as if you were sitting at a terminal session |
| with a Python interpreter. So the tests look like you typed a command |
| and got some output, e.g. |
| </p> |
| <div class="example"> |
| <pre class="example">>>> translate(['table.ctb'], "Hello", mode=compbrlLeftCursor) |
| ("HELLO", [0,1,2,3], [0,1,2,3], 0) |
| </pre></div> |
| |
| <p>There is a convenience wrapper which hides away much of the complexity |
| of above example so you can write stuff like |
| </p> |
| <div class="example"> |
| <pre class="example">>>> t.braille('the cat sat on the mat') |
| u'! cat sat on ! mat' |
| </pre></div> |
| |
| <p>But essentially you are writing code, so the doctests allow you to do |
| more flexible tests that are much closer to the raw iron. For technical |
| reasons the doctests will probably only ever work in either Python2 or |
| Python3 but not both and they will never run from C. |
| </p></dd> |
| </dl> |
| |
| <p>To sum it up, the recommendation is that for normal table testing you |
| should use the test harness. It has a lot of momentum and the format |
| is a standard. If you want to be closer to the raw Python API of |
| liblouis, if you want to test some more intricate scenarios (involving |
| inpos, modes, etc) then the doctests are for you. |
| </p> |
| |
| <hr> |
| <a name="Translation-Table-Test-Harness"></a> |
| <a name="Translation-Table-Test-Harness-1"></a> |
| <h3 class="section">4.1 Translation Table Test Harness</h3> |
| |
| <p>Each harness file is a simple utf8 encoded json file, which has two entries. |
| </p><dl compact="compact"> |
| <dt><code>tables</code></dt> |
| <dd><p>A list containing table names, which the tests should be run against. |
| This is usually just one table, but for some situations more than one table is required. |
| </p></dd> |
| <dt><code>tests</code></dt> |
| <dd><p>A list of sections of tests, which should be processed independantly. |
| Each test section is a dictionary of two items. |
| </p></dd> |
| <dt><code>flags</code></dt> |
| <dd><p>The flags that apply for all the test cases in this section. |
| For example, they could all be forward translation tests, or they should all be run as computer braille tests. |
| </p> |
| </dd> |
| <dt><code>data</code></dt> |
| <dd><p>A list of test cases, each one containing the specific test data needed to perform a test. |
| </p></dd> |
| </dl> |
| |
| <p>These are the valid fields for the flags section: |
| </p><dl compact="compact"> |
| <dt><code>comment</code></dt> |
| <dd><p>A field describing the reason for the tests, the transformation rule or any useful info that might be needed in case the test breaks (optional). |
| </p></dd> |
| <dt><code>cursorPos</code></dt> |
| <dd><p>The position of the cursor within the given text (optional). Useful |
| when simulating screenreader interaction, to debug contraction and |
| cursor behaviour. |
| </p></dd> |
| <dt><code>mode</code></dt> |
| <dd><p>The liblouis translation mode that should be used for this test |
| (optional). If not defined defaults to 0. |
| </p></dd> |
| <dt><code>outputUniBrl</code></dt> |
| <dd><p>For a forward translation test, the output should be in unicode braille. |
| For a backward translation test, the input is in unicode braille. |
| </p></dd> |
| <dt><code>testmode</code></dt> |
| <dd><p>The optional testmode field can have three values: "translate" (default if undeclaired), "backtranslate" or "hyphenate". |
| Declares what tests should be performed on the test data. |
| </p></dd> |
| </dl> |
| |
| |
| <p>Each test case has the following entries: |
| </p> |
| <dl compact="compact"> |
| <dt><code>input</code></dt> |
| <dd><p>The unicode text to be tested (required). |
| </p></dd> |
| <dt><code>output</code></dt> |
| <dd><p>The expected braille output (required). The dots should be encoded in |
| the liblouis ascii-braille like encoding. |
| </p></dd> |
| <dt><code>brlCursorPos</code></dt> |
| <dd><p>The expected position of the braille cursor in the braille output |
| (optional). Useful when simulating screenreader interaction, to debug |
| contraction and cursor behaviour. |
| </p></dd> |
| </dl> |
| |
| <p>Variables defined in the flags section can be overwridden by individual test cases, but if several tests need the same options, they should |
| idealy be split into their own section, complete with their own flags and data. |
| </p> |
| <p>For examples please see <samp>*_harness.txt</samp> in the |
| harness directory in the source distribution. |
| </p> |
| <hr> |
| <a name="Translation-Table-Doctests"></a> |
| <a name="Translation-Table-Doctests-1"></a> |
| <h3 class="section">4.2 Translation Table Doctests</h3> |
| |
| <p>For examples on how to create doctests please see <samp>*_test.txt</samp> in |
| the doctest directory in the source distribution. |
| </p> |
| <hr> |
| <a name="Notes-on-Back_002dTranslation"></a> |
| <a name="Notes-on-Back_002dTranslation-1"></a> |
| <h2 class="chapter">5 Notes on Back-Translation</h2> |
| |
| <p>Back-translation is carried out by the function |
| <code>lou_backTranslateString</code>. Its calling sequence is described in |
| <a href="#Programming-with-liblouis">Programming with liblouis</a>. Tables containing no |
| <code>context</code> opcode (see <a href="#context-opcode"><code>context</code></a>), <code>correct</code> opcode (see <a href="#correct-opcode"><code>correct</code></a>) or multipass opcodes can be |
| used for both forward and backward translation. If these opcodes are |
| needed different tables will be required. |
| <code>lou_backTranslateString</code> first performs <code>pass4</code>, if |
| present, then <code>pass3</code>, then <code>pass2</code>, then the |
| backtranslation, then corrections. Note that this is exactly the |
| inverse of forward translation. |
| </p> |
| <hr> |
| <a name="Programming-with-liblouis"></a> |
| <a name="Programming-with-liblouis-1"></a> |
| <h2 class="chapter">6 Programming with liblouis</h2> |
| |
| |
| <hr> |
| <a name="License"></a> |
| <a name="License-1"></a> |
| <h3 class="section">6.1 License</h3> |
| |
| <p>Liblouis may contain code borrowed from the Linux screen reader BRLTTY, |
| Copyright © 1999-2006 by the BRLTTY Team. |
| </p> |
| <p>Copyright © 2004-2007 ViewPlus Technologies, Inc. |
| <a href="www.viewplus.com">www.viewplus.com</a>. |
| </p> |
| <p>Copyright © 2007,2009 Abilitiessoft, Inc. |
| <a href="www.abilitiessoft.com">www.abilitiessoft.com</a>. |
| </p> |
| <p>Liblouis is free software: you can redistribute it and/or modify it |
| under the terms of the GNU Lesser General Public License as published |
| by the Free Software Foundation, either version 3 of the License, or |
| (at your option) any later version. |
| </p> |
| <p>Liblouis is distributed in the hope that it will be useful, but |
| WITHOUT ANY WARRANTY; without even the implied warranty of |
| MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU |
| Lesser General Public License for more details. |
| </p> |
| <p>You should have received a copy of the GNU Lesser General Public |
| License along with Liblouis. If not, see |
| <a href="http://www.gnu.org/licenses/">http://www.gnu.org/licenses/</a>. |
| </p> |
| <hr> |
| <a name="Overview"></a> |
| <a name="Overview-1"></a> |
| <h3 class="section">6.2 Overview</h3> |
| |
| <p>You use the liblouis library by calling the following functions, |
| <code>lou_translateString</code>, <code>lou_backTranslateString</code>, |
| <code>lou_logFile</code>, <code>lou_logPrint</code>, |
| <code>lou_endLog</code>, <code>lou_getTable</code>, |
| <code>lou_translate</code>, <code>lou_backTranslate</code>, <code>lou_hyphenate</code>, |
| <code>lou_charToDots</code>, |
| <code>lou_dotsToChar</code>, |
| <code>lou_compileString</code>, <code>lou_readCharFromFile</code>, |
| <code>lou_version</code> and <code>lou_free</code>. |
| These are described below. The header file, <samp>liblouis.h</samp>, also |
| contains brief descriptions. Liblouis is written in straight C. It has |
| just three code modules, <samp>compileTranslationTable.c</samp>, |
| <samp>lou_translateString.c</samp> and <samp>lou_backTranslateString.c</samp>. In |
| addition, there are two header files, <samp>liblouis.h</samp>, which defines |
| the API, and <samp>louis.h</samp>, used only internally and by liblouisxml. |
| The latter includes |
| <samp>liblouis.h</samp>. |
| </p> |
| <p>Persons who wish to use liblouis from Python may want to skip ahead to |
| <a href="#Python-bindings">Python bindings</a>. |
| </p> |
| <p><samp>compileTranslationTable.c</samp> keeps track of all translation tables |
| which an application has used. It is called by the translation, |
| hyphenation and checking functions when they start. If a table has not |
| yet been compiled <samp>compileTranslationTable.c</samp> checks it for |
| correctness and compiles it into an efficient internal representation. |
| The main entry point is <code>lou_getTable</code>. Since it is the module |
| that keeps track of memory usage, it also contains the <code>lou_free</code> |
| function. In addition, it contains the <code>lou_logFile</code>, |
| <code>lou_logPrint</code> and <code>lou_endLog</code> functions, plus some utility |
| functions which are |
| used by the other modules. |
| </p> |
| <p>By default, liblouis handles all characters internally as 16-bit |
| unsigned integers. It can be compiled for 32-bit characters as |
| explained below. The meanings of these integers are not hard-coded. |
| Rather they are defined by the character-definition opcodes. However, |
| the standard printable characters, from decimal 32 to 126 are |
| recognized for the purpose of processing the opcodes. Hence, the |
| following definition is included in <samp>liblouis.h</samp>. It is correct |
| for computers with at least 32-bit processors. |
| </p> |
| <div class="example"> |
| <pre class="example">#define widechar unsigned short int |
| </pre></div> |
| |
| <p>To make liblouis handle 32-bit Unicode simply remove the word |
| <code>short</code> in the above <code>define</code>. This will cause the translate and |
| back-translate functions to expect input in 32-bit form and to deliver |
| their output in this form. The input to the compiler (tables) is |
| unaffected except that two new escape sequences for 20-bit and 32-bit |
| characters are recognized. |
| </p> |
| <p>Here are the definitions of the eleven liblouis functions and their |
| parameters. They are given in terms of 16-bit Unicode. If liblouis has |
| been compiled for 32-bit Unicode simply read 32 instead of 16. |
| </p> |
| <hr> |
| <a name="Data-structure-of-liblouis-tables"></a> |
| <a name="Data-structure-of-liblouis-tables-1"></a> |
| <h3 class="section">6.3 Data structure of liblouis tables</h3> |
| |
| <p>The data structure <code>TranslationTableHeader</code> is defined by a |
| <code>typedef</code> statement in <samp>louis.h</samp>. To find the beginning, |
| search for the word ‘<samp>header</samp>’. As its name implies, this is |
| actually the table header. Data are placed in the <code>ruleArea</code> |
| array, which is the last item defined in this structure. This array is |
| declared with a length of 1 and is expanded as needed. The table |
| header consists mostly of arrays of pointers of size <code>HASHNUM</code>. |
| These pointers are actually offsets into <code>ruleArea</code> and point to |
| chains of items which have been placed in the same hash bucket by a |
| simple hashing algorithm. <code>HASHNUM</code> should be a prime and is |
| currently 1123. The structure of the table was chosen to optimize |
| speed rather than memory usage. |
| </p> |
| <p>The first part of the table contains miscellaneous information, such |
| as the number of passes and whether various opcodes have been used. It |
| also contains the amount of memory allocated to the table and the |
| amount actually used. |
| </p> |
| <p>The next section contains pointers to various braille indicators and |
| begins with <code>capitalSign</code>. The rules pointed to contain the |
| dot pattern for the indicator and an opcode which is used by the |
| back-translator but does not appear in the list of opcodes. The |
| braille indicators also include various kinds of emphasis, such as |
| italic and bold and information about the length of emphasized |
| phrases. The latter is contained directly in the table item instead of |
| in a rule. |
| </p> |
| <p>After the braille indicators comes information about when a letter |
| sign should be used. |
| </p> |
| <p>Next is an array of size <code>HASHNUM</code> which points to character |
| definitions. These are created by the character-definition opcodes. |
| </p> |
| <p>Following this is a similar array pointing to definitions of |
| single-cell dot patterns. This is also created from the |
| character-definition opcodes. If a character definition contains a |
| multi-cell dot pattern this is compiled into ordinary forward and |
| backward rules. If such a multi-cell dot pattern contains a single |
| cell which has not previously been defined that cell is placed in this |
| array, but is given the attribute <code>space</code>. |
| </p> |
| <p>Next come arrays that map characters to single-cell dot patterns and |
| dots to characters. These are created from both character-definition |
| opcodes and display opcodes. |
| </p> |
| <p>Next is an array of size 256 which maps characters in this range to |
| dot patterns which may consist of multiple cells. It is used, for |
| example, to map ‘<samp>{</samp>’ to dots 456-246. These mappings are created |
| by the <code>compdots</code> |
| or the <code>comp6</code> opcode (see <a href="#comp6-opcode"><code>comp6</code></a>). |
| </p> |
| <p>Next are two small arrays that held pointers to chains of rules |
| produced by the <code>swapcd</code> opcode (see <a href="#swapcd-opcode"><code>swapcd</code></a>) and the <code>swapdd</code> opcode (see <a href="#swapdd-opcode"><code>swapdd</code></a>) and by |
| some multipass, <code>context</code> and <code>correct</code> opcodes. |
| </p> |
| <p>Now we get to an array of size <code>HASHNUM</code> which points to chains |
| of rules for forward translation. |
| </p> |
| <p>Following this is a similar array for back-translation. |
| </p> |
| <p>Finally is the <code>ruleArea</code>, an array of variable size to which |
| various structures are mapped and to which almost everything else |
| points. |
| </p> |
| <hr> |
| <a name="lou_005fversion"></a> |
| <a name="lou_005fversion-1"></a> |
| <h3 class="section">6.4 lou_version</h3> |
| <a name="index-lou_005fversion"></a> |
| |
| <div class="example"> |
| <pre class="example">char *lou_version () |
| </pre></div> |
| |
| <p>This function returns a pointer to a character string containing the |
| version of liblouis, plus other information, such as the release date |
| and perhaps notable changes. |
| </p> |
| <hr> |
| <a name="lou_005ftranslateString"></a> |
| <a name="lou_005ftranslateString-1"></a> |
| <h3 class="section">6.5 lou_translateString</h3> |
| <a name="index-lou_005ftranslateString"></a> |
| |
| <div class="example"> |
| <pre class="example">int lou_translateString ( |
| const char * tableList, |
| const widechar * inbuf, |
| int *inlen, |
| widechar *outbuf, |
| int *outlen, |
| char *typeform, |
| char *spacing, |
| int mode); |
| </pre></div> |
| |
| <p>This function takes a string of 16-bit Unicode characters in |
| <code>inbuf</code> and translates it into a string of 16-bit characters in |
| <code>outbuf</code>. Each 16-bit character produces a particular dot pattern |
| in one braille cell when sent to an embosser or braille display or to |
| a screen type font. Which 16-bit character represents which dot pattern |
| is indicated by the character-definition and display opcodes in the |
| translation table. |
| </p> |
| <a name="translation_002dtables"></a><p>The <code>tableList</code> parameter points to a list of translation tables |
| separated by commas. If only one table is given, no comma should be |
| used after it. It is these tables which control just how the |
| translation is made, whether in Grade 2, Grade 1, or something else. |
| </p> |
| <p>liblouis knows where to find all the tables that have been distributed |
| with it. So you can just give a table name such as <code>en-us-g2.ctb</code> |
| and liblouis will load it. You can also give a table name which |
| includes a path. If this is the first table in a list, all the tables |
| in the list must be on the same path. You can specify a path on which |
| liblouis will look for table names by setting the environment variable |
| <code>LOUIS_TABLEPATH</code>. This environment variable can contain one or |
| more paths separated by commas. On receiving a table name liblouis |
| first checks to see if it can be found on any of these paths. If not, |
| it then checks to see if it can be found in the current directory, or, |
| if the first (or only) name in a table list, if it contains a |
| path name, can be found on that path. If not, it checks to see if it |
| can be found on the path where the distributed tables have been |
| installed. If a table has already been loaded and compiled this |
| path-checking is skipped. |
| </p> |
| <p>The tables in a list are all compiled into the same internal table. |
| The list is then regarded as the name of this table. As explained in |
| <a href="#How-to-Write-Translation-Tables">How to Write Translation Tables</a>, each table is a file which may |
| be plain text, big-endian Unicode or little-endian Unicode. A table |
| (or list of tables) is compiled into an internal representation the |
| first time it is used. Liblouis keeps track of which tables have been |
| compiled. For this reason, it is essential to call the <code>lou_free</code> |
| function at the end of your application to avoid memory leaks. Do |
| <em>NOT</em> call <code>lou_free</code> after each translation. This will |
| force liblouis to compile the translation tables each time they are |
| used, leading to great inefficiency. |
| </p> |
| <p>Note that both the <code>*inlen</code> and <code>*outlen</code> parameters are |
| pointers to integers. When the function is called, these integers |
| contain the maximum input and output lengths, respectively. When it |
| returns, they are set to the actual lengths used. |
| </p> |
| <p>The <code>typeform</code> parameter is used to indicate italic type, |
| boldface type, computer braille, etc. It is a string of characters |
| with the same length as the input buffer pointed to by <code>*inbuf</code>. |
| However, it is used to pass back character-by-character results, so |
| enough space must be provided to match the <code>*outlen</code> parameter. |
| Each character indicates the typeform of the corresponding character |
| in the input buffer. The values are as follows: 0 plain-text; 1 |
| italic; 2 bold; 4 underline; 8 computer braille. These values can be |
| added for multiple emphasis. If this parameter is <code>NULL</code>, no |
| checking for type forms is done. In addition, if this parameter is not |
| <code>NULL</code>, it is set on return to have an 8 at every position |
| corresponding to a character in <code>outbuf</code> which was defined to |
| have a dot representation containing dot 7, dot 8 or both, and to 0 |
| otherwise. |
| </p> |
| <p>The <code>spacing</code> parameter is used to indicate differences in |
| spacing between the input string and the translated output string. It |
| is also of the same length as the string pointed to by <code>*inbuf</code>. |
| If this parameter is <code>NULL</code>, no spacing information is computed. |
| </p> |
| <p>The <code>mode</code> parameter specifies how the translation should be |
| done. The valid values of mode are listed in <samp>liblouis.h</samp>. They |
| are all powers of 2, so that a combined mode can be specified by |
| adding up different values. |
| </p> |
| <p>The function returns 1 if no errors were encountered and 0 if a |
| complete translation could not be done. |
| </p> |
| <hr> |
| <a name="lou_005ftranslate"></a> |
| <a name="lou_005ftranslate-2"></a> |
| <h3 class="section">6.6 lou_translate</h3> |
| <a name="index-lou_005ftranslate"></a> |
| |
| <div class="example"> |
| <pre class="example">int lou_translate ( |
| const char * tableList, |
| const widechar * const inbuf, |
| int *inlen, |
| widechar * outbuf, |
| int *outlen, |
| char *typeform, |
| char *spacing, |
| int *outputPos, |
| int *inputPos, |
| int *cursorPos, |
| int mode); |
| </pre></div> |
| |
| <p>This function adds the parameters <code>outputPos</code>, <code>inputPos</code> and |
| <code>cursorPos</code>, to facilitate use in screen reader programs. The |
| <code>outputPos</code> parameter must point to an array of integers with at |
| least <code>inlen</code> elements. On return, this array will contain the |
| position in <code>outbuf</code> corresponding to each input position. |
| Similarly, <code>inputPos</code> must point to an array of integers of at |
| least <code>outlen</code> elements. On return, this array will contain the |
| position in <code>inbuf</code> corresponding to each position in |
| <code>outbuf</code>. |
| <code>cursorPos</code> must point to an integer containing the position of the |
| cursor in the input. On return, it will contain the cursor position in |
| the output. Any parameter after <code>outlen</code> may be <code>NULL</code>. In |
| this case, the actions corresponding to it will not be carried out. The |
| <code>mode</code> parameter, however, must be present and must be an integer, |
| not a pointer to an integer. If the <code>compbrlAtCursor</code> bit is set in |
| the <code>mode</code> parameter the space-bounded characters containing the |
| cursor will be translated in computer braille. If the |
| <code>compbrlLeftCursor</code> bit is set only the characters to the left of |
| the cursor will be in computer braille. This bit overrides |
| <code>compbrlAtCursor</code>. |
| When the <code>dotsIO</code> bit is set, during translation, produce output as dot patterns. During |
| back-translation accept input as dot patterns. Note that the produced |
| dot patterns are affected if you have any <code>display</code> opcode (see <a href="#display-opcode"><code>display</code></a>) defined |
| in any of your tables. |
| The <code>ucBrl</code> (Unicode Braille) bit is used by <code>lou_charToDots</code> and <code>lou_translate</code>. It causes the dot |
| patterns to be Unicode Braille rather than the liblouis representation. |
| Note that you will not notice any change when setting <code>ucBrl</code> unless <code>dotsIO</code> is also set. |
| <code>lou_dotsToChar</code> and <code>lou_backTranslate</code> recognize Unicode |
| braille automatically. |
| </p> |
| <p>The <code>otherTrans</code> mode needs special description. If it is set |
| liblouis will attempt to call a wrapper for another translator. These |
| other translators are usually for Asian languages. The calling sequence |
| is the same as for liblouis itself except that the <code>trantab</code> |
| parameter gives the name of the other translator, possibly abbreviated, |
| followed by a colon, followed by whatever other information the other |
| translator needs. This is specific for each translator. If no such |
| information is needed the colon should be omitted. The result of calling |
| either the translate or back-translate functions with this mode bit set |
| will be the same as calling without it set. That is, the wrapper for the |
| other translator simulates a call to liblouis. Note that the wrappers |
| are not implemented at this time. Setting this mode bit will result in |
| failure (return value of 0). |
| </p> |
| <hr> |
| <a name="lou_005fbackTranslateString"></a> |
| <a name="lou_005fbackTranslateString-1"></a> |
| <h3 class="section">6.7 lou_backTranslateString</h3> |
| <a name="index-lou_005fbackTranslateString"></a> |
| |
| <div class="example"> |
| <pre class="example">int lou_backTranslateString ( |
| const char * tableList, |
| const widechar * inbuf, |
| int *inlen, |
| widechar *outbuf, |
| int *outlen, |
| char *typeform, |
| char *spacing, |
| int mode); |
| </pre></div> |
| |
| <p>This is exactly the opposite of <code>lou_translateString</code>. |
| <code>inbuf</code> is a string of 16-bit Unicode characters representing |
| braille. <code>outbuf</code> will contain a string of 16–bit Unicode |
| characters. <code>typeform</code> will indicate any emphasis found in the |
| input string, while <code>spacing</code> will indicate any differences in |
| spacing between the input and output strings. The <code>typeform</code> and |
| <code>spacing</code> parameters may be <code>NULL</code> if this information is |
| not needed. <code>mode</code> again specifies how the back-translation |
| should be done. |
| </p> |
| <hr> |
| <a name="lou_005fbackTranslate"></a> |
| <a name="lou_005fbackTranslate-1"></a> |
| <h3 class="section">6.8 lou_backTranslate</h3> |
| <a name="index-lou_005fbackTranslate"></a> |
| |
| <div class="example"> |
| <pre class="example">int lou_backTranslate ( |
| const char * tableList, |
| const widechar * inbufx, |
| int *inlen, |
| widechar * outbuf, |
| int *outlen, |
| char *typeform, |
| char *spacing, |
| int *outputPos, |
| int *inputPos, |
| int *cursorPos, |
| int mode); |
| </pre></div> |
| |
| <p>This function is exactly the inverse of <code>lou_translate</code>. |
| </p> |
| <hr> |
| <a name="lou_005fhyphenate"></a> |
| <a name="lou_005fhyphenate-1"></a> |
| <h3 class="section">6.9 lou_hyphenate</h3> |
| <a name="index-lou_005fhyphenate"></a> |
| |
| <div class="example"> |
| <pre class="example">int lou_hyphenate ( |
| const char *tableList, |
| const widechar *inbuf, |
| int inlen, |
| char *hyphens, |
| int mode); |
| </pre></div> |
| |
| <p>This function looks at the characters in <code>inbuf</code> and if it finds |
| a sequence of letters attempts to hyphenate it as a word. Note that |
| lou_hyphenate operates on single words only, and spaces or punctuation |
| marks between letters are not allowed. Leading and trailing |
| punctuation marks are ignored. The table named by the <code>tableList</code> |
| parameter must contain a hyphenation table. If it does not, the |
| function does nothing. <code>inlen</code> is the length of the character |
| string in <code>inbuf</code>. <code>hyphens</code> is an array of characters and |
| must be of size <code>inlen</code> + 1 (to account for the NULL terminator). |
| If hyphenation is successful it will have a 1 at the beginning of each |
| syllable and a 0 elsewhere. If the <code>mode</code> parameter is 0 |
| <code>inbuf</code> is assumed to contain untranslated characters. Any |
| nonzero value means that <code>inbuf</code> contains a translation. In this |
| case, it is back-translated, hyphenation is performed, and it is |
| re-translated so that the hyphens can be placed correctly. The |
| <code>lou_translate</code> and <code>lou_backTranslate</code> functions are used |
| in this process. <code>lou_hyphenate</code> returns 1 if hyphenation was |
| successful and 0 otherwise. In the latter case, the contents of the |
| <code>hyphens</code> parameter are undefined. This function was provided for |
| use in liblouisxml. |
| </p> |
| <hr> |
| <a name="lou_005fcompileString"></a> |
| <a name="lou_005fcompileString-1"></a> |
| <h3 class="section">6.10 lou_compileString</h3> |
| <a name="index-lou_005fcompileString"></a> |
| |
| <div class="example"> |
| <pre class="example">int lou_compileString (const char *tableList, const char *inString) |
| </pre></div> |
| |
| <p>This function enables you to compile a table entry on the fly at |
| run-time. The new entry is added to <code>tableList</code> and remains in force |
| until <code>lou_free</code> is called. If <code>tableList</code> has not previously |
| been loaded it is loaded and compiled. <code>inString</code> contains the |
| table entry to be added. It may be anything valid. Error messages |
| will be produced if it is invalid. The function returns 1 on success and |
| 0 on failure. |
| </p> |
| <hr> |
| <a name="lou_005fdotsToChar"></a> |
| <a name="lou_005fdotsToChar-1"></a> |
| <h3 class="section">6.11 lou_dotsToChar</h3> |
| <a name="index-lou_005fdotsToChar"></a> |
| |
| <div class="example"> |
| <pre class="example">int lou_dotsToChar (const char *tableList, const widechar *inbuf, widechar |
| *outbuf, int length, int) |
| </pre></div> |
| |
| <p>This function takes a widechar string in <code>inbuf</code> consisting of dot |
| patterns and converts it to a widechar string in <code>outbuf</code> |
| consisting of characters according to the specifications in |
| <code>tableList</code>. <code>length</code> is the length of both <code>inbuf</code> and |
| <code>outbuf</code>. The dot patterns in <code>inbuf</code> can be in either |
| liblouis format or Unicode braille. The function returns 1 on success |
| and 0 on failure. |
| </p> |
| <hr> |
| <a name="lou_005fcharToDots"></a> |
| <a name="lou_005fcharToDots-1"></a> |
| <h3 class="section">6.12 lou_charToDots</h3> |
| <a name="index-lou_005fcharToDots"></a> |
| |
| <div class="example"> |
| <pre class="example">int lou_charToDots (const char *tableList, const widechar *inbuf, widechar |
| *outbuf, int length, int mode) |
| </pre></div> |
| |
| <p>This function is the inverse of <code>lou_dotsToChar</code>. It takes a |
| widechar string in <code>inbuf</code> consisting of characters and converts it |
| to a widechar string in <code>outbuf</code> consisting of dot patterns |
| according to the specifications in <code>tableList</code>. <code>length</code> is the |
| length of both <code>inbuf</code> and <code>outbuf</code>. The dot patterns in |
| <code>outbufbuf</code> are in liblouis format if the mode bit <code>ucBrl</code> is |
| not set and in Unicode format if it is set. The function returns 1 on |
| success and 0 on failure. |
| </p> |
| <hr> |
| <a name="lou_005flogFile"></a> |
| <a name="lou_005flogFile-1"></a> |
| <h3 class="section">6.13 lou_logFile</h3> |
| <a name="index-lou_005flogFile"></a> |
| |
| <div class="example"> |
| <pre class="example">void lou_logFile (char *fileName); |
| </pre></div> |
| |
| <p>This function is used when it is not convenient either to let messages |
| be printed on stderr or to use redirection, as when liblouis is used |
| in a GUI application or in liblouisxml. Any error messages generated |
| will be printed to the file given in this call. The entire path name of |
| the file must be given. |
| </p> |
| <hr> |
| <a name="lou_005flogPrint"></a> |
| <a name="lou_005flogPrint-1"></a> |
| <h3 class="section">6.14 lou_logPrint</h3> |
| <a name="index-lou_005flogPrint"></a> |
| |
| <div class="example"> |
| <pre class="example">void lou_logPrint (char *format, ...); |
| </pre></div> |
| |
| <p>This function is called like <code>fprint</code>. It can be used by other |
| libraries to print messages to the file specified by the call to |
| <code>lou_logFile</code>. In particular, it is used by the companion |
| library liblouisxml. |
| </p> |
| <hr> |
| <a name="lou_005flogEnd"></a> |
| <a name="lou_005flogEnd-1"></a> |
| <h3 class="section">6.15 lou_logEnd</h3> |
| <a name="index-lou_005flogEnd"></a> |
| |
| <div class="example"> |
| <pre class="example">lou_logEnd (); |
| </pre></div> |
| |
| <p>This function is used at the end of processing a document to close the |
| log file, so that it can be read by the rest of the program. |
| </p> |
| <hr> |
| <a name="lou_005fsetDataPath"></a> |
| <a name="lou_005fsetDataPath-1"></a> |
| <h3 class="section">6.16 lou_setDataPath</h3> |
| <a name="index-lou_005fsetDataPath"></a> |
| |
| <div class="example"> |
| <pre class="example">char * lou_setDataPath (char *path); |
| </pre></div> |
| |
| <p>This function is used to tell liblouis and liblouisutdml where tables |
| and files are located. It thus makes them completely relocatable, even |
| on Linux. The <code>path</code> is the directory where the subdirectories |
| <code>liblouis/tables</code> and <code>liblouisutdml/lbu_files</code> are rooted or |
| located. The function returns a pointer to the <code>path</code>. |
| </p> |
| <hr> |
| <a name="lou_005fgetDataPath"></a> |
| <a name="lou_005fgetDataPath-1"></a> |
| <h3 class="section">6.17 lou_getDataPath</h3> |
| <a name="index-lou_005fgetDataPath"></a> |
| |
| <div class="example"> |
| <pre class="example">char * lou_getDataPath (); |
| </pre></div> |
| |
| <p>This function returns a pointer to the path set by |
| <code>lou_setDataPath</code>. If no path has been set it returns <code>NULL</code>. |
| </p> |
| <hr> |
| <a name="lou_005fgetTable"></a> |
| <a name="lou_005fgetTable-1"></a> |
| <h3 class="section">6.18 lou_getTable</h3> |
| <a name="index-lou_005fgetTable"></a> |
| |
| <div class="example"> |
| <pre class="example">void *lou_getTable (char *tablelist); |
| </pre></div> |
| |
| <p><code>tablelist</code> is a list of names of table files separated by |
| commas, as explained previously |
| (see <a href="#translation_002dtables"><code>tableList</code> parameter in |
| <code>lou_translateString</code></a>). If no errors are found this function |
| returns a pointer to the compiled table. If errors are found messages |
| are printed to the log file, which is stderr unless a different |
| filename has been given using the <code>lou_logFile</code> function. |
| Errors result in a <code>NULL</code> pointer being returned. |
| </p> |
| <hr> |
| <a name="lou_005freadCharFromFile"></a> |
| <a name="lou_005freadCharFromFile-1"></a> |
| <h3 class="section">6.19 lou_readCharFromFile</h3> |
| <a name="index-lou_005freadCharFromFile"></a> |
| |
| <div class="example"> |
| <pre class="example">int lou_readCharFromFile (const char *fileName, int *mode); |
| </pre></div> |
| |
| <p>This function is provided for situations where it is necessary to read |
| a file which may contain little-endian or big-endian 16-bit Unicode |
| characters or ASCII8 characters. The return value is a little-endian |
| character, encoded as an integer. The <code>fileName</code> parameter is the |
| name of the file to be read. The <code>mode</code> parameter is a pointer to |
| an integer which must be set to 1 on the first call. After that, the |
| function takes care of it. On end-of-file the function returns |
| <code>EOF</code>. |
| </p> |
| <hr> |
| <a name="lou_005ffree"></a> |
| <a name="lou_005ffree-1"></a> |
| <h3 class="section">6.20 lou_free</h3> |
| <a name="index-lou_005ffree"></a> |
| |
| <div class="example"> |
| <pre class="example">void lou_free (); |
| </pre></div> |
| |
| <p>This function should be called at the end of the application to free |
| all memory allocated by liblouis. Failure to do so will result in |
| memory leaks. Do <em>NOT</em> call <code>lou_free</code> after each |
| translation. This will force liblouis to compile the translation |
| tables every time they are used, resulting in great inefficiency. |
| </p> |
| <hr> |
| <a name="Python-bindings"></a> |
| <a name="Python-bindings-1"></a> |
| <h3 class="section">6.21 Python bindings</h3> |
| |
| <p>There are Python bindings for <code>lou_translateString</code>, |
| <code>lou_translate</code> and <code>lou_version</code>. For installation |
| instructions see the the <samp>README</samp> file in the <samp>python</samp> |
| directory. Usage information is included in the Python module itself. |
| </p> |
| |
| <hr> |
| <a name="Opcode-Index"></a> |
| <a name="Opcode-Index-1"></a> |
| <h2 class="unnumbered">Opcode Index</h2> |
| <table><tr><th valign="top">Jump to: </th><td><a class="summary-letter" href="#Opcode-Index_opcode_letter-A"><b>A</b></a> |
| |
| <a class="summary-letter" href="#Opcode-Index_opcode_letter-B"><b>B</b></a> |
| |
| <a class="summary-letter" href="#Opcode-Index_opcode_letter-C"><b>C</b></a> |
| |
| <a class="summary-letter" href="#Opcode-Index_opcode_letter-D"><b>D</b></a> |
| |
| <a class="summary-letter" href="#Opcode-Index_opcode_letter-E"><b>E</b></a> |
| |
| <a class="summary-letter" href="#Opcode-Index_opcode_letter-F"><b>F</b></a> |
| |
| <a class="summary-letter" href="#Opcode-Index_opcode_letter-G"><b>G</b></a> |
| |
| <a class="summary-letter" href="#Opcode-Index_opcode_letter-H"><b>H</b></a> |
| |
| <a class="summary-letter" href="#Opcode-Index_opcode_letter-I"><b>I</b></a> |
| |
| <a class="summary-letter" href="#Opcode-Index_opcode_letter-J"><b>J</b></a> |
| |
| <a class="summary-letter" href="#Opcode-Index_opcode_letter-L"><b>L</b></a> |
| |
| <a class="summary-letter" href="#Opcode-Index_opcode_letter-M"><b>M</b></a> |
| |
| <a class="summary-letter" href="#Opcode-Index_opcode_letter-N"><b>N</b></a> |
| |
| <a class="summary-letter" href="#Opcode-Index_opcode_letter-P"><b>P</b></a> |
| |
| <a class="summary-letter" href="#Opcode-Index_opcode_letter-R"><b>R</b></a> |
| |
| <a class="summary-letter" href="#Opcode-Index_opcode_letter-S"><b>S</b></a> |
| |
| <a class="summary-letter" href="#Opcode-Index_opcode_letter-U"><b>U</b></a> |
| |
| <a class="summary-letter" href="#Opcode-Index_opcode_letter-W"><b>W</b></a> |
| |
| </td></tr></table> |
| <table class="index-opcode" border="0"> |
| <tr><td></td><th align="left">Index Entry</th><td> </td><th align="left"> Section</th></tr> |
| <tr><td colspan="4"> <hr></td></tr> |
| <tr><th><a name="Opcode-Index_opcode_letter-A">A</a></th><td></td><td></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-after">after</a>:</td><td> </td><td valign="top"><a href="#Character_002dClass-Opcodes">Character-Class Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-always">always</a>:</td><td> </td><td valign="top"><a href="#Translation-Opcodes">Translation Opcodes</a></td></tr> |
| <tr><td colspan="4"> <hr></td></tr> |
| <tr><th><a name="Opcode-Index_opcode_letter-B">B</a></th><td></td><td></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-before">before</a>:</td><td> </td><td valign="top"><a href="#Character_002dClass-Opcodes">Character-Class Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-begbold">begbold</a>:</td><td> </td><td valign="top"><a href="#Deprecated-Opcodes">Deprecated Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-begcaps">begcaps</a>:</td><td> </td><td valign="top"><a href="#Braille-Indicator-Opcodes">Braille Indicator Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-begcomp">begcomp</a>:</td><td> </td><td valign="top"><a href="#Emphasis-Opcodes">Emphasis Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-begital">begital</a>:</td><td> </td><td valign="top"><a href="#Deprecated-Opcodes">Deprecated Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-begmidword">begmidword</a>:</td><td> </td><td valign="top"><a href="#Translation-Opcodes">Translation Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-begnum">begnum</a>:</td><td> </td><td valign="top"><a href="#Translation-Opcodes">Translation Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-begunder">begunder</a>:</td><td> </td><td valign="top"><a href="#Deprecated-Opcodes">Deprecated Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-begword">begword</a>:</td><td> </td><td valign="top"><a href="#Translation-Opcodes">Translation Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-boldsign">boldsign</a>:</td><td> </td><td valign="top"><a href="#Deprecated-Opcodes">Deprecated Opcodes</a></td></tr> |
| <tr><td colspan="4"> <hr></td></tr> |
| <tr><th><a name="Opcode-Index_opcode_letter-C">C</a></th><td></td><td></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-capsign">capsign</a>:</td><td> </td><td valign="top"><a href="#Braille-Indicator-Opcodes">Braille Indicator Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-capsnocont">capsnocont</a>:</td><td> </td><td valign="top"><a href="#Special-Processing-Opcodes">Special Processing Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-class">class</a>:</td><td> </td><td valign="top"><a href="#Character_002dClass-Opcodes">Character-Class Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-comp6">comp6</a>:</td><td> </td><td valign="top"><a href="#Translation-Opcodes">Translation Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-compbrl">compbrl</a>:</td><td> </td><td valign="top"><a href="#Translation-Opcodes">Translation Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-context">context</a>:</td><td> </td><td valign="top"><a href="#The-Context-and-Multipass-Opcodes">The Context and Multipass Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-contraction">contraction</a>:</td><td> </td><td valign="top"><a href="#Translation-Opcodes">Translation Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-correct">correct</a>:</td><td> </td><td valign="top"><a href="#The-correct-Opcode">The correct Opcode</a></td></tr> |
| <tr><td colspan="4"> <hr></td></tr> |
| <tr><th><a name="Opcode-Index_opcode_letter-D">D</a></th><td></td><td></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-decpoint">decpoint</a>:</td><td> </td><td valign="top"><a href="#Special-Symbol-Opcodes">Special Symbol Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-digit">digit</a>:</td><td> </td><td valign="top"><a href="#Character_002dDefinition-Opcodes">Character-Definition Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-display">display</a>:</td><td> </td><td valign="top"><a href="#Miscellaneous-Opcodes">Miscellaneous Opcodes</a></td></tr> |
| <tr><td colspan="4"> <hr></td></tr> |
| <tr><th><a name="Opcode-Index_opcode_letter-E">E</a></th><td></td><td></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-endbold">endbold</a>:</td><td> </td><td valign="top"><a href="#Deprecated-Opcodes">Deprecated Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-endcaps">endcaps</a>:</td><td> </td><td valign="top"><a href="#Braille-Indicator-Opcodes">Braille Indicator Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-endcomp">endcomp</a>:</td><td> </td><td valign="top"><a href="#Emphasis-Opcodes">Emphasis Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-endital">endital</a>:</td><td> </td><td valign="top"><a href="#Deprecated-Opcodes">Deprecated Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-endnum">endnum</a>:</td><td> </td><td valign="top"><a href="#Translation-Opcodes">Translation Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-endunder">endunder</a>:</td><td> </td><td valign="top"><a href="#Deprecated-Opcodes">Deprecated Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-endword">endword</a>:</td><td> </td><td valign="top"><a href="#Translation-Opcodes">Translation Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-exactdots">exactdots</a>:</td><td> </td><td valign="top"><a href="#Translation-Opcodes">Translation Opcodes</a></td></tr> |
| <tr><td colspan="4"> <hr></td></tr> |
| <tr><th><a name="Opcode-Index_opcode_letter-F">F</a></th><td></td><td></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-firstletterbold">firstletterbold</a>:</td><td> </td><td valign="top"><a href="#Emphasis-Opcodes">Emphasis Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-firstletterital">firstletterital</a>:</td><td> </td><td valign="top"><a href="#Emphasis-Opcodes">Emphasis Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-firstletterunder">firstletterunder</a>:</td><td> </td><td valign="top"><a href="#Emphasis-Opcodes">Emphasis Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-firstwordbold">firstwordbold</a>:</td><td> </td><td valign="top"><a href="#Emphasis-Opcodes">Emphasis Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-firstwordital">firstwordital</a>:</td><td> </td><td valign="top"><a href="#Emphasis-Opcodes">Emphasis Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-firstwordunder">firstwordunder</a>:</td><td> </td><td valign="top"><a href="#Emphasis-Opcodes">Emphasis Opcodes</a></td></tr> |
| <tr><td colspan="4"> <hr></td></tr> |
| <tr><th><a name="Opcode-Index_opcode_letter-G">G</a></th><td></td><td></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-grouping">grouping</a>:</td><td> </td><td valign="top"><a href="#Character_002dDefinition-Opcodes">Character-Definition Opcodes</a></td></tr> |
| <tr><td colspan="4"> <hr></td></tr> |
| <tr><th><a name="Opcode-Index_opcode_letter-H">H</a></th><td></td><td></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-hyphen">hyphen</a>:</td><td> </td><td valign="top"><a href="#Special-Symbol-Opcodes">Special Symbol Opcodes</a></td></tr> |
| <tr><td colspan="4"> <hr></td></tr> |
| <tr><th><a name="Opcode-Index_opcode_letter-I">I</a></th><td></td><td></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-include">include</a>:</td><td> </td><td valign="top"><a href="#Miscellaneous-Opcodes">Miscellaneous Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-italsign">italsign</a>:</td><td> </td><td valign="top"><a href="#Deprecated-Opcodes">Deprecated Opcodes</a></td></tr> |
| <tr><td colspan="4"> <hr></td></tr> |
| <tr><th><a name="Opcode-Index_opcode_letter-J">J</a></th><td></td><td></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-joinnum">joinnum</a>:</td><td> </td><td valign="top"><a href="#Translation-Opcodes">Translation Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-joinword">joinword</a>:</td><td> </td><td valign="top"><a href="#Translation-Opcodes">Translation Opcodes</a></td></tr> |
| <tr><td colspan="4"> <hr></td></tr> |
| <tr><th><a name="Opcode-Index_opcode_letter-L">L</a></th><td></td><td></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-largesign">largesign</a>:</td><td> </td><td valign="top"><a href="#Translation-Opcodes">Translation Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lastletterbold">lastletterbold</a>:</td><td> </td><td valign="top"><a href="#Emphasis-Opcodes">Emphasis Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lastletterital">lastletterital</a>:</td><td> </td><td valign="top"><a href="#Emphasis-Opcodes">Emphasis Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lastletterunder">lastletterunder</a>:</td><td> </td><td valign="top"><a href="#Emphasis-Opcodes">Emphasis Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lastwordboldafter">lastwordboldafter</a>:</td><td> </td><td valign="top"><a href="#Emphasis-Opcodes">Emphasis Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lastwordboldbefore">lastwordboldbefore</a>:</td><td> </td><td valign="top"><a href="#Emphasis-Opcodes">Emphasis Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lastworditalafter">lastworditalafter</a>:</td><td> </td><td valign="top"><a href="#Emphasis-Opcodes">Emphasis Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lastworditalbefore">lastworditalbefore</a>:</td><td> </td><td valign="top"><a href="#Emphasis-Opcodes">Emphasis Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lastwordunderafter">lastwordunderafter</a>:</td><td> </td><td valign="top"><a href="#Emphasis-Opcodes">Emphasis Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lastwordunderbefore">lastwordunderbefore</a>:</td><td> </td><td valign="top"><a href="#Emphasis-Opcodes">Emphasis Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lenboldphrase">lenboldphrase</a>:</td><td> </td><td valign="top"><a href="#Emphasis-Opcodes">Emphasis Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lenitalphrase">lenitalphrase</a>:</td><td> </td><td valign="top"><a href="#Emphasis-Opcodes">Emphasis Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lenunderphrase">lenunderphrase</a>:</td><td> </td><td valign="top"><a href="#Emphasis-Opcodes">Emphasis Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-letsign">letsign</a>:</td><td> </td><td valign="top"><a href="#Braille-Indicator-Opcodes">Braille Indicator Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-letter">letter</a>:</td><td> </td><td valign="top"><a href="#Character_002dDefinition-Opcodes">Character-Definition Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-litdigit">litdigit</a>:</td><td> </td><td valign="top"><a href="#Character_002dDefinition-Opcodes">Character-Definition Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-literal">literal</a>:</td><td> </td><td valign="top"><a href="#Deprecated-Opcodes">Deprecated Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-locale">locale</a>:</td><td> </td><td valign="top"><a href="#Miscellaneous-Opcodes">Miscellaneous Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lowercase">lowercase</a>:</td><td> </td><td valign="top"><a href="#Character_002dDefinition-Opcodes">Character-Definition Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lowword">lowword</a>:</td><td> </td><td valign="top"><a href="#Translation-Opcodes">Translation Opcodes</a></td></tr> |
| <tr><td colspan="4"> <hr></td></tr> |
| <tr><th><a name="Opcode-Index_opcode_letter-M">M</a></th><td></td><td></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-math">math</a>:</td><td> </td><td valign="top"><a href="#Character_002dDefinition-Opcodes">Character-Definition Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-midendword">midendword</a>:</td><td> </td><td valign="top"><a href="#Translation-Opcodes">Translation Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-midnum">midnum</a>:</td><td> </td><td valign="top"><a href="#Translation-Opcodes">Translation Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-midword">midword</a>:</td><td> </td><td valign="top"><a href="#Translation-Opcodes">Translation Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-multind">multind</a>:</td><td> </td><td valign="top"><a href="#Miscellaneous-Opcodes">Miscellaneous Opcodes</a></td></tr> |
| <tr><td colspan="4"> <hr></td></tr> |
| <tr><th><a name="Opcode-Index_opcode_letter-N">N</a></th><td></td><td></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-noback">noback</a>:</td><td> </td><td valign="top"><a href="#Translation-Opcodes">Translation Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-nocont">nocont</a>:</td><td> </td><td valign="top"><a href="#Translation-Opcodes">Translation Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-nocross">nocross</a>:</td><td> </td><td valign="top"><a href="#Translation-Opcodes">Translation Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-nofor">nofor</a>:</td><td> </td><td valign="top"><a href="#Translation-Opcodes">Translation Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-noletsign">noletsign</a>:</td><td> </td><td valign="top"><a href="#Braille-Indicator-Opcodes">Braille Indicator Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-noletsignafter">noletsignafter</a>:</td><td> </td><td valign="top"><a href="#Braille-Indicator-Opcodes">Braille Indicator Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-noletsignbefore">noletsignbefore</a>:</td><td> </td><td valign="top"><a href="#Braille-Indicator-Opcodes">Braille Indicator Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-numsign">numsign</a>:</td><td> </td><td valign="top"><a href="#Braille-Indicator-Opcodes">Braille Indicator Opcodes</a></td></tr> |
| <tr><td colspan="4"> <hr></td></tr> |
| <tr><th><a name="Opcode-Index_opcode_letter-P">P</a></th><td></td><td></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-partword">partword</a>:</td><td> </td><td valign="top"><a href="#Translation-Opcodes">Translation Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-pass2">pass2</a>:</td><td> </td><td valign="top"><a href="#The-Context-and-Multipass-Opcodes">The Context and Multipass Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-pass3">pass3</a>:</td><td> </td><td valign="top"><a href="#The-Context-and-Multipass-Opcodes">The Context and Multipass Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-pass4">pass4</a>:</td><td> </td><td valign="top"><a href="#The-Context-and-Multipass-Opcodes">The Context and Multipass Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-postpunc">postpunc</a>:</td><td> </td><td valign="top"><a href="#Translation-Opcodes">Translation Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-prepunc">prepunc</a>:</td><td> </td><td valign="top"><a href="#Translation-Opcodes">Translation Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-prfword">prfword</a>:</td><td> </td><td valign="top"><a href="#Translation-Opcodes">Translation Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-punctuation">punctuation</a>:</td><td> </td><td valign="top"><a href="#Character_002dDefinition-Opcodes">Character-Definition Opcodes</a></td></tr> |
| <tr><td colspan="4"> <hr></td></tr> |
| <tr><th><a name="Opcode-Index_opcode_letter-R">R</a></th><td></td><td></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-repeated">repeated</a>:</td><td> </td><td valign="top"><a href="#Translation-Opcodes">Translation Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-replace">replace</a>:</td><td> </td><td valign="top"><a href="#Translation-Opcodes">Translation Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-repword">repword</a>:</td><td> </td><td valign="top"><a href="#Translation-Opcodes">Translation Opcodes</a></td></tr> |
| <tr><td colspan="4"> <hr></td></tr> |
| <tr><th><a name="Opcode-Index_opcode_letter-S">S</a></th><td></td><td></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-sign">sign</a>:</td><td> </td><td valign="top"><a href="#Character_002dDefinition-Opcodes">Character-Definition Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-singleletterbold">singleletterbold</a>:</td><td> </td><td valign="top"><a href="#Emphasis-Opcodes">Emphasis Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-singleletterital">singleletterital</a>:</td><td> </td><td valign="top"><a href="#Emphasis-Opcodes">Emphasis Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-singleletterunder">singleletterunder</a>:</td><td> </td><td valign="top"><a href="#Emphasis-Opcodes">Emphasis Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-space">space</a>:</td><td> </td><td valign="top"><a href="#Character_002dDefinition-Opcodes">Character-Definition Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-sufword">sufword</a>:</td><td> </td><td valign="top"><a href="#Translation-Opcodes">Translation Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-swapcc">swapcc</a>:</td><td> </td><td valign="top"><a href="#Swap-Opcodes">Swap Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-swapcd">swapcd</a>:</td><td> </td><td valign="top"><a href="#Swap-Opcodes">Swap Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-swapdd">swapdd</a>:</td><td> </td><td valign="top"><a href="#Swap-Opcodes">Swap Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-syllable">syllable</a>:</td><td> </td><td valign="top"><a href="#Translation-Opcodes">Translation Opcodes</a></td></tr> |
| <tr><td colspan="4"> <hr></td></tr> |
| <tr><th><a name="Opcode-Index_opcode_letter-U">U</a></th><td></td><td></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-undefined">undefined</a>:</td><td> </td><td valign="top"><a href="#Miscellaneous-Opcodes">Miscellaneous Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-undersign">undersign</a>:</td><td> </td><td valign="top"><a href="#Deprecated-Opcodes">Deprecated Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-uplow">uplow</a>:</td><td> </td><td valign="top"><a href="#Character_002dDefinition-Opcodes">Character-Definition Opcodes</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-uppercase">uppercase</a>:</td><td> </td><td valign="top"><a href="#Character_002dDefinition-Opcodes">Character-Definition Opcodes</a></td></tr> |
| <tr><td colspan="4"> <hr></td></tr> |
| <tr><th><a name="Opcode-Index_opcode_letter-W">W</a></th><td></td><td></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-word">word</a>:</td><td> </td><td valign="top"><a href="#Translation-Opcodes">Translation Opcodes</a></td></tr> |
| <tr><td colspan="4"> <hr></td></tr> |
| </table> |
| <table><tr><th valign="top">Jump to: </th><td><a class="summary-letter" href="#Opcode-Index_opcode_letter-A"><b>A</b></a> |
| |
| <a class="summary-letter" href="#Opcode-Index_opcode_letter-B"><b>B</b></a> |
| |
| <a class="summary-letter" href="#Opcode-Index_opcode_letter-C"><b>C</b></a> |
| |
| <a class="summary-letter" href="#Opcode-Index_opcode_letter-D"><b>D</b></a> |
| |
| <a class="summary-letter" href="#Opcode-Index_opcode_letter-E"><b>E</b></a> |
| |
| <a class="summary-letter" href="#Opcode-Index_opcode_letter-F"><b>F</b></a> |
| |
| <a class="summary-letter" href="#Opcode-Index_opcode_letter-G"><b>G</b></a> |
| |
| <a class="summary-letter" href="#Opcode-Index_opcode_letter-H"><b>H</b></a> |
| |
| <a class="summary-letter" href="#Opcode-Index_opcode_letter-I"><b>I</b></a> |
| |
| <a class="summary-letter" href="#Opcode-Index_opcode_letter-J"><b>J</b></a> |
| |
| <a class="summary-letter" href="#Opcode-Index_opcode_letter-L"><b>L</b></a> |
| |
| <a class="summary-letter" href="#Opcode-Index_opcode_letter-M"><b>M</b></a> |
| |
| <a class="summary-letter" href="#Opcode-Index_opcode_letter-N"><b>N</b></a> |
| |
| <a class="summary-letter" href="#Opcode-Index_opcode_letter-P"><b>P</b></a> |
| |
| <a class="summary-letter" href="#Opcode-Index_opcode_letter-R"><b>R</b></a> |
| |
| <a class="summary-letter" href="#Opcode-Index_opcode_letter-S"><b>S</b></a> |
| |
| <a class="summary-letter" href="#Opcode-Index_opcode_letter-U"><b>U</b></a> |
| |
| <a class="summary-letter" href="#Opcode-Index_opcode_letter-W"><b>W</b></a> |
| |
| </td></tr></table> |
| |
| <hr> |
| <a name="Function-Index"></a> |
| <a name="Function-Index-1"></a> |
| <h2 class="unnumbered">Function Index</h2> |
| <table><tr><th valign="top">Jump to: </th><td><a class="summary-letter" href="#Function-Index_fn_letter-L"><b>L</b></a> |
| |
| </td></tr></table> |
| <table class="index-fn" border="0"> |
| <tr><td></td><th align="left">Index Entry</th><td> </td><th align="left"> Section</th></tr> |
| <tr><td colspan="4"> <hr></td></tr> |
| <tr><th><a name="Function-Index_fn_letter-L">L</a></th><td></td><td></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lou_005fbackTranslate"><code>lou_backTranslate</code></a>:</td><td> </td><td valign="top"><a href="#lou_005fbackTranslate">lou_backTranslate</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lou_005fbackTranslateString"><code>lou_backTranslateString</code></a>:</td><td> </td><td valign="top"><a href="#lou_005fbackTranslateString">lou_backTranslateString</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lou_005fcharToDots"><code>lou_charToDots</code></a>:</td><td> </td><td valign="top"><a href="#lou_005fcharToDots">lou_charToDots</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lou_005fcompileString"><code>lou_compileString</code></a>:</td><td> </td><td valign="top"><a href="#lou_005fcompileString">lou_compileString</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lou_005fdotsToChar"><code>lou_dotsToChar</code></a>:</td><td> </td><td valign="top"><a href="#lou_005fdotsToChar">lou_dotsToChar</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lou_005ffree"><code>lou_free</code></a>:</td><td> </td><td valign="top"><a href="#lou_005ffree">lou_free</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lou_005fgetDataPath"><code>lou_getDataPath</code></a>:</td><td> </td><td valign="top"><a href="#lou_005fgetDataPath">lou_getDataPath</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lou_005fgetTable"><code>lou_getTable</code></a>:</td><td> </td><td valign="top"><a href="#lou_005fgetTable">lou_getTable</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lou_005fhyphenate"><code>lou_hyphenate</code></a>:</td><td> </td><td valign="top"><a href="#lou_005fhyphenate">lou_hyphenate</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lou_005flogEnd"><code>lou_logEnd</code></a>:</td><td> </td><td valign="top"><a href="#lou_005flogEnd">lou_logEnd</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lou_005flogFile"><code>lou_logFile</code></a>:</td><td> </td><td valign="top"><a href="#lou_005flogFile">lou_logFile</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lou_005flogPrint"><code>lou_logPrint</code></a>:</td><td> </td><td valign="top"><a href="#lou_005flogPrint">lou_logPrint</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lou_005freadCharFromFile"><code>lou_readCharFromFile</code></a>:</td><td> </td><td valign="top"><a href="#lou_005freadCharFromFile">lou_readCharFromFile</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lou_005fsetDataPath"><code>lou_setDataPath</code></a>:</td><td> </td><td valign="top"><a href="#lou_005fsetDataPath">lou_setDataPath</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lou_005ftranslate"><code>lou_translate</code></a>:</td><td> </td><td valign="top"><a href="#lou_005ftranslate">lou_translate</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lou_005ftranslateString"><code>lou_translateString</code></a>:</td><td> </td><td valign="top"><a href="#lou_005ftranslateString">lou_translateString</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lou_005fversion"><code>lou_version</code></a>:</td><td> </td><td valign="top"><a href="#lou_005fversion">lou_version</a></td></tr> |
| <tr><td colspan="4"> <hr></td></tr> |
| </table> |
| <table><tr><th valign="top">Jump to: </th><td><a class="summary-letter" href="#Function-Index_fn_letter-L"><b>L</b></a> |
| |
| </td></tr></table> |
| |
| <hr> |
| <a name="Program-Index"></a> |
| <a name="Program-Index-1"></a> |
| <h2 class="unnumbered">Program Index</h2> |
| <table><tr><th valign="top">Jump to: </th><td><a class="summary-letter" href="#Program-Index_pg_letter-L"><b>L</b></a> |
| |
| </td></tr></table> |
| <table class="index-pg" border="0"> |
| <tr><td></td><th align="left">Index Entry</th><td> </td><th align="left"> Section</th></tr> |
| <tr><td colspan="4"> <hr></td></tr> |
| <tr><th><a name="Program-Index_pg_letter-L">L</a></th><td></td><td></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lou_005fallround"><code>lou_allround</code></a>:</td><td> </td><td valign="top"><a href="#lou_005fallround">lou_allround</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lou_005fcheckhyphens"><code>lou_checkhyphens</code></a>:</td><td> </td><td valign="top"><a href="#lou_005fcheckhyphens">lou_checkhyphens</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lou_005fchecktable"><code>lou_checktable</code></a>:</td><td> </td><td valign="top"><a href="#lou_005fchecktable">lou_checktable</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lou_005fdebug"><code>lou_debug</code></a>:</td><td> </td><td valign="top"><a href="#lou_005fdebug">lou_debug</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lou_005ftrace"><code>lou_trace</code></a>:</td><td> </td><td valign="top"><a href="#lou_005ftrace">lou_trace</a></td></tr> |
| <tr><td></td><td valign="top"><a href="#index-lou_005ftranslate-1"><code>lou_translate</code></a>:</td><td> </td><td valign="top"><a href="#lou_005ftranslate-_0028program_0029">lou_translate (program)</a></td></tr> |
| <tr><td colspan="4"> <hr></td></tr> |
| </table> |
| <table><tr><th valign="top">Jump to: </th><td><a class="summary-letter" href="#Program-Index_pg_letter-L"><b>L</b></a> |
| |
| </td></tr></table> |