Re: Spell check, Dictionary creation

Le mardi 18 août 2009 à 14:53 +0200, Silli, L. H. a écrit :
> (1) Bug: There are only 7 public dictionaries for Amaya [1]. But there 
> are no Norwegian or Russian dictionaries, for example. As a workaround, 
> I would like to create a personal dictionary - or "source dictionary" - 
> according to the rules Irene described in a previous message [2] and 
> then to just rely on the personal dictionary. I would use an ispell or 
> OpenOffice dictionary as basis.

Current available dictionaries are not complete too.
It's possible to have a personal dictionary for each language (example
Eperso for English).

> 
> However, the personal dictionary only works together with the 7 public 
> dictionaries, it seems. So if I have <body xml:lang="no">, then Amaya 
> does not use the personal dictionary. So this workaround is actually 
> closed, it seems. (I could fake and use xml:lang="en", I guess ... but 
> that seems meaningless.)


Amaya is able to work with several dictionaries. It selects the current
set of dictionaries according to the language of the current element.
This was the result of a previous research work.

Within a <p xml:lang="en">, it will use Eprinc.dic and Eperso.dic
(or .DIC dictionaries if .dic are not available). Eprinc.dic is the main
compressed English dictionary. Eprinc.dic is the main textual English
dictionary. Eperso.dic (or Eperso.DIC) dictionary registers personal
English words. Within a <p xml:lang="fr">, it will use Fprinc.dic and
Fperso.dic. dictionaries.

The format of a .DIC dictionary is very simple:
 o the first line gives the number of words and the number of
   characters (including following newline characters).
 o each following lines list words; there is a word by line ordered
   by size then ordered by alphabetic order.

> Could this be changed so that the personal dictionary works with any 
> language? Or, at the very best, I would like to be able to create 
> (personal) dictionaries that are language specific.

When you ask Amaya to learn the new word, it inserts it in the
$AmayaHome/dictionary.DCT dictionary. That dictionary has exactly the
same format as .DIC dictionaries and you can use that facility to extend
or create a directory.

Suppose I want to extend the French directory:
 1) I download extenddic.tgz to get the source of the French
    directory: extenddic/dic/Fprinc.DIC
 2) I copy the Fprinc.DIC to $AmayaHome/dictionary.DCT and I add
    some missing words with the learn function of Amaya.
 3) I use the diccompress program to generate a new compressed
    Fprinc.dic and install this new version in Amaya/dicopar directory.

> 
> (2) How to create a compiled dictionary?
> 
> Based on the mentioned message from Irene [2], I downloaded the 
> "extenddic.tgz" package[3], and unpacked it. I understood that the 
> included compiler needed some extra include files, and so I also 
> downloaded the full source of Amaya 11.2, and dropped the content of the 
> extenddic package (the files "traiter.c" et al.) into the folder 
> "amaya-fullsrc-11.2/Amaya/thotlib/internals/h", as this seemed to be 
> where the include files was located  - I have no clue where else to put 
> them.
> 
> Then I ran
> 
>     ./compiler
> 
> from the commandline. Only to get a lot of error messages about other 
> lacking files as well. I located those files and placed a copy in the 
> same folder, only to get other error messages (such as 
> «/usr/lib/gcc/powerpc-apple-darwin9/4.0.1/include/varargs.h:4:2: error: 
> #error "GCC no longer implements <varargs.h>."» and «appstruct.h:22: 
> error: syntax error before ‘Proc’»

I generate a new version of the extenddic.tgz and I give some
instructions in http://www.w3.org/Amaya/User/Overview.html

I suggest you experiment the $AmayaHome/dictionary.DCT first.
If it works for you and you are not able to use diccompress, please send
me your dictionary. I will integrate it.

> 
> Perhaps there is a more complete (and perhaps updated) explanation of 
> how to create a dictionary? (I noticed that the "extenddic" package is 
> from March 2000.)  The basic info I need is a) where to place the 
> 'extenddic' files and b) the syntax of the command line command ...
> 
> If not, is there any way I could send a raw dictionary (and the personal 
> dictionary is one such example, I think) to someone so they can compile 
> and include the dictionary in Amaya?
> 
> [1] http://www.w3.org/Amaya/User/SourceDist.html
> [2] http://lists.w3.org/Archives/Public/www-amaya/2001AprJun/0050.html
> [3] http://www.w3.org/Amaya/Distribution/extenddic.tgz
-- 
Irene Vatton <Irene.Vatton@inria.fr>
INRIA

Received on Wednesday, 19 August 2009 13:40:23 UTC