COES   COES   COES Español Ingles  
Navigation

COES. General Information and Distribution

Santiago Rodríguez and Jesús Carretero
November 2010

Contents


1. What Is COES and Why Do I Want It?


This set of data files implements a Spanish (castellano) dictionary with 54,000 roots (aprox.) and their derivated forms. The number of roots increases every day, but new versions of COES are not available until their are tested for correctness.

The current distribution of COES includes a speller for the Spanish language.

COES tools must be used integrated with the international ispell program (version 3.1.13 or further) or aspell.

Several releases of COES have been publicly distributed with the following agenda:

Version Date
V 1.1 December 1994
V 1.2 January 1995
V 1.3 February 1995
V 1.4 April 1995
V 1.5 November 1996
V 1.6 April 1999
V 1.7 June 2001
V 1.8 March 2005
V 1.9 November 2005
V 1.10 May 2008
V 1.11 November 2010
COES PROfessional NON GPL version


2. Where can I get COES?


The last COES release can be obtained from http://www.datsi.fi.upm.es/~coes/espa~nol-1.11.tar.gz. This package contains the affix file and the spanish word list.

If you use aspell, you can get two instalable packages:

3. What are Ispell/Aspell and how to get them?


A special entry on ispell general information.

Aspell is another spell checker and more information can be accessed here.

4. How to install COES?


If you want to run the Spanish dictionary, you have to undefine the NO8BIT macro in the local.h configuration file.
The distribution is included in the espa~nol-X.X.tar.gz file (X.X is the version number). To extract sources that are in files ending in `.tar.gz' you can use the command
gzip -d < espa~nol-X.X.tar.gz | tar xf -
where espa~nol-X.X.tar.gz is the name of the file.
This file is expanded to the following files:
  • espa~nol.aff: Affixes file.
  • espa~nol.words: Contains a list of words that appear in the official español dictionary (Diccionario de la Real Academia Española de la Lengua 21st edition).
  • espa~nol.nofl: Contains a list of words not appearing in the official dictionary but being used normal spanish and they are "correct" words.
  • espa~nol.comp: Contains a list of words not appearing in the official dictionary but being used in computer related texts.
  • antiguas.words: Contains a list of words that appear in the official espa nol dictionary and they are old ones that are not currently in use.
  • espa~nol.words+: Contains the expanded list of words generated from the espa~nol.words and espa~nol.comp word files.
  • e~nes: Script for replacing the 'n and 'N by  n and  N in the espa~nol.aff, espa~nol.words and espa~nol.words+. If you use the second way to specify this letter you have to run this script. This script uses the sed utility. It has been checked by using the GNU sed version 2.05. If you want to run this script make sure that you have the GNU sed installed and type:
    make e~ne

  • Makefile: Makefile for building the hash file (espa~nol.hash) from the affix file and the espa~nol.words file.

5. How to generate the dictionaries?


First, you have to decide how to represent the e~nes. There are two options: 'n 'N and ~n ~N. If you use the second option, you must execute the script e~nes. This script uses the sed utility. It has been checked by using the GNU sed version 2.05. If you want to run this script make sure that you have the GNU sed installed and type:
make e~ne

To generate the Spanish dictionary (espa~nol.hash file) type:
make

This way of building the hash file needs about 50Mb of paging space and 100 Mb of temporary disk space. Please, ensure that you have enough disk space in the tmp partition (usually /usr/tmp). If you do not have it, you have to set the TMPDIR environment variable to a path where you can allocate 100 Mb of temporary disk storage.
If you want to create the espa~nol.hash from the expanded word list (espa~nol.words+), just type:
make build

It does not need so much temporary space.
The size of the spanish dictionary (espa~nol.hash) is 4 Mbytes. If you get a size much bigger, probably it is due to the sort command of the operating system (Solaris 2.7 has this problem). In this case we recommend to install the textutils package of GNU and be sure that the sort command that you use is the textutils one.

6. Dictionary Installation?


To install the hash file become root and type
make install

7. Which character maps are supported by COES?


Six different formats are supported by COES.
Default format: The acute characters are coded as follows:
Code Char

' a

á
' e é
' i í
' o ó
' u ú
' n  n
" u ü
' A Á
' E É
' I Í
' O Ó
' U Ú
' N  N
" U Ü

TeX format: The acute characters are coded as follows:
Code Char

\' a

á
\' e é
\' {\i} í
\' o ó
\' u ú
\' n ñ
\" u ü
\' A Á
\' E É
\' {\I} Í
\' O Ó
\' U Ú
\' N Ñ
\" U Ü

plainTeX format: The acute characters are coded as follows:
Code Char
\' {a} á
\' {e} é
\' {\i} í
\' {o} ó
\' {u} ú
\' {n} ñ
\" {u} ü
\' {A} Á
\' {E} É
\' {\I} Í
\' {O} Ó
\' {U} Ú
\' {N} Ñ
\" {U} Ü

html format: The acute characters are coded as follows:
Code Char
&aacute; á
&eacute; é
&iacute; í
&oacute; ó
&uacute; ú
&Aacute; Á
&Eacute; É
&Iacute; Í
&Oacute; Ó
&Uacute; Ú
&ntilde; ñ
&Ntilde; Ñ
&uuml; ü
&Uuml; Ü

latin1 format: The acute characters are coded as specified in the iso_8859_1 code.
msdos format: The acute characters are coded as specified in the extended ASCII MSDOS code.
If you want to run ispell by using one of the previous formats please type:
ispell -T <formatter> -d espa~nol <file>

8. Is There a MSDOS dictionary?


espa~nol.hash file is available for MSDOS users at:

http://www.datsi.fi.upm.es/~coes/espa~nol.zip

9. Where to send bug reports?


Note that the affixes list and the word list are under development. We are currently working on them. If you find words that does not appear in the word list or words that must not appear in the word list, please send a message to
espanol-bugs@datsi.fi.upm.es.
It is very important that you send us the that does not appear in the dictionary and they must. You can easily do this by sending to the above Email address the file .ispell_espa~nol stored in the home directory of every user.

10. Who developed COES?

COES was developed in the Universidad Politécnica de Madrid. Prof. Jesús Carretero moved to Universidad Carlos III de Madrid and he goes on collaborating in the project. Postal addresses of both authos follows.
Santiago Rodríguez
Departamento de Arquitectura
y Tecnología de Sistemas Informáticos (DATSI)
Facultad de Informática.
Universidad Politécnica de Madrid
Campus de Montegancedo s/n.
28660 Boadilla del Monte, Madrid, España.
Email: srodri@fi.upm.es
Jesús Carretero
Universidad Carlos III de Madrid
Despacho 2.2.A.25
Edificio Sabatini
Campus de Leganés
Avda de la Universidad, 30
28911, Leganés, Madrid, España
Email: jesus.carretero@uc3m.es

11. Copyright


Copyright (c) 1994 1995 1996 1999 2001 2005 2008 2010 Santiago Rodríguez and Jesús Carretero

Two kind of licenses are available for this package:
  • GNU. This package is distributed as free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation. This program is distributed in the hope that it will be useful but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
  • PRO. For applications where GNU license is not usable, please contact the authors.
  • Visitors since June 1997:   Contador
    Updated on 22 November 2010 by Santiago Rodríguez y Jesús Carretero
    Web made by Miguel Carretero