Hi Felipe,
There is a VALID_CHARS_FILE that you can set in the default-params.
The default is admin/valid_chars.en that contains the string
ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz~
You need to add the characters you want to this file, or create a new =20=
one and tell default-params to look there.
Hope this helps.
Best wishes,
Dominic
On Jul 20, 2006, at 6:37 AM, Felipe S=E1nchez Mart=EDnez wrote:
> Hi all,
>
> I have just started working around infomap-nlp, and I found out that
> there is some kind of problem with non-ascii characters. When building =
=20
> a
> single model from a single file, the wordlist file containing the =
whole
> corpus with only one word per line shows things like the following:
>
> ...
> uni
> n
> ...
>
>
> while it should be just 'uni=F3n'. The system is splitting single =
words =20
> by
> non-ascii character like '=F3'.
>
> How can I solve this problem?
>
> Thank you very much in advance.
>
> Cheeers
> -- =20
> Felipe S=E1nchez Mart=EDnez
> -------------------------------------------------------------------
> Departamento de Lenguajes E-mail: fsa...@dl...
> y Sistemas Inform=E1ticos Homepage: www.dlsi.ua.es/~fsanchez
> Universidad de Alicante Fax: +34 965 90 93 26
> E-03071 Alicante (Spain) Phone: +34 965 90 34 00, ext: 2038
>
>
>
> =
-----------------------------------------------------------------------=20=
> --
> Take Surveys. Earn Cash. Influence the Future of IT
> Join SourceForge.net's Techsay panel and you'll get the chance to =20
> share your
> opinions on IT & business topics through brief surveys -- and earn =
cash
> http://www.techsay.com/default.php?=20
> page=3Djoin.php&p=3Dsourceforge&CID=3DDEVDEV
> _______________________________________________
> infomap-nlp-users mailing list
> inf...@li...
> https://lists.sourceforge.net/lists/listinfo/infomap-nlp-users
>
|