Menu

Indexing UTF Frecnh caracter

the spirit
2017-12-14
2017-12-18
  • the spirit

    the spirit - 2017-12-14

    Hi,

    In the file src/Indrilang.g we can find this command "charVocabulary = '\u0001'..'\u00ff'; // UTF-8 format ", this line mean that it can handle french because for example the caracter "è" is" U+00E8"

    So my question : Is it Indri able to index fench caracter like é è à...?

    Regrads.

     
  • David Fisher

    David Fisher - 2017-12-18

    If your input data is UTF-8 encoded, indri can index it. If your queries are UTF-8 encoded, indri can use them for search.

     

Log in to post a comment.