Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

#1126 Search fails when using non-ascii characters

next release
open
nobody
None
5
2013-03-29
2013-01-08
Miguel
No

Hi,

Perhaps this is a known bug or "feature", but I have not found any information. Just to report that searching the database with a word that includes non-ascii characters fails to produce any result.
Example: Fåhraeus
In my database that should find at least 4 publications, but finds none.
This is true whether I search with the word on its own or using a particular field (in this case "author")

I'm on JabRef version 2.9.1 on a OSX (10.6.8)

Cheers,

Miguel

Discussion

  • Oliver Kopp
    Oliver Kopp
    2013-01-08

    Please provide an example entry. I assume that the characters are LaTeX
    encoded.

     
  • Miguel
    Miguel
    2013-01-08

    Thank you for your prompt answer. It already helped circumscribing the problem.
    To summarize:

    1/ I have entries that I imported into the database from PubMed directly from the program that I was using before to manage my database (BibDesk). In these entries the characters are LaTeX-encoded. I am attaching one of them (Apcher2012aa). These entries ARE NOT FOUND when I issue any of these searches:

    Fåhraeus
    Fahraeus
    F{\aa}hraeus ( this is an 'illegal search expression')
    author=Fåhraeus
    author=Fahraeus
    author=F{\aa}hraeus ( again, this is an 'illegal search expression')

    2/ Then I have entries that I imported into the database from PubMed directly from JabRef (versions 2.8.1 and 2.9 - BTW, this is my current version, not 2.9.1, sorry). In these entries the characters appear in the database as non-ascii ones, that is, they are not LaTeX-encoded. I am also attaching one of them (Candeias2006). These entries ARE CORRECTLY FOUND when using any of these searches:

    Fåhraeus
    author=Fåhraeus

    Conversely, they ARE NOT FOUND if I issue:

    Fahraeus
    author=Fahraeus

    In my JabRef settings I have Preferences:

    Language = English
    Default encoding = UTF8

    Please, let me know if something is not clear.

     
  • Miguel
    Miguel
    2013-01-08

    It did not accept my attachment, I'm trying again.

     
    Attachments
  • Miguel
    Miguel
    2013-01-23

    Hi,

    Just wanted to know if someone could reproduce the problem and/or suggest the best way to work around it.

    Cheers.

     
  • Igor
    Igor
    2013-03-02

    What can be done to address the issue?
    I can think in 3 possible options:
    (i) allow the search using escaped chars

    (ii) 'remove' the accent and searches for all the words havind the accented char
    For example:
    if I search: author=José
    Jabref searches: Jose, José, Josè, Jos{\e'}...?

    (iii) ONLY when using an unaccented char, jabref searches all possible ways to accent that char
    Example:
    if I search: author=José
    Jabref searches: José

    if I search: author=Jose
    Jabref searches: Jose, José, Josè, Jos{\e'}...?

     
    • Miguel
      Miguel
      2013-03-02

      Hi Igor,

      I think option (iii) is the most adapted to everyday cases (real differences and accounting for typos). So:

      if I search: author=José
      Jabref searches: José

      if I search: author=Jose
      Jabref searches: Jose, José, Josè, Jos{\e'}...

      The second search would catch typos or entries created before accents were permitted in, say, PubMed, while the first search would be useful for people having created or cured manually their entries, so they are sure they are missing nothing. At the very least, the second type of search in this option should work, I could do with the first one failing to find anything if I can retrieve the results I wanted among those of the second type.

      Cheers.