Menu

#744 Publisher Directory doesn't handle accented publisher names

v1.0 (example)
closed-fixed
None
5
2020-01-04
2020-01-01
Ahasuerus
No

There is a discrepancy between the way the Author Directory page is built and the way the Publisher Directory page is built by the software. From a Community Portal discussion:

The Author Directory page is built using the first two letters of the "Directory Entry" field of author records. The Publisher Directory page is built using the first two letters of the "Publisher Name" field of publisher records.

There are cleanup reports for these two fields, but they look for somewhat different things. The cleanup report for the Directory Entry field looks for any letters outside of the A-Z range. It catches characters like "Ü" and reports them as errors. On the other hand, the cleanup report for the Publisher Name field looks for values that are not in the "Latin-1" character set, which includes most letters used by West European languages, e.g. "Ü", "Û", "Ú" and "Ù". It's only when a publisher name includes a character outside of the Latin-1 set, e.g. "Ÿ" or "Š", and there is no transliterated value for the name that the cleanup report flags the record as an error.

In most cases characters like "Ü" do not need a transliteration. It's only when one of the first two characters -- which are used by the Publisher Directory page -- is non-English but within the Latin-1 character set that it creates a problem. Checking the database, I see that we have roughly 140 publishers with this problem.

I'll have to think about it and see if I can come up with a solution. It may be possible to change the way the Publisher Directory page is built to handle these non-English characters as if they were their English counterparts.

Discussion

  • Ahasuerus

    Ahasuerus - 2020-01-04
    • status: open --> closed-fixed
    • assigned_to: Ahasuerus
     
  • Ahasuerus

    Ahasuerus - 2020-01-04

    Fixed in:

    biblio/directory.py
    common/SQLparsing.py
    

    Installed in SVN 492 on 2020-01-04. Closing the Bug.

     

Anonymous
Anonymous

Add attachments
Cancel





MongoDB Logo MongoDB