Menu

#1410 Top forthcoming should exclude all generic author names

Approved
closed
9
2021-04-06
2021-03-20
No

(I already have a patch for this one, which I'll attach in a follow-up comment.)

Not sure how to "prove" this, short of running a server with the clock changed, but a few times, I've seen books by "uncredited" show up in the "Top Forthcoming Books" at the bottom of the front page, which seemed a bit odd. (From memory, I think they've often been Disney tie-ins.)

Anyway, I had a look at the forthcoming books related code, and found that:

  1. Books by "unknown" are excluded; however other "special" authors are not.
  2. Various reports also exclude unknown-type authors, although they pick up on different ones. (And use the author IDs rather than author_canonical strings, although it looks like all the queries use the authors table, so working off author name wouldn't involve additional joins.)

Naively, it feels like there should be a common list of "special" authors that can/should be ignored in certain circumstances, just as the forthcoming books list. This would make it easier to maintain if/when new ones need adding - I could easily imagine there are non-English language ones that aren't currently being handled.

Discussion

  • ErsatzCulture

    ErsatzCulture - 2021-03-20

    OK, so here is a patch that implements (much of) this... there should be three files, but the file picker is being a bit awkward, so I might have to attach them to multiple comments.

    First off isfdb.py has SPECIAL_AUTHORS_TO_IGNORE defined at the end. This includes the names of the author IDs used by some of the reports - but I haven't altered those reports to make use of this.

    Secondly, mod/marque.py has had an overhaul to break out the code that queries for the top authors, and now makes use of SPECIAL_AUTHORS_TO_IGNORE. The query has been redone to use a cursor rather than db.query(), because I couldn't find any examples of the latter that documented if it supported parameterized SQL queries, which are much safer to use w.r.t. quoting, SQL injection etc. After a cursory search, I found one use of cursors in the ISFDB code, so hopefully they are deemed OK to use?

    Finally, there are some perfunctory tests for this code. These should go in a new mod/tests subdirectory (which may need an empty init.py file as well), and can then be run in the manner documented in the header comment.

     
  • ErsatzCulture

    ErsatzCulture - 2021-03-20

    Here's the test file - I couldn't get the file picker to let me select files from different directories.

    FWIW, I also documented how you might be able to use this code in a Python REPL session here: http://www.isfdb.org/wiki/index.php/User:ErsatzCulture/RunningScriptsStandalone This isn't massively useful, especially if you're not using a CLI-friendly environment (i.e. Windows ;-)

     
  • Ahasuerus

    Ahasuerus - 2021-04-06

    Ticket moved from /p/isfdb/bugs/767/

     
  • Ahasuerus

    Ahasuerus - 2021-04-06
    • summary: Top forthcoming does not exclude all "generic" authors --> Top forthcoming should exclude all generic author names
    • Group: v1.0 (example) --> Approved
     
  • Ahasuerus

    Ahasuerus - 2021-04-06

    Moved this ticket from the "Bugs" queue to the "Feature Requests" queue.

     
  • Ahasuerus

    Ahasuerus - 2021-04-06
    • status: open --> closed
    • assigned_to: ErsatzCulture
     
  • Ahasuerus

    Ahasuerus - 2021-04-06

    Functional change implemented in:

    common/isfdb.py
    mod/marque.py
    

    Installed in SVN 615 on 2021-04-06. Kept the db.query logic (as opposed to cursors) for consistency's sake. Closing the FR.

     

Anonymous
Anonymous

Add attachments
Cancel





MongoDB Logo MongoDB