Menu

#822 Create a cleanup report to find pub-reference title mismatches

Approved
closed
None
5
2016-09-30
2015-08-30
Ahasuerus
No

Create a cleanup report to find mismatches between a publication's title and the title of the pub's reference title. The exact algorithm remains to be determined, but note that TITLE TITLE: ..." and "TITLE TITLE (...)" are valid avariations for publication titles. "SERIES: TITLE" and "SERIES: TITLE: SUBTITLE" are also common, although perhaps suboptimal.
Proposed algorithm:
1.Find all pubs whose "reference" title doesn't contain the pub's title
2. Calculate the two strings' similarity
3. Add the pub to the cleanup report if the calculated similarity value is less than 50%

Discussion

  • Ahasuerus

    Ahasuerus - 2015-09-11
    • status: open --> open-accepted
    • assigned_to: Ahasuerus
     
  • Ahasuerus

    Ahasuerus - 2015-09-11

    Original version (CHAPBOOKs only) implemented in:

    mod/cleanup_report.py 1.63
    mod/common.py 1.67
    nightly/nightly_update.py 1.128
    

    Installed in r2015-132 on 2015-09-10. Keeping open.

     
  • Ahasuerus

    Ahasuerus - 2015-09-16

    Part 2 - ignore punctuation. Implemented in:

    edit/cleanup_report.py 1.2
    nightly/nightly_update.py 1.129
    

    Installed in r2015-137 on 2015-09-16. Keeping the FR open.

     
  • Ahasuerus

    Ahasuerus - 2015-09-19

    Part 3 - Add OMNIBUSES, delete exclamation points from the list of ignored punctuation characters. Implemented in:

    edit/cleanup_report.py 1.3
    nightly/nightly_update.py 1.130
    

    Installed in r2015-141 on 2015-09-18. Keeping the FR opem.

     
  • Ahasuerus

    Ahasuerus - 2015-09-19

    Part 4 - Updated the message displayed at the top of the page to mention OMNIBUses. Implemented in edit/cleanup_report.py 1.4, installed in r2015-142 on 2015-09-18.

     
  • Ahasuerus

    Ahasuerus - 2015-09-22

    Part 5 - Added the ability to ignore pubs in edit/cleanup_report.py 1.6. Installed in r2015-145 on 2015-09-21.

     
  • Ahasuerus

    Ahasuerus - 2016-09-30
    • status: open-accepted --> closed
     
  • Ahasuerus

    Ahasuerus - 2016-09-30

    Part 6 - Added the rest of the publication types in:

    edit/cleanup_report.py 1.61
    nightly/nightly_update.py 1.170
    

    Installed in r2016-170 on 2016-09-29. Closing.

     

Anonymous
Anonymous

Add attachments
Cancel