ISFDB Bibliographic Tools / Feature Requests / #788 Create a cleanup report to find 'Suspected Duplicate Authors'

Ahasuerus - 2015-02-19

Implemented in:

mod/cleanup_report.py 1.26 mod/common.py 1.50 nightly/nightly_update.py 1.93 scripts/add_cleanup_id_2.sql 1.1

Installed in r2105-055 on 2015-02-19. The current version covers authors whose names start with the letters 'I', 'O', 'Q', 'U', 'V', 'X', 'Y', and 'Z'. It uses the Hamming distance algorithm. In the future we will add all the other letters of the alphabet and consider implementing other, more time-consuming, algorithms like Jaro distance.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Ahasuerus - 2015-02-19

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Anonymous - 2015-02-26

Added the letter 'F' in:

mod/cleanup_report.py 1.28 nightly/nightly_update.py 1.95

Installed in r2015-058 on 2015-02-25.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Ahasuerus - 2015-02-26

Added the letter 'W' in:

mod/cleanup_report.py 1.29 nightly/nightly_update.py 1.96

Installed in r2015-059 on 2015-02-26.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Ahasuerus - 2015-03-03

Added 'K' and 'H' and moved the report to a separate weekly run:

mod/cleanup_report.py 1.30 nightly/nightly_update.py 1.97

Installed in r2015-061 on 2015-03-02.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Ahasuerus - 2015-03-03

Corrected a bug in nightly/nightly_update.py 1.98. Installed in r2105-062 on 2015-03-02.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Ahasuerus - 2015-03-04

Corrected a bug with weekly processing colliding with nightly processing in nightly/nightly_update.py 1.99. Installed in r2015-064 on 2015-03-04.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Ahasuerus - 2015-03-11

Added the letter 'T' in:

mod/cleanup_report.py 1.31 nightly/nightly_update.py 1.100

Installed in r2015-067 on 2015-03-10.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Ahasuerus - 2015-03-13

Add the letter G:

mod/cleanup_report.py 1.32 nightly/nightly_update.py 1.101

Installed in r2015-069 on 2015-03-12.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Ahasuerus - 2015-03-16

Added 'E' in:

mod/cleanup_report.py 1.36 nightly/nightly_update.py 1.105

Installed in r2105-076 on 2015-03-16.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Ahasuerus - 2015-04-14

Added 'B' in:

mod/cleanup_report.py 1.38 nightly/nightly_update.py 1.107

Installed in r2015-083 on 2015-04-14.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Anonymous - 2015-04-27

Added 'P' in:

mod/cleanup_report.py 1.41 nightly/nightly_update.py 1.110

Installed in r2015-087 on 2015-04-26.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Ahasuerus - 2015-05-05

Added 'L' in:

mod/cleanup_report.py 1.43 nightly/nightly_update.py 1.112

Installed in r2015-090 on 2015-05-05.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Ahasuerus - 2015-07-14

Added 'S' in:

mod/cleanup_report.py 1.49 nightly/nightly_update.py 1.117

Installed in r2015-101 on 2015-07-14.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Ahasuerus - 2015-08-04

Added 'D' in:

mod/cleanup_report.py 1.51 nightly/nightly_update.py 1.118

Installed in r2015-106 on 2015-08-04.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Ahasuerus - 2015-09-10

status: open --> open-accepted
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Ahasuerus - 2015-09-26

Added 'A' in:

edit/cleanup_report.py 1.7 nightly/nightly_update.py 1.132

Installed in r2015-146 on 2015-09-26.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Ahasuerus - 2015-10-15

Added 'R' in:

edit/cleanup_report.py 1.10 nightly/nightly_update.py 1.135

Installed in r2015-171 on 2015-10-15. 'J' and 'M' are still outstanding.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Ahasuerus - 2015-12-05

Added 'M' in:

edit/cleanup_report.py 1.17 nightly/nightly_update.py 1.141

Installed in r2015-246 on 2015-12-05. 'J' is still outstanding.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Ahasuerus - 2016-01-17

'J' added in:

edit/cleanup_report.py 1.19 nightly/nightly_update.py 1.143

Installed in r2016-002 on 2016-01-16. Keeping the FR open since we may want to add suspect duplicate authors where the first letter is different. Preliminary logic:

select a1.author_id, a1.author_canonical, a2.author_canonical
from authors a1, authors a2
where substr(a1.author_canonical,1,1)='Z'
and a1.author_id != a2.author_id
and substr(a1.author_canonical,2,999)=substr(a2.author_canonical,2,999)

We may also want to check authors whose names start with non-alpha characters like apostrophe.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Create a cleanup report to find 'Suspected Duplicate Authors'

Group

Searches

Help

#788 Create a cleanup report to find 'Suspected Duplicate Authors'

Discussion