Hi!
When selecting many books by the same author they will
sometimes have the same 8 char filename. If there were
files by the same names in the download directory
before pressing download then I would want to be asked
- Replace: Yes to all / No to all - but BEFORE starting
the download. If I choose No to all I would expect all
collisions to be resolved and removed from list of
books to download before actually downloading. If there
are multiple files in the list window in PyGets all
having the same filename then I would rather have them
renamed this should be done in a standard way so we can
check for collisions using the new unique file names
(wich also might exist in the destination if they have
been downloaded before.) Actually I would want
filenames to be unique and in a documented way even if
I only download one book since I might overwrite it by
accident at a later time if not renamed in a consistent
way.
TTFN Alf
Logged In: YES
user_id=607804
Thank you for pointing out the possibility of filename
collisions. I agree that collisions should be detected
earlier during multiple file downloads, and the issue boils
down to deciding what the "right" corrective action should be.
After analyzing the e-text database contents, I have found
that filename collisions are due to one of three conditions:
there are 9 multiple entries (and multiple etext numbers)
referring to the same etext, there are 4 etexts which
contain multiple titles for which separate entries exist in
the contents list, and there are 14 different etexts which
have the same base filename. I've attached a text file
showing which etexts and titles are affected.
In the first two cases, the etext being downloaded is the
same file as the one already on the computer, and the proper
response should probably be a message to that effect. In the
third case, an actual filename collision exists between
different etexts. Project Gutenberg doesn't mind the
identical names because they store etexts that were released
in different years in different directories (i.e. etext99
holds files released in 1999).
Any automatic renaming in the consistent manner you wish
would probably involve appending a secondary identifier to
the filename, such as the etext number or year of release.
That means a filter for identifying which files might need
this treatment would have to be included into the download
process. Fortunately, the number of cases appears to be
reasonably small, and the filter may turn out to simply be a
list of filenames to watch out for.
It may also be better for users if the program simply asked
what alternative filename to use for an etext causing a
filename collision. Each user may have a different idea of
what works best for them. Based on the number of problem
titles, the probability of having collisions appears small
and the burden on the user should not be excessive.
Your suggestion will be put on my list of things to do.
Thanks for the feedback.
--Gary
Duplicate filenames