Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

#8 file sorting

open
nobody
None
5
2012-09-06
2006-05-26
Dennis Lim
No

If we have files which are numbered like this
file1, file2, ... file 10, file11
the sorting algorithm only takes into account ASCII
comparison, therefore, the order would be
file1, file10, file11, file2, ...

Currently cbz creators would have to be careful to
number their files with prefix 0, i.e. file01, file02,
file03 etc. However, this isn't always done.

Therefore a more robust sorting algorithm would detect
numberical substrings and use a different comparison
for that.

I'll try to code it up when I have time.
In the meantime, here is a bug report in case anyone
else wants to fix it before me.

Discussion

  • Dennis Lim
    Dennis Lim
    2006-05-26

    Logged In: YES
    user_id=117202

    patch submitted to fix this. Please refer to patch page.

     
  • sourcebunny
    sourcebunny
    2006-06-05

    Logged In: YES
    user_id=1533972

    I don't see this as a bug but more as a feature request. It
    would only encourage dumb and naieve numbering.

     
  • Dennis Lim
    Dennis Lim
    2006-06-06

    Logged In: YES
    user_id=117202

    It's only natural to number files according to 1,2,3, ...,
    10 instead of 01, 02, 03, ... 10. Naieve perhaps but not
    dumb. The prefix 0 is only a requirement due to a common
    limitation in how most applications implement sorting. Human
    beings can easily 'sort' the list correctly but programs
    cannot. We just need to build in some more 'intelligence'
    into the programs.

    I would go as far as to agree that this is a feature request
    and not a bug. However, some features are commonsense enough
    that I would consider it as a bug if it was missing.

     
  • sourcebunny
    sourcebunny
    2006-06-06

    Logged In: YES
    user_id=1533972

    I disagree. Your sorting would possibly break other sorting
    of images. How does your patch deal with comics with 100+ pages.

    filename10 comes before filename2. If you wanted to extract
    the files you would also get a unwanted situation where
    sorting is different by most 'proper' Operating Systems than
    in Comical.

    Besides it just isn't a bug. That kind of sorting is
    intentional.

     
  • Dennis Lim
    Dennis Lim
    2006-06-07

    Logged In: YES
    user_id=117202

    I guess we'll just agree to disagree.

    If you could give me examples where the algorithm would
    'break other sorting', I'll make the necessary adjustments.

    As for 100+ images, it does handle that as well.. currently
    it would group all numbers together and convert to int
    before sorting. i.e. images with page01, page02, etc have no
    difference in how they are sorted. images with page1, page2,
    ... page10 are sorted as per what I've described. (i.e. the
    'correct' way IMHO). Images with 100+ numbers would behave
    as per the 10+ case. The only way it'll break is if you try
    to sort >65535 images. (I didn't check for overflow)

    This method of 'alphanumeric' sorting has been seriously
    discussed in a reputable magazine like Dr Dobbs Journal
    http://www.ddj.com/showArticle.jhtml;jsessionid=XAWZF2F1HF2BSQSNDBCSKH0CJUMEKJVN?articleID=184404294
    Windows XP Explorer sorts in this manner. You may not prefer
    it, but some people do consider it a feature.

    Yes, it might encourage 'naive' cbr/cbz creators to create
    something that doesn't work in other viewers. i.e. this
    'feature' only works in comical. But that might mean that
    more people switch over and start using comical.
    Besides, it's not just hypothetical, there are already
    comics out there which have pages named as I have described.
    That is why I created the report and patch in the first place.

    I admit that the patch has a weakness regarding localization
    and other languages. However, this is an existing problem in
    comical and I'm not making it any worse. I'll get around to
    fixing it up but need time to study on the localization issues.

     
  • Logged In: NO

    Hi guys
    running under linux, i met a case-sensitivity issue
    => the file numbering is okay but i guess they were created on a windows box where case is not preserved

    when sorted on windows, i supposed the ordering is fine
    but i get the following (which is a real pain when reading the comics!)
    UCXM006-00.JPG
    UCXM006-01.JPG
    UCXM006-02.JPG
    UCXM006-03.JPG
    UCXM006-04.JPG
    UCXM006-06.JPG
    UCXM006-07.JPG
    ucxm006-05.jpg

    would it be possible to provide options like case-insensitive ordering ?