cpaldjvu vs. csepdjvu

Brought to you by: docbill, leonb, profshadoko

cpaldjvu vs. csepdjvu

Forum: DjVuLibre Development

Creator: Charles Hyder

Created: 2010-03-26

Updated: 2012-11-08

Charles Hyder - 2010-03-26

As the source code for cpaldjvu mentions, one day it should be rewritten as a
preprocessor to csepdjvu. Given that, one would expect the code to do much of
the same in both tools. Indeed, the inspection of the code shows a lot of cut-
and-pasting. So, I decided to test them out neck to neck. Here are the
results:

http://ifile.it/73dj0gl

While cpaldjvu performs very well, cspedjvu produces file 6 times as large. I
thought at first that csepdjvu must be missing a call to
tune_jb2image_lossless(), but it's there. Btw, verbose modes of the two
tools show differing numbers of matched shapes, which is weird as well. Any
ideas?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Leon Bottou - 2010-03-26

I could not see your code because the url you gave sent me to an upload page…

This is probably because cpaldjvu picks the dominant color and makes it the
background while csepdjvu does not do this by default.
Other than that, the two programs should do pretty much the same thing.

The situation is more complicated in the case of jb2 because csepdjvu needs
improvements:
The connected component analysis must be enhanced to recall which components
touch another component.
Then the tunejb2 function can be lossy for all the components that touch no
other component.
That way you optimize the letters without destroying the line art.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Charles Hyder - 2010-03-26

Strange. The link I posted works for me. Here's another upload, to Rapidshare
this time:

http://rapidshare.com/files/368354490/cpaldjvu_vs_csepdjvu.zip.html

Regarding background color: the background in the test image has been
completely removed, so this can't be it. The INFO shows that it's the JB2
foreground layer that is poorly compressed.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Leon Bottou - 2010-03-26

Got the files.
In the sep file, you should use the transparent color indice 0xfff when you
want to encode something as the background color.
Otherwise the foreground encodes both the letters (that match well) and the
complement
of the letters (that never matches properly…)

- L.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Charles Hyder - 2010-03-26

Thanks a lot! I simply forgot about this part, which is mentioned explicitly
in man csepdjvu. Everything works fine now.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.