Menu

#99 New Lossless Fast Pseudo-Quantization algorithm

None
closed
None
5
2015-03-15
2014-12-02
Anonymous
No

Hi Hervé,

I've implemented a new Quantizer class LFPQuantizer, which offers very fast conversion from 24- and 23-bit images to 8-bit palette images. However, it is not a real quantization algorithm. It's lossless, that is, it just collects all RGB(A) colors as they come and adds them to the palette (if they are not already there). If there are more than PaletteSize colors in the true-color image, the "quantization" is stopped and NULL is returned. Sounds like this is not a big thing, but wait...

Basically, this is more a conversion than a quantization. However, it is a very fast conversion. For typical images, that are images with few colors (e. g. maps for GIS), the LFPQuantizer is between 3 and 6 times faster than WuQuantizer (which, in addition, may produce wrong results).

The algorithm uses a hash map, where the color (stored as an unsigned int) is the key. However, since MSVC's STL hash map <hash_map> is even slower than the "regular" map (red-black tree), I started to think about an alternative. Although boost's hash map is said being much faster, it is not an option because of the additional dependencies.</hash_map>

So, I took my Algorithms in C++ book, mixed this with some Java JDK HashMap class code and implemented an open addressing hash table in plain C (more or less). I can guarantee that there are at most 256 entries in the table, which is great for an "open addressing" (with linear probing) hash table. By sizing the table to hold 512 entries, the load factor is bounded to 0.5 (less, if there are less colors in the image).

Why that all? Oh, yes, there are use-cases. I need to keep a lot of these images in memory. Actually, my images are 8 bpp with about 10 to 256 colors. Having them in the new lossless WebP (weppy) format on disk saves about 30% space compared to PNG files. However, these are all 24 bit... That's not good for keeping a lot of them in memory. Loading them as 24 bpp and converting to 8 bpp sounds reasonable. Since I know that these are actually "palletized" images, I can use the LFPQuantizer with no worry.

Also, one could give this algorithm a try, before (in case it fails due to the image having too many different colors) actually using the NN or Wu's algorithm. The LFP algorithm is that fast, that it may be worth it (especially if the processed images are known to have not too many colors, like drawings, for example).

Additionally, since it makes sense to me, I tried to add support for 32-bit images in FreeImage_ColorQuantize[Ex]. I was also able to modify class WuQuantizer accordingly. This class and my new class LFPQuantizer support both 24- and 32-bit images. However, I was not able to add 32-bit support to NNQuantizer. Maybe you could give it a try...

I will now provide some modified source files and some new ones. In these, 32-bit support (in addition to 24-bit support, of course) is implemented in FreeImage_ColorQuantizeEx, but explicitly disabled for FIQ_NNQUANT (... if (bpp == 32) return NULL; ). You could either

  1. leave it as is (32-bit images only work with FIQ_WUQUANT and FIQ_LFPQUANT)
  2. add support for 32-bit in NNQuantizer
    or
  3. remove all the 32-bit support from FreeImage_ColorQuantizeEx and WuQuantizer

What's not in the zip, you should add new quantizer constant FIQ_LFPQUANT = 2 to enum FREE_IMAGE_QUANTIZE.

So, could you please add this to FreeImage?

Carsten

1 Attachments

Discussion

  • Hervé Drolon

    Hervé Drolon - 2014-12-04

    Hi Carsten,

    Thanks for this cleverly designed algorithm. I will add it to FreeImage before the next release.

    One minor modification I will add :
    Do not declare 'MapEntry m_map[MAP_SIZE]' on the stack, better allocate it dynamically in the constructor and free it in the destructor. 4 KB is large enough to cause problems on some processors or some old compilers.

    Another suggestion if you agree :
    You currently calculate the palette indexes and fill the 8-bit dib into a single loop.
    You could instead :

    • first fill the hash map and count the number of colors
    • if #colors < 256, allocate and fill the 8-bit dib using a new loop (that means calculating the hash again)
    • if #colors >= 256, default to the NN quantizer (or to the Wu quantizer)

    However, this may be unneeded overhead for your use-cases. I let you choose the best option and will follow your choice.

    About the NN quantizer and 32-bit support, I will have a look at this.

    Best regards,
    Hervé

     
  • Carsten Klein

    Carsten Klein - 2014-12-05

    Hi Herve,

    first, thank you for the flowers... :-)

    My intention was to explicitly use the stack, since many people are saying that stack memory and stack allocation is so much faster than heap memory. However, it turned out, that there is not performance gain from using stack memory. So, please declare:

    /** The hash table. */
    MapEntry *m_map;
    

    Here is the new constructor and destructor:

    LFPQuantizer::LFPQuantizer(unsigned PaletteSize) :
            m_size(0), m_limit(PaletteSize), m_index(0) {
        m_map = new MapEntry[MAP_SIZE];
        memset(m_map, 0xFF, MAP_SIZE * sizeof(MapEntry));
    }
    
    LFPQuantizer::~LFPQuantizer() {
        delete[] m_map;
    }
    

    Now, to your suggestions, which I'll break into two parts:

    • Automatically delegating to NN or Wu quantizer

    I need to know, whether the conversion was successful, that is, the image has no more than 256 colors. The new algorithm is said being lossless. Since both NN and Wu are not lossless if there are more than 256 colors, automatically delegating to a lossy implementation is not a good idea. For my use case, I'll keep the image at 24 bits, if it has too many different colors (I must not change the image information).

    So, this decision should be up to the user of the function. Testing for NULL and possibly calling FreeImage_ColorQuantize again with FIQ_WUQUANT seems acceptable for me.

    • Collect colors and build the destination image in two different loops

    For this, I've implemented your suggestions (as fast as possible; I even implemented a modified but faster GetIndexForColor2 method that does not care for adding the color if needed). However, it turned out that the main time is spent on collection the colors. That's also why the custom hash table implementation speeds up things so much.

    If I comment out the lines needed for the destination image:

    for (unsigned y = 0; y < height - 1; ++y) {
        //BYTE *dst_line = dst_bits + y * dst_pitch;
        const BYTE *src_line = src_bits + y * src_pitch;
        for (unsigned x = 0; x < width; ++x) {
            const unsigned color = *((unsigned *) src_line) & 0x00FFFFFF;
            if (color != last_color) {
                last_color = color;
                last_index = GetIndexForColor(color);
                if (last_index == -1) {
                    FreeImage_Unload(dib8);
                    return NULL;
                }
            }
            //dst_line[x] = last_index;
            src_line += 3;
        }
    }
    

    there is no difference in performance. The only thing we could save some time, is not to allocate the destination image. Then, if there are two many colors and the process fails, we save about 20% time compared to my original algorithm. However, if the process succeeds, the second loop brings in about 63% of extra time. For an image (7363 x 8545 pixels) I get these timings on my (old and slow) box:

    375ms 80% (-20%) New algorithm: Process failed (too many colors)

    470ms 100% Original algorithm: Process failed/succeeded (no matter)

    765ms 163% (+63%) New algorithm: Process succeeded (2nd loop required)

    Since the LFPQuantizer is intended for images known to have few different colors only, we should always expect the process to succeed. If so, we should optimize the algorithm accordingly. So I do not recommend to add another loop to the algorithm.

    However, we could add another function which is kind of FreeImage_CountColorsUsed. We could use nearly the same algorithm and also the same hash table, if the number of colors counted does not exceed 256. Here is what I have in mind:

    int FreeImage_CountColorsUsed(FIBITMAP *dib, int limit = 0)
    

    If limit <= 0, there is no limit. If the limit is <= 256, we could use the fast hash table, otherwise we could use std::map (which is still fast, remember, with std::map, my algorithm performs exactly like Wu's algorithm).

    If a limit is given and exceeded, we exit the loop and return -1, which means, more than limit colors. One could easily use this function (which should need 20% less time than the LFP algorithm) to determine, whether using LFPQuantizer is an option for a certain image or not. However, the total time will still be much longer than just using LFPQuantizer. You'll only save time, if the image has more than 256 colors.

    Another idea is to introduce another quantizer FIQ_LFPQUANT_PESSIMISTIC, which could implement your suggestions (could easily be done by deriving a class from LFPQuantizer). However, the name seems to be too long and likely users get confused by this all... Additionally, the color counting function could be implemented for nearly all image types and bit depths, which is another benefit.

    I'm curious about your comments.

    Please note, that I'd like to have patch tickets #97 and #98 in the next release as well. In my e-mail, I've announced, that I'm working on some new functions (and features), that I need in production code. Tickets #97 to #99 are all part of it.

    Still having two more functions under development, which I will provide in (likely) patch tickets #100 and #101. Maybe there will be another (last one) for FreeImage_CountColorsUsed.

    Regards,
    Carsten

     
  • Carsten Klein

    Carsten Klein - 2014-12-08

    Hi Hervé,

    there still have been some bugs in the above LFPQuantizer.zip. I will now provide a new one, hopefully with no more bugs. In this new version, the MapEntry m_map[256] has already been replaced by a dynamically allocated array (as discussed above).

    File Change
    LFPQuantizer.cpp Corrected implementation for 24-bit images.
    Quantizers.h Removed '#include <map>'

    Carsten

     
  • Hervé Drolon

    Hervé Drolon - 2015-02-19
    • status: open --> pending
    • assigned_to: Hervé Drolon
    • Group: -->
     
  • Hervé Drolon

    Hervé Drolon - 2015-02-19

    Hi Carsten,

    I've added your last code to the CVS (and also updated the FreeImage doc for the next release).

    Hervé

     
  • Anonymous

    Anonymous - 2015-02-20

    Hi Hervé,

    thank you for adding the LFPQUANT algorithm. However, I just found a bug in my code in method LFPQuantizer::AddReservePalette. If the reserve palette contains duplicate colors, this method ends up (or better: never ends) in an infinite loop! I've missed, that the continue statement is located in an inner loop; actually I need to continue the outer loop (the for loop) if the reserve palette's color is already contained in the map.

    For performance reasons, it's likely best to add an ugly goto statement here. Could you please fix this accordingly?

    void LFPQuantizer::AddReservePalette(const void *palette, unsigned size) {
        if (size > MAX_SIZE) {
            size = MAX_SIZE;
        }
        unsigned *ppal = (unsigned *) palette;
        const unsigned offset = m_limit - size;
        for (unsigned i = 0; i < size; ++i) {
            const unsigned color = *ppal++;
            const unsigned index = i + offset;
            unsigned bucket = hash(color) & (MAP_SIZE - 1);
            while (m_map[bucket].color != EMPTY_BUCKET) {
                if (m_map[bucket].color == color) {
                    goto outer_continue;
                }
                bucket = (bucket + 1) % MAP_SIZE;
            }
            m_map[bucket].color = color;
            m_map[bucket].index = index;
        outer_continue: ;
        }
        m_size += size;
    }
    

    This prevents entering an infinite loop, if the reserve palette contains duplicate colors. Of course, this should never be the case for a real-world reserve palette, since palette entries are limited and expensive.

    However, there are still some issues with duplicate reserve colors in this algorithm. Since the newly created palette is kept in the hash map only (which cannot contain duplicates), duplicate colors in the reserve palette are never written to the result image's palette (the palette will contain the entry's corresponding gray value). These palette entries are actually wasted (not usable by the algorithm), since the non-reserve size of the palette is actually determined by the reserve size.

    So, with duplicate colors in the reserve palette, one can actually create unused palette entries in the result image. This could be a feature :-)

    Carsten

     
  • Hervé Drolon

    Hervé Drolon - 2015-02-21

    Hi Carsten,

    I've added your fix for infinite loop, but removed the goto statement.

    Hervé

     
  • Hervé Drolon

    Hervé Drolon - 2015-03-15

    fixed in release 3.17.0

     
  • Hervé Drolon

    Hervé Drolon - 2015-03-15
    • status: pending --> closed
     

Anonymous
Anonymous

Add attachments
Cancel





MongoDB Logo MongoDB