When I first attempted to speed up my software, using multiple threads in C++, I was happy when I found that I was able to keep all cores 100% occupied. The only downside was that instead of getting faster, my program took much longer than the single-threaded version. So I wrote this thread pool. I am now able to get a performance increase proportional to the number of processors. I would be happy if others found it useful too.
See page [README] for more information.