The ulttiny library is small, easy to use and very fast. It's optimized especially for scenarios where tasks and task groups are dynamically and concurrently created on the fly.
This project is an userland library implementation of the Task Queue facility in the OpenSolaris kernel that simplifies thread management. Task Queues are somewhat similar to dispatch queues in Grand Central Dispatch but are more flexible.