Cactus Needle
26m function call model that runs on incredibly small devices
...It is based on a Simple Attention Network architecture and was distilled from a much larger model to focus on fast, compact tool-use behavior. The project provides open weights, training details, dataset generation resources, and a playground for testing the model with custom tools. Needle is optimized for single-shot function calling rather than broad conversational ability, so its core use case is selecting the right tool and producing structured arguments. It can be fine-tuned locally, including on consumer machines, which makes it useful for experimentation with small personalized agents. ...