| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| smallcode-Linux-X64.tar.gz | 2026-05-29 | 11.2 MB | |
| smallcode-macOS-ARM64.tar.gz | 2026-05-29 | 11.1 MB | |
| smallcode-Windows-X64.tar.gz | 2026-05-29 | 10.9 MB | |
| README.md | 2026-05-29 | 1.8 kB | |
| v1.4.0 source code.tar.gz | 2026-05-29 | 11.6 MB | |
| v1.4.0 source code.zip | 2026-05-29 | 11.7 MB | |
| Totals: 6 Items | 56.6 MB | 2 | |
[1.4.0] - 2026-05-29
feat: RAG coding harness with Python scraper + hybrid search (#64)
Adds an opt-in retrieval harness that scrapes snippet-sized code chunks
from curated repositories and serves them via hybrid (keyword + vector)
search. New surface: bin/rag-index.js + smallcode-rag-index bin and
the rag:index npm script (builds .smallcode/rag/index.json from chunks
scraped by scripts/rag_scraper.py), src/rag/ (index_store.js,
retriever.js, curated_repos.json), docs in docs/rag-harness.md, and
test/rag.test.js. No new npm dependencies — the scraper runs via Python.
fix: strict chat templates reject mid-conversation system messages (#62)
Qwen3 / Qwen3.5 chat templates (and other strict templates) under
llama.cpp --jinja raise System message must be at the beginning. and
llama.cpp returns HTTP 400 — but only when tools are present, since
that's when it compiles the template to build a tool-call grammar.
SmallCode injects system-role content mid-conversation (clarifier, plan
request, planner injection, path-validation warnings, skill activation,
compaction summaries), producing a messages array with system entries
at positions other than 0.
- New
src/session/message_normalizer.js#consolidateSystemMessages()collapses all system-role messages into a single leading system message (preserving order, de-duplicating identical blocks) and emits only non-system turns after it. - Applied in both request builders (
bin/smallcode.jsandbin/model_client.jschatCompletion) right before the body is sent, so it catches stray system messages regardless of which path injected them. Verified end-to-end against a Qwen3 model: every tool-bearing request now carries exactly one system message at index 0. - Test coverage:
test/message_normalizer.test.js(9 cases).