| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| 0.2.4 source code.tar.gz | 2025-07-08 | 8.3 MB | |
| 0.2.4 source code.zip | 2025-07-08 | 8.4 MB | |
| README.md | 2025-07-08 | 1.6 kB | |
| Totals: 3 Items | 16.7 MB | 0 | |
Guidance 0.2.4
Better sampling, better metrics, llama-cpp-python fixes (update to latest please!), uncountable visualization fixes.
Added
- Allow changing
sampling_params(top_p/top_k,min_p,repetition_penalty) on the flyModel.with_sampling_params(...) - Add
top_ktokens back into vis after temporary removal in previous refactor
Removed
Model.token_countremoved in favor of (currently private)Model._get_usage().output_tokens
Changed
- Bookkeeping of metrics such as
input_tokens,output_tokens,ff_tokens,token_savings,avg_latency_mshave been added to State and are now accessible via (private for now)Model._get_usage(). This replaces bookkeeping that was previously attached toEngineinstances. - Factory functions
create_azure_openai_model()andcreate_azure_openai_model()for accessing models hosted in AzureAI
Fixed
- Intermittent double widget render fixed.
- Widget doesn't always complete running, fixed.
- Widget backtracking bug fixed
- Widget now always show both inputs and outputs, sometimes would fail.
TraceHandlerforests stripped of extra trace nodes, sometimes caused render glitches.- Widget latency displays now render.
- Widget early race condition resolved (sometimes widget is ready after backend is firing messages)
- Various linting and build improvements
- Tokens generated with OpenAI now correctly tagged as generated for vis
- Fix compatability with
llama-cpp-python0.3.12, bump dependency from 0.3.9 to 0.3.12 (first contrib: @jovemexausto)