| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| README.md | 2026-03-12 | 35.4 kB | |
| Tunix v0.1.6 -- Agentic RL _ VLM source code.tar.gz | 2026-03-12 | 28.7 MB | |
| Tunix v0.1.6 -- Agentic RL _ VLM source code.zip | 2026-03-12 | 29.0 MB | |
| Totals: 3 Items | 57.7 MB | 0 | |
Highlights
- supports Agentic RL training, see https://github.com/google/tunix/tree/main/examples/agentic/gemma_grpo_demo_nb.py
-
supports VLM training, see https://github.com/google/tunix/blob/main/examples/sft/vlm_training.py
:::python from tunix import AgenticGRPOConfig from tunix import AgenticGRPOLearner
agentic_grpo_config = AgenticGRPOConfig( num_generations=NUM_GENERATIONS, num_iterations=NUM_ITERATIONS, max_response_length=MAX_RESPONSE_LENGTH, beta=BETA, epsilon=EPSILON, system_prompt=SWE_SYSTEM_PROMPT, max_concurrency=1, epsilon_high=0.28, off_policy_steps=0, )
agentic_grpo_learner = AgenticGRPOLearner( rl_cluster=rl_cluster, reward_fns=reward_fns, agent_class=MyAgentClass, agent_kwargs={}, env_class=MyEnv, env_kwargs={"max_steps": MAX_STEPS}, algo_config=agentic_grpo_config, chat_parser=chat_parser, )
agentic_grpo_learner.train(train_dataset=train_dataset)
What's Changed
- Developing for v0.1.6 now. by @wang2yn84 in https://github.com/google/tunix/pull/785
- Fix the vllm server mode not finish issue. by @wang2yn84 in https://github.com/google/tunix/pull/784
- [Tunix] Update Dockerfile and deepscaler trainer script to seperate trainer model and ref model. by @copybara-service[bot] in https://github.com/google/tunix/pull/725
- Add Tunix RL GRPO examples for Gemma3. by @copybara-service[bot] in https://github.com/google/tunix/pull/788
- [Tunix] change model implementation to be pytree compatible. by @copybara-service[bot] in https://github.com/google/tunix/pull/782
- Fix TPU nightly regression workflow to use vLLM container and add new tests. by @copybara-service[bot] in https://github.com/google/tunix/pull/754
- [Tunix] Update sharding configuration for attention weights. by @copybara-service[bot] in https://github.com/google/tunix/pull/759
- [Tunix] Add gcsfs to TPU nightly regression dependencies. by @copybara-service[bot] in https://github.com/google/tunix/pull/790
- Adding back test_logprobs_extraction_with_missing_token. by @wang2yn84 in https://github.com/google/tunix/pull/789
- feat:add device indexes for sglang jax by @pathfinder-pf in https://github.com/google/tunix/pull/786
- Fix the rendering issue in Example gallery document. by @rajasekharporeddy in https://github.com/google/tunix/pull/799
- [Tunix] Remove the version pin for SGLang. by @copybara-service[bot] in https://github.com/google/tunix/pull/798
- [Fixes 794] fix transformers=4.57.1 to solve issue42369 in transformers and use c… by @aolemila in https://github.com/google/tunix/pull/795
- Refactor gemma3 modelConfig to explicitly include all models by @copybara-service[bot] in https://github.com/google/tunix/pull/792
- [Tunix] Fix nightly regression: remove unnecessary --root-dir argument from TPU nightly regression script. Fix the MATH500 eval script. by @copybara-service[bot] in https://github.com/google/tunix/pull/796
- use naming utils in tunix cli by @copybara-service[bot] in https://github.com/google/tunix/pull/736
- [Tunix] Remove GitHub Actions replacement in copybara. Replying on more generic google3 replacement rule by @copybara-service[bot] in https://github.com/google/tunix/pull/803
- reduce safetensor loading time by @keshavb96 in https://github.com/google/tunix/pull/760
- [Tunix] Remove env_utils.fs_open from safetensors_loader. fsspec object doesn't have fileno. 3P test is broken: https://github.com/google/tunix/actions/runs/19689186862/job/56403241781?pr=744 by @copybara-service[bot] in https://github.com/google/tunix/pull/804
- [Tunix] Pass HF_TOKEN to TPU nightly regression tests. by @copybara-service[bot] in https://github.com/google/tunix/pull/805
- [Tunix] Follow up of cl/836961494. It was out of sync with github PR. by @copybara-service[bot] in https://github.com/google/tunix/pull/807
- [Tunix] Pin the vLLM TPU Docker image to a specific nightly build version for the TPU tests. by @copybara-service[bot] in https://github.com/google/tunix/pull/808
- [Tunix] Update tunix nightly regression workflow schedule. Change the cron schedule from 2 AM UTC to 10 AM UTC. by @copybara-service[bot] in https://github.com/google/tunix/pull/806
- Centralize Flax sharding setup in env_utils by @copybara-service[bot] in https://github.com/google/tunix/pull/797
- Fix gemma3 grpo shell scripts by @sizhit2 in https://github.com/google/tunix/pull/791
- [Tunix] Fix GRPO script. by @copybara-service[bot] in https://github.com/google/tunix/pull/811
- rename all model configs to use "p" instead of "_" for float values by @copybara-service[bot] in https://github.com/google/tunix/pull/740
- [Tunix] Move model alignment tests from CPU to TPU run dev workflow. by @copybara-service[bot] in https://github.com/google/tunix/pull/818
- handle the situation when lora_config is not provided by @Hanjun-Dai in https://github.com/google/tunix/pull/813
- checkpoint_options->checkpointing_options in cli/config.py by @Hanjun-Dai in https://github.com/google/tunix/pull/814
- [Tunix] Remove EOS token appending to the prompt in vLLM and SGLang sampler. by @copybara-service[bot] in https://github.com/google/tunix/pull/827
- Fix bos duplication by @Hanjun-Dai in https://github.com/google/tunix/pull/822
- remove extra flax sharding check by @copybara-service[bot] in https://github.com/google/tunix/pull/817
- Remove irrelevant text in GRPO example by @copybara-service[bot] in https://github.com/google/tunix/pull/823
- [TUNIX] Switch to absl.logging in the tunix util file for scripts. by @copybara-service[bot] in https://github.com/google/tunix/pull/831
- renaming Transformer to Gemma for gemma model by @copybara-service[bot] in https://github.com/google/tunix/pull/819
- Expand model tests and fix gemma from_params parsing by @copybara-service[bot] in https://github.com/google/tunix/pull/828
- add missing refactoring to model test by @copybara-service[bot] in https://github.com/google/tunix/pull/835
- allow user to config project name and run name in wandb by @Hanjun-Dai in https://github.com/google/tunix/pull/836
- fix the issue when eager mode jax is triggered in undesired places by @Hanjun-Dai in https://github.com/google/tunix/pull/837
- make TFDS download flag a configurable option by @copybara-service[bot] in https://github.com/google/tunix/pull/763
- Fix llama RL verl script by @copybara-service[bot] in https://github.com/google/tunix/pull/839
- Fix ref model compute_logps input sharding issue by @copybara-service[bot] in https://github.com/google/tunix/pull/846
- Improves the GRPO script to be more configurable. by @wang2yn84 in https://github.com/google/tunix/pull/840
- remove unused fn by @copybara-service[bot] in https://github.com/google/tunix/pull/847
- Add support for Dr. GRPO by @copybara-service[bot] in https://github.com/google/tunix/pull/681
- [Tunix] Update parallel sizes to use ROLLOUT_MESH in grpo_demo. by @copybara-service[bot] in https://github.com/google/tunix/pull/851
- Fix typo and citation formatting by @selamw1 in https://github.com/google/tunix/pull/865
- [Bug] Fix/sglang jax support pathways by @aolemila in https://github.com/google/tunix/pull/860
- [Tunix] Add number of batches argument and reduce nightly regression run time. by @copybara-service[bot] in https://github.com/google/tunix/pull/866
- Add AgenticRLLearner base class. by @copybara-service[bot] in https://github.com/google/tunix/pull/829
- Add XM launch for tunix cli by @copybara-service[bot] in https://github.com/google/tunix/pull/848
- update OSS readme by @copybara-service[bot] in https://github.com/google/tunix/pull/863
- use env_utils in config_test by @copybara-service[bot] in https://github.com/google/tunix/pull/872
- check integer type by @copybara-service[bot] in https://github.com/google/tunix/pull/877
- Add smoke shell scripts to nightly run by @copybara-service[bot] in https://github.com/google/tunix/pull/855
- use np instead jnp to compute rewards in agentic framework by @copybara-service[bot] in https://github.com/google/tunix/pull/881
- [Tunix] Support pre-resharding pytrees with different meshes. by @copybara-service[bot] in https://github.com/google/tunix/pull/882
- Fix the
ValueErrorwhile loading the Gemma model inlogit_distillation.ipynbby @rajasekharporeddy in https://github.com/google/tunix/pull/870 - change
qwen3_30bmore specific toqwen3_30b_a3bby @copybara-service[bot] in https://github.com/google/tunix/pull/880 - add qwen4b model config which uses tie embedding by @Hanjun-Dai in https://github.com/google/tunix/pull/858
- Add codewiki link by @copybara-service[bot] in https://github.com/google/tunix/pull/886
- [Script] merge grpo_demo_sglang_jax_rollout.py into grpo_demo_llama3_qwen2.py by @aolemila in https://github.com/google/tunix/pull/868
- Adding support for gemma-X-, llama-X- naming similar to HF by @copybara-service[bot] in https://github.com/google/tunix/pull/876
- enforce rollout tokens to be in RAM by @copybara-service[bot] in https://github.com/google/tunix/pull/889
- Adding Automodel interface to Tunix by @copybara-service[bot] in https://github.com/google/tunix/pull/862
- allow users to import reward module/fn outside tunix folder by @Hanjun-Dai in https://github.com/google/tunix/pull/852
- Fix breaking config test by @copybara-service[bot] in https://github.com/google/tunix/pull/901
- use np instead of jnp for reward fn and GRPO group adv by @copybara-service[bot] in https://github.com/google/tunix/pull/891
- change
qwen3_4b_2507model config added to match HF model ids by @copybara-service[bot] in https://github.com/google/tunix/pull/897 - [Tunix] Remove EOS token appending to the prompt in vLLM and SGLang sampler. by @copybara-service[bot] in https://github.com/google/tunix/pull/900
- use model_path instead of model_id for gcs in the cli by @copybara-service[bot] in https://github.com/google/tunix/pull/904
- change from mock.patch to mock.patch.object. by @copybara-service[bot] in https://github.com/google/tunix/pull/896
- [Tunix]Make chat_parser optional in AgenticRLLearner. by @copybara-service[bot] in https://github.com/google/tunix/pull/911
- Model creation smoke test by @copybara-service[bot] in https://github.com/google/tunix/pull/873
- fix llama run script model name by @copybara-service[bot] in https://github.com/google/tunix/pull/909
- [Tunix] Use
jnp.concatenateinstead ofnp.concatenatefor merging micro-batches. by @copybara-service[bot] in https://github.com/google/tunix/pull/912 - [Tunix] Fix sharding for act_btf in Tunix models. by @copybara-service[bot] in https://github.com/google/tunix/pull/914
- Add
ColabandKagglebadges to the example notebooks by @rajasekharporeddy in https://github.com/google/tunix/pull/893 - Doc ci check by @ev-br in https://github.com/google/tunix/pull/888
- [Tunix] Remove the duplicate tests move rules. by @copybara-service[bot] in https://github.com/google/tunix/pull/922
- [Tunix] Moves the smoke_tests folder to top level. by @copybara-service[bot] in https://github.com/google/tunix/pull/923
- Support custom MaxText (vLLM) models in sampler and rollout. by @NicoGrande in https://github.com/google/tunix/pull/841
- Fix for failing nightly regression test by @copybara-service[bot] in https://github.com/google/tunix/pull/918
- fix perf by @pathfinder-pf in https://github.com/google/tunix/pull/915
- update gemma-3 models ids to match HF by @copybara-service[bot] in https://github.com/google/tunix/pull/916
- Validation tests for model id to exist on HF by @copybara-service[bot] in https://github.com/google/tunix/pull/917
- allow users to specify data module outside of tunix; refactor a bit by @Hanjun-Dai in https://github.com/google/tunix/pull/853
- add train step time metric. by @copybara-service[bot] in https://github.com/google/tunix/pull/924
- [Tunix] Force install numpy 2.3.5 for vllm by @copybara-service[bot] in https://github.com/google/tunix/pull/931
- Add an option to cache NNX traversals in PEFT trainer. by @copybara-service[bot] in https://github.com/google/tunix/pull/928
- [Tunix] Fix sharding for act_btf in Tunix models. by @copybara-service[bot] in https://github.com/google/tunix/pull/934
- Fix the dataset post initialization by @wang2yn84 in https://github.com/google/tunix/pull/936
- add metric first_micro_batch_rollout_time in fully diagg mode. by @copybara-service[bot] in https://github.com/google/tunix/pull/925
- remove mesh cm from example by @copybara-service[bot] in https://github.com/google/tunix/pull/908
- Enable padding for attention qkv biases. by @copybara-service[bot] in https://github.com/google/tunix/pull/941
- checkpoint opt state by @copybara-service[bot] in https://github.com/google/tunix/pull/945
- Introduce prompt queue to support off-policy by @copybara-service[bot] in https://github.com/google/tunix/pull/946
- remove type indirection by @copybara-service[bot] in https://github.com/google/tunix/pull/950
- Adding dataclass to naming by @copybara-service[bot] in https://github.com/google/tunix/pull/948
- Remove
_obs_cacheby @copybara-service[bot] in https://github.com/google/tunix/pull/952 - remove type indirection by @copybara-service[bot] in https://github.com/google/tunix/pull/951
- enable proper report for learning rate during training by @Hanjun-Dai in https://github.com/google/tunix/pull/940
- Code update by @copybara-service[bot] in https://github.com/google/tunix/pull/942
- add llama 3.2 1b/3b-instruct model config by @Hanjun-Dai in https://github.com/google/tunix/pull/935
- use taskgroup for batch of async tasks by @copybara-service[bot] in https://github.com/google/tunix/pull/954
- Add Tunix CLI setup instructions to g3docs. by @copybara-service[bot] in https://github.com/google/tunix/pull/957
- [Internal] Adding naming conventions documentation by @copybara-service[bot] in https://github.com/google/tunix/pull/949
- Fix broken example script references in CLI README by @SarveshMahalingam in https://github.com/google/tunix/pull/938
- Fix the cli to have the proper model config over written. by @wang2yn84 in https://github.com/google/tunix/pull/960
- [internal] extend XM launch to peft by @copybara-service[bot] in https://github.com/google/tunix/pull/959
- adding missing test case variant by @copybara-service[bot] in https://github.com/google/tunix/pull/956
- fix safetensor publish by @copybara-service[bot] in https://github.com/google/tunix/pull/968
- Add gcsfs. by @copybara-service[bot] in https://github.com/google/tunix/pull/969
- make explicit copy to create actor model by @copybara-service[bot] in https://github.com/google/tunix/pull/973
- extend support for gemma1.1 in automodel by @copybara-service[bot] in https://github.com/google/tunix/pull/970
- [Tunix] replaces a print statement with logging.vlog and other minor nits. by @copybara-service[bot] in https://github.com/google/tunix/pull/974
- Apply transpose rule to safetensor saver function by @wang2yn84 in https://github.com/google/tunix/pull/971
- [Tunix] Updates the split_by_mesh_axis access pattern. by @copybara-service[bot] in https://github.com/google/tunix/pull/975
- [Tunix] Fix the logic to handle multiple intermetidate mesh. by @copybara-service[bot] in https://github.com/google/tunix/pull/976
- Fix Gemma 3 model loading by @selamw1 in https://github.com/google/tunix/pull/982
- Add overlong reward shaping for DAPO. by @copybara-service[bot] in https://github.com/google/tunix/pull/947
- Internal Doc by @copybara-service[bot] in https://github.com/google/tunix/pull/984
- Internal change by @copybara-service[bot] in https://github.com/google/tunix/pull/985
- Feat/add lora for sglangjax by @aolemila in https://github.com/google/tunix/pull/826
- Add an multi-turn RL example notebook. by @copybara-service[bot] in https://github.com/google/tunix/pull/988
- Fix Gemma 3 safetensor loading by @copybara-service[bot] in https://github.com/google/tunix/pull/987
- minor resharding updates. by @copybara-service[bot] in https://github.com/google/tunix/pull/989
- Disable Lora in base_config.yaml by @copybara-service[bot] in https://github.com/google/tunix/pull/983
- fix math util parsing by @copybara-service[bot] in https://github.com/google/tunix/pull/994
- Add support for interleaved layer mappings and enhanced key mapping regex. by @copybara-service[bot] in https://github.com/google/tunix/pull/815
- Unify to HF as the single model_id by @copybara-service[bot] in https://github.com/google/tunix/pull/981
- Tunix Documentation V2 for OSS by @copybara-service[bot] in https://github.com/google/tunix/pull/990
- Improve HBM usage tracking by avoiding double counting by @copybara-service[bot] in https://github.com/google/tunix/pull/1003
- Add perf_metrics to tunix cli -- grpo by @copybara-service[bot] in https://github.com/google/tunix/pull/1001
- enable Sphinx to fix slug for every header by @copybara-service[bot] in https://github.com/google/tunix/pull/1007
- [Tunix] Update Copybara to keep a TOC placeholder in g3doc. by @copybara-service[bot] in https://github.com/google/tunix/pull/1011
- Allow single str input by @copybara-service[bot] in https://github.com/google/tunix/pull/1010
- [Tunix] Fix the copybara file transformation issue. by @copybara-service[bot] in https://github.com/google/tunix/pull/1012
- [Distillation] Enhanced Logit Strategy with Top-K and Metrics by @gagika in https://github.com/google/tunix/pull/991
- Add qwen3 vllm/sglang weight mapping support by @wang2yn84 in https://github.com/google/tunix/pull/1009
- [Tunix] Fix Qwen2 vLLM to JAX embedding mapping. by @copybara-service[bot] in https://github.com/google/tunix/pull/1019
- Adding new sharding and performance args for vLLM by @NicoGrande in https://github.com/google/tunix/pull/1006
- [Tunix] Add _put_prompts_to_queue to handle dataset iteration and partial batches. by @copybara-service[bot] in https://github.com/google/tunix/pull/1018
- [Tunix] Adds a check to ensure that arrays are still live before accessing their shards. by @copybara-service[bot] in https://github.com/google/tunix/pull/1025
- Explicitly handle single string inputs in samplers. by @copybara-service[bot] in https://github.com/google/tunix/pull/1030
- reduce log by @copybara-service[bot] in https://github.com/google/tunix/pull/1022
- retry on hf download and list to prevent gateway error that cause presubmit timeout by @copybara-service[bot] in https://github.com/google/tunix/pull/1014
- Add Gemini Code Assist style guide for PR reviews by @copybara-service[bot] in https://github.com/google/tunix/pull/1024
- move scrubber after transformation for it to locate code in github by @copybara-service[bot] in https://github.com/google/tunix/pull/1016
- [Tunix] Fix Qwen2/3 vLLM to JAX lm_head mapping. by @copybara-service[bot] in https://github.com/google/tunix/pull/1020
- bug: fix metric refer_inference_time. by @copybara-service[bot] in https://github.com/google/tunix/pull/1037
- feat: add multi-rollout engine interfaces. by @copybara-service[bot] in https://github.com/google/tunix/pull/1039
- properly set LoRA alpha by @copybara-service[bot] in https://github.com/google/tunix/pull/1044
- Clean up and update the deepscaler training script with sglang configurations. by @copybara-service[bot] in https://github.com/google/tunix/pull/1045
- [Tunix] Use single shared loop for producer. by @copybara-service[bot] in https://github.com/google/tunix/pull/1042
- [Tunix] Add trajectory data logging to disk for further analysis and visualization. by @copybara-service[bot] in https://github.com/google/tunix/pull/980
- [Tunix] Add
convert_messages_to_stringto handle numpy array content in messages. by @copybara-service[bot] in https://github.com/google/tunix/pull/1047 - internal by @copybara-service[bot] in https://github.com/google/tunix/pull/1046
- Cast logits to float32 in sample_top_p. by @copybara-service[bot] in https://github.com/google/tunix/pull/1051
- increase episode timeout by @copybara-service[bot] in https://github.com/google/tunix/pull/1052
- VLM Training (1): Add vision to Gemma 3 by @copybara-service[bot] in https://github.com/google/tunix/pull/986
- update README file in tunix by @copybara-service[bot] in https://github.com/google/tunix/pull/1054
- Add timeline visualization of perf metrics by @copybara-service[bot] in https://github.com/google/tunix/pull/1035
- Sort perfetto trace based on uuid so the sequence follows the timeline by @copybara-service[bot] in https://github.com/google/tunix/pull/1050
- Use the rollout engine construct mesh. by @wang2yn84 in https://github.com/google/tunix/pull/1060
- Fix spacing in algorithm diagram. by @copybara-service[bot] in https://github.com/google/tunix/pull/1065
- Add logging for perf metric mode by @copybara-service[bot] in https://github.com/google/tunix/pull/1057
- Skip softmax and sorting of probabilities when top_p == 1.0 and top_k is None. by @copybara-service[bot] in https://github.com/google/tunix/pull/1056
- Refactor tunix Gemma3-4b SFT script to use new config structure. by @copybara-service[bot] in https://github.com/google/tunix/pull/1059
- Log the computed score in GSM8K reward function by @copybara-service[bot] in https://github.com/google/tunix/pull/1058
- [Tunix] Add Qwen3 32B model configuration. by @copybara-service[bot] in https://github.com/google/tunix/pull/1068
- fix gcs paths in deepscaler notebook by @copybara-service[bot] in https://github.com/google/tunix/pull/1074
- Fix wrong parent when nesting perf metrics by @copybara-service[bot] in https://github.com/google/tunix/pull/1066
- fix optimizer CP restore by @copybara-service[bot] in https://github.com/google/tunix/pull/1070
- [Tunix] Add log_level config to SglangJaxSampler. by @copybara-service[bot] in https://github.com/google/tunix/pull/1069
- [Tunix] Add trajectory logging to agentic GRPO learner. by @copybara-service[bot] in https://github.com/google/tunix/pull/1071
- remove max_steps from trajectory_collect_engine by @copybara-service[bot] in https://github.com/google/tunix/pull/1076
- use absl logging across the repo by @copybara-service[bot] in https://github.com/google/tunix/pull/1077
- feat: log rollout and train time at micro batch level. by @copybara-service[bot] in https://github.com/google/tunix/pull/1038
- Add all steps to global span by @copybara-service[bot] in https://github.com/google/tunix/pull/1067
- [Tunix] Remove upper bound on JAX version in tunix prod dependencies. by @copybara-service[bot] in https://github.com/google/tunix/pull/1079
- Refactor GRPO rollout to simplify grouping and avoid deepcopy by @copybara-service[bot] in https://github.com/google/tunix/pull/1075
- [Tunix] Pad the number of heads for projection bias. by @copybara-service[bot] in https://github.com/google/tunix/pull/1086
- Remove max_open_buckets from GroupQueueManager by @copybara-service[bot] in https://github.com/google/tunix/pull/1083
- chore: Migrate gsutil usage to gcloud storage by @gurusai-voleti in https://github.com/google/tunix/pull/1082
- [Feat] add log_level in SglangJaxConfig and update default page_size by @aolemila in https://github.com/google/tunix/pull/1090
- fix potential race condition on dictionary update by @copybara-service[bot] in https://github.com/google/tunix/pull/1091
- update doc string and error message by @copybara-service[bot] in https://github.com/google/tunix/pull/1093
- Supports padding kwargs for samplers. by @wang2yn84 in https://github.com/google/tunix/pull/1095
- [Tunix]: Skip the already trained data on job resume. by @copybara-service[bot] in https://github.com/google/tunix/pull/1088
- disable perf metrics by default in the cli. by @copybara-service[bot] in https://github.com/google/tunix/pull/1084
- minor update by @copybara-service[bot] in https://github.com/google/tunix/pull/1092
- Added a GPU demo for PEFT with QLoRA on Llama 3_1 by @katjasrz in https://github.com/google/tunix/pull/1105
- Use Exception instead BaseException in Tunix by @wang2yn84 in https://github.com/google/tunix/pull/1108
- fix loss mask for agentic learner by @copybara-service[bot] in https://github.com/google/tunix/pull/1100
- Set the max worker number in asyncio loop. by @wang2yn84 in https://github.com/google/tunix/pull/1111
- add comment clarifying micro batch has to be 1 by @copybara-service[bot] in https://github.com/google/tunix/pull/1094
- Refactor the vllm sampler config with InitVar by @wang2yn84 in https://github.com/google/tunix/pull/1110
- forbidden_tokens in sampler call accepts token IDs instead of strings. by @copybara-service[bot] in https://github.com/google/tunix/pull/1114
- simplify trajectory result processing by @copybara-service[bot] in https://github.com/google/tunix/pull/1109
- fix group_id and pair_idx in traj by @copybara-service[bot] in https://github.com/google/tunix/pull/1116
- Add Colab and Kaggle badges to
qlora_llam3_gpuexample tutorial by @rajasekharporeddy in https://github.com/google/tunix/pull/1122 - [Tunix] mprove GCS CSV writing. by @copybara-service[bot] in https://github.com/google/tunix/pull/1120
- use max_response_len for deepscaler by @copybara-service[bot] in https://github.com/google/tunix/pull/1124
- fix offpolicy step by @copybara-service[bot] in https://github.com/google/tunix/pull/1128
- Update links in Tunix OSS README. by @copybara-service[bot] in https://github.com/google/tunix/pull/1121
- [Tunix] Engine kwargs overwrite predefined config keys. by @copybara-service[bot] in https://github.com/google/tunix/pull/1129
- [Tunix] Minor fix on rewards by @copybara-service[bot] in https://github.com/google/tunix/pull/1131
- [Tunix] Handles None logits from vLLM. by @copybara-service[bot] in https://github.com/google/tunix/pull/1126
- Change duplicate function registration to log a warning instead of raising an error. by @copybara-service[bot] in https://github.com/google/tunix/pull/1119
- Fix auto-assignment of github issues and pull requests to the eng. by @rajasekharporeddy in https://github.com/google/tunix/pull/1089
- Match changes in README by @copybara-service[bot] in https://github.com/google/tunix/pull/1113
- [Tunix] Fixes a bug in
math_rewards.pywhere multiple rewards could be added for a single sample. by @copybara-service[bot] in https://github.com/google/tunix/pull/1142 - Adding fix to logical axis cm for RL. by @NicoGrande in https://github.com/google/tunix/pull/1143
- Fix eos issues in sglang / vllm samplers during on-policy rollout by @yixinw in https://github.com/google/tunix/pull/1148
- allow passing rollout configs from sglang/vllm through cli by @yixinw in https://github.com/google/tunix/pull/1140
- speed up agentic rl by @copybara-service[bot] in https://github.com/google/tunix/pull/1136
- Minor Doc Fixes: Correct a typo and add a hyperlink. by @rajasekharporeddy in https://github.com/google/tunix/pull/800
- [Tunix] Another minor fix for extracting ground truth. We should evaluated against a ground truth without
boxedformatted always. by @copybara-service[bot] in https://github.com/google/tunix/pull/1154 - remat MLP block by @copybara-service[bot] in https://github.com/google/tunix/pull/1130
- fix group_id by @copybara-service[bot] in https://github.com/google/tunix/pull/1157
- ensure seed is set for VLLM sampling params by @copybara-service[bot] in https://github.com/google/tunix/pull/1162
- fix deepscaler notebook by @copybara-service[bot] in https://github.com/google/tunix/pull/1161
- Move ifrt based reshard out of experimental. Leaving intermediate resharding and sidechannel resharding in experimental. by @copybara-service[bot] in https://github.com/google/tunix/pull/1146
- [Tunix] Add vLLM sampler for math eval. by @copybara-service[bot] in https://github.com/google/tunix/pull/1152
- speedup trainer for RL by @copybara-service[bot] in https://github.com/google/tunix/pull/1141
- some BE work by @copybara-service[bot] in https://github.com/google/tunix/pull/1168
- [Tunix] Update
BaseAgentto accept observations with a "prompts" key rather than "question". by @copybara-service[bot] in https://github.com/google/tunix/pull/1164 - expert parallelism config in base rollout by @khatwanimohit in https://github.com/google/tunix/pull/1099
- fix metric logging step by @copybara-service[bot] in https://github.com/google/tunix/pull/1173
- Add DeepSWE train script. by @copybara-service[bot] in https://github.com/google/tunix/pull/1134
- Update the
qlora_llama3_gpu.ipynbnotebook by @rajasekharporeddy in https://github.com/google/tunix/pull/1160 - [Tunix] improve trajectory logging to suppor numpy array and scalar types. by @copybara-service[bot] in https://github.com/google/tunix/pull/1163
- [Tunix] Initialize policy version from global steps. by @copybara-service[bot] in https://github.com/google/tunix/pull/1182
- make trace writing a configurable option by @copybara-service[bot] in https://github.com/google/tunix/pull/1085
- remove unnecessary rollout round by @copybara-service[bot] in https://github.com/google/tunix/pull/1174
- fix expert_parallel_size to not pass through to vLLM args by @khatwanimohit in https://github.com/google/tunix/pull/1181
- measure global step time, prompt len and clip ratio by @copybara-service[bot] in https://github.com/google/tunix/pull/1183
- remove tflops measurement by @copybara-service[bot] in https://github.com/google/tunix/pull/1171
- change defaults for Dropout and BatchNorm by @copybara-service[bot] in https://github.com/google/tunix/pull/1184
- [Resolved 1149] fix oom due to missing closing loop by @aolemila in https://github.com/google/tunix/pull/1151
- [Tunix] Add support for aligning 1D KV biases in sglang_jax. by @copybara-service[bot] in https://github.com/google/tunix/pull/1175
- Enable pr from user's fork to auto assign issues correctly. by @copybara-service[bot] in https://github.com/google/tunix/pull/1188
- [Tunix] Reduce log spam from type mismatch warnings. by @copybara-service[bot] in https://github.com/google/tunix/pull/1187
- split metric prefix by @copybara-service[bot] in https://github.com/google/tunix/pull/1185
- Add support for vllm sampler kwargs. by @NicoGrande in https://github.com/google/tunix/pull/1169
- add pg_clipfrac to grpo_learner by @andytwigg in https://github.com/google/tunix/pull/1203
- fix flax==0.12.4 in tpu-tests.yml temporarily by @aolemila in https://github.com/google/tunix/pull/1206
- [Tunix Perf] New timeline and span definitions by @copybara-service[bot] in https://github.com/google/tunix/pull/1147
- Refactor BackendMappingMixin to use explicit BACKEND_PACKAGE_PATH by @copybara-service[bot] in https://github.com/google/tunix/pull/1198
- Supported mixed precision training in qwen2 by @copybara-service[bot] in https://github.com/google/tunix/pull/1199
- [Tunix Perf] New perf tracer by @copybara-service[bot] in https://github.com/google/tunix/pull/1172
- force loss computation to be in fp32 by @copybara-service[bot] in https://github.com/google/tunix/pull/1210
- Add perfetto and logging export by @copybara-service[bot] in https://github.com/google/tunix/pull/1155
- [Tunix] Make returning logprobs from vLLM sampler configurable. by @copybara-service[bot] in https://github.com/google/tunix/pull/1212
- make perf engine (v1 or v2) and export function selectable. by @copybara-service[bot] in https://github.com/google/tunix/pull/1191
- Add image processor by @copybara-service[bot] in https://github.com/google/tunix/pull/1064
- Allow sampler to take in images by @copybara-service[bot] in https://github.com/google/tunix/pull/1103
- Add VLM SFT example by @copybara-service[bot] in https://github.com/google/tunix/pull/1104
- add docs for perf tracing by @copybara-service[bot] in https://github.com/google/tunix/pull/1209
- Fix typos in README by @copybara-service[bot] in https://github.com/google/tunix/pull/1215
- Add RLOO advantage estimator to Tunix. by @copybara-service[bot] in https://github.com/google/tunix/pull/1211
- [Tunix] Remove utils.time_measure from the training loop. by @copybara-service[bot] in https://github.com/google/tunix/pull/1118
- Fix typo in documentation. by @copybara-service[bot] in https://github.com/google/tunix/pull/1178
- add entroy loss, grad_norm to metrics by @copybara-service[bot] in https://github.com/google/tunix/pull/1216
- Add trajectory status to track limit. by @copybara-service[bot] in https://github.com/google/tunix/pull/1005
- add qwen3 grpo example script with simplemath rewards by @andytwigg in https://github.com/google/tunix/pull/1217
- add support for qwen3-base variants by @andytwigg in https://github.com/google/tunix/pull/1204
- add simple_math reward_fn by @andytwigg in https://github.com/google/tunix/pull/1214
- Update issue auto-assignment script to async and clean up logic. by @copybara-service[bot] in https://github.com/google/tunix/pull/1220
- Enable tied embedding for Qwen3-0.6B and Qwen3-1.7B. by @copybara-service[bot] in https://github.com/google/tunix/pull/1219
- [Tunix] Switch to safetensor API based loader when Pathways is enabled. by @copybara-service[bot] in https://github.com/google/tunix/pull/1226
- [Tunix] Update DeepScaler training notebook with vLLM optimizations. by @copybara-service[bot] in https://github.com/google/tunix/pull/1231
- use unified trace_dir for the trace writer by @copybara-service[bot] in https://github.com/google/tunix/pull/1218
- [Tunix] Add a script to run Pathways on GKE. by @copybara-service[bot] in https://github.com/google/tunix/pull/1229
- create NOOP trace writer by @copybara-service[bot] in https://github.com/google/tunix/pull/1232
- Update Qwen3 JAX to HF mappings for vLLM. by @copybara-service[bot] in https://github.com/google/tunix/pull/1234
- [Tunix] Fix the get_per_token_logps signature in vllm_rollout.py. by @copybara-service[bot] in https://github.com/google/tunix/pull/1236
- chore: move agentic rl learner out of experimental. by @copybara-service[bot] in https://github.com/google/tunix/pull/1230
- fix: add lora flag validation in RLCluster init. by @copybara-service[bot] in https://github.com/google/tunix/pull/1239
New Contributors
- @keshavb96 made their first contribution in https://github.com/google/tunix/pull/760
- @selamw1 made their first contribution in https://github.com/google/tunix/pull/865
- @NicoGrande made their first contribution in https://github.com/google/tunix/pull/841
- @SarveshMahalingam made their first contribution in https://github.com/google/tunix/pull/938
- @gagika made their first contribution in https://github.com/google/tunix/pull/991
- @gurusai-voleti made their first contribution in https://github.com/google/tunix/pull/1082
- @katjasrz made their first contribution in https://github.com/google/tunix/pull/1105
- @yixinw made their first contribution in https://github.com/google/tunix/pull/1148
- @khatwanimohit made their first contribution in https://github.com/google/tunix/pull/1099
- @andytwigg made their first contribution in https://github.com/google/tunix/pull/1203
Full Changelog: https://github.com/google/tunix/compare/v0.1.5...v0.1.6