Tunix - Browse /v0.1.6 at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
README.md	2026-03-12	35.4 kB	0
Tunix v0.1.6 -- Agentic RL _ VLM source code.tar.gz	2026-03-12	28.7 MB	0
Tunix v0.1.6 -- Agentic RL _ VLM source code.zip	2026-03-12	29.0 MB	0
Totals: 3 Items		57.7 MB	0

Highlights

supports Agentic RL training, see https://github.com/google/tunix/tree/main/examples/agentic/gemma_grpo_demo_nb.py
supports VLM training, see https://github.com/google/tunix/blob/main/examples/sft/vlm_training.py

:::python from tunix import AgenticGRPOConfig from tunix import AgenticGRPOLearner

agentic_grpo_config = AgenticGRPOConfig( num_generations=NUM_GENERATIONS, num_iterations=NUM_ITERATIONS, max_response_length=MAX_RESPONSE_LENGTH, beta=BETA, epsilon=EPSILON, system_prompt=SWE_SYSTEM_PROMPT, max_concurrency=1, epsilon_high=0.28, off_policy_steps=0, )

agentic_grpo_learner = AgenticGRPOLearner( rl_cluster=rl_cluster, reward_fns=reward_fns, agent_class=MyAgentClass, agent_kwargs={}, env_class=MyEnv, env_kwargs={"max_steps": MAX_STEPS}, algo_config=agentic_grpo_config, chat_parser=chat_parser, )

agentic_grpo_learner.train(train_dataset=train_dataset)

What's Changed

Developing for v0.1.6 now. by @wang2yn84 in https://github.com/google/tunix/pull/785
Fix the vllm server mode not finish issue. by @wang2yn84 in https://github.com/google/tunix/pull/784
[Tunix] Update Dockerfile and deepscaler trainer script to seperate trainer model and ref model. by @copybara-service[bot] in https://github.com/google/tunix/pull/725
Add Tunix RL GRPO examples for Gemma3. by @copybara-service[bot] in https://github.com/google/tunix/pull/788
[Tunix] change model implementation to be pytree compatible. by @copybara-service[bot] in https://github.com/google/tunix/pull/782
Fix TPU nightly regression workflow to use vLLM container and add new tests. by @copybara-service[bot] in https://github.com/google/tunix/pull/754
[Tunix] Update sharding configuration for attention weights. by @copybara-service[bot] in https://github.com/google/tunix/pull/759
[Tunix] Add gcsfs to TPU nightly regression dependencies. by @copybara-service[bot] in https://github.com/google/tunix/pull/790
Adding back test_logprobs_extraction_with_missing_token. by @wang2yn84 in https://github.com/google/tunix/pull/789
feat:add device indexes for sglang jax by @pathfinder-pf in https://github.com/google/tunix/pull/786
Fix the rendering issue in Example gallery document. by @rajasekharporeddy in https://github.com/google/tunix/pull/799
[Tunix] Remove the version pin for SGLang. by @copybara-service[bot] in https://github.com/google/tunix/pull/798
[Fixes 794] fix transformers=4.57.1 to solve issue42369 in transformers and use c… by @aolemila in https://github.com/google/tunix/pull/795
Refactor gemma3 modelConfig to explicitly include all models by @copybara-service[bot] in https://github.com/google/tunix/pull/792
[Tunix] Fix nightly regression: remove unnecessary --root-dir argument from TPU nightly regression script. Fix the MATH500 eval script. by @copybara-service[bot] in https://github.com/google/tunix/pull/796
use naming utils in tunix cli by @copybara-service[bot] in https://github.com/google/tunix/pull/736
[Tunix] Remove GitHub Actions replacement in copybara. Replying on more generic google3 replacement rule by @copybara-service[bot] in https://github.com/google/tunix/pull/803
reduce safetensor loading time by @keshavb96 in https://github.com/google/tunix/pull/760
[Tunix] Remove env_utils.fs_open from safetensors_loader. fsspec object doesn't have fileno. 3P test is broken: https://github.com/google/tunix/actions/runs/19689186862/job/56403241781?pr=744 by @copybara-service[bot] in https://github.com/google/tunix/pull/804
[Tunix] Pass HF_TOKEN to TPU nightly regression tests. by @copybara-service[bot] in https://github.com/google/tunix/pull/805
[Tunix] Follow up of cl/836961494. It was out of sync with github PR. by @copybara-service[bot] in https://github.com/google/tunix/pull/807
[Tunix] Pin the vLLM TPU Docker image to a specific nightly build version for the TPU tests. by @copybara-service[bot] in https://github.com/google/tunix/pull/808
[Tunix] Update tunix nightly regression workflow schedule. Change the cron schedule from 2 AM UTC to 10 AM UTC. by @copybara-service[bot] in https://github.com/google/tunix/pull/806
Centralize Flax sharding setup in env_utils by @copybara-service[bot] in https://github.com/google/tunix/pull/797
Fix gemma3 grpo shell scripts by @sizhit2 in https://github.com/google/tunix/pull/791
[Tunix] Fix GRPO script. by @copybara-service[bot] in https://github.com/google/tunix/pull/811
rename all model configs to use "p" instead of "_" for float values by @copybara-service[bot] in https://github.com/google/tunix/pull/740
[Tunix] Move model alignment tests from CPU to TPU run dev workflow. by @copybara-service[bot] in https://github.com/google/tunix/pull/818
handle the situation when lora_config is not provided by @Hanjun-Dai in https://github.com/google/tunix/pull/813
checkpoint_options->checkpointing_options in cli/config.py by @Hanjun-Dai in https://github.com/google/tunix/pull/814
[Tunix] Remove EOS token appending to the prompt in vLLM and SGLang sampler. by @copybara-service[bot] in https://github.com/google/tunix/pull/827
Fix bos duplication by @Hanjun-Dai in https://github.com/google/tunix/pull/822
remove extra flax sharding check by @copybara-service[bot] in https://github.com/google/tunix/pull/817
Remove irrelevant text in GRPO example by @copybara-service[bot] in https://github.com/google/tunix/pull/823
[TUNIX] Switch to absl.logging in the tunix util file for scripts. by @copybara-service[bot] in https://github.com/google/tunix/pull/831
renaming Transformer to Gemma for gemma model by @copybara-service[bot] in https://github.com/google/tunix/pull/819
Expand model tests and fix gemma from_params parsing by @copybara-service[bot] in https://github.com/google/tunix/pull/828
add missing refactoring to model test by @copybara-service[bot] in https://github.com/google/tunix/pull/835
allow user to config project name and run name in wandb by @Hanjun-Dai in https://github.com/google/tunix/pull/836
fix the issue when eager mode jax is triggered in undesired places by @Hanjun-Dai in https://github.com/google/tunix/pull/837
make TFDS download flag a configurable option by @copybara-service[bot] in https://github.com/google/tunix/pull/763
Fix llama RL verl script by @copybara-service[bot] in https://github.com/google/tunix/pull/839
Fix ref model compute_logps input sharding issue by @copybara-service[bot] in https://github.com/google/tunix/pull/846
Improves the GRPO script to be more configurable. by @wang2yn84 in https://github.com/google/tunix/pull/840
remove unused fn by @copybara-service[bot] in https://github.com/google/tunix/pull/847
Add support for Dr. GRPO by @copybara-service[bot] in https://github.com/google/tunix/pull/681
[Tunix] Update parallel sizes to use ROLLOUT_MESH in grpo_demo. by @copybara-service[bot] in https://github.com/google/tunix/pull/851
Fix typo and citation formatting by @selamw1 in https://github.com/google/tunix/pull/865
[Bug] Fix/sglang jax support pathways by @aolemila in https://github.com/google/tunix/pull/860
[Tunix] Add number of batches argument and reduce nightly regression run time. by @copybara-service[bot] in https://github.com/google/tunix/pull/866
Add AgenticRLLearner base class. by @copybara-service[bot] in https://github.com/google/tunix/pull/829
Add XM launch for tunix cli by @copybara-service[bot] in https://github.com/google/tunix/pull/848
update OSS readme by @copybara-service[bot] in https://github.com/google/tunix/pull/863
use env_utils in config_test by @copybara-service[bot] in https://github.com/google/tunix/pull/872
check integer type by @copybara-service[bot] in https://github.com/google/tunix/pull/877
Add smoke shell scripts to nightly run by @copybara-service[bot] in https://github.com/google/tunix/pull/855
use np instead jnp to compute rewards in agentic framework by @copybara-service[bot] in https://github.com/google/tunix/pull/881
[Tunix] Support pre-resharding pytrees with different meshes. by @copybara-service[bot] in https://github.com/google/tunix/pull/882
Fix the ValueError while loading the Gemma model in logit_distillation.ipynb by @rajasekharporeddy in https://github.com/google/tunix/pull/870
change qwen3_30b more specific to qwen3_30b_a3b by @copybara-service[bot] in https://github.com/google/tunix/pull/880
add qwen4b model config which uses tie embedding by @Hanjun-Dai in https://github.com/google/tunix/pull/858
Add codewiki link by @copybara-service[bot] in https://github.com/google/tunix/pull/886
[Script] merge grpo_demo_sglang_jax_rollout.py into grpo_demo_llama3_qwen2.py by @aolemila in https://github.com/google/tunix/pull/868
Adding support for gemma-X-, llama-X- naming similar to HF by @copybara-service[bot] in https://github.com/google/tunix/pull/876
enforce rollout tokens to be in RAM by @copybara-service[bot] in https://github.com/google/tunix/pull/889
Adding Automodel interface to Tunix by @copybara-service[bot] in https://github.com/google/tunix/pull/862
allow users to import reward module/fn outside tunix folder by @Hanjun-Dai in https://github.com/google/tunix/pull/852
Fix breaking config test by @copybara-service[bot] in https://github.com/google/tunix/pull/901
use np instead of jnp for reward fn and GRPO group adv by @copybara-service[bot] in https://github.com/google/tunix/pull/891
change qwen3_4b_2507 model config added to match HF model ids by @copybara-service[bot] in https://github.com/google/tunix/pull/897
[Tunix] Remove EOS token appending to the prompt in vLLM and SGLang sampler. by @copybara-service[bot] in https://github.com/google/tunix/pull/900
use model_path instead of model_id for gcs in the cli by @copybara-service[bot] in https://github.com/google/tunix/pull/904
change from mock.patch to mock.patch.object. by @copybara-service[bot] in https://github.com/google/tunix/pull/896
[Tunix]Make chat_parser optional in AgenticRLLearner. by @copybara-service[bot] in https://github.com/google/tunix/pull/911
Model creation smoke test by @copybara-service[bot] in https://github.com/google/tunix/pull/873
fix llama run script model name by @copybara-service[bot] in https://github.com/google/tunix/pull/909
[Tunix] Use jnp.concatenate instead of np.concatenate for merging micro-batches. by @copybara-service[bot] in https://github.com/google/tunix/pull/912
[Tunix] Fix sharding for act_btf in Tunix models. by @copybara-service[bot] in https://github.com/google/tunix/pull/914
Add Colab and Kaggle badges to the example notebooks by @rajasekharporeddy in https://github.com/google/tunix/pull/893
Doc ci check by @ev-br in https://github.com/google/tunix/pull/888
[Tunix] Remove the duplicate tests move rules. by @copybara-service[bot] in https://github.com/google/tunix/pull/922
[Tunix] Moves the smoke_tests folder to top level. by @copybara-service[bot] in https://github.com/google/tunix/pull/923
Support custom MaxText (vLLM) models in sampler and rollout. by @NicoGrande in https://github.com/google/tunix/pull/841
Fix for failing nightly regression test by @copybara-service[bot] in https://github.com/google/tunix/pull/918
fix perf by @pathfinder-pf in https://github.com/google/tunix/pull/915
update gemma-3 models ids to match HF by @copybara-service[bot] in https://github.com/google/tunix/pull/916
Validation tests for model id to exist on HF by @copybara-service[bot] in https://github.com/google/tunix/pull/917
allow users to specify data module outside of tunix; refactor a bit by @Hanjun-Dai in https://github.com/google/tunix/pull/853
add train step time metric. by @copybara-service[bot] in https://github.com/google/tunix/pull/924
[Tunix] Force install numpy 2.3.5 for vllm by @copybara-service[bot] in https://github.com/google/tunix/pull/931
Add an option to cache NNX traversals in PEFT trainer. by @copybara-service[bot] in https://github.com/google/tunix/pull/928
[Tunix] Fix sharding for act_btf in Tunix models. by @copybara-service[bot] in https://github.com/google/tunix/pull/934
Fix the dataset post initialization by @wang2yn84 in https://github.com/google/tunix/pull/936
add metric first_micro_batch_rollout_time in fully diagg mode. by @copybara-service[bot] in https://github.com/google/tunix/pull/925
remove mesh cm from example by @copybara-service[bot] in https://github.com/google/tunix/pull/908
Enable padding for attention qkv biases. by @copybara-service[bot] in https://github.com/google/tunix/pull/941
checkpoint opt state by @copybara-service[bot] in https://github.com/google/tunix/pull/945
Introduce prompt queue to support off-policy by @copybara-service[bot] in https://github.com/google/tunix/pull/946
remove type indirection by @copybara-service[bot] in https://github.com/google/tunix/pull/950
Adding dataclass to naming by @copybara-service[bot] in https://github.com/google/tunix/pull/948
Remove _obs_cache by @copybara-service[bot] in https://github.com/google/tunix/pull/952
remove type indirection by @copybara-service[bot] in https://github.com/google/tunix/pull/951
enable proper report for learning rate during training by @Hanjun-Dai in https://github.com/google/tunix/pull/940
Code update by @copybara-service[bot] in https://github.com/google/tunix/pull/942
add llama 3.2 1b/3b-instruct model config by @Hanjun-Dai in https://github.com/google/tunix/pull/935
use taskgroup for batch of async tasks by @copybara-service[bot] in https://github.com/google/tunix/pull/954
Add Tunix CLI setup instructions to g3docs. by @copybara-service[bot] in https://github.com/google/tunix/pull/957
[Internal] Adding naming conventions documentation by @copybara-service[bot] in https://github.com/google/tunix/pull/949
Fix broken example script references in CLI README by @SarveshMahalingam in https://github.com/google/tunix/pull/938
Fix the cli to have the proper model config over written. by @wang2yn84 in https://github.com/google/tunix/pull/960
[internal] extend XM launch to peft by @copybara-service[bot] in https://github.com/google/tunix/pull/959
adding missing test case variant by @copybara-service[bot] in https://github.com/google/tunix/pull/956
fix safetensor publish by @copybara-service[bot] in https://github.com/google/tunix/pull/968
Add gcsfs. by @copybara-service[bot] in https://github.com/google/tunix/pull/969
make explicit copy to create actor model by @copybara-service[bot] in https://github.com/google/tunix/pull/973
extend support for gemma1.1 in automodel by @copybara-service[bot] in https://github.com/google/tunix/pull/970
[Tunix] replaces a print statement with logging.vlog and other minor nits. by @copybara-service[bot] in https://github.com/google/tunix/pull/974
Apply transpose rule to safetensor saver function by @wang2yn84 in https://github.com/google/tunix/pull/971
[Tunix] Updates the split_by_mesh_axis access pattern. by @copybara-service[bot] in https://github.com/google/tunix/pull/975
[Tunix] Fix the logic to handle multiple intermetidate mesh. by @copybara-service[bot] in https://github.com/google/tunix/pull/976
Fix Gemma 3 model loading by @selamw1 in https://github.com/google/tunix/pull/982
Add overlong reward shaping for DAPO. by @copybara-service[bot] in https://github.com/google/tunix/pull/947
Internal Doc by @copybara-service[bot] in https://github.com/google/tunix/pull/984
Internal change by @copybara-service[bot] in https://github.com/google/tunix/pull/985
Feat/add lora for sglangjax by @aolemila in https://github.com/google/tunix/pull/826
Add an multi-turn RL example notebook. by @copybara-service[bot] in https://github.com/google/tunix/pull/988
Fix Gemma 3 safetensor loading by @copybara-service[bot] in https://github.com/google/tunix/pull/987
minor resharding updates. by @copybara-service[bot] in https://github.com/google/tunix/pull/989
Disable Lora in base_config.yaml by @copybara-service[bot] in https://github.com/google/tunix/pull/983
fix math util parsing by @copybara-service[bot] in https://github.com/google/tunix/pull/994
Add support for interleaved layer mappings and enhanced key mapping regex. by @copybara-service[bot] in https://github.com/google/tunix/pull/815
Unify to HF as the single model_id by @copybara-service[bot] in https://github.com/google/tunix/pull/981
Tunix Documentation V2 for OSS by @copybara-service[bot] in https://github.com/google/tunix/pull/990
Improve HBM usage tracking by avoiding double counting by @copybara-service[bot] in https://github.com/google/tunix/pull/1003
Add perf_metrics to tunix cli -- grpo by @copybara-service[bot] in https://github.com/google/tunix/pull/1001
enable Sphinx to fix slug for every header by @copybara-service[bot] in https://github.com/google/tunix/pull/1007
[Tunix] Update Copybara to keep a TOC placeholder in g3doc. by @copybara-service[bot] in https://github.com/google/tunix/pull/1011
Allow single str input by @copybara-service[bot] in https://github.com/google/tunix/pull/1010
[Tunix] Fix the copybara file transformation issue. by @copybara-service[bot] in https://github.com/google/tunix/pull/1012
[Distillation] Enhanced Logit Strategy with Top-K and Metrics by @gagika in https://github.com/google/tunix/pull/991
Add qwen3 vllm/sglang weight mapping support by @wang2yn84 in https://github.com/google/tunix/pull/1009
[Tunix] Fix Qwen2 vLLM to JAX embedding mapping. by @copybara-service[bot] in https://github.com/google/tunix/pull/1019
Adding new sharding and performance args for vLLM by @NicoGrande in https://github.com/google/tunix/pull/1006
[Tunix] Add _put_prompts_to_queue to handle dataset iteration and partial batches. by @copybara-service[bot] in https://github.com/google/tunix/pull/1018
[Tunix] Adds a check to ensure that arrays are still live before accessing their shards. by @copybara-service[bot] in https://github.com/google/tunix/pull/1025
Explicitly handle single string inputs in samplers. by @copybara-service[bot] in https://github.com/google/tunix/pull/1030
reduce log by @copybara-service[bot] in https://github.com/google/tunix/pull/1022
retry on hf download and list to prevent gateway error that cause presubmit timeout by @copybara-service[bot] in https://github.com/google/tunix/pull/1014
Add Gemini Code Assist style guide for PR reviews by @copybara-service[bot] in https://github.com/google/tunix/pull/1024
move scrubber after transformation for it to locate code in github by @copybara-service[bot] in https://github.com/google/tunix/pull/1016
[Tunix] Fix Qwen2/3 vLLM to JAX lm_head mapping. by @copybara-service[bot] in https://github.com/google/tunix/pull/1020
bug: fix metric refer_inference_time. by @copybara-service[bot] in https://github.com/google/tunix/pull/1037
feat: add multi-rollout engine interfaces. by @copybara-service[bot] in https://github.com/google/tunix/pull/1039
properly set LoRA alpha by @copybara-service[bot] in https://github.com/google/tunix/pull/1044
Clean up and update the deepscaler training script with sglang configurations. by @copybara-service[bot] in https://github.com/google/tunix/pull/1045
[Tunix] Use single shared loop for producer. by @copybara-service[bot] in https://github.com/google/tunix/pull/1042
[Tunix] Add trajectory data logging to disk for further analysis and visualization. by @copybara-service[bot] in https://github.com/google/tunix/pull/980
[Tunix] Add convert_messages_to_string to handle numpy array content in messages. by @copybara-service[bot] in https://github.com/google/tunix/pull/1047
internal by @copybara-service[bot] in https://github.com/google/tunix/pull/1046
Cast logits to float32 in sample_top_p. by @copybara-service[bot] in https://github.com/google/tunix/pull/1051
increase episode timeout by @copybara-service[bot] in https://github.com/google/tunix/pull/1052
VLM Training (1): Add vision to Gemma 3 by @copybara-service[bot] in https://github.com/google/tunix/pull/986
update README file in tunix by @copybara-service[bot] in https://github.com/google/tunix/pull/1054
Add timeline visualization of perf metrics by @copybara-service[bot] in https://github.com/google/tunix/pull/1035
Sort perfetto trace based on uuid so the sequence follows the timeline by @copybara-service[bot] in https://github.com/google/tunix/pull/1050
Use the rollout engine construct mesh. by @wang2yn84 in https://github.com/google/tunix/pull/1060
Fix spacing in algorithm diagram. by @copybara-service[bot] in https://github.com/google/tunix/pull/1065
Add logging for perf metric mode by @copybara-service[bot] in https://github.com/google/tunix/pull/1057
Skip softmax and sorting of probabilities when top_p == 1.0 and top_k is None. by @copybara-service[bot] in https://github.com/google/tunix/pull/1056
Refactor tunix Gemma3-4b SFT script to use new config structure. by @copybara-service[bot] in https://github.com/google/tunix/pull/1059
Log the computed score in GSM8K reward function by @copybara-service[bot] in https://github.com/google/tunix/pull/1058
[Tunix] Add Qwen3 32B model configuration. by @copybara-service[bot] in https://github.com/google/tunix/pull/1068
fix gcs paths in deepscaler notebook by @copybara-service[bot] in https://github.com/google/tunix/pull/1074
Fix wrong parent when nesting perf metrics by @copybara-service[bot] in https://github.com/google/tunix/pull/1066
fix optimizer CP restore by @copybara-service[bot] in https://github.com/google/tunix/pull/1070
[Tunix] Add log_level config to SglangJaxSampler. by @copybara-service[bot] in https://github.com/google/tunix/pull/1069
[Tunix] Add trajectory logging to agentic GRPO learner. by @copybara-service[bot] in https://github.com/google/tunix/pull/1071
remove max_steps from trajectory_collect_engine by @copybara-service[bot] in https://github.com/google/tunix/pull/1076
use absl logging across the repo by @copybara-service[bot] in https://github.com/google/tunix/pull/1077
feat: log rollout and train time at micro batch level. by @copybara-service[bot] in https://github.com/google/tunix/pull/1038
Add all steps to global span by @copybara-service[bot] in https://github.com/google/tunix/pull/1067
[Tunix] Remove upper bound on JAX version in tunix prod dependencies. by @copybara-service[bot] in https://github.com/google/tunix/pull/1079
Refactor GRPO rollout to simplify grouping and avoid deepcopy by @copybara-service[bot] in https://github.com/google/tunix/pull/1075
[Tunix] Pad the number of heads for projection bias. by @copybara-service[bot] in https://github.com/google/tunix/pull/1086
Remove max_open_buckets from GroupQueueManager by @copybara-service[bot] in https://github.com/google/tunix/pull/1083
chore: Migrate gsutil usage to gcloud storage by @gurusai-voleti in https://github.com/google/tunix/pull/1082
[Feat] add log_level in SglangJaxConfig and update default page_size by @aolemila in https://github.com/google/tunix/pull/1090
fix potential race condition on dictionary update by @copybara-service[bot] in https://github.com/google/tunix/pull/1091
update doc string and error message by @copybara-service[bot] in https://github.com/google/tunix/pull/1093
Supports padding kwargs for samplers. by @wang2yn84 in https://github.com/google/tunix/pull/1095
[Tunix]: Skip the already trained data on job resume. by @copybara-service[bot] in https://github.com/google/tunix/pull/1088
disable perf metrics by default in the cli. by @copybara-service[bot] in https://github.com/google/tunix/pull/1084
minor update by @copybara-service[bot] in https://github.com/google/tunix/pull/1092
Added a GPU demo for PEFT with QLoRA on Llama 3_1 by @katjasrz in https://github.com/google/tunix/pull/1105
Use Exception instead BaseException in Tunix by @wang2yn84 in https://github.com/google/tunix/pull/1108
fix loss mask for agentic learner by @copybara-service[bot] in https://github.com/google/tunix/pull/1100
Set the max worker number in asyncio loop. by @wang2yn84 in https://github.com/google/tunix/pull/1111
add comment clarifying micro batch has to be 1 by @copybara-service[bot] in https://github.com/google/tunix/pull/1094
Refactor the vllm sampler config with InitVar by @wang2yn84 in https://github.com/google/tunix/pull/1110
forbidden_tokens in sampler call accepts token IDs instead of strings. by @copybara-service[bot] in https://github.com/google/tunix/pull/1114
simplify trajectory result processing by @copybara-service[bot] in https://github.com/google/tunix/pull/1109
fix group_id and pair_idx in traj by @copybara-service[bot] in https://github.com/google/tunix/pull/1116
Add Colab and Kaggle badges to qlora_llam3_gpu example tutorial by @rajasekharporeddy in https://github.com/google/tunix/pull/1122
[Tunix] mprove GCS CSV writing. by @copybara-service[bot] in https://github.com/google/tunix/pull/1120
use max_response_len for deepscaler by @copybara-service[bot] in https://github.com/google/tunix/pull/1124
fix offpolicy step by @copybara-service[bot] in https://github.com/google/tunix/pull/1128
Update links in Tunix OSS README. by @copybara-service[bot] in https://github.com/google/tunix/pull/1121
[Tunix] Engine kwargs overwrite predefined config keys. by @copybara-service[bot] in https://github.com/google/tunix/pull/1129
[Tunix] Minor fix on rewards by @copybara-service[bot] in https://github.com/google/tunix/pull/1131
[Tunix] Handles None logits from vLLM. by @copybara-service[bot] in https://github.com/google/tunix/pull/1126
Change duplicate function registration to log a warning instead of raising an error. by @copybara-service[bot] in https://github.com/google/tunix/pull/1119
Fix auto-assignment of github issues and pull requests to the eng. by @rajasekharporeddy in https://github.com/google/tunix/pull/1089
Match changes in README by @copybara-service[bot] in https://github.com/google/tunix/pull/1113
[Tunix] Fixes a bug in math_rewards.py where multiple rewards could be added for a single sample. by @copybara-service[bot] in https://github.com/google/tunix/pull/1142
Adding fix to logical axis cm for RL. by @NicoGrande in https://github.com/google/tunix/pull/1143
Fix eos issues in sglang / vllm samplers during on-policy rollout by @yixinw in https://github.com/google/tunix/pull/1148
allow passing rollout configs from sglang/vllm through cli by @yixinw in https://github.com/google/tunix/pull/1140
speed up agentic rl by @copybara-service[bot] in https://github.com/google/tunix/pull/1136
Minor Doc Fixes: Correct a typo and add a hyperlink. by @rajasekharporeddy in https://github.com/google/tunix/pull/800
[Tunix] Another minor fix for extracting ground truth. We should evaluated against a ground truth without boxed formatted always. by @copybara-service[bot] in https://github.com/google/tunix/pull/1154
remat MLP block by @copybara-service[bot] in https://github.com/google/tunix/pull/1130
fix group_id by @copybara-service[bot] in https://github.com/google/tunix/pull/1157
ensure seed is set for VLLM sampling params by @copybara-service[bot] in https://github.com/google/tunix/pull/1162
fix deepscaler notebook by @copybara-service[bot] in https://github.com/google/tunix/pull/1161
Move ifrt based reshard out of experimental. Leaving intermediate resharding and sidechannel resharding in experimental. by @copybara-service[bot] in https://github.com/google/tunix/pull/1146
[Tunix] Add vLLM sampler for math eval. by @copybara-service[bot] in https://github.com/google/tunix/pull/1152
speedup trainer for RL by @copybara-service[bot] in https://github.com/google/tunix/pull/1141
some BE work by @copybara-service[bot] in https://github.com/google/tunix/pull/1168
[Tunix] Update BaseAgent to accept observations with a "prompts" key rather than "question". by @copybara-service[bot] in https://github.com/google/tunix/pull/1164
expert parallelism config in base rollout by @khatwanimohit in https://github.com/google/tunix/pull/1099
fix metric logging step by @copybara-service[bot] in https://github.com/google/tunix/pull/1173
Add DeepSWE train script. by @copybara-service[bot] in https://github.com/google/tunix/pull/1134
Update the qlora_llama3_gpu.ipynb notebook by @rajasekharporeddy in https://github.com/google/tunix/pull/1160
[Tunix] improve trajectory logging to suppor numpy array and scalar types. by @copybara-service[bot] in https://github.com/google/tunix/pull/1163
[Tunix] Initialize policy version from global steps. by @copybara-service[bot] in https://github.com/google/tunix/pull/1182
make trace writing a configurable option by @copybara-service[bot] in https://github.com/google/tunix/pull/1085
remove unnecessary rollout round by @copybara-service[bot] in https://github.com/google/tunix/pull/1174
fix expert_parallel_size to not pass through to vLLM args by @khatwanimohit in https://github.com/google/tunix/pull/1181
measure global step time, prompt len and clip ratio by @copybara-service[bot] in https://github.com/google/tunix/pull/1183
remove tflops measurement by @copybara-service[bot] in https://github.com/google/tunix/pull/1171
change defaults for Dropout and BatchNorm by @copybara-service[bot] in https://github.com/google/tunix/pull/1184
[Resolved 1149] fix oom due to missing closing loop by @aolemila in https://github.com/google/tunix/pull/1151
[Tunix] Add support for aligning 1D KV biases in sglang_jax. by @copybara-service[bot] in https://github.com/google/tunix/pull/1175
Enable pr from user's fork to auto assign issues correctly. by @copybara-service[bot] in https://github.com/google/tunix/pull/1188
[Tunix] Reduce log spam from type mismatch warnings. by @copybara-service[bot] in https://github.com/google/tunix/pull/1187
split metric prefix by @copybara-service[bot] in https://github.com/google/tunix/pull/1185
Add support for vllm sampler kwargs. by @NicoGrande in https://github.com/google/tunix/pull/1169
add pg_clipfrac to grpo_learner by @andytwigg in https://github.com/google/tunix/pull/1203
fix flax==0.12.4 in tpu-tests.yml temporarily by @aolemila in https://github.com/google/tunix/pull/1206
[Tunix Perf] New timeline and span definitions by @copybara-service[bot] in https://github.com/google/tunix/pull/1147
Refactor BackendMappingMixin to use explicit BACKEND_PACKAGE_PATH by @copybara-service[bot] in https://github.com/google/tunix/pull/1198
Supported mixed precision training in qwen2 by @copybara-service[bot] in https://github.com/google/tunix/pull/1199
[Tunix Perf] New perf tracer by @copybara-service[bot] in https://github.com/google/tunix/pull/1172
force loss computation to be in fp32 by @copybara-service[bot] in https://github.com/google/tunix/pull/1210
Add perfetto and logging export by @copybara-service[bot] in https://github.com/google/tunix/pull/1155
[Tunix] Make returning logprobs from vLLM sampler configurable. by @copybara-service[bot] in https://github.com/google/tunix/pull/1212
make perf engine (v1 or v2) and export function selectable. by @copybara-service[bot] in https://github.com/google/tunix/pull/1191
Add image processor by @copybara-service[bot] in https://github.com/google/tunix/pull/1064
Allow sampler to take in images by @copybara-service[bot] in https://github.com/google/tunix/pull/1103
Add VLM SFT example by @copybara-service[bot] in https://github.com/google/tunix/pull/1104
add docs for perf tracing by @copybara-service[bot] in https://github.com/google/tunix/pull/1209
Fix typos in README by @copybara-service[bot] in https://github.com/google/tunix/pull/1215
Add RLOO advantage estimator to Tunix. by @copybara-service[bot] in https://github.com/google/tunix/pull/1211
[Tunix] Remove utils.time_measure from the training loop. by @copybara-service[bot] in https://github.com/google/tunix/pull/1118
Fix typo in documentation. by @copybara-service[bot] in https://github.com/google/tunix/pull/1178
add entroy loss, grad_norm to metrics by @copybara-service[bot] in https://github.com/google/tunix/pull/1216
Add trajectory status to track limit. by @copybara-service[bot] in https://github.com/google/tunix/pull/1005
add qwen3 grpo example script with simplemath rewards by @andytwigg in https://github.com/google/tunix/pull/1217
add support for qwen3-base variants by @andytwigg in https://github.com/google/tunix/pull/1204
add simple_math reward_fn by @andytwigg in https://github.com/google/tunix/pull/1214
Update issue auto-assignment script to async and clean up logic. by @copybara-service[bot] in https://github.com/google/tunix/pull/1220
Enable tied embedding for Qwen3-0.6B and Qwen3-1.7B. by @copybara-service[bot] in https://github.com/google/tunix/pull/1219
[Tunix] Switch to safetensor API based loader when Pathways is enabled. by @copybara-service[bot] in https://github.com/google/tunix/pull/1226
[Tunix] Update DeepScaler training notebook with vLLM optimizations. by @copybara-service[bot] in https://github.com/google/tunix/pull/1231
use unified trace_dir for the trace writer by @copybara-service[bot] in https://github.com/google/tunix/pull/1218
[Tunix] Add a script to run Pathways on GKE. by @copybara-service[bot] in https://github.com/google/tunix/pull/1229
create NOOP trace writer by @copybara-service[bot] in https://github.com/google/tunix/pull/1232
Update Qwen3 JAX to HF mappings for vLLM. by @copybara-service[bot] in https://github.com/google/tunix/pull/1234
[Tunix] Fix the get_per_token_logps signature in vllm_rollout.py. by @copybara-service[bot] in https://github.com/google/tunix/pull/1236
chore: move agentic rl learner out of experimental. by @copybara-service[bot] in https://github.com/google/tunix/pull/1230
fix: add lora flag validation in RLCluster init. by @copybara-service[bot] in https://github.com/google/tunix/pull/1239