NVIDIA NeMo - Browse /v2.7.0 at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
NVIDIA Neural Modules 2.7.0 source code.tar.gz	2026-02-24	50.5 MB	0
NVIDIA Neural Modules 2.7.0 source code.zip	2026-02-24	53.7 MB	0
README.md	2026-02-24	15.3 kB	0
Totals: 3 Items		104.2 MB	0

Highlights

Speech
Adds Per-Stream Phrase Boosting in ASR Decoding (Transducers) #15125
Adds support for streaming speech translation [#15132]
Released new model nemotron-speech-streaming-en-0.6b that performs English Streaming ASR
Released new TTS model magpie_tts_multilingual_357m for multilingual Text-to-Speech

Starting with the next release, NeMo 2.8.0, the following collections will be removed: avlm, diffusion, llm, multimodal, multimodal-autoregressive, nlp, speechlm, vision, vlm, and this repo will focus solely on speech tasks: ASR, TTS, speaker diarization, and speech enhancement.

Detailed Changelogs:

ASR

Changelog

- Enable CUDA graphs in streaming tests by @artbataev :: PR: [#14953] - Update ctc-segmentation by @chtruong814 :: PR: [#14991] - check asr models by @nithinraok :: PR: [#14989] - Unified inference of streaming ASR by @naymaraq :: PR: [#14817] - Update numba to numba-cuda and update cuda python bindings usage by @chtruong814 :: PR: [#15018] - Fixing lines for multispeaker pipeline by @tango4j :: PR: [#15030] - Inference optimization for cache-aware pipelines by @naymaraq :: PR: [#15035] - fix loading of hyb ctc rnnt bpe models when using from pretrained by @nithinraok :: PR: [#15042] - removed old buffered CTC script by @naymaraq :: PR: [#15061] - remove nlp related notebooks by @nithinraok :: PR: [#15070] - Update MagpieTTS model with latest changes by @blisc :: PR: [#15031] - ASR inference: expose RNN-T decoding params for context biasing by @artbataev :: PR: [#15091] - update notebook by @nithinraok :: PR: [#15093] - Fix: Obsolete Attribute [SDE] by @Jorjeous :: PR: [#15105] - Upgrade NeMo ASR tutorials from Mozilla/CommonVoice to Google/FLEURS by @KunalDhawan :: PR: [#15103] - Add support for AIS batch loading for ASR audio processing by @gaikwadabhishek :: PR: [#15102] - Multi-Talker Parakeet Streaming - NeMo Documents and Tutorial Notebooks PR 03 by @tango4j :: PR: [#15025] - [Fix] Fix the notebook errors on multispeaker data simulation and end to end diarization training by @tango4j :: PR: [#15149] - Streaming transducer inference: fix memory usage, improve WER by @artbataev :: PR: [#15148] - Execute with subprocess list by @nithinraok :: PR: [#15165] - Chunking fix by @nune-tadevosyan :: PR: [#15163] - ASR Decoding: allow fallback to CUDA graphs without while loops by @artbataev :: PR: [#15173] - remove nlp/modules by @dimapihtar :: PR: [#14934] - Asr numpy 2 fix by @nithinraok :: PR: [#15166] - Adding flexible input sources for Diarization Mixin by @tango4j :: PR: [#15184] - Add support for streaming speech translation by @naymaraq :: PR: [#15132] - Confidence fix get_correct_marks by @nune-tadevosyan :: PR: [#15128] - Chunking edge cases by @nune-tadevosyan :: PR: [#15182] - update subprocess cmd by @nithinraok :: PR: [#15218] - Changes required for enabling prompt based models in Nemo Inference by @arushidNV :: PR: [#15036] - Fixing the missing sample_rate argument in mixin calling in Sortformer model file by @tango4j :: PR: [#15228] - Fix audio tensor loading canary2 by @nithinraok :: PR: [#15265] - Fix word confidence return by @nithinraok :: PR: [#15249] - feat(asr): add optional auxiliary timestamp model restoration for Canary by @chaosido :: PR: [#15268] - Performance: Optimize .nemo tar extraction & model config processing by @paulirish :: PR: [#15245] - fix speech commands notebook by @nithinraok :: PR: [#15290] - fix timestamps processing with audio tensor input by @nithinraok :: PR: [#15291] - Update conv_asr.py preventing unnecessary calculations by @tamilselvan0x0 :: PR: [#15239] - Bump to pytorch 25.11 by @chtruong814 :: PR: [#15247] - Add FeatureBuffer support to Cache-Aware streaming pipeline by @arushidNV :: PR: [#15188] - Per-Stream Phrase Boosting in ASR Decoding (Transducers) by @artbataev :: PR: [#15125] - Sort audio by duration in ASR streaming inference script by @artbataev :: PR: [#15297] - ASR transcribe: fix forced decoder reinstantiation with `timestamps=True` by @artbataev :: PR: [#15298] - Removes use of torchaudio and moves transforms inside of NeMo by @blisc :: PR: [#15211] - Add sacrebleu to ASR requirements by @pzelasko :: PR: [#15016] - SpeechLM2 : Add support for offset key in Multimodal conversation by @AudranBert :: PR: [#15281] - Add cross-attention to output hypotheses by @mgaido91 :: PR: [#15229] - Add warm-ups for RTFX calculation in streaming ASR pipelines by @naymaraq :: PR: [#15313] - Speedup buffered transducer inference: remove double decoding by @artbataev :: PR: [#15301] - improve canary performance on short audio by @nithinraok :: PR: [#15317] - Transducer Decoding: Move fusion models to the base class by @artbataev :: PR: [#15322] - Add typing to speech_to_text_finetune.py by @Garvys :: PR: [#15326] - Bugfix: correct fusion scores for TDT by @artbataev :: PR: [#15325] - Fix ASR streaming script: correctly add biasing requests to model by @artbataev :: PR: [#15334] - Fix ASR context biasing in streaming TDT decoding by @artbataev :: PR: [#15327]

TTS

Changelog

- Remove HeteronymClassificationModel by @blisc :: PR: [#14980] - remove nlp.parts collection by @dimapihtar :: PR: [#14617] - Update MagpieTTS model with latest changes by @blisc :: PR: [#15031] - remove nlp/modules by @dimapihtar :: PR: [#14934] - [TTS] MagpieTTS Inference Refactoring by @subhankar-ghosh :: PR: [#15178] - [DRAFT][TTS] Magpietts Simple API and loading audiocodec from Huggingface by @subhankar-ghosh :: PR: [#15172] - [TTS][MagpieTTS] Change French tokenizer to use 'french_chartokenizer' by @subhankar-ghosh :: PR: [#15205] - Add Japanese g2p katakana accent support by @quapham :: PR: [#15170] - [TTS][MagpieTTS] Longform TTS using MagpieTTS by @subhankar-ghosh :: PR: [#15210] - [voice agent] Fixing the missing arguments calling in `NemoSTTService` by @SangwonSUH :: PR: [#15233] - [TTS] MagpieTTS inference: Add command line option to select a subset of datasets to run inference on by @rfejgin :: PR: [#15212] - [TTS] Allow inference without reference audio by @rfejgin :: PR: [#15213] - [TTS] Refactor Magpie to support codec conversion and bandwidth extension by @rlangman :: PR: [#15191] - [TTS] MagpieTTS: Implement Frechet Codec Distance metric + some minor inference bugfixes by @rfejgin :: PR: [#15223] - Update MagpieTTS' Inference Parameter Configuration by @blisc :: PR: [#15254] - [TTS][MagpieTTS] Add longform capability to do_tts method by @subhankar-ghosh :: PR: [#15241] - [TTS] Add tests of the MagpieTTS inference CLI by @rfejgin :: PR: [#15272] - [MagpieTTS][TTS] Support local transformer in longform magpietts by @subhankar-ghosh :: PR: [#15296] - Removes use of torchaudio and moves transforms inside of NeMo by @blisc :: PR: [#15211] - [MagpieTTS][Docs] Add magpietts docs by @subhankar-ghosh :: PR: [#15302] - Add Hindi (hi-IN) support for TTS by @quapham :: PR: [#15248] - build: Explicitly set torch >= 2.6.0 and remove weights_only=False by @chtruong814 :: PR: [#15314] - [MagpieTTS] Fix incorrect sort order comment in pareto_rank function by @matteolippi :: PR: [#15333]

NLP / NMT

Changelog

- remove nlp.parts collection by @dimapihtar :: PR: [#14617] - chore: remove ExportDeploy by @pablo-garay :: PR: [#15033] - remove nlp related notebooks by @nithinraok :: PR: [#15070] - Add deprecation notice to modules by @chtruong814 :: PR: [#15050] - [OMNIML-3034] ModelOpt rename from TRT ModelOpt to ModelOpt by @yueshen2016 :: PR: [#15147] - remove nlp/modules by @dimapihtar :: PR: [#14934] - Add support for streaming speech translation by @naymaraq :: PR: [#15132] - Remove hardcoded DEBUG logging level in gpt_oss.py by @yurekami :: PR: [#15236] - Docs: replace removed preprocess_data_for_megatron.py with Megatron-L… by @Saibabu7770 :: PR: [#15222] - remove nlp documentation by @dimapihtar :: PR: [#15304] - fix speech translation vllm dockerfile by @naymaraq :: PR: [#15310]

Text Normalization / Inverse Text Normalization

Changelog

- Add import guards for mcore lightning module by @chtruong814 :: PR: [#14970] - chore: update Lightning requirements version by @liquor233 :: PR: [#15004]

NeMo Tools

Changelog

- Fix: Obsolete Attribute [SDE] by @Jorjeous :: PR: [#15105] - Updated tutorial on SDE, due to changes in colab and libraries by @Jorjeous :: PR: [#15137]

Export

Changelog

- chore: remove ExportDeploy by @pablo-garay :: PR: [#15033] - [OMNIML-3034] ModelOpt rename from TRT ModelOpt to ModelOpt by @yueshen2016 :: PR: [#15147] - fix: Raise exception in nemo.export instead of allowing pickle.loads by @chtruong814 :: PR: [#15266]

Bugfixes

Changelog

- Fix PEFT resume with `resume_from_path` by @maanug-nv :: PR: [#14966] - Update deprecated env var by @maanug-nv :: PR: [#14975] - Revert lhotse patch after updating to lhotse 1.32.2 by @chtruong814 :: PR: [#15329]

Uncategorized:

Changelog

- Version bump to `2.7.0rc0.dev0` by @github-actions[bot] :: PR: [#14956] - Update changelog for `v2.5.1` by @github-actions[bot] :: PR: [#14967] - Bump MCore, TE, Pytorch, and modelopt for 25.11 by @chtruong814 :: PR: [#14946] - Remove code related to nemo-evaluator (aka nemo-eval) by @athitten :: PR: [#14964] - Update changelog for `r2.5.0` by @github-actions[bot] :: PR: [#14990] - Add clear resharding message error message by @mikolajblaz :: PR: [#14962] - Fix Evo2 checkpoint backward compatibility by @farhadrgh :: PR: [#14914] - Pass timeout when running speech functional tests by @chtruong814 :: PR: [#15012] - [Voice Agent] Fix text aggregation, eob handling, logging by @stevehuang52 :: PR: [#14951] - Fix speechlm inference configuration by @stevehuang52 :: PR: [#14931] - Enable EP in PTQ by @jenchen13 :: PR: [#15015] - revert ckpt scripts removal from [#14617] by @dimapihtar :: PR: [#15048] - fix: fix update-buildcache workflow after ED remove by @pablo-garay :: PR: [#15051] - Update changelog for `v2.5.3` by @github-actions[bot] :: PR: [#15055] - [voice agent] Fix RTVI missing bot message by @stevehuang52 :: PR: [#15068] - [voice agent] make parakeet-eou model default stt by @stevehuang52 :: PR: [#15069] - chore: Remove Automodel module by @thomasdhc :: PR: [#15044] - add support for parallel ckpt removal by @dimapihtar :: PR: [#15073] - Fix VLM mcore engine by @meatybobby :: PR: [#15076] - Revert "Fix vlm engine changes in mcore (#15076)" by @pablo-garay :: PR: [#15090] - fix: fix lines with malformed anchor tags by @pablo-garay :: PR: [#15095] - Update Gemma3VL model training scripts by @genquan9 :: PR: [#15041] - fix MR layer b2b filter to be comptatible with baseline FFTConv by @moradza :: PR: [#15100] - guard trust_remote_code by @dimapihtar :: PR: [#15065] - Fix get_new_ctm_lines_from_alignments function in scripts/speaker_tasks/create_alignment_manifest.py by @KunalDhawan :: PR: [#15118] - Change title to 'NVIDIA NeMo Speech Collection' by @snowmanwwg :: PR: [#15127] - remove pinning cuda bindings by @nithinraok :: PR: [#15183] - Update URL to ModelOpt Speculative by @AAnoosheh :: PR: [#15075] - remove ckpt_save_pre_mcore_014 support by @dimapihtar :: PR: [#15146] - Removing pip install instruction for NeMo voice agent environment setting by @tango4j :: PR: [#15101] - replace deprecated type comment with type annotation. by @XuesongYang :: PR: [#15175] - Set L2_NeMo_2_llama3_pretraining_recipe to be optional by @chtruong814 :: PR: [#15192] - Update README with latest news on nano-v3 by @snowmanwwg :: PR: [#15197] - [speechlm] replace pcikle.loads with json.loads by @stevehuang52 :: PR: [#15232] - [Fix] Fix safety issue for fsdp_dtensor by @BoxiangW :: PR: [#15227] - [voice agent] Add examples for tool calling by @stevehuang52 :: PR: [#15243] - Update README to reflect current Repo Status by @nithinraok :: PR: [#15217] - remove checks for hydra installation by @nithinraok :: PR: [#15267] - [voice agent] Improve tool calling and logging ux by @stevehuang52 :: PR: [#15269] - Implement Nemotron-VoiceChat Speech Decoder by @Edresson :: PR: [#15066] - Update CONTRIBUTING.md by @chtruong814 :: PR: [#15260] - Update changelog for `r2.6.0` by @github-actions[bot] :: PR: [#15282] - set dynamo=False to support latest version of pytorch by @nithinraok :: PR: [#15284] - Fix progress_printer using wrong variable in on_test_batch_end by @yurekami :: PR: [#15237] - [voice agent] Add audio logging to NeMo Voice Agent by @tango4j :: PR: [#15279] - bump transformers version by @nithinraok :: PR: [#15271] - Use PurePosixPath for cross-platform path handling by @yurekami :: PR: [#15238] - Fix website link for clearml by @orena1 :: PR: [#14128] - Clipping, lowpass and lossy codec online augmentations for Lhotse dataloader/sampler by @racoiaws :: PR: [#14809] - ci: Enable label to force run CI tests by @chtruong814 :: PR: [#15242] - Remove Codeowners for now by @blisc :: PR: [#15307] - unset weights_only=False by @dimapihtar :: PR: [#15312] - update readme by @nithinraok :: PR: [#15341] - add safe globals to documentation by @dimapihtar :: PR: [#15342] - Security: Fix command injection and insecure permissions by @AkCodes23 :: PR: [#15288] - Freeze tags in in `r2.7.0` by @github-actions[bot] :: PR: [#15351] - Update Imports in Audio Notebook (15345) into r2.7.0 by @blisc :: PR: [#15352] - cp: `Clarify when to use TORCH_FORCE_NO_WEIGHTS_ONLY_LOAD (15353)` into `r2.7.0` by @chtruong814 :: PR: [#15359] - cp: `Fix macro accuracy when changing labels (15379)` into `r2.7.0` by @chtruong814 :: PR: [#15386] - cp: `[voice agent] fix dependency for nemo26.02 (15380)` into `r2.7.0` by @chtruong814 :: PR: [#15383] - cp: `Remove deprecated LLM, VLM, and diffusion tutorials (15357)` into `r2.7.0` by @chtruong814 :: PR: [#15392] - update yaml to include new location of losses that were removed in [#15211] (r.2.7.0 fix) by @blisc :: PR: [#15391] - cp: `fixes nemo tutorial for loading non registered classes (15398)` into `r2.7.0` by @chtruong814 :: PR: [#15399] - cp: `default weights to false (15397)` into `r2.7.0` by @chtruong814 :: PR: [#15401] - cp: `Fixing could not find ctc_segmentation. in CTC tutorial (15403)` into `r2.7.0` by @chtruong814 :: PR: [#15404] - cp: `Adapt to use env variable for adapter mixin model loading (15406)` into `r2.7.0` by @chtruong814 :: PR: [#15407] - Fix BNR 2.0 inference alignment error with input signal padding by @ManasiRemane :: PR: [#15390] - chore: Remove pre-release suffix for 2.7.0 by @chtruong814 :: PR: [#15415] - cp: Update release workflow to include generated changelog (#15429) by @chtruong814 :: PR: [#15430]

Source: README.md, updated 2026-02-24