Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
README.md | 2024-12-13 | 23.9 kB | |
v3.0.0-beta3 source code.tar.gz | 2024-12-13 | 20.9 MB | |
v3.0.0-beta3 source code.zip | 2024-12-13 | 24.9 MB | |
Totals: 3 Items | 45.9 MB | 0 |
本次更新增强了PaddleNLP的基础体验,新增了Llama-3.2、DeepSeekV2模型,升级了TokenizerFast功能,重构了SFTTrainer。
此外,PaddleNLP还支持了优化器状态的卸载和重载功能,实现了精细化的重新计算,训练性能提升7%。在Unified Checkpoint方面,进一步优化了异步保存逻辑,新增Checkpoint压缩功能,可节省78.5%存储空间。 最后,在大模型推理、自动并行、多硬件支持、文档使用上,我们都进行了深度优化。
主要更新与增强
- 新增模型:
-
新增了Llama-3.2模型(#9199)、DeepSeekV2模型(#9250),进一步丰富了大型模型的选择。
-
基础架构改进:
- 重构了SFTTrainer和SFTConfig,提高了代码的可维护性。(#9318)
- 支持优化器状态的卸载和重载功能(#9467),有效降低了内存使用。
- 通过Hook实现了精细化的重新计算支持,例如,在llama模型上,训练性能可提升7%。(#9396)
-
Unified Checkpoint优化:
- 更新了异步保存逻辑(#9173, #9274, #9321),显著提升了检查点的保存与加载效率。
- 增加了对专家并行的支持(#9055),使模型训练更加灵活。
- 支持在开启sharding_comm_overlap时使用Unified Checkpoint。(#9392)
- 新增了Checkpoint压缩功能,最多可节省78.5%的存储空间。(#9183)
- 通过多线程技术减少了检查点的加载时间(#9034)。
-
Tokenizer功能增强:
-
推理性能提升:
- 支持LLM推理直接量化内置bos模型(#9197)。
- 加强了对LLM推理中FP8 量化的支持(如#9328, [#9423]),满足了多样化的精度需求。
-
增强了投机解码(speculative decoding)和Append Attention 的支持。(#9180) (#9244)
-
硬件兼容性扩展:
- 加强了对Intel HPU的支持(#9273),现在支持动态图预测。
- 为XPU等国产硬件提供了统一检查点功能(#9312)。
-
自动并行优化:
- 修复了自动并行过程中的多个问题(如#9217, [#9355]),确保了并行训练的稳定性。
-
更新了自动并行配置与检查点转换器(如#9136, [#9432]),提升了训练的灵活性和稳定性。
-
文档和测试更新:
- 更新了多个文档,包括LLM模型文档(如#9314)和量化文档(如#9330),确保了信息的时效性和准确性。
- 新增了多个测试用例,如分布式数据加载测试(#9438),提高了测试的覆盖率。
- 修复了文档中的链接错误和排版问题(如#9127, [#9515]),提升了用户体验。
本次更新标志着PaddleNLP的持续进步,为用户提供了更加全面、高效和稳定的NLP解决方案。我们期待在未来的版本中,继续为用户带来更多的创新和价值。
What's Changed
- [Unified Checkpoint] update async_save_info in develop by @DesmonDay in https://github.com/PaddlePaddle/PaddleNLP/pull/9173
- add flashmask rm by @lugimzzz in https://github.com/PaddlePaddle/PaddleNLP/pull/9154
- [LLM_INFER] Support quantized model from bos and fix docs by @yuanlehome in https://github.com/PaddlePaddle/PaddleNLP/pull/9197
- fix ci not set no_proxy and modify tests in pir mode by @fightfat in https://github.com/PaddlePaddle/PaddleNLP/pull/9205
- [Models] Add Llama-3.2 by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9199
- move some auto_parallel args into class AutoTrainingArguments by @Wennie396 in https://github.com/PaddlePaddle/PaddleNLP/pull/9155
- [Performance] Compatible with flashmask API rename upgrade by @GuoxiaWang in https://github.com/PaddlePaddle/PaddleNLP/pull/9019
- [AutoParallel] add vpp align and pp amp test by @AndSonder in https://github.com/PaddlePaddle/PaddleNLP/pull/9176
- fix auto ci return bug when run in v100 by @fightfat in https://github.com/PaddlePaddle/PaddleNLP/pull/9216
- fix auto ci return bug when run in v100 by @AndSonder in https://github.com/PaddlePaddle/PaddleNLP/pull/9228
- [LLM] Add tools for parameters by @Hanyonggong in https://github.com/PaddlePaddle/PaddleNLP/pull/9137
- [AutoParallel] Add test for fuse_ffn and fuse_attention_qkv pass by @zhangbo9674 in https://github.com/PaddlePaddle/PaddleNLP/pull/9203
- [CI] Fix ci import. by @ZHUI in https://github.com/PaddlePaddle/PaddleNLP/pull/9239
- [Version] Update version info by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9241
- [Auto Parallel] Adding align mode support by @zhangyuqin1998 in https://github.com/PaddlePaddle/PaddleNLP/pull/9150
- [LLM INFER] top_p_sampling_reject support top_p=0 and custom seed by @gzy19990617 in https://github.com/PaddlePaddle/PaddleNLP/pull/9202
- [INFER] update tune_cublaslt_gemm op and fix some bugs by @yuanlehome in https://github.com/PaddlePaddle/PaddleNLP/pull/9222
- Reduce the time spent on git downloading third-party libraries by @vivienfanghuagood in https://github.com/PaddlePaddle/PaddleNLP/pull/9246
- [PIR] fix pir open bugs by @yuanlehome in https://github.com/PaddlePaddle/PaddleNLP/pull/9248
- Cherry-pick some PRs from incubate/paddlenlp-fleety by @sneaxiy in https://github.com/PaddlePaddle/PaddleNLP/pull/9245
- [Unified Checkpoint] Support expert parallel by @DesmonDay in https://github.com/PaddlePaddle/PaddleNLP/pull/9055
- [PIR] fix pir dt2st for chatglm_v2 by @yuanlehome in https://github.com/PaddlePaddle/PaddleNLP/pull/9251
- Cherry-pick some PRs from incubate/paddlenlp-fleety by @LiYuRio in https://github.com/PaddlePaddle/PaddleNLP/pull/9253
- [Unified Checkpoint] Fix generation config save by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9223
- [AutoParallel] Fix tests for pass paddle AutoParallel CI by @liym27 in https://github.com/PaddlePaddle/PaddleNLP/pull/9267
- change dataset by @lugimzzz in https://github.com/PaddlePaddle/PaddleNLP/pull/9266
- [Unified Checkpoint] update async save logic by @DesmonDay in https://github.com/PaddlePaddle/PaddleNLP/pull/9274
- add config file for model chatglm2,gemma,yuan by @Mangodadada in https://github.com/PaddlePaddle/PaddleNLP/pull/9139
- Fix async hang by @DesmonDay in https://github.com/PaddlePaddle/PaddleNLP/pull/9276
- [AutoParallel] Change llama test from sharding stage2 to stage1 by @zhangbo9674 in https://github.com/PaddlePaddle/PaddleNLP/pull/9281
- [Tokenizer] Enable padding_side as call time kwargs by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9258
- [Trainer] fix save_model by @DesmonDay in https://github.com/PaddlePaddle/PaddleNLP/pull/9286
- [CI] Skip inference test cases by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9270
- [LLM] Add deepseekv2 by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9250
- [Tokenizer] Unify tokenizer _pad by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9280
- [CI] Fix llm/alignment/rm/flashmask path by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9289
- support attention mask using causal=True by @GuoxiaWang in https://github.com/PaddlePaddle/PaddleNLP/pull/9268
- [FlashMask] Add FlashMask for Qwen2 by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9264
- bug fix for xpu_parallel_matmul by @FeixLiu in https://github.com/PaddlePaddle/PaddleNLP/pull/9297
- fix lora sharding v2 by @lugimzzz in https://github.com/PaddlePaddle/PaddleNLP/pull/9300
- [LLM INFER] Append attn by @yuanlehome in https://github.com/PaddlePaddle/PaddleNLP/pull/9244
- [Auto Parallel] fix bugs for split_batches_for_accumulation && fix bu… by @zhangyuqin1998 in https://github.com/PaddlePaddle/PaddleNLP/pull/9217
- [Tokenizer] Fix TokenizerFast missing clean_up_tokenization_spaces by @dynamicheart in https://github.com/PaddlePaddle/PaddleNLP/pull/9304
- clean llama static modeling file by @zhiqiu in https://github.com/PaddlePaddle/PaddleNLP/pull/9301
- [Unified Checkpoint] Accelerate loading checkpoint by multi-thread by @Crystal-X-111 in https://github.com/PaddlePaddle/PaddleNLP/pull/9034
- fix non-pipelinelayer to distributed by @gongel in https://github.com/PaddlePaddle/PaddleNLP/pull/9310
- change the legacy to slm by @wawltor in https://github.com/PaddlePaddle/PaddleNLP/pull/9311
- [TRL] Rename sft trainer. by @ZHUI in https://github.com/PaddlePaddle/PaddleNLP/pull/9292
- [XPU] support unified ckpt function by @cqulilujia in https://github.com/PaddlePaddle/PaddleNLP/pull/9312
- [LLM INFER] Fix some bugs and chatglm_v2 support block_attn by @yuanlehome in https://github.com/PaddlePaddle/PaddleNLP/pull/9271
- [Readme] Add flash mask by @lugimzzz in https://github.com/PaddlePaddle/PaddleNLP/pull/9219
- update llm infer docs by @yuanlehome in https://github.com/PaddlePaddle/PaddleNLP/pull/9314
- [Unified Checkpoint] Add split param and refactor code by @DesmonDay in https://github.com/PaddlePaddle/PaddleNLP/pull/9240
- [METAX] Support llama for MX C550 by @idontkonwher in https://github.com/PaddlePaddle/PaddleNLP/pull/9186
- update QR code by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9325
- add flash_attention on model chatglm_v2 by @Mangodadada in https://github.com/PaddlePaddle/PaddleNLP/pull/9296
- fix readme by @Mangodadada in https://github.com/PaddlePaddle/PaddleNLP/pull/9326
- [Unified Checkpoint] update non-merge checkpoint loading, move async_save_info.json location by @DesmonDay in https://github.com/PaddlePaddle/PaddleNLP/pull/9321
- [paddle cpu inference]fix cpu doc by @bukejiyu in https://github.com/PaddlePaddle/PaddleNLP/pull/9299
- [LLM INFER] add rope_theta for block_multihead_attention by @yuanlehome in https://github.com/PaddlePaddle/PaddleNLP/pull/9334
- Fix pr 9334 by @yuanlehome in https://github.com/PaddlePaddle/PaddleNLP/pull/9335
- fix parameter calculation in auto_parallel mode by @zhiqiu in https://github.com/PaddlePaddle/PaddleNLP/pull/9327
- [Docs] Update flashmask by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9330
- Update load_save_single_card.py by @DesmonDay in https://github.com/PaddlePaddle/PaddleNLP/pull/9337
- Update README.md by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9339
- [Tokenizer] Support reading Tiktoken tokenizer.model. by @lvdongyi in https://github.com/PaddlePaddle/PaddleNLP/pull/9215
- align default custom black/white list for dygraph and static graph by @zhiqiu in https://github.com/PaddlePaddle/PaddleNLP/pull/9340
- [intel_hpu] initial commit for intel_hpu support by @yanfeich in https://github.com/PaddlePaddle/PaddleNLP/pull/9273
- Compatible with Tensor.to change to out_of_place. by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9343
- [Tokenizer] Fix Llama3Tokenizer import by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9341
- [Docs] Add precision alignment doc by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9346
- [Tokenizer] Support adding special tokens to Qwen tokenizer by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9344
- Add ordered save to avoid OOM by @ForFishes in https://github.com/PaddlePaddle/PaddleNLP/pull/9347
- [AutoParallel]Bugfix Hang for VPP-Sharding by @JZ-LIANG in https://github.com/PaddlePaddle/PaddleNLP/pull/9336
- Add CI testing for A100 and V100 device by @waliwali777 in https://github.com/PaddlePaddle/PaddleNLP/pull/9324
- [Inference] Append attn FP8 quant by @ckl117 in https://github.com/PaddlePaddle/PaddleNLP/pull/9328
- [Tokenizer] Add BertTokenizerFast, support register new tokenizer by @lvdongyi in https://github.com/PaddlePaddle/PaddleNLP/pull/9353
- clean print in auto_trainer by @zhiqiu in https://github.com/PaddlePaddle/PaddleNLP/pull/9357
- [Unified Checkpoint] Fix fp32 dtype for using newest paddle by @DesmonDay in https://github.com/PaddlePaddle/PaddleNLP/pull/9360
- [UIE] Fix tokenizer output with return_token_type_ids by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9363
- Add offload/reload for optimizer by @ForFishes in https://github.com/PaddlePaddle/PaddleNLP/pull/9359
- refine dtype use by @wanghuancoder in https://github.com/PaddlePaddle/PaddleNLP/pull/9366
- Add check for sharding stage1-v2 using amp master grad by @ForFishes in https://github.com/PaddlePaddle/PaddleNLP/pull/9333
- [Trainer] Update assert to warning by @DesmonDay in https://github.com/PaddlePaddle/PaddleNLP/pull/9332
- [Auto Parallel] fix adapt_stale_fwd_patch for to_static mode by @zhangyuqin1998 in https://github.com/PaddlePaddle/PaddleNLP/pull/9372
- [LLM INFER] Optimize fuse some kernels in postprocess by @gzy19990617 in https://github.com/PaddlePaddle/PaddleNLP/pull/9201
- [AutoParallel] Fix
EXCODE
bug of AutoParallel CI by @waliwali777 in https://github.com/PaddlePaddle/PaddleNLP/pull/9355 - Support pp + no_recompute_layer. by @tianyuzhou668 in https://github.com/PaddlePaddle/PaddleNLP/pull/9373
- [Unified Checkpoint] Support empty state_dict saving by @DesmonDay in https://github.com/PaddlePaddle/PaddleNLP/pull/9380
- Add submodule by @risemeup1 in https://github.com/PaddlePaddle/PaddleNLP/pull/9385
- [CI] add recursive for submodule by @Liujie0926 in https://github.com/PaddlePaddle/PaddleNLP/pull/9389
- [CI]fix scripts by @Liujie0926 in https://github.com/PaddlePaddle/PaddleNLP/pull/9394
- [LLM]add ktotrainer by @lugimzzz in https://github.com/PaddlePaddle/PaddleNLP/pull/9393
- Refine log freq by @zhangbo9674 in https://github.com/PaddlePaddle/PaddleNLP/pull/9397
- [XPU] Llama XPU's swiglu uses phi's swiglu by @dynamicheart in https://github.com/PaddlePaddle/PaddleNLP/pull/9414
- fix hip paddlenlp_ops bug by @TBD1 in https://github.com/PaddlePaddle/PaddleNLP/pull/9418
- [CI]update target_lists_for_llm by @Liujie0926 in https://github.com/PaddlePaddle/PaddleNLP/pull/9417
- [INFER][LLM] Add the AutoModel for inference mode by @zeroRains in https://github.com/PaddlePaddle/PaddleNLP/pull/9416
- [Unified Checkpoint] Support sharding_comm_overlap by @DesmonDay in https://github.com/PaddlePaddle/PaddleNLP/pull/9392
- [DCU] update dcu paddlenlp_ops by @TBD1 in https://github.com/PaddlePaddle/PaddleNLP/pull/9433
- Change core.LoDTensor to core.DenseTensor by @co63oc in https://github.com/PaddlePaddle/PaddleNLP/pull/9434
- Change LOD_TENSOR to DENSE_TENSOR by @co63oc in https://github.com/PaddlePaddle/PaddleNLP/pull/9419
- [LLM] Fix deepseekv2 import in py38 by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9446
- [Distributed Dataloader] change process new_group creation by @DesmonDay in https://github.com/PaddlePaddle/PaddleNLP/pull/9438
- Update dist_dataloader.py by @DesmonDay in https://github.com/PaddlePaddle/PaddleNLP/pull/9451
- [llm]fix pp no drop last by @lugimzzz in https://github.com/PaddlePaddle/PaddleNLP/pull/9439
- Reduce long duration for the
exit -6 re-run
process. by @waliwali777 in https://github.com/PaddlePaddle/PaddleNLP/pull/9400 - Fix row parallel lora layers parameters initialization bug by @will-jl944 in https://github.com/PaddlePaddle/PaddleNLP/pull/9427
- Refactor tool of creating pretrain dataset by @gongel in https://github.com/PaddlePaddle/PaddleNLP/pull/9454
- 【Auto-Parallel】update conf for sharding overlap in static by @liym27 in https://github.com/PaddlePaddle/PaddleNLP/pull/9456
- [AutoParallel] add release_gradients and comm_buffer_size_MB to strategy by @AndSonder in https://github.com/PaddlePaddle/PaddleNLP/pull/9432
- [LLM] Skip zero loss by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9447
- [ChatTemplate] Fix chat template when answer is contained within question. by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9444
- [LLM] Add expert parallel by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9368
- 增加benchmark多机任务执行脚本对于异常退出的处理 by @XieYunshen in https://github.com/PaddlePaddle/PaddleNLP/pull/9442
- [llm]add set_seed by @lugimzzz in https://github.com/PaddlePaddle/PaddleNLP/pull/9429
- [AutoParallel] Reconstruct sharding mesh dimension inference logic - Part2 add sharding_mesh_dimension param by @AndSonder in https://github.com/PaddlePaddle/PaddleNLP/pull/9382
- Fix auto parallel CI exit -6 by @waliwali777 in https://github.com/PaddlePaddle/PaddleNLP/pull/9460
- [ChatTemplate] Fix chat template for
Gemma
when answer is contained within question. by @lvdongyi in https://github.com/PaddlePaddle/PaddleNLP/pull/9462 - Use paddle.cast instead of Tensor.astype by @HydrogenSulfate in https://github.com/PaddlePaddle/PaddleNLP/pull/9461
- fixed the init problem in tensor parallel by @wawltor in https://github.com/PaddlePaddle/PaddleNLP/pull/9452
- Revised PoSE by @whf313 in https://github.com/PaddlePaddle/PaddleNLP/pull/8822
- fix AutoInferenceModel for qwen-vl by @yuanlehome in https://github.com/PaddlePaddle/PaddleNLP/pull/9463
- add reft method by @TranscenderNing in https://github.com/PaddlePaddle/PaddleNLP/pull/8819
- [AutoParallel]: llama_model_auto support alibi by @blacksheep-Aristotle in https://github.com/PaddlePaddle/PaddleNLP/pull/9422
- [AutoParallel]:gpt 13b model support fused_linear sp fused_attention … by @blacksheep-Aristotle in https://github.com/PaddlePaddle/PaddleNLP/pull/9477
- add Moslora by @TranscenderNing in https://github.com/PaddlePaddle/PaddleNLP/pull/9331
- [Trainer] Fix eval for map dataset by @DesmonDay in https://github.com/PaddlePaddle/PaddleNLP/pull/9472
- [Inference]Move quantization code from run_finetune.py to run_quantization.py by @lixcli in https://github.com/PaddlePaddle/PaddleNLP/pull/9450
- [AutoParallel] Fix parameter passing for comm_buffer_size_MB and release_gradients by @AndSonder in https://github.com/PaddlePaddle/PaddleNLP/pull/9481
- [AutoParallel]:fix run llama_13b_auto error by @blacksheep-Aristotle in https://github.com/PaddlePaddle/PaddleNLP/pull/9480
- [Unified Checkpoint] Checkpoint compression by @wtmlon in https://github.com/PaddlePaddle/PaddleNLP/pull/9183
- fixbug for chatglm_v2's RetaryEmbedding dtype by @mingMelody in https://github.com/PaddlePaddle/PaddleNLP/pull/9476
- [LLM INFER] Support speculative decoding (llama) by @Wanglongzhi2001 in https://github.com/PaddlePaddle/PaddleNLP/pull/9180
- [Fix] Remove data args print by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9486
- [AutoParallel] open vpp test cast at v100 machines by @AndSonder in https://github.com/PaddlePaddle/PaddleNLP/pull/9468
- [ChatTemplate] Fix chat template for
Yuan
when answer is contained within question. by @lvdongyi in https://github.com/PaddlePaddle/PaddleNLP/pull/9485 - [AutoParallel]:fix baichuan d2s fail by @blacksheep-Aristotle in https://github.com/PaddlePaddle/PaddleNLP/pull/9478
- [Tokenizer] Support fast tokenizer within AutoTokenizer import by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9466
- [Inference] use fp8 cuda core gemm kernel when M<=4 by @zhink in https://github.com/PaddlePaddle/PaddleNLP/pull/9423
- [XPU] set appropriate mask value for xpu by @runzhech in https://github.com/PaddlePaddle/PaddleNLP/pull/9495
- [LLM INFER] not use gemm_dequant default and fix bug by @yuanlehome in https://github.com/PaddlePaddle/PaddleNLP/pull/9498
- [NEW Feature] 新增基于hook的refined_recompute支持 by @JunnYu in https://github.com/PaddlePaddle/PaddleNLP/pull/9396
- 【Hackathon 7th No.43】完善 TokenizerFast 功能支持 part 1 by @yinfan98 in https://github.com/PaddlePaddle/PaddleNLP/pull/9407
- [BUG] fix pp eval shape bug by @JunnYu in https://github.com/PaddlePaddle/PaddleNLP/pull/9505
- Adding LoKrModel Class to paddle.peft library by @WhuanY in https://github.com/PaddlePaddle/PaddleNLP/pull/9269
- 移除CUDA_DEVICE_MAX_CONNECTIONS环境变量, 优化benchmark执行脚本 by @XieYunshen in https://github.com/PaddlePaddle/PaddleNLP/pull/9500
- [Refactor] SFTTrainer SFTConfig by @ZHUI in https://github.com/PaddlePaddle/PaddleNLP/pull/9318
- fix csrc readme by @yuanlehome in https://github.com/PaddlePaddle/PaddleNLP/pull/9515
- Add document for speculative decoding by @Wanglongzhi2001 in https://github.com/PaddlePaddle/PaddleNLP/pull/9492
- [News] FlashRAG-Paddle by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9511
- support quant ckpt limit strategy by @wtmlon in https://github.com/PaddlePaddle/PaddleNLP/pull/9494
- Fix ckpt convert bug by @zhangbo9674 in https://github.com/PaddlePaddle/PaddleNLP/pull/9521
- support pp accuracy calculation by @wtmlon in https://github.com/PaddlePaddle/PaddleNLP/pull/9379
- Fix ckpt convert bug1 by @zhangbo9674 in https://github.com/PaddlePaddle/PaddleNLP/pull/9522
- [CI] Compatible with paddle.where by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9534
- [Inference] Update DygraphInferencePredictor by @DrownFish19 in https://github.com/PaddlePaddle/PaddleNLP/pull/9491
- support offload/reload optimizer's states for custom device by @tianhaodongbd in https://github.com/PaddlePaddle/PaddleNLP/pull/9467
- [LLM INFER] fix tune_cublaslt_int8_gemm.py and remove dist_config by @yuanlehome in https://github.com/PaddlePaddle/PaddleNLP/pull/9520
- 【Hackathon 7th No.43】TokenizerFast for Qwen2 by @yinfan98 in https://github.com/PaddlePaddle/PaddleNLP/pull/9532
- [INFER][LLM] Add the AutoPredictor for inference by @zeroRains in https://github.com/PaddlePaddle/PaddleNLP/pull/9445
- Support call sft training with clone PaddleNLP by @ZHUI in https://github.com/PaddlePaddle/PaddleNLP/pull/9516
New Contributors
- @Crystal-X-111 made their first contribution in https://github.com/PaddlePaddle/PaddleNLP/pull/9034
- @idontkonwher made their first contribution in https://github.com/PaddlePaddle/PaddleNLP/pull/9186
- @waliwali777 made their first contribution in https://github.com/PaddlePaddle/PaddleNLP/pull/9324
- @tianyuzhou668 made their first contribution in https://github.com/PaddlePaddle/PaddleNLP/pull/9373
- @risemeup1 made their first contribution in https://github.com/PaddlePaddle/PaddleNLP/pull/9385
- @TBD1 made their first contribution in https://github.com/PaddlePaddle/PaddleNLP/pull/9418
- @zeroRains made their first contribution in https://github.com/PaddlePaddle/PaddleNLP/pull/9416
- @XieYunshen made their first contribution in https://github.com/PaddlePaddle/PaddleNLP/pull/9442
- @whf313 made their first contribution in https://github.com/PaddlePaddle/PaddleNLP/pull/8822
- @mingMelody made their first contribution in https://github.com/PaddlePaddle/PaddleNLP/pull/9476
- @runzhech made their first contribution in https://github.com/PaddlePaddle/PaddleNLP/pull/9495
- @WhuanY made their first contribution in https://github.com/PaddlePaddle/PaddleNLP/pull/9269
Full Changelog: https://github.com/PaddlePaddle/PaddleNLP/compare/v3.0.0-beta2...v3.0.0-beta3