| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| py_data_juicer-1.4.6-py3-none-any.whl | 2026-02-02 | 2.0 MB | |
| README.md | 2026-02-02 | 1.8 kB | |
| Release v1.4.6_ introduce Q_A Copilot_ Video bytes I_O_ Tracer for Ray mode source code.tar.gz | 2026-02-02 | 46.7 MB | |
| Release v1.4.6_ introduce Q_A Copilot_ Video bytes I_O_ Tracer for Ray mode source code.zip | 2026-02-02 | 47.5 MB | |
| Totals: 4 Items | 96.2 MB | 0 | |
Major Updates
- 🤖 Our Q&A copilot is introduced to resolve questions from users. Now the robot is available in the docs, DingTalk group, Discord, etc. [#891]
- 🎬 I/O for video bytes: support bytes reading/storing for videos. [#882]
- 🫆 Tracer for ray mode: now the tracer supports to trace changed samples in ray mode. [#885]
Enhancements
- Prepare a new dockerfile for use case of embodied AI, and update the cuda/system/... versions of the basic docker image. [#887]
- Add Copilot News & Refined DingTalk link/QR code & Discord link/QR code in the docs. [#891]
- Convert the word retrieval from lists to sets to speed up two OPs. [#890]
- Add a new workflow to automatically fetch the traffic report from github insigts. [#899] [#900]
Fixed Bugs
- Fix
TypeErrorwhen usingfield_typesin YAML config forRequiredFieldsValidator. [#886] - Replace the deprecated
concurrencyparameter withcomputeparameter in the ray.data.Dataset.map_batches() call. [#888] - Prevent divide-by-zero in calculate_ray_np when Ray cluster not ready. [#864]
- Add thread limiting for multi-process workloads to prevent over-subscription. [#877]
- Fix the bug where the unittest of standalone mode could be stuck. [#896]
- Update several out-of-date links in the docs. [#898]
Acknowledgement
- @dubin555 helps to fix several bugs and enhance the processing performance for some OPs. [#886] [#890]
- @xyuzh helps to update the ray usage to the latest version in some OPs, fix some bugs and optimize the parallel strategies. [#888] [#864]
- @XinyuLiu1999 helps to fix a bug of over-subscription on multi-process workloads. [#877]
Full Changelog: https://github.com/datajuicer/data-juicer/compare/v1.4.5...v1.4.6