Download Latest Version Ray-2.48.0 source code.tar.gz (176.9 MB)
Email in envelope

Get an email when there's a new version of Ray

Home / ray-2.47.0
Name Modified Size InfoDownloads / Week
Parent folder
Ray-2.47.0 source code.tar.gz 2025-06-08 154.3 MB
Ray-2.47.0 source code.zip 2025-06-08 159.7 MB
README.md 2025-06-08 8.5 kB
Totals: 3 Items   314.0 MB 0

Release Highlights

  • Prefill disaggregation is now supported in initial support in Ray Serve LLM (#53092). This is critical for production LLM serving use cases.
  • Ray Data features a variety of performance improvements (locality-based scheduling, non-blocking execution) as well as improvements to observability, preprocessors, and other stability fixes.
  • Ray Serve now features custom request routing algorithms, which is critical for high throughput traffic for large model use cases.

Ray Libraries

Ray Data

πŸŽ‰ New Features: - Add save modes support to file data sinks (#52900) - Added flattening capability to the Concatenator preprocessor to support output vectorization use cases (#53378)

πŸ’« Enhancements: - Re-enable Actor locality-based scheduling. This PR also improves algorithms for ranking the locations for the bundle. (#52861) - Disable blocking pipeline by default until Actor Pool fully scales up to min actors (#52754) - Progress bar and dashboard improvements to show name of partial functions properly(#52280)

πŸ”¨ Fixes: - Make Ray Data from_torch respect Dataset len (#52804) - Fixing flaky aggregation test (#53383) - Fix race condition bug in fault tolerance by disabling on_exit hook (#53249) - Fix move_tensors_to_device utility for the list/tuple[tensor] case (#53109) - Fix ActorPool scaling to avoid scaling down when the input queue is empty (#53009) - Fix internal queues accounting for all Operators w/ an internal queue (#52806) - Fix backpressure for FileBasedDatasource. This fixes potential OOMs for workloads using FileBasedDatasources (#52852)

πŸ“– Documentation: - Fix working code snippets (#52748) - Improve AggregateFnV2 docstrings and examples (#52911) - Improved documentation for vectorizers and API visibility in Data (#52456)

Ray Train

πŸŽ‰ New Features: - Added support for configuring Ray Train worker actor runtime environments. (#52421) - Included Grafana panel data in Ray Train export for improved monitoring. (#53072) - Introduced a structured logging environment variable to standardize log formats. (#52952) - Added metrics for TrainControllerState to enhance observability. (#52805)

πŸ’« Enhancements: - Logging of controller state transitions to aid in debugging and analysis. (#53344) - Improved handling of Noop scaling decisions for smoother scaling logic. (#53180)

πŸ”¨ Fixes: - Improved move_tensors_to_device utility to correctly handle list / tuple of tensors. (#53109) - Fixed GPU transfer support for non-contiguous tensors. (#52548) - Increased timeout in test_torch_device_manager to reduce flakiness. (#52917)

πŸ“– Documentation: - Added a note about PyTorch DataLoader’s multiprocessing and forkserver usage. (#52924) - Fixed various docstring format and indentation issues. (#52855, [#52878]) - Removed unused "configuration-overview" documentation page. (#52912) - General typo corrections. (#53048)

πŸ— Architecture refactoring: - Deduplicated ML doctest runners in CI for efficiency. (#53157) - Converted isort configuration to Ruff for consistency. (#52869) - Removed unused PARALLEL_CI blocks and combined imports. (#53087, [#52742])

Ray Tune

πŸ’« Enhancements: - Updated test_train_v2_integration to use the correct RunConfig. (#52882)

πŸ“– Documentation: - Replaced session.report with tune.report and corrected import paths. (#52801) - Removed outdated graphics cards reference in docs. (#52922) - Fixed various docstring format issues. (#52879)

Ray Serve

πŸŽ‰ New Features: - Added support for implementing custom request routing algorithms. (#53251) - Introduced an environment variable to prioritize custom resources during deployment scheduling. (#51978)

πŸ’« Enhancements: - The ingress API now accepts a builder function in addition to an ASGI app object. (#52892)

πŸ”¨ Fixes: - Fixed runtime_env validation for py_modules. (#53186) - Disallowed special characters in Serve deployment and application names. (#52702) - Added a descriptive error message when a deployment name is not found. (#45181)

πŸ“– Documentation: - Updated the guide on serving models with Triton Server in Ray Serve. - Added documentation for custom request routing algorithms.

Ray Serve/Data LLM

πŸŽ‰ New Features: - Added initial support for prefill decode disaggregation (#53092) - Expose vLLM Metrics to serve.llm API (#52719) - Embedding API (#52229)

πŸ’« Enhancements: - Allow setting name_prefix in build_llm_deployment (#53316) - Minor bug fix for 53144: stop tokens cannot be null (#53288) - Add missing repetition_penalty vLLM sampling parameter (#53222) - Mitigate the serve.llm streaming overhead by properly batching stream chunks (#52766) - Fix test_batch_vllm leaking resources by using larger wait_for_min_actors_s

πŸ”¨ Fixes: - LLMRouter.check_health() should check LLMServer.check_health() (#53358) - Fix runtime passthrough and auto-executor class selection (#53253) - Update check_health return type (#53114) - Bug fix for duplication of <bos> token (#52853) - In stream batching, first part of the stream was always consumed and not streamed back from the router (#52848)

RLlib

πŸŽ‰ New Features: - Add GPU inference to offline evaluation. (#52718)

πŸ’« Enhancements: - Do-over of examples for connector pipelines. (#52604) - Cleanup of meta learning classes and examples. (#52680)

πŸ”¨ Fixes: - Fixed weight synching in offline evaluation. (#52757) - Fixed bug in split_and_zero_pad utility function (related to complex structures vs simple values or np.arrays). (#52818)

Ray Core

πŸ’« Enhancements: - uv run integration is now enabled by default, so you don't need to set the RAY_RUNTIME_ENV_HOOK any more (#53060) - Record gcs process metrics (#53171)

πŸ”¨ Fixes: - Improvements for using RuntimeEnv in the Job Submission API. (#52704) - Close unused pipe file descriptor of child processes of Raylet (#52700) - Fix race condition when canceling task that hasn't started yet (#52703) - Implement a thread pool and call the CPython API on all threads within the same concurrency group (#52575) - cgraph: Fix execution schedules with collective operations (#53007) - cgraph: Fix scalar tensor serialization edge case with serialize_to_numpy_or_scalar (#53160) - Fix the issue where a valid RestartActor rpc is ignored (#53330) - Fix reference counter crashes during worker graceful shutdown (#53002)

Dashboard

πŸŽ‰ New Features: - train: Add dynolog for on-demand GPU profiling for Torch training (#53191)

πŸ’« Enhancements: - Add configurability of 'orgId' param for requesting Grafana dashboards (#53236)

πŸ”¨ Fixes: - Fix Grafana dashboards dropdowns for data and train dashboard (#52752) - Fix dashboard for daylight savings (#52755)

Ray Container Images

πŸ’« Enhancements: - Upgrade h11 (#53361), requests, starlette, jinja2 (#52951), pyopenssl and cryptography (#52941) - Generate multi-arch image indexes (#52816)

Docs

πŸŽ‰ New Features: - End-to-end example: Entity recognition with LLMs (#52342) - new end-to-end example - End-to-end example: xgboost tutorial (#52383) - End-to-end tutorial for audio transcription and LLM as judge curation (#53189)

πŸ’« Enhancements: - Adds pydoclint to pre-commit (#52974)

Thanks!

Thank you to everyone who contributed to this release!

@NeilGirdhar, @ok-scale, @JiangJiaWei1103, @brandonscript, @eicherseiji, @ktyxx, @MichalPitr, @GeneDer, @rueian, @khluu, @bveeramani, @ArturNiederfahrenhorst, @c8ef, @lk-chen, @alanwguo, @simonsays1980, @codope, @ArthurBook, @kouroshHakha, @Yicheng-Lu-llll, @jujipotle, @aslonnie, @justinvyu, @machichima, @pcmoritz, @saihaj, @wingkitlee0, @omatthew98, @can-anyscale, @nadongjun, @chris-ray-zhang, @dizer-ti, @matthewdeng, @ryanaoleary, @janimo, @crypdick, @srinathk10, @cszhu, @TimothySeah, @iamjustinhsu, @mimiliaogo, @angelinalg, @gvspraveen, @kevin85421, @jjyao, @elliot-barn, @xingyu-long, @LeoLiao123, @thomasdesr, @ishaan-mehta, @noemotiovon, @hipudding, @davidxia, @omahs, @MengjinYan, @dengwxn, @MortalHappiness, @alhparsa, @emmanuel-ferdman, @alexeykudinkin, @KunWuLuan, @dev-goyal, @sven1977, @akyang-anyscale, @GokuMohandas, @raulchen, @abrarsheikh, @edoakes, @JoshKarpel, @bhmiller, @seanlaii, @ruisearch42, @dayshah, @Bye-legumes, @petern48, @richardliaw, @rclough, @israbbani, @jiwq

Source: README.md, updated 2025-06-08