Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
Modin 0.33.0 source code.tar.gz | 2025-05-23 | 16.3 MB | |
Modin 0.33.0 source code.zip | 2025-05-23 | 16.8 MB | |
README.md | 2025-05-23 | 6.0 kB | |
Totals: 3 Items | 33.1 MB | 0 |
This release introduces a set of features for switching Modin execution between multiple backends (e.g. Ray and local Pandas) manually or automatically. It also includes several bug fixes.
Key Features and Updates Since 0.32.0
- Stability and Bugfixes
- FIX-#7327: Use sort parameter of DataFrame.stack (#7396)
- FIX-#7346: Handle execution on Dask workers to avoid creating conflicting clients (#7347)
- FIX-#7375: Fix Series.duplicated dropping name (#7395)
- FIX-#7381: Fix Series binary operators ignoring fill_value (#7394)
- FIX-#7383: Avoid broadcast issue in partition manager with custom NPartitions (#7399)
- FIX-#7404: Implement interchange protocol for datetime columns (#7434)
- FIX-#7405: Internally sort indices for loc/iloc set (#7440)
- FIX-#7413: Always use positional index before computing argmin/argmax (#7463)
- FIX-#7461: Set backend correctly with environment variables. (#7462)
- FIX-#7465: Properly implement Series.rename_axis (#7466)
- FIX-#7486: Add support for
.astype(pandas.CategoricalDtype(…))
(#7487) - FIX-#7490: Exclude move_to and _update_inplace from casting. (#7491)
- FIX-#7495: Separate extensions for aliases. (#7496)
- FIX-#7521: Fix wrong extension being used when backend is pinned (#7546)
- FIX-#7528: Dispatch module-level extensions to the correct backend (#7529)
- FIX-#7532: Display choices in error message of environment vars (#7533)
- FIX-#7536: setuptools / ray version conflict in pkg_resources._vendor (#7537)
- FIX-#7538: set_backend should exit early if there is nothing to do (#7539)
- FIX-#7547: native qc move_to_me_cost does not work with non-subclasses (#7548)
- FIX-#7553: Fix groupby when AutoSwitchBackend is disabled. (#7554)
- FIX-#7555: Get the correct extension when AutoSwitchBackend is False. (#7556)
- FIX-#7559: Create the dummy query compiler just once per backend. (#7560)
- FIX-#7562: Raise AttributeError for missing extension properties. (#7563)
- FIX-#7569: Fix handling of pyarrow dtype and empty dataframes (#7570)
- FIX-#7576: Fix ambiguous AttributeError message (#7577)
- FIX-#7578: Change groupby extension allow list and fix cached_property extensions (#7579)
- Performance enhancements
- PERF-#7397: Avoid materializing index/columns in shape checks (#7398)
- Refactor Codebase
- REFACTOR-#7315: Refactor axis checks in squeeze (#7400)
- REFACTOR-#7418: Rename internal interchange protocol methods. (#7422)
- REFACTOR-#7427: Require query compilers to expose engine and storage format. (#7430)
- REFACTOR-#7470: Combine backend casting and extension code at the API layer. (#7485)
- REFACTOR-#7493: Improve the clarity of the costing functions (#7494)
- REFACTOR-#7527: Add more costing logic to the base query compiler. (#7530)
- REFACTOR-#7534: Provide internal, overridable method for max_shape (#7535)
- REFACTOR-#7564: Fix docstrings for transfer thresholds. (#7565)
- Update testing suite
- TEST-#7419: Fix a few errors in CI (#7420)
- TEST-#7421: Fix unidist with APT-installed MPI (#7423)
- TEST-#7431: Fix formatting for isort 6 and black 25 (#7432)
- TEST-#7437: Check execution-filter outputs correctly in CI. (#7438)
- TEST-#7441: Correctly skip sanity tests if we don't need them. (#7442)
- TEST-#7457: Fix SSL certificate error in notebooks by using http. (#7458)
- TEST-#7497: Skip tests requiring lxml on windows. (#7500)
- TEST-#7571: xfail test_read_csv_s3_issue4658 due to missing s3 bucket (#7572)
- Documentation improvements
- DOCS-#7566: Add pandas on snowflake + backend pinning to documentation page (#7567)
- New Features
- FEAT-#7433: Replace NativeDataFrameMode with a complete "native" execution. (#7436)
- FEAT-#7445: Add metrics interface so third-parties can collect metrics from the modin frontend (#7444)
- FEAT-#7448: Allow QueryCompilerCaster to apply cost-optimization on automatic casting (#7464)
- FEAT-#7455: Add Backend config variable as an alias for execution. (#7456)
- FEAT-#7459: Add methods to get and set backend. (#7460)
- FEAT-#7468: Add progress bar for engine switch (#7469)
- FEAT-#7472: Add an option register dataframe and series accessors with a particular backend. (#7473)
- FEAT-#7474: Register general functions with a particular backend. (#7489)
- FEAT-#7475: Choose the correct init method from extensions and apply casting to init. (#7488)
- FEAT-#7477: Move the query compiler calculator so it can be used in more places (#7478)
- FEAT-#7480: Implement max_cost interface (#7481)
- FEAT-#7482: Add "from_qc" API to QueryCompiler and BackendCostCalculator to handle asymmetric information scenarios (#7483)
- FEAT-#7492: Allow I/O function accessors. (#7502)
- FEAT-#7505: Support post-operation automatic backend switch. (#7506)
- FEAT-#7507: Support pre-operation automatic backend switch. (#7512)
- FEAT-#7509: Add AutoSwitchBackend configuration variable (#7510)
- FEAT-#7511: Support pre-operation switch for init by passing arguments to cost functions. (#7531)
- FEAT-#7521: Support pinning objects to a backend (#7522)
- FEAT-#7523: Improve formal definition of the automatic switching algorithm (#7524)
- FEAT-#7540: Ability to configure NativeQueryCompiler AutoSwitch Settings (#7561)
- FEAT-#7542: Support post-operation backend switch for groupby. (#7545)
- FEAT-#7543: Let plugins register groupby accessors. (#7575)
- FEAT-#7549: Emit metrics on auto-switch and casting behavior (#7550)
- FEAT-#7557: Add operation and size information to backend switch progress (#7558)
- FEAT-#7573: Dispatch array_ufunc to query compilers (#7574)
Contributors
@CRiddler @YarShev @anmyachev @data-makerman @devin-petersohn @emmanuel-ferdman @mpeleshenko @noloerino @sfc-gh-dpetersohn @sfc-gh-jkew @sfc-gh-joshi @sfc-gh-mvashishtha