Download Latest Version v0.28.1_ Build fixes (ROCm, GCC 12) source code.zip (1.6 MB)
Email in envelope

Get an email when there's a new version of Horovod

Home / v0.28.0
Name Modified Size InfoDownloads / Week
Parent folder
README.md 2023-05-09 1.8 kB
v0.28.0_ Keras 2.11+ optimizers, faster reducescatter, fixes for latest TensorFlow, CUDA, NCCL.tar.gz 2023-05-09 1.2 MB
v0.28.0_ Keras 2.11+ optimizers, faster reducescatter, fixes for latest TensorFlow, CUDA, NCCL.zip 2023-05-09 1.6 MB
Totals: 3 Items   2.8 MB 0

Added

  • TensorFlow: Added new get_local_and_global_gradients to PartialDistributedGradientTape to retrieve local and non-local gradients separately. (#3859)

Changed

  • Improved reducescatter performance by allocating output tensors before enqueuing the operation. (#3824)
  • TensorFlow: Ensured that tf.logical_and within allreduce tf.cond runs on CPU. (#3885)
  • TensorFlow: Added support for Keras 2.11+ optimizers. (#3860)
  • CUDA_VISIBLE_DEVICES environment variable is no longer passed to remote nodes. (#3865)

Fixed

  • Fixed build with ROCm. (#3839, #3848)
  • Fixed build of Docker image horovod-nvtabular. (#3851)
  • Fixed linking recent NCCL by defaulting CUDA runtime library linkage to static and ensuring that weak symbols are overridden. (#3867, #3846)
  • Fixed compatibility with TensorFlow 2.12 and recent nightly versions. (#3864, #3894, #3906, #3907)
  • Fixed missing arguments of Keras allreduce function. (#3905)
  • Updated with_device functions in MXNet and PyTorch to skip unnecessary cudaSetDevice calls. (#3912)
Source: README.md, updated 2023-05-09