Torch-Pruning is an open-source toolkit designed to optimize deep neural networks by performing structural pruning directly within PyTorch models. The library focuses on reducing the size and computational cost of neural networks by removing redundant parameters and channels while maintaining model performance. It introduces a graph-based algorithm called DepGraph that automatically identifies dependencies between layers, allowing parameters to be pruned safely across complex architectures. This dependency analysis makes it possible to prune large networks such as transformers, convolutional networks, and diffusion models without breaking the computational graph. Torch-Pruning physically removes parameters rather than masking them, which results in smaller and faster models during both training and inference. The toolkit supports a wide variety of architectures used in computer vision and large language models, making it a flexible solution for model compression tasks.
Features
- Graph-based dependency analysis using the DepGraph pruning algorithm
- Structural pruning that removes channels and parameters from neural networks
- Support for large models including transformers and computer vision architectures
- Integration with PyTorch training and inference workflows
- Tools for measuring parameter importance during pruning operations
- Optimization of models to reduce memory usage and computational cost