Download Latest Version v0.15.2 source code.tar.gz (7.6 MB)
Email in envelope

Get an email when there's a new version of PEFT

Home / v0.15.0
Name Modified Size InfoDownloads / Week
Parent folder
README.md 2025-03-19 13.8 kB
v0.15.0 source code.tar.gz 2025-03-19 7.6 MB
v0.15.0 source code.zip 2025-03-19 7.9 MB
Totals: 3 Items   15.5 MB 0

Highlights

peft-v0 15 0

New Methods

CorDA: Context-Oriented Decomposition Adaptation

@iboing and @5eqn contributed CorDA: Context-Oriented Decomposition Adaptation of Large Language Models for Task-Aware Parameter-Efficient Fine-tuning . This task-driven initialization method has two modes, knowledge-preservation and instruction-preservation, both using external data to select ranks intelligently. The former can be used to select those ranks that correspond to weights not affiliated with knowledge from, say, a QA dataset. The latter can be used to select those ranks that correspond most to the task at hand (e.g., a classification task). (#2231)

Trainable Tokens: Selective token update

The new Trainable Tokens tuner allows for selective training of tokens without re-training the full embedding matrix, e.g. when adding support for reasoning / thinking tokens. This is a lot more memory efficient and the saved checkpoint is much smaller. It can be used standalone or in conjunction with LoRA adapters by passing trainable_token_indices to LoraConfig. (#2376)

Enhancements

LoRA now supports targeting multihead attention modules (but for now only those with _qkv_same_embed_dim=True). These modules were tricky as they may expose linear submodules but won't use their forward methods, therefore needing explicit support. (#1324)

Hotswapping now allows different alpha scalings and ranks without recompilation of the model when the model is prepared using a call to prepare_model_for_compiled_hotswap() before compiling the model. (#2177)

GPTQModel support was added in [#2247] as a replacement for AutoGPTQ which is not maintained anymore.

Changes

  • It's now possible to use all-linear as target_modules for custom (non-transformers) models (#2267). With this change comes a bugfix where it was possible that non-linear layers were selected when they shared the same name with a linear layer (e.g., bar.foo and baz.foo).
  • The internal tuner API was refactored to make method registration easier. With this change the number of changes to numerous files is reduced to a single register_peft_method() call. (#2282)
  • PEFT_TYPE_TO_MODEL_MAPPING is now deprecated and should not be relied upon. Use PEFT_TYPE_TO_TUNER_MAPPING instead. (#2282)
  • Mixed adapter batches can now be used in conjunction with beam search. (#2287)
  • It was possible that modules_to_save keys wrongly matched parts of the state dict if the key was a substring of another key (e.g., classifier and classifier2). (#2334)
  • Auto-casting of the input dtype to the LoRA adapter dtype can now be disabled via disable_input_dtype_casting=True. (#2353)
  • The config parameters rank_pattern and alpha_pattern used by many adapters now supports matching full paths as well by specifying the pattern with a caret in front, for example: ^foo to target model.foo but not model.bar.foo. (#2419)
  • AutoPeftModels do not reduce the embedding size anymore if the tokenizer size differs from the embedding size. Only if there are more tokens in the tokenizer than in the embedding matrix, the matrix will be resized. This is to prevent resizing of embedding matrices in models that have 'spare' tokens built-in. (#2427)

What's Changed

New Contributors

Full Changelog: https://github.com/huggingface/peft/compare/v0.14.0...v0.15.0

Source: README.md, updated 2025-03-19