Petals - Browse /v2.0.1 at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
README.md	2023-07-23	2.8 kB	0
v2.0.1_ Inference of longer sequences, Python 3.11 support, bug fixes.tar.gz	2023-07-23	99.1 kB	0
v2.0.1_ Inference of longer sequences, Python 3.11 support, bug fixes.zip	2023-07-23	140.8 kB	0
Totals: 3 Items		242.8 kB	0

Highlights

🛣️ Inference of longer sequences. We extended the max sequence length to 8192 tokens for Llama 2 and added chunking to avoid server out-of-memory errors (happened when processing long prefixes). This became possible thanks to multi-query attention used in Llama 2, which uses 8x less GPU memory for attention caches. Now you can process longer sequences using a Petals client and have dialogues of up to 8192 tokens at https://chat.petals.dev

🐍 Python 3.11 support. Petals clients and servers now work on Python 3.11.

🐞 Bug fixes. We fixed the server's --token argument (used to provide your 🤗 Model Hub access token for loading Llama 2), possible deadlocks in the server, issues with fine-tuning speed (servers available via relays are deprioritized) and other minor load balancing issues.

🪟 Running server on Windows. We made a better guide for running a server in WSL (Windows Subsystem for Linux).

📦 Running server on Runpod. We added a guide for using a Petals template on Runpod.

What's Changed

Update to petals.dev by @justheuristic in https://github.com/bigscience-workshop/petals/pull/390
Bump version to 2.0.0.post3 by @borzunov in https://github.com/bigscience-workshop/petals/pull/391
Fix --attn_cache_tokens default by @borzunov in https://github.com/bigscience-workshop/petals/pull/392
Fix deadlocks in MemoryCache by @borzunov in https://github.com/bigscience-workshop/petals/pull/396
Support Python 3.11 by @borzunov in https://github.com/bigscience-workshop/petals/pull/393
Fix routing through relay, default network RPS, --token, logging, readme by @borzunov in https://github.com/bigscience-workshop/petals/pull/399
If speedtest fails, assume network speed of 100 Mbit/s by @borzunov in https://github.com/bigscience-workshop/petals/pull/404
Split long sequences into chunks by @justheuristic in https://github.com/bigscience-workshop/petals/pull/403
Add Llama 2, WSL instructions to readme by @borzunov in https://github.com/bigscience-workshop/petals/pull/406
Update README.md by @borzunov in https://github.com/bigscience-workshop/petals/pull/407
Update commands for hosting Llama 2 in readme by @borzunov in https://github.com/bigscience-workshop/petals/pull/409
Update --update_period and --expiration defaults by @borzunov in https://github.com/bigscience-workshop/petals/pull/410
Bump version to 2.0.1 by @borzunov in https://github.com/bigscience-workshop/petals/pull/411

Full Changelog: https://github.com/bigscience-workshop/petals/compare/v2.0.0.post1...v2.0.1

Source: README.md, updated 2023-07-23

Petals Files

Run 100B+ language models at home, BitTorrent-style

Highlights

What's Changed

Petals Files

Run 100B+ language models at home, BitTorrent-style

Get an email when there's a new version of Petals

Highlights

What's Changed