Page 2 | speed-dreeams free download

Pfl Research

Simulation framework for accelerating research

A fast, modular Python framework released by Apple for privacy-preserving federated learning (PFL) simulation. Integrates with TensorFlow, PyTorch, and classical ML, and offers high-speed distributed simulation (7–72× faster than alternatives).

Downloads: 1 This Week

Last Update: 2026-03-19

See Project

SenseVoice

Multilingual speech recognition and audio understanding model

...SenseVoice is trained on more than 400,000 hours of speech data and supports over 50 languages for multilingual recognition tasks. It is built to achieve high transcription accuracy while maintaining efficient inference performance. It includes different model variants optimized for either speed or accuracy, allowing developers to choose a configuration suitable for their use case. In addition to speech transcription, SenseVoice can detect emotional cues in speech and identify common sound events such as applause, laughter, or coughing. It also provides tools for running inference, exporting models to formats like ONNX or LibTorch, and deploying the system through APIs.

Downloads: 10 This Week

Last Update: 3 days ago

See Project

OpenAI-Compatible Edge-TTS API

Free, high-quality text-to-speech API endpoint to replace OpenAI

...The project emulates the /v1/audio/speech endpoint used by OpenAI, so any client that can talk to the OpenAI TTS API can be redirected to this service with minimal changes. It exposes parameters for input text, voice selection, audio format, and playback speed, mirroring the OpenAI interface while mapping popular OpenAI voice names to equivalent Edge voices. Because it relies on Edge’s TTS, the audio generation itself is free, and the project essentially acts as a smart proxy that handles formatting and streaming. The server supports Server-Sent Events (SSE) for streaming audio, enabling low-latency playback in chat UIs and other interactive tools. ...

Downloads: 1 This Week

Last Update: 2025-11-28

See Project

FramePack

Lets make video diffusion practical

...It’s useful for diffusion and generative models that learn from sequential image datasets, as well as classical pipelines that batch many related frames. With a simple API and examples, it invites experimentation on tradeoffs between compression, fidelity, and speed.

Downloads: 7 This Week

Last Update: 2025-10-21

See Project

openTSNE

Extensible, parallel implementations of t-SNE

openTSNE is a modular Python implementation of t-Distributed Stochasitc Neighbor Embedding (t-SNE) [1], a popular dimensionality-reduction algorithm for visualizing high-dimensional data sets. openTSNE incorporates the latest improvements to the t-SNE algorithm, including the ability to add new data points to existing embeddings [2], massive speed improvements [3] [4] [5], enabling t-SNE to scale to millions of data points, and various tricks to improve the global alignment of the resulting visualizations.

Downloads: 1 This Week

Last Update: 2024-08-19

See Project

Whisper-WebUI

A Web UI for easy subtitle using whisper model

...Built with Gradio, it allows users to upload audio or video files, process them locally, and generate accurate text outputs without relying on command-line tools. The platform integrates optimized implementations such as faster-whisper, significantly improving transcription speed and reducing memory usage compared to standard models. It supports multiple input sources including local files, YouTube content, and microphone input, making it versatile for different workflows. Whisper WebUI also includes advanced preprocessing and postprocessing features such as voice activity detection, background music separation, and speaker diarization, enabling more accurate and structured outputs.

Downloads: 6 This Week

Last Update: 2026-03-18

See Project

1D Visual Tokenization and Generation

This repo contains the code for 1D tokenizer and generator

The 1D Visual Tokenization and Generation project from ByteDance introduces a novel “one-dimensional” tokenizer designed for images: instead of representing images with large grids of 2D tokens (as in many prior generative/image-modeling systems), it compresses images into as few as 32 discrete tokens (or more, optionally) — thereby achieving a very compact, efficient representation that drastically speeds up generation and reconstruction while retaining strong fidelity. This compact...

Downloads: 0 This Week

Last Update: 2025-12-02

See Project

OpenBB

Investment Research for Everyone, Everywhere

Customize and speed up your analysis, bring your own data, and create instant reports to gain a competitive edge. Whether it’s a CSV file, a private endpoint, an RSS feed, or even embed an SEC filing directly. Chat with financial data using large language models. Don’t waste time reading, create summaries in seconds and ask how that impacts investments.

Downloads: 3 This Week

Last Update: 2026-03-09

See Project

SAM 2

The repository provides code for running inference with SAM 2

SAM2 is a next-generation version of the Segment Anything Model (SAM), designed to improve performance, generalization, and efficiency in promptable image segmentation tasks. It retains the core promptable interface—accepting points, boxes, or masks—but incorporates architectural and training enhancements to produce higher-fidelity masks, better boundary adherence, and robustness to complex scenes. The updated model is optimized for faster inference and lower memory use, enabling real-time...

Downloads: 8 This Week

Last Update: 2025-10-06

See Project

DeepSpeed

Deep learning optimization library: makes distributed training easy

DeepSpeed is an easy-to-use deep learning optimization software suite that enables unprecedented scale and speed for Deep Learning Training and Inference. With DeepSpeed you can: 1. Train/Inference dense or sparse models with billions or trillions of parameters 2. Achieve excellent system throughput and efficiently scale to thousands of GPUs 3. Train/Inference on resource constrained GPU systems 4. Achieve unprecedented low latency and high throughput for inference 5.

Downloads: 1 This Week

Last Update: 2026-03-30

See Project

Notte

Opensource browser using agents

Notte is an open-source browser framework that enables the development and deployment of web-based AI agents. It introduces a perception layer that transforms web pages into structured, navigable maps described in natural language, allowing agents to interact with the internet more effectively. Notte is designed for building scalable and efficient browser-based AI applications.

Downloads: 1 This Week

Last Update: 2026-04-10

See Project

Guidance

A guidance language for controlling large language models

Guidance is an efficient programming paradigm for steering language models. With Guidance, you can control how output is structured and get high-quality output for your use case—while reducing latency and cost vs. conventional prompting or fine-tuning. It allows users to constrain generation (e.g. with regex and CFGs) as well as to interleave control (conditionals, loops, tool use) and generation seamlessly.

Downloads: 1 This Week

Last Update: 2026-03-18

See Project

SentenceTransformers

Multilingual sentence & image embeddings with BERT

...Further, it is easy to fine-tune your own models. Our models are evaluated extensively and achieve state-of-the-art performance on various tasks. Further, the code is tuned to provide the highest possible speed.

Downloads: 7 This Week

Last Update: 7 days ago

See Project

Elia

Terminal-based LLM chat tool with multi-model and local support

Elia is an open source terminal-based interface designed for interacting with large language models in a fast and efficient way. It runs entirely in the command line, offering a keyboard-driven experience that reduces the need for switching between apps. Users can chat with both proprietary models like ChatGPT and Claude, as well as local models such as Llama 3, Mistral, and Gemma. Elia stores conversations in a local SQLite database, making it easy to revisit past interactions. It supports...

Downloads: 3 This Week

Last Update: 2026-03-19

See Project

BoxMOT

Pluggable SOTA multi-object tracking modules for segmentation

BoxMOT is an open-source framework designed to provide modular implementations of state-of-the-art multi-object tracking algorithms for computer vision applications. The project focuses on the tracking-by-detection paradigm, where objects detected by vision models are continuously tracked across frames in a video sequence. It provides a pluggable architecture that allows developers to combine different object detectors with multiple tracking algorithms without modifying the core codebase....

Downloads: 3 This Week

Last Update: 6 days ago

See Project

Spark TTS

Spark-TTS Inference Code

Spark TTS is an open-source, PyTorch-based text-to-speech inference system that leverages large language models to produce highly natural, intelligible speech from text input. It uses an efficient single-stream architecture where speech tokens are directly reconstructed from the predictions of an LLM, removing the need for external acoustic models or complex vocoders and making the generation pipeline cleaner and faster. The project supports zero-shot voice cloning, meaning it can imitate a...

Downloads: 3 This Week

Last Update: 2026-02-04

See Project

DeepSparse

Sparsity-aware deep learning inference runtime for CPUs

A sparsity-aware enterprise inferencing system for AI models on CPUs. Maximize your CPU infrastructure with DeepSparse to run performant computer vision (CV), natural language processing (NLP), and large language models (LLMs).

Downloads: 0 This Week

Last Update: 2025-06-02

See Project

LuxTTS

A high-quality rapid TTS voice cloning model

LuxTTS is an open-source text-to-speech (TTS) system focused on delivering high-quality, rapid voice synthesis and voice cloning that runs extremely fast and efficiently on consumer hardware. It implements a lightweight architecture based on ZipVoice and optimized sampling techniques so that it can generate speech at speeds up to roughly 150 times real-time on a single GPU and faster than real-time on CPU, all while producing audio at high fidelity with 48 kHz quality. The project supports...

Downloads: 4 This Week

Last Update: 2026-03-12

See Project

DocTR

Library for OCR-related tasks powered by Deep Learning

DocTR provides an easy and powerful way to extract valuable information from your documents. Seemlessly process documents for Natural Language Understanding tasks: we provide OCR predictors to parse textual information (localize and identify each word) from your documents. Robust 2-stage (detection + recognition) OCR predictors with pretrained parameters. User-friendly, 3 lines of code to load a document and extract text with a predictor. State-of-the-art performances on public document...

Downloads: 4 This Week

Last Update: 2026-02-04

See Project

IndexTTS2

Industrial-level controllable zero-shot text-to-speech system

IndexTTS is a modern, zero-shot text-to-speech (TTS) system engineered to deliver high-quality, natural-sounding speech synthesis with few requirements and strong voice-cloning capabilities. It builds on state-of-the-art models such as XTTS and other modern neural TTS backbones, improving them with a conformer-based speech conditional encoder and upgrading the decoder to a high-quality vocoder (BigVGAN2), leading to clearer and more natural audio output. The system supports zero-shot voice...

Downloads: 5 This Week

Last Update: 2025-11-27

See Project

Dataherald

Interact with your SQL database, Natural Language to SQL using LLMs

...It is designed to enable real-time, self-service analytics without needing technical knowledge of databases, making business data easily accessible to non-technical users. Dataherald focuses on speed, accuracy, and scalability for enterprise settings.

Downloads: 0 This Week

Last Update: 2025-03-13

See Project

OmniVoice

High-Quality Voice Cloning TTS for 600+ Languages

The OmniVoice project is a cutting-edge multilingual text-to-speech system designed to generate high-quality speech across more than 600 languages. Built on a diffusion language model-style architecture, it combines scalability with strong performance, enabling both natural-sounding voice synthesis and efficient inference speeds. One of its most notable capabilities is zero-shot voice cloning, allowing users to replicate a speaker’s voice using only a short reference audio clip. In addition,...

Downloads: 3 This Week

Last Update: 6 days ago

See Project

DFlash

Block Diffusion for Ultra-Fast Speculative Decoding

DFlash is an open-source framework for ultra-fast speculative decoding using a lightweight block diffusion model to draft text in parallel with a target large language model, dramatically improving inference speed without sacrificing generation quality. It acts as a “drafter” that proposes likely continuations which the main model then verifies, enabling significant throughput gains compared to traditional autoregressive decoding methods that generate token by token. This approach has been shown to deliver lossless acceleration on models like Qwen3-8B by combining block diffusion techniques with efficient batching, making it ideal for applications where latency and cost matter. ...

Downloads: 2 This Week

Last Update: 4 days ago

See Project

PyBroker

Algorithmic Trading in Python with Machine Learning

Are you looking to enhance your trading strategies with the power of Python and machine learning? Then you need to check out PyBroker! This Python framework is designed for developing algorithmic trading strategies, with a focus on strategies that use machine learning. With PyBroker, you can easily create and fine-tune trading rules, build powerful models, and gain valuable insights into your strategy’s performance.

Downloads: 2 This Week

Last Update: 2026-03-05

See Project

Depth Pro

Sharp Monocular Metric Depth in Less Than a Second

...Unlike many prior approaches, it does not require camera intrinsics or extra metadata, yet still outputs metric depth suitable for downstream 3D tasks. Apple highlights both accuracy and speed: the model can synthesize a ~2.25-megapixel depth map in around 0.3 seconds on a standard GPU, enabling near real-time applications. The repo and research page emphasize boundary fidelity and crisp geometry, addressing a common weakness in monocular depth where edges can blur. Community integrations (e.g., inference wrappers and UI nodes) have sprung up around the model, reflecting practical interest in video, AR, and generative pipelines. ...

Downloads: 4 This Week

Last Update: 2025-10-08

See Project

Search Results for "speed-dreeams" - Page 2

Showing 134 open source projects for "speed-dreeams"

Pfl Research

SenseVoice

OpenAI-Compatible Edge-TTS API

FramePack

openTSNE

Whisper-WebUI

1D Visual Tokenization and Generation

OpenBB

SAM 2

DeepSpeed

Notte

Guidance

SentenceTransformers

Elia

BoxMOT

Spark TTS

DeepSparse

LuxTTS

DocTR

IndexTTS2

Dataherald

OmniVoice

DFlash

PyBroker

Depth Pro

Search Results for "speed-dreeams" - Page 2

Showing 134 open source projects for "speed-dreeams"

Pfl Research

SenseVoice

OpenAI-Compatible Edge-TTS API

FramePack

openTSNE

Whisper-WebUI

1D Visual Tokenization and Generation

OpenBB

SAM 2

DeepSpeed

Notte

Guidance

SentenceTransformers

Elia

BoxMOT

Spark TTS

DeepSparse

LuxTTS

DocTR

IndexTTS2

Dataherald

OmniVoice

DFlash

PyBroker

Depth Pro

Related Searches

Related Categories