Showing 21 open source projects for "high level assembler"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 1
    Wan2.2

    Wan2.2

    Wan2.2: Open and Advanced Large-Scale Video Generative Model

    Wan2.2 is a major upgrade to the Wan series of open and advanced large-scale video generative models, incorporating cutting-edge innovations to boost video generation quality and efficiency. It introduces a Mixture-of-Experts (MoE) architecture that splits the denoising process across specialized expert models, increasing total model capacity without raising computational costs. Wan2.2 integrates meticulously curated cinematic aesthetic data, enabling precise control over lighting,...
    Downloads: 169 This Week
    Last Update:
    See Project
  • 2
    HunyuanVideo-Foley

    HunyuanVideo-Foley

    Multimodal Diffusion with Representation Alignment

    ...Produces high-quality 48 kHz audio output suitable for professional use. Hybrid architecture combining multimodal transformer blocks and unimodal refinement blocks. Temporal alignment via frame-level synchronization modules (e.g. Synchformer).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    LTX-2

    LTX-2

    Python inference and LoRA trainer package for the LTX-2 audio–video

    LTX-2 is a powerful, open-source toolkit developed by Lightricks that provides a modular, high-performance base for building real-time graphics and visual effects applications. It is architected to give developers low-level control over rendering pipelines, GPU resource management, shader orchestration, and cross-platform abstractions so they can craft visually compelling experiences without starting from scratch. Beyond basic rendering scaffolding, LTX-2 includes optimized math libraries, resource loaders, utilities for texture and buffer handling, and integration points for native event loops and input systems. ...
    Downloads: 57 This Week
    Last Update:
    See Project
  • 4
    LingBot-World

    LingBot-World

    Advancing Open-source World Models

    LingBot-World is an open-source, high-fidelity world simulator designed to advance the state of world models through video generation. Built on top of Wan2.2, it enables realistic, dynamic environment simulation across diverse styles, including real-world, scientific, and stylized domains. LingBot-World supports long-term temporal consistency, maintaining coherent scenes and interactions over minute-level horizons.
    Downloads: 3 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    Protenix

    Protenix

    A trainable PyTorch reproduction of AlphaFold 3

    Protenix is an open-source, trainable PyTorch reimplementation of AlphaFold 3, developed by ByteDance with the goal of democratizing high-accuracy protein structure prediction for computational biology and drug-discovery research. Protenix provides a complete pipeline for turning protein sequences (with optional MSA / sequence alignment) or structural inputs (e.g. PDB/CIF) into full 3D atomic-level structure predictions. It supports both “full” models and lightweight variants such as “Protenix-Mini,” offering a trade-off between speed/compute cost and predictive accuracy — making structure prediction accessible even in resource-constrained environments. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Step-Audio-EditX

    Step-Audio-EditX

    LLM-based Reinforcement Learning audio edit model

    Step-Audio-EditX is an open-source, 3 billion-parameter audio model from StepFun AI designed to make expressive and precise editing of speech and audio as easy as text editing. Rather than treating audio editing as low-level waveform manipulation, this model converts speech into a sequence of discrete “audio tokens” (via a dual-codebook tokenizer) — combining a linguistic token stream and a semantic (prosody/emotion/style) token stream — thereby abstracting audio editing into high-level token operations. This allows users to modify not only what is said (the text) but also how it's said: emotion, tone, speaking style, prosody, accent, even paralinguistic cues. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    IndexTTS2

    IndexTTS2

    Industrial-level controllable zero-shot text-to-speech system

    IndexTTS is a modern, zero-shot text-to-speech (TTS) system engineered to deliver high-quality, natural-sounding speech synthesis with few requirements and strong voice-cloning capabilities. It builds on state-of-the-art models such as XTTS and other modern neural TTS backbones, improving them with a conformer-based speech conditional encoder and upgrading the decoder to a high-quality vocoder (BigVGAN2), leading to clearer and more natural audio output. The system supports zero-shot voice...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 8
    GLM-TTS

    GLM-TTS

    Controllable & emotion-expressive zero-shot TTS

    ...The system introduces a multi-reward reinforcement learning framework that jointly optimizes for voice similarity, emotional expressiveness, pronunciation, and intelligibility, yielding output that can rival commercial options in naturalness and expressiveness. GLM-TTS also supports phoneme-level control and hybrid text + phoneme input, giving developers precise control over pronunciation critical for multilingual or polyphone­-rich languages.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    DeepSeek Coder

    DeepSeek Coder

    DeepSeek Coder: Let the Code Write Itself

    DeepSeek-Coder is a series of code-specialized language models designed to generate, complete, and infill code (and mixed code + natural language) with high fluency in both English and Chinese. The models are trained from scratch on a massive corpus (~2 trillion tokens), of which about 87% is code and 13% is natural language. This dataset covers project-level code structure (not just line-by-line snippets), using a large context window (e.g. 16K) and a secondary fill-in-the-blank objective to encourage better contextual completions and infilling. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 10
    LingBot-VLA

    LingBot-VLA

    A Pragmatic VLA Foundation Model

    ...It has been pretrained on tens of thousands of hours of real robotic interaction data across multiple robot platforms, which enables it to generalize well to diverse morphologies and tasks without needing extensive retraining on each new bot. The model aims to bridge vision, language understanding, and motor control within one unified architecture, making it capable of understanding high-level instructions and generating coherent low-level actions in physical environments. Because LingBot-VLA includes not just the model weights but also a full production-ready codebase with tools for data handling, training, and evaluation, developers can adapt it to custom robots or simulation environments efficiently.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    GLM-4.7

    GLM-4.7

    Advanced language and coding AI model

    GLM-4.7 is an advanced agent-oriented large language model designed as a high-performance coding and reasoning partner. It delivers significant gains over GLM-4.6 in multilingual agentic coding, terminal-based workflows, and real-world developer benchmarks such as SWE-bench and Terminal Bench 2.0. The model introduces stronger “thinking before acting” behavior, improving stability and accuracy in complex agent frameworks like Claude Code, Cline, and Roo Code. GLM-4.7 also advances “vibe...
    Downloads: 64 This Week
    Last Update:
    See Project
  • 12
    HY-Motion 1.0

    HY-Motion 1.0

    HY-Motion model for 3D character animation generation

    HY-Motion 1.0 is an open-source, large-scale AI model suite developed by Tencent’s Hunyuan team that generates high-quality 3D human motion from simple text prompts, enabling the automatic production of fluid, diverse, and semantically accurate animations without manual keyframing or rigging. Built on advanced deep learning architectures that combine Diffusion Transformer (DiT) and flow matching techniques, HY-Motion scales these approaches to the billion-parameter level, resulting in strong instruction-following capabilities and richer motion outputs compared to existing open-source models. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    MiniCPM-o

    MiniCPM-o

    A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming

    MiniCPM-o 2.6 is a cutting-edge multimodal large language model (MLLM) designed for high-performance tasks across vision, speech, and video. Capable of running on end-side devices such as smartphones and tablets, it provides powerful features like real-time speech conversation, video understanding, and multimodal live streaming. With 8 billion parameters, MiniCPM-o 2.6 surpasses its predecessors in versatility and efficiency, making it one of the most robust models available. It supports...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    HunyuanImage-3.0

    HunyuanImage-3.0

    A Powerful Native Multimodal Model for Image Generation

    HunyuanImage-3.0 is a powerful, native multimodal text-to-image generation model released by Tencent’s Hunyuan team. It unifies multimodal understanding and generation in a single autoregressive framework, combining text and image modalities seamlessly rather than relying on separate image-only diffusion components. It uses a Mixture-of-Experts (MoE) architecture with many expert subnetworks to scale efficiently, deploying only a subset of experts per token, which allows large parameter...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 15
    Anthropic SDK Python

    Anthropic SDK Python

    Provides convenient access to the Anthropic REST API from any Python 3

    ...The library includes definitions for all request and response parameters using Python typed objects, automatically handles serialization and deserialization, and wraps HTTP logic (timeouts, retries, error mapping) so that developers can call the API in a clean, high-level way. The SDK supports both synchronous and asynchronous usage (via async/await) depending on context. Importantly, it also supports streaming responses via Server-Sent Events (SSE) so that large outputs can be consumed incrementally rather than waiting for the full response. The client offers helper abstractions for tools (function-style “tools”) and streaming utilities for building interactive agents.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 16
    OpenTinker

    OpenTinker

    OpenTinker is an RL-as-a-Service infrastructure for foundation models

    OpenTinker is an open-source Reinforcement Learning-as-a-Service (RLaaS) infrastructure intended to democratize reinforcement learning for large language model (LLM) agents. Traditional RL setups can be monolithic and difficult to configure, but OpenTinker separates concerns across agent definition, environment interaction, and execution, which lets developers focus on defining the logic of agents and environments separately from how training and inference are run. It introduces a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Tiktoken

    Tiktoken

    tiktoken is a fast BPE tokeniser for use with OpenAI's models

    ...It also offers extension mechanisms so that custom encodings can be registered. Internally, it includes the core tokenizer logic (often implemented in Rust or efficient lower-level code), APIs for encoding, decoding, and counting tokens, and binding layers to Python (and sometimes other languages) for easy use.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    CO3D (Common Objects in 3D)

    CO3D (Common Objects in 3D)

    Tooling for the Common Objects In 3D dataset

    CO3Dv2 (Common Objects in 3D, version 2) is a large-scale 3D computer vision dataset and toolkit from Facebook Research designed for training and evaluating category-level 3D reconstruction methods using real-world data. It builds upon the original CO3Dv1 dataset, expanding both scale and quality—featuring 2× more sequences and 4× more frames, with improved image fidelity, more accurate segmentation masks, and enhanced annotations for object-centric 3D reconstruction. CO3Dv2 enables research...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    CogVLM2

    CogVLM2

    GPT4V-level open-source multi-modal model based on Llama3-8B

    CogVLM2 is the second generation of the CogVLM vision-language model series, developed by ZhipuAI and released in 2024. Built on Meta-Llama-3-8B-Instruct, CogVLM2 significantly improves over its predecessor by providing stronger performance across multimodal benchmarks such as TextVQA, DocVQA, and ChartQA, while introducing extended context length support of up to 8K tokens and high-resolution image input up to 1344×1344. The series includes models for both image understanding and video...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    DeepSeek-V3.2-Speciale

    DeepSeek-V3.2-Speciale

    High-compute ultra-reasoning model surpassing model surpassing GPT-5

    ...The model uses a scaled reinforcement learning framework that allows it to surpass GPT-5 in several evaluations and reach reasoning performance comparable to Gemini-3.0-Pro. DeepSeek-V3.2-Speciale contributed to gold-medal solutions in the 2025 IMO, IOI, ICPC World Finals, and CMO, demonstrating its ability to handle elite-level problem solving. It is released under the MIT license and includes curated benchmark solutions for community verification and analysis.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    DeepSeek-V3.2

    DeepSeek-V3.2

    High-efficiency reasoning and agentic intelligence model

    DeepSeek-V3.2 is a cutting-edge large language model developed by DeepSeek-AI, focused on achieving high reasoning accuracy and computational efficiency for agentic tasks. It introduces DeepSeek Sparse Attention (DSA), a new attention mechanism that dramatically reduces computational overhead while maintaining strong long-context performance. Built with a scalable reinforcement learning framework, it reaches near-GPT-5 levels of reasoning and outperforms comparable models like DeepSeek-V3.1...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB