Showing 61 open source projects for "precise"

View related business solutions
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    UI-TARS Desktop

    UI-TARS Desktop

    A GUI Agent app based on UI-TARS to control your computer using AI

    ...This cross-platform tool supports both Windows and macOS, allowing users to perform tasks through intuitive commands. Key features include screenshot-based visual recognition, precise mouse and keyboard control, and real-time feedback on actions. Provides immediate responses and visual feedback on actions performed. The application facilitates seamless interaction with the computer, enhancing user experience by simplifying complex operations into straightforward language instructions. Leverages advanced AI to bridge the gap between visual elements and language commands. ...
    Downloads: 53 This Week
    Last Update:
    See Project
  • 2
    LTX-2.3

    LTX-2.3

    Official Python inference and LoRA trainer package

    ...This unified approach allows creators to generate complete multimedia sequences where motion, timing, and sound are aligned automatically. LTX-2 is designed for both research and production workflows and can generate high-resolution video clips with precise control over structure, motion, and camera behavior.
    Downloads: 95 This Week
    Last Update:
    See Project
  • 3
    agentation

    agentation

    The visual feedback tool for agents

    ...The package installs into a React app and shows a floating toolbar that lets you activate element selection and add notes during a development session, helping you capture precise targets for improved AI output.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    gse

    gse

    Go efficient multilingual NLP and text segmentation

    Go efficient multilingual NLP and text segmentation; support English, Chinese, Japanese and others. Gse is implements jieba by golang, and try add NLP support and more feature. Support common, search engine, full mode, precise mode and HMM mode multiple word segmentation modes. Support user and embed dictionary, Part-of-speech/POS tagging, analyze segment info, stop and trim words. Support multilingual: English, Chinese, Japanese and others. Support Traditional Chinese. Support HMM cut text use Viterbi algorithm. Support NLP by TensorFlow (in work). Named Entity Recognition (in work). ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 5
    Agent S

    Agent S

    Agent S: an open agentic framework that uses computers like a human

    ...The latest version, Agent S3, surpasses human-level performance on the OSWorld benchmark, demonstrating state-of-the-art results in complex multi-step computer tasks. Agent S combines powerful foundation models (such as GPT-5) with grounding models like UI-TARS to translate visual inputs into precise executable actions. It supports flexible deployment via CLI, SDK, or cloud, and integrates with multiple model providers including OpenAI, Anthropic, Gemini, Azure, and Hugging Face endpoints. With optional local code execution, reflection mechanisms, and compositional planning, Agent S provides a scalable and research-driven framework for building advanced computer-use agents.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 6
    ML Ferret

    ML Ferret

    Refer and Ground Anything Anywhere at Any Granularity

    ...The core idea is a hybrid region representation that mixes discrete coordinates with continuous visual features, so the model can fluidly handle “any-form” referring while maintaining precise spatial localization. The repo presents the vision-language pipeline, model assets, and paper resources that show how Ferret answers questions, follows instructions, and returns grounded outputs rather than just text. In practice, this enables tasks like “find that small red icon next to the chart and describe it” where both the linguistic reference and the visual region are ambiguous without fine spatial reasoning.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Wan2.2

    Wan2.2

    Wan2.2: Open and Advanced Large-Scale Video Generative Model

    ...It introduces a Mixture-of-Experts (MoE) architecture that splits the denoising process across specialized expert models, increasing total model capacity without raising computational costs. Wan2.2 integrates meticulously curated cinematic aesthetic data, enabling precise control over lighting, composition, color tone, and more, for high-quality, customizable video styles. The model is trained on significantly larger datasets than its predecessor, greatly enhancing motion complexity, semantic understanding, and aesthetic diversity. Wan2.2 also open-sources a 5-billion parameter high-compression VAE-based hybrid text-image-to-video (TI2V) model that supports 720P video generation at 24fps on consumer-grade GPUs like the RTX 4090. ...
    Downloads: 163 This Week
    Last Update:
    See Project
  • 8
    Magnitude

    Magnitude

    Vision AI browser agent for automation, testing, and extraction

    ...Developers can use it to automate repetitive web tasks, integrate services without APIs, or build advanced browser-based agents. It also provides flexible abstraction levels, allowing both high-level task execution and precise low-level control of actions like mouse movements and keyboard input.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    MCP ZoomEye

    MCP ZoomEye

    A Model Context Protocol server that provides network asset info

    The ZoomEye MCP Server is a Model Context Protocol server that provides network asset information based on query conditions, allowing Large Language Models to obtain data by querying ZoomEye using dorks and other search parameters. ​
    Downloads: 0 This Week
    Last Update:
    See Project
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • 10
    Artificial Intelligence Controller

    Artificial Intelligence Controller

    AICI: Prompts as (Wasm) Programs

    AICI is a framework that allows developers to build controllers that constrain and direct the output of Large Language Models (LLMs). By treating prompts as WebAssembly (Wasm) programs, AICI enables more precise and controlled interactions with LLMs, enhancing their utility in various applications. This approach allows for the creation of more reliable and predictable AI-driven systems.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    FLUX.2

    FLUX.2

    Official inference repo for FLUX.2 models

    FLUX.2 is a state-of-the-art open-weight image generation and editing model released by Black Forest Labs aimed at bridging the gap between research-grade capabilities and production-ready workflows. The model offers both text-to-image generation and powerful image editing, including editing of multiple reference images, with fidelity, consistency, and realism that push the limits of what open-source generative models have achieved. It supports high-resolution output (up to ~4 megapixels),...
    Downloads: 45 This Week
    Last Update:
    See Project
  • 12
    Scriberr

    Scriberr

    Self-hosted AI audio transcription

    ...Unlike cloud-based transcription services, Scriberr runs entirely on the user’s machine, ensuring that sensitive recordings are never sent to third-party servers and remain fully under user control. It leverages modern speech recognition models such as Whisper and other advanced architectures to deliver precise transcripts with word-level timing and speaker identification. The application includes a polished user interface that simplifies the management of recordings, transcripts, and annotations, making it suitable for both casual users and professionals handling large volumes of audio. Beyond transcription, Scriberr also integrates features such as summarization, tagging, and interaction with language models, allowing users to extract insights from conversations or meetings efficiently.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 13
    Get Shit Done

    Get Shit Done

    A light-weight and powerful meta-prompting, context engineering

    Get Shit Done is a high-impact, open-source meta-prompting and spec-driven development system designed to streamline building software with AI assistants like Claude Code, OpenCode, and Gemini CLI. It solves “context rot” — the degradation of AI quality as a chat session grows — by structuring your idea into precise, context-engineered steps that are researched, scoped, planned, executed, and verified with clear commands and outputs instead of ad-hoc prompts. The project emphasizes simplicity and effectiveness over bureaucratic workflows like story points, sprint ceremonies, or enterprise processes, making it especially useful for solo developers or small teams aiming to get reliable execution without overhead. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 14
    canvas-editor

    canvas-editor

    Canvas-based WYSIWYG rich text editor with advanced layout tools

    canvas-editor is a browser-based rich text editor that renders content using HTML5 Canvas and SVG instead of traditional DOM-based approaches. It is designed to provide a WYSIWYG editing experience similar to word processors, enabling precise control over layout, rendering, and document structure. canvas-editor supports a wide range of formatting and document features, including text styling, tables, images, and embedded elements, all managed through a structured data model. Its architecture is modular, allowing developers to extend functionality through plugins, custom commands, and event hooks. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 15
    WhisperJAV

    WhisperJAV

    Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD

    WhisperJAV is an open-source speech transcription pipeline designed specifically for generating subtitles for Japanese adult video content. The project addresses challenges that standard speech recognition models face when transcribing this type of audio, which often includes low signal-to-noise ratios and large numbers of non-verbal vocalizations. Traditional automatic speech recognition systems can misinterpret these sounds as words, leading to inaccurate transcripts. WhisperJAV introduces...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 16
    HunyuanImage-3.0

    HunyuanImage-3.0

    A Powerful Native Multimodal Model for Image Generation

    HunyuanImage-3.0 is a powerful, native multimodal text-to-image generation model released by Tencent’s Hunyuan team. It unifies multimodal understanding and generation in a single autoregressive framework, combining text and image modalities seamlessly rather than relying on separate image-only diffusion components. It uses a Mixture-of-Experts (MoE) architecture with many expert subnetworks to scale efficiently, deploying only a subset of experts per token, which allows large parameter...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 17
    Aider

    Aider

    Aider is AI pair programming in your terminal

    Aider is an AI pair programming tool that runs directly in your terminal, helping developers build new projects or extend existing codebases faster and more confidently. It works alongside you like a coding partner, using powerful large language models to understand your code and implement precise changes. Aider creates a structured map of your entire repository, allowing it to handle large and complex projects effectively. It supports over 100 programming languages, making it flexible for nearly any development stack. With built-in Git integration, Aider keeps you in control by automatically committing clean, reversible changes. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 18
    ContextGem

    ContextGem

    ContextGem: Effortless LLM extraction from documents

    ContextGem is an open-source framework designed to simplify the extraction of structured data and insights from documents using large language models (LLMs). It provides a flexible, intuitive API that minimizes boilerplate code, enabling developers to build complex extraction workflows efficiently. ContextGem supports various document formats and integrates with multiple LLM providers, making it a versatile tool for tasks like contract analysis, anomaly detection, and information retrieval.​
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    koishi-plugin-novelai

    koishi-plugin-novelai

    Koishi plugin for NovelAI image generation with advanced controls

    A Koishi-based plugin that enables image generation through NovelAI, designed for chat-driven workflows and flexible prompt control. It supports multiple configuration options, including model switching, sampler selection, and adjustable image sizes, giving users control over output quality and style. It includes advanced prompt syntax to refine results and allows automatic translation of Chinese keywords to improve usability across languages. A customizable banned word list helps filter...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    Skyvern

    Skyvern

    Automate browser-based workflows with LLMs and Computer Vision

    ...Skyvern's AI decisions come with built-in explanations, providing clear summaries and justifications for every action. Support for proxies, with support for country, state, or even precise zip-code level targeting. Skyvern understands how to solve CAPTCHAs to complete complicated workflows. Support for authenticating into user accounts, including support for 2FA/TOTP. Extract data from workflows in any schema of your choice including CSV or JSON. Automate procurement pipelines, breeze through government forms, and complete workflows in any language.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    Actionbook

    Actionbook

    Browser action engine for AI agents. 10× faster, resilient by design

    Actionbook is an AI-centric automation framework that equips intelligent agents with the ability to interact with real live web pages in a reliable and scalable way, eliminating the guesswork involved in navigating modern dynamic sites. Instead of having agents blindly scrape HTML or blindly try to click things, Actionbook supplies up-to-date action manuals and verified DOM structure, letting agents know exactly how to click, type, and navigate complex interfaces such as SPAs or streaming...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 22
    Wan Move

    Wan Move

    Motion-controllable Video Generation via Latent Trajectory Guidance

    ...By representing motion information as dense point trajectories and integrating them into the latent space of an image-to-video model, the project produces videos with more precise and controllable motion behavior than many existing methods. Wan-Move is particularly notable for eliminating the need for additional motion encoders, instead directly infusing motion cues into spatiotemporal features, which simplifies both training and inference.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    Sprite Fusion Pixel Snapper

    Sprite Fusion Pixel Snapper

    A tool to snap pixels to a perfect grid

    ...The tool works by adjusting sprite rendering coordinates and texture sampling so that every pixel aligns cleanly to the screen’s pixel grid, avoiding blurring, distortion, or unintended smoothing artifacts. This is especially important in pixel art games, retro-styled interactive media, or precise UI designs where crisp edges and predictable alignment are essential. SpriteFusion Pixel Snapper integrates with popular game engines and rendering pipelines to ensure that assets remain sharp across a broad range of resolutions and aspect ratios without requiring manual fiddling from artists or developers. It includes options for snapping modes, filtering overrides, and automatic correction detection that can be applied on export or at runtime.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    DeepSeek Math

    DeepSeek Math

    Pushing the Limits of Mathematical Reasoning in Open Language Models

    ...The repo may also include modules that integrate external computational tools (e.g. a CAS / computer algebra system) or calculator assistance backends to enhance correctness. Because math reasoning is a high bar for LLMs, DeepSeek-Math aims to showcase their model’s ability not just in natural text but in precise formal reasoning.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    Qwen-2.5-VL

    Qwen-2.5-VL

    Qwen2.5-VL is the multimodal large language model series

    Qwen2.5 is a series of large language models developed by the Qwen team at Alibaba Cloud, designed to enhance natural language understanding and generation across multiple languages. The models are available in various sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B parameters, catering to diverse computational requirements. Trained on a comprehensive dataset of up to 18 trillion tokens, Qwen2.5 models exhibit significant improvements in instruction following, long-text generation...
    Downloads: 9 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
MongoDB Logo MongoDB