Search Results for "character recognition source code" - Page 2

Showing 90 open source projects for "character recognition source code"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • 1
    Face Alignment

    Face Alignment

    2D and 3D Face alignment library build using pytorch

    Detect facial landmarks from Python using the world's most accurate face alignment network, capable of detecting points in both 2D and 3D coordinates. Build using FAN's state-of-the-art deep learning-based face alignment method. For numerical evaluations, it is highly recommended to use the lua version which uses identical models with the ones evaluated in the paper. More models will be added soon. By default, the package will use the SFD face detector. However, the users can alternatively...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    StoryMem

    StoryMem

    Official code for StoryMem: Multi-shot Long Video Storytelling

    StoryMem is a narrative-focused memory accumulation system that lets users build, store, and reference past conversational context or story elements with an AI, effectively enabling the AI to maintain and recall personalized story memories or character arcs over time. Instead of treating each interaction as stateless, it tracks user-defined memory nodes, tags, and story threads so that future interactions can draw on established narrative context like character traits, past events, or...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    FAY

    FAY

    Framework for building AI-powered interactive digital humans and agent

    Fay is an open source framework designed to build and deploy interactive digital humans powered by large language models. It acts as a middleware layer that connects digital character technologies with conversational AI systems and business applications. Fay supports various types of digital humans, including 2.5D and 3D avatars, and can be integrated with applications running on mobile devices, PCs, web platforms, and embedded systems.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    Best-of Machine Learning with Python

    Best-of Machine Learning with Python

    A ranked list of awesome machine learning Python libraries

    This curated list contains 900 awesome open-source projects with a total of 3.3M stars grouped into 34 categories. All projects are ranked by a project-quality score, which is calculated based on various metrics automatically collected from GitHub and different package managers. If you like to add or update projects, feel free to open an issue, submit a pull request, or directly edit the projects.yaml. Contributions are very welcome! General-purpose machine learning and deep learning...
    Downloads: 1 This Week
    Last Update:
    See Project
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 5
    PRML

    PRML

    PRML algorithms implemented in Python

    PRML repository is a respected and well-maintained project that implements the foundational algorithms from the famous textbook Pattern Recognition and Machine Learning by Christopher M. Bishop, providing a practical and accessible Python reference for both students and professionals. Rather than just summarizing concepts, the repository includes working code that demonstrates linear regression and classification, kernel methods, neural networks, graphical models, mixture models with EM...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    PersonaPlex

    PersonaPlex

    PersonaPlex code

    PersonaPlex is an open-source real-time conversational speech AI model that goes beyond traditional text chat by providing full-duplex speech-to-speech interaction, meaning it can listen and talk at the same time instead of waiting for you to finish speaking before responding. This architectural approach eliminates awkward pauses and makes conversations feel much more human-like, with natural behaviors such as overlapping speech, interruptions, and fluent turn-taking, traits that traditional...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 7
    EasyOCR

    EasyOCR

    Ready-to-use OCR with 80+ supported languages

    Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc. EasyOCR is a python module for extracting text from image. It is a general OCR that can read both natural scene text and dense text in document. We are currently supporting 80+ languages and expanding. Second-generation models: multiple times smaller size, multiple times faster inference, additional characters and comparable accuracy to the first...
    Downloads: 38 This Week
    Last Update:
    See Project
  • 8
    Advanced NLP with spaCy

    Advanced NLP with spaCy

    Advanced NLP with spaCy: A free online course

    Advanced NLP with spaCy is an open-source educational repository that provides the materials for an interactive course on advanced natural language processing using the spaCy library. The course is designed to teach developers how to build real-world NLP systems by combining rule-based techniques with machine learning models. The repository includes lessons, exercises, and examples that guide learners through tasks such as tokenization, named entity recognition, text classification, and training custom NLP models. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    latexcv

    latexcv

    A collection of cv and resume templates written in LaTeX

    A collection of user-friendly LaTeX CV and résumé templates (packaged within the R Markdown vitae ecosystem), offering simple themes and templates for creating professional CVs without heavy TeX coding. Supports multiple display themes such as classic, modern, sidebar layouts.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Train ML Models With SQL You Already Know Icon
    Train ML Models With SQL You Already Know

    BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

    Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.
    Try Free
  • 10
    Fish Speech

    Fish Speech

    SOTA Open Source TTS

    Fish Speech is a state-of-the-art open-source text-to-speech project that has evolved into the OpenAudio series of advanced TTS models. The repository hosts the code and tooling for training, fine-tuning, and serving high-quality TTS, while the current flagship models (OpenAudio-S1 and S1-mini) are distributed via Fish Audio’s playground and Hugging Face. The models are evaluated with Seed TTS metrics and achieve exceptionally low word and character error rates, indicating strong intelligibility and alignment between text and audio. ...
    Downloads: 23 This Week
    Last Update:
    See Project
  • 11
    dots.ocr

    dots.ocr

    Multilingual Document Layout Parsing in a Single Vision-Language Model

    dots.ocr is a cutting-edge multilingual document parsing system built on a unified vision-language model that combines layout detection, text recognition, and structural understanding into a single architecture. Unlike traditional OCR pipelines that rely on multiple specialized components, dots.ocr integrates these processes end-to-end, reducing error propagation and improving consistency across tasks. The model is designed to recognize virtually any human script, making it highly effective...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    CutLER

    CutLER

    Code release for Cut and Learn for Unsupervised Object Detection

    CutLER is an approach for unsupervised object detection and instance segmentation that trains detectors without human-annotated labels, and the repo also includes VideoCutLER for unsupervised video instance segmentation. The method follows a “Cut-and-LEaRn” recipe: bootstrap object proposals, refine them iteratively, and train detection/segmentation heads to discover objects across diverse datasets. The codebase provides training and inference scripts, model configs, and references to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    FLUX.2

    FLUX.2

    Official inference repo for FLUX.2 models

    FLUX.2 is a state-of-the-art open-weight image generation and editing model released by Black Forest Labs aimed at bridging the gap between research-grade capabilities and production-ready workflows. The model offers both text-to-image generation and powerful image editing, including editing of multiple reference images, with fidelity, consistency, and realism that push the limits of what open-source generative models have achieved. It supports high-resolution output (up to ~4 megapixels),...
    Downloads: 50 This Week
    Last Update:
    See Project
  • 14
    minbpe

    minbpe

    Minimal, clean code for the Byte Pair Encoding (BPE) algorithm

    minbpe is a minimal, clean implementation of byte-level Byte Pair Encoding (BPE), the tokenization approach widely used in modern language models. It operates on UTF-8 encoded bytes rather than Unicode characters, which makes it robust to arbitrary text inputs and avoids needing a language-specific character vocabulary. The repository is structured as a teaching-oriented implementation that shows how to train a tokenizer by learning merge rules, then apply those merges to encode text into...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    latexify

    latexify

    A library to generate LaTeX expression from Python code

    latexify_py converts small, math-heavy pieces of Python code into human-readable LaTeX that mirrors the intent of the computation, not just its surface syntax. It parses Python functions and expressions into an abstract syntax tree (AST), applies symbolic rewrites for common mathematical constructs, and then emits LaTeX that compiles cleanly in standard environments. Typical use cases include turning analytical utilities—like probability mass functions, activation formulas, or recurrence...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    ktrain

    ktrain

    ktrain is a Python library that makes deep learning AI more accessible

    ktrain is a Python library that makes deep learning and AI more accessible and easier to apply. ktrain is a lightweight wrapper for the deep learning library TensorFlow Keras (and other libraries) to help build, train, and deploy neural networks and other machine learning models. Inspired by ML framework extensions like fastai and ludwig, ktrain is designed to make deep learning and AI more accessible and easier to apply for both newcomers and experienced practitioners. With only a few lines...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 17
    HunyuanImage-3.0

    HunyuanImage-3.0

    A Powerful Native Multimodal Model for Image Generation

    HunyuanImage-3.0 is a powerful, native multimodal text-to-image generation model released by Tencent’s Hunyuan team. It unifies multimodal understanding and generation in a single autoregressive framework, combining text and image modalities seamlessly rather than relying on separate image-only diffusion components. It uses a Mixture-of-Experts (MoE) architecture with many expert subnetworks to scale efficiently, deploying only a subset of experts per token, which allows large parameter...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 18
    Claude-Flow

    Claude-Flow

    The leading agent orchestration platform for Claude

    Claude-Flow v2 Alpha is an advanced AI orchestration and automation framework designed for enterprise-grade, large-scale AI-driven development. It enables developers to coordinate multiple specialized AI agents in real time through a hive-mind architecture, combining swarm intelligence, neural reasoning, and a powerful set of 87 Modular Control Protocol (MCP) tools. The platform supports both quick swarm tasks and persistent multi-agent sessions known as hives, facilitating distributed AI...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 19
    Jittor

    Jittor

    Jittor is a high-performance deep learning framework

    Jittor is a high-performance deep learning framework based on JIT compiling and meta-operators. The whole framework and meta-operators are compiled just in time. A powerful op compiler and tuner are integrated into Jittor. It allowed us to generate high-performance code specialized for your model. Jittor also contains a wealth of high-performance model libraries, including image recognition, detection, segmentation, generation, differentiable rendering, geometric learning, reinforcement...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 20
    Open Model Zoo

    Open Model Zoo

    Pre-trained Deep Learning models and demos

    Open Model Zoo is a large repository of high-quality pre-trained deep learning models and demonstration applications designed to work with the OpenVINO™ toolkit, offering a comprehensive starting point for a wide range of AI and computer vision workloads. It includes hundreds of models covering object detection, classification, segmentation, pose estimation, speech recognition, text-to-speech, and more, many of which are already converted into formats optimized for inference on CPUs, GPUs,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Lingvo

    Lingvo

    Framework for building neural networks

    Lingvo is a TensorFlow based framework focused on building and training sequence models, especially for language and speech tasks. It was originally developed for internal research and later open sourced to support reproducible experiments and shared model implementations. The framework provides a structured way to define models, input pipelines, and training configurations using a common interface for layers, which encourages reuse across different tasks. It has been used to implement state...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    vJEPA-2

    vJEPA-2

    PyTorch code and models for VJEPA2 self-supervised learning from video

    VJEPA2 is a next-generation self-supervised learning framework for video that extends the “predict in representation space” idea from i-JEPA to the temporal domain. Instead of reconstructing pixels, it predicts the missing high-level embeddings of masked space-time regions using a context encoder and a slowly updated target encoder. This objective encourages the model to learn semantics, motion, and long-range structure without the shortcuts that pixel-level losses can invite. The...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Hiera

    Hiera

    A fast, powerful, and simple hierarchical vision transformer

    Hiera is a hierarchical vision transformer designed to be fast, simple, and strong across image and video recognition tasks. The core idea is to use straightforward hierarchical attention with a minimal set of architectural “bells and whistles,” achieving competitive or superior accuracy while being markedly faster at inference and often faster to train. The repository provides installation options (from source or Torch Hub), a model zoo with pre-trained checkpoints, and code for evaluation and fine-tuning on standard benchmarks. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 24
    OculiX

    OculiX

    Visual Automation IDE — automate anything you see on screen

    OculiX is the evolution of SikuliX, actively maintained with the full agreement of its original creator RaiMan. Automate any desktop application using image recognition (OpenCV) and OCR (Tesseract + PaddleOCR). No access to source code or DOM required — if you can see it, you can automate it. Key features: - Guided step-by-step recorder with live code preview - Image recognition via OpenCV 4.10 - Dual OCR: Tesseract (built-in) + PaddleOCR (neural, high precision) - Local and remote automation via integrated VNC - SSH tunnels via embedded JSch - Cross-platform: Windows, macOS (Apple Silicon M1-M4), Linux - Scripting: Jython, JRuby, Java, PowerShell, AppleScript - Java 17 recommended (Java 8+ supported) - Full CI/CD with automated builds for all platforms Used worldwide for test automation, RPA, and visual regression testing. ...
    Downloads: 36 This Week
    Last Update:
    See Project
  • 25
    KeyTik: The All-in-One Automation Tool

    KeyTik: The All-in-One Automation Tool

    A Powerful Multi-Profile Key Mapper, Clicker, Macro, and More

    KeyTik is a Python program that uses AutoHotkey to handle many things, including a powerful key mapper and various macros such as clickers and more. It comes with comprehensive key support including ASCII, ANSI, Unicode, Scan Code, Virtual Keyboard Code, and more. KeyTik is also packed with features like Bind to Programs and Devices, Assign Shortcuts, Text Format, Hold Format, and more. KeyTik Pro is available with more feature for $20. Take 25% Off for 10 people only. Check out...
    Downloads: 15 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB