Showing 27 open source projects for "parse-stdf"

View related business solutions
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    LlamaParse

    LlamaParse

    Parse files for optimal RAG

    LlamaParse is a GenAI-native document parser that can parse complex document data for any downstream LLM use case (RAG, agents). Load in 160+ data sources and data formats, from unstructured, and semi-structured, to structured data (API's, PDFs, documents, SQL, etc.) Store and index your data for different use cases. Integrate with 40+ vector stores, document stores, graph stores, and SQL db providers.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 2
    Claude Code SDK Python

    Claude Code SDK Python

    Python SDK for Claude Agent

    ...The repo is MIT-licensed and includes documentation and installation instructions (requires Python 3.10+, Node installation of Claude Code). Example usage shows how to stream responses, parse structured message blocks, or create persistent client sessions.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 3
    compromise

    compromise

    Modest natural-language processing

    ...It works mainly by conjugating all forms of a basic word list. Decide how words get interpreted or make heavier changes with a compromise-plugin. Parse text without running POS-tagging. Pre-parse any match statements for faster lookups. It is not the most accurate, or clever nlp library, but found its niche as an easy, small library that can run everywhere.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Kor

    Kor

    LLM

    This is a half-baked prototype that “helps” you extract structured data from text using LLMs. Specify the schema of what should be extracted and provide some examples. Kor will generate a prompt, send it to the specified LLM and parse out the output. You might even get results back.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 5
    PaddleOCR-json

    PaddleOCR-json

    OCR offline image text recognition command line windows program

    ...Projects and wrappers built around PaddleOCR-json demonstrate how it can be integrated into other applications, such as desktop OCR utilities or language-specific bindings, because the JSON output is easy to parse and consume.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 6
    Code-Graph-RAG

    Code-Graph-RAG

    The ultimate RAG for your monorepo

    Code-Graph-RAG is an advanced retrieval-augmented generation system designed specifically for understanding and interacting with large, multi-language codebases by transforming them into structured knowledge graphs. It uses Tree-sitter to parse source code into abstract syntax trees, extracting relationships between functions, classes, and modules to build a graph-based representation of the entire codebase. This structured approach enables more accurate and context-aware querying compared to traditional text-based search methods, allowing users to ask natural language questions about code structure and functionality. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    Stable Diffusion Web UI Extensions

    Stable Diffusion Web UI Extensions

    Extension index for stable-diffusion-webui

    ...The index maintains short descriptions, tags, and repository links, enabling quick filtering by purpose or workflow. It also standardizes submission format so extension authors can contribute entries that the Web UI can parse reliably. For end users, this turns the Web UI into a modular platform where new features appear without manual cloning or guesswork. The project effectively coordinates a thriving plugin ecosystem, keeping discovery and updates lightweight and centralized.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    dots.ocr

    dots.ocr

    Multilingual Document Layout Parsing in a Single Vision-Language Model

    ...It achieves state-of-the-art performance on document parsing benchmarks while maintaining a relatively compact model size, demonstrating efficiency without sacrificing accuracy. Beyond standard OCR tasks, it extends its capabilities to parse complex visual elements such as charts, diagrams, and web interfaces, converting them into structured outputs like SVG code.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    SemTools

    SemTools

    Semantic search and document parsing tools for the command line

    ...Built with Rust for performance and reliability, the toolchain provides fast processing of text and structured documents while maintaining low system overhead. SemTools can parse documents, build semantic embeddings, and perform similarity searches across datasets, making it useful for research, knowledge management, and AI-assisted coding workflows. The toolkit is designed to work well with modern AI pipelines, particularly those involving large language models that require structured knowledge retrieval.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Access competitive interest rates on your digital assets.

    Generate interest, borrow against your crypto, and trade a range of cryptocurrencies — all in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 10
    BettaFish

    BettaFish

    Public opinion analysis system

    ...Unlike simpler analytics tools, BettaFish employs agent collaboration and a “forum” style internal mechanism to combine diverse model outputs, making the analysis richer and more robust. It also integrates multimodal processing, enabling it to parse images and video alongside text.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Stanza

    Stanza

    Stanford NLP Python library for many human languages

    ...It contains tools, which can be used in a pipeline, to convert a string containing human language text into lists of sentences and words, to generate base forms of those words, their parts of speech and morphological features, to give a syntactic structure dependency parse, and to recognize named entities. The toolkit is designed to be parallel among more than 70 languages, using the Universal Dependencies formalism. Stanza is built with highly accurate neural network components that also enable efficient training and evaluation with your own annotated data.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    text-extract-api

    text-extract-api

    Document (PDF, Word, PPTX ...) extraction and parse API

    text-extract-api is an open-source service designed to extract readable text from a wide variety of document formats through a simple API interface. The project focuses on converting complex files such as PDFs, images, scanned documents, and office files into structured plain text that can be processed by downstream applications or language models. Instead of requiring developers to integrate multiple document parsing libraries individually, the system centralizes text extraction...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Token-Oriented Object Notation

    Token-Oriented Object Notation

    Token-Oriented Object Notation (TOON)

    ...This design allows prompts containing structured data to use significantly fewer tokens, which can reduce inference costs and improve efficiency in LLM applications. The project includes a formal specification, encoding rules, and reference implementations that developers can use to serialize and parse TOON data in their applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Jimp

    Jimp

    An image processing library written entirely in JavaScript for Node

    An image processing library for Node written entirely in JavaScript, with zero native dependencies. If you're using this library with TypeScript the method of importing slightly differs from JavaScript. Instead of using require, you must import it with ES6 default import scheme. If you're using a web bundles (webpack, rollup, parcel) you can benefit from using the module build of jimp. Using the module build will allow your bundler to understand your code better and exclude things you aren't...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    MGIE

    MGIE

    Guiding Instruction-based Image Editing via Multimodal Large Language

    MGIE—Guiding Instruction-based Image Editing—demonstrates how a multimodal LLM can parse natural-language editing instructions and then drive image transformations accordingly. The project focuses on making edits explainable and controllable: the model interprets text guidance, reasons over image content, and outputs edits aligned with user intent. It’s positioned as an ICLR 2024 Spotlight work, with code and references that show how to connect language planning to concrete image operations. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    simpleaichat

    simpleaichat

    Python package for easily interfacing with chat apps

    ...The package emphasizes simplicity over heavy frameworks, making it ideal for scripts, notebooks, and small services that need LLMs without architectural lock-in. It supports structured responses and validation patterns so your app can reliably parse model outputs instead of wrestling with brittle free-text parsing. The project encourages clean separation between system prompts, user messages, and tool outputs to keep conversations predictable. With convenience helpers for logging, environment configuration, and retries, it reduces the friction of moving from a quick experiment to a reliable internal tool.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Snips NLU

    Snips NLU

    Snips Python library to extract meaning from text

    Snips NLU is a Natural Language Understanding python library that allows to parse sentences written in natural language, and extract structured information. It’s the library that powers the NLU engine used in the Snips Console that you can use to create awesome and private-by-design voice assistants. The exact output is a bit richer, the point here is to give a glimpse on what kind of information can be extracted.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Accord.NET Framework

    Accord.NET Framework

    Machine learning, computer vision, statistics and computing for .NET

    The Accord.NET Framework is a .NET machine learning framework combined with audio and image processing libraries completely written in C#. It is a complete framework for building production-grade computer vision, computer audition, signal processing and statistics applications even for commercial use. A comprehensive set of sample applications provide a fast start to get up and running quickly, and extensive documentation and a wiki help fill in the details. The Accord.NET project provides...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    Command Line Parser GetPot

    Command Line Parser GetPot

    Tool to parse the command line and configuration files.

    Powerful command line and configuration file parsing for C++, Python, Ruby and Java (others to come). This tool provides many features, such as separate treatment for options, variables, and flags, unrecognized object detection, prefixes and much more.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20

    Lords Mobile Player Statistics

    Compare yourself with others by analyzing Lords Mobile Screenshots

    Lords Mobile Player Statistics (or short Lords Mobile Stats) is a Windows application that allows you to parse Lords Mobile Screenshots to extract player statistics of yourself and other players. After extracting the data you can compare players and view the data as a large sortable table. You can also export the results as text file or render the table as an image (for sharing with your guild for example). This project is still in a early development stage, please read the Wiki (see menu above) for details about what is working and what is not. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    DeSR is a multilingual statistical dependency parser. It produces dependency parse trees for natural language sentences using a parsing model learned from annotated corpora.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    The MaxParser is written in c++ and can parse with first, second, third and fourth order projective Graph-based Dependency parsing algorithm. The project is the new version of the project "Max-MSTParser". If you want to use this software for research, please reference this web address in your papers
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    Medical Treebank

    Community-based linguistic annotation work on clinical documents.

    This project hosts linguistic annotations and guidelines for clinical text. We plan to include several types of annotation (Token, POS and Parse) in WordFreak format on clinical notes originally from the i2b2/VA NLP challenges. The guidelines are copyrighted, but free for the community to use. Annotation in WordFreak format contains only linguistic labels and character offsets, and can be distributed independently from the note text. Instruction is provided on setting up WordFreak for aligning/visualizing the annotations with the source text, which should be obtained through the official i2b2 data host https://www.i2b2.org/NLP/DataSets/Main.php.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Program Q AIML is a C++ Qt-based library offering a simple API to parse AIML XML files and then interacting with a user input with Latin/Arabic support (Unicode). AIML is a technology permitting to have an AI chat bot. A sample application is provided.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    ...Zlatan Mur's idea of ​​creating rules is unique way to connect (PMML) all sources of the rules. Ruledit is protected by copyright of Zlatan Mur Graphical rule editor for JBoss Drools rules. Can be easily extended to parse rules for any other rule engine. Also includes parser for HQL/SQL for rule testing on a database. Aditional plugins for rule deploy can also be obtained
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB