Showing 19 open source projects for "structured text"

View related business solutions
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • Stop vibe-debugging. Icon
    Stop vibe-debugging.

    Plug Claude into your app's actual errors.

    AppSignal's MCP server hands Claude, Cursor, or Zed your real errors, traces, and the deploy that shipped them. AI writes the fix; you review the diff.
    Free 30 days.
  • 1
    OpenMed

    OpenMed

    Open source healthcare AI

    OpenMed is an open-source healthcare AI and medical NLP toolkit designed to turn clinical text into structured insights using transformer-based models and production-oriented interfaces. Its core purpose is to provide specialized medical entity extraction, PII detection and de-identification, assertion-aware analysis, and related healthcare text processing capabilities without locking users into a proprietary platform. The project includes a curated registry of more than a dozen medical NER models focused on areas such as diseases, drugs, anatomy, genes, and protected health information, and it is built to support both research and deployment scenarios. ...
    Downloads: 20 This Week
    Last Update:
    See Project
  • 2
    npm-pdfreader

    npm-pdfreader

    Parse text and tables from PDF files.

    npm-pdfreader is a Node.js library for reading text and parsing tables from PDF files. It supports tabular data with automatic column detection and rule-based parsing, making it useful for extracting structured data from PDFs. ​
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Vespa

    Vespa

    The open big data serving engine

    Make AI-driven decisions using your data, in real-time. At any scale, with unbeatable performance. Vespa is a full-featured text search engine and supports both regular text search and fast approximate vector search (ANN). This makes it easy to create high-performing search applications at any scale, whether you want to use traditional techniques or a modern vector-based approach. You can even combine both approaches efficiently in the same query, something no other engine can do....
    Downloads: 6 This Week
    Last Update:
    See Project
  • 4
    visual-explainer

    visual-explainer

    Agent skill + prompt templates that generate rich HTML pages

    ...The project includes prompt templates and automation logic that enable coding agents to generate visual summaries such as diff reviews, architecture overviews, plan audits, and structured data tables. Its primary goal is to bridge the readability gap between raw machine output and stakeholder-friendly documentation. By producing styled web pages instead of plain text logs, visual-explainer improves communication in engineering and AI workflows where clarity is critical. The tool is particularly useful in environments that rely on autonomous agents or CI pipelines that generate dense technical output.
    Downloads: 12 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 5
    theByteBook

    theByteBook

    In-depth explanation of cloud native related technologies

    theByteBook is a large open-source repository that publishes a comprehensive technical book focused on high-availability system design, modern cloud-native infrastructure, and foundational engineering concepts, serving as both a learning resource and architecture reference. The content covers deep dives into networking principles, container ecosystems, Kubernetes, service meshes, distributed systems, and SRE/DevOps practices, aiming to help practitioners build reliable, scalable, and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Scanopy

    Scanopy

    Clean network diagrams, One-time setup, zero upkeep

    Scanopy is a powerful multi-modal data capture and analysis toolkit that enables users to collect, process, and visualize structured and unstructured information from a variety of sources in a flexible pipeline. It is built to handle complex scanning tasks — such as OCR, document analysis, audio transcription, network data capture, and image extraction — while providing unified APIs and workflows that make managing heterogeneous data sources seamless. Developers can compose custom pipelines...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    Gretel Synthetics

    Gretel Synthetics

    Synthetic data generators for structured and unstructured text

    Unlock unlimited possibilities with synthetic data. Share, create, and augment data with cutting-edge generative AI. Generate unlimited data in minutes with synthetic data delivered as-a-service. Synthesize data that are as good or better than your original dataset, and maintain relationships and statistical insights. Customize privacy settings so that data is always safe while remaining useful for downstream workflows. Ensure data accuracy and privacy confidently with expert-grade reports....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    DocWire SDK

    DocWire SDK

    Award-winning modern data processing SDK in C++20

    DocWire SDK, a standout C++20AI driven data processing tool, has received award from SourceForge and strong backing from Microsoft. It handles nearly 100 file types, empowering efficient text extraction, web data extraction, and document analysis. For businesses, the shift to DocWire SDK signifies a leap forward. It promises comprehensive document format support and the ability to extract valuable insights from email boxes, databases, and websites using cutting-edge AI. DocWire SDK aims to...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    DataKit

    DataKit

    Connect processes into powerful data pipelines

    Connect processes into powerful data pipelines with a simple git-like filesystem interface. DataKit is a tool to orchestrate applications using a Git-like dataflow. It revisits the UNIX pipeline concept, with a modern twist: streams of tree-structured data instead of raw text. DataKit allows you to define complex build pipelines over version-controlled data. DataKit is currently used as the coordination layer for HyperKit, the hypervisor component of Docker for Mac and Windows, and for the DataKitCI continuous integration system. src contains the main DataKit service. This is a Git-like database to which other services can connect. ci contains DataKitCI, a continuous integration system that uses DataKit to monitor repositories and store build results. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 10
    Civilizer

    Civilizer

    Civilizer - Tool to efficiently manage your data/knowledge/idea

    Civiilzer is a Dedicated Personal Knowledge Management Tool built by Java. You can use it for purposes such as a high level note application, document store, information hub, etc. With Civilzer's Markdown editor, you can edit text in Markdown format, but the editor provides a lot more additional features, so you can edit your content in a more stylish fashion. You can structure and organize your data with tagging, relating, bookmarking. Civilizer's full text search functionality is simple...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    Roman Life Manager

    Roman Life Manager

    Personal information manager. Free, opensource. Powerful and simple.

    Personal information management application. Powerful and simple. Free and opensoure. Inspired by todos.txt and markdown.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Piggydb

    Piggydb

    Piggydb helps you have more fun with knowledge creation.

    Piggydb is a flexible and scalable knowledge building platform that supports a heuristic or bottom-up approach to discover new concepts or ideas based on your input. You can begin with using it as a flexible outliner, diary or notebook, and as your database grows, Piggydb helps you to shape or elaborate your own knowledge. Piggydb is a Web application provided as a self-contained package that contains a Web server and database engine. With Piggydb, you can create highly structured content...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    GTE Note Taker

    GTE Note Taker

    A simple yet flexible portable plain text note taker for note addicts.

    This small application aims to to help the user take structured plain text notes one at a time. By design, the application aim to be as simple as possible to take a note without being distracted by bells and whistles. The application can also be used to record todos and journal entries. They are structured because they have consistent names ([type]-timestamp.txt), stored in one configurable path, and, optionally, their contents follow predefined formats (see the cnf file).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14

    Arrows

    SW for creation of arrows and ellipses shapefiles for GIS

    SW for creation of arrows and ellipses shapefiles for visualization of vectors error ellipses etc. Input is structured text file. Output is shapefile (open specification from ESRI). Shapefile is widely used in GIS SW.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    The UIMA Annotator (called BRUTUS - Business Rules from Unstructured Text and Unstructured Sources) is a component for the UIMA Framework that allows for capturing business knowledge formalized in Structured English syntax (based on OMG's SBVR) with MOF
    Downloads: 3 This Week
    Last Update:
    See Project
  • 16
    Prototype for a framework and user interface for combining various structured search and document clustering techniques.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    RSane Publisher allows a team of authors to easily maintain shared documents. Publisher speeds authoring of large structured documents, especially technical, business, and reference materials. Runs on Java, JBoss, and Tomcat.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    WordXML is a extension for Microsoft Word in order to convert content easily into the XML format. Content can, without any special technical knowledge, be structured and saved as XML with a document template. WordXML is written in Python.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    SystemSpecifyer is used to develop and maintain complex, well structured, systems specifications, in normal language. Using the System Matrix double matrix relation building mechanism specs can be tested for completeness, correctness and consistency.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
Auth0 Logo