Showing 15 open source projects for "text to html"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 1
    Automatic text summarizer

    Automatic text summarizer

    Module for automatic summarization of text documents and HTML pages

    Sumy is an automatic text summarization library that provides multiple algorithms for extracting key content from documents and articles. Simple library and command line utility for extracting summary from HTML pages or plain texts. The package also contains a simple evaluation framework for text summaries. Implemented summarization methods are described in the documentation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Text Generation Web UI

    Text Generation Web UI

    Oobabooga - The definitive Web UI for local AI, with powerful features

    ...Notebook mode that resembles OpenAI's playground. Chat mode for conversation and role playing. Instruct mode compatible with Alpaca and Open Assistant formats. Nice HTML output for GPT-4chan. Markdown output for GALACTICA, including LaTeX rendering. Custom chat characters. Advanced chat features (send images, get audio responses with TTS). Very efficient text streaming. Parameter presets, 8-bit mode. Layers splitting across GPU(s), CPU, and disk. CPU mode, FlexGen, DeepSpeed ZeRO-3, API with streaming and without streaming. ...
    Downloads: 22 This Week
    Last Update:
    See Project
  • 3
    Chandra

    Chandra

    OCR model for complex documents with layout-aware structured outputs

    Chandra is an advanced OCR model designed to extract and structure information from complex documents such as tables, forms, handwritten notes, and mathematical content. It focuses on preserving full document layout, meaning that extracted text is accompanied by positional metadata like bounding boxes for each element. Chandra supports multiple output formats including Markdown, HTML, and JSON, making it suitable for downstream processing and integration into data pipelines. It is capable of handling over 40 languages and is optimized to read difficult inputs such as messy handwriting and multi-column layouts. ...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 4
    shuyuan

    shuyuan

    Reading book source

    ...For learners, researchers, or avid readers, Shuyuan offers a way to bridge from plain text files or eBooks into a manageable, interactive resource — one where notes, references, and reading progress can be tracked. It likely supports different input formats (text, HTML, PDF), and may integrate optional translation or text normalization tools.
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    Label Studio

    Label Studio

    Label Studio is a multi-type data labeling and annotation tool

    ...Configurable label formats let you customize the visual interface to meet your specific labeling needs. Support for multiple data types including images, audio, text, HTML, time-series, and video.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 6
    PasteMD

    PasteMD

    Paste Markdown and AI responses into Word Excel instantly fast

    ...It primarily targets users who frequently copy content from AI chat tools or web pages and encounter formatting issues, especially with Markdown, tables, and LaTeX formulas. PasteMD operates from the system tray and monitors clipboard content, automatically converting Markdown or HTML into properly formatted documents using Pandoc. With a single global hotkey, users can paste structured content directly into the active application without manual cleanup or reformatting. It includes intelligent detection mechanisms that distinguish between Markdown tables, rich HTML content, and plain text, ensuring the correct output format is used for each target application. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    LangChain Extract

    LangChain Extract

    Did you say you like data?

    LangChain Extract is an open-source reference application designed to demonstrate how large language models can be used to extract structured data from unstructured text and document files. The project implements a lightweight web service that allows developers to define extraction schemas and apply them to various sources such as plain text, HTML, or PDF documents. Built using FastAPI and the LangChain framework, the application exposes a REST API that can process documents and return structured outputs that match user-defined JSON schemas. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    LlamaParse

    LlamaParse

    Parse files for optimal RAG

    LlamaParse is a GenAI-native document parser that can parse complex document data for any downstream LLM use case (RAG, agents). Load in 160+ data sources and data formats, from unstructured, and semi-structured, to structured data (API's, PDFs, documents, SQL, etc.) Store and index your data for different use cases. Integrate with 40+ vector stores, document stores, graph stores, and SQL db providers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    MCP UI

    MCP UI

    SDK for building interactive UI components over MCP for AI tools

    ...It enables developers to create rich, dynamic UI components that can be delivered from an MCP server and rendered seamlessly by a compatible client. Instead of returning only text responses, tools can provide structured UI resources such as HTML or remote-rendered components, allowing more engaging and functional interactions. mcp-ui introduces a standardized approach where tools and their associated interfaces are linked through metadata, enabling clients to automatically discover and display the correct UI. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 10
    HunyuanOCR

    HunyuanOCR

    OCR expert VLM powered by Hunyuan's native multimodal architecture

    HunyuanOCR is an open-source, end-to-end OCR (optical character recognition) Vision-Language Model (VLM) developed by Tencent‑Hunyuan. It’s designed to unify the entire OCR pipeline, detection, recognition, layout parsing, information extraction, translation, and even subtitle or structured output generation, into a single model inference instead of a cascade of separate tools. Despite being fairly lightweight (about 1 billion parameters), it delivers state-of-the-art performance across a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Unstructured.IO

    Unstructured.IO

    Open source libraries and APIs to build custom preprocessing pipelines

    The unstructured library provides open-source components for ingesting and pre-processing images and text documents, such as PDFs, HTML, Word docs, and many more. The use cases of unstructured revolve around streamlining and optimizing the data processing workflow for LLMs. unstructured modular bricks and connectors form a cohesive system that simplifies data ingestion and pre-processing, making it adaptable to different platforms and is efficient in transforming unstructured data into structured outputs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    LexiFinder

    LexiFinder

    AI-powered semantic indexing: automating the creation of book indexes

    ...Both interfaces share the same underlying engine and support the same features. Supported input formats are PDF, DOCX, and ODT. The index can be exported as plain text, JSON, CSV, or HTML.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Pattern

    Pattern

    Web mining module for Python, with tools for scraping

    Pattern is an open-source Python library that provides tools for web mining, natural language processing, machine learning, and network analysis. The project integrates multiple capabilities into a single framework that allows developers to collect, process, and analyze textual data from the web. It includes modules for web scraping and crawling that can retrieve information from sources such as social media platforms, search engines, and online knowledge bases. In addition to data mining...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    Text2Video

    Text2Video

    Software tool that converts text to video for more engaging experience

    ...I created a prototype web application that takes text as an input and generates a video as an output. I plan to further work on the project targeting young college students who are aged between 18 to 23 because they tend to prefer learning through videos over books based on the survey I found. The technologies I used for the project are HTML, CSS, Javascript, Node.js, CCapture.js, ffmpegserver.js, Amazon Polly, Python, Flask, gevent, spaCy, and Pixabay API.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    TexLexAn is an open source text analyser for Linux, able to estimate the readability and reading time, to classify and summarize texts. It has some learning abilities and accepts html, doc, pdf, ppt, odt and txt documents. Written in C and Python.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB