Showing 98 open source projects for "documents"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Cloud tools for web scraping and data extraction Icon
    Cloud tools for web scraping and data extraction

    Deploy pre-built tools that crawl websites, extract structured data, and feed your applications. Reliable web data without maintaining scrapers.

    Automate web data collection with cloud tools that handle anti-bot measures, browser rendering, and data transformation out of the box. Extract content from any website, push to vector databases for RAG workflows, or pipe directly into your apps via API. Schedule runs, set up webhooks, and connect to your existing stack. Free tier available, then scale as you need to.
    Explore 10,000+ tools
  • 1
    TeXworks

    TeXworks

    A simple interface for working with TeX documents

    TeXworks is a free and simple working environment for authoring TeX (LaTeX, ConTeXt and XeTeX) documents. Inspired by Dick Koch's award-winning TeXShop program for Mac OS X, it makes entry into the TeX world easier for those using desktop operating systems other than OS X. It provides an integrated, easy-to-use environment for users on other platforms particularly GNU/Linux and Windows and features a clean, simple interface accessible to casual and non-technical users.
    Downloads: 99 This Week
    Last Update:
    See Project
  • 2
    OmegaT - multiplatform CAT tool

    OmegaT - multiplatform CAT tool

    The free computer aided translation (CAT) tool for professionals

    OmegaT is a free and open source multiplatform Computer Assisted Translation tool with fuzzy matching, translation memory, keyword search, glossaries, and translation leveraging into updated projects.
    Leader badge
    Downloads: 1,848 This Week
    Last Update:
    See Project
  • 3
    DocScript is an approach to document preparation. It presents tools and utilities to edit and publish documents. The philosophy behind the DocScript project is to utilize the programming tools you're working with anyway in your daily work.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4

    joy of text

    Editor with scripting language, security features & system interfaces.

    Jot was developed general purpose editor for large CAD files. It's command-driven UI requires no mode switching and hence requires fewer keystrokes to get a typical job done. It is particularly useful for checking and cross-referencing between several source, intermediate and output files - a common requirement for CAD work. But jot's usefulness doesn't stop there. It's sophisticated search features can, for example, be used for interactive data mining or automating the extraction of...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Turn traffic into pipeline and prospects into customers Icon
    Turn traffic into pipeline and prospects into customers

    For account executives and sales engineers looking for a solution to manage their insights and sales data

    Docket is an AI-powered sales enablement platform designed to unify go-to-market (GTM) data through its proprietary Sales Knowledge Lake™ and activate it with intelligent AI agents. The platform helps marketing teams increase pipeline generation by 15% by engaging website visitors in human-like conversations and qualifying leads. For sales teams, Docket improves seller efficiency by 33% by providing instant product knowledge, retrieving collateral, and creating personalized documents. Built for GTM teams, Docket integrates with over 100 tools across the revenue tech stack and offers enterprise-grade security with SOC 2 Type II, GDPR, and ISO 27001 compliance. Customers report improved win rates, shorter sales cycles, and dramatically reduced response times. Docket’s scalable, accurate, and fast AI agents deliver reliable answers with confidence scores, empowering teams to close deals faster.
    Learn More
  • 5
    EpiDoc: Epigraphic Documents in TEI XML

    EpiDoc: Epigraphic Documents in TEI XML

    XML text markup for ancient documents

    The EpiDoc Collaborative is developing specifications and tools for standards-based, digital publication and interchange of scholarly and educational editions of documentary and literary texts like inscriptions and papyri. The link below will take you to the EpiDoc home page on this site.
    Leader badge
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    Madedit-Mod

    Madedit-Mod

    MadEdit-Mod is a cross platform Text/Hex editor based on MadEdit

    Madedit-Mod is a cross platform text/hex editor base on MadEdit with a log of critical bug fix from me or other developers. A lot of new features were added, such as Drag-Drop Edit(cross platform), Highlight word, etc. The reason that I maintained this project is that the author of MadEdit had not worked on it for for a long time and I really like it and need more features. Find more information on Wiki pages. Currently supported Languages: English Chinese Simplified...
    Leader badge
    Downloads: 26 This Week
    Last Update:
    See Project
  • 7
    RTextDoc

    RTextDoc

    An editor for structured documents

    RTextDoc is an editor for structured text documents such as LaTeX, AsciiDoc, DocBook. RTextDoc has proofreading capabilities: on-the-fly spelling, instant grammar checking and built-in free dictionaries. RTextDoc has syntax highlighting, bracket matching, folding, document structure browser for sections and labels, bookmarks, manager for LaTeX symbols, an editor for mathematical equations,integrated BibTeX database manager and several tools to convert LaTeX to HTML and back. ...
    Leader badge
    Downloads: 4 This Week
    Last Update:
    See Project
  • 8
    The goal of this tool is to simplify and accelerate the process of creating bookmarks for DjVu and PDF documents. You can see additional information on the project page.
    Leader badge
    Downloads: 51 This Week
    Last Update:
    See Project
  • 9
    RefDB is a reference database and bibliography tool for SGML, XML, and LaTeX documents, sort of a Reference Manager or BibTeX for markup languages. It is portable and known to run on Linux, Free/NetBSD, OSX, Solaris, and Windows/Cygwin.
    Downloads: 15 This Week
    Last Update:
    See Project
  • Create and run cloud-based virtual machines. Icon
    Create and run cloud-based virtual machines.

    Secure and customizable compute service that lets you create and run virtual machines.

    Computing infrastructure in predefined or custom machine sizes to accelerate your cloud transformation. General purpose (E2, N1, N2, N2D) machines provide a good balance of price and performance. Compute optimized (C2) machines offer high-end vCPU performance for compute-intensive workloads. Memory optimized (M2) machines offer the highest memory and are great for in-memory databases. Accelerator optimized (A2) machines are based on the A100 GPU, for very demanding applications.
    Try for free
  • 10
    RText is a customizable programmer's text editor written in Java. Some of its features include: syntax highlighting, editing multiple documents at once, printing and print preview, find/replace/find in files dialogs, undo/redo, and online help.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 11
    SimplyHTML is an application and a java component for rich text processing. It stores documents as HTML files in combination with Cascading Style Sheets (CSS). SimplyHTML is not intended to be used as an editor for web pages.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    fastText

    fastText

    Library for fast text classification and representation

    ...In order to build such classifiers, we need labeled data, which consists of documents and their corresponding categories (or tags, or labels).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    iText®, a JAVA PDF library

    iText®, a JAVA PDF library

    PDF Library for Developers

    iText is an open-source PDF library available for Java and .NET (C#). iText allows you to effortlessly generate and manipulate standards-compliant PDF documents with a powerful and feature-rich SDK. With iText, you can create archivable and accessible PDFs, split and merge documents, fill and flatten forms, digitally sign documents, and more. iText add-ons enable additional functionality, such as PDF creation from HTML templates, secure redaction, OCR, and much more. The latest versions of iText build on the success of previous versions and feature an improved document engine, high and low-level programming capabilities, and a more efficient modular structure. iText represents the next level for developers looking to leverage PDF in document workflows. ...
    Leader badge
    Downloads: 174 This Week
    Last Update:
    See Project
  • 14
    PDF-Shuffler
    PDF-Shuffler is a small python-gtk application, which helps the user to merge or split pdf documents and rotate, crop and rearrange their pages using an interactive and intuitive graphical interface. It is a frontend for python-pyPdf.
    Leader badge
    Downloads: 47 This Week
    Last Update:
    See Project
  • 15
    cpDetector is a proxy for codepage detection of documents. It delegates to multiple instances that try to detect the codepage by different techinques. A command line executeable is shipped that allows to sort documents by codepage.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 16

    Indexmeister

    automatic indexing for large LaTex documents

    Indexmeister reads a variety of formats (.tex, .docx, .epub, and others) and suggests keywords for indexing. The included program Imbrowse provides a semi-automatic interface to rapidly add index tags to multi-file latex documents.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    XMLStarlet is a set of command line utilities (tools) to transform, query, validate, and edit XML documents and files using simple set of shell commands in similar way it is done for text files with UNIX grep, sed, awk, diff, patch, join, etc utilities.
    Leader badge
    Downloads: 1,248 This Week
    Last Update:
    See Project
  • 18
    A Swiss Army Knife GUI application for PDF documents: combine, split, rotate, reorder (n-up, booklet), watermark, edit bookmarks/fileinfo/pagetransition, compress, encrypt, decrypt, sign, repair, edit attachments and more.
    Leader badge
    Downloads: 105 This Week
    Last Update:
    See Project
  • 19
    BVH

    BVH

    Bibliothèques Virtuelles Humanistes

    ...Il constitue, avec l'informatisation des Catalogues régionaux des incunables et l’élaboration de la base "De minute en minute" le volet recherche sur le document ancien du CESR. Il diffuse des fonds documents patrimoniaux et poursuit des recherches associant des compétences en sciences humaines et en informatique. Depuis 2008, il agrège plusieurs types de documents numériques : Une sélection de fac-similés d'ouvrages de la Renaissance numérisés en Région Centre et dans les établissements partenaires La base textuelle Epistemon, qui offre des transcriptions en XML-TEI Des transcriptions ou analyses de minutes notariales et des manuscrits. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20

    PanDocElectron

    Graphical User Interface for PanDoc for Linux, Mac & Windows

    PanDoc Graphical User Interface implemented with Electron for Linux, Mac and Windows. It support users in converting source documents into various other formats like docx, odt, html and reveal documentation. The zip files contain the full source code because PanDocElectron is written in HTML/Javascript. Electron is used more or less as browser that runs the HTML/Javascript application. [Download PanDocElectron](https://sourceforge.net/p/pandocelectron/wiki/Home/) Extract the zip-file from Downloads in your Documents folders (directory Documents/PanDoc). ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21

    WebDjVuTextEd

    Edit the OCR text layer of DjVu documents in a web browser

    WebDjVuTextEd allows to edit the text layer of OCR'ed DjVu documents in a web browser. You can modify the structure (paragraphs, lines, words...) create, delete, edit text nodes, modify their container box by mouse, and run a spellchecker. The program does not directly read the DjVu files, it requires exported XML text data and images. When using without a webserver, you can open and save local files, but cannot take advantages of auto-save and spell checking.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22

    MindRaider

    MindRaider is a personal notebook and outliner.

    MindRaider is a personal notebook and outliner. Where do you keep private remarks like ideas, plans, gift tips and howtos? Loads of documents and remarks spread around the file system? Can you find a remark when you need it? No? Try MindRaider!
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    JGloss

    JGloss

    Add readings and translations to Japanese text

    JGloss lets you import Japanese text documents and add reading and translation annotations for words, both automatically during import, and manually. It is written in Java.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 24
    A java-based parser for parsing/grabbing web sites and other text or XML documents, based on a nondeterministic parser language, creating XML output. Also contains a few utility classes for HTML, CSV and text parsing, and additional character sets.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    writeup
    Programming language for converting source documents into HTML or XML. Writeup is a combination of a markup language (similar to markdown) and a macro pre-processing language that enables a formal production system to be set up for documents.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • Next