Showing 30 open source projects for "documents"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Cloud tools for web scraping and data extraction Icon
    Cloud tools for web scraping and data extraction

    Deploy pre-built tools that crawl websites, extract structured data, and feed your applications. Reliable web data without maintaining scrapers.

    Automate web data collection with cloud tools that handle anti-bot measures, browser rendering, and data transformation out of the box. Extract content from any website, push to vector databases for RAG workflows, or pipe directly into your apps via API. Schedule runs, set up webhooks, and connect to your existing stack. Free tier available, then scale as you need to.
    Explore 10,000+ tools
  • 1
    LibreWeb Browser

    LibreWeb Browser

    Decentralized Web Browser

    ...Built-in easy-to-use editor (whenever you want to publish some content without programming language knowledge). Decentralized (no single-point of failure or censorship), like P2P, DHT, and IPFS. Versioning/revisions of content and documents (automatically solves broken 'links', that can't be happy anymore). Publisher users should be able to add additional information about the document/page, eg. title or path (similar in how Jekyll is using the YML format for meta data).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Mod_Rexx is an Apache loadable module which interfaces to Rexx. All phases of an Apache request can be processed with Mod_Rexx. It supports Open Object Rexx and Regina Rexx.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    mediaTUM is free software written in Python for archiving and retrieval of images, documents and other research data. It was originally developed in the framework of the DFG project IntegraTUM and is continuously expanded with new functionalities as required.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4

    Z Notation E-Mail Mark-up Tools

    Tools to convert Z mark-up to HTML or text.

    A small library and two command-line tools to parse and convert Z notation from the "e-mail" mark-up into HTML code, or into UTF-8 text with box-drawing graphics, or into the Z Standard text format. See the project's Wiki Home Page for details --- the "Wiki" button in the bar above, or the following link:
    Downloads: 0 This Week
    Last Update:
    See Project
  • Comet Backup - Fast, Secure Backup Software for MSPs Icon
    Comet Backup - Fast, Secure Backup Software for MSPs

    Fast, Secure Backup Software for Businesses and IT Providers

    Comet is a flexible backup platform, giving you total control over your backup environment and storage destinations.
    Learn More
  • 5
    SportWire Web News
    SportWire is a website toolkit for high-speed high-volume collection, transformation and redistribution of documents. Sportwire is currently employed by xmlteam.com to translate multiple concurrent vendor news feeds into standard SportsML at over 20 documents per second.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6

    Django Live OS

    Django Live OS for building webapps using Django and MongoDB.

    This is Django Live, a live CD based on Debian stable, Squeeze 6/Lenny 5 that enables to setup/host/test Django apps with ease. No worrys of how to install Apache/Python/MySQL/Django..just fill it, shut it, and go on..LAMP made easy :)
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Hypermail is a program that takes a file of mail messages in UNIX mailbox format and generates a set of cross-referenced HTML documents. Development of hypermail continues now at github: https://github.com/hypermail-project/hypermail
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8

    LinkChecker

    check links in web documents or full websites

    New Homepage: http://wummel.github.io/linkchecker/ Linkchecker features: - recursive and multithreaded checking and site crawling - output in colored or normal text, HTML, SQL, CSV, XML or a sitemap graph in different formats - HTTP/1.1, HTTPS, FTP, mailto:, news:, nntp:, Telnet and local file links support - restrict link checking with regular expression filters for URLs - proxy support -...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    This project aims the requirement of managed informations / documents in the treatment of patients, i.e. integrated care. The underlaying data model implements standards of electronic health recording, the project is based on a SQL db and Java EE framework. This is currently a student driven project for educational purposes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • The Original Buy Center Software. Icon
    The Original Buy Center Software.

    Never Go To The Auction Again.

    VAN sources private-party vehicles from over 20 platforms and provides all necessary tools to communicate with sellers and manage opportunities. Franchise and Independent dealers can boost their buy center strategies with our advanced tools and an experienced Acquisition Coaching™ team dedicated to your success.
    Learn More
  • 10

    HXPath

    XPath HTML parser

    HXPath is a command line tool useful to extract data from HTML documents. HXPath can select sub trees, like the standard xpath tool, but is also able to read contents and attributes and output them in a bash friendly format. HTML Tidy and HTTP/HTTPS get are built in too.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Transform XML documents using XSL style-sheets. Process embedded blocks of XXSLT (xslt + include directive) commands in any document - download XML data and define XSL style-sheets. Insert downloaded content directly to the page source. Cache support.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    An Apache module for performing server-side content negotiation of XHTML documents. Compatible clients will be sent the correct Content-Type header; older or non-compliant clients will be sent text/html.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Irudiko is a library written in C++ for generating Locality Sensitive Hashing sketches from any textual and web document. Mainly designed to work with HTML pages, it has also an optimization support for English or Italian documents.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Anastasia is a SGML/XML publication tool which allows the processing and searching of large documents using tcl scripting. See https://github.com/peterrobinson/Anastasia2 for updated code, etc, for running in Apache 2 environments.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    The SBus is a family of high-speed packet-based databus standards, suitable for both networking and interdevice communication. They are optimized for high data density transactions. This project creates and documents the standards, schematics, and driver
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    CMSpider is a Java application that, "spidering" a hierarchical ordered collection of wiki documents, generates a hierarchical site with a tree menu (without frames). It's tailored for personal web-notebooks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Produce alphabetical index for document repository using SWISH-E. Index files are analysed with WordNet to produce a theme list, which is used for searches to find documents. Theme words in documents are automatically hyperlinked to a list of references.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    The WBXML Library is a C library for handling WBXML (Wireless Binary XML) documents. It consists of a WBXML Parser (with a SAX like interface), a generic WBXML Encoder, and an internal representation of the document (WBXMLTree).
    Leader badge
    Downloads: 5 This Week
    Last Update:
    See Project
  • 19
    Vade Mecum is a feature-packed Pocket PC-based viewer for Plucker (http://www.plkr.org) documents, supporting text selection, highlighting, annotations, and more.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Estraier is a personal full-text search system for web sites, local file systems, mail boxes, and so on. Estraier has flexible interface and it can handle multilingual documents and various file formats with external plug-ins.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    The Infomap NLP software performs automatic indexing of words and documents from free-text corpora, using a variant of LSA to enable information retrieval and other applications. It was developed by the Infomap Project at Stanford University's CSLI.
    Leader badge
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Xtree is a Document Object Model XML extension library for PHP (written in C) that is extremely fast, simple, and efficient. With this extension, loading, saving, and manipulating XML documents couldn’t be easier. An XPath Interpreter is also included.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Tools, programs, snippets, documents from Edwin `MavEtJu' Groothuis (ddc dhcping dhcpdump morse daychooser radius ngrep-lib mavbiff httpgrabber ipfw-graph dnstracer)
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    KINg (KINg Is Not google!) is an effort to create a smart search engine, initially not to be used on the web, but to be used with documents in electronic format in our machine.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    ApMl provides users with the ability to crawl the web and download pages to their computer in a directory structure suitable for a Machine Learning system to both train itself and classify new documents. Classification Algorithms include Naive Bayes, KNN
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next