Showing 16 open source projects for "content analysis"

View related business solutions
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 1
    PDF4QT

    PDF4QT

    Open source PDF editor

    PDF4QT is open source PDF editor based on Qt framework. It contains a C++ library, applications for viewing/editing PDF documents, and a command line tool. PDF4QT is an open-source PDF editor for Windows/Linux. It is a modern solution for viewing/editing/rendering PDF documents, for users and developers alike. For developers, there is a C++ library and a command line tool for use in scripts. For users, there are four applications offering many features. The project is hosted on Github and...
    Downloads: 53 This Week
    Last Update:
    See Project
  • 2
    Vanilla.PDF

    Vanilla.PDF

    Cross-platform SDK for creating and modifying PDF documents

    ...The SDK offers full cross-platform support including Windows, Linux, macOS, and Android, with builds available for major compilers and architectures. Vanilla.PDF supports advanced PDF features such as adding CMS (PKCS#7) digital signatures, modifying content streams and metadata, and working with encryption and permissions based on standard PDF security models. It includes tools for parsing PDF internals like cross-reference tables and objects, providing fine-grained document analysis capabilities. The project is unit-tested with continuous integration pipelines, supporting sanitizers for enhanced code quality and stability.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    Pix2Text

    Pix2Text

    Open-Source Python3 tool for recognizing layouts, tables, and math

    An Open-Source Python3 tool for recognizing layouts, tables, math formulas, and text in images, converting them into Markdown format. A free alternative to Mathpix, empowering seamless conversion of visual content into text-based representations. 80+ languages are supported. Pix2Text (P2T) aims to be a free and open-source Python alternative to Mathpix, and it can already accomplish Mathpix's core functionality. Pix2Text (P2T) can recognize layouts, tables, images, text, and mathematical...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 4
    LOL HTML

    LOL HTML

    Low output latency streaming HTML parser/rewriter with CSS API

    Low Output Latency streaming HTML rewriter/parser with CSS-selector based API. It is designed to modify HTML on the fly with minimal buffering. It can quickly handle very large documents, and operate in environments with limited memory resources. The crate serves as a back-end for the HTML rewriting functionality of Cloudflare Workers, but can be used as a standalone library with a convenient API for a wide variety of HTML rewriting/analysis tasks. The parser switches back to the tag scanner...
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    TAPClean is a Commodore tape preservation / restoration tool. It will check, repair, and remaster Commodore 64 and VIC 20 TAP or DC2N DMP files (tape images).
    Leader badge
    Downloads: 12 This Week
    Last Update:
    See Project
  • 6

    xsd2pgschema

    Relational database replication tool based on XML Schema

    ...PgSchema server, serialized relational data model server, can be used to speed up the analysis of complex XML Schema. Large XML file can be split through xmlsplitter, a flexible XML splitter based on XPath and StAX.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    javahexeditor Java Hex Editor

    javahexeditor Java Hex Editor

    A hex editor Eclipse plugin and multi-platform desktop application

    You can install the latest Eclipse plugin version from the update site https://javahexeditor.sourceforge.io/update or the Eclipse Marketplace https://marketplace.eclipse.org/content/java-hex-editor. Older versions of the Eclipse plugin are available via the update site for the version, e.g. https://javahexeditor.sourceforge.io/update/0.5.1 You can download the latest stand-alone version and older versions on the "Files" tab.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    LaTeX Reference Card Creator

    LaTeX Reference Card Creator

    A Makefile based build system for creating LaTeX reference cards

    LaTeX Reference Card Creator is a Makefile based build system for creating reference cards. LaTeX Reference Card Creator compiles content into PDF, DjVu, TEX DVI, HTML and PostScript output formats. A three column reference card will be created. Features include batch image format conversions, spell checking, broken link checking, automatic backups and .zip and .tar.gz distribution building. LaTeX Reference Card Creator provides many LaTeX examples which can be used to make a reference card.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    unfluff

    unfluff

    Automatically extract body content (and other cool stuff) from HTML

    unfluff is a Node.js library designed to automatically extract the main content from an HTML document — stripping away navigation bars, ads, footers and other boilerplate to leave you with the “body content”, metadata (title, author, date) and other useful fields. It’s a tool very much aimed at content-analysis, web scraping, building datasets, or repurposing article text for downstream processing (like machine-learning or summarization).
    Downloads: 0 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 10
    LaTeX Web Publisher

    LaTeX Web Publisher

    LaTeX Web Publisher is a Makefile based Web publishing system

    LaTeX Web Publisher is a Makefile based Web publishing system featuring content creation into HTML, non-split HTML, HTML Zip, PDF, DjVu, PostScript, DVI and Plain text formats. All LaTeX Web Publisher output formats are from a single LaTeX source and have indices. LaTeX Web Publisher can be used for website creation and has FTP deployment capabilities. A website created with LaTeX Web Publisher will have HTML, non-split HTML and PDF content formats.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11

    XML Content Provider

    Plugin to connect XML data sources to the GIN Server

    The GIN Server is a semantic middleware for efficient "bottom-up" data integration and automized semantic analysis for dynamically linked data. The XML Content Provider is a configurable plugin for the GIN Server to integrate any XML data source with a simple structure.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12

    RDF Content Provider for iQser GIN

    Plugin to connect RDF sources with the GIN Server

    GIN Server is a semantic middleware for easy data integration and automized analysis. The extendable architecture allows to plugin in data sources, analytics and event handling. This RDF Content Provider enables access to Semantic Web Content as an RDF file or SPAEQL endpoint.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    XmlView
    GUI utility in pure Java for viewing and editing XML content; example of application built with Superficial http://superficial.sourceforge.net
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Yet another lightweight http server. Implements a few html rewriting rules delineated by '<?pico>' tags, and makes it fairly easy to add more straight into the C code. Does not have complete HTTP standards compliance; if you need foo, add it in!
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    XMPP Web Services for Java (XWS4J) is an implementation of machine to machine communication over XMPP. The communicated content is encoded in XML, according to customized definitions of input and output in W3C XML Schemata.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Scan, the Semantic Content ANnotator, is a semantic pipeline that helps connecting information extraction tools to semantic database. UIMA-based, it allows easy plugin-writing: information extraction, ontology control, store in RDF Repositories.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
Auth0 Logo