Showing 20 open source projects for "html parse c"

View related business solutions
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    OpenSearchServer Search Engine

    OpenSearchServer Search Engine

    An open source search engine with RESTFul API and crawlers

    OpenSearchServer is a powerful, enterprise-class, search engine program. Using the web user interface, the crawlers (web, file, database, etc.) and the client libraries (REST/API , Ruby, Rails, Node.js, PHP, Perl) you will be able to integrate quickly and easily advanced full-text search capabilities in your application: Full-text with basic semantic, join queries, boolean queries, facet and filter, document (PDF, Office, etc.) indexation, web scrapping,etc. OpenSearchServer runs on...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    HyperSQL is like a doxygen plus javadoc for SQL, hypermapping SQL views, packages, procedures, and functions to HTML source code listings and showing all code locations where these are used.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    NekoHTML is a simple HTML scanner and tag balancer that enables application programmers to parse HTML documents and access the information using standard XML interfaces.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    DeSR is a multilingual statistical dependency parser. It produces dependency parse trees for natural language sentences using a parsing model learned from annotated corpora.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Stop vibe-debugging. Icon
    Stop vibe-debugging.

    Plug Claude into your app's actual errors.

    AppSignal's MCP server hands Claude, Cursor, or Zed your real errors, traces, and the deploy that shipped them. AI writes the fix; you review the diff.
    Free 30 days.
  • 5
    Geoportal Server
    Geoportal Server is a standards-based, open source product that enables discovery and use of geospatial resources including data and services.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    Hypermail is a program that takes a file of mail messages in UNIX mailbox format and generates a set of cross-referenced HTML documents. Development of hypermail continues now at github: https://github.com/hypermail-project/hypermail
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7

    HXPath

    XPath HTML parser

    HXPath is a command line tool useful to extract data from HTML documents. HXPath can select sub trees, like the standard xpath tool, but is also able to read contents and attributes and output them in a bash friendly format. HTML Tidy and HTTP/HTTPS get are built in too.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    webchanges can keep track of user-defined important changes to (X)HTML-based webpages.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    A utility to extract meta-information (properties/comments) out of various file-types; e.g. HTML, PDF, RTF & various Office documents; OGG/MP3 files and JPEG/PNG/GIF images, which can be presented in various output formats (HTML, XML, LaTeX & plain t
    Downloads: 0 This Week
    Last Update:
    See Project
  • Auth0 B2B Essentials: SSO, MFA, and RBAC Built In Icon
    Auth0 B2B Essentials: SSO, MFA, and RBAC Built In

    Unlimited organizations, 3 enterprise SSO connections, role-based access control, and pro MFA included. Dev and prod tenants out of the box.

    Auth0's B2B Essentials plan gives you everything you need to ship secure multi-tenant apps. Unlimited orgs, enterprise SSO, RBAC, audit log streaming, and higher auth and API limits included. Add on M2M tokens, enterprise MFA, or additional SSO connections as you scale.
    Sign Up Free
  • 10
    SWISH++ is a Unix-based file indexing and searching engine (typically used to index and search files on web sites). It's very fast, robust, and can index several file formats including text, HTML, mail, news, LaTeX, and MP3, and apply filters.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    Irudiko is a library written in C++ for generating Locality Sensitive Hashing sketches from any textual and web document. Mainly designed to work with HTML pages, it has also an optimization support for English or Italian documents.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    SWISH-Enhanced is a fast, powerful, *flexible*, free, and easy to use system for indexing collections of Web pages or other files. Key features include the ability to limit searches to certain HTML tags (META, TITLE, comments, etc.).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    CitemaPP is a Google Sitemap generator written in C++. Instead of crawling the html-doc directory on your server, CitemaPP crawls the content of your server via http protocol.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    This is a simple command line tool, which will solve the problem of full mailboxes with stuff you don't want to lose. It fetches all the mail from any POP3 mailbox account and generates a searchable HTML archive on your local harddrive. OS: Unix/Linux
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Command line HTML Parser to be used in scripts to extract data from HTML/webpage according to supplied path and options. Usefull for systematic periodic parsing pages with known structures where information keeps changing - like looking for item on ebay
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    An Apache2 DSO module search engine based on the Swish-e C API returning results by replacing tags in a user supplied html template. Persons with Swish-e knowledge and ability to generate a Swish-e index file should find the searchm interface familiar.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    High-performance software for information retrieval research. Emphasis on semi-structured text retrieval, especially for HTML and XML. The goal is to facilitate information retrieval research by providing an interchangable toolkit of functions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    GoldSeeker is a small formatted data extraction application. It can parse informations from a text, html or other file, and export it in a database.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Automagically categorize content into configurable/trainable taxonomies. This allows you to define taxonomies, and feed html documents into the categorization engine.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    ICECrawler is a WWW crawler and map-generator intended to help understanding and analyzing links between websites and webdocuments.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
Auth0 Logo