Showing 34 open source projects for "html parser c"

View related business solutions
  • Auth0 Free: 25K MAUs + 5-Min Setup Icon
    Auth0 Free: 25K MAUs + 5-Min Setup

    Enterprise Auth, Zero Friction: Any Framework • 30+ SDKs • Universal Login

    Production-ready login in 10 lines of code. SSO, MFA & social auth included. Scale seamlessly beyond free tier with Okta’s enterprise security.
    Get Your API Keys
  • Payroll Services for Small Businesses | QuickBooks Icon
    Payroll Services for Small Businesses | QuickBooks

    Save up to 50% on QuickBooks Online! Keep the Accounting and Book Keeping for your Small Business up to date!

    Easily pay your team and access powerful tools, employee benefits, and supportive experts with the #1 online payroll service provider. Manage payroll and access HR and employee services in one place. Pay your team automatically once your payroll setup is complete. We'll calculate, file, and pay your payroll taxes automatically.
    Learn More
  • 1

    Tesseract OCR

    Open Source OCR Engine

    ... various output formats, including plain text, HTML, PDF and more. It also has unicode (UTF-8) support.
    Downloads: 1,391 This Week
    Last Update:
    See Project
  • 2
    BudouX

    BudouX

    Standalone, small, language-neutral

    Standalone. Small. Language-neutral. BudouX is the successor to Budou, the machine learning-powered line break organizer tool. It is standalone. It works with no dependency on third-party word segmenters such as Google cloud natural language API. It is small. It takes only around 15 KB including its machine learning model. It's reasonable to use it even on the client-side. It is language-neutral. You can train a model for any language by feeding a dataset to BudouX’s training...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    LlamaParse

    LlamaParse

    Parse files for optimal RAG

    LlamaParse is a GenAI-native document parser that can parse complex document data for any downstream LLM use case (RAG, agents). Load in 160+ data sources and data formats, from unstructured, and semi-structured, to structured data (API's, PDFs, documents, SQL, etc.) Store and index your data for different use cases. Integrate with 40+ vector stores, document stores, graph stores, and SQL db providers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    DotVVM

    DotVVM

    Open source MVVM framework for Web Apps

    DotVVM is an open-source framework for ASP.NET. It lets you create web apps using the MVVM pattern, with just C# and HTML. DotVVM can be used to build new ASP.NET Core web apps, or to modernize legacy ASP.NET apps and migrate them to .NET 5. Save your time with GridView, FileUpload and other components shipped with the framework. Don't spend the time building an API. Just load data from the database and use data-binding to display them. DotVVM needs less than 100 kB of JavaScript code. It's...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Never Get Blocked Again | Enterprise Web Scraping Icon
    Never Get Blocked Again | Enterprise Web Scraping

    Enterprise-Grade Proxies • Built-in IP Rotation • 195 Countries • 20K+ Companies Trust Us

    Get unrestricted access to public web data with our ethically-sourced proxy network. Automated session management and advanced unblocking handle the hard parts. Scale from 1 to 1M requests with zero blocks. Built for developers with ready-to-use APIs, serverless functions, and complete documentation. Used by 20,000+ companies including Fortune 500s. SOC2 and GDPR compliant.
    Get Started
  • 5
    pdf-extractor

    pdf-extractor

    Node.js module for rendering pdf pages to images, svgs and HTML files

    Pdf-extractor is a wrapper around pdf.js to generate images, svgs, html files, text files and json files from a pdf on node.js. A DOM Canvas is used to render and export the graphical layer of the pdf. Canvas exports *.png as a default but can be extended to export to other file types like .jpg. Pdf objects are converted to svg using the SVGGraphics parser of pdf.js. Pdf text is converted to HTML. This can be used as a (transparent) layer over the image to enable text selection. Pdf text...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    SPM - Monitoring  system

    SPM - Monitoring system

    Monitoring Tool for your IT Environment

    SPM Monitoring System - Complete Solution for Efficient Monitoring and Alerting SPM Monitoring System is an all-in-one monitoring solution for IT environments that offers comprehensive features to ensure high availability, stability, and optimal performance of your infrastructure. With SPM Monitoring Systems, you can monitor your network, servers, applications, and services with ease, and receive timely alerts when issues arise. Host Availability Monitoring. Agent-Based Monitoring:...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 7
    TXM

    TXM

    Unicode-XML-TEI text/corpus analysis platform

    TXM is a free and open-source cross-platform Unicode & XML based text/corpus analysis environment and graphical client, supporting Windows, Linux and Mac OS X. It can also be used online as a J2EE standard compliant web portal (GWT based) with access control built in. DOWNLOAD LATEST VERSION OF TXM : http://textometrie.ens-lyon.fr/spip.php?rubrique61&lang=en TXM offers a comprehensive range of analysis tools (concordances, collocate search, frequency lists, etc.) based on the powerfull...
    Leader badge
    Downloads: 15 This Week
    Last Update:
    See Project
  • 8
    DocWire SDK

    DocWire SDK

    Award-winning modern data processing in C++17/20

    DocWire SDK, a standout C++17/20 data processing tool, has received award from SourceForge and strong backing from Microsoft. It handles nearly 100 file types, empowering efficient text extraction, web data extraction, and document analysis. The upcoming integration of C++17 and C++20 will bring advanced functionalities, particularly in areas like HTTP capabilities and web data extraction. For businesses, the shift to DocWire SDK signifies a leap forward. It promises comprehensive document...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    libpostal

    libpostal

    A C library for parsing/normalizing street addresses around the world

    A C library for parsing/normalizing street addresses around the world. Powered by statistical NLP and open geo data. libpostal is a C library for parsing/normalizing street addresses around the world using statistical NLP and open data. The goal of this project is to understand location-based strings in every language, everywhere. Addresses and the locations they represent are essential for any application dealing with maps (place search, transportation, on-demand/delivery services, check-ins...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Save hundreds of developer hours with components built for SaaS applications. Icon
    Save hundreds of developer hours with components built for SaaS applications.

    The #1 Embedded Analytics Solution for SaaS Teams.

    Whether you want full self-service analytics or simpler multi-tenant security, Qrvey’s embeddable components and scalable data management remove the guess work.
    Try Developer Playground
  • 10

    cuneiformplus

    Fork of OCR software cuneiform

    Fork of OCR software cuneiform Original software see: https://launchpad.net/cuneiform-linux by Cognitive Technologies and Jussi Pakkanen Other Open Source OCR stuff see * Tesseract by Ray Smith (using the Leptonica image library) * GOCR * OCRAD
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    perkun

    perkun

    two experimental AI languages + zubr

    Two experimental AI languages - Perkun and its successor Wlodkowic. Attempt to maximize the expected value of the payoff function by appropriate choosing the actions (output variables values). The package contains also a tool called zubr - a Java code generator based on Perkun. Take also a look at my blog: http://pawel-biernacki.blogspot.fi/ For Windows users there is an installer: http://www.pawelbiernacki.net/perkun.msi
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    AIML_chung

    AIML_chung

    an AIML chatbot engine with 3D avatars, maths parser, speech and dll

    AIML chung is an full AIML1.0 based standalone chat bot engine trial with dll , tts / espeak speech voices, synonyms substitutions, maths parser and 3D photorealistics openGL avatars written in compiled freebasic.Comes with GUI window and console examples, 3D world mode and a dll version to use with other programming languages like c++ or Liberty Basic , or to easily embed in your applications .Talk with your A.I. computer.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    SLING

    SLING

    A natural language frame semantics parser

    The aim of the SLING project is to learn to read and understand Wikipedia articles in many languages for the purpose of knowledge base completion, e.g. adding facts mentioned in Wikipedia (and other sources) to the Wikidata knowledge base. We use frame semantics as a common representation for both knowledge representation and document annotation. The SLING parser can be trained to produce frame semantic representations of text directly without any explicit intervening linguistic representation...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Command Line Parser GetPot

    Command Line Parser GetPot

    Tool to parse the command line and configuration files.

    Powerful command line and configuration file parsing for C++, Python, Ruby and Java (others to come). This tool provides many features, such as separate treatment for options, variables, and flags, unrecognized object detection, prefixes and much more.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    DeSR is a multilingual statistical dependency parser. It produces dependency parse trees for natural language sentences using a parsing model learned from annotated corpora.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    The MaxParser is written in c++ and can parse with first, second, third and fourth order projective Graph-based Dependency parsing algorithm. The project is the new version of the project "Max-MSTParser". If you want to use this software for research, please reference this web address in your papers
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Azul OS

    Azul OS

    Azul OS version dev(Linux) IA

    ... 0.4.1 . Disponible [changelog] software added : php5-mysql gcc-c++ php5-gd php5-ctype perl-HTML-Tagset php5-zip php5-curl kernel-source mysql-connector-java php5-pear php5-mcrypt php5-ftp devel_C_C++ gimp gedit recode libreoffice MozillaFirefox wireshark audacity nano This work is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported License. #Blog : http://azul0.wordpress.com/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    TexLexAn is an open source text analyser for Linux, able to estimate the readability and reading time, to classify and summarize texts. It has some learning abilities and accepts html, doc, pdf, ppt, odt and txt documents. Written in C and Python.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    VitoshaTrade is Distributed Artificial Neural Network trained by Differential Evolution for prediction of Forex. Project development is in Sofia, Bulgaria. Vitosha is a mountain massif, on the outskirts of Sofia, the capital of Bulgaria.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    POESIA= Public Opensource Environment for a Safer Internet Access an opensource Internet content filter (multimodal, mulitlingual) aimed for protection of youth (in schools...); partly funded by the European Commission
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    CaboCha is a Japanese dependency/syntactic parser based on machine learning.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Webvoice is a text to speech cgi program. You can embed a link in a html page to send things you want to say, via sound. No software is required on the client side. Festival and sox are needed on the server. Webvoice has its own interface (if needed).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    A modular language generator based on the theory of Functional Grammar (FG) by Simon C. Dik. Implemented using Java for the user interface, ANTLR for the input format parser and Prolog for the grammar and lexicon module, treating underlying linguistic st
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    A multi-platform information extraction/ontology population library from HTML documents, written in C++
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Lillybot is an OpenCyc-based irc chatbot. It implements a very simple reasoning engine that works with the OpenCyc ontology, and hooks it to a natural-language parser. It can answer simple english questions with small simple-english replies.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next