Showing 149 open source projects for "html source extractor"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 1

    blog99

    A blog engine that does html and gopher

    This is the blog engine for HTML and Gopher. Blog entries are written as html files. For HTML, it is an Apache/MySQL/Python application using WSGI. For Gopher, it is Gophernicus/MySQL/Python using CGI.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    WeChatSogou

    WeChatSogou

    Python library to crawl and retrieve data from WeChat accounts

    WechatSogou is an open source Python library designed to retrieve data from WeChat official accounts by using the Sogou WeChat search service as its data source. It provides developers with a programmatic way to search for public accounts and collect article information without manually browsing the search interface. It functions as a crawler interface that sends requests to the search engine, retrieves results, and converts the returned pages into structured data that can be used in applications or analysis pipelines. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 3
    Toapi

    Toapi

    Convert websites into structured APIs automatically with Python tool

    ...Instead of building a traditional web crawler that collects and stores data before exposing it through an API, Toapi simplifies the process by allowing developers to define data structures that automatically generate an API layer from existing web pages. It works by parsing HTML content from a source site and mapping selected elements into structured data that can be returned as JSON through API endpoints. Developers define items and routes that determine how web pages are parsed and how the resulting data is exposed through the API interface. It also includes mechanisms for caching both page content and API requests, helping reduce repeated network calls and improving performance. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4

    Offline Websites

    Website2Pdf application helps to get offline form of webpages.

    Favorite webpages can be made available offline as pdf files. Enter your favorite website url, with just one click pdf files will be created without loss of any css, styling of html. All the web files will be retained. Please make sure to use help button before you convert webpages to offline files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 5
    HyperSQL is like a doxygen plus javadoc for SQL, hypermapping SQL views, packages, procedures, and functions to HTML source code listings and showing all code locations where these are used.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    htmlarea

    htmlarea

    Small, powerful, full featured WYSIWYG editor

    HTMLArea 4 is a browser based WYSIWYG editor that easily replaces the TEXTAREA in your web pages. It is written in JavaScript, and suitable for use in any modern web browser, and any page on your web site. Current version is 4.0-2016-08-29
    Downloads: 10 This Week
    Last Update:
    See Project
  • 7
    Simple yet powerful multi-threaded object-oriented CGI/FastCGI/WSGI/mod_python/html-templating modules for Python. This project has moved to GitHub: https://github.com/jribbens/jonpy
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    This is a apache v2.0 authentication module. Based on html form authentication and cookie authentication session. Cookie session are stored in memcache deamon. Can be used has an simple "Single Signe-On" (SSO). All the code source and the bug tracking has migrated to github: https://github.com/ZenProjects/Apache-Authmemcookie-Module All the documentation are here: https://zenprojects.github.io/Apache-Authmemcookie-Module/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9

    sitecheck

    Modular web site spider for web developers.

    More than just a link checker, sitecheck is a website spider (also known as a crawler) which can assist with SEO by testing an entire site plus both inbound links from search engines and outbound links to other sites for the following issues: looping redirects (HTTP 301/302), broken links (HTTP 404), server errors (HTTP 500), spelling mistakes, low readability scores (using the Flesch Reading Ease test), missing/empty/duplicate meta tags, duplicate content, slow page speed, W3C validation...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 10
    Html SymboliZe

    Html SymboliZe

    transcodes between html entities and regular text

    Hsz takes the text you type and turns it into the proper html entities. Hsz is designed to make web developing easier by providing an easy means of looking up html entity codes. (see http://www.w3schools.com/html/html_entities.asp for info about what html entity codes are)
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    pyMantis
    pyMantis is a data-management system for (systems) biology build on the web2py framework. It features: tree based file explorer, relational db table wizzard with automated creation of user interfaces, internal and external access management, wiki, ..
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    PynDora

    PynDora

    Python WebServer Log File Analyzer

    This is a web log file analyzer we are making using python. First the IIS parsing engine wil be built and then Apache and possibly other servers. It is going to support multiple log files from any date and output the statistics in html formatted files, incorporating automatically build charts. It will be a pure python solution which is going to be self contained, ie no installation will be required other from the standard python modules.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    awb combines simple but powerful AsciiDoc markup with templates, blog and image gallery generation, and sitemap.xml generation to allow you to easily maintain and update a website.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Booktype

    Booktype

    Open source platform to write and publish print and digital books

    Booktype makes it easier for people and organisations to collate, organise, edit and publish books. Delivering frictionlessly to print, lulu.com, and almost any ereader, Booktype facilitates collaborative production processes. No more lost manuscripts, overwritten Word files, awkward wikis or cumbersome CMSes.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 15
    PyQueryDNS

    PyQueryDNS

    A graphical DNS client with very useful features

    PyQueryDNS is a graphical DNS client with very useful features
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Charm is a full-featured, cross-platfom blogging client for LiveJournal, Atom (Movable Type, Blogger), and MetaWeb (WordPress). It is console-based, all-text, and can be used entirely from the command line. It is written in Python.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17

    RAWR - Rapid Assessment of Web Resources

    A web interface enumeration tool for simplifying red team reporting.

    Introducing RAWR (Rapid Assessment of Web Resources). There's a lot packed in this tool that will help you get a better grasp of the threat landscape that is your client's web resources. It has been tested from extremely large network environments, down to 5 node networks. It has been fine-tuned to promote fast, accurate, and applicable results in usable formats. RAWR will make the mapping phase of your next web assessment efficient and get you producing positive results faster!
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Wiko, the wiki compiler, compiles wiki like files into html and LaTeX, combining easy wiki syntax, your preferred non-web text editor and svn/cvs control to write static webs, cientific articles or even blogs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19

    LinkChecker

    check links in web documents or full websites

    New Homepage: http://wummel.github.io/linkchecker/ Linkchecker features: - recursive and multithreaded checking and site crawling - output in colored or normal text, HTML, SQL, CSV, XML or a sitemap graph in different formats - HTTP/1.1, HTTPS, FTP, mailto:, news:, nntp:, Telnet and local file links support - restrict link checking with regular expression filters for URLs - proxy support -...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20

    Spondulas

    Spondulas is browser emulator designed to retrieve web pages for hunti

    Spondulas is browser emulator and parser designed to retrieve web pages for hunting malware. It supports generation of browser user agents, GET/POST requests, and SOCKS5 proxy. It can be used to parse HTML files sent via e-mail. Monitor mode allows a website to be monitored at intervals to discover changes in DNS or content over time. Autolog mode creates an investigation file that documents redirection chains. The retrieved web pages are parsed for links and reported to an output file. More...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    HTML DOM Parser

    HTML parser which can be used for screen-scraping applications

    htmldom parses the HTML file and provides methods for iterating and searching the parse tree in a similar way as Jquery. To report bugs please mail me at bhimsen.pes@gmail.com
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    ZetaBoards topic fetcher
    Fetches topics with new posts from ZetaBoards forums and does something with the URLs, like opening them in a browser. Configurations can be stored and manipulated for quicker fetching. Development, translations, bug reports, etc. are handled at Launchpad: https://launchpad.net/zb-fetcher SourceForge is used to host released files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Html Assembler
    Html Assembler is a static site generator. It automatically integrates page content such as text and photos in a modifiable page template creating a complete set of html files ready for upload to your site.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24

    Web Crawler Security Tool

    A web crawler oriented to information security.

    Last update on tue mar 26 16:25 UTC 2012 The Web Crawler Security is a python based tool to automatically crawl a web site. It is a web crawler oriented to help in penetration testing tasks. The main task of this tool is to search and list all the links (pages and files) in a web site. The crawler has been completely rewritten in v1.0 bringing a lot of improvements: improved the data visualization, interactive option to download files, increased speed in crawling, exports list of...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    TBlogger is an application written in Python. Main purpose is to make maintaining static html entries easier. For example a static blog/diary... TBlogger supports currently only FTP protocol. This will change in the future.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB