Page 2 | html source extractor free download

Showing 149 open source projects for "html source extractor"

View related business solutions

Internet Python Clear Filters & Widen Search

MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
Custom VMs From 1 to 96 vCPUs With 99.95% Uptime
General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.

Try Free
1

blog99

A blog engine that does html and gopher

This is the blog engine for HTML and Gopher. Blog entries are written as html files. For HTML, it is an Apache/MySQL/Python application using WSGI. For Gopher, it is Gophernicus/MySQL/Python using CGI.

Downloads: 0 This Week

Last Update: 2018-08-14
See Project
2

WeChatSogou

Python library to crawl and retrieve data from WeChat accounts

WechatSogou is an open source Python library designed to retrieve data from WeChat official accounts by using the Sogou WeChat search service as its data source. It provides developers with a programmatic way to search for public accounts and collect article information without manually browsing the search interface. It functions as a crawler interface that sends requests to the search engine, retrieves results, and converts the returned pages into structured data that can be used in applications or analysis pipelines. ...

Downloads: 4 This Week

Last Update: 2026-03-10
See Project
3

Toapi

Convert websites into structured APIs automatically with Python tool

...Instead of building a traditional web crawler that collects and stores data before exposing it through an API, Toapi simplifies the process by allowing developers to define data structures that automatically generate an API layer from existing web pages. It works by parsing HTML content from a source site and mapping selected elements into structured data that can be returned as JSON through API endpoints. Developers define items and routes that determine how web pages are parsed and how the resulting data is exposed through the API interface. It also includes mechanisms for caching both page content and API requests, helping reduce repeated network calls and improving performance. ...

Downloads: 0 This Week

Last Update: 23 hours ago
See Project
4

Offline Websites

Website2Pdf application helps to get offline form of webpages.

Favorite webpages can be made available offline as pdf files. Enter your favorite website url, with just one click pdf files will be created without loss of any css, styling of html. All the web files will be retained. Please make sure to use help button before you convert webpages to offline files.

Downloads: 0 This Week

Last Update: 2017-12-17
See Project
Try Google Cloud Risk-Free With $300 in Credit
No hidden charges. No surprise bills. Cancel anytime.

Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.

Start Free
5

HyperSQL

HyperSQL is like a doxygen plus javadoc for SQL, hypermapping SQL views, packages, procedures, and functions to HTML source code listings and showing all code locations where these are used.

Downloads: 1 This Week

Last Update: 2016-09-19
See Project
6

htmlarea

Small, powerful, full featured WYSIWYG editor

HTMLArea 4 is a browser based WYSIWYG editor that easily replaces the TEXTAREA in your web pages. It is written in JavaScript, and suitable for use in any modern web browser, and any page on your web site. Current version is 4.0-2016-08-29

1 Review

Downloads: 10 This Week

Last Update: 2016-08-29
See Project
7

Jon's Python modules

Simple yet powerful multi-threaded object-oriented CGI/FastCGI/WSGI/mod_python/html-templating modules for Python. This project has moved to GitHub: https://github.com/jribbens/jonpy

Downloads: 1 This Week

Last Update: 2016-01-30
See Project
8

Auth MemCache Cookie

This is a apache v2.0 authentication module. Based on html form authentication and cookie authentication session. Cookie session are stored in memcache deamon. Can be used has an simple "Single Signe-On" (SSO). All the code source and the bug tracking has migrated to github: https://github.com/ZenProjects/Apache-Authmemcookie-Module All the documentation are here: https://zenprojects.github.io/Apache-Authmemcookie-Module/

2 Reviews

Downloads: 0 This Week

Last Update: 2015-08-07
See Project
9

sitecheck

Modular web site spider for web developers.

More than just a link checker, sitecheck is a website spider (also known as a crawler) which can assist with SEO by testing an entire site plus both inbound links from search engines and outbound links to other sites for the following issues: looping redirects (HTTP 301/302), broken links (HTTP 404), server errors (HTTP 500), spelling mistakes, low readability scores (using the Flesch Reading Ease test), missing/empty/duplicate meta tags, duplicate content, slow page speed, W3C validation...

1 Review

Downloads: 0 This Week

Last Update: 2014-10-04
See Project
Forever Free Full-Stack Observability | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
10

Html SymboliZe

transcodes between html entities and regular text

Hsz takes the text you type and turns it into the proper html entities. Hsz is designed to make web developing easier by providing an easy means of looking up html entity codes. (see http://www.w3schools.com/html/html_entities.asp for info about what html entity codes are)

Downloads: 0 This Week

Last Update: 2013-12-23
See Project
11

pyMantis

pyMantis is a data-management system for (systems) biology build on the web2py framework. It features: tree based file explorer, relational db table wizzard with automated creation of user interfaces, internal and external access management, wiki, ..

Downloads: 0 This Week

Last Update: 2014-02-25
See Project
12

PynDora

Python WebServer Log File Analyzer

This is a web log file analyzer we are making using python. First the IIS parsing engine wil be built and then Apache and possibly other servers. It is going to support multiple log files from any date and output the statistics in html formatted files, incorporating automatically build charts. It will be a pure python solution which is going to be self contained, ie no installation will be required other from the standard python modules.

Downloads: 0 This Week

Last Update: 2014-01-13
See Project
13

AsciiDoc Website Builder

awb combines simple but powerful AsciiDoc markup with templates, blog and image gallery generation, and sitemap.xml generation to allow you to easily maintain and update a website.

Downloads: 0 This Week

Last Update: 2013-11-11
See Project
14

Booktype

Open source platform to write and publish print and digital books

Booktype makes it easier for people and organisations to collate, organise, edit and publish books. Delivering frictionlessly to print, lulu.com, and almost any ereader, Booktype facilitates collaborative production processes. No more lost manuscripts, overwritten Word files, awkward wikis or cumbersome CMSes.

Downloads: 6 This Week

Last Update: 2015-12-17
See Project
15

PyQueryDNS

A graphical DNS client with very useful features

PyQueryDNS is a graphical DNS client with very useful features

Downloads: 0 This Week

Last Update: 2013-05-29
See Project
16

Charm

Charm is a full-featured, cross-platfom blogging client for LiveJournal, Atom (Movable Type, Blogger), and MetaWeb (WordPress). It is console-based, all-text, and can be used entirely from the command line. It is written in Python.

1 Review

Downloads: 1 This Week

Last Update: 2013-04-30
See Project
17

RAWR - Rapid Assessment of Web Resources

A web interface enumeration tool for simplifying red team reporting.

Introducing RAWR (Rapid Assessment of Web Resources). There's a lot packed in this tool that will help you get a better grasp of the threat landscape that is your client's web resources. It has been tested from extremely large network environments, down to 5 node networks. It has been fine-tuned to promote fast, accurate, and applicable results in usable formats. RAWR will make the mapping phase of your next web assessment efficient and get you producing positive results faster!

Downloads: 0 This Week

Last Update: 2016-06-22
See Project
18

WiKo

Wiko, the wiki compiler, compiles wiki like files into html and LaTeX, combining easy wiki syntax, your preferred non-web text editor and svn/cvs control to write static webs, cientific articles or even blogs.

Downloads: 0 This Week

Last Update: 2013-04-26
See Project
19

LinkChecker

check links in web documents or full websites

New Homepage: http://wummel.github.io/linkchecker/ Linkchecker features: - recursive and multithreaded checking and site crawling - output in colored or normal text, HTML, SQL, CSV, XML or a sitemap graph in different formats - HTTP/1.1, HTTPS, FTP, mailto:, news:, nntp:, Telnet and local file links support - restrict link checking with regular expression filters for URLs - proxy support -...

25 Reviews

Downloads: 2 This Week

Last Update: 2014-02-14
See Project
20

Spondulas

Spondulas is browser emulator designed to retrieve web pages for hunti

Spondulas is browser emulator and parser designed to retrieve web pages for hunting malware. It supports generation of browser user agents, GET/POST requests, and SOCKS5 proxy. It can be used to parse HTML files sent via e-mail. Monitor mode allows a website to be monitored at intervals to discover changes in DNS or content over time. Autolog mode creates an investigation file that documents redirection chains. The retrieved web pages are parsed for links and reported to an output file. More...

1 Review

Downloads: 0 This Week

Last Update: 2015-05-05
See Project
21

HTML DOM Parser

HTML parser which can be used for screen-scraping applications

htmldom parses the HTML file and provides methods for iterating and searching the parse tree in a similar way as Jquery. To report bugs please mail me at bhimsen.pes@gmail.com

1 Review

Downloads: 0 This Week

Last Update: 2012-08-29
See Project
22

ZetaBoards topic fetcher

Fetches topics with new posts from ZetaBoards forums and does something with the URLs, like opening them in a browser. Configurations can be stored and manipulated for quicker fetching. Development, translations, bug reports, etc. are handled at Launchpad: https://launchpad.net/zb-fetcher SourceForge is used to host released files.

Downloads: 0 This Week

Last Update: 2020-08-14
See Project
23

Html Assembler

Html Assembler is a static site generator. It automatically integrates page content such as text and photos in a modifiable page template creating a complete set of html files ready for upload to your site.

Downloads: 0 This Week

Last Update: 2013-04-15
See Project
24

Web Crawler Security Tool

A web crawler oriented to information security.

Last update on tue mar 26 16:25 UTC 2012 The Web Crawler Security is a python based tool to automatically crawl a web site. It is a web crawler oriented to help in penetration testing tasks. The main task of this tool is to search and list all the links (pages and files) in a web site. The crawler has been completely rewritten in v1.0 bringing a lot of improvements: improved the data visualization, interactive option to download files, increased speed in crawling, exports list of...

3 Reviews

Downloads: 0 This Week

Last Update: 2015-10-10
See Project
25

TBlogger

TBlogger is an application written in Python. Main purpose is to make maintaining static html entries easier. For example a static blog/diary... TBlogger supports currently only FTP protocol. This will change in the future.

Downloads: 0 This Week

Last Update: 2014-06-09
See Project