Page 7 | python web crawler free download

Showing 1049 open source projects for "python web crawler"

View related business solutions

Internet Clear Filters & Widen Search

Grafana: The open and composable observability platform
Faster answers, predictable costs, and no lock-in built by the team helping to make observability accessible to anyone.

Grafana is the open source analytics & monitoring solution for every database.

Learn More
Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
1

Decentralized Internet

SDK for building decentralized web and distributed computing projects

This project was created in order to support a new internet. One that is more open, free, and censorship-resistant in comparison to the old internet. An internet that eventually wouldn't need to rely on telecom towers, an outdated grid, or all these other "old school" forms of tech. We believe P2P compatibility is an important part of the future of the net. Grid Computing also plays a role in having a better means of transferring information in a speedy, more cost-efficient and reliable manner.

Downloads: 0 This Week

Last Update: 2020-09-30
See Project
2

R-Project

MOVED TO: https://github.com/echoes1971/r-prj

Downloads: 0 This Week

Last Update: 2020-08-14
See Project
3

TCellXTalk

TCellXTalk Web-App from LP CSIC/UAB

TCellXTalk is a comprehensive database of experimentally detected phosphorylation, ubiquitination and acetylation sites in human T cells. The web-app at www.TCellXTalk.org makes TCellXTalk accessible from Internet, and enables the in silico prediction of potential co-modified peptides to facilitate their experimental detection, using targeted or directed mass spectrometry, for the study of protein post-translational modification cross-talk. More detailed information on TCellXTalk and...

Downloads: 0 This Week

Last Update: 2020-07-13
See Project
4

LymPHOS2

LymPHOS2 Web-App

LymPHOS2 is a web-based Application at www.LymPHOS.org containing peptidic and protein sequences and spectrometric information on the PhosphoProteome of human T-Lymphocytes. - Nguyen, TD., Vidal-Cortes, O., Gallardo, Ó., Abian, J., Carrascal, M., LymPHOS 2.0: an update of a phosphosite database of primary human T cells. Database 2015, 2015. DOI: 10.1093/database/bav115 - Carrascal, M., Ovelleiro, D., Casas, V., Gay, M., Abian, J., Phosphorylation analysis of primary human T lymphocytes...

1 Review

Downloads: 0 This Week

Last Update: 2020-07-03
See Project
Keep company data safe with Chrome Enterprise
Protect your business with AI policies and data loss prevention in the browser

Make AI work your way with Chrome Enterprise. Block unapproved sites and set custom data controls that align with your company's policies.

Download Chrome
5

istSOS

Free and Open Source Sensor Observation Service Data Management System

istSOS is an OGC SOS server implementation written in Python. istSOS allows for managing and dispatch observations from monitoring sensors according to the Sensor Observation Service standard. The project provides also a Graphical user Interface that allows for easing the daily operations and a RESTful Web api for automatizing administration procedures. istSOS is released under the GPL License, and runs on all major platforms (Windows, Linux, Mac OS X), even though tests were conducted under a Linux environment.

Downloads: 0 This Week

Last Update: 2020-04-23
See Project
6

SFM2Web

SFM2Web reads text and database files encoded with SFMs (Standard Format Markers) and then generates a web site according to flags specified in control files. This is useful for web publication of MDF lexicons, USFM Bible books, texts, phrasebooks, etc.

Downloads: 0 This Week

Last Update: 2020-04-24
See Project
7

magnetW

Magnet link aggregation search

magnetW is based on the rule principle of magnetX , the search results of each magnetic station are uniformly formatted. There is no group in this project, only Github for code hosting and related technical exchanges, and other addresses may be risky, please distinguish carefully. This project is open source and free. There are no collection channels of any kind, such as donations, and no advertising of any kind. If you encounter anything similar to the above situation, please don't believe...

Downloads: 0 This Week

Last Update: 2021-05-31
See Project
8

CountBookmarks

Makes a detailed count of your browser bookmarks by folder

This simple program performs a detailed count of exported web browser bookmarks by folder. Its output file can be imported into a spreadsheet and sorted to show the relative size of all your bookmark folders.

Downloads: 0 This Week

Last Update: 2020-06-20
See Project
9

BotSlayer

BotSlayer Community Edition

BotSlayer is an application that helps track and detect potential manipulation of information spreading on Twitter. The tool is developed by the Observatory on Social Media at Indiana University --- the same lab that brought to you Botometer and Hoaxy. BotSlayer is not a tool to detect and remove likely social bots from your list of Twitter followers or friends. For that purpose, check out Botometer. If you just want to visualize the spread of some piece of information, consider Hoaxy....

Downloads: 0 This Week

Last Update: 2023-07-13
See Project
Cloud-based help desk software with ServoDesk
Full access to Enterprise features. No credit card required.

What if You Could Automate 90% of Your Repetitive Tasks in Under 30 Days? At ServoDesk, we help businesses like yours automate operations with AI, allowing you to cut service times in half and increase productivity by 25% - without hiring more staff.

Try ServoDesk for free
10

pyindi-client

Python binding to the libindi library

...PyQt applications may also be built on top of IndiClient, thus allowing rapid development of GUI Indi clients. Besides Python there are also bindings for node.js, Tcl (incomplete) and PHP (not useful). As application examples you will find a Python Websocket server with which you may build a web application interacting with Indi servers, and a simple PyQt application similar to the Kstars Indi Control Panel (was built as an exercise). Finally there is an equatorial mount 3D simulator written with Freecad and Python, planned to be connected with the PyIndi module. *** The pyindi-client binding has moved to github...

Downloads: 0 This Week

Last Update: 2019-12-11
See Project
11

AET

Detects visual changes on websites and performs page health checks

AET is a system that detects visual changes on websites and performs basic page health checks (like w3c compliance, accessibility, HTTP status codes, JS Error checks and others). AET is designed as a flexible system that can be adapted and tailored to the regression requirements of a given project. The tool has been developed to aid front-end client-side layout regression testing of websites or portfolios, in essence assessing the impact or change of a website from one snapshot to the next.

Downloads: 0 This Week

Last Update: 2023-10-19
See Project
12

X-RAY

The next web scraper, see through the <html> noise

Supports strings, arrays, arrays of objects, and nested object structures. The schema is not tied to the structure of the page you're scraping, allowing you to pull the data in the structure of your choosing. The API is entirely composable, giving you great flexibility in how you scrape each page. Paginate through websites, scraping each page. X-ray also supports a request delay and a pagination limit. Scraped pages can be streamed to a file, so if there's an error on one page, you won't...

Downloads: 0 This Week

Last Update: 2021-10-05
See Project
13

YouTube Video Downloader

Allows you to download youtube videos into a video/audio format.

YouTube Video Downloader By Chase, This is a tool developed in python, by web scraping I can get the videos from YouTube and download it on my machine in a video/audio format, easy-to-use GUI for your needs, dark theme.

1 Review

Downloads: 14 This Week

Last Update: 2019-07-10
See Project
14

Requests-HTML

Pythonic HTML Parsing for Humans

This library intends to make parsing HTML (e.g. scraping the web) as simple and intuitive as possible. When using this library you automatically get full JavaScript support! (Using Chromium, thanks to puppeteer) CSS Selectors (a.k.a jQuery-style, thanks to PyQuery). XPath Selectors, for the faint of heart. Mocked user-agent (like a real web browser). Automatic following of redirects. Connection–pooling and cookie persistence. The Requests experience you know and love, with magical parsing...

Downloads: 0 This Week

Last Update: 2023-04-10
See Project
15

django-dynamic-scraper

Creating Scrapy scrapers via the Django admin interface

Django Dynamic Scraper (DDS) is an app for Django build on top of the scraping framework Scrapy. While preserving many of the features of Scrapy it lets you dynamically create and manage spiders via the Django admin interface. With Django Dynamic Scraper (DDS) you can define your Scrapy scrapers dynamically via the Django admin interface and save your scraped items in the database you defined for your Django project. Since it simplifies things DDS is not usable for all kinds of scrapers, but...

Downloads: 0 This Week

Last Update: 2022-09-05
See Project
16

Jupyter Server Proxy

Jupyter notebook server extension to proxy web services.

Jupyter Server Proxy lets you run arbitrary external processes (such as RStudio, Shiny Server, Syncthing, PostgreSQL, Code Server, etc) alongside your notebook server and provide authenticated web access to them using a path like /rstudio next to others like /lab. Alongside the Python package that provides the main functionality, the JupyterLab extension (@jupyterhub/jupyter-server-proxy) provides buttons in the JupyterLab launcher window to get to RStudio for example.

Downloads: 0 This Week

Last Update: 2023-12-21
See Project
17

Rendora

dynamic server-side rendering using headless Chrome

Rendora is a dynamic renderer to provide zero-configuration server-side rendering mainly to web crawlers in order to effortlessly improve SEO for websites developed in modern Javascript frameworks such as React.js, Vue.js, Angular.js, etc. Rendora works totally independently of your frontend and backend stacks. Rendora can be seen as a reverse HTTP proxy server sitting between your backend server (e.g. Node.js/Express.js, Python/Django, etc...) and potentially your frontend proxy server (e.g. nginx, traefik, apache, etc...) or even directly to the outside world that does actually nothing but transporting requests and responses as they are except when it detects whitelisted requests according to the config. ...

Downloads: 0 This Week

Last Update: 2022-03-08
See Project
18

Transcrypt

Python in the Browser

Lean and mean Python 3.6 to JavaScript compiler. Supports multiple inheritance, operator overloading and Python source level debugging, even of minified Javascript files. Transcrypt code is as fast and compact as its Javascript counterpart, and it is precompiled for page load speed. You can now develop your web applications completely in Python, with full access to any Javascript library.

Downloads: 2 This Week

Last Update: 2025-05-23
See Project
19

pyspider

A powerful Spider(Web Crawler) system in Python

pyspider is a powerful Spider(Web Crawler) system in Python. Components are connected by message queue. Every component, including message queue, is running in their own process/thread, and replaceable. That means, when process is slow, you can have many instances of processor and make full use of multiple CPUs, or deploy to multiple machines. This architecture makes pyspider really fast. benchmarking.

Downloads: 1 This Week

Last Update: 2021-03-31
See Project
20

gdpr

Tool to maintain gdpr data protection declaration

Admins often maintain multiple web pages, each of which under EU-GDPR requires a privacy statement. In order to keep them coherent, up-to-date and at the same time avoiding doing the same work multiple times, this project provides a tool to automatically create the appropriate statements for each page from a single source. The project is currently available in PHP, however if anyone is willing to provide a version in Python or Perl or whatever, it is more than welcome. ...

Downloads: 0 This Week

Last Update: 2018-10-16
See Project
21

Twitter Intelligence

Twitter Intelligence OSINT project performs tracking and analysis

A project written in Python for Twitter tracking and analysis without using Twitter API. This project is a Python 3.x application. The package dependencies are in the file requirements.txt. Run that command to install the dependencies. SQLite is used as the database. Tweet data is stored on the Tweet, User, Location, Hashtag, HashtagTweet tables. The database is created automatically. analysis.py performs analysis processing. User, hashtag, and location analyzes are performed. You must write...

Downloads: 0 This Week

Last Update: 2023-04-12
See Project
22

crawler4j

Open source web crawler for Java

crawler4j is an open source web crawler for Java which provides a simple interface for crawling the Web. Using it, you can setup a multi-threaded web crawler in few minutes. You need to create a crawler class that extends WebCrawler. This class decides which URLs should be crawled and handles the downloaded page. shouldVisit function decides whether the given URL should be crawled or not.

Downloads: 1 This Week

Last Update: 2022-01-12
See Project
23

OpenSearchServer Search Engine

An open source search engine with RESTFul API and crawlers

OpenSearchServer is a powerful, enterprise-class, search engine program. Using the web user interface, the crawlers (web, file, database, etc.) and the client libraries (REST/API , Ruby, Rails, Node.js, PHP, Perl) you will be able to integrate quickly and easily advanced full-text search capabilities in your application: Full-text with basic semantic, join queries, boolean queries, facet and filter, document (PDF, Office, etc.) indexation, web scrapping,etc. OpenSearchServer runs on...

31 Reviews

Downloads: 3 This Week

Last Update: 2018-08-26
See Project
24

blog99

A blog engine that does html and gopher

This is the blog engine for HTML and Gopher. Blog entries are written as html files. For HTML, it is an Apache/MySQL/Python application using WSGI. For Gopher, it is Gophernicus/MySQL/Python using CGI.

Downloads: 0 This Week

Last Update: 2018-08-14
See Project
25

PiHass

Pre-defined and easy to use Home-Assistant Image for raspberry pi

This is a Raspbain Strech base image with Home-Assistant on it. i used Virtual Env based installation and added some Custom Ui and Custom Components. i have also configured MySQL server and database and also some scripts, sensors and groups to help users start working with the system.

Downloads: 0 This Week

Last Update: 2018-07-03
See Project