Showing 1049 open source projects for "python web crawler"

View related business solutions
  • Grafana: The open and composable observability platform Icon
    Grafana: The open and composable observability platform

    Faster answers, predictable costs, and no lock-in built by the team helping to make observability accessible to anyone.

    Grafana is the open source analytics & monitoring solution for every database.
    Learn More
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    Decentralized Internet

    Decentralized Internet

    SDK for building decentralized web and distributed computing projects

    This project was created in order to support a new internet. One that is more open, free, and censorship-resistant in comparison to the old internet. An internet that eventually wouldn't need to rely on telecom towers, an outdated grid, or all these other "old school" forms of tech. We believe P2P compatibility is an important part of the future of the net. Grid Computing also plays a role in having a better means of transferring information in a speedy, more cost-efficient and reliable manner.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    MOVED TO: https://github.com/echoes1971/r-prj
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    TCellXTalk

    TCellXTalk

    TCellXTalk Web-App from LP CSIC/UAB

    TCellXTalk is a comprehensive database of experimentally detected phosphorylation, ubiquitination and acetylation sites in human T cells. The web-app at www.TCellXTalk.org makes TCellXTalk accessible from Internet, and enables the in silico prediction of potential co-modified peptides to facilitate their experimental detection, using targeted or directed mass spectrometry, for the study of protein post-translational modification cross-talk. More detailed information on TCellXTalk and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    LymPHOS2

    LymPHOS2

    LymPHOS2 Web-App

    LymPHOS2 is a web-based Application at www.LymPHOS.org containing peptidic and protein sequences and spectrometric information on the PhosphoProteome of human T-Lymphocytes. - Nguyen, TD., Vidal-Cortes, O., Gallardo, Ó., Abian, J., Carrascal, M., LymPHOS 2.0: an update of a phosphosite database of primary human T cells. Database 2015, 2015. DOI: 10.1093/database/bav115 - Carrascal, M., Ovelleiro, D., Casas, V., Gay, M., Abian, J., Phosphorylation analysis of primary human T lymphocytes...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Keep company data safe with Chrome Enterprise Icon
    Keep company data safe with Chrome Enterprise

    Protect your business with AI policies and data loss prevention in the browser

    Make AI work your way with Chrome Enterprise. Block unapproved sites and set custom data controls that align with your company's policies.
    Download Chrome
  • 5
    istSOS

    istSOS

    Free and Open Source Sensor Observation Service Data Management System

    istSOS is an OGC SOS server implementation written in Python. istSOS allows for managing and dispatch observations from monitoring sensors according to the Sensor Observation Service standard. The project provides also a Graphical user Interface that allows for easing the daily operations and a RESTful Web api for automatizing administration procedures. istSOS is released under the GPL License, and runs on all major platforms (Windows, Linux, Mac OS X), even though tests were conducted under a Linux environment.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    SFM2Web reads text and database files encoded with SFMs (Standard Format Markers) and then generates a web site according to flags specified in control files. This is useful for web publication of MDF lexicons, USFM Bible books, texts, phrasebooks, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    magnetW

    magnetW

    Magnet link aggregation search

    magnetW is based on the rule principle of magnetX , the search results of each magnetic station are uniformly formatted. There is no group in this project, only Github for code hosting and related technical exchanges, and other addresses may be risky, please distinguish carefully. This project is open source and free. There are no collection channels of any kind, such as donations, and no advertising of any kind. If you encounter anything similar to the above situation, please don't believe...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    CountBookmarks

    CountBookmarks

    Makes a detailed count of your browser bookmarks by folder

    This simple program performs a detailed count of exported web browser bookmarks by folder. Its output file can be imported into a spreadsheet and sorted to show the relative size of all your bookmark folders.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    BotSlayer

    BotSlayer

    BotSlayer Community Edition

    BotSlayer is an application that helps track and detect potential manipulation of information spreading on Twitter. The tool is developed by the Observatory on Social Media at Indiana University --- the same lab that brought to you Botometer and Hoaxy. BotSlayer is not a tool to detect and remove likely social bots from your list of Twitter followers or friends. For that purpose, check out Botometer. If you just want to visualize the spread of some piece of information, consider Hoaxy....
    Downloads: 0 This Week
    Last Update:
    See Project
  • Cloud-based help desk software with ServoDesk Icon
    Cloud-based help desk software with ServoDesk

    Full access to Enterprise features. No credit card required.

    What if You Could Automate 90% of Your Repetitive Tasks in Under 30 Days? At ServoDesk, we help businesses like yours automate operations with AI, allowing you to cut service times in half and increase productivity by 25% - without hiring more staff.
    Try ServoDesk for free
  • 10

    pyindi-client

    Python binding to the libindi library

    ...PyQt applications may also be built on top of IndiClient, thus allowing rapid development of GUI Indi clients. Besides Python there are also bindings for node.js, Tcl (incomplete) and PHP (not useful). As application examples you will find a Python Websocket server with which you may build a web application interacting with Indi servers, and a simple PyQt application similar to the Kstars Indi Control Panel (was built as an exercise). Finally there is an equatorial mount 3D simulator written with Freecad and Python, planned to be connected with the PyIndi module. *** The pyindi-client binding has moved to github...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    AET

    AET

    Detects visual changes on websites and performs page health checks

    AET is a system that detects visual changes on websites and performs basic page health checks (like w3c compliance, accessibility, HTTP status codes, JS Error checks and others). AET is designed as a flexible system that can be adapted and tailored to the regression requirements of a given project. The tool has been developed to aid front-end client-side layout regression testing of websites or portfolios, in essence assessing the impact or change of a website from one snapshot to the next.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    X-RAY

    X-RAY

    The next web scraper, see through the <html> noise

    Supports strings, arrays, arrays of objects, and nested object structures. The schema is not tied to the structure of the page you're scraping, allowing you to pull the data in the structure of your choosing. The API is entirely composable, giving you great flexibility in how you scrape each page. Paginate through websites, scraping each page. X-ray also supports a request delay and a pagination limit. Scraped pages can be streamed to a file, so if there's an error on one page, you won't...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    YouTube Video Downloader

    YouTube Video Downloader

    Allows you to download youtube videos into a video/audio format.

    YouTube Video Downloader By Chase, This is a tool developed in python, by web scraping I can get the videos from YouTube and download it on my machine in a video/audio format, easy-to-use GUI for your needs, dark theme.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 14
    Requests-HTML

    Requests-HTML

    Pythonic HTML Parsing for Humans

    This library intends to make parsing HTML (e.g. scraping the web) as simple and intuitive as possible. When using this library you automatically get full JavaScript support! (Using Chromium, thanks to puppeteer) CSS Selectors (a.k.a jQuery-style, thanks to PyQuery). XPath Selectors, for the faint of heart. Mocked user-agent (like a real web browser). Automatic following of redirects. Connection–pooling and cookie persistence. The Requests experience you know and love, with magical parsing...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    django-dynamic-scraper

    django-dynamic-scraper

    Creating Scrapy scrapers via the Django admin interface

    Django Dynamic Scraper (DDS) is an app for Django build on top of the scraping framework Scrapy. While preserving many of the features of Scrapy it lets you dynamically create and manage spiders via the Django admin interface. With Django Dynamic Scraper (DDS) you can define your Scrapy scrapers dynamically via the Django admin interface and save your scraped items in the database you defined for your Django project. Since it simplifies things DDS is not usable for all kinds of scrapers, but...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Jupyter Server Proxy

    Jupyter Server Proxy

    Jupyter notebook server extension to proxy web services.

    Jupyter Server Proxy lets you run arbitrary external processes (such as RStudio, Shiny Server, Syncthing, PostgreSQL, Code Server, etc) alongside your notebook server and provide authenticated web access to them using a path like /rstudio next to others like /lab. Alongside the Python package that provides the main functionality, the JupyterLab extension (@jupyterhub/jupyter-server-proxy) provides buttons in the JupyterLab launcher window to get to RStudio for example.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Rendora

    Rendora

    dynamic server-side rendering using headless Chrome

    Rendora is a dynamic renderer to provide zero-configuration server-side rendering mainly to web crawlers in order to effortlessly improve SEO for websites developed in modern Javascript frameworks such as React.js, Vue.js, Angular.js, etc. Rendora works totally independently of your frontend and backend stacks. Rendora can be seen as a reverse HTTP proxy server sitting between your backend server (e.g. Node.js/Express.js, Python/Django, etc...) and potentially your frontend proxy server (e.g. nginx, traefik, apache, etc...) or even directly to the outside world that does actually nothing but transporting requests and responses as they are except when it detects whitelisted requests according to the config. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Transcrypt

    Transcrypt

    Python in the Browser

    Lean and mean Python 3.6 to JavaScript compiler. Supports multiple inheritance, operator overloading and Python source level debugging, even of minified Javascript files. Transcrypt code is as fast and compact as its Javascript counterpart, and it is precompiled for page load speed. You can now develop your web applications completely in Python, with full access to any Javascript library.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    pyspider

    pyspider

    A powerful Spider(Web Crawler) system in Python

    pyspider is a powerful Spider(Web Crawler) system in Python. Components are connected by message queue. Every component, including message queue, is running in their own process/thread, and replaceable. That means, when process is slow, you can have many instances of processor and make full use of multiple CPUs, or deploy to multiple machines. This architecture makes pyspider really fast. benchmarking.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20

    gdpr

    Tool to maintain gdpr data protection declaration

    Admins often maintain multiple web pages, each of which under EU-GDPR requires a privacy statement. In order to keep them coherent, up-to-date and at the same time avoiding doing the same work multiple times, this project provides a tool to automatically create the appropriate statements for each page from a single source. The project is currently available in PHP, however if anyone is willing to provide a version in Python or Perl or whatever, it is more than welcome. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Twitter Intelligence

    Twitter Intelligence

    Twitter Intelligence OSINT project performs tracking and analysis

    A project written in Python for Twitter tracking and analysis without using Twitter API. This project is a Python 3.x application. The package dependencies are in the file requirements.txt. Run that command to install the dependencies. SQLite is used as the database. Tweet data is stored on the Tweet, User, Location, Hashtag, HashtagTweet tables. The database is created automatically. analysis.py performs analysis processing. User, hashtag, and location analyzes are performed. You must write...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    crawler4j

    crawler4j

    Open source web crawler for Java

    crawler4j is an open source web crawler for Java which provides a simple interface for crawling the Web. Using it, you can setup a multi-threaded web crawler in few minutes. You need to create a crawler class that extends WebCrawler. This class decides which URLs should be crawled and handles the downloaded page. shouldVisit function decides whether the given URL should be crawled or not.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    OpenSearchServer Search Engine

    OpenSearchServer Search Engine

    An open source search engine with RESTFul API and crawlers

    OpenSearchServer is a powerful, enterprise-class, search engine program. Using the web user interface, the crawlers (web, file, database, etc.) and the client libraries (REST/API , Ruby, Rails, Node.js, PHP, Perl) you will be able to integrate quickly and easily advanced full-text search capabilities in your application: Full-text with basic semantic, join queries, boolean queries, facet and filter, document (PDF, Office, etc.) indexation, web scrapping,etc. OpenSearchServer runs on...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 24

    blog99

    A blog engine that does html and gopher

    This is the blog engine for HTML and Gopher. Blog entries are written as html files. For HTML, it is an Apache/MySQL/Python application using WSGI. For Gopher, it is Gophernicus/MySQL/Python using CGI.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25

    PiHass

    Pre-defined and easy to use Home-Assistant Image for raspberry pi

    This is a Raspbain Strech base image with Home-Assistant on it. i used Virtual Env based installation and added some Custom Ui and Custom Components. i have also configured MySQL server and database and also some scripts, sensors and groups to help users start working with the system.
    Downloads: 0 This Week
    Last Update:
    See Project