Showing 70 open source projects for "python web crawler"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • No-Nonsense Code-to-Cloud Security for Devs | Aikido Icon
    No-Nonsense Code-to-Cloud Security for Devs | Aikido

    Connect your GitHub, GitLab, Bitbucket, or Azure DevOps account to start scanning your repos for free.

    Aikido provides a unified security platform for developers, combining 12 powerful scans like SAST, DAST, and CSPM. AI-driven AutoFix and AutoTriage streamline vulnerability management, while runtime protection blocks attacks.
    Start for Free
  • 1
    Graph Notebook

    Graph Notebook

    Library extending Jupyter notebooks to integrate with Apache TinkerPop

    The graph notebook provides an easy way to interact with graph databases using Jupyter notebooks. Using this open-source Python package, you can connect to any graph database that supports the Apache TinkerPop, openCypher or the RDF SPARQL graph models. These databases could be running locally on your desktop or in the cloud. Graph databases can be used to explore a variety of use cases including knowledge graphs and identity graphs. This project includes many examples of Jupyter notebooks...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Papis

    Papis

    Powerful and highly extensible command-line based document

    Papis is a powerful and highly extensible CLI document and bibliography manager. With Papis, you can search your library for books and papers, add documents and notes, import and export to and from other formats, and much much more. Papis uses a human-readable and easily hackable .yaml file to store each entry's bibliographical data. It strives to be easy to use while providing a wide range of features. And for those who still want more, Papis makes it easy to write scripts that extend its...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Siddhi Core Libraries

    Siddhi Core Libraries

    Stream Processing and Complex Event Processing Engine

    ... to various endpoints in real time. Agile development experience with SQL-like query language and graphical drag-and-drop editor supporting event simulation. Lightweight runtime that can natively run on Kubernetes, Docker, VM, or bare metal, and embedded in any Java or Python application. Scalable, and highly available distributed event processing on Kubernetes, with NATS Streaming and Siddhi Kubernetes Operator.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Goutte

    Goutte

    Goutte, a simple PHP Web Scraper

    Goutte is a screen scraping and web crawling library for PHP. Goutte provides a nice API to crawl websites and extract data from the HTML/XML responses. Goutte depends on PHP 7.1+. Add fabpot/goutte as a require dependency in your composer.json file. Create a Goutte Client instance (which extends Symfony\Component\BrowserKit\HttpBrowser). Make requests with the request() method. The method returns a Crawler object (Symfony\Component\DomCrawler\Crawler). To use your own HTTP settings, you may...
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas | Run databases anywhere Icon
    MongoDB Atlas | Run databases anywhere

    Ensure the availability of your data with coverage across AWS, Azure, and GCP on MongoDB Atlas—the multi-cloud database for every enterprise.

    MongoDB Atlas allows you to build and run modern applications across 125+ cloud regions, spanning AWS, Azure, and Google Cloud. Its multi-cloud clusters enable seamless data distribution and automated failover between cloud providers, ensuring high availability and flexibility without added complexity.
    Learn More
  • 5
    lixa

    lixa

    LIXA, LIbre XA, is a free and open source XA transaction manager

    ... technology enables every application container, like a web server or a shell, to become a two phase commit application server. The client/server architecture of LIXA allows many application containers to share a single LIXA (state) server: this is ideal when horizontal scalability is a must and many identical application containers must refer to a single transactional environment. LIXA can be used with the C, C++, Java, Python and COBOL programming languages.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    eCxx

    eCxx

    A C++ library for AVR and NodeMCU

    NOTE: This project is marked with 'Status: Abandoned' on SourceForge because not enough time can be dedicated to this project. However it may still get sporadic commits to the repository. eCxx is a library for AVR and NodeMCU tailored for micro LED displays and lighting effects. eCxx is utilizing Makefile build system. Java and Python based applications/tools are also included to ease the development and debugging process using the host PC. On one side, eCxx supports the original...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Whisper Library

    Whisper Library

    Whisper is a file-based time-series database format for Graphite

    Whisper is one of three components within the Graphite project. Whisper is a fixed-size database, similar in design and purpose to RRD (round-robin-database). It provides fast, reliable storage of numeric data over time. Whisper allows for higher resolution (seconds per point) of recent data to degrade into lower resolutions for long-term retention of historical data. Copies data from src in dst, if missing. Unlike whisper-merge, don't overwrite data that's already present in the target...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Spyne

    Spyne

    A transport agnostic sync/async RPC library

    Spyne is a Python RPC toolkit that makes it easy to expose online services that have a well-defined API using multiple protocols and transports. It integrates with popular Python web frameworks as well as libraries like SQLAlchemy to keep your code as DRY as possible. Spyne aims to save the protocol implementers the hassle of implementing their own remote procedure call api and the application programmers the hassle of jumping through hoops just to expose their services using multiple protocols...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Hack-Tools

    Hack-Tools

    Hack tools

    hack-tools is a collection of various hacking tools and utilities. It serves as a comprehensive toolkit for penetration testers and cybersecurity enthusiasts, encompassing a wide range of functionalities.​
    Downloads: 5 This Week
    Last Update:
    See Project
  • Powering the best of the internet | Fastly Icon
    Powering the best of the internet | Fastly

    Fastly's edge cloud platform delivers faster, safer, and more scalable sites and apps to customers.

    Ensure your websites, applications and services can effortlessly handle the demands of your users with Fastly. Fastly’s portfolio is designed to be highly performant, personalized and secure while seamlessly scaling to support your growth.
    Try for free
  • 10
    Alfred-Workflow

    Alfred-Workflow

    Full-featured library for writing Alfred 3 & 4 workflows

    Alfred-Workflow is a Python helper library for Alfred 2, 3 and 4 workflow authors, developed and hosted on GitHub. Alfred workflows typically take user input, fetch data from the Web or elsewhere, filter them and display results to the user. Alfred-Workflow takes care of a lot of the details for you, allowing you to concentrate your efforts on your workflow’s functionality. Alfred-Workflow supports macOS 10.7+ (Python 2.7). Easily launch background tasks (daemons) to keep your workflow...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11

    bluetroller

    A library and interface for controlling bluetooth LE devices

    bluetroller is a library and interface for controlling all kinds of bluetooth LE devices. A vast number of devices can be controlled via Bluetooth LE, including fitness trackers, lighting, camera sliders, gimbals and many more. Right now these devices can only be controlled via phone apps which are frequently buggy, unmaintained and will stop working after some future phone update. This project aims to grow to become an exhaustive library of these devices.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    GoodByeCatpcha

    GoodByeCatpcha

    Solver ReCaptcha v2 Free

    An async Python library to automate solving ReCAPTCHA v2 by images/audio using Mozilla's DeepSpeech, PocketSphinx, Microsoft Azure’s, Google Speech and Amazon's Transcribe Speech-to-Text API. Also image recognition to detect the object suggested in the captcha. Built with Pyppeteer for Chrome automation framework and similarities to Puppeteer, PyDub for easily converting MP3 files into WAV, aiohttp for async minimalistic web-server, and Python’s built-in AsyncIO for convenience.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13

    Easy Web automation library

    Easy Web automation library

    This library has been designed to work with selenium for web automation. It has incorporated functions and handled exception from selenium. It uses selenium library for web interfaces.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Functional, Data Science Intro To Python

    Functional, Data Science Intro To Python

    [tutorial]A functional, Data Science focused introduction to Python

    The first section is an intentionally brief, functional, data science-centric introduction to Python. The assumption is a someone with zero experience in programming can follow this tutorial and learn Python with the smallest amount of information possible. The sections after that, involve varying levels of difficulty and cover topics as diverse as Machine Learning, Linear Optimization, build systems, command line tools, recommendation engines, Sentiment Analysis and Cloud Computing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15

    Face Recognition

    World's simplest facial recognition api for Python & the command line

    Face Recognition is the world's simplest face recognition library. It allows you to recognize and manipulate faces from Python or from the command line using dlib's (a C++ toolkit containing machine learning algorithms and tools) state-of-the-art face recognition built with deep learning. Face Recognition is highly accurate and is able to do a number of things. It can find faces in pictures, manipulate facial features in pictures, identify faces in pictures, and do face recognition...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 16
    ACMESharp

    ACMESharp

    An ACME client library and PowerShell client for the .NET platform

    ... is broken up into layers that build upon each other. Basic tools and services required for implementing the ACME protocol and its semantics (JSON Web Signature (JWS), PKI operations, client-side persistence) Low-level ACME protocol client library that can interoperate with a compliant ACME server. PowerShell module that implements a powerful client, that functions equally well as a manual tool or a component of a larger automation process, for managing ACME Registrations, Identifiers, etc.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    Assorted projects. General-purpose libraries for Python, C++, Scala, bash, and others. Meta-programming tools. System utilities. UI components. Web APIs. Configuration files. Benchmarks. Programming competition entries. And much more.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    TensorFlow World

    TensorFlow World

    Simple and ready-to-use tutorials for TensorFlow

    This repository aims to provide simple and ready-to-use tutorials for TensorFlow. The explanations are present in the wiki associated with this repository. There are different motivations for this open source project. TensorFlow (as we write this document) is one of / the best deep learning frameworks available. The question that should be asked is why has this repository been created when there are so many other tutorials about TensorFlow available on the web? Deep Learning is in very high...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Icon Font to PNG

    Icon Font to PNG

    Python script (and library) for exporting icons from icon fonts

    Python script (and library) for easy and simple export of icons from web icon fonts (e.g. Font Awesome, Octicons) as PNG images. The best part is the provided shell script, but you can also use it’s functionality directly in your (probably awesome) Python project. There’s also font-awesome-to-png script for backward compatibility with the first iteration of the concept. You can use IconFont (and IconFontDownloader for that matter) directly inside your Python project. There's no proper...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    C++ Standard Airline IT Object Library
    That project aims at providing a clean API, and the corresponding C++ implementation, for the basis of Airline IT Business Object Model (BOM), ie, to be used by several other Open Source projects, such as RMOL, Air-Sched, Travel-CCM, OpenTREP, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    mds-utils

    General purpose utilities for C++ and Python developers

    ...++ classes that help on treating Python file objects as C++ streams. 6. a review and refactor of the indexing support in Python extensions. Now access in write mode is supported too. More details on the Doxygen documentation. Documentation is available through doxygen. Once downloaded and uncompressed, issue the "doxygen" command from the root folder. The documentation will be into "doc/html". An online version of this documentation is available at the link here below (mds-utils Web Site).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Awesome AWS

    Awesome AWS

    A curated list of awesome Amazon Web Services libraries

    A curated list of awesome Amazon Web Services (AWS) libraries, open source repos, guides, blogs, and other resources. Featuring the Fiery Meter of AWSome. Each repo listed meets at least one of the following requirements, community-authored repo with 100+ stars, community-vouched repo with < 100 stars, official repo from aws or awslabs. 100+ stars for community repos is not a strict requirement, it only serves as a guideline for the initial compilation. If you can vouch for the awesomeness...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Node Crawler

    Node Crawler

    Web Crawler/Spider for NodeJS + server-side jQuery

    Most powerful, popular and production crawling/scraping package for Node, happy hacking.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Poor Http

    Poor Http

    WSGI Server, WSGI Connector, Python doc generator

    Poor Http Server is standalone wsgi server, which is designed for using python web applications. Unlike other projects, this is not framework, but single server, light wsgi connector, and python doc generator.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    C++ Simulated Travel Distribution System
    That project aims at providing a clean API and a simple implementation, as a C++ library, of a Travel-oriented Distribution System. It corresponds to the simulated version of the real-world Computerized Reservation Systems (CRS).
    Downloads: 0 This Week
    Last Update:
    See Project
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.