Showing 109 open source projects for "python web crawler"

View related business solutions
  • Simply solve complex auth. Easy for devs to set up. Easy for non-devs to use. Icon
    Simply solve complex auth. Easy for devs to set up. Easy for non-devs to use.

    Transform user access with Frontegg CIAM: login box, SSO, MFA, multi-tenancy, and 99.99% uptime.

    Custom auth drains 25% of dev time and risks 62% more breaches, stalling enterprise deals. Frontegg platform delivers a simple login box, seamless authentication (SSO, MFA, passwordless), robust multi-tenancy, and a customizable Admin Portal. Integrate fast with the React SDK, meet compliance needs, and focus on innovation.
    Start for Free
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 1
    WebMagic

    WebMagic

    A scalable web crawler framework for Java

    WebMagic is a scalable crawler framework. It covers the whole lifecycle of crawler, downloading, url management, content extraction and persistent. It can simplify the development of a specific crawler. WebMagic is a simple but scalable crawler framework. You can develop a crawler easily based on it. WebMagic has a simple core with high flexibility, a simple API for html extracting. It also provides annotation with POJO to customize a crawler, and no configuration is needed. Some other features...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    QR Code generator library

    QR Code generator library

    High-quality QR Code generator library in Java, TypeScript/JavaScript

    ... to TypeScript, Python, Rust, C++, and C. It is open source under the MIT License. For each language, the codebase is roughly 1000 lines of code and has no dependencies other than the respective language’s standard library.
    Downloads: 18 This Week
    Last Update:
    See Project
  • 3
    Heritrix

    Heritrix

    Internet Archive's open-source, web-scale, web crawler project

    Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project. Heritrix (sometimes spelled heretrix, or misspelled or missaid as heratrix/heritix/heretix/heratix) is an archaic word for heiress (woman who inherits). Since our crawler seeks to collect and preserve the digital artifacts of our culture for the benefit of future researchers and generations, this name seemed apt. Heritrix is designed to respect the robots.txt exclusion directives...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    Odigos

    Odigos

    Distributed tracing without code changes

    Odigos supports any application written in Java, Python, .NET, Node.js and Go. Historically, compiled languages like Go have been difficult to instrument without code changes. Odigos solves this problem by uniquely leveraging eBPF. Odigos currently supports all the popular managed and open source destinations. By producing data in the OpenTelemetry format, Odigos can be used with any observability tool that supports OTLP. Odigos automatically scales OpenTelemetry collectors based...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • 5
    Siddhi Core Libraries

    Siddhi Core Libraries

    Stream Processing and Complex Event Processing Engine

    ... to various endpoints in real time. Agile development experience with SQL-like query language and graphical drag-and-drop editor supporting event simulation. Lightweight runtime that can natively run on Kubernetes, Docker, VM, or bare metal, and embedded in any Java or Python application. Scalable, and highly available distributed event processing on Kubernetes, with NATS Streaming and Siddhi Kubernetes Operator.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Framework Benchmarks

    Framework Benchmarks

    Source for the TechEmpower Framework Benchmarks project

    If you're new to the project, welcome! Please feel free to ask questions here. We encourage new frameworks and contributors to ask questions. We're here to help! This project provides representative performance measures across a wide field of web application frameworks. With much help from the community, coverage is quite broad and we are happy to broaden it further with contributions. The project presently includes frameworks on many languages including Go, Python, Java, Ruby, PHP, C#, F...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    ZK - Simply Ajax and Mobile
    ZK is an open-source Java framework for building modern web and mobile applications. It enables developers to create rich, interactive UIs using only Java — no JavaScript required. With 200+ Ajax-powered components, event-driven architecture, and support for popular technologies like Spring, Java EE, and JSP/JSF, ZK makes it simple to deliver powerful and user-friendly web applications.
    Downloads: 39 This Week
    Last Update:
    See Project
  • 8
    EulerSharp

    EulerSharp

    Euler Yet another proof Engine

    EYE [1] is a reasoning engine supporting the Semantic Web layers [2]. It performs controlled chaining and it supports Euler paths [3]. Via N3 [4] it is interoperable with Cwm [5]. [1] http://eulersharp.sourceforge.net/README [2] http://www.w3.org/DesignIssues/diagrams/sweb-stack/2006a [3] http://mathworld.wolfram.com/KoenigsbergBridgeProblem.html [4] http://www.w3.org/TeamSubmission/n3/ [5] http://www.w3.org/2000/10/swap/doc/cwm
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    NanoH5 (tsl2nano)

    NanoH5 (tsl2nano)

    java bean / database driven zero code application framework

    NanoH5 (or FullRelation) is a fullstack UI implementation framework providing a model driven design (MDA). Build a complete html5 application through a given class- or database-model without coding (coding APIs are available).
    Downloads: 1 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 10
    lixa

    lixa

    LIXA, LIbre XA, is a free and open source XA transaction manager

    ... technology enables every application container, like a web server or a shell, to become a two phase commit application server. The client/server architecture of LIXA allows many application containers to share a single LIXA (state) server: this is ideal when horizontal scalability is a must and many identical application containers must refer to a single transactional environment. LIXA can be used with the C, C++, Java, Python and COBOL programming languages.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    eCxx

    eCxx

    A C++ library for AVR and NodeMCU

    NOTE: This project is marked with 'Status: Abandoned' on SourceForge because not enough time can be dedicated to this project. However it may still get sporadic commits to the repository. eCxx is a library for AVR and NodeMCU tailored for micro LED displays and lighting effects. eCxx is utilizing Makefile build system. Java and Python based applications/tools are also included to ease the development and debugging process using the host PC. On one side, eCxx supports the original...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    OMLX project is a place for processing of many projects to be ready to become open source projects.
    Leader badge
    Downloads: 179 This Week
    Last Update:
    See Project
  • 13
    Zebrunner Community Edition

    Zebrunner Community Edition

    Test Automation Management Tool

    Zebrunner CE (Community Edition) is a Test Automation Management Tool for continuous testing and continuous deployment. It allows you to run various kinds of tests and gain successive levels of confidence in the code quality. Zebrunner CE is integrated by default with Carina open-source TestNG framework and uses Jenkins as a CI Tool. It is built on top of popular docker solutions and includes Postgres database, Zebrunner Reporting, Jenkins Master/Slaves Nodes, Selenium Hub, Mobile...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    GUIDOLib
    The GUIDOLib provides a powerful engine for the graphic rendering of music scores, based on the Guido Music Notation format. It supports Linux, Mac OS X, Windows, Android and iOS operating systems. A Java JNI interface is available as well as a Javascript version of the library. A Web API has also been designed, allowing to deploy the engine as a Web service.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15

    Easy Web automation library

    Easy Web automation library

    This library has been designed to work with selenium for web automation. It has incorporated functions and handled exception from selenium. It uses selenium library for web interfaces.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Mapbox Maps SDK for React Native

    Mapbox Maps SDK for React Native

    A Mapbox GL react native module for creating custom maps

    Mapbox is the location data platform for mobile and web applications. We provide building blocks to add location features like maps, search, and navigation into any experience you create. Use our simple and powerful APIs & SDKs and our open-source libraries for interactivity and control. Once you’re signed in, all you need to start building is a Mapbox access token. Use this same short code with all of our interactive mapping libraries, Python and JavaScript SDKs, and directly against our REST...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    phoneutria
    A Java Web crawler: multi-threaded, scalable, with high performance, extensible and polite. It can be used to crawl and index any web or enterprise domain and is configurable through a XML configuration file.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    SIGAR (System Information Gatherer and Reporter) is a cross-platform, cross-language library and command-line tool for accessing operating system and hardware level information in Java, Perl and .NET.
    Leader badge
    Downloads: 25 This Week
    Last Update:
    See Project
  • 19

    Platform Course

    Opensource framework for creation bussiness web applications

    Platform Course 5.0.0 is opensource framework for easy development of original solutions for unique business process. Main advantages: cross-domain auth, LDAP integration, cross-browser GWT-based UI, big tables handling, chart and geo maps, input forms on XForms. Tested with MSSQL, PostgreSQL and Oracle. Actual SVN Repository for this product is located at https://share.curs.ru/svn/showcase/branches/stable/ Login: reader Password: reader If you wish to know further...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    COAR-DMS

    COAR-DMS

    DMS for linux, C++ library, server, webUI , SOAP

    ... security (rwx), special authorities - from thousands to tens of billions of documents - dashboard (working copies, new documents) - electronic signs - search statement, syntax like SQL - multithreaded, multiprocess library, Servers: - native HTTP server (libmicrohttp) - SOAP server - WebDAV(planed) - Indexer Python API WebUI GWT, JSP, SOAP-API
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Java-based framework for decoupling back-end services and front-end interfaces. Browse and interact with a database, a class library, a network, a log file, or any live java object as though it were a filesystem. (It works with filesystems too!)
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22
    OVNI

    OVNI

    Open Virtualization Nodes Infrastructure

    OVNI is, first of all, an AJAX web-application to create and manage Virtual Machines on KVM nodes. it's developed under WaveMaker and rely on Libvirt to be compatible with other tools such as virsh. In the future, the project aims to provide a complete virtualization environment.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    DoCookBook

    DoCookBook

    Cookbook Style Document for DocBook Customizations

    This project has been moved to GitHub: https://github.com/tomschr/dbcookbook/ The DoCookBook project aims to create an open source book about DocBook and the DocBook XSL stylesheets written as a cookbook and released under a Creative Commons license.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    The goal of this project is to make possible to access Progress database from any external program that can use sockets. The server (broker and agents) are written in Progress 4GL and made use of sockets capabilities of Progress V9.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Cyclone - Task Automation

    Cyclone - Task Automation

    Task Scheduler for Java, Groovy, Javascript, python & ruby

    Cyclone is java-based rich GUI web tool for automating tasks with ease. Cyclone is a project that was borne out of necessity. Agreed, there are many native tools that allow task to be scheduled or automated on most operating systems, however, Cyclone is more than just an automation tool. In many programming languages, there are batch module/libraries available to schedule jobs, but need quite a bit of enhancements to provide similar feature set as Cyclone. Cyclone comes with a number...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.