Showing 153 open source projects for "python web crawler"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • MongoDB Atlas | Run databases anywhere Icon
    MongoDB Atlas | Run databases anywhere

    Ensure the availability of your data with coverage across AWS, Azure, and GCP on MongoDB Atlas—the multi-cloud database for every enterprise.

    MongoDB Atlas allows you to build and run modern applications across 125+ cloud regions, spanning AWS, Azure, and Google Cloud. Its multi-cloud clusters enable seamless data distribution and automated failover between cloud providers, ensuring high availability and flexibility without added complexity.
    Learn More
  • 1
    pdf-editor

    pdf-editor

    Edit your PDFs without needing a subscription or creating accounts

    Edit your PDFs without needing a subscription or creating accounts. Add a GUI/Turn it into a web application. Add a parser for the command line to do multiple commands at once e.g. merge (cut pdf1) pdf2. Tested working with Python 3.8.5. Install venv (py -3.8 -m pip install virtualenv). PDF and Word documents are binary files, which makes them much more complex than plaintext files. In addition to text, they store lots of font, color, and layout information. If you want your programs to read...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    ADFILT

    ADFILT

    Web filter lists for countless different topics

    This is the place where I, Imre Kristoffer Eilertsen, host my web filter lists for countless different topics, for use in adblock tools and the like. GitHub was in mid-2017 by far the easiest way for laymen like me to store pure text files, which is a necessity to create subscribable lists. This is a hobby project of mine, in which I work just as much on these lists and this repo as I feel like. But don't be fooled by the appearance, as these are nevertheless some lists that I've placed lots...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    A set of tools (command line and GUI) to provide a complete digital photo workflow for Unixes. EXIF headers are used as the central information repository, so users may change their software at any time without loosing any data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Pytago

    Pytago

    A source-to-source transpiler for Python to Go translation

    pytago is a source-to-source transpiler that converts some Python into human-readable Go. It enables developers to translate Python codebases into Go, facilitating migration or interoperability between the two languages. ​
    Downloads: 1 This Week
    Last Update:
    See Project
  • Your top-rated shield against malware and online scams | Avast Free Antivirus Icon
    Your top-rated shield against malware and online scams | Avast Free Antivirus

    Browse and email in peace, supported by clever AI

    Our antivirus software scans for security and performance issues and helps you to fix them instantly. It also protects you in real time by analyzing unknown files before they reach your desktop PC or laptop — all for free.
    Free Download
  • 5
    I Heart LA

    I Heart LA

    Compilable markdown for linear algebra

    I Heart LA is a compilable markdown for math. It can generate working code in your favorite language (C++, Python, MATLAB, more to come) and LaTeX from snippets.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    Just Another Desktop Environment

    Just Another Desktop Environment

    Linux desktop environment built with HTML5, CSS, JavaScript and Python

    Desktop Environment built with Web Technologies, JDE takes over your desktop to manage applications a dock or panel is still needed to complement it. Clean and minimalistic interface. Settings panel. Show/Hide application categories. Keyboard application search. Visual application search. Settings panel integrates with individual application settings. Dbus integration. UI inspector. Animated Backgrounds. Drag and Drop, optional Window auto tile. Desktop Tour on first run. Scriptable workspaces.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    CBMPy

    CBMPy

    PySCeS Constraint Based Modelling

    ... these into the underlying mathematical structures. . CBMPy implements popular analyses such as FBA, FVA, element/charge balancing, network analysis and model editing as well as advanced methods developed for the ecosystem modelling. CBMPy supports user interaction via: - interactive console or as a library for advanced use - GUI, visual representation of the model, analysis methods - a SOAP based webAPI exposes high level functionality via web services
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Full Stack FastAPI Couchbase

    Full Stack FastAPI Couchbase

    Full stack, modern web application generator

    Full stack, modern web application generator. Using FastAPI, Couchbase as a database, Docker, automatic HTTPS, and more. Couchbase has a great set of features that is not easily or commonly found in alternatives. REST backend tests based on Pytest, integrated with Docker, so you can test the full API interaction, independent on the database. As it runs in Docker, it can build a new data store from scratch each time (so you can use ElasticSearch, MongoDB, or whatever you want, and just test...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Bots open source edi translator

    Bots open source edi translator

    Bots is a complete translator for edi: edifact, x12, xml, tradacoms

    Bots is a complete translator for edi (Electronic Data Interchange). EDI data formats eg: edifact, x12, tradacoms, xml. Mail: http://groups.google.com/group/botsmail Web-site: http://bots.sourceforge.net Wiki: http://bots.readthedocs.io Develop: https://github.com/eppye-bots/bots
    Leader badge
    Downloads: 71 This Week
    Last Update:
    See Project
  • Deliver secure remote access with OpenVPN. Icon
    Deliver secure remote access with OpenVPN.

    Trusted by nearly 20,000 customers worldwide, and all major cloud providers.

    OpenVPN's products provide scalable, secure remote access — giving complete freedom to your employees to work outside the office while securely accessing SaaS, the internet, and company resources.
    Get started — no credit card required.
  • 10
    pyspider

    pyspider

    A powerful Spider(Web Crawler) system in Python

    pyspider is a powerful Spider(Web Crawler) system in Python. Components are connected by message queue. Every component, including message queue, is running in their own process/thread, and replaceable. That means, when process is slow, you can have many instances of processor and make full use of multiple CPUs, or deploy to multiple machines. This architecture makes pyspider really fast. benchmarking. Since pyspider has various components, you can just run pyspider to start a standalone...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    crawler4j

    crawler4j

    Open source web crawler for Java

    crawler4j is an open source web crawler for Java which provides a simple interface for crawling the Web. Using it, you can setup a multi-threaded web crawler in few minutes. You need to create a crawler class that extends WebCrawler. This class decides which URLs should be crawled and handles the downloaded page. shouldVisit function decides whether the given URL should be crawled or not. In the above example, this example is not allowing .css, .js and media files and only allows pages within...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12

    survol

    RDF-based framework monitoring business systems activity

    A Python agent and a web interface aiming to help the analysis and investigation of a legacy application. A set of machines, processes, databases, programs etc ... all communicating with each other, manipulating your data, and whose software architecture has become, with time, complicated, difficult to understand, and undocumented. Data are aggregated with an RDF inference engine, creating a global vision of the business information processing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Extended Memory Semantics (EMS)

    Extended Memory Semantics (EMS)

    Persistent shared object memory and parallelism for Node.js and Python

    EMS makes possible persistent shared memory parallelism between Node.js, Python, and C/C++. Extended Memory Semantics (EMS) unifies synchronization and storage primitives to address several challenges of parallel programming. A modern multi-core server has 16-32 cores and nearly 1TB of memory, equivalent to an entire rack of systems from a few years ago. As a consequence, jobs formerly requiring a Map-Reduce cluster can now be performed entirely in shared memory on a single server without using...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14

    redfish-lab

    Get started with the Redfish RESTful API from the DMTF

    Redfish-lab allows a smooth ramp-up with the Redfish RESTful API on an HPE ProLiant server, including UEFI/BIOS configuration with various scripting languages like PowerShell and Python. Small tutorials/articles are also proposed in the Wiki section.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Linux Dash

    Linux Dash

    A beautiful web dashboard for Linux

    Using the 3 bar hamburger icon on each module (widget) on Linux Dash, you can drag and re-arrange them. It is strongly recommended that all linux-dash installations be protected via a security measure of your choice. All of your changes are saved in LocalStorage, so your changes will be preserved permanently on that browser for convenience. Each of the modules (widgets) on the Linux Dash screen can be minimized to hide it, expanded in one click to maximize it, adjusted to a custom width....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    PyXB (“pixbee”) is a pure Python package that generates Python source code for classes that correspond to data structures defined by XMLSchema. In concept it is similar to JAXB for Java and CodeSynthesis XSD for C++.
    Downloads: 20 This Week
    Last Update:
    See Project
  • 17
    Note: latest version can be found at https://github.com/targeted/pythomnic3k Pythomnic3k is a Python 3 framework for service-oriented middleware with hot reloading and fault tolerance. It is used for integrating various systems in enterprise network or writing standalone network services.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Smart-M3 is a functional platform that provides a cross domain search extent for triple based information. Smart-M3 enables smart cross domain applications that rely on information level interoperability.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    htmlarea

    htmlarea

    Small, powerful, full featured WYSIWYG editor

    HTMLArea 4 is a browser based WYSIWYG editor that easily replaces the TEXTAREA in your web pages. It is written in JavaScript, and suitable for use in any modern web browser, and any page on your web site. Current version is 4.0-2016-08-29
    Downloads: 7 This Week
    Last Update:
    See Project
  • 20
    Cloud Export is a tool to automatically extract your data from web applications and save it to your local file system for backup purposes, but more extensive than Google Takeout. Plans are based on http://www.dataliberation.org.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    QAL

    QAL

    Query Abstraction Layer

    Project has moved to: https://github.com/OptimalBPM/qal QAL is a collection of libraries for mining, transforming and writing data from and to a number of places. Sources and destinations include different SQL and NoSQL backends, file formats like .csv, XML and excel. Even untidy HTML web pages. It has a database abstraction layer that supports connectivity to Postgres, MySQL, DB2, Oracle, MS SQL server. JSON and MongoDB is coming. It uses XML/JSON formats(self-generated SQL schemas...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    a Project with All the bells and whistles to allow the average user to fully benefit from HTTP,DNS,FTP,SSH through python, allowing quick and easy deploying of servers without compiling, or installing anything but our favorite language.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    Maximo SOAP Web Service Tester

    Simple application for testing XML Web Services in Maximo

    Maximo SOAP WebService Tester (Windows) Source: https://github.com/SVSagi/mxwst
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Another plain text format with target in the easy edition of outlines.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    DoCookBook

    DoCookBook

    Cookbook Style Document for DocBook Customizations

    This project has been moved to GitHub: https://github.com/tomschr/dbcookbook/ The DoCookBook project aims to create an open source book about DocBook and the DocBook XSL stylesheets written as a cookbook and released under a Creative Commons license.
    Downloads: 0 This Week
    Last Update:
    See Project
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.