Showing 82 open source projects for "python web crawler"

View related business solutions
  • Level Up Your Cyber Defense with External Threat Management Icon
    Level Up Your Cyber Defense with External Threat Management

    See every risk before it hits. From exposed data to dark web chatter. All in one unified view.

    Move beyond alerts. Gain full visibility, context, and control over your external attack surface to stay ahead of every threat.
    Try for Free
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    Best-of Python

    Best-of Python

    A ranked list of awesome Python open-source libraries

    ...If you like to add or update projects, feel free to open an issue, submit a pull request, or directly edit the projects.yaml. Contributions are very welcome! Ranked list of awesome python libraries for web development. Correctly generate plurals, ordinals, indefinite articles; convert numbers. Libraries for loading, collecting, and extracting data from a variety of data sources and formats. Libraries for data batch- and stream-processing, workflow automation, job scheduling, and other data pipeline tasks.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    Dash

    Dash

    Build beautiful web-based analytic apps, no JavaScript required

    Dash is a Python framework for building beautiful analytical web applications without any JavaScript. Built on top of Plotly.js, React and Flask, Dash easily achieves what an entire team of designers and engineers normally would. It ties modern UI controls and displays such as dropdown menus, sliders and graphs directly to your analytical Python code, and creates exceptional, interactive analytics apps.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    Mercury

    Mercury

    Convert Python notebook to web app and share with non-technical users

    Turn Python notebooks to web applications with open-source Mercury framework. Hide code and add interactive widgets. Non-technical users can tweak widgets and execute notebook with new parameters. The core of Mercury is Open Source under AGPLv3. We provide Mercury Pro with additional features, dedicated support and friendly commercial license.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Mara Pipelines

    Mara Pipelines

    A lightweight opinionated ETL framework, halfway between plain scripts

    This package contains a lightweight data transformation framework with a focus on transparency and complexity reduction. Data integration pipelines as code: pipelines, tasks and commands are created using declarative Python code. PostgreSQL as a data processing engine. Extensive web ui. The web browser as the main tool for inspecting, running and debugging pipelines. GNU make semantics. Nodes depend on the completion of upstream nodes. No data dependencies or data flows. No in-app data processing: command line tools as the main tool for interacting with databases and data. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 5
    Astropy

    Astropy

    Repository for the Astropy core package

    The Astropy Project is a community effort to develop a common core package for Astronomy in Python and foster an ecosystem of interoperable astronomy packages. Astropy is a Python library for use in astronomy. Learn Astropy provides a portal to all of the Astropy educational material through a single dynamically searchable web page. It allows you to filter tutorials by keywords, search for filters, and make search queries in tutorials and documentation simultaneously. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Metaflow

    Metaflow

    A framework for real-life data science

    Metaflow is a human-friendly Python library that helps scientists and engineers build and manage real-life data science projects. Metaflow was originally developed at Netflix to boost productivity of data scientists who work on a wide variety of projects from classical statistics to state-of-the-art deep learning.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 7
    Quadratic

    Quadratic

    Data science spreadsheet with Python & SQL

    Quadratic enables your team to work together on data analysis to deliver better results, faster. You already know how to use a spreadsheet, but you’ve never had this much power before. Quadratic is a Web-based spreadsheet application that runs in the browser and as a native app (via Electron). Our goal is to build a spreadsheet that enables you to pull your data from its source (SaaS, Database, CSV, API, etc) and then work with that data using the most popular data science tools today (Python, Pandas, SQL, JS, Excel Formulas, etc). ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 8
    CKAN

    CKAN

    CKAN is an open-source DMS for powering data hubs

    CKAN is the world’s leading open-source data portal platform. CKAN makes it easy to publish, share and work with data. It's a data management system that provides a powerful platform for cataloging, storing and accessing datasets with a rich front-end, full API (for both data and catalog), visualization tools and more.CKAN is used by national and regional government organizations throughout the European Union, the Americas, Asia, and Oceania to power a variety of official and community data...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 9
    RStudio

    RStudio

    RStudio is an integrated development environment (IDE) for R

    RStudio is a powerful, full-featured integrated development environment (IDE) tailored primarily for the R programming language but increasingly supportive of other languages like Python and Julia. It brings together console, editor, plotting, workspace, history, and file-management panes into a unified interface, helping data scientists, statisticians, and analysts to work more productively. The IDE is cross-platform: there are desktop versions for Windows, macOS and Linux, as well as a server version for remote or multi-user deployment via a web browser. ...
    Downloads: 16 This Week
    Last Update:
    See Project
  • Cloud-based help desk software with ServoDesk Icon
    Cloud-based help desk software with ServoDesk

    Full access to Enterprise features. No credit card required.

    What if You Could Automate 90% of Your Repetitive Tasks in Under 30 Days? At ServoDesk, we help businesses like yours automate operations with AI, allowing you to cut service times in half and increase productivity by 25% - without hiring more staff.
    Try ServoDesk for free
  • 10
    Emerge

    Emerge

    Browser-based interactive codebase and dependency visualization tool

    Emerge (or emerge-viz) is an interactive code analysis tool to gather insights about source code structure, metrics, dependencies, and complexity of software projects. You can scan the source code of a project, calculate metric results and statistics, generate an interactive web app with graph structures (e.g. a dependency graph or a filesystem graph), and export the results in some file formats. Emerge currently has parsing support for the following languages: C, C++, Groovy, Java, JavaScript, TypeScript, Kotlin, ObjC, Ruby, Swift, Python, and Go. The structure, coloring, and clustering is calculated and based on the idea of combining a force-directed graph simulation and Louvain modularity. emerge is mainly written in Python 3 and is tested on macOS, Linux, and modern web browsers (i.e., the latest Safari, Chrome, Firefox, and Edge).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Recap

    Recap

    Recap tracks and transform schemas across your whole application

    Recap is a schema language and multi-language toolkit to track and transform schemas across your whole application. Your data passes through web services, databases, message brokers, and object stores. Recap describes these schemas in a single language, regardless of which system your data passes through. Recap schemas can be defined in YAML, TOML, JSON, XML, or any other compatible language.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    ClearML

    ClearML

    Streamline your ML workflow

    ...It is designed as an end-to-end MLOps suite allowing you to focus on developing your ML code & automation, while ClearML ensures your work is reproducible and scalable. The ClearML Python Package for integrating ClearML into your existing scripts by adding just two lines of code, and optionally extending your experiments and other workflows with ClearML powerful and versatile set of classes and methods. The ClearML Server storing experiment, model, and workflow data, and supports the Web UI experiment manager, and ML-Ops automation for reproducibility and tuning. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Synapse Machine Learning

    Synapse Machine Learning

    Simple and distributed Machine Learning

    SynapseML (previously MMLSpark) is an open source library to simplify the creation of scalable machine learning pipelines. SynapseML builds on Apache Spark and SparkML to enable new kinds of machine learning, analytics, and model deployment workflows. SynapseML adds many deep learning and data science tools to the Spark ecosystem, including seamless integration of Spark Machine Learning pipelines with the Open Neural Network Exchange (ONNX), LightGBM, The Cognitive Services, Vowpal Wabbit,...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 14
    Siddhi Core Libraries

    Siddhi Core Libraries

    Stream Processing and Complex Event Processing Engine

    Fully open source, cloud-native, scalable, micro streaming, and complex event processing system capable of building event-driven applications for use cases such as real-time analytics, data integration, notification management, and adaptive decision-making. Event processing logic can be written using Streaming SQL queries via graphical and source editors, to capture events from diverse data sources, process and analyze them, integrate with multiple services and data stores, and publish...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    TensorBoardX

    TensorBoardX

    tensorboard for pytorch (and chainer, mxnet, numpy, etc.)

    The SummaryWriter class provides a high-level API to create an event file in a given directory and add summaries and events to it. The class updates the file contents asynchronously. This allows a training program to call methods to add data to the file directly from the training loop, without slowing down training. TensorboardX now supports logging directly to Comet. Comet is a free cloud based solution that allows you to automatically track, compare and explain your experiments. It adds a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Perspective

    Perspective

    A data visualization and analytics component

    Perspective is a high-performance data visualization library for building real-time, interactive analytics dashboards. Developed by FINOS, it supports WebAssembly-powered pivot tables and can handle large streaming datasets with speed and flexibility. Perspective is ideal for fintech, trading, and IoT applications where insights from live data need to be visualized, sliced, and explored quickly in a browser.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    D-Tale

    D-Tale

    Visualizer for pandas data structures

    D-Tale is the combination of a Flask backend and a React front-end to bring you an easy way to view & analyze Pandas data structures. It integrates seamlessly with ipython notebooks & python/ipython terminals. Currently, this tool supports such Pandas objects as DataFrame, Series, MultiIndex, DatetimeIndex & RangeIndex. D-Tale was the product of a SAS to Python conversion. What was originally a perl script wrapper on top of SAS's insight function is now a lightweight web client on top of Pandas data structures. To help guard against users loading the same data to D-Tale multiple times and thus eating up precious memory, we have a loose check for duplicate input data. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    PULSAR

    PULSAR

    Distributed pub-sub messaging system

    Apache Pulsar is a cloud-native, distributed messaging and streaming platform originally created at Yahoo! and now a top-level Apache Software Foundation project. Easy to deploy, lightweight compute process, developer-friendly APIs, no need to run your own stream processing engine. Run in production at Yahoo! scale for over 5 years, with millions of messages per second across millions of topics. Expand capacity seamlessly to hundreds of nodes. Low publish latency (< 5ms) at scale with strong...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    HEALPix

    HEALPix

    Data Analysis, Simulations and Visualization on the Sphere

    Software for pixelization, hierarchical indexation, synthesis, analysis, and visualization of data on the sphere. Please acknowledge HEALPix by quoting the web page http://healpix.sourceforge.net (or https://healpix.sourceforge.io) and publication: K.M. Gorski et al., 2005, Ap.J., 622, p.759 Full software documentation available at https://healpix.sourceforge.io/documentation.php Wiki Pages: https://sourceforge.net/p/healpix/wiki/Home Exchanging Data with HEALPix (in FITS files):...
    Leader badge
    Downloads: 489 This Week
    Last Update:
    See Project
  • 20
    Catbird Linux

    Catbird Linux

    Linux for content creation, web scraping, coding, and data analysis.

    Catbird Linux is a USB pluggable Live Linux operating system built for media creation, web scraping, and software coding. It is the daily driver you want for retrieving data, making videos or podcasts, and making software tools to automate the repetitive tasks. It is ready for work in Python, Lua, and Go languages, with numerous packages for web scraping or downloading data via API calls. Using Catbird Linux, it is possible to accomplish in depth stock market analysis, track weather trends, follow social media sentiment, or do other tasks in data science. ...
    Leader badge
    Downloads: 16 This Week
    Last Update:
    See Project
  • 21
    text-dedup

    text-dedup

    All-in-one text de-duplication

    ...It supports Jaccard similarity thresholding, parallel execution, and flexible deduplication strategies, making it ideal for cleaning web-scraped data, language model training datasets, or document archives.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    QUAST

    QUAST

    Quality Assessment Tool for Genome Assemblies

    QUAST performs fast and convenient quality evaluation and comparison of genome assemblies. It is maintained by the Gurevich lab at HIPS (https://helmholtz-hips.de/en/hmsb). For the most up-to-date description, please visit http://quast.sf.net. Below are just some highlights. QUAST computes several well-known metrics, including contig accuracy, the number of genes discovered, N50, and others, as well as introducing new ones, like NA50 (see details in the paper and manual). A...
    Leader badge
    Downloads: 58 This Week
    Last Update:
    See Project
  • 23
    Euler Math Toolbox

    Euler Math Toolbox

    Numerical and Symbolic Math Tool

    Euler is a powerful all-in-one numerical software and includes Maxima for seamless symbolic computations. Euler supports Latex for math display, Povray for photo-realistic 3D scenes, Python, Matplotlib and C for scripting, and contains a full programming language. Features include libraries for numerical algorithms, optimization, plotting in 2D and 3D, graphics export, a complete help system, tutorials and examples. Euler runs in Windows natively, or in Linux via Wine. It is completely...
    Leader badge
    Downloads: 69 This Week
    Last Update:
    See Project
  • 24
    Autoplot

    Autoplot

    Autoplot is an interactive browser for data on the web

    Autoplot is an interactive browser for data on the web. Give Autoplot a URL or local file name and it creates a sensible plot of the data. Autoplot allows you to interactively browse data stored in ascii, .cdf, netcdf, and many other formats. Autoplot's source has been moved to GitHub. Thanks to SourceForge for many years of hosting!
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Expert Baseball Skipper

    Expert Baseball Skipper

    A software that helps you manage a baseball team (real or fantasy).

    Expert Baseball Skipper is a libre (as in free software) software used to manage a baseball team, whether it’s a real baseball team or fantasy. While still in its early stages, Expert Baseball Skipper is a modern take on LazyDogSoftware’s Baseball Memories (https://archive.org/details/tucows_247031_Baseball_Memories). Expert Baseball Skipper is actively being tested with the use of the most advanced baseball simulator on the web; https://www.franchiseball.com.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • Next