Showing 68 open source projects for "python web crawler"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Keep company data safe with Chrome Enterprise Icon
    Keep company data safe with Chrome Enterprise

    Protect your business with AI policies and data loss prevention in the browser

    Make AI work your way with Chrome Enterprise. Block unapproved sites and set custom data controls that align with your company's policies.
    Download Chrome
  • 1
    Best-of Python

    Best-of Python

    A ranked list of awesome Python open-source libraries

    ...If you like to add or update projects, feel free to open an issue, submit a pull request, or directly edit the projects.yaml. Contributions are very welcome! Ranked list of awesome python libraries for web development. Correctly generate plurals, ordinals, indefinite articles; convert numbers. Libraries for loading, collecting, and extracting data from a variety of data sources and formats. Libraries for data batch- and stream-processing, workflow automation, job scheduling, and other data pipeline tasks.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    Dash

    Dash

    Build beautiful web-based analytic apps, no JavaScript required

    Dash is a Python framework for building beautiful analytical web applications without any JavaScript. Built on top of Plotly.js, React and Flask, Dash easily achieves what an entire team of designers and engineers normally would. It ties modern UI controls and displays such as dropdown menus, sliders and graphs directly to your analytical Python code, and creates exceptional, interactive analytics apps.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    Mara Pipelines

    Mara Pipelines

    A lightweight opinionated ETL framework, halfway between plain scripts

    This package contains a lightweight data transformation framework with a focus on transparency and complexity reduction. Data integration pipelines as code: pipelines, tasks and commands are created using declarative Python code. PostgreSQL as a data processing engine. Extensive web ui. The web browser as the main tool for inspecting, running and debugging pipelines. GNU make semantics. Nodes depend on the completion of upstream nodes. No data dependencies or data flows. No in-app data processing: command line tools as the main tool for interacting with databases and data. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Metaflow

    Metaflow

    A framework for real-life data science

    Metaflow is a human-friendly Python library that helps scientists and engineers build and manage real-life data science projects. Metaflow was originally developed at Netflix to boost productivity of data scientists who work on a wide variety of projects from classical statistics to state-of-the-art deep learning.
    Downloads: 4 This Week
    Last Update:
    See Project
  • Level Up Your Cyber Defense with External Threat Management Icon
    Level Up Your Cyber Defense with External Threat Management

    See every risk before it hits. From exposed data to dark web chatter. All in one unified view.

    Move beyond alerts. Gain full visibility, context, and control over your external attack surface to stay ahead of every threat.
    Try for Free
  • 5
    Quadratic

    Quadratic

    Data science spreadsheet with Python & SQL

    Quadratic enables your team to work together on data analysis to deliver better results, faster. You already know how to use a spreadsheet, but you’ve never had this much power before. Quadratic is a Web-based spreadsheet application that runs in the browser and as a native app (via Electron). Our goal is to build a spreadsheet that enables you to pull your data from its source (SaaS, Database, CSV, API, etc) and then work with that data using the most popular data science tools today (Python, Pandas, SQL, JS, Excel Formulas, etc). ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 6
    CKAN

    CKAN

    CKAN is an open-source DMS for powering data hubs

    CKAN is the world’s leading open-source data portal platform. CKAN makes it easy to publish, share and work with data. It's a data management system that provides a powerful platform for cataloging, storing and accessing datasets with a rich front-end, full API (for both data and catalog), visualization tools and more.CKAN is used by national and regional government organizations throughout the European Union, the Americas, Asia, and Oceania to power a variety of official and community data...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 7
    Recap

    Recap

    Recap tracks and transform schemas across your whole application

    Recap is a schema language and multi-language toolkit to track and transform schemas across your whole application. Your data passes through web services, databases, message brokers, and object stores. Recap describes these schemas in a single language, regardless of which system your data passes through. Recap schemas can be defined in YAML, TOML, JSON, XML, or any other compatible language.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    RStudio

    RStudio

    RStudio is an integrated development environment (IDE) for R

    RStudio is a powerful, full-featured integrated development environment (IDE) tailored primarily for the R programming language but increasingly supportive of other languages like Python and Julia. It brings together console, editor, plotting, workspace, history, and file-management panes into a unified interface, helping data scientists, statisticians, and analysts to work more productively. The IDE is cross-platform: there are desktop versions for Windows, macOS and Linux, as well as a server version for remote or multi-user deployment via a web browser. ...
    Downloads: 15 This Week
    Last Update:
    See Project
  • 9
    ClearML

    ClearML

    Streamline your ML workflow

    ...It is designed as an end-to-end MLOps suite allowing you to focus on developing your ML code & automation, while ClearML ensures your work is reproducible and scalable. The ClearML Python Package for integrating ClearML into your existing scripts by adding just two lines of code, and optionally extending your experiments and other workflows with ClearML powerful and versatile set of classes and methods. The ClearML Server storing experiment, model, and workflow data, and supports the Web UI experiment manager, and ML-Ops automation for reproducibility and tuning. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Cloud data warehouse to power your data-driven innovation Icon
    Cloud data warehouse to power your data-driven innovation

    BigQuery is a serverless and cost-effective enterprise data warehouse that works across clouds and scales with your data.

    BigQuery Studio provides a single, unified interface for all data practitioners of various coding skills to simplify analytics workflows from data ingestion and preparation to data exploration and visualization to ML model creation and use. It also allows you to use simple SQL to access Vertex AI foundational models directly inside BigQuery for text processing tasks, such as sentiment analysis, entity extraction, and many more without having to deal with specialized models.
    Try for free
  • 10
    Perspective

    Perspective

    A data visualization and analytics component

    Perspective is a high-performance data visualization library for building real-time, interactive analytics dashboards. Developed by FINOS, it supports WebAssembly-powered pivot tables and can handle large streaming datasets with speed and flexibility. Perspective is ideal for fintech, trading, and IoT applications where insights from live data need to be visualized, sliced, and explored quickly in a browser.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Siddhi Core Libraries

    Siddhi Core Libraries

    Stream Processing and Complex Event Processing Engine

    Fully open source, cloud-native, scalable, micro streaming, and complex event processing system capable of building event-driven applications for use cases such as real-time analytics, data integration, notification management, and adaptive decision-making. Event processing logic can be written using Streaming SQL queries via graphical and source editors, to capture events from diverse data sources, process and analyze them, integrate with multiple services and data stores, and publish...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Synapse Machine Learning

    Synapse Machine Learning

    Simple and distributed Machine Learning

    SynapseML (previously MMLSpark) is an open source library to simplify the creation of scalable machine learning pipelines. SynapseML builds on Apache Spark and SparkML to enable new kinds of machine learning, analytics, and model deployment workflows. SynapseML adds many deep learning and data science tools to the Spark ecosystem, including seamless integration of Spark Machine Learning pipelines with the Open Neural Network Exchange (ONNX), LightGBM, The Cognitive Services, Vowpal Wabbit,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    HEALPix

    HEALPix

    Data Analysis, Simulations and Visualization on the Sphere

    Software for pixelization, hierarchical indexation, synthesis, analysis, and visualization of data on the sphere. Please acknowledge HEALPix by quoting the web page http://healpix.sourceforge.net (or https://healpix.sourceforge.io) and publication: K.M. Gorski et al., 2005, Ap.J., 622, p.759 Full software documentation available at https://healpix.sourceforge.io/documentation.php Wiki Pages: https://sourceforge.net/p/healpix/wiki/Home Exchanging Data with HEALPix (in FITS files):...
    Leader badge
    Downloads: 520 This Week
    Last Update:
    See Project
  • 14
    text-dedup

    text-dedup

    All-in-one text de-duplication

    ...It supports Jaccard similarity thresholding, parallel execution, and flexible deduplication strategies, making it ideal for cleaning web-scraped data, language model training datasets, or document archives.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    Catbird Linux

    Catbird Linux

    Linux for content creation, web scraping, coding, and data analysis.

    Catbird Linux is a USB pluggable Live Linux operating system built for media creation, web scraping, and software coding. It is the daily driver you want for retrieving data, making videos or podcasts, and making software tools to automate the repetitive tasks. It is ready for work in Python, Lua, and Go languages, with numerous packages for web scraping or downloading data via API calls. Using Catbird Linux, it is possible to accomplish in depth stock market analysis, track weather trends, follow social media sentiment, or do other tasks in data science. ...
    Leader badge
    Downloads: 14 This Week
    Last Update:
    See Project
  • 16
    QUAST

    QUAST

    Quality Assessment Tool for Genome Assemblies

    QUAST performs fast and convenient quality evaluation and comparison of genome assemblies. It is maintained by the Gurevich lab at HIPS (https://helmholtz-hips.de/en/hmsb). For the most up-to-date description, please visit http://quast.sf.net. Below are just some highlights. QUAST computes several well-known metrics, including contig accuracy, the number of genes discovered, N50, and others, as well as introducing new ones, like NA50 (see details in the paper and manual). A...
    Leader badge
    Downloads: 68 This Week
    Last Update:
    See Project
  • 17
    Euler Math Toolbox

    Euler Math Toolbox

    Numerical and Symbolic Math Tool

    Euler is a powerful all-in-one numerical software and includes Maxima for seamless symbolic computations. Euler supports Latex for math display, Povray for photo-realistic 3D scenes, Python, Matplotlib and C for scripting, and contains a full programming language. Features include libraries for numerical algorithms, optimization, plotting in 2D and 3D, graphics export, a complete help system, tutorials and examples. Euler runs in Windows natively, or in Linux via Wine. It is completely...
    Leader badge
    Downloads: 80 This Week
    Last Update:
    See Project
  • 18
    Autoplot

    Autoplot

    Autoplot is an interactive browser for data on the web

    Autoplot is an interactive browser for data on the web. Give Autoplot a URL or local file name and it creates a sensible plot of the data. Autoplot allows you to interactively browse data stored in ascii, .cdf, netcdf, and many other formats. Autoplot's source has been moved to GitHub. Thanks to SourceForge for many years of hosting!
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Quick 2d Plot

    Quick 2d Plot

    Program for live 2d graphical representation of data streams

    Quick2dPlot, or q2d for short, is an open source minimalistic plotting program designed for live 2d graphical representation of data streams. The program may be useful for plotting output of different user's application programs, especially in case when the user wants to see a plot or a number of plots during calculations or a data acquisition process. The program is command-driven and uses no widgets. Q2d is written in C, it takes advantage of SDL2 library for plotting. Currently...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Wooey

    Wooey

    A Django app that creates automatic web UIs for Python scripts

    Wooey is a simple web interface to run command line Python scripts. Think of it as an easy way to get your scripts up on the web for routine data analysis, file processing, or anything else. The project was inspired by how simply and powerfully sandman could expose users to a database and by how Gooey turns ArgumentParser-based command-line scripts into WxWidgets GUIs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    rocket-bi

    rocket-bi

    An open-source web-based self-service BI for analytical databases

    Rocket.BI is a free, open-source, web-based business intelligence solution specifically designed for analytical databases. It enables data analysts and business users alike to easily integrate different data sources, perform advanced data analysis, ad hoc, and more. With an easy-to-use editor, you can create personalized reports, build interactive business dashboards and generate actionable business insights. Rocket.BI also allows collaboration as working together with other people in the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Bloxs

    Bloxs

    Build dashboards in Jupyter Notebook with numeric and chart boxes

    Bloxs is a simple Python package that helps you display information in an attractive way (formed in blocks). Perfect for building dashboards, reports and apps in the notebook.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    ML workspace

    ML workspace

    All-in-one web-based IDE specialized for machine learning

    All-in-one web-based development environment for machine learning. The ML workspace is an all-in-one web-based IDE specialized for machine learning and data science. It is simple to deploy and gets you started within minutes to productively built ML solutions on your own machines. This workspace is the ultimate tool for developers preloaded with a variety of popular data science libraries (e.g., Tensorflow, PyTorch, Keras, Sklearn) and dev tools (e.g., Jupyter, VS Code, Tensorboard)...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    TransPose

    TransPose

    PyTorch Implementation for "TransPose, Keypoint localization

    TransPose is a human pose estimation model based on a CNN feature extractor, a Transformer Encoder, and a prediction head. Given an image, the attention layers built in Transformer can efficiently capture long-range spatial relationships between keypoints and explain what dependencies the predicted keypoints locations highly rely on.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    Jupytab

    Jupytab

    Display in Tableau data from Jupyter notebooks

    Jupytab allows you to explore in Tableau data which is generated dynamically by a Jupyter Notebook. You can thus create Tableau data sources in a very flexible way using all the power of Python. This is achieved by having Tableau access data through a web server created by Jupytab.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next