Showing 3125 open source projects for "data"

View related business solutions
  • Go from Data Warehouse to Data and AI platform with BigQuery Icon
    Go from Data Warehouse to Data and AI platform with BigQuery

    Build, train, and run ML models with simple SQL. Automate data prep, analysis, and predictions with built-in AI assistance from Gemini.

    BigQuery is more than a data warehouse—it's an autonomous data-to-AI platform. Use familiar SQL to train ML models, run time-series forecasts, and generate AI-powered insights with native Gemini integration. Built-in agents handle data engineering and data science workflows automatically. Get $300 in free credit, query 1 TB, and store 10 GB free monthly.
    Try BigQuery Free
  • Build AI Apps with Gemini 3 on Vertex AI Icon
    Build AI Apps with Gemini 3 on Vertex AI

    Access Google’s most capable multimodal models. Train, test, and deploy AI with 200+ foundation models on one platform.

    Vertex AI gives developers access to Gemini 3—Google’s most advanced reasoning and coding model—plus 200+ foundation models including Claude, Llama, and Gemma. Build generative AI apps with Vertex AI Studio, customize with fine-tuning, and deploy to production with enterprise-grade MLOps. New customers get $300 in free credits.
    Try Vertex AI Free
  • 1
    CakeChat

    CakeChat

    CakeChat: Emotional Generative Dialog System

    CakeChat is a backend for chatbots that are able to express emotions via conversations. The code is flexible and allows to condition model's responses by an arbitrary categorical variable. For example, you can train your own persona-based neural conversational model or create an emotional chatting machine. Hierarchical Recurrent Encoder-Decoder (HRED) architecture for handling deep dialog context. Multilayer RNN with GRU cells. The first layer of the utterance-level encoder is always...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Active Learning

    Active Learning

    Framework and examples for active learning with machine learning model

    ...It provides modular tools for running reproducible experiments across different datasets, sampling strategies, and machine learning models. The system allows researchers to study how models can improve labeling efficiency by selectively querying the most informative data points rather than relying on uniformly sampled training sets. The main experiment runner (run_experiment.py) supports a wide range of configurations, including batch sizes, dataset subsets, model selection, and data preprocessing options. It includes several established active learning strategies such as uncertainty sampling, k-center greedy selection, and bandit-based methods, while also allowing for custom algorithm implementations. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    IDEA (Text Data Visualizer)

    IDEA (Text Data Visualizer)

    Text Data Visualizer with Django

    It is hard for non-developer to visualize data. But if you use IDEA, you can visualize data easily. If you want to test Project: IDEA locally on your environment, you require mecab-ko and mecab-ko-dic. If you have some data which you want to visualize, just put it in IDEA. Then click the Visualization button!
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    data-science-ipython-notebooks

    data-science-ipython-notebooks

    Data science Python notebooks: Deep learning

    Data Science IPython Notebooks is a broad, curated set of Jupyter notebooks covering Python, data wrangling, visualization, machine learning, deep learning, and big data tools. It aims to be a practical map of the ecosystem, showing hands-on examples with libraries such as NumPy, pandas, matplotlib, scikit-learn, and others. Many notebooks introduce concepts step by step, then apply them to real datasets so readers can see techniques in action.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Deploy Apps in Seconds with Cloud Run Icon
    Deploy Apps in Seconds with Cloud Run

    Host and run your applications without the need to manage infrastructure. Scales up from and down to zero automatically.

    Cloud Run is the fastest way to deploy containerized apps. Push your code in Go, Python, Node.js, Java, or any language and Cloud Run builds and deploys it automatically. Get fast autoscaling, pay only when your code runs, and skip the infrastructure headaches. Two million requests free per month. And new customers get $300 in free credit.
    Try Cloud Run Free
  • 5
    automl-gs

    automl-gs

    Provide an input CSV and a target field to predict, generate a model

    Give an input CSV file and a target field you want to predict to automl-gs, and get a trained high-performing machine learning or deep learning model plus native Python code pipelines allowing you to integrate that model into any prediction workflow. No black box: you can see exactly how the data is processed, and how the model is constructed, and you can make tweaks as necessary. automl-gs is an AutoML tool which, unlike Microsoft's NNI, Uber's Ludwig, and TPOT, offers a zero code/model definition interface to getting an optimized model and data transformation pipeline in multiple popular ML/DL frameworks, with minimal Python dependencies (pandas + scikit-learn + your framework of choice). automl-gs is designed for citizen data scientists and engineers without a deep statistical background under the philosophy that you don't need to know any modern data preprocessing and machine learning engineering techniques to create a powerful prediction workflow.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    MUSE

    MUSE

    A library for Multilingual Unsupervised or Supervised word Embeddings

    MUSE is a framework for learning multilingual word embeddings that live in a shared space, enabling bilingual lexicon induction, cross-lingual retrieval, and zero-shot transfer. It supports both supervised alignment with seed dictionaries and unsupervised alignment that starts without parallel data by using adversarial initialization followed by Procrustes refinement. The code can align pre-trained monolingual embeddings (such as fastText) across dozens of languages and provides standardized evaluation scripts and dictionaries. By mapping languages into a common vector space, MUSE makes it straightforward to build cross-lingual applications where resources are scarce for some languages. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Django REST Pandas

    Django REST Pandas

    Serves up Pandas dataframes via the Django REST Framework

    ...The resulting API can serve up CSV (and a number of other formats for consumption by a client-side visualization tool like d3.js. The design philosophy of DRP enforces a strict separation between data and presentation. This keeps the implementation simple, but also has the nice side effect of making it trivial to provide the source data for your visualizations. This capability can often be leveraged by sending users to the same URL that your visualization code uses internally to load the data. While DRP is primarily a data API, it also provides a default collection of interactive visualizations through the @wq/chart library, and a @wq/pandas loader to facilitate custom JavaScript charts that work well with CSV output served by DRP. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    pyfolio

    pyfolio

    Portfolio and risk analytics in Python

    ...Here's an example of a simple tear sheet analyzing a strategy. Quantopian also offers a fully managed service for professionals that includes Zipline, Alphalens, Pyfolio, FactSet data, and more.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    LaueTools

    LaueTools

    open source python packages for X-ray MicroLaue Diffraction analysis

    LaueTools is an open-source project for white beam Laue x-ray microdiffraction data analysis including tools in image processing, peaks searching & indexing, crystal structure solving (orientation & strain) and data & grain mapping visualisation. Python 3 Code and new features are now at: https://gitlab.esrf.fr/micha/lauetools
    Downloads: 0 This Week
    Last Update:
    See Project
  • 99.99% Uptime for MySQL and PostgreSQL on Google Cloud Icon
    99.99% Uptime for MySQL and PostgreSQL on Google Cloud

    Enterprise Plus edition delivers sub-second maintenance downtime and 2x read/write performance. Built for critical apps.

    Cloud SQL Enterprise Plus gives you a 99.99% availability SLA with near-zero downtime maintenance—typically under 10 seconds. Get 2x better read/write performance, intelligent data caching, and 35 days of point-in-time recovery. Supports MySQL, PostgreSQL, and SQL Server with built-in vector search for gen AI apps. New customers get $300 in free credit.
    Try Cloud SQL Free
  • 10
    django-rest-auth

    django-rest-auth

    This app makes it extremely easy to build Django powered SPA's

    This app makes it extremely easy to build Django powered SPA's (Single Page App) or Mobile apps exposing all registration and authentication-related functionality as CBV's (Class Base View) and REST (JSON). Tivix rebuilt a NATO software system to organize and coordinate rescue missions for submarines in distress across the globe. The United Nations Partner Portal (UNPP) is a web application built for a group of UN agencies to simplify their business processes and streamline collaboration...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    pyFileSearcher

    pyFileSearcher

    simple searching tool for big fileservers

    pyFileSearcher was designed to be lightweight, easy to use, but capable of handling a large volume of files tool. A tool that I personally could use on large corporate servers to find out - which files have taken all my space in the last few days? It's free, it's opensource, it's for linux and windows. The program is written in Python 3 using the Qt5.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    PyMOL Molecular Graphics System

    PyMOL Molecular Graphics System

    PyMOL is an OpenGL based molecular visualization system

    The Open-Source PyMOL repository has been moved to github: https://github.com/schrodinger/pymol-open-source We still use the pymol-users mailing list here on sourceforge. Please subscribe for community support: https://pymol.org/maillist (Note: SourceForge email newsletter and special offers are optional and can be unchecked) The PyMOL community wiki has its own home: https://pymolwiki.org/
    Downloads: 52 This Week
    Last Update:
    See Project
  • 13
    Invenio

    Invenio

    Invenio digital library framework

    Invenio is a highly customizable open-source framework for building large-scale digital repositories and research data platforms. Developed by CERN, it is designed to manage, index, and provide access to metadata-rich content such as publications, datasets, and multimedia files. Invenio provides a modular architecture, making it suitable for libraries, archives, and research institutions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Zabbix-in-Telegram

    Zabbix-in-Telegram

    Zabbix Notifications with graphs in Telegram

    Zabbix Notifications with graphs in Telegram.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    Scalable Distributed Deep-RL

    Scalable Distributed Deep-RL

    A TensorFlow implementation of Scalable Distributed Deep-RL

    Scalable Agent is the open implementation of IMPALA (Importance Weighted Actor-Learner Architectures), a highly scalable distributed reinforcement learning framework developed by Google DeepMind. IMPALA introduced a new paradigm for efficiently training agents across large-scale environments by decoupling acting and learning processes. In this architecture, multiple actor processes interact with their environments in parallel to collect trajectories, which are then asynchronously sent to a...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    lazynlp

    lazynlp

    Library to scrape and clean web pages to create massive datasets

    LazyNLP is a lightweight tool for collecting and curating large-scale text datasets for machine learning and NLP applications with minimal manual effort.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    PDF Merge and Edit

    PDF Merge and Edit

    Python script to merge and edit sensitive PDF files

    ...Update a single page in a PDF (good for adding a signed page to a form) Insert a page into an existing PDF. Delete a page. Click on one of the buttons and a new window will pop up depending on the function. Pick your files and enter in the data. If there are no problems, a confirmation will pop up.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    ShadowSocksShare

    ShadowSocksShare

    Python ShadowSocks framework

    This project obtains the shared ss(r) account from the ss(r) shared website crawler, redistributes the account and generates a subscription link by parsing and verifying the account connectivity. Since Google plus will be closed on April 2, 2019, almost all the available accounts crawled before come from Google plus. So if you are building your own website, please keep an eye on the updates of this project and redeploy using the latest source code.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Pipelines

    Pipelines

    An experimental programming language for data flow

    Pipelines is a language and runtime for crafting massively parallel pipelines. Unlike other languages for defining data flow, the Pipeline language requires the implementation of components to be defined separately in the Python scripting language. This allows the details of implementations to be separated from the structure of the pipeline while providing access to thousands of active libraries for machine learning, data analysis, and processing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    CMD Plot Tool

    CMD Plot Tool

    Calculates and plots Colour Magnitude Diagrams from Astronomical data

    ...It works “out of the box” and does not require any installation of development environments, additional libraries or resetting of system paths. The tool is available as a single application/executable file, with the source code, on Sourceforge. Sample data is also bundled, to demonstrate its complete functionality to users. Other functionality within this application is the ability to convert DAOPHOT magnitude files to CSV format.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Canorus

    Canorus

    Music score editor

    Canorus is a free cross-platform music score editor. It supports an unlimited number and length of staffs, polyphony, a MIDI playback of notes, chord markings, lyrics, import/export filters to formats like MIDI, MusicXML, ABC Music, MusiXTeX and LilyPond
    Leader badge
    Downloads: 11 This Week
    Last Update:
    See Project
  • 22
    Requests-HTML

    Requests-HTML

    Pythonic HTML Parsing for Humans

    ...The rest of the code operates the same way as the synchronous version except that results is a list containing multiple response objects however the same basic processes can be applied as above to extract the data you want.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 23
    GeoNotebook

    GeoNotebook

    A Jupyter notebook extension for geospatial visualization and analysis

    GeoNotebook is an open-source extension to the Jupyter Notebook ecosystem that equips users with powerful geospatial visualization and analysis capabilities directly within the notebook interface. It integrates with GeoJS and other geospatial services to enable rich, interactive map rendering, layer control, and GIS data manipulation alongside traditional code and markdown cells in a Jupyter environment. Users can execute Python geospatial analysis and immediately visualize results on slippy web maps, allowing them to explore, annotate, and interpret large spatial datasets without leaving the notebook. GeoNotebook bridges the gap between data science workflows and GIS exploration by combining the flexibility of interactive notebooks with browser-based map display driven by a Python backend and WebGL/Canvas tools. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 24

    ArtnetScanner

    Network Analyzer for lighting protocols

    Beta package released! Network Analyzer for lighting protocols like Art-Net, ACN, MA-Net and Compu-Net. Allows to display DMX data and discovers connected Desks
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    Trigger Happy

    Trigger Happy

    Automate the exchanges of the data between applications

    Automate the exchanges of the data between the applications and services you use on the web. Make Twitter talk to Mastodon, make Github talk to Mattermost, store your favorite tweets by creating notes in Evernote, follow RSS feeds and post each news in Wallabag, Pocket or Evernote. The possibilities are too numerous to name all of them, but with that project you won't have to raise your little finger at all: automate everything and make your life easier.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB