Showing 184 open source projects for "data"

View related business solutions
  • Go from Data Warehouse to Data and AI platform with BigQuery Icon
    Go from Data Warehouse to Data and AI platform with BigQuery

    Build, train, and run ML models with simple SQL. Automate data prep, analysis, and predictions with built-in AI assistance from Gemini.

    BigQuery is more than a data warehouse—it's an autonomous data-to-AI platform. Use familiar SQL to train ML models, run time-series forecasts, and generate AI-powered insights with native Gemini integration. Built-in agents handle data engineering and data science workflows automatically. Get $300 in free credit, query 1 TB, and store 10 GB free monthly.
    Try BigQuery Free
  • Build on Google Cloud with $300 in Free Credit Icon
    Build on Google Cloud with $300 in Free Credit

    New to Google Cloud? Get $300 in free credit to explore Compute Engine, BigQuery, Cloud Run, Vertex AI, and 150+ other products.

    Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query exabytes in BigQuery, or build AI apps with Vertex AI and Gemini. Once your credits are used, keep building with 20+ products with free monthly usage, including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. Sign up to start building right away.
    Start Free Trial
  • 1
    RecNN

    RecNN

    Reinforced Recommendation toolkit built around pytorch 1.7

    This is my school project. It focuses on Reinforcement Learning for personalized news recommendation. The main distinction is that it tries to solve online off-policy learning with dynamically generated item embeddings. I want to create a library with SOTA algorithms for reinforcement learning recommendation, providing the level of abstraction you like.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    PixieDust

    PixieDust

    Python Helper library for Jupyter Notebooks

    PixieDust is an open source Python helper library that works as an add-on to Jupyter notebooks to improve the user experience of working with data. It also fills a gap for users who have no access to configuration files when a notebook is hosted on the cloud.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3

    Optimized Storage for temporal Data

    open Optimized Storage of time series data

    Beta version. Base class for optimized storage of time series data. Uses any kind of relational database. Cross plateform with multiple languages (C++, C#, Java). Conditional storage based on value variation : DeltaValue and DeltaTime params. Get back data without losts.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    I3D models trained on Kinetics

    I3D models trained on Kinetics

    Convolutional neural network model for video classification

    ...The project provides TensorFlow and Sonnet-based implementations, pretrained checkpoints, and example scripts for evaluating or fine-tuning models. It also offers sample data, including preprocessed video frames and optical flow arrays, to demonstrate how to run inference and visualize outputs.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Cut Cloud Costs with Google Compute Engine Icon
    Cut Cloud Costs with Google Compute Engine

    Save up to 91% with Spot VMs and get automatic sustained-use discounts. One free VM per month, plus $300 in credits.

    Save on compute costs with Compute Engine. Reduce your batch jobs and workload bill 60-91% with Spot VMs. Compute Engine's committed use offers customers up to 70% savings through sustained use discounts. Plus, you get one free e2-micro VM monthly and $300 credit to start.
    Try Compute Engine
  • 5
    MLBox

    MLBox

    MLBox is a powerful Automated Machine Learning python library

    MLBox is a powerful Automated Machine Learning python library. Fast reading and distributed data preprocessing/cleaning/formatting. Highly robust feature selection and leak detection. Accurate hyper-parameter optimization in high-dimensional space. State-of-the-art predictive models for classification and regression (Deep Learning, Stacking, LightGBM,...) Prediction with model interpretation. MLBox has been developed and used by many active community members.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    captcha_break

    captcha_break

    Identification codes

    This project will use Keras to build a deep convolutional neural network to identify the captcha verification code. It is recommended to use a graphics card to run the project. The following visualization codes are jupyter notebookall done in . If you want to write a python script, you can run it normally with a little modification. Of course, you can also remove these visualization codes. captcha is a library written in python to generate verification codes. It supports image verification...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    data-science-ipython-notebooks

    data-science-ipython-notebooks

    Data science Python notebooks: Deep learning

    Data Science IPython Notebooks is a broad, curated set of Jupyter notebooks covering Python, data wrangling, visualization, machine learning, deep learning, and big data tools. It aims to be a practical map of the ecosystem, showing hands-on examples with libraries such as NumPy, pandas, matplotlib, scikit-learn, and others. Many notebooks introduce concepts step by step, then apply them to real datasets so readers can see techniques in action.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    Django REST Pandas

    Django REST Pandas

    Serves up Pandas dataframes via the Django REST Framework

    ...The resulting API can serve up CSV (and a number of other formats for consumption by a client-side visualization tool like d3.js. The design philosophy of DRP enforces a strict separation between data and presentation. This keeps the implementation simple, but also has the nice side effect of making it trivial to provide the source data for your visualizations. This capability can often be leveraged by sending users to the same URL that your visualization code uses internally to load the data. While DRP is primarily a data API, it also provides a default collection of interactive visualizations through the @wq/chart library, and a @wq/pandas loader to facilitate custom JavaScript charts that work well with CSV output served by DRP. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9

    Autologging

    Easier logging and tracing of Python functions and class methods.

    Autologging eliminates boilerplate logging setup code and tracing code, and provides a means to separate application logging from program flow and data tracing. Autologging provides two decorators and a custom log level: "autologging.logged" decorates a class to create a __log member. By default, the logger is named for the class's containing module and name (e.g. "my.module.ClassName"). "autologging.traced" decorates a class to provide automatic CALL/RETURN tracing for all class, static, and instance methods, as well as the special __init__ method (by default) "autologging.TRACE" is a custom log level (lower than logging.DEBUG) that is registered with the Python logging module when autologging is imported
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    SFD

    SFD

    S³FD: Single Shot Scale-invariant Face Detector, ICCV, 2017

    S³FD (Single Shot Scale-invariant Face Detector) is a real-time face detection framework designed to handle faces of various sizes with high accuracy using a single deep neural network. Developed by Shifeng Zhang, S³FD introduces a scale-compensation anchor matching strategy and enhanced detection architecture that makes it especially effective for detecting small faces—a long-standing challenge in face detection research. The project builds upon the SSD framework in Caffe, with...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    Mixup-CIFAR10

    Mixup-CIFAR10

    mixup: Beyond Empirical Risk Minimization

    ...The approach acts as a regularizer, encouraging linear behavior in the feature space between samples, which helps reduce overfitting and enhance performance on unseen data.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 12
    Skater

    Skater

    Python library for model interpretation/explanations

    Skater is a unified framework to enable Model Interpretation for all forms of the model to help one build an Interpretable machine learning system often needed for real-world use-cases(** we are actively working towards to enabling faithful interpretability for all forms models). It is an open-source python library designed to demystify the learned structures of a black box model both globally(inference on the basis of a complete data set) and locally(inference about an individual prediction). The concept of model interpretability in the field of machine learning is still new, largely subjective, and, at times, controversial. Model interpretation is the ability to explain and validate the decisions of a predictive model to enable fairness, accountability, and transparency in algorithmic decision-making. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    cnn-text-classification-tf

    cnn-text-classification-tf

    Convolutional Neural Network for Text Classification in Tensorflow

    The cnn-text-classification-tf repository by Denny Britz is a well-known educational implementation of convolutional neural networks for text classification using TensorFlow, aimed at helping developers and researchers understand how CNNs can be applied to natural language processing tasks. Based loosely on Kim’s influential paper on CNNs for sentence classification, this codebase demonstrates how to preprocess text data, convert words into learned embeddings, and apply multiple convolution filters to extract n-gram features that are then pooled and fed into a classifier. The project includes scripts for training, evaluation, and data handling, making it easy to run experiments on datasets such as movie reviews or other labeled text collections. By breaking down the model into understandable components, it serves as a practical reference for students and practitioners learning how deep learning models handle text beyond traditional bag-of-words approaches.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 14
    Zhao

    Zhao

    A compilation of "The Princely Party Relationship Network"

    zhao is a repository that consolidates research, data, and insights related to Zhao, which is likely an individual’s research collection, notes, or curated resources on deep learning, AI, or computational topics (name and content context suggest specialized study). The project may include code examples, experiment results, references to academic papers, mathematical notes, and supporting scripts to explore specific ML methods, benchmarks, or theoretical findings.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    jsondata

    jsondata

    Modular JSON by trees and branches, pointers and patches

    The 'jsondata' package provides for the modular in-memory processing of JSON data by trees, branches, pointers, and patches. The main interface classes are: - JSONData - Core for RFC7159 based data structures. Provides modular data components. - JSONDataSerializer - Core for RFC7159 based data persistence. Provides modular data serialization. - JSONPointer - RFC6901 for addressing by pointer paths. Provides pointer arithmetics
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    pivottablejs

    pivottablejs

    Drag’n’drop Pivot Tables and Charts for Jupyter/IPython Notebook

    PivotTable.js is a Javascript Pivot Table and Pivot Chart library with drag-drop interactivity, and it can now be used with Jupyter/IPython Notebook via the pivottablejs module. I first built PivotTable.js with a plan to build an in-browser data analysis tool, and got as far as one where you could load up a CSV file in the browser for display. Since then, however, the Jupyter project has gathered steam and now provides a browser-based interface to some of the most powerful data processing libraries in the world, so it makes sense to interface with it.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    pyCGNS

    pyCGNS

    A Python package for CGNS

    pyCGNS is now on github: https://pycgns.github.io/index.html https://github.com/pyCGNS/pyCGNS
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18

    EasyHTML

    A python package for building DOM of the HTML documents

    A python package that provides an easy access to elements of HTML and XHTML documents through the Document Object Model.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    C++ library for working with OWL ontologies
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20

    Waterloo

    Java-based scientific graphics

    Java-based scientific graphics with support for Java, Groovy, MATLAB, Python, the R statistical environment, Scala and SciLab.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Question Answering Corpus

    Question Answering Corpus

    Question answering dataset in "Teaching Machines to Read & Comprehend"

    RC-Data is a dataset generation framework created by Google DeepMind to produce large-scale reading comprehension question-answer pairs from CNN and Daily Mail news articles. The dataset, introduced in the 2015 paper “Teaching Machines to Read and Comprehend” (Hermann et al., NIPS 2015), was among the first large corpora designed to train and evaluate machine reading and comprehension models.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 22
    Feed-forward neural network for python
    ffnet is a fast and easy-to-use feed-forward neural network training solution for python. Many nice features are implemented: arbitrary network connectivity, automatic data normalization, very efficient training tools, network export to fortran code. Now ffnet has also a GUI called ffnetui.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    (py)biblib

    A python library to handle BibTeX bibliographic data.

    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Neural Libs

    Neural Libs

    Neural network library for developers

    ...The project also includes examples of the use of neural networks as function approximation and time series prediction. Includes a special program makes it easy to test neural network based on training data and the optimization of the network.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 25
    Importer library to import assets from different common 3D file formats such as Collada, Blend, Obj, X, 3DS, LWO, MD5, MD2, MD3, MDL, MS3D and a lot of other formats. The data is stored in an own in-memory data-format, which can be easily processed. www.open3mod.com/ is a 3D model viewer and exporter based on Assimp that is also Open Source.
    Downloads: 24 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB