Showing 306 open source projects for "python data analysis"

View related business solutions
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    The database for AI-powered applications.

    MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
    Start Free
  • 1
    CakeChat

    CakeChat

    CakeChat: Emotional Generative Dialog System

    CakeChat is a backend for chatbots that are able to express emotions via conversations. The code is flexible and allows to condition model's responses by an arbitrary categorical variable. For example, you can train your own persona-based neural conversational model or create an emotional chatting machine. Hierarchical Recurrent Encoder-Decoder (HRED) architecture for handling deep dialog context. Multilayer RNN with GRU cells. The first layer of the utterance-level encoder is always...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    xLearn

    xLearn

    High performance, easy-to-use, and scalable machine learning (ML)

    xLearn is a high-performance, easy-to-use, and scalable machine learning package that contains linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM), all of which can be used to solve large-scale machine learning problems. xLearn is especially useful for solving machine learning problems on large-scale sparse data. Many real-world datasets deal with high dimensional sparse feature vectors like a recommendation system where the number of categories...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    automl-gs

    automl-gs

    Provide an input CSV and a target field to predict, generate a model

    Give an input CSV file and a target field you want to predict to automl-gs, and get a trained high-performing machine learning or deep learning model plus native Python code pipelines allowing you to integrate that model into any prediction workflow. No black box: you can see exactly how the data is processed, and how the model is constructed, and you can make tweaks as necessary. automl-gs is an AutoML tool which, unlike Microsoft's NNI, Uber's Ludwig, and TPOT, offers a zero code/model...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Tensorpack

    Tensorpack

    A Neural Net Training Interface on TensorFlow, with focus on speed

    Tensorpack is a neural network training interface based on TensorFlow v1. Uses TensorFlow in the efficient way with no extra overhead. On common CNNs, it runs training 1.2~5x faster than the equivalent Keras code. Your training can probably gets faster if written with Tensorpack. Scalable data-parallel multi-GPU / distributed training strategy is off-the-shelf to use. Squeeze the best data loading performance of Python with tensorpack.dataflow. Symbolic programming (e.g. tf.data) does not offer...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 5
    TEXT2DATA

    TEXT2DATA

    Text Analytics Platform

    Bring Text Analytics Platform that uses NLP (Natural Language Processing) and Machine Learning to your work environment. Extract essential information from your text documents and let Artificial Intelligence save your time. Get detailed and agile reports on your unstructured data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Pragmatic AI

    Pragmatic AI

    [Book-2019] Pragmatic AI: An Introduction to Cloud-based ML

    Pragmatic AI is the first truly practical guide to solving real-world problems with contemporary machine learning, artificial intelligence, and cloud computing tools. Writing for business professionals, decision-makers, and students who aren’t professional data scientists, Noah Gift demystifies all the tools and technologies you need to get results. He illuminates powerful off-the-shelf cloud-based solutions from Google, Amazon, and Microsoft, as well as accessible techniques using Python and R...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    Skater

    Skater

    Python library for model interpretation/explanations

    Skater is a unified framework to enable Model Interpretation for all forms of the model to help one build an Interpretable machine learning system often needed for real-world use-cases(** we are actively working towards to enabling faithful interpretability for all forms models). It is an open-source python library designed to demystify the learned structures of a black box model both globally(inference on the basis of a complete data set) and locally(inference about an individual prediction...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Scikit-plot

    Scikit-plot

    An intuitive library to add plotting functionality to scikit-learn

    Single line functions for detailed visualizations. Scikit-plot is the result of an unartistic data scientist's dreadful realization that visualization is one of the most crucial components in the data science process, not just a mere afterthought. Gaining insights is simply a lot easier when you're looking at a colored heatmap of a confusion matrix complete with class labels rather than a single-line dump of numbers enclosed in brackets. Besides, if you ever need to present your results...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    anaGo

    anaGo

    Bidirectional LSTM-CRF and ELMo for Named-Entity Recognition

    anaGo is a Python library for sequence labeling(NER, PoS Tagging,...), implemented in Keras. anaGo can solve sequence labeling tasks such as named entity recognition (NER), part-of-speech tagging (POS tagging), semantic role labeling (SRL) and so on. Unlike traditional sequence labeling solver, anaGo doesn't need to define any language-dependent features. Thus, we can easily use anaGo for any language. In anaGo, the simplest type of model is the Sequence model. Sequence model includes essential...
    Downloads: 0 This Week
    Last Update:
    See Project
  • The Ultimate Quiz Maker & Engagement Platform Icon
    The Ultimate Quiz Maker & Engagement Platform

    Powering publishers, brands, and sports teams with 30+ interactive content types. Maximize engagement and revenue with Riddle.

    Riddle is an online platform for creating interactive content such as quizzes, surveys, personality tests, prediction games, and leaderboards. Our customers create content on our platform and then embed it on their website. The goal? Increased engagement, lead generation, segmentation, and content monetization - all 100% GDPR compliant.
    Try for free
  • 10

    TensorImage

    Image classification library for easily training and deploying models

    (Visit our github repository at https://github.com/TensorImage/tensorimage for more information) TensorImage is and open source package for image classification. It has a wide range of data augmentation operations that can be performed over training data to prevent overfitting and increase testing accuracy. TensorImage is easy to use and manage as all files, trained models and data are organized within a workspace directory, which you can change at any time in the configuration file...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    DIGITS

    DIGITS

    Deep Learning GPU training system

    The NVIDIA Deep Learning GPU Training System (DIGITS) puts the power of deep learning into the hands of engineers and data scientists. DIGITS can be used to rapidly train the highly accurate deep neural network (DNNs) for image classification, segmentation and object detection tasks. DIGITS simplifies common deep learning tasks such as managing data, designing and training neural networks on multi-GPU systems, monitoring performance in real-time with advanced visualizations, and selecting...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    The Deep Review

    The Deep Review

    A collaboratively written review paper on deep learning, genomics, etc

    This repository is home to the Deep Review, a review article on deep learning in precision medicine. The Deep Review is collaboratively written on GitHub using a tool called Manubot (see below). The project operates on an open contribution model, welcoming contributions from anyone. To see what's incoming, check the open pull requests. For project discussion and planning see the Issues. As of writing, we are aiming to publish an update of the deep review. We will continue to make project...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Edward

    Edward

    A probabilistic programming language in TensorFlow

    A library for probabilistic modeling, inference, and criticism. Edward is a Python library for probabilistic modeling, inference, and criticism. It is a testbed for fast experimentation and research with probabilistic models, ranging from classical hierarchical models on small data sets to complex deep probabilistic models on large data sets. Edward fuses three fields, Bayesian statistics and machine learning, deep learning, and probabilistic programming. Edward is built on TensorFlow...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Intel neon

    Intel neon

    Intel® Nervana™ reference deep learning framework

    neon is Intel's reference deep learning framework committed to best performance on all hardware. Designed for ease of use and extensibility. See the new features in our latest release. We want to highlight that neon v2.0.0+ has been optimized for much better performance on CPUs by enabling Intel Math Kernel Library (MKL). The DNN (Deep Neural Networks) component of MKL that is used by neon is provided free of charge and downloaded automatically as part of the neon installation. The gpu...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15

    AerinSistemas-Noname

    Elasticsearch to Pandas dataframe or CSV

    API and command line utility, written in Python, for querying Elasticsearch exporting result as documents into a CSV file. The search can be done using logical operators or ranges, in combination or alone. The output can be limited to the desired attributes. Also ToT can insert the querying to a Pandas Dataframe or/and save its in a HDF5 container (under development).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    Tangent

    Tangent

    Source-to-source debuggable derivatives in pure Python

    Existing libraries implement automatic differentiation by tracing a program's execution (at runtime, like PyTorch) or by staging out a dynamic data-flow graph and then differentiating the graph (ahead-of-time, like TensorFlow). In contrast, Tangent performs ahead-of-time autodiff on the Python source code itself, and produces Python source code as its output. Tangent fills a unique location in the space of machine learning tools. As a result, you can finally read your automatic derivative code...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    AI learning

    AI learning

    AiLearning, data analysis plus machine learning practice

    We actively respond to the Research Open Source Initiative (DOCX) . Open source today is not just open source, but datasets, models, tutorials, and experimental records. We are also exploring other categories of open source solutions and protocols. I hope you will understand this initiative, combine this initiative with your own interests, and do what you can. Everyone's tiny contributions, together, are the entire open source ecosystem. We are iBooker, a large open-source community,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    auto_ml

    auto_ml

    Automated machine learning for analytics & production

    auto_ml is designed for production. Here's an example that includes serializing and loading the trained model, then getting predictions on single dictionaries, roughly the process you'd likely follow to deploy the trained model. Before you go any further, try running the code. Load up some data (either a DataFrame, or a list of dictionaries, where each dictionary is a row of data). Make a column_descriptions dictionary that tells us which attribute name in each row represents the value we’re...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Python Machine Learning book

    Python Machine Learning book

    The book code repository and info resource

    What you can expect are 400 pages rich in useful material just about everything you need to know to get started with machine learning. From theory to the actual code that you can directly put into action! This is not yet just another "this is how scikit-learn works" book. I aim to explain all the underlying concepts, tell you everything you need to know in terms of best practices and caveats, and we will put those concepts into action mainly using NumPy, scikit-learn, and Theano. This is not...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20

    SPAWNN

    SPatial Analysis With self-organizing Neural Networks

    The SPAWNN toolkit is an innovative toolkit for spatial analysis with self-organizing neural networks which is particularily useful for spatial analysis, visualization and geographical data mining. To run the toolkit, simply download and execute (double-click) the jar-file. Please cite: - Hagenauer, J., & Helbich, M. (2016). SPAWNN: A Toolkit for SPatial Analysis With Self-Organizing Neural Networks. Transactions in GIS, 20(5), 755-775. Other related publications: - Hagenauer, J...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    PyDaMelo

    Python-compatible Data mining elementary objects

    An attempt at offering machine learning and data mining algorithms at the finest grain we are able to, easy to combine together through Python scripting to glue together the Lego-like bricks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    All future developments will be implemented in the new MATLAB toolbox SciXMiner, please visit https://sourceforge.net/projects/scixminer/ to download the newest version. The former Matlab toolbox Gait-CAD was designed for the visualization and analysis of time series and features with a special focus to data mining problems including classification, regression, and clustering.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    JCLTP

    A Java Class Library for Text Processing

    JCLTP is a class library designed for processing text. JCLTP is free, open source and developed with the Java programming language. JCLTP is distributed under the GNU license. It incorporates several technologies that enable process information while applying AI techniques, in order to build predictive models for text classification. Through a flexible structure of interfaces and classes, the opportunity to extend, adapt and add functionality JCLTP is provided. Thus, analysis of new types...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Mass-based dissimilarity

    Mass-based dissimilarity

    A data dependent dissimilarity measure based on mass estimation.

    This software calculates the mass-based dissimilarity matrix for data mining algorithms relying on a distance measure. References: Overcoming Key Weaknesses of Distance-based Neighbourhood Methods using a Data Dependent Dissimilarity Measure. KDD 2016 http://dx.doi.org/10.1145/2939672.2939779 The source code, presentation slide and poster are attached under "Files". The presentation video in KDD 2016 is published on https://youtu.be/eotD_-SuEoo . Since this software is licensed...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    ExSTraCS

    ExSTraCS

    Extended Supervised Tracking and Classifying System

    ... to address problems in epidemiological data mining to identify complex patterns relating predictive attributes in noisy datasets to disease phenotypes of interest. ExSTraCS combines a number of recent advancements into a single algorithmic platform. It can flexibly handle (1) discrete or continuous attributes, (2) missing data, (3) balanced or imbalanced datasets, and (4) binary or many classes. A complete users guide for ExSTraCS is included. Coded in Python 2.7.
    Downloads: 0 This Week
    Last Update:
    See Project
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.