Showing 93 open source projects for "statistical"

View related business solutions
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 1
    Pipeline for training Language Models

    Pipeline for training Language Models

    Pipeline for training Language Models using PyTorch.

    Pipeline for training Language Models using PyTorch. Inspired by Yandex Data School NLP Course (week 03: Language Modeling) Prepared text file with space-separated words on each line.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    Eiten

    Eiten

    Statistical and Algorithmic Investing Strategies for Everyone

    Eiten is an open-source Python project focused on providing statistical and algorithmic trading strategies powered by data analysis and machine learning techniques. It is designed to make quantitative investing more accessible by offering ready-to-use strategies that analyze market behavior, detect patterns, and generate actionable insights. The project includes tools for evaluating stock performance, identifying trends, and applying algorithmic models to financial data, enabling users to experiment with different investment approaches. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    Machine Learning Mindmap

    Machine Learning Mindmap

    A mindmap summarising Machine Learning concepts

    ...The project organizes a wide range of machine learning topics into an interconnected diagram that helps learners understand how concepts relate to one another across the broader field of artificial intelligence. The mind map covers fundamental areas such as data preprocessing, statistical analysis, supervised learning, unsupervised learning, reinforcement learning, and deep learning architectures. By arranging these concepts visually, the repository allows students and practitioners to quickly explore the relationships between algorithms, techniques, and modeling approaches used in modern machine learning workflows. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    NeuralCoref

    NeuralCoref

    Fast Coreference Resolution in spaCy with Neural Networks

    ...For a brief introduction to coreference resolution and NeuralCoref, please refer to our blog post. NeuralCoref is written in Python/Cython and comes with a pre-trained statistical model for English only. NeuralCoref is accompanied by a visualization client NeuralCoref-Viz, a web interface powered by a REST server that can be tried online.
    Downloads: 0 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 5
    automl-gs

    automl-gs

    Provide an input CSV and a target field to predict, generate a model

    ...No black box: you can see exactly how the data is processed, and how the model is constructed, and you can make tweaks as necessary. automl-gs is an AutoML tool which, unlike Microsoft's NNI, Uber's Ludwig, and TPOT, offers a zero code/model definition interface to getting an optimized model and data transformation pipeline in multiple popular ML/DL frameworks, with minimal Python dependencies (pandas + scikit-learn + your framework of choice). automl-gs is designed for citizen data scientists and engineers without a deep statistical background under the philosophy that you don't need to know any modern data preprocessing and machine learning engineering techniques to create a powerful prediction workflow.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    spark-ml-source-analysis

    spark-ml-source-analysis

    Spark ml algorithm principle analysis and specific source code

    spark-ml-source-analysis is a technical repository that analyzes the internal implementation of machine learning algorithms within Apache Spark’s MLlib library. The project aims to help developers and data scientists understand how distributed machine learning algorithms are implemented and optimized inside the Spark ecosystem. Instead of providing a runnable software system, the repository focuses on explaining algorithm principles and examining the underlying source code used in Spark’s...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    DS-Take-Home

    DS-Take-Home

    Solution to the book A Collection of Data Science Take-Home Challenge

    DS-Take-Home is a repository that provides practical solutions to a series of real-world data science challenges inspired by the book A Collection of Data Science Take-Home Challenges. The project is designed as a learning resource where aspiring data scientists can study how typical industry-style take-home assignments are solved using data analysis and machine learning techniques. Each challenge is implemented in a separate Jupyter notebook that walks through the process of analyzing...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    newLISP for BSDs, LINUX, MacOS X, SunOS and Win32: small, fast 350+ functions, a -C-, MySQL, PostgreSQL, SQLite, ODBC, TCP/IP, UDP, XML, Java interface, string processing, regular expressions , math, financial, statistical functions, Win32 DLL
    Leader badge
    Downloads: 8 This Week
    Last Update:
    See Project
  • 9
    Lihang

    Lihang

    Statistical learning methods (2nd edition) [Li Hang]

    Lihang is an open-source repository that provides educational notes, mathematical derivations, and code implementations based on the book Statistical Learning Methods by Li Hang. The repository aims to help readers understand the theoretical foundations of machine learning algorithms through practical implementations and detailed explanations. It includes notebooks and scripts that demonstrate how key algorithms such as perceptrons, decision trees, logistic regression, support vector machines, and hidden Markov models work in practice. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • 10
    DeepLearn

    DeepLearn

    Implementation of research papers on Deep Learning+ NLP+ CV in Python

    Welcome to DeepLearn. This repository contains an implementation of the following research papers on NLP, CV, ML, and deep learning. The required dependencies are mentioned in requirement.txt. I will also use dl-text modules for preparing the datasets. If you haven't use it, please do have a quick look at it. CV, transfer learning, representation learning.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11

    BioRec:Bird Census field data annotation

    Recognizing biological data from a notebook.

    ...Namely, bird census based on personal inspection or small (~10 km^2) regions with recording birds' position and behaviour on paper. This project makes it easy to annotate such field data and to make this data available for statistical analysis.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12

    cbrTekStraktor

    an application to automatically extract text from comic books.

    ...The application also enables to manually define text areas in CBR files. The application comprises a simple graphical editor for further processing the extracted text. The text extraction is achieved by a combination of statistical and graphical processing operations. It is based on the following 3 major algorithms - Binarization of color images (Niblak and other methods) - Connected components - K-Means clustering Apache Tesseract is used to perform Optical Character Recognition on the extracted text. A subsequent version of the application will integrate with translation software in order to provide automated translation of comic book texts and re-inserion of translated texts
    Downloads: 6 This Week
    Last Update:
    See Project
  • 13
    ZPar statistical parser. Universal language support (depending on the availability of training data), with language-specific features for Chinese and English. Currently support word segmentation, POS tagging, dependency and phrase-structure parsing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Phrasal

    Phrasal

    Statistical phrase-based machine translation system

    Stanford Phrasal is a state-of-the-art statistical phrase-based machine translation system, written in Java. At its core, it provides much the same functionality as the core of Moses. Distinctive features include: providing an easy to use API for implementing new decoding model features, the ability to translating using phrases that include gaps (Galley et al. 2010), and conditional extraction of phrase-tables and lexical reordering models.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Adaptive Gaussian Filtering

    Adaptive Gaussian Filtering

    Machine learning with Gaussian kernels.

    Libagf is a machine learning library that includes adaptive kernel density estimators using Gaussian kernels and k-nearest neighbours. Operations include statistical classification, interpolation/non-linear regression and pdf estimation. For statistical classification there is a borders training feature for creating fast and general pre-trained models that nonetheless return the conditional probabilities. Libagf also includes clustering algorithms as well as comparison and validation routines. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    Natural Language Analysis with Ngrams

    NLP tool for statistical analysis of words, sentences, documents

    Goal of this project is to have a NLP tool that would give statistical analysis results based on Google Ngram data. Furthermore, it is now just a NetBeans project without a final JAR. Furthermore, there will be a github version for anyone who wishes to contribute. In the future versions, user will be able to convert a single word to numerical data, to be able to compare two words and get the comparison data, and to be able to do the same for the sentences, paragraphs and documents. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17

    Chordalysis

    Log-linear analysis (data modelling) for high-dimensional data

    ===== Project moved to https://github.com/fpetitjean/Chordalysis ===== Log-linear analysis is the statistical method used to capture multi-way relationships between variables. However, due to its exponential nature, previous approaches did not allow scale-up to more than a dozen variables. We present here Chordalysis, a log-linear analysis method for big data. Chordalysis exploits recent discoveries in graph theory by representing complex models as compositions of triangular structures, also known as chordal graphs. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    NetKit-SRL, or NetKit for short, is an open-source Network Learning Toolkit for statistical relational learning. The toolkit provides functionalities not found in any existing open source projects and integrates with the WEKA machine learning toolkit.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19

    SocialModeler

    A set of tools for analyzing open source social media

    SocialModeler leverages natural language processing and statistical text analysis approaches to quickly analyze and explore social media data (e.g. news articles or blogs). It uses an application-based user interface for configuration and analysis.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    iGREAT is an open-source, statistical machine translation software toolkit based on finite-state models.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    LExAu: Learning Expectations Autonomously. Library for on-line data driven statistical machine learning.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22

    EMGU Face Recognition

    Using EMGU to perform Principle Component Analysis (PCA)

    ...The reason that face recognition is so popular is not only it’s real world application but also the common use of principle component analysis (PCA). PCA is an ideal method for recognising statistical patterns in data. The popularity of face recognition is the fact a user can apply a method easily and see if it is working without needing to know to much about how the process is working.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    LoonyBin is a workflow management system specifically geared toward the needs of computational research. It is currently used in Natural Language Processing and statistical machine translation. More at http://www.cs.cmu.edu/~jhclark/loonybin/.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24

    Aliza Gaming API

    An extensible development framework for roleplay games.

    ...It offers a comprehensive set of tools, utilities, and libraries, empowering developers to create immersive and dynamic gaming experiences with ease. Key Features: Modular Architecture, Rich Graphics and UI Components, Comprehensive Game Logic and Character Management, Environment and World-Building Tools, Statistical and Mathematical Utilities, Enhanced Debugging and Logging, Data Management and Integration, Integrated LLM for NPC Dialog, Cross-Platform Compatibility Dive into the world of game development with AlizaGameAPI. Download the latest version, explore the detailed documentation, and join the community of developers pushing the boundaries of gaming technology.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    JProGraM (PRObabilistic GRAphical Models in Java) is a statistical machine learning library. It supports statistical modeling and data analysis along three main directions: (1) probabilistic graphical models (Bayesian networks, Markov random fields, dependency networks, hybrid random fields); (2) parametric, semiparametric, and nonparametric density estimation (Gaussian models, nonparanormal estimators, Parzen windows, Nadaraya-Watson estimator); (3) generative models for random networks (small-world, scale-free, exponential random graphs, Fiedler random fields), subgraph sampling algorithms (random walk, snowball, etc.), and spectral decomposition.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB