Showing 1214 open source projects for "python data analysis"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • 1
    LongWriter

    LongWriter

    Unleashing 10,000+ Word Generation from Long Context LLMs

    LongWriter is an open-source framework and set of large language models designed to enable ultra-long text generation that can exceed 10,000 words while maintaining coherence and structure. Traditional large language models can process large inputs but often struggle to generate long outputs due to limitations in training data and alignment strategies. LongWriter addresses this challenge by introducing a specialized dataset and training approach that encourages models to produce longer...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    LlamaGen

    LlamaGen

    Autoregressive Model Beats Diffusion

    LlamaGen is an open-source research project that introduces a new approach to image generation by applying the autoregressive next-token prediction paradigm used in large language models to visual generation tasks. Instead of relying on diffusion models, the framework treats images as sequences of tokens that can be generated progressively using transformer architectures similar to those used for text generation. The project explores how scaling autoregressive models and improving image...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Magicoder

    Magicoder

    Empowering Code Generation with OSS-Instruct

    Magicoder is an open-source family of large language models designed specifically for code generation and software development tasks. The project focuses on improving the quality and diversity of code generation by training models with a novel dataset construction approach known as OSS-Instruct. This technique uses open-source code repositories as a foundation for generating more realistic and diverse instruction datasets for training language models. By grounding training data in real...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    xLSTM

    xLSTM

    Neural Network architecture based on ideas of the original LSTM

    xLSTM is an open-source machine learning architecture that reimagines the classic Long Short-Term Memory (LSTM) network for modern large-scale language modeling and sequence processing tasks. The project introduces a new recurrent neural network design that incorporates exponential gating mechanisms and enhanced memory structures to overcome limitations of traditional LSTM models. By introducing innovations such as matrix-based memory and improved normalization techniques, xLSTM improves the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 5
    Huatuo-Llama-Med-Chinese

    Huatuo-Llama-Med-Chinese

    Instruction-tuning LLM with Chinese Medical Knowledge

    Huatuo-Llama-Med-Chinese is an open-source project that develops medical-domain large language models by instruction-tuning existing models using Chinese medical knowledge. The project builds specialized models by fine-tuning architectures such as LLaMA, Alpaca-Chinese, and Bloom with curated medical datasets. These datasets are constructed from medical knowledge graphs, academic literature, and question-answer pairs designed to teach models how to respond accurately to healthcare-related...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    InternVL

    InternVL

    A Pioneering Open-Source Alternative to GPT-4o

    InternVL is a large-scale multimodal foundation model designed to integrate computer vision and language understanding within a unified architecture. The project focuses on scaling vision models and aligning them with large language models so that they can perform tasks involving both visual and textual information. InternVL is trained on massive collections of image-text data, enabling it to learn representations that capture both visual patterns and semantic meaning. The model supports a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    TimesFM

    TimesFM

    Pretrained time-series foundation model developed by Google Research

    TimesFM is a pretrained time-series foundation model from Google Research built for forecasting tasks, designed to generalize across many domains without requiring extensive per-dataset retraining. It provides a decoder-only model approach to forecasting, aiming for strong performance even in zero-shot or low-data settings where traditional models often struggle. The project includes code and an inference API intended to make it practical to run forecasts programmatically, with options to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Ling-V2

    Ling-V2

    Ling-V2 is a MoE LLM provided and open-sourced by InclusionAI

    Ling-V2 is an open-source family of Mixture-of-Experts (MoE) large language models developed by the InclusionAI research organization with the goal of combining state-of-the-art performance, efficiency, and openness for next-generation AI applications. It introduces highly sparse architectures where only a fraction of the model’s parameters are activated per input token, enabling models like Ling-mini-2.0 to achieve reasoning and instruction-following capabilities on par with much larger...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    JEPA

    JEPA

    PyTorch code and models for V-JEPA self-supervised learning from video

    JEPA (Joint-Embedding Predictive Architecture) captures the idea of predicting missing high-level representations rather than reconstructing pixels, aiming for robust, scalable self-supervised learning. A context encoder ingests visible regions and predicts target embeddings for masked regions produced by a separate target encoder, avoiding low-level reconstruction losses that can overfit to texture. This makes learning focus on semantics and structure, yielding features that transfer well...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8 Monitoring Tools in One APM. Install in 5 Minutes. Icon
    8 Monitoring Tools in One APM. Install in 5 Minutes.

    Errors, performance, logs, uptime, hosts, anomalies, dashboards, and check-ins. One interface.

    AppSignal works out of the box for Ruby, Elixir, Node.js, Python, and more. 30-day free trial, no credit card required.
    Start Free
  • 10
    DLRM

    DLRM

    An implementation of a deep learning recommendation model (DLRM)

    DLRM (Deep Learning Recommendation Model) is Meta’s open-source reference implementation for large-scale recommendation systems built to handle extremely high-dimensional sparse features and embedding tables. The architecture combines dense (MLP) and sparse (embedding) branches, then interacts features via dot product or feature interactions before passing through further dense layers to predict click-through, ranking scores, or conversion probabilities. The implementation is optimized for...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    MoCo (Momentum Contrast)

    MoCo (Momentum Contrast)

    Self-supervised visual learning using momentum contrast in PyTorch

    MoCo is an open source PyTorch implementation developed by Facebook AI Research (FAIR) for the papers “Momentum Contrast for Unsupervised Visual Representation Learning” (He et al., 2019) and “Improved Baselines with Momentum Contrastive Learning” (Chen et al., 2020). It introduces Momentum Contrast (MoCo), a scalable approach to self-supervised learning that enables visual representation learning without labeled data. The core idea of MoCo is to maintain a dynamic dictionary with a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    DINOv2

    DINOv2

    PyTorch code and models for the DINOv2 self-supervised learning

    DINOv2 is a self-supervised vision learning framework that produces strong, general-purpose image representations without using human labels. It builds on the DINO idea of student–teacher distillation and adapts it to modern Vision Transformer backbones with a carefully tuned recipe for data augmentation, optimization, and multi-crop training. The core promise is that a single pretrained backbone can transfer well to many downstream tasks—from linear probing on classification to retrieval,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    IVY

    IVY

    The Unified Machine Learning Framework

    Take any code that you'd like to include. For example, an existing TensorFlow model, and some useful functions from both PyTorch and NumPy libraries. Choose any framework for writing your higher-level pipeline, including data loading, distributed training, analytics, logging, visualization etc. Choose any backend framework which should be used under the hood, for running this entire pipeline. Choose the most appropriate device or combination of devices for your needs. DeepMind releases an...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    PyG

    PyG

    Graph Neural Network Library for PyTorch

    PyG (PyTorch Geometric) is a library built upon PyTorch to easily write and train Graph Neural Networks (GNNs) for a wide range of applications related to structured data. It consists of various methods for deep learning on graphs and other irregular structures, also known as geometric deep learning, from a variety of published papers. In addition, it consists of easy-to-use mini-batch loaders for operating on many small and single giant graphs, multi GPU-support, DataPipe support,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Korvus

    Korvus

    Korvus is a search SDK that unifies the entire RAG pipeline

    Korvus is an open-source retrieval-augmented generation (RAG) pipeline designed to run entirely inside PostgreSQL, allowing developers to build AI search and knowledge systems directly within a database environment. The project consolidates the typical steps of a RAG pipeline—including embedding generation, document retrieval, reranking, and text generation—into a single query executed within the Postgres ecosystem. By leveraging PostgresML and vector extensions such as pgvector, Korvus...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    higgsfield

    higgsfield

    Fault-tolerant, highly scalable GPU orchestration

    Higgsfield is an open-source, fault-tolerant, highly scalable GPU orchestration, and a machine learning framework designed for training models with billions to trillions of parameters, such as Large Language Models (LLMs).
    Downloads: 10 This Week
    Last Update:
    See Project
  • 17
    CogAgent

    CogAgent

    An open sourced end-to-end VLM-based GUI Agent

    CogAgent is a 9B-parameter bilingual vision-language GUI agent model based on GLM-4V-9B, trained with staged data curation, optimization, and strategy upgrades to improve perception, action prediction, and generalization across tasks. It focuses on operating real user interfaces from screenshots plus text, and follows a strict input–output format that returns structured actions, grounded operations, and optional sensitivity annotations. The model is designed for agent-style execution rather...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    DeepXDE

    DeepXDE

    A library for scientific machine learning & physics-informed learning

    DeepXDE is a library for scientific machine learning and physics-informed learning. DeepXDE includes the following algorithms. Physics-informed neural network (PINN). Solving different problems. Solving forward/inverse ordinary/partial differential equations (ODEs/PDEs) [SIAM Rev.] Solving forward/inverse integro-differential equations (IDEs) [SIAM Rev.] fPINN: solving forward/inverse fractional PDEs (fPDEs) [SIAM J. Sci. Comput.] NN-arbitrary polynomial chaos (NN-aPC): solving...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Qwen-Audio

    Qwen-Audio

    Chat & pretrained large audio language model proposed by Alibaba Cloud

    Qwen-Audio is a large audio-language model developed by Alibaba Cloud, built to accept various types of audio input (speech, natural sounds, music, singing) along with text input, and output text. There is also an instruction-tuned version called Qwen-Audio-Chat which supports conversational interaction (multi-round), audio + text input, creative tasks and reasoning over audio. It uses multi-task training over many different audio tasks (30+), and achieves strong multi-benchmarks performance...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Flock

    Flock

    Flock is a workflow-based low-code platform for building chatbots

    Flock is a workflow-based low-code platform designed for building AI applications such as chatbots, retrieval-augmented generation systems, and multi-agent workflows. The platform uses a visual workflow architecture where different nodes represent processing steps such as input processing, model inference, retrieval operations, and tool execution. Developers can connect these nodes to create complex pipelines that orchestrate multiple language models and external services. Built on...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 21
    UNO

    UNO

    A Universal Customization Method for Single and Multi Conditioning

    UNO is a project by ByteDance introduced in 2025, titled “A Universal Customization Method for Both Single and Multi-Subject Conditioning.” It suggests a framework for image (or more general generative) modeling where the model can be conditioned either on a single subject or multiple subjects — which may correspond to generating or customizing images featuring specific people, styles, or objects, possibly with fine-grained control over subject identity or composition. Because the project is...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    PyTorch/XLA

    PyTorch/XLA

    Enabling PyTorch on Google TPU

    PyTorch/XLA is a Python package that uses the XLA deep learning compiler to connect the PyTorch deep learning framework and Cloud TPUs. You can try it right now, for free, on a single Cloud TPU with Google Colab, and use it in production and on Cloud TPU Pods with Google Cloud. Take a look at one of our Colab notebooks to quickly try different PyTorch networks running on Cloud TPUs and learn how to use Cloud TPUs as PyTorch devices. We are also introducing new TPU VMs for more transparent...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    XIAOJUSURVEY

    XIAOJUSURVEY

    Powerful survey system for creating, managing, and analyzing forms

    ...It also focuses on data analysis capabilities, helping users extract insights from collected responses through built-in reporting and visualization features. Xiaoju Survey is designed with extensibility in mind, allowing developers to customize or integrate it into existing systems. Its architecture supports high availability and scalability, making it suitable for enterprise-level deployments.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 24
    iX

    iX

    Autonomous GPT-4 agent platform

    IX is a platform for designing and deploying autonomous and [semi]-autonomous LLM-powered agents and workflows. IX provides a flexible and scalable solution for delegating tasks to AI-powered agents. Agents created with the platform can automate a wide variety of tasks while running in parallel and communicating with each other.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 25
    TensorFlow Privacy

    TensorFlow Privacy

    Library for training machine learning models with privacy for data

    Library for training machine learning models with privacy for training data. This repository contains the source code for TensorFlow Privacy, a Python library that includes implementations of TensorFlow optimizers for training machine learning models with differential privacy. The library comes with tutorials and analysis tools for computing the privacy guarantees provided.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB