Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Artificial Intelligence
AI Models
Search Results

Search Results for "python data analysis"

x

Sort By:

Relevance

Clear All Filters

OS

Linux 30
Mac 29
Windows 29
More...
BSD 21
ChromeOS 21
Mobile Operating Systems 1

Category

Artificial Intelligence 30
Business 1

License

OSI-Approved Open Source 14

Translations

Chinese (Simplified) 2
Chinese (Traditional) 2
English 2

Programming Language

Python 23
JavaScript 1

Showing 30 open source projects for "python data analysis"

View related business solutions

AI Models Linux Clear Filters & Widen Search

Test your software product anywhere in the world
Get feedback from real people across 190+ countries with the devices, environments, and payment instruments you need for your perfect test.

Global App Testing is a managed pool of freelancers used by Google, Meta, Microsoft, and other world-beating software companies.

Try us today.
Crowdtesting That Delivers | Testeum
Unfixed bugs delaying your launch? Test with real users globally – check it out for free, results in days.

Testeum connects your software, app, or website to a worldwide network of testers, delivering detailed feedback in under 48 hours. Ensure functionality and refine UX on real devices, all at a fraction of traditional costs. Trusted by startups and enterprises alike, our platform streamlines quality assurance with actionable insights. Click to perfect your product now.

Click to perfect your product now.
1

DB-GPT

Revolutionizing Database Interactions with Private LLM Technology

DB-GPT is an experimental open-source project that uses localized GPT large models to interact with your data and environment. With this solution, you can be assured that there is no risk of data leakage, and your data is 100% private and secure.

Downloads: 8 This Week

Last Update: 2025-07-25
See Project
2

MiniCPM-o

A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming

MiniCPM-o 2.6 is a cutting-edge multimodal large language model (MLLM) designed for high-performance tasks across vision, speech, and video. Capable of running on end-side devices such as smartphones and tablets, it provides powerful features like real-time speech conversation, video understanding, and multimodal live streaming. With 8 billion parameters, MiniCPM-o 2.6 surpasses its predecessors in versatility and efficiency, making it one of the most robust models available. It supports...

Downloads: 4 This Week

Last Update: 2025-05-15
See Project
3

NVIDIA Isaac GR00T

NVIDIA Isaac GR00T N1.5 is the world's first open foundation model

NVIDIA Isaac‑GR00T N1.5 is an open-source foundation model engineered for generalized humanoid robot reasoning and manipulation skills. It accepts multimodal inputs—such as language and images—and uses a diffusion transformer architecture built upon vision-language encoders, enabling adaptive robot behaviors across diverse environments. It is designed to be customizable via post-training with real or synthetic data.

Downloads: 3 This Week

Last Update: 2025-08-13
See Project
4

Clay Foundation Model

The Clay Foundation Model - An open source AI model and interface

The Clay Foundation Model is an open-source AI model and interface designed to provide comprehensive data and insights about Earth. It aims to serve as a foundational tool for environmental monitoring, research, and decision-making by integrating various data sources and offering an accessible platform for analysis.

Downloads: 0 This Week

Last Update: 2025-07-05
See Project
Picsart Enterprise Background Removal API for Stunning eCommerce Visuals
Instantly remove the background from your images in just one click.

With our Remove Background API tool, you can access the transformative capabilities of automation , which will allow you to turn any photo asset into compelling product imagery. With elevated visuals quality on your digital platforms, you can captivate your audience, and therefore achieve higher engagement and sales.

Learn More
5

MedicalGPT

MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training

MedicalGPT training medical GPT model with ChatGPT training pipeline, implementation of Pretraining, Supervised Finetuning, Reward Modeling and Reinforcement Learning. MedicalGPT trains large medical models, including secondary pre-training, supervised fine-tuning, reward modeling, and reinforcement learning training.

Downloads: 0 This Week

Last Update: 2025-02-16
See Project
6

Chinese-LLaMA-Alpaca 2

Chinese LLaMA-2 & Alpaca-2 Large Model Phase II Project

This project is developed based on the commercially available large model Llama-2 released by Meta. It is the second phase of the Chinese LLaMA&Alpaca large model project. The Chinese LLaMA-2 base model and the Alpaca-2 instruction fine-tuning large model are open-sourced. These models expand and optimize the Chinese vocabulary on the basis of the original Llama-2, use large-scale Chinese data for incremental pre-training, and further improve the basic semantics and command understanding...

Downloads: 0 This Week

Last Update: 2024-01-23
See Project
7

Chinese-LLaMA-Alpaca-2 v2.0

Chinese LLaMA & Alpaca large language model + local CPU/GPU training

This project has open-sourced the Chinese LLaMA model and the Alpaca large model with instruction fine-tuning to further promote the open research of large models in the Chinese NLP community. Based on the original LLaMA , these models expand the Chinese vocabulary and use Chinese data for secondary pre-training, which further improves the basic semantic understanding of Chinese. At the same time, the Chinese Alpaca model further uses Chinese instruction data for fine-tuning, which...

Downloads: 0 This Week

Last Update: 2023-08-21
See Project
8

VALL-E

PyTorch implementation of VALL-E (Zero-Shot Text-To-Speech)

We introduce a language modeling approach for text to speech synthesis (TTS). Specifically, we train a neural codec language model (called VALL-E) using discrete codes derived from an off-the-shelf neural audio codec model, and regard TTS as a conditional language modeling task rather than continuous signal regression as in previous work. During the pre-training stage, we scale up the TTS training data to 60K hours of English speech which is hundreds of times larger than existing systems. VALL...

Downloads: 7 This Week

Last Update: 2023-04-14
See Project
9

FinGPT

Open-Source Financial Large Language Models!

FinGPT is an open-source large language model tailored specifically for financial tasks. Developed by AI4Finance Foundation, it is designed to assist with various financial applications, such as forecasting, financial sentiment analysis, and portfolio management. FinGPT has been trained on a diverse range of financial datasets, making it a powerful tool for finance professionals looking to leverage AI for data-driven decision-making. The model is freely available on platforms like Hugging Face...

1 Review

Downloads: 19 This Week

Last Update: 2025-03-03
See Project
Sales CRM and Pipeline Management Software | Pipedrive
The easy and effective CRM for closing deals

Pipedrive’s simple interface empowers salespeople to streamline workflows and unite sales tasks in one workspace. Unlock instant sales insights with Pipedrive’s visual sales pipeline and fine-tune your strategy with robust reporting features and a personalized AI Sales Assistant.

Try it for free
10

Wan2.2

Wan2.2: Open and Advanced Large-Scale Video Generative Model

Wan2.2 is a major upgrade to the Wan series of open and advanced large-scale video generative models, incorporating cutting-edge innovations to boost video generation quality and efficiency. It introduces a Mixture-of-Experts (MoE) architecture that splits the denoising process across specialized expert models, increasing total model capacity without raising computational costs. Wan2.2 integrates meticulously curated cinematic aesthetic data, enabling precise control over lighting, composition...

1 Review

Downloads: 32 This Week

Last Update: 2025-07-30
See Project
11

Janus-Pro

Janus-Series: Unified Multimodal Understanding and Generation Models

.... Its latest iteration, Janus-Pro, improves on this with a more optimized training strategy, expanded data, and larger model scaling, leading to significant advancements in both multimodal understanding and text-to-image generation.

1 Review

Downloads: 3 This Week

Last Update: 2025-03-04
See Project
12

GLM-4-32B-0414

Open Multilingual Multimodal Chat LMs

GLM-4-32B-0414 is a powerful open-source large language model featuring 32 billion parameters, designed to deliver performance comparable to leading models like OpenAI’s GPT series. It supports multilingual and multimodal chat capabilities with an extensive 32K token context length, making it ideal for dialogue, reasoning, and complex task completion. The model is pre-trained on 15 trillion tokens of high-quality data, including substantial synthetic reasoning datasets, and further enhanced...

Downloads: 1 This Week

Last Update: 2025-06-27
See Project
13

Universal Sentence Encoder

Encoder of greater-than-word length text trained on a variety of data

The Universal Sentence Encoder (USE) is a pre-trained deep learning model designed to encode sentences into fixed-length embeddings for use in various natural language processing (NLP) tasks. It leverages Transformer and Deep Averaging Network (DAN) architectures to generate embeddings that capture the semantic meaning of sentences. The model is designed for tasks like sentiment analysis, semantic textual similarity, and clustering, and provides high-quality sentence representations...

Downloads: 0 This Week

Last Update: 2025-03-19
See Project
14

GPT Neo

An implementation of model parallel GPT-2 and GPT-3-style models

An implementation of model & data parallel GPT3-like models using the mesh-tensorflow library. If you're just here to play with our pre-trained models, we strongly recommend you try out the HuggingFace Transformer integration. Training and inference is officially supported on TPU and should work on GPU as well. This repository will be (mostly) archived as we move focus to our GPU-specific repo, GPT-NeoX. NB, while neo can technically run a training step at 200B+ parameters, it is very...

Downloads: 8 This Week

Last Update: 2023-03-24
See Project
15

segmentation-3.0

Speaker segmentation model for 10s audio chunks with powerset labels

... (VAD), overlapped speech detection, and speaker diarization when combined with additional models. While it doesn't process full recordings directly, it powers pipelines for detailed segmentation and analysis of speech data. Its MIT license ensures it's openly accessible, though users must agree to usage conditions for access. The model showcases state-of-the-art segmentation performance and is used in both academic and production-oriented pipelines.

Downloads: 0 This Week

Last Update: 2025-07-01
See Project
16

xlm-roberta-base

Multilingual RoBERTa trained on 100 languages for NLP tasks

xlm-roberta-base is a multilingual transformer model trained by Facebook AI on 2.5TB of filtered CommonCrawl data spanning 100 languages. It is based on the RoBERTa architecture and pre-trained using a masked language modeling (MLM) objective. Unlike models like GPT, which predict the next word, this model learns bidirectional context by predicting masked tokens, enabling robust sentence-level representations. xlm-roberta-base is particularly suited for cross-lingual understanding...

Downloads: 0 This Week

Last Update: 2025-07-01
See Project
17

twitter-roberta-base-sentiment-latest

RoBERTa model for English sentiment analysis on Twitter data

twitter-roberta-base-sentiment-latest is a RoBERTa-based transformer model fine-tuned on over 124 million tweets collected between 2018 and 2021. Designed for sentiment analysis in English, it categorizes tweets as Negative, Neutral, or Positive. The model is optimized using the TweetEval benchmark and integrated with the TweetNLP ecosystem for seamless deployment. Its training emphasizes real-world, social media content, making it highly effective for analyzing informal or noisy text...

Downloads: 0 This Week

Last Update: 2025-07-01
See Project
18

mms-300m-1130-forced-aligner

CTC-based forced aligner for audio-text in 158 languages

... to the TorchAudio forced alignment API. Users can integrate it easily through the Python package ctc-forced-aligner, and it supports GPU acceleration via PyTorch. The alignment pipeline includes audio processing, emission generation, tokenization, and span detection, making it suitable for speech analysis, transcription syncing, and dataset creation. This model is especially useful for researchers and developers working with low-resource languages or building multilingual speech systems.

Downloads: 0 This Week

Last Update: 2025-07-02
See Project
19

Meta-Llama-3-8B-Instruct

Instruction-tuned 8B LLM by Meta for helpful, safe English dialogue

... available data and more than 10 million human-annotated examples, it excludes any Meta user data. The model is released under the Meta Llama 3 Community License, which allows commercial use for organizations with fewer than 700 million MAUs, and imposes clear use, attribution, and redistribution rules. Meta provides safety tools like Llama Guard 2 and Code Shield to help developers implement system-level safety in applications.

Downloads: 0 This Week

Last Update: 2025-06-27
See Project
20

bert-base-chinese

BERT-based Chinese language model for fill-mask and NLP tasks

... applications, including text classification, named entity recognition, and sentiment analysis in Chinese. It uses the same structure as the BERT base uncased English model, but it is trained entirely on Chinese data. While robust, like other large language models, it may reflect or amplify existing biases present in its training data. Due to limited transparency around the dataset and evaluation metrics, users should test it thoroughly before deployment in sensitive contexts.

Downloads: 0 This Week

Last Update: 2025-07-02
See Project
21

Llama-2-7b

7B-parameter foundational LLM by Meta for text generation tasks

Llama-2-7B is a foundational large language model developed by Meta as part of the Llama 2 family, designed for general-purpose text generation in English. It has 7 billion parameters and uses an optimized transformer-based, autoregressive architecture. Trained on 2 trillion tokens of publicly available data, it serves as the base for fine-tuned models like Llama-2-Chat. The model is pretrained only, meaning it is not optimized for dialogue but can be adapted for various natural language...

Downloads: 0 This Week

Last Update: 2025-06-27
See Project
22

GPT-2

GPT-2 is a 124M parameter English language model for text generation

GPT-2 is a pretrained transformer-based language model developed by OpenAI for generating natural language text. Trained on 40GB of internet data from outbound Reddit links (excluding Wikipedia), it uses causal language modeling to predict the next token in a sequence. The model was trained without human labels and learns representations of English that support text generation, feature extraction, and fine-tuning. GPT-2 uses a byte-level BPE tokenizer with a vocabulary of 50,257 and handles...

Downloads: 0 This Week

Last Update: 2025-06-27
See Project
23

starcoder

Code generation model trained on 80+ languages with FIM support

... natural language. While it is not an instruction-tuned model, it can act as a capable technical assistant when prompted appropriately. Developers can use it for general-purpose code generation, with fine control over prefix/middle/suffix tokens. The model has some limitations: generated code may contain bugs or licensing constraints, and attribution must be observed when output resembles training data. StarCoder is licensed under the BigCode OpenRAIL-M license.

Downloads: 0 This Week

Last Update: 2025-06-27
See Project
24

Nanonets-OCR-s

State-of-the-art image-to-markdown OCR model

Nanonets-OCR-s is an advanced image-to-markdown OCR model that transforms documents into structured and semantically rich markdown. It goes beyond basic text extraction by intelligently recognizing content types and applying meaningful tags, making the output ideal for Large Language Models (LLMs) and automated workflows. The model expertly converts mathematical equations into LaTeX syntax, distinguishing between inline and display modes for accuracy. It also generates descriptive <img> tags...

Downloads: 0 This Week

Last Update: 2025-06-26
See Project
25

Llama-3.1-8B-Instruct

Multilingual 8B-parameter chat-optimized LLM fine-tuned by Meta

...), and high-quality human and synthetic safety data. It excels at conversational AI, tool use, coding, and multilingual reasoning, achieving strong performance across a wide range of academic and applied benchmarks. The model is released under the Llama 3.1 Community License, which permits commercial use for organizations with fewer than 700 million monthly active users, provided they comply with Meta’s Acceptable Use Policy.

Downloads: 0 This Week

Last Update: 2025-06-27
See Project

Previous
You're on page 1
2
Next

Related Searches

mega-voice

speech synthesis

.mega-voice

chinese

text to speech

tensorflow

voice cloning

vision android

ocr

forensic audio analysis

Related Categories

Artificial Intelligence

Business

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
225 Broadway Suite 1600
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2025 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise

Want the latest updates on software, tech news, and AI?

Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.

Thanks for helping keep SourceForge clean.

X

You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

Briefly describe the problem (required):

Upload screenshot of ad (required):

Select a file, or drag & drop file here.

✔

✘

Screenshot instructions:

Click URL instructions:
Right-click on the ad, choose "Copy Link", then paste here →
(This may not be possible with some types of ads)

More information about our ad policies

Ad destination/click URL: