Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Artificial Intelligence
Machine Learning Software
Search Results

Search Results for "sql data generator"

x

Sort By:

Relevance

Clear All Filters

OS

Windows 21
Linux 20
Mac 19
More...
BSD 7
ChromeOS 6

Category

Artificial Intelligence 21
Software Development 4
Business 1
Multimedia 1
Scientific/Engineering 1

License

OSI-Approved Open Source 18

Translations

English 2

Programming Language

Python 7
C++ 3
Java 2
Rust 2
More...
TypeScript 2
BASIC 1
JavaScript 1
PL/SQL 1
Prolog 1
Scala 1

Status

Production/Stable 2
Beta 1

Showing 21 open source projects for "sql data generator"

View related business solutions

Machine Learning Windows Clear Filters & Widen Search

MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
Custom VMs From 1 to 96 vCPUs With 99.95% Uptime
General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.

Try Free
1

Data-Science-Interview-Questions-Answers

Curated list of data science interview questions and answers

...The repository focuses on core data science fundamentals rather than acting as a software framework, which makes it especially useful as a study and revision resource. Its content is organized into subject-specific documents that cover machine learning, deep learning, statistics, probability, Python, SQL and databases, and resume-based interview questions. That structure makes it practical for users who want to study by topic, strengthen weak areas, or simulate the range of questions they may encounter in interviews.

Downloads: 0 This Week

Last Update: 2026-03-10
See Project
2

Data Science Interviews

Data science interview questions and answers

Data Science Interviews is an open-source repository that collects common data science interview questions along with community-provided answers and explanations. The project serves as a preparation resource for students, job seekers, and professionals who want to review the technical knowledge required for data science roles. The repository organizes questions into different categories including theoretical machine learning concepts, technical programming questions, and probability or statistics problems. ...

Downloads: 0 This Week

Last Update: 2026-03-10
See Project
3

cracking-the-data-science-interview

A Collection of Cheatsheets, Books, Questions, and Portfolio

Cracking the Data Science Interview is an open educational repository that collects study materials, resources, and reference links for preparing for data science interviews. The project organizes content across many fundamental areas of data science, including statistics, probability, SQL, machine learning, and deep learning. It includes cheat sheets that summarize important technical concepts commonly discussed during technical interviews.

Downloads: 0 This Week

Last Update: 2026-03-11
See Project
4

Spice.ai OSS

A self-hostable CDN for databases

Spice is a portable runtime offering developers a unified SQL interface to materialize, accelerate, and query data from any database, data warehouse, or data lake. Spice connects, fuses, and delivers data to applications, machine-learning models, and AI backends, functioning as an application-specific, tier-optimized Database CDN. The Spice runtime, written in Rust, is built-with industry-leading technologies such as Apache DataFusion, Apache Arrow, Apache Arrow Flight, SQLite, and DuckDB. ...

Downloads: 32 This Week

Last Update: 2 days ago
See Project
Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build generative AI apps with Vertex AI. Switch between models without switching platforms.

Start Free
5

MindsDB

Making Enterprise Data Intelligent and Responsive for AI

MindsDB is an AI data solution that enables humans, AI, agents, and applications to query data in natural language and SQL, and get highly accurate answers across disparate data sources and types. MindsDB connects to diverse data sources and applications, and unifies petabyte-scale structured and unstructured data. Powered by an industry-first cognitive engine that can operate anywhere (on-prem, VPC, serverless), it empowers both humans and AI with highly informed decision-making capabilities. ...

Downloads: 7 This Week

Last Update: 2026-03-03
See Project
6

OpenMLDB

OpenMLDB is an open-source machine learning database

OpenMLDB is an open-source machine learning database that provides a feature platform computing consistent features for training and inference. OpenMLDB is an open-source machine learning database that is committed to solving the data and feature challenges. OpenMLDB has been deployed in hundreds of real-world enterprise applications. It prioritizes the capability of feature engineering using SQL for open-source, which offers a feature platform enabling consistent features for training and inference. Real-time features are essential for many machine learning applications, such as real-time personalized recommendations and risk analytics. ...

Downloads: 8 This Week

Last Update: 2025-02-21
See Project
7

fugue

A unified interface for distributed computing

Fugue is a unified interface for distributed computing that lets users execute Python, Pandas, and SQL code on Spark, Dask, and Ray with minimal rewrites.

Downloads: 7 This Week

Last Update: 2026-02-20
See Project
8

Deepnote

Deepnote is a drop-in replacement for Jupyter

...The system supports programming languages such as Python, R, and SQL and allows users to execute and analyze data directly within interactive notebooks. Deepnote emphasizes team-based data science by enabling real-time collaboration similar to shared document editors, allowing multiple users to work simultaneously on the same notebook environment.

Downloads: 5 This Week

Last Update: 2026-03-26
See Project
9

PostgresML

The GPU-powered AI application database

...Leverage multiple types of natural language processing and machine learning models such as vector search and personalization with embeddings to improve search results. Leverage your data with time series forecasting to garner key business insights. Build statistical and predictive models with the full power of SQL and dozens of regression algorithms. Return results and detect fraud faster with ML at the database layer. PostgresML abstracts the data management overhead from the ML/AI lifecycle by enabling users to run ML/LLM models directly on a Postgres database.

Downloads: 8 This Week

Last Update: 2025-01-16
See Project
Try Google Cloud Risk-Free With $300 in Credit
No hidden charges. No surprise bills. Cancel anytime.

Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.

Start Free
10

Learning Interpretability Tool

Interactively analyze ML models to understand their behavior

The Learning Interpretability Tool (LIT, formerly known as the Language Interpretability Tool) is a visual, interactive ML model-understanding tool that supports text, image, and tabular data. It can be run as a standalone server, or inside of notebook environments such as Colab, Jupyter, and Google Cloud Vertex AI notebooks.

Downloads: 7 This Week

Last Update: 2024-12-20
See Project
11

HeavyDB

HeavyDB (formerly MapD/OmniSciDB)

HeavyDB is an open-source GPU-accelerated analytical database designed to perform extremely fast queries on large datasets. The system is built as a SQL-based relational columnar database engine that leverages modern hardware parallelism, including GPUs and multicore CPUs. Its architecture allows users to query datasets containing billions of rows in milliseconds without requiring traditional indexing, pre-aggregation, or sampling techniques. HeavyDB was originally developed as part of the OmniSci platform (formerly MapD) and is commonly used for large-scale analytics and geospatial data processing. ...

Downloads: 0 This Week

Last Update: 2026-03-11
See Project
12

Lightweight' GAN

Implementation of 'lightweight' GAN, proposed in ICLR 2021

Implementation of 'lightweight' GAN proposed in ICLR 2021, in Pytorch. The main contribution of the paper is a skip-layer excitation in the generator, paired with autoencoding self-supervised learning in the discriminator. Quoting the one-line summary "converge on single gpu with few hours' training, on 1024 resolution sub-hundred images". Augmentation is essential for Lightweight GAN to work effectively in a low data setting. You can test and see how your images will be augmented before they pass into a neural network (if you use augmentation). ...

Downloads: 1 This Week

Last Update: 2025-01-12
See Project
13

PyTorch Implementation of SDE Solvers

Differentiable SDE solvers with GPU support and efficient sensitivity

...The example trains an SDE as the generator of a GAN, whilst using a neural CDE [4] as the discriminator.

Downloads: 0 This Week

Last Update: 2023-09-26
See Project
14

Byzer-lang

A low-code open-source programming language for data pipeline

Byzer (former MLSQL) is a low-code, open-sourced, and distributed programming language for data pipeline, analytics, and AI in a cloud-native way. Design protocol: Everything is a table. Byzer is a SQL-like language, to simplify data pipeline, analytics, and AI, combined with built-in algorithms and extensions. We believe that everything is a table, a simple and powerful SQL-like language can significantly reduce human efforts of data development without switching different tools.

Downloads: 0 This Week

Last Update: 2024-08-13
See Project
15

EZStacking

EZStacking is Jupyter notebook generator for machine learning

EZStacking is Jupyter notebook generator for supervised learning problems using Scikit-Learn pipelines and stacked generalization. EZStacking handles classification and regression problems for structured data. It can also be viewed as a development tool, because a notebook generated with EZStacking contains: -an exploratory data analysis (EDA) used to assess data quality - a modelling producing a reduced-size stacked estimator - a server returning a prediction, a measure of the quality of input data and the execution time.

Downloads: 0 This Week

Last Update: 2022-06-30
See Project
16

Synthetic Mixed Data Generator

A Synthetic Data Generator for producing mixed datasets described by relevant, irrelevant, and redundant features.

Downloads: 0 This Week

Last Update: 2021-11-17
See Project
17

BlazingSQL

BlazingSQL is a lightweight, GPU accelerated, SQL engine for Python

BlazingSQL is a GPU-accelerated SQL engine built on top of the RAPIDS ecosystem. RAPIDS is based on the Apache Arrow columnar memory format, and cuDF is a GPU DataFrame library for loading, joining, aggregating, filtering, and otherwise manipulating data. BlazingSQL is a SQL interface for cuDF, with various features to support large-scale data science workflows and enterprise datasets.

Downloads: 0 This Week

Last Update: 2024-08-14
See Project
18

awesome-TS-anomaly-detection

List of tools & datasets for anomaly detection on time-series data

All lists are in alphabetical order. In the lists, maintained projects are prioritized vs not mantained. A repository is considered "not maintained" if the latest commit is > 1 year old, or explicitly mentioned by the authors.

Downloads: 0 This Week

Last Update: 2024-08-07
See Project
19

midipiano_chung

midipiano chung is a free open source standalone virtual midi acoustic piano synthesizer /expander with samples based sounds and originals dsps effects, connected to the choozen virtual midi input (up to 3 if any) and midiout(thru) ports of your computer.It is easily extensible by adding or modifying sound files (mp3,wav) in the /sounds/ folder.Works well as output for midi_chung player and midirec_chung recorder or an external usb midi master keyboard. It is written in compiled freebasic and uses fbsound(freebasic). Can run on a small netbook . Autochord , themeonly, learn (automatic chord learning from played data) functions added to recorder/player .( auto adds/replace chords to any music melody ). can record and export to midifiles. included brainpiano_chung a version with neural network brain autochord trial. (it really adds chords) included brainpiano2_chung version with custom neural network music generator . amazing brainpiano3_chung added quantize trial added to recorder

Downloads: 1 This Week

Last Update: 2019-10-28
See Project
20

libVMR

VMR - machine learning library

libVMR is a class library written in Java which implements code generator for group method of data handling - GMDH. The library is intended for users, with machine learning skills. libVMR provides an effective framework for the research and development of data mining and predictive analytics. libVMR is based on the most popular neural network model with a higher generalization ability from kernel tricks - vector machine by Reshetov (VMR).

Downloads: 1 This Week

Last Update: 2015-06-21
See Project
21

pyIRDG

IMDb Relational Dataset Generator

pyIRDG is a program written in Python to generate relational datasets in Prolog format. It uses data from the Internet Movie Database in combination with IMDbPY as backend. A graphical user interface written in pyQt allows the user to link multiple entities together as model for the generation process. The big four entities are Title, Person, Company and Character. Many attributes can be chosen for adding to the output .pl file. Three types of constraints on attributes are available to limit...

Downloads: 0 This Week

Last Update: 2014-03-09
See Project

Previous
You're on page 1
Next

Related Searches

lazarus

self-learning ai

bi

gmdh

Related Categories

Artificial Intelligence

Software Development

Business

Multimedia

Scientific/Engineering

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise