WikiSQL

A large crowd-sourced dataset for developing natural language interfaces for relational databases. WikiSQL is the dataset released along with our work Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning. Regarding tokenization and Stanza, when WikiSQL was written 3-years ago, it relied on Stanza, a CoreNLP python wrapper that has since been deprecated. If you'd still like to use the tokenizer, please use the docker image. We do not anticipate switching to the current Stanza as changes to the tokenizer would render the previous results not reproducible.

Features

Both the evaluation script as well as the dataset are stored within the repo
Only Python 3 is supported at the moment
Inside the data folder you will find the files in jsonl and db format
We supply a sample predictions file for the dev set
In addition to the raw data dump, we also release an optional annotation script that annotates WikiSQL
Develop natural language interfaces for relational databases

Project Samples

Project Activity

See All Activity >

License

BSD License

Follow WikiSQL

WikiSQL Web Site

Other Useful Business Software

$300 Free Credits for Your Google Cloud Projects

Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.

Start Free Trial

Rate This Project

User Reviews

Be the first to post a review of WikiSQL!

Additional Project Details

Programming Language

Python

Related Categories

Python HTML XHTML, Python Database Software, Python Reinforcement Learning Frameworks, Python Reinforcement Learning Libraries, Python Reinforcement Learning Algorithms

Registered

2022-07-26

Similar Business Software

Google Cloud BigQuery

BigQuery is a serverless, multicloud data warehouse that simplifies the process of working with all types of data so you can focus on getting valuable business insights quickly. At the core of Google’s data cloud, BigQuery allows you to simplify data integration, cost effectively and securely...

See Software
DbVisualizer

DbVisualizer is a universal database client for anyone who works with data, from solo developers and startups to professional teams managing complex environments, including developers, DBAs, analysts, and data engineers working with relational and NoSQL databases. It offers a graphical interface...

See Software
Google Cloud SQL

Fully managed relational database service for MySQL, PostgreSQL, and SQL Server with rich extension collections, configuration flags, and developer ecosystems. New customers get $300 in free credits to spend on Cloud SQL. You won’t be charged until you upgrade. Reduce maintenance costs with...

See Software
Google Cloud Platform

Google Cloud is a cloud-based service that allows you to create anything from simple websites to complex applications for businesses of all sizes. New customers get $300 in free credits to run, test, and deploy workloads. All customers can use 25+ products for free, up to monthly usage...

See Software
Teradata VantageCloud

Teradata VantageCloud: The complete cloud analytics and data platform for AI. Teradata VantageCloud is an enterprise-grade, cloud-native data and analytics platform that unifies data management, advanced analytics, and AI/ML capabilities in a single environment. Designed for scalability and...

See Software
StrongDM

StrongDM is a People-First Access platform that gives technical staff a direct route to the critical infrastructure they need to be their most productive. End users enjoy fast, intuitive, and auditable access to the resources they need, and administrators leverage simplified workflows to enhance...

See Software