Tofu is a Python library for generating synthetic UK Biobank data. The UK Biobank is a large open-access prospective research cohort study of 500,000 middle-aged participants recruited in England, Scotland and Wales. The study has collected and continues to collect extensive phenotypic and genotypic detail about its participants, including data from questionnaires, physical measures, sample assays, accelerometry, multimodal imaging, genome-wide genotyping and longitudinal follow-up for a wide range of health-related outcomes. Tofu will generate synthetic data which conforms to the structure of the baseline data UK Biobank sends researchers by generating random values. For categorical variables (single or multiple choices), a random value will be picked from the UK Biobank data dictionary for that field. For continuous variables, a random value will be generated based on the distribution of values reported for that field on the UK Biobank showcase.

Features

  • For categorical variables (single or multiple choices), a random value will be picked from the UK Biobank data dictionary for that field
  • For continous variables, a random value will be generated based on the distribution of values reported for that field on the UK Biobank showcase
  • For date and date/time fields, a random date will be generated
  • For all other fields, such as polymorphic fields, no data will be generated
  • The lookups directory contains lookups downloaded from the UK Biobank showcase
  • Data conform to the structure and schema of the baseline file but are otherwise nonsensical: no checks have been implemented across fields

Project Samples

Project Activity

See All Activity >

Follow Tofu

Tofu Web Site

You Might Also Like
The Voice API that just works | Twilio Icon
The Voice API that just works | Twilio

Build a scalable voice experience with the API that's connecting millions around the world.

With Twilio Voice, you can build unique phone call experiences with one API, to create, receive, control and monitor calls with just a few lines of code. Create an engaging voice experience that you can quickly scale and modify with a wide array of customization options and resources.
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of Tofu!

Additional Project Details

Programming Language

Python

Related Categories

Python Synthetic Data Generation Software

Registered

2023-05-22