Aggregate dataset of open data without identifying information

doi: 10.53962/g9j4-v2gy

Originally published on 2022-09-21 under a CC0 Public Domain Dedication



This module contains a principal dataset collated from various open data, which we previously identified as not containing identifying information. This principal dataset is generated to be a pseudo-population to generate smaller sample datasets from without identifying information. These sample datasets will be used to generate precision estimates (α and 1-α) for algorithms to check for identifying information in open data in a next step. The principal dataset shared here contains 30,251 rows and a maximum of 23 columns.

Main file


Supporting files


