Student
Directeur.e(s) de recherche
Anne-Sophie Charest
Start date
Title of the research project
Confidentiality guarantees of a new method to generate synthetic data
Description

It is often difficult, even sometimes impossible, to share denominalized data between organisations and researchers due to ethical constraints regarding participant confidentiality. Synthetic datasets could facilitate data sharing. However, many current methods, which use multiple imputation (MI) techniques for missing data, lower the analysis potential and the quality of the results.

This project therefore aims to assess the confidentialy guarantees of a promising new data synthesis method. This method adds a data masking step to a multiple imputation technique to generate synthetic data based on the risk of each observation. In particular, attribute disclosure risks, which refer to the disclosure of certain attributes based on other, known ones, will be tested.

The feasibility and quality of the results will be tesed on a dataset provided by l’Institut de la statistique du Québec.
 

Discover

Featured project

Prostate cancer is the second most frequent cancer and the fifth leading cause of cancer death among men. To improve patient outcomes, treatment must be personalized based on accurate prognosis. Nomograms already exist to identify patients at low risk for recurrence based on preoperative clinical information, but these tools do not use patients’ medical images.

Read more