It is often difficult to share denominalized data between different organizations and researchers due to ethical constraints related to respondent confidentiality. This is a common reality in the healthcare field, given the inherent sensitivity of this type of data. One option in this case is not to share the data directly, but rather to provide access to it via a tool that controls the risk of disclosure of the queries made and allows only those it considers safe. DataSHIELD is such a tool which has been proposed to protect the confidentiality of a dataset, and which can be used via the statistical software R. It also allows statistical analysis on several datasets hosted in different locations, always ensuring the confidentiality of the respondents.
In this project, we are interested in the confidentiality guarantees provided by the software, and in its limitations. In particular, we wish to establish principles to guide the choice of disclosure control parameters offered with the tool, and to understand more precisely the impact of these controls on the quality of the descriptive statistics, linear models and graphs produced.