Daniel Gourdeau presents his research project as part of a student seminar organized by the Institute Intelligence and Data
Read moreData sharing is often limited by privacy issues. This is very common in particular for health datasets, given the inherent sensitivity of this type of data. When sharing of the original dataset is not possible, one method that can be used is to generate a synthetic dataset, which contains as much statistical information as possible from the original dataset, but which provides data on false individuals in order to protect the confidentiality of respondents.
Data sharing is often limited by privacy issues. This is very common in particular for health datasets, given the inherent sensitivity of this type of data. When sharing of the original dataset is not possible, one method that can be used is to generate a synthetic dataset, which contains as much statistical information as possible from the original dataset, but which provides data on false individuals in order to protect the confidentiality of respondents.
Leila Nombo
Ph.D. candidate
Faculté des sciences et de génie
Université Laval
Data sharing is often limited by privacy issues. This is very common in particular for health datasets, given the inherent sensitivity of this type of data. When sharing of the original dataset is not possible, one method that can be used is to generate a synthetic dataset, which contains as much statistical information as possible from the original dataset, but which provides data on false individuals in order to protect the confidentiality of respondents. One way to ensure that these synthetic data effectively protect respondents is to use differential confidentiality, a rigorous measure of disclosure risk.
This project is interested in how to analyze these synthetic datasets to obtain valid statistical results, as traditional methods of inference need to be modified to account for the variability added by the generation of the synthetic dataset.
This research project is based on the analysis of massive data on the NOL index and other intraoperative clinical parameters used by anesthesiologists during surgery. These parameters help them make analgesic treatment decisions in a non-communicating patient under general anesthesia and in whom it is impossible to assess pain and analgesic needs by standard questionnaires performed on awake patients.
First, the objective is to interpret the values of this index in relation to the decisions made by the clinician.
Marzieh Ghiyasinasab
Postdoc fellow
Département de mathématiques et de génie industriel
Polytechnique Montréal
This research project is based on the analysis of massive data on the NOL index and other intraoperative clinical parameters used by anesthesiologists during surgery. These parameters help them make analgesic treatment decisions in a non-communicating patient under general anesthesia and in whom it is impossible to assess pain and analgesic needs by standard questionnaires performed on awake patients.
First, the objective is to interpret the values of this index in relation to the decisions made by the clinician.
The second step is to develop an artificial intelligence algorithm that can guide decision-making for greater precision and better anesthetic safety for the patient.
Mathieu Baillargeon
M.Sc. candidate
Faculté des sciences et de génie
Université Laval
Data sharing is often limited by privacy issues. This is very common in particular for health datasets, given the inherent sensitivity of this type of data. When sharing of the original dataset is not possible, one method that can be used is to generate a synthetic dataset, which contains as much statistical information as possible from the original dataset, but which provides data on false individuals in order to protect the confidentiality of respondents.
This project is interested in rigorously measuring the confidentiality protection offered by a synthetic dataset. We will carefully examine some measures proposed in the literature, to understand their guarantees and the differences and similarities between them in order to identify the measure (s) that would be the most relevant for the sharing of synthetic data.
Multipoint scintillation detectors are used to measure the dose of radiation deposited simultaneously at many locations in space and they have the advantage to allow real-time measurements. However, this detector must be precisely calibrated to provide accurate dose measurements.
Boby Lessard
M.Sc. candidate
Faculté des sciences et de génie
Université Laval
Multipoint scintillation detectors are used to measure the dose of radiation deposited simultaneously at many locations in space and they have the advantage to allow real-time measurements. However, this detector must be precisely calibrated to provide accurate dose measurements.
The goal of this project is to develop an automated routine for the calibration of multipoint scintillation detectors under the beam of a linear accelerator such as the ones used for cancer treatments, by representing the calibration data in the principal component space.
A multipoint scintillation detector measures the spectrum of the light produced within the detector. Indeed, light is produced within the detector proportional to the radiation deposited in the detector. From a calibration dataset, a Non-Negative Matrix Factorisation algorithm (NMF) is used with the aim to retrieve the pure spectral components of the measurements. To simplify the visualization of the calibration dataset, the dataset is transformed using the Principal Component Analysis algorithm (PCA), and this transformed dataset is then represented graphically in the principal component space. This space allows to visualize the spectral composition of the data, relative to the pure spectra.
Many datasets can therefore be built, represented into this space, and used with the NMF algorithm with the aim to evaluate the performance of this algorithm for different calibration datasets.
In the end, this will allow to determine the experimental datasets that have to be acquired to perform an accurate calibration of the multipoint scintillation detectors.
Discover
Featured project
Prostate cancer is the second most frequent cancer and the fifth leading cause of cancer death among men. To improve patient outcomes, treatment must be personalized based on accurate prognosis. Nomograms already exist to identify patients at low risk for recurrence based on preoperative clinical information, but these tools do not use patients’ medical images.