As tools derived from artificial intelligence are used more frequently in medicine and health-related domains, understanding their predictions becomes increasingly important when determining the trustworthiness of a prediction.
As the goal of this project is to interpret a neural network, it requires two main phases: the evaluation of the model and the interpretation of the model. The success of the first phase required the implementation of generalised methods to evaluate neural networks according to pre-established metrics. Once the model's quality is determined through the first phase, it is possible to implement the interpretation techniques that allow a human user to understand and analyse the model's predictions, thus concluding the second phase of the project.
The project's third and final phase was comprised of the analysis of the interpretation data obtained from the new methods and the presentation of the results to the other people working on this same neural network.
The intern developed a tool for converting pulmonary nodule annotation data stored inHDF5 files to theDICOMfile format. The tool enables the extraction of annotation data from the HDF5 file as well as the lung computed tomography (CT) data of patients stored in a database. Subsequently, the tool generates and saves a DICOM annotation file following the structure indicated by the DICOM Standard Browser. The student programmed this tool in Python while keeping track of versions using Gitlab. The intern’s project facilitated the conversion of data for around a hundred patients.
Ultimately, the tool can be reused by other members of the research group in the future for projects requiring annotation data conversion to DICOM
One of the primary challenges of diagnosing Alzheimer’s Disease (AD) lies in its progression through two silent decades. The lack of symptoms in patients during this time evidently hinders their chance of suspecting the disease, or merely being granted a precautionary brain scan. Moreover, the initial endogenous signs and noticeable symptoms often coincide with aging individuals without any neurological disease diagnosis. In the midst of these diagnostic challenges, fundamental and clinical research efforts have provided a sea of disparate information about the pathophysiology by tracing back the events that unfold with respect to small and large scale components. To capture the multifactorial nature of AD in the face of a heavily delayed diagnostic timeframe, it becomes intractable to attempt to account for the abundance of causal candidate factors for AD using standard analytical statistical techniques.
Instead of tracing back AD signs and symptoms, we aim to simulate normal aging going forward in time, in the hopes of detecting more accurate early Alzheimer’s signs as they emerge, and subsequently diverge from typical aging-associated abnormalities. Therefore, we propose to restructure AD knowledge into several levels of abstraction or scales with the reliance on mathematical modeling techniques to represent AD more comprehensively, while inspiring the model from the process of normal aging from 18- to 100-year-old humans.
We will conduct a thorough literature search to estimate parametric values required to satisfy our system of ordinary and partial differential equations, tailored to simulate normal aging. We will use an Agile approach to categorize entities known to play a role in aging such as 1) at the nanoscale with compounds like glucose and insulin, and proteins such as amyloid and tau; 2) at the microscale based on neuronal and glial populations as well as the vascular endothelium; 3) bringing them together to simulate and predict the trajectory of biomarkers at the mesoscale (e.g., neuronal integrity via cortical thickness, metabolic integrity via FDG-PET). The model’s use of estimated theoretical parameter values will in turn be validated with human data to orient its development in concordance with the longitudinal trajectory of the aging human.
The multiscale hierarchy of neurological diseases which is composed of an incredibly complex interactome alarmingly prompt us to move on from single-component analyses, towards more carefully dissecting the most impactful entities, while adequately accounting for how they intertwine with each other during aging. This framework can provide a starting point for earlier detection of AD neurodegeneration and potentially facilitate the identification of more AD-specific pathways for future pharmacological interventions.
Alexandre Boulay's project involves the analysis of phages and bacteria in the gut microbiota from a metagenomic dataset from the Institute of Nutrition and Functional Foods (INAF) at Université Laval, relying on bioinformatics and artificial intelligence (AI) methods. The dataset comes from a recent study that examined the interaction of the endocannabinoid axis with host environmental factors as well as gut, metabolic and mental health status in Quebec adults with various metabolic and lifestyle statuses. The overall objective is to train interpretable AI algorithms to identify phage and bacterial biomarkers of metabolic and mental health in individuals, and to study the interactions between bacteria and phage. This project could have important implications for the understanding of the interactions between bacteria and phages, which are very poorly known, but also for the knowledge of the gut microbiota of the Quebec population in relation to their metabolic and mental health.
Rose-Marie's project focuses on the analysis of interactions between bacteriophages - the viruses of bacteria - and bacteria of the intestinal microbiota based on datasets from experiments carried out by the student in collaboration with members of the Institute of Nutrition and Functional Foods (INAF) at Université Laval. The first objective is to study the impact of phages on bacterial dynamics in a simplified microbiota, composed of 8 key bacterial strains of the human intestinal microbiota. The second objective is to study the bacterial-phage dynamics in a complex microbiota representative of the human gut microbiota. For both objectives, following the experimentation and acquisition of sequencing data, Rose-Marie will perform data analysis using bioinformatics methods. This project could have important implications for the understanding of interactions between bacteria and phages, which are very poorly known, but also for the knowledge of the intestinal microbiota in relation to nutrition and health of individuals.
Synthetic healthcare datasets are useful to support the development of data analysis and machine learning techniques in healthcare, by offering access to representative data to experiment and generate models from while mitigating the issues associated with dealing with highly sensitive data related to human subjects. However, the performance and usefulness of data analysis and machine learning methods applied depend on the quality of these synthetic datasets and their representativity of the phenomenon to model.
The objective of the project is to develop machine learning methods for generating synthetic healthcare datasets that preserve the distribution and the temporality of real administrative healthcare datasets while ensuring that the confidentiality of sensitive information on persons found in the real dataset is preserved. This means to have some guarantees that the capacity to identify real people from the original dataset is not possible or very unlikely, and that attributes of the real records (e.g. personal healthcare history) can not be inferred from the synthetic dataset. Depending on the guarantees we can get in ensuring the confidentiality over the real open medical data used in generating the synthetic datasets, it would be considered to produce synthetic versions of RAMQ datasets, and even to disclose them more openly for research and analysis purposes if that is deemed to be acceptable.
It is often difficult, even sometimes impossible, to share denominalized data between organisations and researchers due to ethical constraints regarding participant confidentiality. Synthetic datasets could facilitate data sharing. However, many current methods, which use multiple imputation (MI) techniques for missing data, lower the analysis potential and the quality of the results.
This project therefore aims to assess the confidentialy guarantees of a promising new data synthesis method. This method adds a data masking step to a multiple imputation technique to generate synthetic data based on the risk of each observation. In particular, attribute disclosure risks, which refer to the disclosure of certain attributes based on other, known ones, will be tested.
The feasibility and quality of the results will be tesed on a dataset provided by l’Institut de la statistique du Québec.
While some studies report the positive effects of continuing professional development (CPD) on clinical behaviour, few address the sustainability of these effects as well as the types of approaches that could improve this sustainability.
Our aim was to compare the durability of healthcare professionals' intention to have conversations with patients in cases of serious illness, after training using an interprofessional approach or an individual approach. We conducted a cluster randomised clinical trial with measurements immediately (T1), at 1 year (T2) and at 2 years (T3) after the intervention in primary care clinics in Canada and the United States. Results are reported according to CONSERVE (2021) guidelines. Clinics were randomly assigned to either interprofessional team training (intervention) or individual training (comparator). Our primary outcome of interest, healthcare professionals' intention to have conversations in cases of serious illness, and associated psychosocial variables (social norm, moral norm, beliefs about consequences, and beliefs about abilities) were measured using the CPD-Reaction questionnaire. Data were collected using self-administered questionnaires at 3 stages after training (T1, T2 and T3). Bivariate and multivariate statistical analyses were performed using a linear mixed model for each study time with an interaction term between time and arm. The average age of the 373 participants was 35-44 years, and 79% were women. On a scale of 1 to 7, at T1 the mean intention was 6.0 (SD 1.12) for the interprofessional arm and 6.4 (SD 0.7) for the individual arm. At T2, it was 5.65 (SD 1.39) and 6.04 (SD 0.88) in the interprofessional and individual arms respectively. At T3, it was 5.5 (SD 1.53) in the interprofessional arm and 6.3 (SD 0.74) in the individual arm. The p-value for the interaction between study arm and time was 0.05. The difference in mean intention between the two study arms was 0.02 (CI -0.26 to 0.31), -0.07 (CI -0.49 to 0.34), -0.55 (-1.00 to -0.10) at T1, T2 and T3 respectively. At T3, it was 5.5 (SD 1.53) in the interprofessional arm and 6.3 (SD 0.74) in the individual arm. The p-value for the interaction between study arm and time was 0.05. The difference in mean intention between the two study arms was 0.02 (CI -0.26 to 0.31), -0.07 (CI -0.49 to 0.34), -0.55 (-1.00 to -0.10) at T1, T2 and T3 respectively. In conclusion, healthcare professionals' intention to have conversations in cases of serious illness varied over time according to the training approach. This intention was lower at the 1- and 2-year follow-up after training using an interprofessional approach compared with training using an individual approach.
Our results could help to improve continuing professional development, and hence the quality of care offered.
Fadwa Mehdaoui's project focuses on the analysis of interactions between bacteria and their viruses, called phages, in the microbiota using metagenomic data coupled with bioinformatics and machine learning methods.
The metagenomic data come from the North Sentinel project 3.6 (axis Environment-health interactions in the North) that sampled the gut microbiota of young Inuit from Nunavik. Following the identification of bacteria and phages in the sequencing data, her work will consist of an exploratory statistical analysis of the interactions between bacteria and phages, followed by a machine learning analysis based on interpretable algorithms such as the set covering machine or random forests. It is also envisaged to develop a new machine learning model for phage host prediction based on these data.
This project could have important implications for the understanding of the interactions between bacteria and phages, which are very poorly known, but also for the knowledge of the gut microbiota of the Inuit in relation to their unique diet.
Prediction and early identification of stroke is crucial to prevent emergency department (ED) revisits and initiate treatment, reducing morbidity and mortality.
This project focuses on the analysis of non-contrast brain CT (NCCT) data to predict early ED revisits for patients coming back with a stroke diagnosis. The first objective will be gathering open-source NCCT data as well as NCCT data from the Integrated Health and Social Services Center from Chaudiere-Appalaches (CISSS-CA) to classify the presence/absence of stroke using an existing model. The second objective will be to develop and test a machine learning model with weights from the previous model and other relevant clinical data to classify short-term revisits to the ED as an outcome.
From a clinical perspective, the development of such a tool may help support neuroradiologists in image interpretation and clinical decision making in the ED.
Discover
Featured project
Prostate cancer is the second most frequent cancer and the fifth leading cause of cancer death among men. To improve patient outcomes, treatment must be personalized based on accurate prognosis. Nomograms already exist to identify patients at low risk for recurrence based on preoperative clinical information, but these tools do not use patients’ medical images.