Workshop
Deep Generative Models for Health
Emanuele Palumbo · Laura Manduchi · Sonia Laguna · Melanie F. Pradier · Vincent Fortuin · Stephan Mandt · Julia Vogt
Room 260 - 262
Deep generative models have recently gained increasing attention in machine learning research with recent breakthroughs, such as Stable Diffusion, DALL-E, and Chat-GPT, among others. Despite significant advancements, the potential of generative AI in the health sector is yet not fully exploited. To address this gap, our workshop serves as a forum for presenting the latest research trends in generative models tailored for health applications. By bringing together a diversified pool of experts, we aim to investigate the methodological requirements and clinical implications of generative AI for health applications, thus shedding light on the challenges that lie ahead. Through this collaborative effort, we aspire to unlock the potential of generative models for groundbreaking advancements in the health sector.
Schedule
Fri 6:30 a.m. - 6:35 a.m.
|
Opening Remarks
(
Opening Remarks
)
>
SlidesLive Video |
🔗 |
Fri 6:35 a.m. - 7:20 a.m.
|
Invited Talk - Mihaela van Der Schaar
(
Invited Talk
)
>
SlidesLive Video In my talk, I will showcase how synthetic data, generated by deep generative models based on real-world data, enables solutions in healthcare that are unattainable with real data alone. I will discuss the transformation of biased datasets into unbiased ones using synthetic data. My talk will also explore how generative models facilitate transfer learning across various domains, enhancing the versatility of machine learning models. I will also cover the importance of data augmentation, where synthetic data enriches training sets for more comprehensive machine learning outcomes. Additionally, I will highlight the crucial role of synthetic data in the thorough testing and debugging of these models, ensuring their dependability in healthcare settings. |
🔗 |
Fri 7:20 a.m. - 8:00 a.m.
|
Invited Talk - Rajesh Ranganath
(
Invited Talk
)
>
SlidesLive Video This talk will discuss some of the uses of generative models in healthcare, dive into continuous time generative models, and as this is a workshop, step back to high level speculations about generative modeling and the needs of generative modeling in healthcare. Along the way, I will cover two developments in continuous time generative models 1) learning the noising process in a diffusion model to maximize likelihood and 2) choosing the base distribution in flows/interpolants to facilitate learning. References to what I will cover: Where to Diffuse, How to Diffuse, and How to Get Back: Automated Learning for Multivariate Diffusions: https://arxiv.org/abs/2302.07261 Stochastic interpolants with data-dependent couplings: https://arxiv.org/abs/2310.03725 On the Feasibility of Machine Learning Augmented Magnetic Resonance for Point-of-Care Identification of Disease: https://arxiv.org/abs/2301.11962 |
🔗 |
Fri 8:00 a.m. - 8:15 a.m.
|
Break
(
Break
)
>
|
🔗 |
Fri 8:15 a.m. - 8:30 a.m.
|
Best Paper Award: Protein Inpainting Co-Design with ProtFill
(
Spotlight
)
>
link
SlidesLive Video
Designing new proteins with specific binding capabilities is a challenging task that has the potential to revolutionize many fields, including medicine and material science. Here we introduce ProtFill, a unified method for simultaneous protein structure and sequence design. Distinct from most existing computational design frameworks which focus on either structure or sequence design, our method embraces both representations concurrently. Employing an $SE(3)$ equivariant diffusion graph neural network, our method excels in both sequence prediction and structure recovery. We demonstrate the model's applicability in interface redesign for antibodies as well as other proteins, underscoring the efficacy of our approach and the potential of the diffusion framework in protein design.
|
Elizaveta Kozlova · Arthur Valentin · Daniel Nakhaee-Zadeh Gutierrez 🔗 |
Fri 8:30 a.m. - 8:45 a.m.
|
Spotlight: Hierarchical Protein Representation for Interface Co-design with HICON
(
Spotlight
)
>
link
SlidesLive Video Protein-protein interactions (PPIs) are essential for many biological processes, but their design is challenging due to their complex and dynamic nature. We propose a new model called Hierarchical Interface CO-design Network (HICON) that can jointly generate the sequence and 3D structure of protein interfaces. HICON uses a novel hierarchical architecture that combines atomic and amino acid resolutions in an equivariant manner and leverages Large Protein Language Models for sequence initialization. We evaluate HICON on a variety of biological interfaces, including protein-protein, enzyme-ligand, and antibody paratope-epitope interfaces. Our results show that HICON outperforms state-of-the-art models on sequence prediction and paratope co-design on several computational metrics. |
Aous Khadhraoui · Daniel Nakhaee-Zadeh Gutierrez · Elizaveta Kozlova 🔗 |
Fri 8:45 a.m. - 9:25 a.m.
|
Invited Talk - Theofanis Karaletsos
(
Invited Talk
)
>
SlidesLive Video Studying biological systems is hard, since they are the domain of microscopic processes that are typically hard to measure and observe and mired in complexity. A typical approach towards studying systems of such complexity is to perform perturbations, study their outcomes, and try to understand the links to mechanisms we may want to control better. In this talk, we will talk about a class of deep generative models [1] that is tailored to this task, in that it studies readouts of cells and disentangles latent spaces suitably to isolate perturbation effects. We will introduce the model, how it can help us perform counterfactual reasoning over cells, discuss evaluation of such models, and sketch the work ahead to apply it fruitfully in service of discovery work. [1] Modelling cellular perturbations with the sparse additive mechanism shift variational autoencoder, Michael Bereket, Theofanis Karaletsos, NeurIPS2023 |
🔗 |
Fri 9:30 a.m. - 10:30 a.m.
|
Poster Session 1: Paper ID <= 33
(
Poster Session 1
)
>
|
🔗 |
Fri 10:30 a.m. - 11:40 a.m.
|
Lunch
(
Lunch
)
>
|
🔗 |
Fri 11:40 a.m. - 12:20 p.m.
|
Invited Talk - Finale Doshi-Velez
(
Invited Talk
)
>
SlidesLive Video Title: Validation with Large Generative Models: A Need for Human-Centric Approaches Abstract: Especially in applications such as health, we really want to know whether or not our models will behave as we want them to. And for smaller-surface models, including deep generative ones, we have a number of statistical and human-centered techniques to gain confidence that these models are doing largely reasonable things. However, these techniques, already partial for smaller-surface models, are able to provide even fewer assurances in the context of larger-surface models. In this talk, I will discuss how we must fundamentally re-think our approach to validation for larger-surface models. In particular, much of the validation effort must shift from statistical checks done in advance to human-centered checks for a particular output at task-time. I will discuss how this effort will require new methods and lay out some open questions and directions in this space. |
🔗 |
Fri 12:20 p.m. - 1:10 p.m.
|
Panel Discussion
(
Panel Discussion
)
>
SlidesLive Video |
🔗 |
Fri 1:10 p.m. - 1:25 p.m.
|
Break
(
Break
)
>
|
🔗 |
Fri 1:25 p.m. - 1:40 p.m.
|
Spotlight: Generative Time Series Models with Interpretable Latent Processes for Complex Disease Trajectories
(
Spotlight
)
>
link
SlidesLive Video We propose a deep generative time series approach using latent temporal processes for modeling and holistically analyzing complex disease trajectories and demonstrate its effectiveness in modeling systemic sclerosis. We aim to find meaningful temporal latent representations of an underlying generative process that explain the observed disease trajectories in an interpretable and comprehensive way.To enhance the interpretability of these latent temporal processes,we develop a semi-supervised approach for disentangling the latent space using established medical concepts.We show that the learned temporal latent processes can be utilized for further data analysis, including finding similar patients and clustering the disease into new sub-types.Moreover, our method enables personalized online monitoring and prediction of multivariate time series including uncertainty quantification. |
Cécile Trottet · Manuel Schürch · Maolaaisha Aminanmu · Ahmed Allam · Michael Krauthammer 🔗 |
Fri 1:40 p.m. - 1:55 p.m.
|
Spotlight: Synthetic Sleep EEG Signal Generation using Latent Diffusion Models
(
Spotlight
)
>
link
SlidesLive Video Electroencephalography (EEG) is a non-invasive method that allows for recording rich temporal information and is a valuable tool for diagnosing various neurological and psychiatric conditions. One of the main limitations of EEG is the low signal-to-noise ratio and the lack of data availability to train large data-hungry neural networks. Sharing large healthcare datasets is crucial to advancing medical imaging research, but privacy concerns often impede such efforts. Deep generative models have gained attention as a way to circumvent data-sharing limitations and as a possible way to generate data to improve the performance of these models. This work investigates latent diffusion models with spectral loss as deep generative modeling to generate 30-second windows of synthetic EEG signals of sleep stages. The spectral loss is essential to guarantee that the generated signal contains structured oscillations on specific frequency bands that are typical of EEG signals. We trained our models using two large sleep datasets (\emph{Sleep EDFx} and \emph{SHHS}) and used the Multi-Scale Structural Similarity Metric, Frechet inception distance, and a spectrogram analysis to evaluate the quality of synthetic signals. We demonstrate that the latent diffusion model can generate realistic signals with the correct neural oscillation and could, therefore, be used to overcome the scarcity of EEG data. |
Bruno Aristimunha · Raphael Yokoingawa de Camargo · Sylvain Chevallier · Oeslle Lucena · Adam Thomas · M. Jorge Cardoso · Walter Lopez Pinaya · Jessica Dafflon 🔗 |
Fri 2:10 p.m. - 2:25 p.m.
|
Spotlight: Counterfactual Generative Models for Time-Varying Treatments
(
Spotlight
)
>
link
SlidesLive Video Estimating the counterfactual outcome of treatment is essential for decision-making in public health and clinical science, among others. Often, treatments are administered in a sequential, time-varying manner, leading to an exponentially increased number of possible counterfactual outcomes. Furthermore, in modern applications, the outcomes are high-dimensional and conventional average treatment effect estimation fails to capture disparities in individuals. To tackle these challenges, we propose a novel conditional generative framework capable of producing counterfactual samples under time-varying treatment, without the need for explicit density estimation. Our method carefully addresses the distribution mismatch between the observed and counterfactual distributions via a loss function based on inverse probability weighting. We present a thorough evaluation of our method using both synthetic and real-world data. Our results demonstrate that our method is capable of generating high-quality counterfactual samples and outperforms the state-of-the-art baselines. |
Shenghao Wu · Wenbin Zhou · Minshuo Chen · Shixiang Zhu 🔗 |
Fri 2:10 p.m. - 2:25 p.m.
|
Spotlight: Prot2Text: Multimodal Protein’s Function Generation with GNNs and Transformers
(
Spotlight
)
>
link
SlidesLive Video In recent years, significant progress has been made in the field of protein function prediction with the development of various machine-learning approaches. However, most existing methods formulate the task as a multi-classification problem, i.e. assigning predefined labels to proteins. In this work, we propose a novel approach, Prot2Text, which predicts a protein's function in a free text style, moving beyond the conventional binary or categorical classifications. By combining Graph Neural Networks(GNNs) and Large Language Models(LLMs), in an encoder-decoder framework, our model effectively integrates diverse data types including protein sequence, structure, and textual annotation and description. This multimodal approach allows for a holistic representation of proteins' functions, enabling the generation of detailed and accurate functional descriptions. To evaluate our model, we extracted a multimodal protein dataset from SwissProt, and demonstrate empirically the effectiveness of Prot2Text. These results highlight the transformative impact of multimodal models, specifically the fusion of GNNs and LLMs, empowering researchers with powerful tools for more accurate function prediction of existing as well as first-to-see proteins. |
Hadi Abdine · Michail Chatzianastasis · Costas Bouyioukos · Michalis Vazirgiannis 🔗 |
Fri 2:25 p.m. - 2:30 p.m.
|
Closing remarks
(
Closing remarks
)
>
SlidesLive Video |
🔗 |
Fri 2:30 p.m. - 3:30 p.m.
|
Poster Session 2: Paper ID > 33
(
Poster Session 2
)
>
|
🔗 |
-
|
Lesion in-and-out painting for medical image augmentation
(
Poster
)
>
link
Deep learning(DL) in the medical imaging field suffers from lack of usable data compared to natural image because of the private and sensitive nature of medical data. Also it is a highly imbalanced data because for almost any disease, medical imaging has more patients not having it rather than having it. To address these problems, synthetic data generation is considered to be a promising solution. In this study, we present Lesion In-aNd-Out Painting (LINOP) to generate synthetic medical images for data augmentation. Generative model based on Mask Aware Transformer (MAT) architecture was used to synthesize lesions onto normal image (inpainting) and synthesis outside of lesion area (outpainting). We train and validate a lesion inpainting pipeline on mammography dataset and a lesion outpainting pipeline on chest X-ray dataset. For mammography, proposed augmentation showed up to 30.3\% improvements on mass localization in terms of mAP@50, and for CXR, up to 10.3\% improvements on disease classification in terms of AUROC. |
Yisak Kim · Kyungmin Jeon · Soyeon Kim · Chang Min Park 🔗 |
-
|
Adversarial Fine-tuning using Generated Respiratory Sound to Address Class Imbalance
(
Poster
)
>
link
Deep generative models have emerged as a promising approach in the medical image domain to address data scarcity. However, their use for sequential data like respiratory sounds is less explored. In this work, we propose a straightforward approach to augment imbalanced respiratory sound data using an audio diffusion model as a conditional neural vocoder. We also demonstrate a simple yet effective adversarial fine-tuning method to align features between the synthetic and real respiratory sound samples to improve respiratory sound classification performance. Our experimental results on the ICBHI dataset demonstrate that the proposed adversarial fine-tuning is effective, while only using the conventional augmentation method shows performance degradation. Moreover, our method outperforms the baseline by 2.24% on the ICBHI Score and improves the accuracy of the minority classes up to 26.58%. For the supplementary material, we provide the code and generated results. |
June-Woo Kim · Chihyeon Yoon · Miika Toikkanen · Sangmin Bae · Ho-Young Jung 🔗 |
-
|
Clinical Time Series Imputation using Conditional Information Bottleneck
(
Poster
)
>
link
Clinical time series imputation presents a significant challenge because it requires capturing the underlying temporal dynamics from partially observed time series data input. Among the recent successes of imputation methods based on generative models, the information bottleneck (IB) framework offers a well-suited theoretical foundation for multiple imputations, allowing us to account for the uncertainty associated with the imputed values. However, direct application of IB framework to time series data without considering temporal context can lead to a substantial loss of temporal dependencies. To address such a challenge, we propose a novel conditional information bottleneck (CIB) approach for time series imputation, which aims to mitigate the potentially negative consequences of the regularization constraint by reducing the redundant information conditioned on the temporal context. Our experiments, conducted on real-world healthcare dataset and image sequences, demonstrate that our method significantly improves imputation performance, and also enhances prediction performance based on the imputed values. |
MinGyu Choi · Changhee Lee 🔗 |
-
|
fcVI: Flow Cytometry Variational Inference
(
Poster
)
>
link
Single-cell flow cytometry is pivotal in biomedical research, offering invaluable insights into cellular phenotypes and functions. However, its potential is often constrained by technical limitations, noise interference, and batch effects. In this context, we propose fcVI, a multimodal deep generative model, tailored for integrative analysis of multiple massively parallel cytometry datasets from diverse sources. By effectively modeling noise variances, technical biases, and batch-specific disparities using probabilistic data representation, we showed fcVI not only excels in missing protein marker imputation but also sets a pioneering standard in seamlessly integrating multiple cytometry panels. As a result, fcVI emerges as a potent tool for constructing comprehensive flow cytometry atlases and enhancing the precision of flow cytometry data analyses. |
Kemal Inecik · Adil Meric · Fabian Theis 🔗 |
-
|
A GAN Model with Controllable Lesion Generation for Synthetic Capsule Endoscopy Datasets
(
Poster
)
>
link
In this paper, we will address a novel approach to create a synthetic capsule endoscopy dataset. In the medical area, research using deep learning has been actively conducted. It is important to secure a large amount of high-quality datasets to develop a deep learning model. However, medical data have privacy concerns or data bias issues. For this reason, medical data for learning can be noisy and incomplete. Also, it is difficult to obtain qualitative and quantitative medical data. To overcome these limitations, one of the studies that has recently been in the spotlight is synthetic data research. If we use synthetic data to learn deep learning models, we can maintain a more uniform data format and label. In this study, we want to solve the problem of lack of data by creating enough endoscopic datasets by naturally synthesizing the desired lesions in the desired location. We applied the crop and paste method and CycleGAN to the capsule endoscopy dataset for the first time. After placing the desired lesion at the desired coordinates using the crop and paste method, a widely used Data Augmentation Technique, we achieve natural synthesis using the CycleGAN model. We propose an Image-to-Image model that adjusts the type of location and lesion of the generated synthetic data. Through high-quality synthetic data generated in this way, we aim to realize the potential of deep learning in the medical field. |
Hyundong Choi · Heechul Jung 🔗 |
-
|
Semantic Map Guided Synthesis of Wireless Capsule Endoscopy Images using Diffusion Models
(
Poster
)
>
link
Wireless capsule endoscopy (WCE) is a non-invasive method for visualizing the gastrointestinal (GI) tract, crucial for diagnosing GI tract diseases. However, interpreting WCE results can be time-consuming and tiring. Existing studies have employed deep neural networks (DNNs) for automatic GI tract lesion detection, but acquiring sufficient training examples, particularly due to privacy concerns, remains a challenge. Public WCE databases lack diversity and quantity. To address this, we propose a novel approach leveraging generative models, specifically the diffusion model (DM), for generating diverse WCE images. Our model incorporates semantic map resulted from visualization scale (VS) engine, enhancing the controllability and diversity of generated images. We evaluate our approach using visual inspection and visual Turing tests, demonstrating its effectiveness in generating realistic and diverse WCE images. |
Haejin Lee · Jeongwoo Ju · Jonghyuck Lee · Yeoun Joo Lee · Heechul Jung 🔗 |
-
|
Federated learning for causal inference using deep generative disentangled models
(
Poster
)
>
link
In the context of decentralized and privacy-constrained healthcare data settings, we introduce an innovative approach to estimate individual treatment effects (ITE) via federated learning. Emphasizing the critical importance of data privacy in healthcare, especially when drawing on data from various global hospitals, we address challenges arising from data scarcity and specific treatment assignment criteria influenced by the availability of the medication of interest. Our methodology uses federated learning applied to neural network-based generative causal inference models to bridge the gap between decentralized and centralized ITE estimation on a benchmark dataset. |
Alejandro Almodóvar · Juan Parras · Santiago Zazo 🔗 |
-
|
CHIRon: A Generative Foundation Model for Structured Sequential Medical Data
(
Poster
)
>
link
Recent advances in large language models (LLMs) have shown that foundation models (FMs) can learn highly complex representations of sequences that can be used for downstream generative and discriminative tasks such as text generation and classification.While most FMs focus on text, recent work has shown FMs can be learnt for sequential medical data, e.g. ICD-10 diagnosis codes associated with specific patient visits. These FMs demonstrate improved performance on downstream discriminative disease classification tasks, but cannot be used for generative tasks such as synthesizing artificial patient visits for data augmentation or privacy preserving-preserving data sharing since they utilize BERT-based pre-training. In this paper, we introduce CHIRon, the first generative FM for sequential medical data.CHIRon utilizes causal masking during for pre-training, enabling generative applications, and incorporates a number of architectural improvements and support for additional medical data types (diagnoses, procedures, medications, lab results, place of service, demographics).We show empirically that CHIRon can be used to generate realistic sequential medical data and also outperforms state of the art FMs for sequential medical data on disease classification tasks. |
Brian Hill · Melika Emami · Vijay Nori · Aldo Cordova-Palomera · Robert Tillman · Eran Halperin 🔗 |
-
|
Mapping and Diagnosing Augmented Whole Slide Image Datasets with Training Dynamics
(
Poster
)
>
link
Pediatric heart transplantation represents the standard of care for children confronting end-stage heart failure. One of the most common postoperative complications, heart transplant rejection, has been monitored via surveillance endomyocardial biopsies and manual assessment by cardiac pathology experts. However, manual annotations with interobserver and intraobserver variability among cardiovascular pathology experts lead to significant disagreements about the severity of rejection. Artificial intelligence (AI)-enabled computational pathology usually requires large-scale manual annotations of gigapixel whole-slide images (WSIs) for effective model training. To address these challenges, we develop an AI-enabled rare disease detection framework for automating heart transplant rejection detection from WSIs of pediatric patients. Specifically, we conduct dataset cartography with data maps and training dynamics to map and diagnose the augmented samples, exploring the model behavior on individual instances during model training. Extensive experiments on internal and external patient cohorts have demonstrated the feasibility of both tile-level and biopsy-level detection with augmented samples. The proposed data-efficient learning framework may support seamless scalability to real-world rare disease detection without the burden of iterative expert annotations. |
Wenqi Shi · Benoit Marteau · May Dongmei Wang 🔗 |
-
|
Automated clinical coding using off-the-shelf large language models
(
Poster
)
>
link
The task of assigning diagnostic ICD codes to patient hospital admissions is typically performed by expert human coders. Efforts towards automated ICD coding are dominated by supervised deep learning models. However, difficulties in learning to predict the large number of rare codes remain a barrier to adoption in clinical practice. In this work, we leverage off-the-shelf pre-trained generative large language models (LLMs) to develop a practical solution that is suitable for zero-shot and few-shot code assignment. Unsupervised pre-training alone does not guarantee precise knowledge of the ICD ontology and specialist clinical coding task, therefore we frame the task as information extraction, providing a descriptionof each coded concept and asking the model to retrieve related mentions. For efficiency, rather than iterating over all codes, we leverage the hierarchical nature of the ICD ontology to sparsely search for relevant codes. Then, in a second stage, which we term ‘meta-refinement’, we utilise GPT-4 to select a subset of the relevant labels as predictions. We validate our method using Llama-2, GPT-3.5 and GPT-4 on the CodiEsp dataset of ICD-coded clinical case documents. Our tree-search method achieves state-of-the-art performance on rarer classes, achieving the best macro-F1 of 0.225, whilst achieving slightly lower micro-F1 of 0.157, comparedto 0.216 and 0.219 respectively from PLM-ICD. To the best of our knowledge, this is the first method for automated ICD coding requiring no task-specific learning. |
Joseph Boyle · Antanas Kascenas · Pat Lok · Maria Liakata · Alison O'Neil 🔗 |
-
|
mmNormVAE: Normative Modeling on Multimodal Neuroimaging Data using Variational Autoencoders
(
Poster
)
>
link
Normative modelling is a popular method for studying brain disorders like Alzheimer's Disease (AD) where the normal brain patterns of cognitively normal subjects are modelled and can be used at subject-level to detect deviations relating to disease pathology. So far, deep learning-based normative frameworks have largely been applied on a single imaging modality. We aim to design a multi-modal normative modelling framework based on multimodal variational autoencoders (mmNormVAE) where disease abnormality is aggregated across multiple neuroimaging modalities (T1-weighted and T2-weighted MRI) and subsequently used to estimate subject-level neuroanatomical deviations due to AD. |
Sayantan Kumar · Philip Payne · Aristeidis Sotiras 🔗 |
-
|
DySurv: Dynamic Deep Learning Model for Survival Prediction in the ICU
(
Poster
)
>
link
Survival analysis helps approximate underlying distributions of time-to-events which in the case of critical care like in the ICU can be a powerful tool for dynamic mortality risk prediction. Extending beyond the classical Cox model, deep learning techniques have been leveraged over the last years relaxing the many constraints of their counterparts from statistical methods. In this work, we propose a novel conditional variational autoencoder-based method called DySurv which uses a combination of static and time-series measurements from patient electronic health records in estimating risk of death dynamically in the ICU. DySurv has been tested on standard benchmarks where it outperforms most existing methods including other deep learning methods and we evaluate it on a real-world patient database from MIMIC-IV. The predictive capacity of DySurv is consistent and the survival estimates remain disentangled across different datasets supporting the idea that dynamic deep learning models based on conditional variational inference in multi-task cases can be robust models for survival analysis. |
Munib Mesinovic · Peter Watkinson · Tingting Zhu 🔗 |
-
|
Semi-Supervised Diffusion Model for Brain Age Prediction
(
Poster
)
>
link
Brain age prediction models have succeeded in predicting clinical outcomes in neurodegenerative diseases, but can struggle with tasks involving faster progressing diseases and clinical-grade data. To enhance their performance, we employed a semi-supervised diffusion model, obtaining a 0.90(p<0.01) correlation between chronological and predicted age on clinical-grade T1w MR images. This outperformed standard non-generative methods. Furthermore, the prediction's produced by our model were significantly associated with survival duration (r=0.24, p<0.01) in Amyotrophic Lateral Sclerosis. Thus, our approach demonstrates the value of diffusion-based architectures for the task of brain age prediction. |
Ayodeji Ijishakin · Sophie Martin · Florence Townend · James Cole 🔗 |
-
|
Closing Gaps: An Imputation Analysis of ICU Vital Signs
(
Poster
)
>
link
As more ICU EHR data becomes available, the interest in developing clinical prediction models to improve healthcare protocols increases. However, insufficient data quality still hinders clinical prediction using Machine Learning (ML). Many vital sign measurements, such as heart rate, contain sizeable missing segments, leaving gaps in the data that could negatively impact prediction performance. Previous works have introduced numerous time-series imputation techniques. Nevertheless, more comprehensive work is needed to compare a representative set of imputation methods for imputing ICU vital signs to determine the best practice. In reality, simple and ad-hoc imputation techniques that could decrease prediction accuracy, like zero imputation, are still used. In this work, we compare established and recently developed imputation techniques to guide researchers in improving clinical prediction model performance by choosing the most accurate imputation technique. We introduce an extensible, reusable benchmark with, currently, 15 imputation and 4 amputation methods created for benchmarking on major ICU datasets. We hope to provide a comparative basis and facilitate further clinical ML development to bring more models to practice. |
Robin van de Water · Bert Arnrich 🔗 |
-
|
Uncovering the latent dynamics of whole-brain fMRI tasks with a sequential variational autoencoder
(
Poster
)
>
link
The neural dynamics underlying brain activity are critical to understanding cognitive processes and mental disorders. However, current voxel-based whole-brain dimensionality reduction techniques fail to capture these dynamics, producing latent timeseries that inadequately relate to behavioral tasks. To address this issue, we introduce a novel approach to learning low-dimensional approximations of neural dynamics using a sequential variational autoencoder (SVAE) that learns the latent dynamical system. Importantly, our method finds smooth dynamics that can predict cognitive processes with accuracy higher than classical methods, with improved spatial localization to task-relevant brain regions, and we find fixed points for the dynamics that are stable across random initialization of the model. |
Eloy Geenjaar · Donghyun Kim · Riyasat Ohib · Marlena Duda · Amrit Kashyap · Sergey Plis · Vince Calhoun 🔗 |
-
|
Generative Multimodal Decoding: Reconstructing Images and Text from Human fMRI
(
Poster
)
>
link
The human brain adeptly processes immense visual information using complex neural mechanisms. Recent advances in functional MRI (fMRI) enable decoding this visual information from recorded brain activity patterns. In this work, we present an innovative approach for reconstructing meaningful images and captions directly from fMRI data, with a focus on brain captioning due to its enhanced flexibility over image decoding.We utilize the Natural Scenes fMRI dataset containing brain recordings from subjects viewing images. Our method leverages state-of-the-art image captioning and diffusion models for multimodal decoding. We train regression models between fMRI data and textual/visual features and incorporate depth estimation to guide image reconstruction.Our key innovation is a multimodal framework aligning neural and deep learning representations to generate both semantic captions and photorealistic images from brain activity. We demonstrate quantitative improvements in captioning over prior art and in image spatial relationships through our reconstruction pipeline.In conclusion, this work significantly advances brain decoding capabilities through an integrated vision-language approach. Our flexible decoding platform combining high-level semantic text and low-level visual depth information provides new insights into human visual cognition. The proposed methods could enable future applications in brain-computer interfaces, neuroscience, and AI. |
Matteo Ferrante · Tommaso Boccato · Furkan Ozcelik · Rufin VanRullen · Nicola Toschi 🔗 |
-
|
Texture synthesis for realistic-looking virtual colonoscopy using mask-aware transformer
(
Poster
)
>
link
In virtual colonoscopy, computer vision techniques focus on depth estimation, photometric tracking, and simultaneous localization and mapping (SLAM). To narrow the domain gap between virtual and real colonoscopy data, it is necessary to utilize real-world data or employ realistic-looking virtual dataset. We introduce a texture synthesis and outpainting strategy using the Mask-aware-transformer. The proposed method crafts textures for the colon's inner mucosa by utilizing real colonoscopy dataset. The primary objective is to develop texture maps tailored for virtual colonoscopy. The proposed method provides an RGB-D dataset of synthesized textures for virtual colonoscopy, meeting requirements for realistic, controllable, and a variety of texture appearances. The proposed dataset leverages 9 video sequences, each generated from distinct colon models, accumulating a total of 14,120 frames, paired with ground truth depth. |
Seunghyun Jang · Yisak Kim · Dongheon Lee · Chang Min Park 🔗 |
-
|
DDxT: Deep Generative Transformer Models for Differential Diagnosis
(
Poster
)
>
link
Differential Diagnosis (DDx) is the process of identifying the most likely medical condition among the possible pathologies through the process of elimination based on evidence. The primary prior works have relied on the Reinforcement Learning (RL) paradigm under the intuition that it aligns better with how physicians perform DDx. In this paper, we show that a generative approach trained with simpler supervised and self-supervised learning signals can achieve superior results on the current benchmark. The proposed Transformer-based generative network, named DDxT, autoregressively produces a set of possible pathologies, i.e., DDx, and predicts the actual pathology using a neural network. Experiments are performed using the DDXPlus dataset. In the case of DDx, the proposed network has achieved a mean accuracy of $99.82\%$ and a mean F1 score of $0.9472$. Additionally, mean accuracy reaches $99.98\%$ with a mean F1 score of $0.9949$ while predicting ground truth pathology. The proposed DDxT outperformed the previous RL-based approaches by a big margin. Overall, the automated DDx generative model has the potential to become a useful tool for a physician in times of urgency.
|
Mohammad Mahmudul Alam · Edward Raff · Tim Oates · Cynthia Matuszek 🔗 |
-
|
Large Language Models with Retrieval-Augmented Generation for Zero-Shot Disease Phenotyping
(
Poster
)
>
link
Identifying disease phenotypes from electronic health records is critical for numerous secondary uses. Manually encoding physician knowledge into rules is particularly challenging for rare diseases due to inadequate EHR coding, necessitating review of clinical notes. Large language models (LLM) offer promise in text understanding but may not efficiently handle real-world clinical documentation. We propose a zero-shot LLM-based method enriched by retrieval-augmented generation and MapReduce, which pre-identifies disease-related text snippets to be used in parallel as queries for the LLM to establish diagnosis. This method as applied to pulmonary hypertension, a rare disease, significantly outperforms physician logic rules ($F_1$ score of 0.62 vs. 0.75) which has the potential to enhance rare disease cohort and care gap identification, expanding the scope of robust clinical research.
|
Will Thompson · David Vidmar · Jessica De Freitas · Gabriel Altay · Kabir Manghnani · Andrew Nelsen · Kellie Morland · John Pfeifer · Brandon Fornwalt · RuiJun Chen · Martin Stumpe · Riccardo Miotto
|
-
|
JoLT: Jointly Learned Representations of Language and Time-Series
(
Poster
)
>
link
Time-series and text data is prevalent in healthcare and frequently exist in tandem, for e.g., in electrocardiogram (ECG) interpretation reports. Yet, these modalities are typically modeled independently. Even studies that jointly model time-series and text do so by converting time-series to images or graphs. We hypothesize that explicitly modeling time-series jointly with text can improve tasks such as summarization and question answering for time-series data, which have received little attention so far. To address this gap, we introduce JoLT to jointly learn desired representations from pre-trained time-series and text models. JoLT utilizes a Querying Transformer (Q-Former) to align the time-series and text representations. Our experiments on a large real-world electrocardiography dataset for medical time-series summarization show that JoLT outperforms state-of-the-art image captioning and medical question-answering approaches, and that the decoder architecture, size, and pre-training data can vary the performance on said tasks. |
Yifu Cai · Mononito Goswami · Arjun Choudhry · Arvind Srinivasan · Artur Dubrawski 🔗 |
-
|
A 3D Conditional Diffusion Model for Image Quality Transfer - An Application to Low-Field MRI
(
Poster
)
>
link
Low-field (LF) MRI scanners (<1T) are still prevalent in settings with limited resources or unreliable power supply. However, they often yield images with lower spatial resolution and contrast than high-field (HF) scanners. This quality disparity can result in inaccurate clinician interpretations. Image Quality Transfer (IQT) has been developed to enhance the quality of images by learning a mapping function between low and high-quality images. Previous IQT models often fail to restore high-frequency features, leading to blurry output. In this paper, we propose a 3D conditional diffusion model to improve 3D volumetric data, specifically LF MR images. Additionally, we incorporate a cross-batch mechanism into the self-attention and padding of our network, ensuring broader contextual awareness even under small 3D patches. Evaluations on the publicly available Human Connectome Project (HCP) dataset for IQT and brain parcellation demonstrate that our model outperforms existing methods both quantitatively and qualitatively. |
Seunghoi Kim · Daniel Alexander · Ahmed Karam Eldaly · Matteo Figini 🔗 |
-
|
The Negative Impact of Denoising on Automated Classification of Electrocardiograms
(
Poster
)
>
link
We present an evaluation of recent state-of-the-art electrocardiogram denoising methods and assess their impact on the performance of automatic diagnosis classifiers, with a focus on the risk prediction of torsade de pointes arrhythmia. Our findings indicate that the traditional approach of evaluating denoising methods independently of the application is insufficient. This is particularly the case for applications where the signals are used for phenotype prediction. We observed that when classifiers are fed denoised data instead of raw data, their performance significantly deteriorates, with a decline of up to 40 percentage points in accuracy and up to 27 percentage points in AUROC when a misclassification detection method is further applied, underscoring a notable reduction in model reliability. These findings highlight the importance of considering the downstream impact of denoising on automated classification tasks and it sheds light on the complexities of trustworthiness in the context of healthcare applications. |
Federica Granese · Ahmad Fall · Alex Lence · Joe-Elie Salem · Jean-Daniel Zucker · Edi Prifti 🔗 |
-
|
Generating Personalized Insulin Treatments Strategies with Conditional Generative Time Series Models
(
Poster
)
>
link
We propose a novel framework that combines deep generative time series models with decision theory for generating personalized treatment strategies. It leverages historical patient trajectory data to jointly learn the generation of realistic personalized treatment and future outcome trajectories through deep generative time series models. In particular, our framework enables the generation of novel multivariate treatment strategies tailored to the personalized patient history and trained for optimal expected future outcomes based on conditional expected utility maximization. We demonstrate our framework by generating personalized insulin treatment strategies and blood glucose predictions for hospitalized diabetes patients, showcasing the potential of our approach for generating improved personalized treatment strategies. |
Manuel Schürch · Xiang Li · Ahmed Allam · Giulia Hofer · Maolaaisha Aminanmu · Claudia Cavelti-Weder · Michael Krauthammer 🔗 |
-
|
LC-SD: Realistic Endoscopic Image Generation with Limited Training Data
(
Poster
)
>
link
Computer-assisted surgical systems provide support information to the surgeon, which can improve the execution and overall outcome of the procedure. These systems are based on deep learning models that are trained on complex and challenging-to-annotate data. Generating synthetic data can overcome these limitations, but it is necessary to reduce the domain gap between real and synthetic data. We propose a method for image-to-image translation based on a Stable Diffusion model, which generates realistic images starting from synthetic data. Compared to previous works, the proposed method is better suited for clinical application as it requires a much smaller amount of input data and allows finer control over the generation of details by introducing different variants of supporting control networks. The proposed method is applied in the context of laparoscopic cholecystectomy, using synthetic and real data from public datasets. It achieves a mean Intersection over Union of 69.76%, significantly improving the baseline results (69.76% vs. 42.21%). The proposed method for translating synthetic images into images with realistic characteristics will enable the training of deep learning methods that can generalize optimally to real-world contexts, thereby improving computer-assisted intervention guidance systems. |
Joanna Kaleta · Diego Dall'alba · Szymon Plotka · Przemyslaw Korzeniowski 🔗 |
-
|
DiffRNAFold: Generating RNA Tertiary Structures with Latent Space Diffusion
(
Poster
)
>
link
RNA molecules provide an exciting frontier for novel therapeutics. Accurate determination of RNA structure could accelerate development of therapeutics through an improved understanding of function. However, the extremely large conformation space has kept the RNA 3D structure space largely unresolved. Using recent advances in generative modeling, we propose DiffRNAFold, a latent space diffusion model for RNA tertiary structure design. Our preliminary results suggest that DiffRNAFold generated molecules are similar in 3D space to true RNA molecules, providing an important first step towards accurate structure and function prediction in vivo. |
Mihir Bafna · Vikranth Keerthipati · Subhash Kanaparthi · Ruochi Zhang 🔗 |
-
|
Effectively Fine-tune to Improve Large Multimodal Models for Radiology Report Generation
(
Poster
)
>
link
Writing radiology reports from medical images requires a high level of domain expertise. It is time-consuming even for trained radiologists and can be error-prone for inexperienced radiologists. It would be appealing to automate this task by leveraging generative AI, which has shown drastic progress in vision and language understanding. In particular, Large Language Models (LLM) have demonstrated impressive capabilities recently and continued to set new state-of-the-art performance on almost all natural language tasks. While many have proposed architectures to combine vision models with LLMs for multimodal tasks, few have explored practical fine-tuning strategies. In this work, we proposed a simple yet effective two-stage fine-tuning protocol to align visual features to LLM's text embedding space as soft visual prompts. Our framework with OpenLLaMA-7B achieved state-of-the-art level performance without domain-specific pretraining. Moreover, we provide detailed analyses of soft visual prompts and attention mechanisms, shedding light on future research directions. |
Yuzhe Lu · Sungmin Hong · Yash Shah · Panpan Xu 🔗 |
-
|
Adversarial Denoising Diffusion Model for Unsupervised Anomaly Detection
(
Poster
)
>
link
In this paper, we propose the Adversarial Denoising Diffusion Model (ADDM). The ADDM is based on the Denoising Diffusion Probabilistic Model (DDPM) but complementarily trained by adversarial learning. The proposed adversarial learning is achieved by classifying model-based denoised samples and samples to which random Gaussian noise is added to a specific sampling step. With the addition of explicit adversarial learning on data samples, ADDM can learn the semantic characteristics of the data more robustly during training, which achieves a similar data sampling performance with much fewer sampling steps than DDPM. We apply ADDM to anomaly detection in unsupervised MRI images. Experimental results show that the proposed ADDM outperformed existing generative model-based unsupervised anomaly detection methods. In particular, compared to other DDPM-based anomaly detection methods, the proposed ADDM shows better performance with the same number of sampling steps and similar performance with 50% fewer sampling steps. |
Jongmin Yu · Hyeontaek Oh · Jinhong Yang 🔗 |
-
|
Robust semi-supervised segmentation with timestep ensembling diffusion models
(
Poster
)
>
link
Medical image segmentation is challenging due to limited data and annotations. Denoising diffusion probabilistic models (DDPM) show promise in modelling natural image distributions and are successfully applied in medical imaging. Our research focuses on semi-supervised image segmentation using diffusion models' latent representations and addressing domain generalisation. We found that optimal performance depends on the choice of diffusion steps and ensembling. Our model outperformed in domain-shifted settings while remaining competitive within domain, highlighting DDPMs' potential for medical image segmentation. |
Margherita Rosnati · Mélanie Roschewitz · Ben Glocker 🔗 |
-
|
ECG Inpainting with denoising diffusion prior
(
Poster
)
>
link
In this work, we train a generative denoising diffusion model (DDGM) in healthy electrocardiogram (ECG) data capable of generating realistic healthy heartbeats. We then show how recent advances in solving linear inverse Bayesian problems with DDGM can be used to derive interpretable outlier detection tools for electrophysiological anomalies. |
Lisa Bedin · Gabriel Cardoso · Remi Dubois · Eric Moulines 🔗 |
-
|
Investigating Causality Between Genotype And Clinical Phenotype In Neurological Disorders Using Structural Causal Model and Normalizing Flow
(
Poster
)
>
link
Understanding the causal relationship between genotype and clinical phenotype is crucial for disease treatment and prognosis. Despite the existing literature on exploring associations of genetics with clinical phenotypes such as imaging patterns and survival in various diseases, there are few to none work address the causation of these correlated genotypes. This paper leverages recent advances in causal deep learning to formulate the phenotypical outcome given the change in genotype as a causal inference problem. We build upon structural causal model (SCM) with normalizing flows parameterized by deep networks to perform the counterfactual query to investigate the causal relationship between genotype and clinical phenotype in two types of neurological disorders. Specifically, we focus on the causal effect of (1) APOE4 allele on brain volumetric measures in Alzheimer's disease; (2) key driver gene mutations on overall survival (OS) in glioblastoma. Experimental results show that APOE4 noncarriers causally lead to greater gray matter atrophy in the frontal lobe, and survival-correlated genes do not exhibit causal effect on OS in glioblastoma. |
Fanyang Yu · Rongguang Wang · Pratik Chaudhari · Christos Davatzikos 🔗 |
-
|
Transferring Movement Understanding for Parkinson’s Therapy by Generative Pre-Training
(
Poster
)
>
link
Motion data is a modality of clinical importance for Parkinson's research but modeling it typically requires careful design of the machine learning system. Inspired by recent advances in autoregressive language modeling, we investigate the extent to which these modeling assumptions may be relaxed. We quantize motion capture data into discrete tokens and apply a generic autoregressive model to learn a model of human motion. Representing both positions and joint angles in a combined vocabulary, we model forward and inverse kinematics in addition to autoregressive prediction in 3D and angular space. This lets us pre-train on a 1B token, 40 hour dataset of motion capture, and then finetune on one hour of clinically relevant data in a downstream task. Despite the naivety of this approach, the model is able to perform clinical tasks and we demonstrate high performance classifying 5 hours of dance data. |
Emily Napier · Gavia Gray · Sageev Oore 🔗 |
-
|
Language models are susceptible to incorrect patient self-diagnosis in medical applications
(
Poster
)
>
link
Large language models (LLMs) are becoming increasingly relevant as a potential tool for healthcare, aiding communication between clinicians, researchers, and patients. However, traditional evaluations of LLMs on medical exam questions do not reflect the complexity of real patient-doctor interactions. An example of this complexity is the introduction of patient self-diagnosis, where a patient attempts to diagnose their own medical conditions from various sources. While the patient sometimes arrives at an accurate conclusion, they more often are led toward misdiagnosis due to the patient's over-emphasis on bias validating information. In this work we present a variety of LLMs with multiple-choice questions from United States medical board exams which are modified to include self-diagnostic reports from patients. Our findings highlight that when a patient proposes incorrect bias-validating information, the diagnostic accuracy of LLMs drop dramatically, revealing a high susceptibility to errors in self-diagnosis. |
Rojin Ziaei · Samuel Schmidgall 🔗 |
-
|
Are we going MAD? Benchmarking Multi-Agent Debate between Language Models for Medical Q&A
(
Poster
)
>
link
Recent advancements in large language models (LLMs) underscore their potential for responding to medical inquiries. However, ensuring that generative agents provide accurate and reliable answers remains an ongoing challenge. In this context, multi-agent debate (MAD) has emerged as a prominent strategy for enhancing the truthfulness of LLMs. In this work, we provide a comprehensive benchmark of MAD strategies for medical Q&A, along with open-source implementations. This sheds light on the effective utilization of various strategies including the trade-offs between cost, time, and accuracy. We build upon these insights to provide a novel debate-prompting strategy based on agent agreement that outperforms previously published strategies on medical Q&A tasks. |
Andries Smit · Paul Duckworth · Nathan Grinsztajn · Kale-ab Tessera · Tom Barrett · Arnu Pretorius 🔗 |
-
|
JMedLoRA:Medical Domain Adaptation on Japanese Large Language Models using Instruction-tuning
(
Poster
)
>
link
In the ongoing wave of impact driven by large language models (LLMs) like ChatGPT, the adaptation of LLMs to medical domain has emerged as a crucial research frontier. Since mainstream LLMs tend to be designed for general-purpose applications, constructing a medical LLM through domain adaptation is a huge challenge. While instruction-tuning is used to fine-tune some LLMs, its precise roles in domain adaptation remain unknown. Here we show the contribution of LoRA-based instruction-tuning to performance in Japanese medical question-answering tasks. In doing so, we employ a multifaceted evaluation for multiple-choice questions, including scoring based on "Exact match" and "Gestalt distance" in addition to the conventional accuracy. Our findings suggest that LoRA-based instruction-tuning can partially incorporate domain-specific knowledge into LLMs, with larger models demonstrating more pronounced effects. Furthermore, our results underscore the potential of adapting English-centric models for Japanese applications in domain adaptation, while also highlighting the persisting limitations of Japanese-centric models. This initiative represents a pioneering effort in enabling medical institutions to fine-tune and operate models without relying on external services. |
Issey Sukeda · Masahiro Suzuki · Hiroki Sakaji · satoshi kodera 🔗 |
-
|
Generative models for wearables data
(
Poster
)
>
link
Data scarcity is a common obstacle in medical research due to the high costs associated with data collection and the complexity of gaining access to and utilizing data. Synthesizing health data may provide an efficient and cost-effective solution to this shortage, enabling researchers to explore distributions and populations that are not represented in existing observations or difficult to access due to privacy considerations. To that end, we have developed a multi-task self-attention model that produces realistic wearable activity data. We examine the characteristics of the generated data and quantify its similarity to genuine samples. |
Arinbjörn Kolbeinsson · Luca Foschini 🔗 |
-
|
Synthetic Data: Can We Trust Statistical Estimators?
(
Poster
)
>
link
The increasing interest in data sharing makes synthetic data particularly appealing. However, the analysis of synthetic data raises a unique set of methodological challenges. In this work, we highlight the importance of inferential utility and provide empirical evidence that naive inference from synthetic data (that handles these as if they were really observed) is not appropriate, as the rate of false-positive findings (type I error) will be unacceptably high, even when the estimates are unbiased, due to underestimation of the true standard error. This is even more problematic for deep generative models. Valid inference from synthetic data will necessitate the construction of valid standard errors, to which our work contributes. |
Alexander Decruyenaere · Paloma Rabaey · Christiaan Polet · Johan Decruyenaere · Stijn Vansteelandt · Thomas Demeester · Heidelinde Dehaene 🔗 |
-
|
BuDDI: Bulk Deconvolution with Domain Invariance to predict cell-type-specific perturbations from bulk
(
Poster
)
>
link
While single-cell experiments provide deep cellular resolution within a single sample, some single-cell experiments are inherently more challenging than bulk experiments due to dissociation difficulties, cost, or limited tissue availability. This creates a situation where we have deep cellular profiles of one sample or condition, and bulk profiles across multiple samples and conditions. To bridge this gap, we propose BuDDI (BUlk Deconvolution with Domain Invariance). BuDDI utilizes domain adaptation techniques to effectively integrate available corpora of case-control bulk and reference scRNA-seq observations to infer cell-type-specific perturbation effects. BuDDI achieves this by learning independent latent spaces within a single variational autoencoder (VAE) encompassing at least four sources of variability: 1) cell-type proportion, 2) perturbation effect, 3) structured experimental variability, and 4) remaining variability. Since each latent space is encouraged to be independent, we simulate perturbation responses by independently composing each latent space to simulate cell-type-specific perturbation responses. We evaluated BuDDI’s performance on simulated and real data with experimental designs of increasing complexity. We first validated that BuDDI could learn domain invariant latent spaces on data with matched samples across each source of variability. Then we validated that BuDDI could accurately predict cell-type-specific perturbation response when no single-cell perturbed profiles were used during training; instead, only bulk samples had both perturbed and non-perturbed observations. Finally, we validated BuDDI on predicting sex-specific differences, an experimental design where it is not possible to have matched samples. In each experiment, BuDDI outperformed all other comparative methods and baselines. As more reference atlases are completed, BuDDI provides a path to combine these resources with bulk-profiled treatment or disease signatures to study perturbations, sex differences, or other factors at single-cell resolution. |
Natalie Davidson · Casey Greene 🔗 |
-
|
Multi-V-Stain: Multiplexed Virtual Staining of Histopathology Whole-Slide Images
(
Poster
)
>
link
Pathological assessment on Hematoxylin \& Eosin (H\&E) stained tissue samples is a clinically-established routine for cancer diagnosis. While providing rich morphological information, it lacks insights on protein expression patterns, essential for cancer prognosis and treatment decisions. Imaging Mass Cytometry (IMC) is adept at highly multiplexed protein profiling. However, it has challenges such as high operational cost and a restrictive focus on small Regions-of-Interest. To this end, we propose Multi-V-Stain, a novel image-to-image translation method for multiplexed IMC virtual staining. Our method can effectively leverage the rich morphological features from H\&E images to predict multiplexed protein expressions on a Whole-Slide Image level. In our assessments using an in-house melanoma dataset, Multi-V-Stain consistently achieves higher image quality and generates stains that are more biologically relevant when compared to existing techniques. |
Sonali Andani · Boqi Chen · Joanna Ficek-Pascual · Simon Heinke · Ruben Casanova · Bettina Sobottka · Bernd Bodenmiller · Viktor H Koelzer · Gunnar Rätsch 🔗 |
-
|
MEDiC: Mitigating EEG Data Scarcity Via Class-Conditioned Diffusion Model
(
Poster
)
>
link
Learning with a small-scale Electroencephalography (EEG) dataset is a non-trivial task. On the other hand, collecting a large-scale EEG dataset is equally challenging due to subject availability and procedure sophistication constraints. Data augmentation offers a potential solution to address the shortage of data; however, traditional augmentation techniques are inefficient for EEG data. In this paper, we propose MEDiC, a class-conditioned Denoising Diffusion Probabilistic Model (DDPM) based approach to generate synthetic EEG embeddings. We perform experiments on a publicly accessible dataset. Empirical findings indicate that MEDiC efficiently generates synthetic EEG embeddings, which can serve as effective proxies to original EEG data. The code & pre-trained model will be made publicly available after the paper's acceptance. |
Gulshan Sharma · Abhinav Dhall · Ramanathan Subramanian 🔗 |