Data Science in Medicine & Biology II

Session Type: Lecture
Session Code: A2L-B
Location: Room 2
Date & Time: Wednesday March 22, 2023 (10:20 - 11:20)
Chair: Adam Charles
Track: 5

Paper IDPaper NameAuthorsAbstract
3175Boolean Factor Graph Modeling and Analysis of Gene Graphs: Budding Yeast Cell-CycleStephen Kotiang, Ali EslamiThe desire to understand genomic functions and the behavior of complex gene regulatory networks has recently been a major research focus in systems biology. As a result, a plethora of computational and modeling tools have been proposed to identify and infer interactions among biological entities. Here, we consider the general question of the effect of perturbation on the global dynamical network behavior as well as error propagation in biological networks to incite research pertaining to intervention strategies. This paper introduces a computational framework that combines the formulation of Boolean networks (BNs) and factor graphs to explore the global dynamical features of biological systems. A message-passing algorithm is proposed for this formalism to evolve network states as messages in the graph. The model is applied to assess the network state progression and the impact of gene deletion in the budding yeast cell cycle. Simulation results show that our model predictions match published experimental data.
3100Prediction and Functional Characterization of Transcriptional Activation Domains Saloni Mahatma, Lisa Van Den Broeck, Nicholas Morffy, Max Staller, Lucia Strader, Rosangela SozzaniGene expression is induced by transcription factors (TFs) through their activation domains (ADs). However, ADs are unconserved, intrinsically disordered sequences without a secondary structure, making it challenging to recognize and predict these regions and limiting our ability to identify TFs. Here, we address this challenge by leveraging a neural network approach to systematically predict ADs. As input for our neural network, we used computed properties for amino acid (AA) side chain and secondary structure, rather than relying on the raw sequence. Moreover, to shed light on the features learned by our neural network and greatly increase interpretability, we computed the input properties most important for an accurate prediction. Our findings further highlight the importance of aromatic and negatively charged AA and reveal the importance of unknown AA properties. Taking advantage of these most important features, we used an unsupervised learning approach to classify the ADs into 10 subclasses, which can further be explored for AA specificity and AD functionality. Overall, our pipeline, relying on supervised and unsupervised machine learning, shed light on the non-linear properties of ADs.
3157Stacking Multiple Optimal Transport Policies to Map Functional Connectomes Javid Dadashkarimi, Matthew Rosenblatt, Amin Karbasi, Dustin ScheinostConnectomics is a popular approach for understanding the brain with neuroimaging data. However, a connectome generated from one atlas is different in size, topology, and scale compared to a connectome generated from another. Consequently, connectomes generated from different atlases cannot be used in the same analysis pipelines. This limitation hinders efforts toward increasing sample size and demonstrating generalizability across datasets. Recently, we proposed Cross Atlas Remapping via Optimal Transport (CAROT) to find a spatial mapping between a pair of atlases based on a set of training data. The mapping transforms timeseries fMRI data parcellated with an atlas to form a connectome based on a different one. Crucially, CAROT does not need raw fMRI data and thus does not require re-processing, which can otherwise be time-consuming and expensive. The current CAROT implementation leverages information from several source atlases to create robust mappings for a target atlas. In this work, we extend CAROT to combine existing mappings between a source and target atlas for an arbitrary number of mappings. This extension (labeled Stacking CAROT) allows mappings between a pair of atlases to be created once and re-used with other pre-trained mappings to create new mappings as needed. Reconstructed connectomes from Stacking CAROT perform as well as those from CAROT in downstream analyses. Importantly, Stacking CAROT significantly reduces training time and storage requirements compared to CAROT. Overall, Stacking CAROT improves previous versions of CAROT.