Modeling the Three-Dimensional Chromatin Structure from Hi-C Data with Transfer Learning
Abstract
Recent studies have revealed the importance of three-dimensional (3D) chromatin structure in the regulation of vital biological processes. Contrary to protein folding, no experimental procedure that can directly determine ground-truth 3D chromatin coordinates exists. Instead, chromatin conformation is studied implicitly using high-throughput chromosome conformation capture (Hi-C) methods that quantify the frequency of all pairwise chromatin contacts. Computational methods that infer the 3D chromatin structure from Hi-C data are thus unsupervised, and limited by the assumption that contact frequency determines Euclidean distance. Inspired by recent developments in deep learning, in this work we explore the idea of transfer learning to address the crucial lack of ground-truth data for 3D chromatin structure inference. We present a novel method, Transfer learning Encoder for CHromatin 3D structure prediction (TECH-3D) that combines transfer learning with creative data generation procedures to reconstruct chromatin structure. Our work outperforms previous deep learning attempts for chromatin structure inference and exhibits similar results as state-of-the-art algorithms on many tests, without making any assumptions on the relationship between contact frequencies and Euclidean distances. Above all, TECH-3D presents a highly creative and novel approach, paving the way for future deep learning models.