OPEN Deep learning and alignment of spatially resolved single-cell transcriptomes with Tangram
OPEN Deep learning and alignment of spatially resolved single-cell transcriptomes with Tangram
Charting an organs' biological atlas requires us to spatially resolve the entire single-cell transcriptome, and to relate such cellular features to the anatomical scale. Single-cell and single-nucleus RNA-seq can profile cells comprehensively, but lose spatial information. Spatial transcriptomics allows for spatial measurements, but at lower resolution and with limited sensitivity. Targeted in situ technologies solve both issues, but are limited in gene throughput. To overcome these limitations we present Tangram, a method that aligns sc/snRNA-seq data to various forms of spatial data collected from the same region, including MERFISH, STARmap, smFISH, Spatial Transcriptomics (Visium) and histological images. Tangram can map any type of sc/snRNA-seq data, including multimodal data such as those from SHARE-seq, which we used to reveal spatial patterns of chromatin accessibility. We demonstrate Tangram on healthy mouse brain tissue, by reconstructing a genome-wide anatomically integrated spatial map at single-cell resolution of the visual and somatomotor areas.
A Human Cell Atlas should combine high-resolution molecular and histological mapping with anatomical and functional data. Advances in single-cell and spatial genomics opened the way to high-resolution spatial profiles, but each of the currently available technologies addresses only some of the challenge of resolving entire transcriptomes in space at single-cell resolution. On the one hand, sc/snRNA-seq profiles single cells transcriptome-wide, from which we can recover cell types, gene expression programs, and developmental relations, but by necessity lose direct spatial information. Conversely, spatial technologies resolve transcriptomes in space, but are limited in either gene throughput or spatial resolution. In general, targeted in situ technologies (such as in situ sequencing, multiplexed error-robust fluorescence in situ hybridization (MERFISH), single-molecule FISH (smFISH), cyclic-ouroboros smFISH (osmFISH), spatially resolved transcript amplicon readout mapping (STARmap), targeted expansion sequencing, and sequential FISH (seqFISH+)) are typically limited to hundreds of preselected genes, but adding more probes can reduce accuracy for some genes. Spatial transcriptomics methods (such as Spatial Transcriptomics (ST/Visium), Slide-seq, and High Definition Spatial Transcriptomics) spatially barcode entire transcriptomes, but with limited capture rate (and substantial 'dropouts', which increase at higher resolution) and a spatial resolution larger than a single cell, ranging from fifty micrometers to one hundred micrometers for ST to ten micrometers for Slide-seq. In addition, for biological interpretation, cellular features would ideally be related to the
nature methods
nature methods
histological or organ scale, which is conventionally done using methods from computer vision for registration of medical images. However, these methods typically require human supervision, such as identification of anatomical landmarks in images, preventing the complete automation that is desirable for organ-scale mapping.
Computational methods have previously bridged this gap by combining single-cell and spatial measurements. These methods can reconstruct key landmark genes by leveraging local alignment in transcriptome space, or hypotheses such as continuity in gene expression. However, intrinsically sparse or granularly distributed genes are difficult to predict. For measurements at coarse spatial resolution, computational methods aim to deconvolve these data, by either learning a program dictionary or a probability distribution of the data, to infer a cell-type composition within a spatial voxel. However, deconvolution is hindered by spatial 'dropouts,' in which cell types defined by sparse or dim markers are not correctly detected.
Here, we present Tangram, a deep-learning framework to address two challenges: learn spatial gene-expression maps transcriptome-wide at single-cell resolution, and relate those to histological and anatomical information from the same specimens. Tangram learns a spatial alignment of sc/snRNA-seq data from a reference spatial data of any kind-either fine or coarse grained-as we demonstrate by spatially mapping snRNA-seq data from the isocortex of the adult healthy mouse brain using each of five kinds of spatial supports, at different levels of resolution and gene coverage: ISH, smFISH, Visium (Spatial Transcriptomics), STARmap and MERFISH. Tangram produces consistent spatial maps of cell types and overcomes limitations in throughput or resolution. It corrects low-quality genes, even in high-resolution methods, provides single-cell resolution for low-resolution methods, and provides genome-wide coverage for targeted methods. By mapping multimodal single data (simultaneous high-throughput ATAC and RNA expression with sequencing (SHARE-seq)) on spatial support, Tangram visualizes spatial patterns of chromatin accessibility and transcription factor motif scores at single-cell resolution. Finally, Tangram includes a dedicated new computer vision module that leverages histological data, and maps it to anatomical positions in an existing Common Coordinate Framework in the brain. If a histology image is available, even without any further annotation, this module relates all scales, to a single integrated atlas.