Izvor: Piplmetar.rs, 21.Sep.2022, 10:47

Largely natural sequencing-by-synthesis for scRNA-seq the usage of Ultima sequencing

Sean K. Simmons1,2 na1, Gila Lithwick-Yanai3 na1, Xian Adiconis1,2 na1, Florian Oberstrass3, Nika Iremadze3, Kathryn Geiger-Schuller 
ORCID: orcid.org/0000-0002-6705-06811 nAff6, Pratiksha I. Thakore1 nAff6, Chris J. Frangieh1,4, Omer Barad3, Gilad Almogy3, Orit Rozenblatt-Rosen 
ORCID: orcid.org/0000-0001-6313-35701 nAff6, Aviv Regev1,5 nAff6, Doron Lipson3 & … Joshua Z. Levin 
ORCID: orcid.org/0000-0002-0170-35981,2  >> Pročitaj celu vest na sajtu Piplmetar.rs << Nature Biotechnology
(2022)Cite this article
4084 Accesses
55 Altmetric
Metrics predominant components


Abstract Here we introduce a mostly natural sequencing-by-synthesis (mnSBS) arrangement for single-cell RNA sequencing (scRNA-seq), adapted to the Ultima genomics platform, and systematically benchmark it against recent scRNA-seq technology. mnSBS uses mostly natural, unmodified nucleotides and handiest a low portion of fluorescently labeled nucleotides, which enables for prime polymerase processivity and lower charges. We characterize successful application in four scRNA-seq case experiences of varied technical and biological kinds, including 5′ and 3′ scRNA-seq, human peripheral blood mononuclear cells from a single person and in multiplex, as smartly as Perturb-Seq. Benchmarking reveals that outcomes from mnSBS-primarily based mostly mostly scRNA-seq are very equivalent to those the usage of Illumina sequencing, with minor differences in outcomes connected to the blueprint of reads relative to annotated gene boundaries, owing to single-stop reads of Ultima being closer to gene ends than reads from Illumina. The arrangement is thus love minded with state of the art scRNA-seq libraries unprejudiced of the sequencing technology. We ask mnSBS to be of explain utility for attach-effective mountainous-scale scRNA-seq initiatives.

Vital Single-cell RNA sequencing (scRNA-seq) enables the glimpse and characterization of mobile states and pathways at ever-increasing experimental scales, including the Human Cell Atlas1, cell atlases for tumors2 and other diseases3,4, and mountainous-scale Perturb-Seq displays of thousands and thousands of cells below genetic5,6 or drug7 perturbations. Systems for taking pictures and processing single-cell libraries had been radically scaled up to now few years8,9,10,11, but sequencing itself has largely relied on Illumina technology. Here we record the pattern of a sequencing technology intended to facilitate mountainous-scale experiences. Largely natural sequencing-by-synthesis (mnSBS) is a novel sequencing chemistry that relies on a low portion of labeled nucleotides, combining the efficiency of non-terminating chemistry with the throughput and scalability of optical endpoint scanning within an commence fluidics machine to permit high-throughput sequencing, and has been demonstrated on Genome-in-a-Bottle reference samples and samples from the 1000 Genomes challenge12. To benchmark mnSBS with scRNA-seq, we performed experiments with four library kinds, sequenced in parallel on an Illumina sequencer and on an Ultima Genomics (Ultima) prototype sequencer imposing mnSBS (Fig. 1a).
Fig. 1: Experimental score.
a, Work saunter with the high-tail showing four samples feeble and changes made for Ultima sequencing. b, Library conversion showing PCR course of to substitute adapters from Illumina (P5 and P7, parts of Learn 1 and a pair of) to Ultima (Primer for Sequencing + Sample Barcode (PS-SBC) and Primer for Bead (PB), parts of Learn 1 and a pair of). The 5′ libraries have TSO and 3′ libraries have poly(dT). Our Ultima libraries did now now not require index sequences for combining libraries together, though this selection will be added in some arrangement. c, mnSBS schematic. d, Recordsdata conversion of single-stop reads to simulated paired-stop reads wished for Cell Ranger diagnosis. White box reveals 5 bases trimmed from cDNA and three bases trimmed from UMI adjacent to the poly(dT) sequence in 3′ libraries. In 5′ libraries, handiest three bases had been trimmed from the cDNA subsequent to the TSO. PS-SBC be taught is feeble to deconvolute multiplexed libraries.


To place into effect mnSBS for massively parallel, droplet-primarily based mostly mostly scRNA-seq, we transformed a same old scRNA-seq work saunter with the high-tail to be love minded with Ultima sequencing (Fig. 1b–d; Systems). Specializing in 10x Chromium scRNA-seq (Systems), a most well liked arrangement, we first added adapters to cDNA libraries particular for Ultima sequencing (Fig. 1b). Next, we take care of the undeniable truth that droplet-primarily based mostly mostly scRNA-seq relies on pairing each and each cDNA be taught with a cell barcode (CBC) and a uncommon molecular identifier (UMI) (Systems). With Illumina sequencing, the 2 ends of the library are sequenced individually by paired-stop sequencing, but for single-stop Ultima sequencing, we indulge in all the records in a single be taught of 200–250 bases (Fig. 1d and Extended Recordsdata Fig. 1), such that the CBC and UMI are be taught first and followed by the cDNA. For these reads derived from the 3′ stop of the transcript, we sequence by poly(T) bases, which are the outcomes of the mRNA poly(A) tail, adjacent to the cDNA sequence.
To have in thoughts mnSBS with scRNA-seq, we implemented experiments with four libraries, spanning assorted technical and biological use conditions, and sequenced each and each in parallel on each and each Ultima and Illumina sequencers (Systems). Three libraries had been from peripheral blood mononuclear cells (PBMCs) of healthy human donors, spanning 3′ scRNA-seq (~7,000 cells, 1 person), 5′ scRNA-Seq (~7,000 cells, 1 person) and a library generated in multiplex by pooling cells from eight donors (~24,000 cells, 8 folk, 5′ scRNA-seq). We chose PBMCs on narrative of they’re predominant human cells, embrace various cell forms of a quantity of sizes and frequencies and had been feeble for previous benchmarking13,14. The fourth library became as soon as from a Perturb-Seq5,6 experiment, where ~20,000 cells had been profiled after clustered regularly interspaced rapid palindromic repeats (CRISPR)–Cas9 pooled genetic perturbation, followed by scRNA-seq to detect each and each the profile of the cell and the connected records RNA. Collectively, the four libraries span three major use conditions—person affected person atlas, multiplex affected person profiling, and mountainous-scale displays, and the 2 most regularly feeble library kinds for scRNA-seq.
We first tested the feasibility of mnSBS for scRNA-seq, with matched Ultima and 5′ and 3′ droplet-primarily based mostly mostly scRNA-seq of PBMCs. Initial diagnosis (Systems) showed that the quantity of UMIs generated at a given sequencing depth became as soon as connected between Ultima and Illumina in the 5′ libraries, whereas for the 3′ libraries we bought more UMIs with Illumina than Ultima (Fig. 2a), owing to differences in sequence quality. While Ultima and Illumina records for 5′ libraries had been an identical, for the 3′ records there became as soon as lower quality for Ultima in the bases flanking the poly(T) space—the 3′ stop of the UMI and the 5′ stop of the cDNA (Extended Recordsdata Fig. 2a). Indeed, filtering out reads that have bases with quality  2 the usage of a pseudocount of 10 TPM). The 20 genes with the supreme FC are labeled in each and each location. For all 3′ libraries, the final three UMI bases had been trimmed for quality causes.
Source records



We extra investigated whether trimming the final three bases of the UMI impacted the outcomes owing to increased ‘collisions’ when assorted fleshy UMIs collapsed together to the identical trimmed UMI. This could perchance also be the case when high UMI complexity is required, for a cell with many UMIs detected or for a highly expressed gene. To uncover this, we examined the ratio of the quantity of trimmed to untrimmed UMIs for cells and for genes with assorted numbers of UMIs, in the Illumina 3′ PBMC dataset (with the increased quality UMIs). On the cell degree, the ratio reduces because the coverage will increase, but handiest modestly, by now now not up to 10% for all but a pair of dozen cells with very high quantity of UMIs (Supplementary Fig. 1a). On the gene degree, only a pair of of the highly expressed genes (8 of three,908 genes with >1,000 UMIs) characterize a discount of >10% (Supplementary Fig. 1b). Conversely, some very lowly expressed genes have lower ratios, likely on narrative of, for these genes, shedding even one UMI will lead to a smaller ratio. Taken together, our analyses characterize that shortening UMIs has handiest a modest quit on highly expressed genes and high-complexity cell profiles. This led us to exclude the final three bases of every and each UMI in Ultima 3′ records in subsequent downstream diagnosis (Systems).
Next, evaluating the efficiency of these PBMC 3′ and 5′ matched libraries, we bought an identical total efficiency for every and each sequencing technologies. First, to correct for differences in sequencing depths, which had been increased in Ultima than Illumina, we randomly sampled Ultima reads, so as that we feeble the identical quantity of reads for every and each sequencing platform (Systems). Both technologies acknowledged nearly all the identical CBCs (Fig. 2b; 7,916 cells (Ultima) versus 7,926 cells (Illumina) in the 3′ records, and 7,875 cells (Ultima) versus 7,854 cells (Illumina) in the 5′ records), with the identical quantity of UMIs and genes per cell for 5′ libraries and somewhat lower numbers for 3′ libraries with Ultima (as anticipated) (Fig. 2c,d). Once we sampled reads to have the identical quantity of UMIs (Systems), we bought a an identical quantity of genes per cell in Illumina and Ultima moreover for 3′ libraries (Extended Recordsdata Fig. 3). Assorted metrics (Supplementary Desk 1) moreover showed an identical total efficiency, with somewhat increased genome mapping rates in Ultima but connected transcriptome mapping rates.
The 2 sequencing technologies yielded highly correlated expression ranges for the matched 5′ and 3′ PBMC libraries, albeit with some outlier genes and minor differences (Pearson’s r = 0.98 in all conditions; Fig. 2e and Extended Recordsdata Fig. 3c). As anticipated, when a single sequencing scuttle became as soon as randomly split into two datasets, we take a look at even increased correlation of expression ranges (Extended Recordsdata Fig. 3d). Particularly, there became as soon as a modest bias, namely in the 3′ libraries, against genes with increased GC command material having increased expression in Illumina and the longest genes having increased expression in Ultima 3′ libraries (Extended Recordsdata Fig. 4a,b). Of the 166 genes with differences in expression for 3′ PBMC between the 2 sequencing platforms, most (130 genes, 78.3%) differed in the portion of reads that had been assigned by Cell Ranger to the gene out of all the reads mapped to that gene space (Extended Recordsdata Fig. 4c). Here’s likely connected to how Ultima and Illumina reads diagram to assorted locations relative to the transcript, as anticipated from the variation in single-stop versus paired-stop reads (Fig. 1d). In 5′ records, Ultima reads diagram closer to the 5′ stop than Illumina reads, whereas in 3′ records, Ultima reads diagram closer to the 3′ stop than Illumina reads (Extended Recordsdata Fig. 4d,e). Because Cell Ranger excludes reads that fabricate now now not fully diagram within annotated gene boundaries, more Ultima reads are excluded from diagnosis as they’re closer to gene ends (Extended Recordsdata Fig. 4d,e), as proven, for instance, for LILRA5 and HIST1H1D (Extended Recordsdata Fig. 4f,g). This contrast in space can moreover lead to more multimapping or ambiguous reads (Extended Recordsdata Fig. 4h and Supplementary Desk 2). As an illustration, four (ARF5, MIF, IFITM1 and TCIRG1) of the 20 genes with the supreme log fold substitute (FC) (all logs are the natural logarithm (immoral e) on this glimpse, unless otherwise notorious) between Ultima and Illumina in the 3′ records (labeled in Fig. 2e) have increased expression in Illumina and a noteworthy increased fee of mapped ambiguous reads in the Ultima than the Illumina records (>50 versus  ln2) in comparison with the smartly-liked reference, whereas other total metrics had been largely unchanged. Within the 3′ records, there had been a an identical quantity of DE genes in analyses with the prolonged and same old references, though the expression of some genes, for instance, LILRA5 and MT-CO2, agreed arrangement more closely the usage of the prolonged reference. Comparing gene expression ranges for the identical sequencing dataset processed with the smartly-liked or an prolonged reference reveals that most ranges are very an identical, though a sizeable quantity (23 to 83) are increased and a pair of (1 to a pair) are lower (Extended Recordsdata Fig. 5c). Moreover, one of the tip genes that change between the prolonged and same old references are genes that change between Ultima and Illumina with the smartly-liked reference, for instance, MT-CO2 and LILRA5 in the 3′ records and HIST1H1D and HIST1H1E in the 5′ records (Fig. 2e and Extended Recordsdata Fig. 4f,g). This suggests that a records-pushed prolonged reference could perchance relieve enhance expression in Ultima scRNA-seq records, namely when the usage of 5′ records. Alternatively, one can have in thoughts bettering the potential Cell Ranger counts UMIs to greater take succor of reads that overlap genes but are now now not entirely contained within them.
We examined the influence of the single-stop Ultima versus paired-stop Illumina records by sequencing the 5′ PBMC library with single-ended Illumina sequencing. Applying a an identical pipeline to the one we feeble for Ultima with minor required changes (Systems), we sampled the Ultima records to have the identical quantity of reads because the single-stop Illumina records and in comparison them (Supplementary Fig. 2). The 2 programs showed very high agreement in phrases of the quantity of UMIs per cell and genes per cell, with some distance fewer outlier genes between Ultima and the single-ended Illumina records than observed when evaluating to the paired-stop records (Supplementary Desk 3). Total, the standard indulge in watch over metrics of single-stop Illumina and Ultima sequencing are arrangement more an identical, namely the mapping metrics (Supplementary Desk 1).
To take a look at the biological insights derived from scRNA-seq the usage of the 2 technologies, we grew to vary into to study 5′ scRNA-seq of PBMCs from eight folk processed together and sequenced with each and each Ultima and Illumina (Systems). Both programs have roughly the identical quantity of UMIs on this dataset (

Nastavak na Piplmetar.rs...



Povezane vesti

Taking out undesirable variation from mountainous-scale RNA sequencing data with PRPS

Izvor: Piplmetar.rs, 21.Sep.2022

Ramyar Molania1,2, Momeneh Foroutan3, Johann A. Gagnon-Bartsch4, Luke C. Gandolfo . ORCID: orcid.org/0000-0002-3599-24551,2,5, Aryan Jain . ORCID: orcid.org/0000-0003-4928-80606, Abhishek Sinha . ORCID: orcid.org/0000-0001-8404-354X6, Gavriel Olshansky7,8,...

Nastavak na Piplmetar.rs...

Napomena: Ova vest je automatizovano (softverski) preuzeta sa sajta Piplmetar.rs. Nije preneta ručno, niti proverena od strane uredništva portala "Vesti.rs", već je preneta automatski, računajući na savesnost i dobru nameru sajta Piplmetar.rs. Ukoliko vest (članak) sadrži netačne navode, vređa nekog, ili krši nečija autorska prava - molimo Vas da nas o tome ODMAH obavestite obavestite kako bismo uklonili sporni sadržaj.