单击此处编辑母版标题样式,单击此处编辑母版文本样式,第二级,第三级,第四级,第五级,*,*,RNA-seq,研究方法与策略,市场部,张壮壮,上海天昊生物科技有限公司,RNA-seq研究方法与策略市场部 张壮壮,DNA makes RNA makes protein,mRNA,是沟通,DNA,和蛋白质的“,桥梁,”,DNA makes RNA makes proteinm,Messenger RNA(mRNA),is a large family of RNA molecules that convey genetic information from DNA to the ribosome,where they specify the amino acid sequence of the protein products of gene expression.,A,non-coding RNA,(,ncRNA,)is a functional RNA molecule that is not translated into a protein.,microRNAs(miRNAs),Small non-coding RNAs of 22 nucleotides that are integral components of RNA-induced silencing complex(RISC)and that recognize partially complementary target mRNAs to induce translational repression,which is often linked to degradation.,Long non-coding RNAs,(long ncRNAs,lncRNA)are non-protein coding transcripts longer than 200 nucleotides.,mRNA,Coding,RNA,rRNA,tRNA,snoRNA,scaRNA,snRNA,Non-coding,RNAi,rasiRNA,piRNA,siRNA,miRNA,stRNA,anti-sense,lncRNA,circRNA,Chris P.Ponting,Peter L.Oliver,and Wolf Reik.Evolution and Functions of Long Noncoding RNAs.Cell,136,629641,February 20,2009.,RNA world is more colorful,Messenger RNA(mRNA)is a larg,Dual RNA-seq of pathogen and host.,10,618630(2012).,RNA,Type,Dual RNA-seq of pathogen and h,一个典型人类细胞的,RNA,含量,参数,量,每个细胞中的总,RNA,130 pg,细胞核中总,RNA,的比例,14%,细胞核中,DNA:RNA,2:1,mRNA,分子,2 x 10,5,-,1 x 10,6,mRNA,常规大小,1900 nt,一个典型的快速生长的哺乳动物细胞培养中,每个细胞大约含有,10-30 pg,的,RNA,,而一个完全分化的原代细胞中,,RNA,的量要少得多,大约每个细胞中,RNA,的含量小于,1 pg,。细胞中的,RNA,分子主要是,tRNA,和,rRNA,。,mRNA,大约占细胞中,RNA,总量的,1-5%,,但是具体的量取决于细胞类型和细胞的生理状态。,一个典型人类细胞的RNA含量参数量每个细胞中的总RNA1,RNA,的特点,分子相对较小,通常是单链;,周期短,降解快;,通常有特殊结构,(mRNA,、,miRNA,、,tRNA,和,rRNA),;,通常有前体,需要剪切和修饰,(mRNA,、,miRNA,、,tRNA,和,rRNA),;,mRNA,的特点,5,端帽子结构和,3,端,Poly,A,尾巴,分子长度一般介于,500-10000nt,有前体,包含内含子,能翻译成功能蛋白,原核生物,mRNA,缺少,cap,和,Poly-A tail,的结构,!,RNA的特点分子相对较小,通常是单链;mRNA的特点5端帽,基于丰度的,mRNA,分类,丰度,拷贝,/,细胞,每个细胞中不同,mRNA,的数量,每种,mRNA,的丰度,低,515,11,000,0.004%,中等,200400,500,0.1%,高,12,000,200nt),Read length,50SE,90PE,50SE,90PE,Identify novel transcripts,Profiling,Gene structure,SNP/SNV,biomarker,Gene fusion,RNA-seq Type,Alternative,Comment,mRNA-seq/LncRNA-seq,poly-A+,mRNA and LncRNA,Small RNA-seq(miRNA-seq),poly-A-,miRNA,piRNA,.,rmRNA-seq,rRNA-,coding and non-coding RNAs,Total RNA-seq,Both,all RNAs,but most of them are rRNAs and tRNAs,ApplicationsRNA-SeqSmall RNALn,2.RNA,的提取与质检,3.,测序文库的构建,4.,上机测序与数据质控,5.,数据分析与结果展示,1.,试验方案设计,普通转录组文库,LncRNA,文库,Small RNA,文库,Total RNA,mRNA,Non-coding RNA,mRNA,文库,LncRNA-seq,miRNA-seq,mRNA-seq,真核链特异性文库,真核,原核,De,novo,Assembly,Transcript,Re-sequencing,2.RNA的提取与质检3.测序文库的构建4.上机测序与,Figure 1 RNA-seq work flow.,Schematic diagram of RNA-seq library construction.Total RNA is extracted from 300,000 cells to 3 million cells,and a small aliquot is used to measure the integrity of the RNA.rRNA is then depleted through one of several methods to enrich subpopulation of RNA molecules,such as mRNA or small RNA.mRNA is fragmented into a uniform size distribution and the fragment size can be monitored by RNA gel electrophoresis or Agilent Bioanalyzer.The cDNA is then built into a library.The size distribution pattern of the library can be checked by Agilent Bioanalyzer;this information is important for RNA-seq data analysis.,Mapping programs align reads to the reference genome and map splice junctions.Gene expression can be quantified as absolute read counts or normalized values such as RPKM.,If RNA-seq data sets are deep enough and the reads are long enough to map enough splice junctions,the mapped reads can be assembled into transcripts.,The sequences of the reads can be mined by comparing the transcriptome reads with the reference genome to identify nucleotide variants that are either genomic variants(for example,SNPs)or candidates for RNA editing.,RNA-seq Workflow,Technical considerations for functional sequencing assays.13,802807(2012).,?,Figure 1 RNA-seq work flow.RN,We carried out replicate experiments across,15 laboratory sites using reference RNA standards,to test,four protocols(poly-A-selected,ribodepleted,size-selected and degraded),on,five sequencing platforms(Illumina HiSeq,Life Technologies PGM and Proton,Pacific Biosciences RS and Roche 454).,The results show,high intraplatform(Spearman rank R 0.86)and inter-platform(R 0.83)concordance for expression measures across the deep-count platforms,but highly variable efficiency and cost for splice junction and variant detection between all platforms.,We carried out replicate exper,For intact RNA,gene expression profiles from rRNA-depletion and poly-A enrichment are similar.In addition,rRNA depletion enables effective analysis of degraded RNA samples.,For intact RNA,gene expressio,读长,(,结构正确性,)(,表达量准确性,),通量,Roche,454,读长很长,(700bp),通量低,(700M),测试费用很高,MiSeq,读长中等,(2300bp),通量中等,(15G),测试费用中等,HiSeq,读长中等,(2150bp),通量高,(1.8T),测试费用低,读长(结构正确性),重复的设置:技术重复、生物学重复,技术误差和个体差异可以通过设置重复进行评估,但不能消除。,只有准确平衡了技术误差和个体差异,才能用,RNA-seq,结果解释组间差异。,RNA-seq,结果变异,组间差异,+,技术误差,+,个体误差,实验目的,源于技术,源于不同个体,技术重复评估,生物学重复评估,重复的设置:技术重复、生物学重复技术误差和个体差异可以通过设,RNA-seq,文库构建和测序的技术重复性皆为,0.99,以上,可以不设技术重复。,R