Wednesday, January 08, 2014 8:55:54 PM
January 8, 2014 by Next-Gen Sequencing Data
A bit late to the yearly review posts. But here it is. Long Reads of the year 2013. As you can see, this “Long Reads” are slightly different Here we summarize a few “long read” sequence data that got publicly available last year and point to where one can download the data. They are awesome resources and great to start playing with them in the new year.
One of the most exciting things in “next-gen sequencing” happened this year is the availability “long” sequence reads, be it genomic or transcriptomic. Two sequencing technologies, that already have “long reads” and got a lot of attraction this year are Illumina’s Moleculo and PacBio. And Oxford Nanopore data is just around the corner. With Oxford Nanopore’s early access program, it is expected that, we might see some data by February 2014 (AGBT 2014?).
The year 2013 started with Illumina acquiring Moleculo for its long-read technology. And another biggest change that happened is that PacBio got more social (possibly realizing the threat from Illumina) :). PacBio started blogging in mid 2012, but had just two blog posts in 2012. Then, 2013 came, PacBio got really prolific and till now it has over 55 posts. In addition, PacBio also started making its data publicly available using the blog.
Moleculo and PacBio sequence data from Drosophila
After acquiring Moleculo, Illumina launched Fast Track Long Read sequencing service using Moleculo long read technology. As part of the early Access launch, Illumina shared long reads data set from Dr. Dmitri Petrov’s group at Stanford, comprising two libraries of Drosophila melanogaster, each run on a single HiSeq lane and producing ~30Gb data. Visit Illumina’s Base Space to get the data.
Around the same time, Casey Bergman’s lab made PacBio long reads publicly available. The raw PacBio data is 1,357,183,439 bp with ~7.5x coverage of the 180 Mb male D. melanogaster genome. The 63G PacBio data can be downloaded from Bergman’s lab website. Not just this, Begman lab also had Illumina data from the same sample and combined it with the PacBio reads to offer error corrected sequence data.
Another possible Moleculo data is from the publication first publication using Moleculo technology. The Moleculo team worked on the project before naming the technology as Moleculo and the results came out in a paper on eLife. However, it looks like the data is not available freely. Are there other Moleculo data out in the wild?
PacBio RNA-seq data from Human MCF-7
PacBio long generated sequencing data of RNA from MCF-7, a human breast cancer cell line and made it available on its website. The data obtained from P4-C2 sequencing chemistry and contains 44,531 non-redundant transcript-length consensus sequences with read length ranging from 400 bp – 4,900 bp (an average length of 1,929 bp). Here is the PacBio blog post offering more details on the “long read” data.
Long-Read Shotgun Sequencing of a Human Genome
Pacbio released the data generated from P5-C3 scaffolding sequencing chemistry and contains over 3.6 M reads with average length of 8,849 bases. (Half of sequenced bases in reads greater than: 10,985 bp). The data is from an interesting human cell line derived from a complete hydatidiform mole (CHM).
A hydatidiform mole is defined as a pregnancy with no embryo and clinically presents in approximately 1 in 1,500 pregnant women in North America. The CHM cells have a diploid genome, typically XX, that is a result of replication of a haploid paternal (sperm) genome. Through the corresponding absence of allelic variation, this sample has been used to generate a haploid reference genome sequence, and many associated resources are available, including physical maps, genotypes (iSCAN), and a large-insert BAC library (CHORI-17). It is also one of the targets for the production of a higher quality “platinum” genome assembly.
Visit PacBio blog for accessing the data.
PacBio RNA-seq data
Mike Snyder’s group from Stanford did the first long-read survey of human transcriptome and generated 476,000 CCS reads from cDNA with an average length of 1 kb to investigate the isoform complement of a diverse pool of RNA samples representing 20 human tissues and organs. Data from 454 platform with average read length 522 bp , but on the same samples, is also available. PacBio RNA-seq Data on ENA: PRJEB3969
PacBio RNA-seq data from hESC cell line
Wing Wong’s team from Stanford published a new method that can use PacBio and Illumina reads to identify isoforms in PNAS. The team used C2 chemistry to generate over 7.5 M lreads of average length 2-3 Kb from hESC cell line H1. Data can be accessed at GSE51861.
7Share1Share0Share0Share0Share0Share You may also like:
EncodingInformationAsDNA EncodingInformationAsDNA
Information Storage in DNARoche to Shut Down NJ R&D Facility and 1000 Jobs to Go Roche to Shut Down NJ R&D Facility and 1000 Jobs to Go
Roche, the swiss based pharma giant announced that it will be closing Nutley NJ R&D...2013 NGS Conferences 2013 NGS Conferences
Here is the list of Next-Gen sequencing conferences in 2013. NextGenSeek hopes to list...23andMe Reduces DNA Testing Kit Prize and Removes Subscription Plan 23andMe Reduces DNA Testing Kit Prize and Removes Subscription Plan
23andMe the personal genomics company based in California announced that it is...Illumina Sues Complete Genomics Again Illumina Sues Complete Genomics Again
Illumina announced that it is filing its second patent infringement lawsuit against...Did You Know There Are (At Least) 14 Next-Gen Sequence Technology Companies? Did You Know There Are (At Least) 14 Next-Gen Sequence Technology Companies?
Would you believe there are next-gen sequencing technology companies other than the... [ what's this ] Share on facebookShare on twitterShare on emailShare on pinterest_shareMore Sharing Services0Related posts:
1.PacBio Aims to Reach Average Read of Lengths of 7000-9000 Bases in 2013
2.Illumina CEO Jay Flatley on Moleculo and Verinata Health
3.Illumina Acquires Moleculo Inc. for Longer Reads
4.Update on Moleculo Technology from PAGXXI
5.Illumina Gives More Details on Moleculo Technology
Filed Under: Illumina Long Read Sequencing Service, Moleculo Long Reads, Moleculo Technology, PacBio, PacBio RNA-seq · Tagged With: Moleculo, Moleculo Long Reads, PacBio, PacBio Long Reads
Comments
Lex Nederbragy says:
January 8, 2014 at 4:18 pm
Great idea, this post! Some comments:
The Drosophila moleculo data is available through Illumina’s basespace (free registration required).
PacBio released several bacterial genome datasets, from projects illustrating the potential for finished genomes using this platform.
Reply
Lex Nederbragy says:
January 8, 2014 at 6:59 pm
And then I forgot to include the Arabidopsis Pacbio long reads, as well as the reads generated from the Human Microbiome Project ‘mock community’ sample – both released by the company and available through pacbiodevnet.com
http://nextgenseek.com/2014/01/long-sequence-reads-to-play-with-during-the-holidays/
Recent PACB News
- Ambry Genetics and PacBio Announce Collaboration to Sequence Up to 7,000 Human Genomes Aimed at Providing Answers for Families Battling Rare Diseases • PR Newswire (US) • 05/15/2024 01:45:00 PM
- Form S-3ASR - Automatic shelf registration statement of securities of well-known seasoned issuers • Edgar (US Regulatory) • 05/09/2024 08:33:12 PM
- Form 10-Q - Quarterly report [Sections 13 or 15(d)] • Edgar (US Regulatory) • 05/09/2024 08:21:46 PM
- Form 8-K - Current report • Edgar (US Regulatory) • 05/09/2024 08:12:15 PM
- PacBio Announces First Quarter 2024 Financial Results • PR Newswire (US) • 05/09/2024 08:05:00 PM
- PacBio Announces Preliminary First Quarter 2024 Revenue and Updates 2024 Revenue Guidance • PR Newswire (US) • 04/16/2024 12:05:00 PM
- Estonia National Biobank Selects PacBio to Sequence 10,000 Whole Genomes • PR Newswire (US) • 03/27/2024 12:00:00 PM
- PacBio Grants Equity Incentive Award to New Employee • PR Newswire (US) • 03/22/2024 08:30:00 PM
- PacBio Announces PureTarget™ Repeat Expansion Panel, Expanding its Portfolio of End-to-End Clinical Research Solutions • PR Newswire (US) • 03/12/2024 01:05:00 PM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 03/06/2024 10:36:07 PM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 03/06/2024 10:30:18 PM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 03/06/2024 10:26:40 PM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 03/06/2024 10:22:45 PM
- Form 144 - Report of proposed sale of securities • Edgar (US Regulatory) • 03/04/2024 11:32:39 PM
- Form 144 - Report of proposed sale of securities • Edgar (US Regulatory) • 03/04/2024 11:22:32 PM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 02/26/2024 09:55:28 PM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 02/26/2024 09:36:09 PM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 02/26/2024 09:25:48 PM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 02/26/2024 09:19:42 PM
- PacBio to Present at Upcoming Investor Conferences • PR Newswire (US) • 02/26/2024 09:05:00 PM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 02/21/2024 11:25:13 PM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 02/21/2024 11:20:57 PM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 02/21/2024 11:17:14 PM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 02/21/2024 11:07:18 PM
- Form 144 - Report of proposed sale of securities • Edgar (US Regulatory) • 02/20/2024 09:17:12 PM
North Bay Resources Announces 50/50 JV at Fran Gold Project, British Columbia; Initiates NI 43-101 Resources Estimate and Bulk Sample • NBRI • May 21, 2024 9:07 AM
Greenlite Ventures Inks Deal to Acquire No Limit Technology • GRNL • May 17, 2024 3:00 PM
Music Licensing, Inc. (OTC: SONG) Subsidiary Pro Music Rights Secures Final Judgment of $114,081.30 USD, Demonstrating Strength of Licensing Agreements • SONGD • May 17, 2024 11:00 AM
VPR Brands (VPRB) Reports First Quarter 2024 Financial Results • VPRB • May 17, 2024 8:04 AM
ILUS Provides a First Quarter Filing Update • ILUS • May 16, 2024 11:26 AM
Cannabix Technologies and Omega Laboratories Inc. enter Strategic Partnership to Commercialize Marijuana Breathalyzer Technology • BLO • May 16, 2024 8:13 AM