Tuesday, October 22, 2013 3:56:55 PM
In order to help evaluate the utility of long, unbiased sequence reads for characterizing structural variation in the human genome using our recently released P5-C3 scaffolding sequencing chemistry, we have collected 10x long-read, shotgun coverage of a human genome sample. The human genome harbors many structural variations, including variable number tandem repeats, deletions, insertions, inversions, and repetitive mobile elements, which are often difficult to resolve using short-read technologies. We hope this data set will be of value to the bioinformatic and scientific community studying various forms of structural variation across the human genome. To access it, simply send us an email and you will receive instructions for downloading the data set.
In collaboration with Evan Eichler (Howard Hughes Medical Institute, University of Washington), we sequenced CHM1TERT, a well-studied cell line derived from a complete hydatidiform mole (CHM). A hydatidiform mole is defined as a pregnancy with no embryo and clinically presents in approximately 1 in 1,500 pregnant women in North America. The CHM cells have a diploid genome, typically XX, that is a result of replication of a haploid paternal (sperm) genome. Through the corresponding absence of allelic variation, this sample has been used to generate a haploid reference genome sequence, and many associated resources are available, including physical maps, genotypes (iSCAN), and a large-insert BAC library (CHORI-17). It is also one of the targets for the production of a higher quality “platinum” genome assembly.
We prepared ~20 kb DNA fragment libraries, size-selected with the BluePippin™ system from Sage Science, and sequenced with 3-hour movies using the P5-C3 sequencing chemistry. Some sequencing statistics are listed below:
Total number of reads: 3,679,463
Total number of post-filtered bases: 32,559,803,198
Average read length: 8,849 bp
Half of sequenced bases in reads greater than: 10,985 bp
5% of sequenced DNA inserts longer than: 18,060 bp
Longest DNA insert sequenced: 41,460 bp
PacBio® RS II instrument time for sequencing: 10 days
Number of SMRT® Cells: 66
(see link)
Figure 1. Subread length distribution. A subread is a DNA insert sequenced between two SMRTbell™ hairpin adapters. The solid black line (right y axis) denotes the amount of sequenced bases greater than a given subread length (x axis).
We also mapped the data against the human reference genome (GRCh37) and found generally even coverage across the reference, with numerous examples of structural variations highlighted by the long reads. A mapping coverage summary and a few examples highlighting structural variation are given below.
Figure 2. Uniform sequencing coverage upon mapping against the GRCh37 human genome reference. (A) Example coverage for chromosome 3. The gap in the center is due to lack of sequence in the reference (~3 million N bases) of the centromere. (B) Coverage histogram over all non-N bases of the GRCh37 reference.
(see link)
Figure 3. Examples of large deletions. The sharp breakpoints from the even shotgun read structure, combined with the lack of read coverage, indicate a 114.2 kb and a 4.9 kb deletion in this ~375 kb region of chromosome 3. The individual sequence reads are shaded by length (reads in black are >10 kb). Both deletions have been validated and are polymorphic in the human population.
Figure 4. Sequence structure of the Fragile X Mental Retardation (FMR1) Triplet CGG Repeat. (A) Read mapping to the reference genome sequence shows many insertions (green vertical lines) across this region on the X chromosome. (B) Consensus building from the reads and dot plot comparison reveals the true structure including an additional AGG-(CGG)9 repeat block in the CHM1 genome. http://blog.pacificbiosciences.com/2013/10/data-release-long-read-shotgun.html?utm_content=buffer3a5d4&utm_source=buffer&utm_medium=twitter&utm_campaign=Buffer
Recent PACB News
- Ambry Genetics and PacBio Announce Collaboration to Sequence Up to 7,000 Human Genomes Aimed at Providing Answers for Families Battling Rare Diseases • PR Newswire (US) • 05/15/2024 01:45:00 PM
- Form S-3ASR - Automatic shelf registration statement of securities of well-known seasoned issuers • Edgar (US Regulatory) • 05/09/2024 08:33:12 PM
- Form 10-Q - Quarterly report [Sections 13 or 15(d)] • Edgar (US Regulatory) • 05/09/2024 08:21:46 PM
- Form 8-K - Current report • Edgar (US Regulatory) • 05/09/2024 08:12:15 PM
- PacBio Announces First Quarter 2024 Financial Results • PR Newswire (US) • 05/09/2024 08:05:00 PM
- PacBio Announces Preliminary First Quarter 2024 Revenue and Updates 2024 Revenue Guidance • PR Newswire (US) • 04/16/2024 12:05:00 PM
- Estonia National Biobank Selects PacBio to Sequence 10,000 Whole Genomes • PR Newswire (US) • 03/27/2024 12:00:00 PM
- PacBio Grants Equity Incentive Award to New Employee • PR Newswire (US) • 03/22/2024 08:30:00 PM
- PacBio Announces PureTarget™ Repeat Expansion Panel, Expanding its Portfolio of End-to-End Clinical Research Solutions • PR Newswire (US) • 03/12/2024 01:05:00 PM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 03/06/2024 10:36:07 PM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 03/06/2024 10:30:18 PM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 03/06/2024 10:26:40 PM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 03/06/2024 10:22:45 PM
- Form 144 - Report of proposed sale of securities • Edgar (US Regulatory) • 03/04/2024 11:32:39 PM
- Form 144 - Report of proposed sale of securities • Edgar (US Regulatory) • 03/04/2024 11:22:32 PM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 02/26/2024 09:55:28 PM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 02/26/2024 09:36:09 PM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 02/26/2024 09:25:48 PM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 02/26/2024 09:19:42 PM
- PacBio to Present at Upcoming Investor Conferences • PR Newswire (US) • 02/26/2024 09:05:00 PM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 02/21/2024 11:25:13 PM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 02/21/2024 11:20:57 PM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 02/21/2024 11:17:14 PM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 02/21/2024 11:07:18 PM
- Form 144 - Report of proposed sale of securities • Edgar (US Regulatory) • 02/20/2024 09:17:12 PM
North Bay Resources Announces 50/50 JV at Fran Gold Project, British Columbia; Initiates NI 43-101 Resources Estimate and Bulk Sample • NBRI • May 21, 2024 9:07 AM
Greenlite Ventures Inks Deal to Acquire No Limit Technology • GRNL • May 17, 2024 3:00 PM
Music Licensing, Inc. (OTC: SONG) Subsidiary Pro Music Rights Secures Final Judgment of $114,081.30 USD, Demonstrating Strength of Licensing Agreements • SONGD • May 17, 2024 11:00 AM
VPR Brands (VPRB) Reports First Quarter 2024 Financial Results • VPRB • May 17, 2024 8:04 AM
ILUS Provides a First Quarter Filing Update • ILUS • May 16, 2024 11:26 AM
Cannabix Technologies and Omega Laboratories Inc. enter Strategic Partnership to Commercialize Marijuana Breathalyzer Technology • BLO • May 16, 2024 8:13 AM