“Long-read sequencing and de novo assembly of a Chinese genome” comes from lead author Lingling Shi at Jinan University and senior author Kai Wang from the University of Southern California, as well as many other collaborators in China and the US. The team was particularly interested in finding population-specific variants, including structural variants, which required the use of long-read sequencing. Assemblies based on short-read sequence data “may have inherent technical limitations in characterizing repeat elements that span longer than the read length, yet repeats and segmental duplications are known to cover approximately half of the human genome,” the scientists write. Using SMRT Sequencing and mapping technology from BioNano Genomics, “we perform detailed characterization of the HX1 genome and demonstrate that long-read sequencing can detect functional elements in human genomes that are missed by short-read sequencing.”
For the genome assembly, the team sequenced DNA from an anonymous Chinese individual (HX1) to 103x coverage, producing a 2.93 Gb genome with a contig N50 of 8.3 Mb. Included in the results were 206 Mb of alternative haplotypes that “were constructed along with the primary contigs,” Shi et al. write. Consensus accuracy for the assembly was 99.73%, matching the accuracy of the well-known NA12878 genome assembly. In an analysis of structural variants, the team found about 20,000 insertions and deletions, with half of them classified as short tandem repeats or mobile elements. Nearly 50 exonic deletions or insertions were specific to the HX1 genome, including one previously characterized deletion that has only been seen in the Asian population.
The team also developed a new gap-filling method to make use of all this sequence data. They determined that nearly 30% of gaps in the GRCh38 reference genome could be addressed with data from HX1. “The total length of filled or shortened gaps amounts to 7.1?Mb,” they report. “We further evaluated the repeat contents within the gaps that can be closed by us, and found that simple repeats and satellite sequences were significantly enriched within the closed gaps compared with GRCh38.”
Using the Iso-Seq method, the scientists also analyzed the transcriptome of this individual and detected more than 58,000 isoforms, including “57 isoforms at 42 loci that do not overlap with any GENCODE transcript,” they write. Follow-up studies for some of the more complex data — such as “a novel transcribed element with at least five exons and six isoforms” — validated these predicted splicing events. They also found at least two genes that have never been identified with short-read data. The team looked for disease-causing variants, finding two that were classified in ClinVar as pathogenic. However, “manual review of the literature cited in the two ClinVar records indicated that both of them represented erroneous database records,” the scientists report. “This analysis highlights the need for extreme caution in interpreting ‘pathogenic’ variants documented in variant databases.”
“In summary, while short-read-based alignment and variant calling based on reference genome remain a common practice to assay personal genomes, de novo assembly by long-read sequencing may reveal novel and complementary biological insights,” Shi et al. conclude. “Furthermore, long-read RNA sequencing may identify novel transcripts that can be missed by short-read RNA sequencing.”
http://www.pacb.com/blog/chinese-genome-assembly-smrt-sequencing-finds-novel-genes-recovers-missing-sequence/
Recent PACB News
- PacBio Announces the HiFi Solves Sub-fertility Consortium in Asia Pacific • GlobeNewswire Inc. • 09/11/2024 01:05:00 PM
- Revio to Power Research in Male Infertility and Rare Disease at Münster University Hospital • GlobeNewswire Inc. • 09/04/2024 01:05:00 PM
- PacBio to Present at the Morgan Stanley 22nd Annual Global Healthcare Conference • GlobeNewswire Inc. • 08/26/2024 08:03:00 PM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 08/20/2024 09:27:01 PM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 08/20/2024 09:25:47 PM
- Form 144 - Report of proposed sale of securities • Edgar (US Regulatory) • 08/19/2024 08:24:44 PM
- Form 144 - Report of proposed sale of securities • Edgar (US Regulatory) • 08/16/2024 08:29:03 PM
- Form 10-Q - Quarterly report [Sections 13 or 15(d)] • Edgar (US Regulatory) • 08/09/2024 12:46:25 AM
- Form 8-K/A - Current report: [Amend] • Edgar (US Regulatory) • 08/07/2024 08:09:35 PM
- Form 8-K - Current report • Edgar (US Regulatory) • 08/07/2024 08:08:10 PM
- PacBio Announces Second Quarter 2024 Financial Results • GlobeNewswire Inc. • 08/07/2024 08:05:00 PM
- Singapore’s National Precision Medicine (NPM) programme will partner with PacBio to propel HiFi Sequencing in Southeast Asia • GlobeNewswire Inc. • 08/05/2024 08:05:00 PM
- Novogene to Power Cancer, Rare Disease, and Environmental Research with PacBio Sequencing • GlobeNewswire Inc. • 07/23/2024 01:05:00 PM
- PacBio to Report Second Quarter 2024 Financial Results on August 7, 2024 • GlobeNewswire Inc. • 07/15/2024 08:03:00 PM
- PacBio and International Research Consortium CoLoRS Announce Release of First-Ever HiFi Long-Read Variant Database • PR Newswire (US) • 06/10/2024 01:05:00 PM
- Form DEFA14A - Additional definitive proxy soliciting materials and Rule 14(a)(12) material • Edgar (US Regulatory) • 05/30/2024 01:52:04 AM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 05/22/2024 09:25:43 PM
- Ambry Genetics and PacBio Announce Collaboration to Sequence Up to 7,000 Human Genomes Aimed at Providing Answers for Families Battling Rare Diseases • PR Newswire (US) • 05/15/2024 01:45:00 PM
- Form S-3ASR - Automatic shelf registration statement of securities of well-known seasoned issuers • Edgar (US Regulatory) • 05/09/2024 08:33:12 PM
- Form 10-Q - Quarterly report [Sections 13 or 15(d)] • Edgar (US Regulatory) • 05/09/2024 08:21:46 PM
- Form 8-K - Current report • Edgar (US Regulatory) • 05/09/2024 08:12:15 PM
- PacBio Announces First Quarter 2024 Financial Results • PR Newswire (US) • 05/09/2024 08:05:00 PM
- PacBio Announces Preliminary First Quarter 2024 Revenue and Updates 2024 Revenue Guidance • PR Newswire (US) • 04/16/2024 12:05:00 PM
- Estonia National Biobank Selects PacBio to Sequence 10,000 Whole Genomes • PR Newswire (US) • 03/27/2024 12:00:00 PM
- PacBio Grants Equity Incentive Award to New Employee • PR Newswire (US) • 03/22/2024 08:30:00 PM
FEATURED Cannabix Technologies and Omega Laboratories Inc. Advance Marijuana Breathalyzer Technology - Dr. Bruce Goldberger to Present at Society of Forensic Toxicologists Conference • Sep 24, 2024 8:50 AM
FEATURED Integrated Ventures, Inc Announces Strategic Partnership For GLP-1 (Semaglutide) Procurement Through MedWell USA, LLC. • Sep 24, 2024 8:45 AM
Avant Technologies Accelerates Creation of AI-Powered Platform to Revolutionize Patient Care • AVAI • Sep 24, 2024 8:00 AM
VHAI - Vocodia Partners with Leading Political Super PACs to Revolutionize Fundraising Efforts • VHAI • Sep 19, 2024 11:48 AM
Dear Cashmere Group Holding Co. AKA Swifty Global Signs Binding Letter of Intent to be Acquired by Signing Day Sports • DRCR • Sep 19, 2024 10:26 AM
HealthLynked Launches Virtual Urgent Care Through Partnership with Lyric Health. • HLYK • Sep 19, 2024 8:00 AM