Thursday, July 11, 2013 4:33:41 PM
LONG READS OFFER UNIQUE INSIGHT--- Scientists at the University of Oslo’s Centre for Ecological and
Evolutionary Synthesis (CEES) applied long PacBio® reads to
a genome that was proving particularly difficult to assemble.
Today, sequencing problems associated with the Atlantic
cod genome are a thing of the past — and researchers are
using their new assembly as the foundation for a major
resequencing effort that’s just getting started.
Recent work using multi-kilobase sequence reads
generated from Single Molecule, Real-Time (SMRT®)
technology is enabling a dramatically improved genome
assembly for cod, an economically important fish species.
In many ways the cod genome seemed like a puzzle that
might never be fully solved, but the Pacific Biosciences®
sequencing platform made significant inroads — and
just in time, as the team of researchers working on cod
recently received funding to resequence 1,000 more of
them. Being able to base these new efforts on a reliable
genome assembly will make future results far more
meaningful.
Lex Nederbragt, a research fellow at the University of
Oslo and a member of the Norwegian High-Throughput
Sequencing Centre, regularly contemplates the broader
importance of cod. There is interest in domesticating cod
for aquaculture, and the genome assembly can aid in
finding those regions that influence traits important for
disease resistance and growth rates, which may prove
crucial for the economic success of this industry. Cod
is the most important aquatic species in Norway and
other commercial fishery nations. Moreover, cod has an
interesting population ecology; that is, some populations
do exceedingly well, whereas others get depleted through
fishing and never recover to historic abundances. In the
last decade or so, “there has been a growing interest in
the genomics of this organism,” Nederbragt says.
In 2008, Nederbragt and his colleagues Bastiaan Star,
Sissel Jentoft, Kjetill S Jakobsen, and others from
the CEES-led Cod Genome Sequencing Consortium
began a cod genome project using shotgun and matepair
sequencing on the 454® platform. They mixed in
some long-range information from BACs sequenced
using traditional Sanger sequencing that resulted in an
assembly having thousands of scaffolds and hundreds
of thousands of contigs for the 830 Mb genome. Some
35 percent of the bases in the scaffolds were gaps,
Nederbragt says, which of course proved quite a challenge
for the Ensembl annotation team. “They managed to
produce a meaningful annotated genome by taking wellknown
genes from stickleback and other fishes to try
to put together the missing pieces in cod,” he adds. In
generating an assembly and annotation, the project was a
success; but scientists knew that for certain regions of the
genome, the genes would not accurately represent the cod
genome.
Still, having the draft genome assembly was a real step
forward and enabled the first big biological finding:
nowhere in the cod’s 22,000 genes could scientists find the
genes necessary for functionality of the MHC II pathway,
a critical component of the white blood cell-mediating
major histocompatibility complex that exists in all jawed
vertebrates. “The immune system of cod is really different
from what we’re used to seeing,” Nederbragt says. “Some
key genes associated with that pathway are completely
gone from the genome. This has never been seen before,
and people are really surprised that this was at all
possible.” Knowing the pitfalls of the genome assembly,
Lex Nederbragt, a research fellow at the University of Oslo, uses
the PacBio Sequencing System to sequence the cod genome.
COD GENOME ASSEMBLY:
LONG READS OFFER UNIQUE INSIGHT
“Having a really good reference
genome will make a big difference.” the research team validated the
finding by adding more data and
mining information from various
sources, including other cod-like fish.
Naturally, CEES scientists are
following up on this oddity of the
cod genome, looking into how the
immune response of cod functions
compared to other fish species.
Also, the cod genome indicates an
expansion of the MHC I complex, so
some are investigating whether that
has enabled MHC I to compensate for
the potential depletion of MHC II in
this organism.
Assembly Improvement
Even while the first cod genome
assembly was being published,
Nederbragt and his colleagues were
casting about for ways to improve
it. The team’s initial attempt at this
extracted DNA from the same cod
used for the original study and ran
mate-pair sequencing with the
Illumina® platform. That data helped
explain why the genome was so
fragmented, though it could not fix
the problem. The cod sequenced
was from the wild population, a
normal diploid fish whose marked
heterozygosity — sequence
differences between maternal and
paternal chromosomes — seemed to
be causing issues during assembly.
“Besides the SNPs that you would
normally expect, we see large
differences over hundreds of bases —
sometimes even kilobases — either
missing from the other chromosome,
or causing differences in regions
when we align them,” Nederbragt
explains. “This confuses assembly
programs.”
Another problem for the assembly
was the presence of many short
tandem repeats (STRs). “They’re
so long that they’re longer than
the Illumina reads,” Nederbragt
says, noting that these regions are
challenging for Sanger and 454
sequencing as well. “We estimate
that 10 to 20 percent of the gaps are
flanked, and probably spanned, by
those sequences.”
What the cod team really needed was
sequence data long enough to span
these regions of heterozygosity and
STRs. Their big break came in 2012
when the Oslo center acquired the
PacBio RS.
Building a Better Reference
As they tested out the new
instrument, Nederbragt and his
colleagues ran their default cod
sample to get a sense of the PacBio
performance with DNA they already
knew very well. “When we looked
at these PacBio reads mapping
to the assembly, we saw them
crossing large gaps of even multiple
kilobases,” he says. It was a moment
the team had been anticipating for
years. “I could see that the problem
of STRs and heterozygosity could
be addressed by this technology,”
Nederbragt adds.
Indeed, the multi-kilobase reads
from the PacBio RS confirmed what
the team had suspected all along:
that these short tandem repeats
were preventing other sequencing
technology from getting through
gaps, and that the proliferation
of these regions was causing a
fragmented assembly. For the first
time, the team actually had data
indicating that its theory about the
heterozygosity problem was correct:
the reads showed long stretches
of different sequence flanked by
sequences that matched each other,
indicating heterozygous regions.
This development altered the course
of the genome project. “It made
us change our sequencing plans,”
Nederbragt says. “We decided to use
the remaining funds on generating
about 8x coverage of PacBio reads.”
Since they already had so much 454
and Illumina data, the team opted
to add in the new PacBio reads
to improve the existing assembly.
“Together with PhD student Ole
Kristian Tørresen from our group, we
In the most Northern part of the Dutch North Sea live big
schools of North Sea cod, one of the most endangered species.
“When we looked at these PacBio reads mapping to the
assembly, we saw them crossing large gaps of even
multiple kilobases. I could see that the problem of STRs and
heterozygosity could be addressed by this technology.”
“We’ve never seen a faster
assembly,” Nederbragt
says; it came together in
just 36 hours.
Recent PACB News
- PacBio and International Research Consortium CoLoRS Announce Release of First-Ever HiFi Long-Read Variant Database • PR Newswire (US) • 06/10/2024 01:05:00 PM
- Form DEFA14A - Additional definitive proxy soliciting materials and Rule 14(a)(12) material • Edgar (US Regulatory) • 05/30/2024 01:52:04 AM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 05/22/2024 09:25:43 PM
- Ambry Genetics and PacBio Announce Collaboration to Sequence Up to 7,000 Human Genomes Aimed at Providing Answers for Families Battling Rare Diseases • PR Newswire (US) • 05/15/2024 01:45:00 PM
- Form S-3ASR - Automatic shelf registration statement of securities of well-known seasoned issuers • Edgar (US Regulatory) • 05/09/2024 08:33:12 PM
- Form 10-Q - Quarterly report [Sections 13 or 15(d)] • Edgar (US Regulatory) • 05/09/2024 08:21:46 PM
- Form 8-K - Current report • Edgar (US Regulatory) • 05/09/2024 08:12:15 PM
- PacBio Announces First Quarter 2024 Financial Results • PR Newswire (US) • 05/09/2024 08:05:00 PM
- PacBio Announces Preliminary First Quarter 2024 Revenue and Updates 2024 Revenue Guidance • PR Newswire (US) • 04/16/2024 12:05:00 PM
- Estonia National Biobank Selects PacBio to Sequence 10,000 Whole Genomes • PR Newswire (US) • 03/27/2024 12:00:00 PM
- PacBio Grants Equity Incentive Award to New Employee • PR Newswire (US) • 03/22/2024 08:30:00 PM
- PacBio Announces PureTarget™ Repeat Expansion Panel, Expanding its Portfolio of End-to-End Clinical Research Solutions • PR Newswire (US) • 03/12/2024 01:05:00 PM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 03/06/2024 10:36:07 PM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 03/06/2024 10:30:18 PM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 03/06/2024 10:26:40 PM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 03/06/2024 10:22:45 PM
- Form 144 - Report of proposed sale of securities • Edgar (US Regulatory) • 03/04/2024 11:32:39 PM
- Form 144 - Report of proposed sale of securities • Edgar (US Regulatory) • 03/04/2024 11:22:32 PM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 02/26/2024 09:55:28 PM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 02/26/2024 09:36:09 PM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 02/26/2024 09:25:48 PM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 02/26/2024 09:19:42 PM
- PacBio to Present at Upcoming Investor Conferences • PR Newswire (US) • 02/26/2024 09:05:00 PM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 02/21/2024 11:25:13 PM
- Form 4 - Statement of changes in beneficial ownership of securities • Edgar (US Regulatory) • 02/21/2024 11:20:57 PM
FEATURED Fifty 1 Labs, Inc. Announces Major Strategic Advancements and Shareholder Updates • Jun 14, 2024 2:07 AM
ECGI Holdings Announces LOI to Acquire Pacific Saddlery to Capitalize on $12.72 Billion Market Potential • ECGI • Jun 13, 2024 9:50 AM
Snakes & Lattes Opens Pop-Up Location at The Wellington Market in Toronto: A New Destination for Fun and Games - Thanks 'The Well', PepsiCo, Indie Pale House & All Sponsors & Partners for Their Commitment & Assistance Throughout The Process • FUNN • Jun 13, 2024 8:18 AM
HealthLynked Introduces Innovative Online Medical Record Request Form Using DocuSign • HLYK • Jun 12, 2024 8:00 AM
Ubiquitech Software Corp (OTC:UBQU) Posts $624,585 Quarterly Revenue - Largest Quarter Since 2018 • UBQU • Jun 11, 2024 10:13 AM
Element79 Gold Corp Files for OTCQB Uplisting, Provides Financial Update • ELEM • Jun 11, 2024 9:25 AM