LT.Swing trade!
Register for free to join our community of investors and share your ideas. You will also get access to streaming quotes, interactive charts, trades, portfolio, live options flow and more tools.
Register for free to join our community of investors and share your ideas. You will also get access to streaming quotes, interactive charts, trades, portfolio, live options flow and more tools.
UCDAVIS-100K Genome Project unveils 20 more foodborne pathogen genomes
July 22, 2013
(The genomes were determined using Single Molecule, Real-Time (SMRT®) Sequencing technology from Pacific Biosciences of California, Inc.)!!!
The 100K Genome Project, led by the University of California, Davis, the U.S. Food and Drug Administration’s Center for Food Safety and Applied Nutrition, and Agilent Technologies, today announced that it has added 20 newly completed genome sequences of foodborne disease-causing microorganisms to its public database at the National Center for Biotechnology Information.
The genomes were determined using Single Molecule, Real-Time (SMRT®) Sequencing technology from Pacific Biosciences of California, Inc.
This brings to 30 the number of genomic sequences completed by the 100K Genome Project, which aims to sequence the genomes of 100,000 bacterial and viral genomes. This genome sequencing effort is focused on speeding the diagnosis and treatment of foodborne diseases, as well as shortening the duration and limiting the spread of foodborne illness outbreaks. In the United States alone, foodborne diseases annually sicken around 48 million people and kill approximately 3,000, according to the Centers for Disease Control and Prevention.
The newly deposited sequences include several isolates of Salmonella, Listeria, Campylobacter, and Vibrio, as well as a full characterization of their epigenomes – a diagnostic feature that defines how the DNA is chemically modified and changes how the organism behaves.
“These finished genome sequences represent the highest quality standard, with each strain closed in a single bacterial chromosome and the associated mobile DNA,” said Bart Weimer, director of the 100K Genome Project and professor at the school of veterinary medicine at UC Davis. “They also contain complete associated phage or plasmid elements, which are critical for understanding pathogenicity, drug resistance and other biologically important traits that are linked to survival.
“The genomes we have analyzed to date are from pathogens responsible for common and debilitating foodborne infections,” Weimer said, noting that the ready availability of this information will aid in reducing the time needed to diagnose and define outbreak strains.
“Making these genomic sequences publicly available through the National Center for Biotechnology Information database provides researchers and public health officials with information that will allow tracking of foodborne pathogens to their source,” said Marc Allard, an FDA genomics expert and advisor to the 100K Genome Project. “This will ultimately speed outbreak investigations, reduce illness, and facilitate the development of new rapid test methods to detect pathogens.”
The new genomes were sequenced and assembled using technology capable of detecting and identifying genome-wide methylation patterns as it performs DNA sequencing.
“Increasingly, microbiologists are recognizing that epigenetic information provides essential clues to the virulence of an outbreak strain,” said Jonas Korlach, chief scientific officer at Pacific Biosciences. “The automated pipelines that made the completion of these 20 genomes and epigenomes possible serve as a solid foundation for the production of many more high-quality, finished genomes of foodborne pathogens through this project in the near future.”
A complete list of the bacteria for which finished genomes have been generated by the 100K Project is available online at the National Center for Biotechnology Information,
(http://www.ncbi.nlm.nih.gov/bioproject/186441), a partner in the project.
About the 100K Genome Project:
The 100K Genome Project was launched in March 2012 as a collaborative effort between UC Davis, Agilent Technologies, and the U.S. Food and Drug Administration. Since then, the U.S. Centers for Disease Control and Prevention, U.S. Department of Agriculture and additional corporate partners have joined the worldwide effort. The project’s first data were deposited in June 2013. For more information visit http://100kgenome.vetmed.ucdavis.edu
http://news.ucdavis.edu/search/news_detail.lasso?id=10660&
PacBio Sequencing for Whole Human Genomes .Publihed on July 15,2013 In this talk, Mount Sinai's Eric Schadt uses PacBio sequencing on human genomes and reports finding uncharacterized structural variation that could have diagnostic utility. Schadt says that SMRT sequencing is advantageous for long-range genetic information, extreme GC content, and highly repetitive regions. He presents sequence data for a CEPH individual studied for repeat expansions, showing that long reads can resolve the majority of these regions.
(Correction to last part of the article on COD GENOME ASSEMBLY:) We are currently trying to make use of all
the data we have in the best possible
way,” he says.
Layering these reads together, and
using the highly accurate consensus,
the team generated very long reads,
error-corrected them using the short
read data, and ran them through
Celera® Assembler. “We’ve never seen
a faster assembly,” Nederbragt says;
it came together in just 36 hours.
As Nederbragt and his colleagues
sifted through the new assembly,
they realized that the assembler
was splitting haplotypes rather than
merging them, so the heterozygous
regions were being run as linear
sections of the genome, rather than
alternates of the same section. “The
sequencing problem is now gone; it
looks like we have the whole genome
present in PacBio reads,” Nederbragt
says. “Now it has become a
bioinformatics challenge.” He and his
team are currently working to quantify
the regions they believe should be
split into haplotypes and to figure out
the differences between them.
As they determine the best
bioinformatic solution to the
assembly, they are starting to
investigate the new genome data to
see where it varies from the original
stickleback-oriented assembly.
They’ve already seen an exon in the
original annotation that potentially
does not exist in the all-cod assembly,
Nederbragt says, noting that a full
comparison of the two genome
assemblies will take place in the
future. For now, they are focused on
getting this new assembly into its
23 pseudochromosomes, which can
then be shipped off for annotation.
“The goal is to get the annotators an
assembly good enough that they don’t
need to retrofit it with information
from other organisms,” he says.
Implications for Future
Initiatives
The interest in generating an
improved cod genome assembly took
a critical turn when the Norwegian
Research Council funded a large
grant to resequence 1,000 cod. The
four-year Aqua Genome Project,
as it is known, will produce truly
meaningful and useful results if this
resequencing can be done with a
high-quality draft of the actual fish
being studied, rather than relying on
the old cod assembly with information
from other fish genes woven into it.
“The goal is to do deep cataloging of
genomic variation,” a process that will
also include studies of transcriptome,
methylation, and structural variation
data on the PacBio platform,
Nederbragt says. “Having a really
good reference genome will make a
big difference.”
The Aqua Genome Project seeks to
characterize the variation between
wild-caught and farmed fish in an
effort to increase the success of
aquaculture. This is no trivial matter:
“Sustainable aquaculture could
contribute to solving the world’s food
problems,” Nederbragt says.
As he and his colleagues get started
on that new project, they’ll be relying
on their PacBio sequencer. “If you
want to have long-range information
that you can trust, PacBio reads are
very useful,” Nederbragt says.
www
(Case Study June 2013) COD GENOME ASSEMBLY:
LONG READS OFFER UNIQUE INSIGHT--- Scientists at the University of Oslo’s Centre for Ecological and
Evolutionary Synthesis (CEES) applied long PacBio® reads to
a genome that was proving particularly difficult to assemble.
Today, sequencing problems associated with the Atlantic
cod genome are a thing of the past — and researchers are
using their new assembly as the foundation for a major
resequencing effort that’s just getting started.
Recent work using multi-kilobase sequence reads
generated from Single Molecule, Real-Time (SMRT®)
technology is enabling a dramatically improved genome
assembly for cod, an economically important fish species.
In many ways the cod genome seemed like a puzzle that
might never be fully solved, but the Pacific Biosciences®
sequencing platform made significant inroads — and
just in time, as the team of researchers working on cod
recently received funding to resequence 1,000 more of
them. Being able to base these new efforts on a reliable
genome assembly will make future results far more
meaningful.
Lex Nederbragt, a research fellow at the University of
Oslo and a member of the Norwegian High-Throughput
Sequencing Centre, regularly contemplates the broader
importance of cod. There is interest in domesticating cod
for aquaculture, and the genome assembly can aid in
finding those regions that influence traits important for
disease resistance and growth rates, which may prove
crucial for the economic success of this industry. Cod
is the most important aquatic species in Norway and
other commercial fishery nations. Moreover, cod has an
interesting population ecology; that is, some populations
do exceedingly well, whereas others get depleted through
fishing and never recover to historic abundances. In the
last decade or so, “there has been a growing interest in
the genomics of this organism,” Nederbragt says.
In 2008, Nederbragt and his colleagues Bastiaan Star,
Sissel Jentoft, Kjetill S Jakobsen, and others from
the CEES-led Cod Genome Sequencing Consortium
began a cod genome project using shotgun and matepair
sequencing on the 454® platform. They mixed in
some long-range information from BACs sequenced
using traditional Sanger sequencing that resulted in an
assembly having thousands of scaffolds and hundreds
of thousands of contigs for the 830 Mb genome. Some
35 percent of the bases in the scaffolds were gaps,
Nederbragt says, which of course proved quite a challenge
for the Ensembl annotation team. “They managed to
produce a meaningful annotated genome by taking wellknown
genes from stickleback and other fishes to try
to put together the missing pieces in cod,” he adds. In
generating an assembly and annotation, the project was a
success; but scientists knew that for certain regions of the
genome, the genes would not accurately represent the cod
genome.
Still, having the draft genome assembly was a real step
forward and enabled the first big biological finding:
nowhere in the cod’s 22,000 genes could scientists find the
genes necessary for functionality of the MHC II pathway,
a critical component of the white blood cell-mediating
major histocompatibility complex that exists in all jawed
vertebrates. “The immune system of cod is really different
from what we’re used to seeing,” Nederbragt says. “Some
key genes associated with that pathway are completely
gone from the genome. This has never been seen before,
and people are really surprised that this was at all
possible.” Knowing the pitfalls of the genome assembly,
Lex Nederbragt, a research fellow at the University of Oslo, uses
the PacBio Sequencing System to sequence the cod genome.
COD GENOME ASSEMBLY:
LONG READS OFFER UNIQUE INSIGHT
“Having a really good reference
genome will make a big difference.” the research team validated the
finding by adding more data and
mining information from various
sources, including other cod-like fish.
Naturally, CEES scientists are
following up on this oddity of the
cod genome, looking into how the
immune response of cod functions
compared to other fish species.
Also, the cod genome indicates an
expansion of the MHC I complex, so
some are investigating whether that
has enabled MHC I to compensate for
the potential depletion of MHC II in
this organism.
Assembly Improvement
Even while the first cod genome
assembly was being published,
Nederbragt and his colleagues were
casting about for ways to improve
it. The team’s initial attempt at this
extracted DNA from the same cod
used for the original study and ran
mate-pair sequencing with the
Illumina® platform. That data helped
explain why the genome was so
fragmented, though it could not fix
the problem. The cod sequenced
was from the wild population, a
normal diploid fish whose marked
heterozygosity — sequence
differences between maternal and
paternal chromosomes — seemed to
be causing issues during assembly.
“Besides the SNPs that you would
normally expect, we see large
differences over hundreds of bases —
sometimes even kilobases — either
missing from the other chromosome,
or causing differences in regions
when we align them,” Nederbragt
explains. “This confuses assembly
programs.”
Another problem for the assembly
was the presence of many short
tandem repeats (STRs). “They’re
so long that they’re longer than
the Illumina reads,” Nederbragt
says, noting that these regions are
challenging for Sanger and 454
sequencing as well. “We estimate
that 10 to 20 percent of the gaps are
flanked, and probably spanned, by
those sequences.”
What the cod team really needed was
sequence data long enough to span
these regions of heterozygosity and
STRs. Their big break came in 2012
when the Oslo center acquired the
PacBio RS.
Building a Better Reference
As they tested out the new
instrument, Nederbragt and his
colleagues ran their default cod
sample to get a sense of the PacBio
performance with DNA they already
knew very well. “When we looked
at these PacBio reads mapping
to the assembly, we saw them
crossing large gaps of even multiple
kilobases,” he says. It was a moment
the team had been anticipating for
years. “I could see that the problem
of STRs and heterozygosity could
be addressed by this technology,”
Nederbragt adds.
Indeed, the multi-kilobase reads
from the PacBio RS confirmed what
the team had suspected all along:
that these short tandem repeats
were preventing other sequencing
technology from getting through
gaps, and that the proliferation
of these regions was causing a
fragmented assembly. For the first
time, the team actually had data
indicating that its theory about the
heterozygosity problem was correct:
the reads showed long stretches
of different sequence flanked by
sequences that matched each other,
indicating heterozygous regions.
This development altered the course
of the genome project. “It made
us change our sequencing plans,”
Nederbragt says. “We decided to use
the remaining funds on generating
about 8x coverage of PacBio reads.”
Since they already had so much 454
and Illumina data, the team opted
to add in the new PacBio reads
to improve the existing assembly.
“Together with PhD student Ole
Kristian Tørresen from our group, we
In the most Northern part of the Dutch North Sea live big
schools of North Sea cod, one of the most endangered species.
“When we looked at these PacBio reads mapping to the
assembly, we saw them crossing large gaps of even
multiple kilobases. I could see that the problem of STRs and
heterozygosity could be addressed by this technology.”
“We’ve never seen a faster
assembly,” Nederbragt
says; it came together in
just 36 hours.
Friday, Jul. 05, 2013 (This link changes with new story daily). PACBIO is the TOP STORY headline today!! http://paper.li/mtwolfinger/1342906758 Next-Generation-Sequencing Update
Friday, Jul. 05, 2013
Next update in about 12 hours
Genome Biology | Full text | The advantages of SMRT sequencing
Shared by
Pacific Biosciences (genomebiology.com )- So-called next-generation technologies for sequencing DNA are penetrating every aspect of biology thanks to the immense amount of information that is encoded within nucleic acid sequences. However,...
(THIS SAME STORY POSTED ON THIS MESSAGE BOARD ON JULY THE THIRD) http://paper.li/mtwolfinger/1342906758
Posted on July 5, 2013) De novo bacterial genome assembly: a solved problem?
Pacific Biosciences published a paper earlier this year on an approach to sequence and assemble a bacterial genome leading to a near-finished, or finished genome. The approach, dubbed Hierarchical Genome Assembly Process (HGAP), is based on only PacBio reads without the need for short-reads. This is how it works:
•generate a high-coverage dataset of the longest reads possible, aim for 60-100x in raw reads
•pre-assembly: use the reads from the shorter part of the raw read length distribution, to error-correct the longest reads, set the cutoff in such a way so that the longest reads make up about 30x coverage
•use the long, error-corrected reads in a suitable assembler, e.g. Celera, to produce contigs
•map the raw PacBio reads back to the contigs to polish the final sequence (rather, recall the consensus using the raw reads as evidence) with the Quiver tool
The approach is very well explained on this website. As an aside, the same principle can now be used with the PacBioToCA pipeline.
In principle, this approach could result in a finished genome, i.e. a gapless contig per chromosomal element (chromosomes and plasmids). A more theoretical study confirms this:
“Our results indicate that the majority of known bacterial and archaeal genomes can be assembled without gaps, at finished-grade quality, using a single PacBio RS sequencing library.” (Koren et al, arXiv:1304.3752)
As always, the proof is in the pudding. There have been reports, and even a publication here and there, that the HGAp approach actually works. In this blog post I would like to add our experiences, which in short are that HGAP can indeed result in (close-to) finished genomes.
At the Norwegian Sequencing Centre,with which I am affiliated, we recently received several bacterial genome DNA samples for PacBio sequencing. Given our very positive first experiences with size selecting PacBio libraries using the BluePippin, see my previous post, we decided to use this instrument also for these samples. Four of the samples yielded very nice libraries, which were sequenced, two SMRTcells each, on our (recently upgraded) PacBio RSII instrument.
Raw reads
We have never seen such long reads:
PB_0027 PB_0028 PB_0029 PB_0031
Count 80512 58524 45514 84169
Sum 595 Mbp 462 Mbp 351 Mbp 669 Mbp
Av. (bp)ngth 7393 7893 7714 7951
N50 (bp) 10662 11205 11109 11162
Largest (bp) 24397 25552 23992 25678
Note that the average read length is much longer than the specifications of the RSII, which is about 4.6 kbp.
These reads were then used in HGAP. We have smrtpipe, the analysis suite of Pacific Biosciences, installed, so I could simply make a file with the names of the input files, a default HGAP settings xml file, and run the whole thing on one of our big servers. The assemblies took about two days when given 32 CPUs and a lot of memory – I haven’t logged how much RAM they actually used.
Pre-assembly
Here are the results of pre-assembly, the correction of the largest 30x in raw reads with the rest of the reads:
PB_0027 PB_0028 PB_0029 PB_0031
Cutoff (bp)* 12106 12077 10371 12780
Count 9186 8636 10059 9252
Sum 107 Mbp 100 Mbp 106 Mbp 110 Mbp
Av. (bp) 11594 11562 10540 11876
N50 (bp) 12519 12770 11513 13120
Largest 20043 19090 19030 18681
*Cutoff: minimum length of seeds for error-correction.
After pre-assembly, there was more than a 100 Mbp in error-corrected, potentially high-quality reads with an N50 higher than one sometimes see for contigs of a short-read bacterial genome assembly!
Assembly
These 8 – 10 thousand reads were assembled by Celera, with Quiver polishing, into:
PB_0027 PB_0028 PB_0029 PB_0031
Contigs 3.4 Mbp 3.2 Mbp 4.3 Mbp 1.8 Mbp
45 kbp 76 kbp 80kbp 1.3 Mbp
64 Kbp 1.1 Mbp
45 Kbp 0.95 Mbp
17 kbp 95 kbp
Wow, mostly one laaarge contig (and I checked, these are without ‘N’ bases) and a few shorter ones. The exception was the last strain which assembled into a few large pieces, that together, according to what I understand, are too large. A further step for this assembly is trying the Minimus2 tool, to see whether there is enough overlap between the contigs to further reduce their number – a step generally recommended for HGAp assemblies. I haven’t tried this yet for this assembly.
So, it looks like ‘it just works’. Well, there was at least one case where a misassembly is suspected. Looking at the coverage plot (of the remapped raw pacbio reads) for the 4,3 Mbp contig of PB0029, we saw this:
Mapping coverage of raw PacBio reads to the largest contig of the PB_0029 assembly
The sudden jump in coverage after 1.2 Mbp points to a fusion of the sequences of two chromosomes – and in fact this is quite likely the case given what is know about these strains. For the others, it reamins to be seen whether the smaller pieces are in fact plasmids, or should be part of the major chromosome.
A few remarks before I conclude:
•these four samples are clearly success stories
•all had modest GC percentages, around 35 – 50%
•we also have had a sample that didn’t fragment very well and only yielded a 2 kbp insert library (giving CCS reads after sequencing)
•another strain didn’t behave as well either, resulting in reads averaging 3.5 kbp – assembly for this one has not been started yet
•there is no reference genome for these samples, so assembly accuracy, and per-base quality, could not be assessed fully
Conclusion
It looks like that for well-behaved samples, the approach of combining PacBio library creation, BluePippin size selection (optional, but highly recommended) and sequencing of two SMRTcells, works very well to give finished, or near finished bacterial genome assemblies. I want to emphasise the following, though: even though the assembly looks great, it is afterwards up to the biologist/researcher to make sure the contigs actually make sense given:
•the remapping of the reads
•what is known about the species (e.g. expected number of chromosomes)
•what is known about the sample (e.g., presence of plasmids)
•other, independent evidence, e.g. illumina reads, optical mapping results, etc
The title of this post aks: “De novo bacterial genome assembly: a solved problem?”. I dare to say we’re pretty close…
A bioinformaticians side-note
The bottleneck of the HGAp process was the two consensus calling steps: when the consensus of the longest reads are being called (based on the mapped shortest ones), and especially for the Celera contig consensus calling. The latter takes one contig at a time, and since these now are becoming millions of basepairs long, this can take up many hours, perhaps even half the total assembly time. By the way, overlapping the error-corrected reads was done in minutes… So, if someone is interested in developing a parallelised consensus caller, than can work with parts of a long contigs, and stitch the consensi back together when done, we bioinformaticians doing HGAP would be very grateful…
Acknowledgments
This post would not have been possible without the excellent skills of the NSC lab team, and I thank the owners of the Bacterial samples for which this post describes results for permission to use the metrics for this post. I apologize in advance for not being able to share the (raw and assembled) data presented here…
A better read if you click on link! http://flxlexblog.wordpress.com/2013/07/05/de-novo-bacterial-genome-assembly-a-solved-problem/#more-420
Open this link and click on Figure 1 & 2. http://genomebiology.com/2013/14/6/405 (Figure 1. Idealized assembly graphs [18] of the 5.2 megabase-pair B.anthracis Ames Ancestor main chromosome using (a) 100 bp,
(b) 1,000 bp and (c) 5,000 bp reads. The graphs encode the compressed de Bruijn graph derived from infi nite coverage error-free reads,
eff ectively representing the repeats in the genome and the upper bound of what could be achieved in a real assembly. Increasing the read length
decreases the number of contigs because the longer reads will span more of the repeats. Note the assembly with 5,000 bp reads has a self-edge
because the chromosome is circular). (figure 2. A sequencing context breakdown of the empirical insertion error rate of the two platforms on NA12878 whole genome data. In this figure we show all contexts of size 8 that start with AAAAA. The empirical insertion quality score (y-axis) is PHRED scaled. Despite the higher error rate (approximately Q12) of the PacBio RS instrument, the error is independent of the sequencing context. Other platforms are known to have different error rates for different sequencing contexts. Illumina's HiSeq platform, shown here, has a lower error rate (approximately Q45 across eight independent runs), but contexts such as AAAAAAAA and AAAAACAG have extremely different error rates (Q30 versus Q55). This context-specific error rate creates bias that is not easily clarified by greater sequencing depth. Empirical insertion error rates were measured using the Genome Analysis Toolkit (GATK) - Base Quality Score Recalibration tool.
(Wednesday, July 3, 2013)Using PacBio Sequencing, Scientists Find No Evidence of Horizontal Gene Transfer in Haitian Cholera Strain
A new analysis from public health scientists has found that the cholera strain responsible for the 2010 outbreak in Haiti has a limited ability to add to its genetic repertoire through horizontal gene transfer.
The paper, “Evolutionary Dynamics of Vibrio cholerae O1 Following a Single-Source Introduction to Haiti,” from lead author Lee Katz and senior author Cheryl Tarr, both from the US Centers for Disease Control and Prevention, came out this week in mBio, a journal published by the American Society for Microbiology.
Because the 2010 epidemic represents the first known arrival of cholera in Haiti, the outbreak served as an unusual opportunity to study the pathogen from a single point of origin and its subsequent evolution as it spread across the country.
The CDC scientists, together with collaborators at the Public Health Agency of Canada, Mount Sinai School of Medicine, Georgia Institute of Technology, and others, sequenced several cholera isolates from different regions and different stages of the outbreak. Nine of those isolates, plus a reference genome, were sequenced to near-closure on the PacBio® platform.
The data corroborate earlier findings that grouped the Haitian isolates with strains found in Nepal. In Haiti, however, the pathogen showed little in the way of evolution, adopting no new genes or genomic regions through the usual mechanism of horizontal gene transfer, the scientists report. They note that the strain is “severely impaired for transformation”; in addition to the lack of acquired genes in any of the isolates’ genomes, the bacteria failed in lab tests to take up new genetic material through traditional mechanisms.
“A pangenome analysis showed nearly homogeneous genomic content, with no evidence of gene acquisition among Haiti isolates,” the authors write. They add, “It is well accepted that [horizontal gene transfer] is a major force driving evolution in bacteria, including Vibrio; thus, the lack of HGT observed in our study might be surprising.”
The authors also made note of a technical challenge in their work: pulsed-field gel electrophoresis, useful at the outset to verify that the outbreak was caused by a single strain, in time was no longer able to confirm the single founder theory. This issue was addressed by switching to whole genome sequencing for bacterial analysis, making comprehensive surveillance more effective and conclusive.
(same article,just shorter version)from PacBio Blog ( Wednesday, July 3, 2013Genome Biology Commentary Discusses the Advantages of SMRT Sequencing
A new commentary in Genome Biology from highly respected scientific authors, including a Nobel Prize winner, highlights the benefits of Single Molecule, Real-Time (SMRT®) Sequencing. The commentary, entitled “The advantages of SMRT sequencing,” comes from Richard Roberts at New England BioLabs, Mauricio Carneiro at the Broad Institute, and Michael Schatz at Cold Spring Harbor Laboratory.
The authors begin with the premise that the PacBio® RS is sometimes overlooked as a next-generation sequencing option, even though it serves as “ideal approach to the complete sequencing of small genomes.”
The commentary focuses on three advantages of SMRT Sequencing: extraordinarily long reads, methylation data, and high accuracy in conjunction with lack of sequencing bias. The scientists note the importance of long reads for de novo genome assemblies and for sequencing through repeats and other complex regions. Because SMRT Sequencing can directly detect chemical base modifications to RNA and DNA, such as methylation, the authors state that the technology provides key functional information through the sequencing process. Finally, they highlight the lack of systematic bias in this sequencing method; as a result, “SMRT sequencing provides a highly accurate statistically averaged consensus perspective of the genome, as it is highly unlikely that the same error will be randomly observed multiple times,” they write.
The authors say that all of these factors “make a strong case for combining the more traditional, sequence-dense data from other technologies with at least moderate coverage of SMRT data so that genomes can be improved, their methylation patterns obtained, and the functional activity of their methyltransferase genes deduced.” They conclude, “SMRT sequencing opens a new window that may have a dramatic effect on our understanding of this biology.”
PacBio: Debunking the Error Myth.homolog./2013/07/03/pacbio-debunking-the-error-myth/Richard J Roberts, Mauricio O Carneiro and Michael C Schatz published a short commentary in Genome Biology that is worth taking a look at.
Link
The power of SMRT sequencing data lies both in its long read lengths and in the random nature of the error process (Figure 2). It is true that individual reads contain a higher number of errors: approximately 11% to 14% or Q12 to Q15, compared with Q30 to Q35 from Illumina and other technologies. However, given sufficient depth (8x or more, say), SMRT sequencing provides a highly accurate statistically averaged consensus perspective of the genome, as it is highly unlikely that the same error will be randomly observed multiple times. Notoriously, other platforms have been found to suffer from systematic errors that need to be resolved by complementary methods before the final sequence is produced [16].
In a nutshell, smart (SMRT) people make mistakes and many smart persons are whimsical
In our understanding, more than the errors, the type of error made the PacBios so difficult for bioinformaticians. Bioinformatics tools and algorithms are designed to cope with SNPs, whereas PacBio data have mostly indels. That is why we took time to explain the point in the (still incomplete and error-prone) Pacbio tutorials.
http://www.homolog.us/blogs/blog/2013/07/03/pacbio-debunking-the-error-myth/
Published: 3 July 2013 Abstract
Of the current next-generation sequencing technologies, SMRT sequencing is sometimes overlooked. However, attributes such as long reads, modified base detection and high accuracy make SMRT a useful technology and an ideal approach to the complete sequencing of small genomes.
Correspondence
Pacific Biosciences' single molecule, real-time sequencing technology, SMRT, is one of several next-generation sequencing technologies that are currently in use. In the past, it has been somewhat overlooked because of its lower throughput compared with methods such as Illumina and Ion Torrent, and because of persistent rumors that it is inaccurate. Here, we seek to dispel these misconceptions and show that SMRT is indeed a highly accurate method with many advantages when used to sequence small genomes, including the possibility of facile closure of bacterial genomes without additional experimentation. We also highlight its value in being able to detect modified bases in DNA.
Extending read lengths
So-called next-generation technologies for sequencing DNA are penetrating every aspect of biology thanks to the immense amount of information that is encoded within nucleic acid sequences. However, today's next-generation sequencing technologies, such as Illumina, 454 and Ion Torrent, have several significant limitations, especially short read lengths and amplification biases, that restrict our ability to fully sequence genomes. Unfortunately, with the rise of next-generation sequencing, even less emphasis is being placed on trying to understand at the biological and biochemical levels just what functions newly discovered genes have and how those functions allow an organism to work, which is surely why we are sequencing DNA in the first place. Now a new technology, SMRT sequencing from Pacific Biosciences [1], has been developed that not only produces considerably longer and highly accurate DNA sequences from individual unamplified molecules, but can also show where methylated bases occur [2] (and thereby provide functional information about the DNA methyltransferases encoded by the genome).
SMRT sequencing is a sequencing-by-synthesis technology based on real-time imaging of fluorescently tagged nucleotides as they are synthesized along individual DNA template molecules. Because the technology uses a DNA polymerase to drive the reaction, and because it images single molecules, there is no degradation of signal over time. Instead, the sequencing reaction ends when the template and polymerase dissociate. As a result, instead of the uniform read length seen with other technologies, the read lengths have an approximately log-normal distribution with a long tail. The average read length from the current PacBio RS instrument is about 3,000 bp, but some reads may be 20,000 bp or longer. This is roughly 30 to 200 times longer than the read length from a next-generation sequencing instrument, and more than a four-fold improvement since the original release of the instrument two years ago. It is notable that the recently announced PacBio RS II platform claims to have a further four-fold improvement, with twice the mean read length and twice the throughput of the current machine.
Applications of SMRT sequencing
The SMRT approach to sequencing has several advantages. First, consider the impact of the longer reads, especially for de novo assemblies of novel genomes. While typical next-generation sequencing can provide abundant coverage of a genome, the short read lengths and amplification biases of those technologies can lead to fragmented assemblies whenever a complex repeat or poorly amplified region is encountered. As a result, GC-rich and GC-poor regions, which tend to be poorly amplified, are particularly susceptible to poor quality sequencing. Resolving fragmented assemblies requires additional costly bench work and further sequencing. By also including the longer reads of SMRT sequencing runs, the read set will span many more repeats and missing bases, thereby closing many of the gaps automatically and simplifying, or even eliminating, the finishing time (Figure 1). It is becoming routine for bacterial genomes to be completely assembled using this approach [3,4], and we expect this practice will translate to larger genomes in the near future. A complete genome is far more useful than the poor quality draft sequences that litter GenBank because it provides a complete blueprint for the organism; the genes encoded therein represent the full biological potential of that organism. With only draft assemblies available, one is always left with the nagging feeling that some crucial gene is missing - perhaps the one in which you are most interested! The long read lengths also have more power to reveal complex structural variations present in DNA samples, such as pinpointing precisely where copy number variations have occurred relative to the reference sequence [5]. They are also extremely powerful for resolving complex RNA splicing patterns from cDNA libraries, since a single long read may contain the entire transcript end-to-end, thus eliminating the need to infer the isoforms [6].
Figure 1. Idealized assembly graphs [18]of the 5.2 megabase-pair B. anthracis Ames Ancestor main chromosome using (a) 100 bp, (b) 1,000 bp and (c) 5,000 bp reads. The graphs encode the compressed de Bruijn graph derived from infinite coverage error-free reads, effectively representing the repeats in the genome and the upper bound of what could be achieved in a real assembly. Increasing the read length decreases the number of contigs because the longer reads will span more of the repeats. Note the assembly with 5,000 bp reads has a self-edge because the chromosome is circular.
Second, consider DNA methyltransferases. These can exist as solitary entities or as parts of restriction-modification systems. In both cases, they methylate relatively short sequence motifs that can easily be recognized from SMRT sequencing data because of the change in DNA polymerase kinetics, as it moves along the template molecule, that result from the presence of epigenetic modifications. The altered kinetics cause a change in the timing of when the fluorescent colors are observed, thus enabling direct detection of epigenetic modifications, which can ordinarily only be inferred, and bypassing the usual necessity of enrichment or chemical conversion. Often, thanks to bioinformatics, the gene responsible for any given modification can be matched to the sequence motif in which the modification lies [7,8]. When it cannot, then simply cloning the gene into a plasmid, which is subsequently grown in a non-modifying host and re-sequenced, can provide the match [9]. Moreover, SMRT sequencing has also been able to identify RNA base modifications through the same approach as DNA base modifications, but using an RNA transcriptase in place of the DNA polymerase [10]. In fact, SMRT sequencing represents an important step toward uncovering the biology that happens between DNA and proteins, including not only the study of mRNA sequences but also the regulation of translation [11,12]. Thus, functional information emerges directly from the SMRT sequencing approach.
Third, we must consider the persistent rumor that SMRT sequencing is much less accurate than other next-generation sequencing platforms, which has now been demonstrated to be untrue in several ways. First, a direct comparison of several approaches to determining genetic polymorphisms has shown that SMRT sequencing has comparable performance to other sequencing technologies [13]. Second, the accuracy of assembling a complete genome using SMRT sequencing in combination with other technologies has proved to be as reliable and accurate as more traditional approaches [3,6,14]. Moreover Chin et al. [15] showed that an assembly using only long SMRT sequencing reads achieves comparable or even higher performance than other platforms (99.999% accuracy in three organisms with known reference sequences), including 11 corrections to the Sanger reference of these genomes. Koren et al. [6] showed that most microbial genomes could be assembled into a single contig per chromosome with this approach; it is by far the least expensive option for doing so.
Debunking the error myth
The power of SMRT sequencing data lies both in its long read lengths and in the random nature of the error process (Figure 2). It is true that individual reads contain a higher number of errors: approximately 11% to 14% or Q12 to Q15, compared with Q30 to Q35 from Illumina and other technologies. However, given sufficient depth (8x or more, say), SMRT sequencing provides a highly accurate statistically averaged consensus perspective of the genome, as it is highly unlikely that the same error will be randomly observed multiple times. Notoriously, other platforms have been found to suffer from systematic errors that need to be resolved by complementary methods before the final sequence is produced [16].
Figure 2. A sequencing context breakdown of the empirical insertion error rate of the two platforms on NA12878 whole genome data. In this figure we show all contexts of size 8 that start with AAAAA. The empirical insertion quality score (y-axis) is PHRED scaled. Despite the higher error rate (approximately Q12) of the PacBio RS instrument, the error is independent of the sequencing context. Other platforms are known to have different error rates for different sequencing contexts. Illumina's HiSeq platform, shown here, has a lower error rate (approximately Q45 across eight independent runs), but contexts such as AAAAAAAA and AAAAACAG have extremely different error rates (Q30 versus Q55). This context-specific error rate creates bias that is not easily clarified by greater sequencing depth. Empirical insertion error rates were measured using the Genome Analysis Toolkit (GATK) - Base Quality Score Recalibration tool.
Another approach that benefits from the stochastic nature of the SMRT error profile is the use of circular consensus reads, where a sequencing read produces multiple observations of the same base in order to generate high-accuracy consensus sequence from single molecules [17]. This strategy trades read length for accuracy, which can be effective in some cases (targeted re-sequencing, small genomes) but is not necessary if one can achieve some redundancy in the sequencing data (8x is recommended). With this redundancy, it is preferable to benefit from the improved mapping of longer inserts than opt for circular consensus reads, because the longer reads will be able to span more repeats and high accuracy will still be achieved from their consensus.
Conclusions
The considerations above make a strong case for combining the more traditional, sequence-dense data from other technologies with at least moderate coverage of SMRT data so that genomes can be improved, their methylation patterns obtained, and the functional activity of their methyltransferase genes deduced. We would especially urge all groups currently sequencing bacterial genomes to adopt this policy. That said, SMRT sequencing has also substantially improved eukaryotic genome assemblies, and we expect it to become more widely applied in this context over time, in light of the greater read lengths and throughput of the PacBio RS II instrument.
Perhaps it would even be worth redoing many genomes so that existing shotgun dataset-based assemblies could be closed and their complete methylomes obtained. The resultant assembled (epi)genomes would be inherently more valuable: the usefulness of a closed genome with associated functional annotation of its methyltransferase genes is far greater than the uncertainties left with a shotgun data set. Whereas we currently know much about the importance of epigenetic phenomena for higher eukaryotes, very little is known about the epigenetics of bacteria and the lower eukaryotes. SMRT sequencing opens a new window that may have a dramatic effect on our understanding of this biology.
http://genomebiology.com/2013/14/6/405
Development of SureSelect Target Capture Methods for Sequencing on the PacBio RS
1 Jul 2013
Pacific Biosciences’ PacBio RS is a powerful new technology for single molecule, real-time sequencing, featuring long reads, fast turnaround times, and capabilities for both de novo sequencing and resequencing applications. In this poster read how the efficacy and the advantages of the Agilent SureSelect Target Enrichment System was assessed for targeted resequencing on the PacBio RS.
Agilent Technologies
http://www.selectscience.net/application-articles/development-of-sureselect-target-capture-methods-for-sequencing-on-the-pacbio-rs/?artID=29140
Comparison of Sequencing Instruments (Wish they Included PacBio)
June 30th, 2013. Here is a good paper comparing various sequencing technologies and options. Wish they included PacBio. They have an interesting chart comparing k-mer frequencies from various technologies. That method will not work with PacBio however, because very few 31-mers stay intact with 85% indel rate in PacBio.
Abstract
The fully annotated genome sequence of the European strain, 26695 was first published in 1997 and, in 1999, it was directly compared to the USA isolate J99, promoting two standard laboratory isolates for Helicobacter pylori (H. pylori) research. With the genomic scaffolds available from these important genomes and the advent of benchtop high-throughput sequencing technology, a bacterial genome can now be sequenced within a few days. We sequenced and analysed strains J99 and 26695 using the benchtop-sequencing machines Ion Torrent PGM and the Illumina MiSeq Nextera and Nextera XT methodologies. Using publically available algorithms, we analysed the raw data and interrogated both genomes by mapping the data and by de novo assembly. We compared the accuracy of the coding sequence assemblies to the originally published sequences. With the Ion Torrent PGM, we found an inherently high-error rate in the raw sequence data. Using the Illumina MiSeq, we found significantly more non-covered nucleotides when using the less expensive Illumina Nextera XT compared with the Illumina Nextera library creation method. We found the most accurate de novo assemblies using the Nextera technology, however, extracting an accurate multi-locus sequence type was inconsistent compared to the Ion Torrent PGM. We found the cagPAI failed to assemble onto a single contig in all technologies but was more accurate using the Nextera. Our results indicate the Illumina MiSeq Nextera method is the most accurate for de novo whole genome sequencing of H. pylori.
The paper also mentioned observing different GC-content from different technologies.
Unique 31-mers in the output sequence data are expected to be errors given there is adequate depth to cover each genome more than 50-times. To determine if these mapped unique 31-mers were similar in GC content to highly covered regions, we compared their GC content to 105, randomly chosen, highly frequent 31-mers (occurring at a frequency between 20 and 50). This analysis may provide insights as to the reason for the low coverage. We found significant differences in GC content across all sequencing technologies and library preparation methods (Figure S4 and Table 2). The GC content of the Ion Torrent mapped unique reads was significantly higher for both genomes, suggesting a technical reason for the lack of coverage. Interestingly, the GC content of unique 31-mers that map to the reference genomes derived by the Illumina library preparation methods (Nextera and Nextera XT) identified conflicting results. These comments are from this article! http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0067539?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+plosone%2FPLoSONE+%28PLOS+ONE+Alerts%3A+New+Articles%29&utm_content=Google+Reader
http://www.homolog.us/blogs/chem/2013/06/30/choosing-a-benchtop-sequencing-machine-to-characterise-helicobacter-pylori-genomes/
Tuesday, June 18, 2013Back from SFAF, and Eager for More Finished Genomes
Last month’s Sequencing, Finishing, Analysis in the Future (SFAF) meeting in Santa Fe, New Mexico, hosted by Los Alamos National Laboratory, attracted terrific scientists and we really enjoyed hearing about their work as well as sharing our own technology advances. It was great to be at a meeting where genome finishing and analysis were key themes; it was an environment where our customers’ experience with HGAP and Quiver resonated, particularly around the automated finishing of microbial genomes.
SFAF had a number of keynote speakers, including Mark Adams from the J. Craig Venter Institute, who spoke about antibiotic resistance in microbes. He noted that lateral transfer of multidrug resistance genes is creating new drug-resistant pathogens in our hospitals. A key theme in his talk was the need for comprehensive information about the genomes of these drug-resistant microbes, including the difficult-to-assemble regions such as duplications, repetitive sequence, plasmids, and so on. He said that standard strain typing does not provide enough information to distinguish the particular form of resistance between bugs. For example, plasmids and phages play a critical role in horizontal gene transfer of drug resistance genes, yet these elements are notoriously difficult to assemble with short-read NGS methods. Adams commented that PacBio’s HGAP assemblies provide both finished genomes and plasmids, which offer important clues about drug resistance mechanisms and microbial adaptation.
Throughout the conference, many speakers mentioned the challenge of reference-based sequencing when there are errors in the reference, or when the reference is not a good enough representation of the genome being sequenced and compared to it. It was apparent that the trend is shifting back to de novo sequencing, which provides more information about the organism under investigation and is more likely to pick up unexpected differences that are not included in a reference genome. This idea is especially compelling in the microbial world, where pan-genomes and horizontal gene transfer have turned the whole idea of a single reference genome on its head.
Finishing genomes — how to generate them and how much they cost — was another common thread. It was impressive to see how many conference attendees were using the HGAP/Quiver method to finish all sorts of microbial genomes. Ken Dewar from McGill University gave a presentation based on a simple question: Can we sequence one full bacterium in one day for less than $1,000? We were thrilled to see that PacBio’s technology is meeting his goal. In another talk, Adam Phillippy from the National Biodefense Analysis and Countermeasures Center said that with PacBio® technology, it’s now cheaper to finish a microbial genome than it is to publish one. Phillippy noted that once reads exceed 7 kb for microbial genomes, assembly complexity is drastically reduced; using the XL chemistry, 40 percent of their PacBio reads have been at least that long. He and other speakers noted that they have been getting Q60 scores from their PacBio data with the latest hardware and software upgrades.
Most of all, we were inspired by so much compelling science and the community’s renewed commitment to finished genomes. Many attendees talked about their motivation to go back to their labs and try out these new methods for finishing microbial genomes, especially since the whole process can now be automated. http://blog.pacificbiosciences.com/2013/06/back-from-sfaf-and-eager-for-more.html
Longing for the longest reads: PacBio and BluePippin
Posted on June 19, 2013 by flxlex
PacBio sequencing is all about looong reads, especially in relation to de novo sequencing. A few things are needed to get the longest reads possible:
•the longer the enzyme is active on the template, the longer the raw read will be – PacBio calls these ‘Polymerase reads’
•a library for pacBio sequencing consist of circular molecules, with the target insert between two hairpin adaptors, allowing the enzyme to ‘go around’ and sequence the opposite strand once it reaches the end of the insert. See my previous post on this here. It then follows that the longer the template used for library preparation, the smaller the chance the polymerase goes around the hairpin, leading to longer uniquely sequenced template – PacBio calles these ‘reads of insert’ – and these represent the most useful reads for de novo sequencing applications
•finally, the distribution of sizes of the library has an influence: any high-throughput sequencing technology, as well as PCR, has problems with ‘preferential treatment’ of smaller fragments. With PacBio sequencing, shorter molecules tend to load preferentially into the wells of the SMRTCell (‘chip’). It then makes sense to try to reduce the shoulder of shorter fragments for the final library preparation.
Recently, PacBio and Sage Science announced a co-marketing partnership for the BluePippin. This instrument allows for tight size selection of DNA samples, effectively making the peak of the size distribution much more narrow. With regard to PacBio sequencing, a narrow peak lessens the problem of preferential loading of short fragments, leading to much longer ‘reads of insert’. A demonstration can be seen on this poster (I think I know which fish they used for that one plot…).
It’s nice when a company demonstrates a new improvement to their technology. But, the proof is always in the pudding, in this case, can the sequencing centres out there actually show the same results? Based on recent tweets from the PacBio USA User group meeting, one would believe so:
The Norwegian Sequencing Centre, with which I am affiliated, recently bought a BluePippin instrument for the reasons outlined above. Our first test was on a good, 10 kb insert PacBio (SMRTBell) library that we already had used for sequencing. The (excellent!) lab team ran a (large) sample of the library on the BluePippin instrument and showed a significant reduction of the small fragments:
(see link for chart) Note that we didn’t try to remove the larger fragments (which you would probably do for mate pair libraries) because in the case of PacBio library preparation, length is more important than a tight distribution per se. The next question was what the effect is of such a cleanup has on sequencing. We ran both libraries on our PacBio instrument with C2XL chemistry and 120 minute movies. First some numbers. Please note that the regular library data was obtained from our PacBio RS before maintenance and upgrade to RSII (doubling of the laser capacity), and this may have had a lsight negative effect on the read lengths. There were also more reads included for the ‘before’ data, and these were analysed with version 1.4 of the PacBio software versus 2.0 for the BluePippin library.
Library # of reads Ave. length Longest subread N50* Longest
Regular library 693,275 3,009 bp 4,041 bp 22,298 bp
BluePippin library 325,407 6,045 bp 8,820 bp 25,931 bp
*N50 is a metric often used for genome assemblies, and here translates to ‘the length at which half the bases of the entire dataset is in reads of at least length N50'.
The following graph shows the distribution of the longest subreads (or, reads of insert) for each raw read produced, for the library before, and after BluePippin cleanup.
FOR MORE GO TO THIS LINK http://flxlexblog.wordpress.com/2013/06/19/longing-for-the-longest-reads-pacbio-and-bluepippin/
Posted on June 17, 2013 by ehine Our PacBio throughput and read lengths have been improving steadily over the past year and may have just taken yet another big step forward. We upgraded our PacBio sequencer to RSII in mid-May and we are seeing significant increases in per-cell yield and improved read lengths with our longer libraries. The most notable change in the upgrade from RSI to RSII is the doubling of the number of simultaneously observable sequencing reactions on the SMRTcell, allowing throughput to be effectively doubled as well. Let’s take a look at some examples:
http://www.igs.umaryland.edu/labs/grc/2013/06/17/pacbio-rsii-producing-encouraging-early-results/
Friday, June 14, 2013Two Worlds of Genome Assemblers
by Jonas Korlach, Chief Scientific Officer
Finished genomes were the focus of last month's Sequencing, Finishing, Analysis in the Future (SFAF) meeting in Santa Fe, New Mexico. In addition to several presentations, including a talk by Adam Phillippy from the National Biodefense Analysis and Countermeasures Center that demonstrated the ability to generate high-quality, finished microbial genomes using just long-read PacBio data, several papers have appeared recently describing the same principle: the HGAP/Quiver Nature Methods paper, the FDA’s Salmonella Javiana outbreak genome publication, a blog entry by the University of Maryland using HGAP, and a preprint by Adam Phillippy and colleagues describing a similar genome assembly strategy and results.
These presentations and papers highlight the fact that SMRT® Sequencing, in conjunction with the appropriate bioinformatics tools, achieves highly accurate genomes, exceeding 99.999% accuracy, despite a higher single-pass error rate. This is possible because final genome assemblies build sequence through consensus(1); as the errors in SMRT Sequencing are random, very high consensus accuracy can be achieved.
Long reads and consensus are also at the heart of the genome assemblers appropriate for our type of sequencing reads. These overlap-based assemblers such as Celera® Assembler or MIRA — originally developed during the era of Sanger sequencing — are robust to errors. The long reads provide ample information about which reads belong together in pair-wise alignments, thereby connecting them properly for a correct genome assembly. In contrast, short-read technologies have largely relied on de Bruijn graph-based assemblers: short reads are fragmented further from which a K-mer graph is constructed and the assembly is derived.(2) As such, de Bruijn graph assemblers are very sensitive to single-read errors, which is why there has been a focus on single-pass sequence read accuracy in recent years.(3) Overlap and de Bruijn assemblers therefore differ fundamentally in their approach, highlighting the fact that the right bioinformatic tools need to be applied together with different types of sequencing data, and different parameters need to be evaluated for their performance.
We are excited about the application of these new assembly strategies to large numbers of microbial genomes (e.g., in the context of the 100K Foodborne Pathogen Genome Project) to close the large gap that currently exists between draft genomes and finished genomes in GenBank. Finished microbial genomes are the foundation for functional genomics studies, comparative genomics, forensics, microbial outbreak source identification, and phylogenetic analysis, and are thereby crucial for understanding microbes and advancing the field of microbiology.(4) Sequencing microbial genomes de novo, i.e. without the need for a pre-existing reference genome, is important to capture novel elements, such as plasmids or phages. These are sometimes referred to as the accessory genome(5), and can make the crucial difference between a commensal, harmless bacterium and a serious, perhaps drug-resistant pathogen.
References
1. S. Junemann, F. J. Sedlazeck, K. Prior et al., Nat Biotechnol 31 (4), 294 (2013).
2. J. R. Miller, S. Koren, and G. Sutton, Genomics 95 (6), 315 (2010).
3. N. J. Loman, C. Constantinidou, J. Z. Chan et al., Nat Rev Microbiol 10 (9), 599 (2012).
4. C. M. Fraser, J. A. Eisen, K. E. Nelson et al., J Bacteriol 184 (23), 6403 (2002).
5. D. Croll and B. A. McDonald, PLoS Pathog 8 (4), e1002608 (2012). http://blog.pacificbiosciences.com/2013/06/two-worlds-of-genome-assemblers.html
Next-Generation-Sequencing Update
Saturday, Jun. 15, 2013
Next update in about 11 hours
http://paper.li/mtwolfinger/1342906758
Complete microbial genomes using only PacBio data? Testing HGAP…
Posted on April 4, 2013 by ehine We’ve spent some time recently testing a new way to assemble PacBio data called HGAP, which stands for “hierarchical genome assembly process”. Unlike previous assemblers of PacBio data that have relied on the use of either Illumina and/or PacBio CCS reads for error correction of PacBio long reads, HGAP uses multiple alignments of all reads to perform the corrections, potentially eliminating the need for other libraries and data types. The corrected reads are assembled with an overlap-layout consensus assembler (in this case Celera Assembler) to form contigs. More details about HGAP can be read found here: https://github.com/PacificBiosciences/DevNet/wiki/Hierarchical-Genome-Assembly-Process-%28HGAP%29
We have evaluated HGAP on several of our projects and compared it to our assembly of illumina-corrected Pacbio reads assembled with Celera Assembler. So far, the results have been very encouraging and we have seen significant improvement in many cases. The chart below shows several examples: (This article posted @ University of Maryland School of Medicine Institude
for Genome Science)!!
So the assemblies are more contiguous, but are the corrections good enough to generate accurate consensus sequence? In an attempt to verify the consensus accuracy of these HGAP assemblies for several Bordetella genomes, we aligned >240x coverage of 250bp Illumina MiSeq data to the HGAP-generated contigs and looked for discrepancies and SNPs using GATK. We found no cases of high-quality, passed-filter variants, which supports a highly accurate consensus sequence generated by the HGAP assembly. We continue to test and compare HGAP with other PacBio assembly methods but are encouraged by initial results.
http://www.igs.umaryland.edu/labs/grc/2013/04/04/complete-microbial-genomes-using-only-pacbio-data-testing-hgap/
PACB not presenting,they are sponser,sorry.
(PACB will be Presenting 6/14/13) New exhibition makes genome accessible to public
Unique NIH-Smithsonian collaboration unlocks the present and future of genome science
The Smithsonian Institution’s first state-of-the-art exhibition about genome science, Genome: Unlocking Life’s Code, opens Friday, June 14, 2013, at the National Museum of Natural History in partnership with the National Human Genome Research Institute (NHGRI), a part of the National Institutes of Health.
http://www.nih.gov/news/health/jun2013/nhgri-13.htm Presenting Sponsors
Bio-Rad Laboratories, Inc.
The Brin Wojcicki Foundation
Celgene Corporation
History™
Pacific Biosciences and Mike & Beth Hunkapiller
http://unlockinglifescode.org/about/sponsors
Long read,but very good info! For example, the fraction of the GC-poor
P. falciparum genome that had relative coverage = 0.25 (i.e. four-fold undercovered or
worse) ranged from 0.33% in Pacific Biosciences data (PACB BEST) to 3.7% in Illumina data to
22% in Ion Torrent data (worst). In the GC-rich R. sphaeroides genome, the four-fold
undercoverage fractions were 0.0071% for Pacific Biosciences (PACB BEST), 0.39% for Illumina,and 36% for Ion Torrent (worst). The better performance of Pacific Biosciences is probably
attributable to the lack of any amplification in their process. ( Characterizing and measuring bias in sequence data
Genome Biology 2013, Publication date 29 May 2013
Article URL http://genomebiology.com/2013/14/5/R51 (cf. [20, 21]). Characterizing and measuring bias in sequence data
Genome Biology 2013, 14:R51 doi:10.1186/gb-2013-14-5-r51
Michael G Ross (mgross@alum.mit.edu)
Carsten Russ (carsten@broadinstitute.org)
Maura Costello (costello@broadinstitute.org)
Andrew Hollinger (aholling@broadinstitute.org)
Niall J Lennon (nlennon@broadinstitute.org)
Ryan Hegarty (rhegarty@broadinstitute.org)
Chad Nusbaum (chad@broadinstitute.org)
David B Jaffe (jaffe@broadinstitute.org)
ISSN 1465-6906
Article type Research
Submission date 11 December 2012
Acceptance date 15 May 2013
Publication date 29 May 2013
Article URL http://genomebiology.com/2013/14/5/R51 For the whole read click on LINK and then click on provisional PDF
Era7 Bioinformatics publishes a new bacterial genome: illumina sequencing data plus a little of PacBio data produces much better results than only llumina
07/06/2013
We have published the first complete genome sequence of a Staphylococcus aureus strain assigned to clonal complex 12. The strain was isolated in a food poisoning outbreak due to contaminated potato salad in Switzerland in 2009, and it produces staphylococcal enterotoxin B.
The genome of Staphylococcus aureus KLT6 strain has been sequenced with illumina and PacBio technologies. The bacterial genome sequenced is a Staphylococcus aureus from a outbreak in Switzerland. The outbreak was traced back to a staphylococcal enterotoxin B (SEB)-producing S. aureus strain that was designated strain KLT6 and was assigned to CC12 and spa type t160.
The assembly and scaffolding has been done with Era7 Bioinformatics' AG7 approach and the annotation with the Era7 Bioinformatics' bacterial annotation system BG7. This article shows how adding a little of PacBio data to an illumina bacterial genome project much better results can be obtained. The results have been checked by means of optical mapping.
This work is a new success of Era7 Bioinformatics' methods for bacterial genomics projects (link). In addition, this work also constitutes a new successful step in our project NEXTMICRO, about NGS and outbreak management.
http://era7bioinformatics.com/en/news.cfm?id=15&title=era7-bioinformatics-publishes-a-new-bacterial-genome:-illumina-sequencing-data-plus-a-little-of-pacbio-data-produces-much-better-results-than-only-llumina#.UbH5C0D0E1M
Long,but a good read on PACB-1) CCS for sequencing full-length 16S rRNA
Before the days of Next Generation Sequencing, people used to amplify 16S regions of samples and make bacterial clone libraries, sequencing a set of clones using Sanger sequencing. NGS allowed for many more reads per sample, but restricted the length of the sequenced part to a few hundred bases at best. NGS therefore yielded much deeper sequenced datasets, but at a lower discriminatory power (less phylogenetic/taxonomic signal).
The long CCS reads now possible with PacBio imply that one could consider going back to full-length 16S sequencing. The throughput (40 000 reads per SMRTCell) will be much higher than doable using Sanger sequencing, and the quality potentially even better than Sanger. However, the price per read will be significantly higher than using short-read technology. I think that for certain diversity studies, using PacBio CCS with full-length 16S amplicons will be very beneficial.
2) CCS for shotgun metagenomics
Similarly, people interested in whole-sample shotgun metagenomics (as opposed to PCR-based diversity studies) could consider using the 1-2 kbp CCS reads. The long reads could yield much more useful information for gene mining, for example. Some may suggest that using the Roche/454 GS FLX+, which now seems to be working – at least in our lab it does – will yield much more reads (1 to 1.2 million) around that length (we have seen 1kb mode read lengths with GS FLX+!), making this technology more cost-effective. Some back-of-the-envelope calculations using prices for the Norwegian Sequencing Centre show that the comparison actually is in favour of PacBio. Given one library preparation, and 1 million required reads, the 454 is currently out-priced by PacBio, while the latter has the potential to give better quality (no homopolymer errors) and longer reads. However, first, generating 1 million CCS reads (25 SMRTCells) takes more time (several days, not counting library preparation) than a full 454 run (about a day) [Note that lab teams will like not having to do emulsion PCR for PacBio CCS!]. Secondly, the pricing situation may be different for other centres (pricing structures I think really differ from centre to centre).
3) CCS to replace Sanger capillary plate sequencing
Sanger sequencing is far from dead. Its main attraction is that it allows very small sample sizes (down to a single sample can be submitted to a facility), and long reads with high quality. A key difference between Sanger and NGS is the fact that each read can be traced back to a well on a plate. I think that, if a suitable and cost-effective barcoding scheme could be designed for multiplexed PacBio CCS sequencing, PacBio CCS could potentially replace Sanger plate sequencing. To keep the costs down, one would need to massively multiplex, perhaps dozens of 96 well plates with fragments that need to be tracked back to their original plate and well. But a laboratory with good automation experience might pull it off. At the same time, the scheme requires a steady flow of Sanger samples. It wouldn’t work for a facility that sequences less then, say, a dozen plates per week. Commercial providers may actually already consider doing this switch. The benefits could be longer reads than one can get with Sanger, with higher per-base qualities.
[Technical note: the per-SMRTCell throughput of 40 000 reads may allow for adding the same template multiple times, increasing the final consensus accuracy. The fraction of raw reads too short to give a consensus call (see above) may actually contribute to quality as well as they are barcoded]
In summary: PacBio CCS may be an alternative for short read sequencing, or even Sanger Capillary sequencing. However, there will be a trade-off in information content, versus price per read.
For a technical, but very readable paper describing CCS (from the PacBio researchers themselves), see this paper in Nucleic Acids Research.
http://flxlexblog.wordpress.com/
New Assembly Method Published for Rapid and Automated Genome Sequencing Using Long-Read, Single Molecule, Real-Time (SMRT(R)) Sequencing
MENLO PARK, Calif., May 6, 2013 (GLOBE NEWSWIRE) -- Researchers from Pacific Biosciences of California, Inc., (Nasdaq:PACB), the U.S. Joint Genome Institute and the University of Washington have published a new method for assembling high-quality genomes from Single Molecule, Real-Time (SMRT®) DNA sequencing. Published in the May 5 edition of Nature Methodsi, the paper by Chin et al. describes the hierarchical genome assembly process (HGAP) and demonstrates the method for efficient, automated de novo assembly from genomic DNA to a finished genome sequence for several microorganisms and a human bacterial artificial chromosome (BAC) clone. As part of the paper, the authors also describe a new consensus algorithm, Quiver, that achieves highly accurate de novo genome sequence results exceeding 99.999% (QV 50) accuracy.
Finished genomes are crucial for understanding microbes and advancing the field of microbiology.ii Previous attempts for obtaining the complete genome sequence of microbes in an automated, high-throughput manner have challenged researchers. For example, with second-generation sequencing methods, short read lengths inhibit the ability to resolve long repeats, resulting in unfinished, fragmented draft assemblies. Further, extreme sequence contexts, such as GC- or AT-rich regions, or palindromic sequences, lead to gaps in draft genome assemblies that cannot be covered using these second-generation methods. As a result, Sanger sequencing has typically been employed for finishing microbial genomes, but due to its laborious and low-throughput nature this process is slow and expensive.
More recently, hybrid-assembly approaches have been described in which long PacBio reads were used in combination with short-read dataiii,iv. Building on these advances, in this new paper the authors utilize just a single, long-insert shotgun DNA library in conjunction with SMRT Sequencing, thereby removing the need for additional sample preparation and sequencing data sets required for previously described hybrid strategies. A paper describing a similar strategy and assembly results by S. Koren, A. Phillippy, and colleagues from the National Biodefense Analysis and Countermeasures Center, Frederick, MD, and the United States Agriculture Department has been deposited in a pre-print archive.
"This approach can close the large gap that currently exists between 'draft' and high-quality 'finished' genomes," said Jonas Korlach, senior author on the paper and Chief Scientific Officer at Pacific Biosciences. "Further, the ability to automatically and cost-effectively assemble genomes independent of the availability of a reference sequence can be critical in the rapid characterization of new pathogen strains."
Evan Eichler, co-author on the paper, Howard Hughes Medical Investigator and Professor at the Department of Genome Sciences at the University of Washington, said, "I am excited by the ability of SMRT DNA Sequencing and HGAP for finishing complex regions of the human genome. This approach has demonstrated the potential to cost-effectively generate high-quality finished sequence from large-insert clones of these regions, such as BACs. Short-read sequencing technologies simply cannot adequately access and assemble through these complex regions of genomes."
Pacific Biosciences recently launched the PacBio® RS II – a new SMRT Sequencing system that provides the industry's highest consensus accuracy and longest reads with double the throughput from the previous version of the system. The PacBio RS II allows scientists to rapidly and cost-effectively generate finished genome assemblies, reveal and understand epigenomes, and characterize genomic variation. The PacBio RS II system, including consumables and software, provides a simple, fast, end-to-end sequencing workflow for applications such as infectious disease and microbiology, agriculture, and understanding rare diseases.
http://www.globenewswire.com/news-release/2013/05/06/544566/10031482/en/New-Assembly-Method-Published-for-Rapid-and-Automated-Genome-Sequencing-Using-Long-Read-Single-Molecule-Real-Time-SMRT-R-Sequencing.html
Older article,but good read; A coming of age for PacBio and long read sequencing #AGBT13 Saturday, February 23, 2013
Aside from the dubstep pumping out of the Roche and Agilent booths, the volume of AGBT has been somewhat muted. There was no grand offering of new hardware or over the top promises of sequencing genomes on what now appear to be vaporware USB thumb drives. This is my first in person experience of AGBT, so as a virgin it seems for the most part to be rooted in the science despite the ridiculous parties and “showgirl” casino nights. The atmosphere here is unlike any other science conference I’ve attended. It’s like the bastard child of a Gordon Conference and a Las Vegas Porn Convention. I really hope that the deep pockets of Sequencing Centers are more influenced by the science than the free dinner parties and alcohol, but I have pretty low confidence in humanity. Regardless, I think everyone in attendance today was overwhelmed by a stunning talk from PacBio and the dramatic advancements of their long read technology.
The PacBio talk came on the heels of what felt like a warm-up opening act from Jeremy Schmutz of the Hudson Alpha Institute. Schmutz has been working with a start-up that was recently acquired by illumina called Moleculo, which promises highly accurate long reads using short read sequencers. I sat through two other talks, went to the Moleculo poster, and still do not have a clear idea of how the technology actually works. What I could deduce from the talks and the questions I asked at the poster session, Moleculo technology works by fragmenting DNA into 10 kilobase fragments. These fragments are then diluted and clonally amplified in separate volumes. How Moleculo does this clonal amplification of what have to be tens of thousands of small volume PCR reactions is beyond me. When I asked this question at the poster session, Dmitry Pushkarev of Moleculo said that it was simple: they have a $4 piece of plastic about the size of a shoe box that does all of the hard work. Right. So the first step involves magic. Once the 10kb fragments are clonally amplified, they are again fragmented into even smaller pieces and Nextera tagged. These Nextera tagged sequences are then sequenced on a HiSeq and the data output is sent back to Moleculo where they do the de-multiplexing and sequence assembly. As before, when asked how the Moleculo assembler can accurately span long repeat regions greater than the 100bp short read output, Pushkarev offered, “We wrote the best assembler ever.” Ok, I get it, you guys are going to be coy about how everything actually works. From an early business perspective, I guess that’s a great idea, but some of us like to know the details of how techniques work so that we can do our own validation and be positive that the data we’re getting out is representative of what has actually been sequenced. The sample to black box to data pipeline is not ideal to me, but others, like Schmutz, seem to be ok with this for now. Hopefully, as the Moleculo technology matures, we get more information from them about how the technique actually works.
Fortunately, Schmutz was at least able to show us some of his downstream data using Moleculo and also how it compared to other long and longer read sequencers such as Roche 454 and PacBio. Schmutz works mostly with plants, and plants are tricky organisms to sequence because they’re full of repetitive information that cannot be accurately sequenced using the current short read technologies. Schmutz showed that Moleculo reads appear to be highly accurate with an error rate of 1.2bp per 10kb of sequence; however, the Moleculo reads created a similar number of sequence gaps as shorter 800bp 454 reads and these regions had to be fixed with secondary MiSeq runs. In contrast, Schmutz showed that PacBio RS reads provided the least “gappy” data which is important from a cost perspective. It seemed like Schmutz final conclusion was that he liked the Moleculo technology the best because they did all of the hard work: they made the libraries and performed the read assembly without having to train staff on new techniques or machines.
After Schmutz had thoroughly fluffed the crowd over long read sequencing, Jonas Korlach from Pacific Biosciences took the stage to give a talk on the progress of assembling genomes using their long read sequencing technology. If you’ve been in the sequencing game for any amount of time you’ve definitely heard of PacBio and their early promise of accurate, realtime, single molecule, long read sequencing technology and their subsequent absolute failure in rapidly delivering those promises. PacBio has spent the last 2 years trying to dig itself out of a hole by aggressively working with early adopters to fix their original lapses in quality. In an earlier talk in the week (Slides!), Michael Schatz, another plant geneticist and PacBio early adopter declared, “You don’t have to worry about the errors anymore.” This is in part because of major improvements in PacBio chemistry and the introduction of a new more accurate polymerase. PacBio has consistently doubled its average read length over the last two years and has made gains on the error correction front. Last year, Schatz published a paper in Nature Biotechnology which showed that highly accurate illumina or 454 short reads could be used to error correct PacBio long reads to generate the most accurate long read data available today. Korlach followed up on this short read correction scheme by showing that the illumina/454 step can now be eliminated completely and researchers can use the short reads generated during the PacBio run to error correct the much longer reads in a process called Hierarchical Genome Assembly (HGAP), with a base accuracy of 99.99965%. That’s a far cry from the early real world production numbers of 85%! Korlach then supported these tech numbers with an astounding amount of real world data. Granted, all of the de novo sequencing presented was done in pathogenic organisms or bacteria with small circular genomes, but no one can dispute how impressive the data looked during the presentation.
The big question now is: has PacBio finally weathered the storm and can it overcome its previous reputation as a failed sequencing company? Time will really only tell. It’s hard to predict winners and losers in this space, especially in light of the coming sequestration and dwindling research budgets. PacBio may be peaking at the wrong time. With the threat of Moleculo long read technology on the horizon, major sequencing labs may hold out on purchasing PacBio RS systems. Why invest $700,000 in a sequencer if you can get “good enough” long reads off of your HiSeqs?
http://www.labspaces.net/blog/1618/A_coming_of_age_for_PacBio_and_long_read_sequencing___AGBT___
M I T Technology Review (A good read) A Cancer Gene Therapy Activated by a Pill
Patients can turn off an experimental treatment if side effects get too bad.
By Susan Young on March 18, 2013 .Why It MattersRevving up the immune system to attack cancer cells is an attractive area of oncology research, but such treatments can have dangerous side effects.
A unique new cancer treatment uses gene therapy to induce a cancer-fighting immune response whose intensity can then be controlled with a pill. The combination could help tailor treatment to a patient’s individual response.
The treatment uses the body’s own cells or tumor cells to produce extra copies of a naturally occurring hormone-like molecule called IL-12, which regulates anticancer immune responses. Last week, Ziopharm Oncology announced a clinical study of the treatment for patients with breast cancer. The company is already testing it in patients with melanoma.
Many researchers have explored techniques that rev up the natural response the body uses to detect and attack cancerous cells (see, for example, “Engineering Better Immune Cells” and “Priming the Body to Tackle Cancer”). But controlling the killer cells of the immune system can sometimes be a challenge, as researchers found in the 1990s when cancer patients who were given IL-12 in a clinical trial died from toxic side effects.
“IL-12 is a very potent [immune system regulator] and can generate a lot of side effects,” says Per Basse, a physician-scientist at the University of Pittsburgh School of Medicine, who studies immune cells and their ability to fight cancer. “As a clinician, I would like to be able to dial it up and down so that if it all starts to look not so good, you can stop the process,” he says.
To avoid the dangerous side of the molecule, Ziopharm’s system is designed to control IL-12 with a combined genetic and pharmaceutical switch. A virus is injected into the tumor to deliver the gene for IL-12. The gene starts out in “off mode,” so it doesn’t actually produce any IL-12. To activate the gene, a patient has to take a pill that delivers another molecule. The advantage is that any patient who starts to experience nasty side effects from the IL-12 can stop taking the pill. “If things go awry, you have an escape valve,” says Ziopharm’s CEO, Jonathan Lewis.
The key to the “inducible” system is a version of the receptor that controls molting in arthropods (insects, spiders, and crustaceans), modified so that it determines whether the IL-12 gene is on. The gene for that receptor, which is also delivered into the body by a virus, is always on, but its protein product and thus IL-12 expression is activated by the pill. Ziopharm licensed the control system from Intrexon for use in its oncology treatment.
“The inducibility is a great idea, but the trick is getting something that you can get into the tumor,” says Ralph Weichselbaum, a cancer researcher at the University of Chicago, who has worked on a cancer therapy induced by radiation. Currently, Ziopharm injects the gene-toting virus directly into patients’ tumors, but Lewis says the plan is to inject it into muscles in the future. “Muscle cells are extremely good protein production factories,” he says.
But even injecting the virus into a single tumor has an effect on other tumors—both in lab animals and in humans. In animal studies, the tumor that receives the injection will at first get bigger because immune cells are accumulating in response to the IL-12. “Then it will get smaller and go away,” says Lewis. Tumors that received no injection will do the same thing—grow, then shrink, and then disappear. “We are seeing similar things in people,” says Lewis.
Eventually, the system could be used to deliver multiple genetic treatments at once, says Lewis. “With one injection you could be able to control three or four [cancer-fighting] proteins in different ways.”
http://www.technologyreview.com/news/512216/a-cancer-gene-therapy-activated-by-a-pill/
In a very recent development, at the Barcaly's Global Healthcare Conference on March 13, ZIOP CEO Jonathan Lewis suggested that the PICASSO III trial's primary efficacy endpoint has been changed to just PFS (it used to be PFS followed by OS), and that the company and FDA have agreed that PFS would suffice for FULL approval. Here is what Lewis said:
Notable and new is the following -- and that is in the study, as of now, PFS is the primary endpoint and for protocol, its primary endpoint, full approval. OS is a secondary endpoint. This modification has been done, incorporating changes with dialogue with FDA.
He later reiterated:
Again, consistent with FDA policy as articulated by them, and again, consistent with the amendment, as done now, to this protocol, with PFS as a primary endpoint, for full approval.
If these statements accurately reflect FDA's stance on the issue, this particular bear argument is no longer valid.
Should the PICASSO III trial succeed, we value the company at 3 times expected peak revenues (for the STS indication). We estimate peak world-wide sales of Palifosfamide for STS at roughly $330 million which is in-line with company estimates for us-only sales (see slide 18 in this presentation). While the company based it's market sizing on prescription data from IntrinsicQ Data for first-line STS, we note that with 11,000 newly reported STS cases per year and 4,000 deaths due to the disease, these numbers may be an over estimate of US-only sales.
Based on our $330 million peak revenue estimate, a 3x revenue multiple (equivalent to peak revenues in year 5 discounted at 10%), we employ a positive outcome market cap of $990 million.
03/13/2013 from Barclays; (I think this means Primary Endpoints have been met)? https://twitter.com/ChimeraResearch/status/312233280848146435/photo/1
A Cancer Gene Therapy Activated by a Pill
Patients can turn off an experimental treatment if side effects get too bad.
By Susan Young on March 18, 2013 A unique new cancer treatment uses gene therapy to induce a cancer-fighting immune response whose intensity can then be controlled with a pill. The combination could help tailor treatment to a patient’s individual response.
The treatment uses the body’s own cells or tumor cells to produce extra copies of a naturally occurring hormone-like molecule called IL-12, which regulates anticancer immune responses. Last week, Ziopharm Oncology announced a clinical study of the treatment for patients with breast cancer. The company is already testing it in patients with melanoma.
Many researchers have explored techniques that rev up the natural response the body uses to detect and attack cancerous cells (see, for example, “Engineering Better Immune Cells” and “Priming the Body to Tackle Cancer”). But controlling the killer cells of the immune system can sometimes be a challenge, as researchers found in the 1990s when cancer patients who were given IL-12 in a clinical trial died from toxic side effects.
“IL-12 is a very potent [immune system regulator] and can generate a lot of side effects,” says Per Basse, a physician-scientist at the University of Pittsburgh School of Medicine, who studies immune cells and their ability to fight cancer. “As a clinician, I would like to be able to dial it up and down so that if it all starts to look not so good, you can stop the process,” he says.
To avoid the dangerous side of the molecule, Ziopharm’s system is designed to control IL-12 with a combined genetic and pharmaceutical switch. A virus is injected into the tumor to deliver the gene for IL-12. The gene starts out in “off mode,” so it doesn’t actually produce any IL-12. To activate the gene, a patient has to take a pill that delivers another molecule. The advantage is that any patient who starts to experience nasty side effects from the IL-12 can stop taking the pill. “If things go awry, you have an escape valve,” says Ziopharm’s CEO, Jonathan Lewis.
The key to the “inducible” system is a version of the receptor that controls molting in arthropods (insects, spiders, and crustaceans), modified so that it determines whether the IL-12 gene is on. The gene for that receptor, which is also delivered into the body by a virus, is always on, but its protein product and thus IL-12 expression is activated by the pill. Ziopharm licensed the control system from Intrexon for use in its oncology treatment.
“The inducibility is a great idea, but the trick is getting something that you can get into the tumor,” says Ralph Weichselbaum, a cancer researcher at the University of Chicago, who has worked on a cancer therapy induced by radiation. Currently, Ziopharm injects the gene-toting virus directly into patients’ tumors, but Lewis says the plan is to inject it into muscles in the future. “Muscle cells are extremely good protein production factories,” he says.
But even injecting the virus into a single tumor has an effect on other tumors—both in lab animals and in humans. In animal studies, the tumor that receives the injection will at first get bigger because immune cells are accumulating in response to the IL-12. “Then it will get smaller and go away,” says Lewis. Tumors that received no injection will do the same thing—grow, then shrink, and then disappear. “We are seeing similar things in people,” says Lewis.
Eventually, the system could be used to deliver multiple genetic treatments at once, says Lewis. “With one injection you could be able to control three or four [cancer-fighting] proteins in different ways.”
http://www.technologyreview.com/news/512216/a-cancer-gene-therapy-activated-by-a-pill/
Amgen and Cytokinetics At Important Cardiology Meeting. The American College of Cardiology is coming to San Francisco for ACC.13, March 9 – 11, 2013. Focus on the transformation of cardiovascular care — from discovery to delivery. See how innovation and science impact the delivery of quality care and the prevention of cardiovascular disease in our patients. Learn more about the education we have planned for ACC.13 — including focused full-day sessions, MOC sessions to update your knowledge, and education across each of our 16 learning pathways.
We give you the tools you need to improve your quality of care. Plus, at ACC. 13, you can personalize your experience — by specialty, by interest area, by role in the CV team, and more! Learn more about the ACC.13 eMeeting Planner App and Online Program Planner.
http://accscientificsession.cardiosource.org/ACC.aspx
Upcoming Event! Date: 07.03.2013 - 10.03.2013 (Date in U S)(03/07/2013-03/10/2013)
Hoping OMEROS(OMER) will be there ?? The International Symposium on Ocular Pharmacology and Therapeutics
Location: Paris, France
Type: Symposium
ISOPT will reconvene in Paris March 7 – 10 for our 10th symposium. Over the years, ISOPT has repeatedly shaped its form so it can accommodate to provide a collaborative atmosphere for clinicians, clinical trialists, pharma industry and regulatory officers. Our formed vision has been to increase knowledge and awareness of drug usage in ophthalmology reflecting innovations and their utilization in practice. In order to achieve this vision ISOPT is set to implement the following missions:
Educate via updating and sharing current and evolving treatment algorithms for practitioners.
Facilitate innovation via panels and discussions of early and late “development related topics”.
http://www.ophthalmologyweb.com/1337-Events/38636-The-International-Symposium-on-Ocular-Pharmacology-and-Therapeutics/
Omeros Corporation's Stock Has Potential To Double This Year
February 18, 2013 by: Biomed Group | about: OMER Disclosure: I am long OMER. I wrote this article myself, and it expresses my own opinions. I am not receiving compensation for it (other than from Seeking Alpha). I have no business relationship with any company whose stock is mentioned in this article. (More...)
We wanted to share some thoughts about Omeros (OMER), our top pick in the biotechnology industry, which fell from its long-term highs and is currently vastly undervalued. Moreover, Omeros has not yet been discovered by most investors, and has fundamental catalysts in place that are expected to drive its stock price higher. Omeros also has highly positive analyst ratings, with one-year price targets that are about more than double the current price.
Let's take a look at the company's upcoming catalysts as follows:
NDA submission for OMS302 - OMS302, the company's lead drug candidate, is a surgical irrigation solution that aims to assist surgeons during lens replacement surgeries by maintaining pupil dilation, as well as reduce postoperative pain. Maintenance of pupil dilation is important for lens replacement surgery; a pupil that shrinks during surgery increases the amount of time necessary to complete the procedure and risks damage to structures in the eye.
On January, Omeros announced it has completed the 90-day safety database lock in the second positive Phase III trial of OMS302 in 416 patients undergoing intraocular lens replacement surgery (ILR), randomized 1:1 to OMS302 and placebo. The adverse event profile of OMS302 was similar to that observed in its previous clinical trials, and the incidence of adverse events was similar between the two treatment groups. No safety concerns have been identified during OMS302's clinical development.
According to analysts at Wedbush, if approved, OMS302 could become widely used, and the gross peak annual sales worldwide could reach over $600 million. With that said, we would like to point out that Omeros's market cap is only $144.2M. In addition, Maxim Group sees 95% probability of approval for this product, given the robust P-values associated with the primary and secondary endpoints in both pivotal trials.
Omeros plans to file an NDA for OMS302 in Q1:2013, followed by a MAA in mid-2013 and launch OMS302 in the first half of 2014. We expect to see Omeros's shares rising, especially when the regulatory process for OMS302 will begin with the NDA submission.
Completion of Phase I for OMS824 - OMS824 is the lead compound in Omeros's PDE10 program for the treatment of schizophrenia. In December last year, Omeros announced positive safety and pharmacokinetic data from the single-ascending-dose (SAD) portion of a Phase I trial for OMS824 in healthy subjects. Specifically, OMS824 was well tolerated, had linear pharmacokinetics, a long half-life consistent with once daily dosing, and good systemic exposure with the highest dose resulting in the expected pharmacological effects in healthy subjects.
OMS824 appears to have excellent pharmaceutical properties, and we look forward to completing this Phase I clinical trial and starting studies in schizophrenia patients.
- Gregory A. Demopulos, M.D., chairman and chief executive officer of Omeros.
OMS824 will complete Phase I trial during this quarter and could potentially be tested in Phase II for schizophrenia. It is important to mention that schizophrenia drugs like Eli Lilly's (LLY) Zyprexa have achieved annual sales of over $1 billion, despite safety and efficacy limitations.
What insiders think about Omeros?
In early January, 3 (out of 6) directors have bought Omeros's shares. When you see insiders buying at depressed prices, it often means the shares are oversold and offer great long-term value.
Ray Aspiri purchased 40,000 shares on January 4 at $5.72 per share.
Arnold Hanish purchased 2,000 shares on January 3 at $5.58 per share.
Peter Demopulos purchased 9,345 shares on January 3 at $5.42 per share.
Currently, 41% of the company's outstanding shares are owned by insiders, institutions and mutual funds (approximately 5.4% by the CEO). It is a good indicator, since they have access to sophisticated research and have a great deal of information on the company.
What analysts think about Omeros?
Of the 5 analysts we checked, the lowest price target stands at $13,(the highest at $25). which implies a potential upside of approximately 133% from the current closing price of $5.57 per share.
Financial position:
According to the last financial report, Omeros ended Q3:2012 with $30.6 million in cash and previously provided guidance for 12-months runway. At the Oppenheimer 23rd Annual Healthcare Conference, which took place in December 12, 2012, the CEO has been asked about the company's cash position and partnering plans. The CEO answered that as of Q3:2012, the company had $34 million in cash, with additional $40 million of equity credit line, which the company had not used. With respect to partnering, the company is now in partnering discussions to all or almost all of its programs.
Additional disclosure: Biomed Group is a group of investment professionals and writers. This article was written by Amit Cohen. We did not receive compensation for this article, and we have no business relationship with any company whose stock is mentioned in this article. This information is not to be construed as an offer to buy or sell any security mentioned on this article.
http://seekingalpha.com/article/1200161-omeros-corporation-s-stock-has-potential-to-double-this-year?source=yahooTh
FDA ALS Advocacy Meeting coming up; FDA Convenes ALS Public Hearing February 25
To: ALS Community Advocates
From: Advocacy - MDA
Date: January 30, 2013
Please ensure that our voice is heard — submit your request to the FDA by Feb. 8, 2013!
We are pleased to share news of a landmark opportunity for our ALS community. On February 25, 2013, the U.S. Food and Drug Administration (FDA) will hold a public hearing that will be open to individuals and caregivers affected by ALS, ALS clinical research experts, and those with a strong and passionate voice about the needs of our community.
We urge our entire ALS community to participate, either in person or in writing. Please do all you can to help ensure that the unified voice of our ALS community is heard.
Hearing remarks and written submissions may pertain to topics of importance to you and your ALS experience and/or expertise, including:
•the tremendous unmet medical needs within our community;
•considerations pertaining to benefit/risk decisions when participating in clinical trials and undergoing therapies;
•accelerated drug approval and compassionate use programs within the FDA;
•disease burden in daily life with ALS; and much more.
Click here to read the comment that was recently submitted to the FDA regarding these issues by MDA and ALSA.
Click here to read the official FDA hearing notice, and learn more about submitting comments or attending the meeting. Please note that all requests to participate, either in person or in writing, must be submitted to the FDA by February 8,
This is a rare opportunity to impact therapeutic development and ALS health policy — please join us!
mda.org/advocacy
Annie Kennedy
MDA Senior Vice President - Advocacy
http://mda.org/advocacy/news/FDA-Convenes-ALS-Public-Hearing-February-25
Dawson James Initiates Cytokinetics at Market Outperform on Competitive Positioning
12:02p ET November 27, 2012 (Benzinga) Dawson James initiated coverage on Cytokinetics (NASDAQ: CYTK) with a Market Outperform rating and a $2 price target.
Dawson James said, "Although CYTK's programs are still relatively early, they are based on an understanding of the underlying biology unparalleled in the industry. In addition, the competitive landscape is characterized by few opposing programs. Our company valuation is based primarily on the weighted-probability of the DCF analysis for two tirasemtiv scenarios: approval based on accelerated approval or a subsequent Phase III trial. We believe there is a 70% chance of accelerated approval for tirasemtiv ($2.50) and a 30% chance a Phase III is required ($0.75). We derive our $2 price target from the synthesis of these analyses."
Cytokinetics closed at $0.67 on Monday.
Houston,we may have Ignition. Ready for blast Off!!!!
Price is being very controlled!!??????????????????????
I believe CYTK has 133 million shares,Total Number of Shares owned 98.5 Mshares,persentage of Shares Owned 63%, Institutions 64 Holders, Mutual Funds 48 Holders,Other Major Holders 15 Holders!!!
http://data.cnbc.com/quotes/CYTK/tab/8