Register for free to join our community of investors and share your ideas. You will also get access to streaming quotes, interactive charts, trades, portfolio, live options flow and more tools.
Register for free to join our community of investors and share your ideas. You will also get access to streaming quotes, interactive charts, trades, portfolio, live options flow and more tools.
I am always for the truth and sincere sharing of information. So any factual information that you can share about the lawsuit would be appreciated. If not, then perhaps we should just all wait to see how the lawsuit comes about. Otherwise, the continuing hints at something sinister with never anything concrete weaved into feeble theories about how the FDA operate in illogical and inhumane way do have a feel of sneakiness. Hopefully that isn't your intent but that is how that came across.
So what is it that you disagree with? The fact that the CMC problems did not affect the efficacy decision of the FDA? I disagree with you because human nature says that one decision affects another in these games. If the FDA was convinced that the CMC issues were going to cause a delay of a year or so, then it is easy to see why they would require further efficacy data as well since they would not at that stage feel guilty on account of delaying the treatment from patients, and additional efficacy data is always a good thing...
Let's think this through. Suppose that the FDA did not believe that there was sufficient evidence that Provenge works, then it was clear that new efficacy data should be required. The CMC issue would have had nothing to do with it. This decision would be a rational one.
Now, suppose that they actually believed that there were substantial evidence that Provenge works as the vote of the AC showed. But because of a CMC issue that could be resolved in a few months or a year and because "additional efficacy data is always a good thing", human nature then let them require new efficacy that at the time was estimated not to be seen for 3 years. That's at least a couple of years to three years of waiting while people are dying. Given that they always had the ability to withdraw approval on new negative evidence, this decision would be not just irrational but also inhumane.
Invoking human nature, etc. to make a point basically shows that you are thinking about this based on a system of belief rather than reasoning about it based on the facts at hand. I fully understood where your were, hence, my previous parting comment "Believe what you will".
Drug 1: Another NSAID that shows pain relief N=2000 , p=.05
Drug 2: Improves OS (HR=1.8) in stage 3 NSLC N=100, p=.06
Which one should be on market?
The issue is whether the randomization principle holds up.
For the larger trial, randomization likely ensures that there won't be any serious imbalance in any prognostic factors including possibly hidden ones (ie, ones we haven't thought of yet).
For the smaller trial, you would need to worry about imbalances. But that is why it is important to do a variety of sensitivity analysis as the FDA statisticians always do. But remember that this is clinical data and often there are also other sources of data outside of the immediate trial that can provide corroborative evidence.
So, if we can assure that no imbalance could influence the p value computation, then the smaller trial would be a better gauge of the drug benefit than the larger one. The difference between .05 and .06 is just noise.
"Has DNDN stated what the new alpha allocation is?"
I am leaning toward [228,253] as the range for the interim trigger with 240 as the likely value. The OBF interim alpha for these values are about 0.019 for 228, .023 for 240 and, .028 for 253.
"let us have more efficacy data since we have to wait for CMC anyway"
If you cannot weigh the difference between a requirement that can be satisfied with some work and a requirement that can kill a drug forever due to the high cost and long time to run a new trial, then there really isn't much for us to discuss. Believe what you will.
My guess is that the true underlying treatment effect of Provenge in the 9902b patient pool is about 1.25
1.25 HR is a very definite number and you are right that it will kill IMPACT. Now that you have scared me straight, could you explain how you came to it?
wrt stopping at 98 deaths in the TAX 327 docetaxel trial yielding a calculated p value of 0.20, IMO, that statement would have stopped Hussein and Scher cold at the Advisory Committee meeting,
This type of retrospective statement about a different trial is good for an exploratory discussion but it would unlikely to go over well in a regulatory setting.
the failed February preapproval manufacturing inspection would ultimately have produced the same result.
If by "same result" you mean a delay of approval, perhaps. But, IMHO, it is deceptive to equate a delay of approval due to CMC matters to one due to a requirement for efficacy data. The former is fixable - heck even DSCO manufacturing issues seem to be getting resolved, while the latter could be for good. Put it another way, if CMC was the issue, DNDN would do well, but as it is, if IMPACT fails, DNDN will likely cease to exist.
1. If IMPACT is more like 9901 than 9902a, then interim power is at least 90%
Must be running East again ? That's far too high. My estimate for interim assuming D9901 is about 70% give or take a few points.
About D9902a, with very high probability, the trial would have achieved stat sig if the enrollment size was about 700. Interestingly, if you fit an exponential curve to the latter part of the survival curves to extrapolate the survival times for the 36-months survivor, the trial HR gets to larger than 1.45. This is yet another piece of evidence showing how the late effect of Provenge is a really big deal in any effort to try guessing the success of IMPACT.
Obtaining financing certainly those not seem to be a problem for them.
If DNDN can raise as much money as they need, why should they worry about keeping investors' interest between the interim and the final and risk IMPACT failure?
It seems clear that optimizing the power of the interim was a large, if not the major, reason behind the complex maneuver to change to OBF as the alpha spending function, raise the interim trigger and lower the final trigger. The computation to set these numbers properly is sufficiently complex that this would not be done without serious data analysis.
The open question still is what new data have they gotten that drove them into trying to solve that problem instead of just sitting tight waiting for the 360 trigger which was powered at more than 90% based on just the old D9901+D9902a data?
Dew - The meaning of my post might have been lost on you because you couldn't get past its flippancy. I'll try to be more serious.
I do agree with you that program-survival bias is a good filter for taking a first impression of any new dataset that seems more impressive than it should be. You could, in fact, argue that program-survival bias is a corollary of the "Regression to the mean" argument that Thomas Fleming brought up in his FDA letter against the D9901 and D9902a data in the BLA.
In your case, you would say that the dominant mode (ie, the mean) for the universe of possible trials is that the null hypothesis takes place. As such, a phase-3 study designed from weak and possibly spurious data in a phase-2 trial will be susceptible to a regression to the mean, ie, showing the null hypothesis. There, that is a slightly more rigorous argument for your principle from Fleming's argument.
But as I said, general principles for filtering or taking first impression do not necessarily apply to a particular dataset once you've decided to take a closer examination of it. There, you need to understand the semantics of the domain of study where the data come from and taking in all corroborative data to see if the whole picture makes sense. That too should be self-evident to anyone who ever does data analysis. And, I can assure you that I've looked at data far larger than you could ever imagine.
medchal - If the shoe fits,... (...)
Now that was a truly "animalistic" attack that went right to the beneath of my sole .
You evidently still don’t understand the mathematical concept of program-survival bias even after myriad attempts have been made to explain it to you. (Are you really a professional mathematician?)
Dew - "Program-survival bias" is a trivial statistical statement about the overall population of trials. It can only tell you the shape of a thing before you decide to examine it in details. But, for any one data point, other domain-specific considerations including particular statistical measures could be applied to see if the data point on hand means what it means. Such considerations then give a different type of certainty estimate about the data. Does this make sense to you or perhaps too subtle?
An analogue of your frequently touted (by yourself) principle is what I call the broken glass principle. Given the abundance of broken glassware since the past several centuries, if you see a shiny object on the beach, be careful that it might not have any value. Instead you should think hard about it before starting to jump up and down - especially if there are an abundance of sharp looking such objects on the ground.
On the other hand, let's say you pick up a shiny object on the beach that is attached to some sort of a ring and you decide that it might be worth looking into. The broken glass principle says that you could now try some other techniques other than your eyes and fingers: physical, chemical and indeed even social(let a few jewelers take a look at it and see if they salivate) to see if the object is consistent with being something other than just a piece of glass.
The broken glass principle is rather deep as it applies to itself recursively and also to other similar principles such as yours. It tells us not to confuse impression with examination.
So, that's a long answer to your question. And, as a bonus, my broken glass principle actually lends evidence to a common folk wisdom: Do not throw figurative stones at a purported mathematician while you live in a mental house of glass. You will only contribute to the future of my broken glassware principle .
My simulations took into account the timing of enrollees. As I indicated in the previous post, the trial will achieve stat sig about 50% of the time if the trigger was 240 and D9901+02a assumed as the model. But you also need to think through issues about the health of early patients and also of late patients as deviations from D9901+02a. Early simulations seem to indicate that the chance would be better, not worse.
Are you confident, based on the evidence to date, that it does work?
The totality of all data available on Provenge (phase 1&2 trial, D9905, P16, P11, D9901, D9902a) says that it is an active agent. As you know, a statistical proof is always subjected to the will of the quantum God. So, there is always a chance of some colossal coincidence that made the data from all these trials show what they showed. But, yes, within the limited extent of my perception, I believe that Provenge works.
Steve - I don't even know what it means to have a minimum hazard ratio to meet stat sig. But, since the hazard ratio is a measure of drug benefit, if it gets too low, then who cares if the trial meets stat sig anyway.
The real question is how to model the data so that any prediction from that would be trustworthy. As I mentioned a few times already, if you assume 240 as the trigger and only D9901+02a as the model, then the chance of success is about 50%.
But we know that the early phase of IMPACT enrolled much better patients than D9901, ie, the GS<=7 patients. I estimated that the GS<=7 patients could make up about 2/3 or more of IMPACT. That's much better than the 59-60% in D9901 or D9902a. If you go back to the Jan 2004 PR on the survival data of D9901, these patients exhibited an HR of 1.89.
Now, I suspect that the latter phase of IMPACT, especially after the CRL, when they tried to finish enrollment might include much sicker patients (caseystarman's friend is probably a good example). So in any simulation work, this effect should be taken into account.
Over all, my personal belief (and it is that, just a belief) is that IMPACT will turn out better than the integrated data. How much better is an open question and that, in turn, put into question the chance of the interim success. But I think that chance is better than 50%.
My "obsession" is your delusion... people were less animalistic... Dendreon owners are the most rabidly self-righteous--and disingenuous--group...
You have no obsession but virtually every piece of your writing here and on iVillage are about a single topic, sometimes with "rabidly self-righteous" evaluative statements like the above.
About analysis of people owning a stock, what does that have to do with the type of "animalistic" emotion that you exhibit in your above utterances? Perhaps you might not realize this "disingenuous" thought process of yours when you accuse others?
But thanks for your reply. You can carry on as you wish.
Medchal,
You follow the people on iVillage closely and often put down what they try to do as some misguided obsession. Have you ever stopped and thought a bit about your own obsession with them?
Whatever motivates the people there only they can know themselves. But, what they do, if successful, will help many patients.
As for you, what will your obsession get you?
Wall - I do realize that this is entirely off-topic. Please delete this post when you get around to it.
Ok Dew. Moving on.
Dew - Independent thinking is about examining your thoughts from all data at hand, not being a slave to your biases, either in general or specific, such as against a particular company. Unlike other common biotech situations, Provenge is one where there are quite a bit of data available to compute with. Until you've done so, harping on the power issue from some sort of vague general principle is uninformed.
Separately and I hope you will take this constructively, you frequently give in to some trivial cleverness and one-up kind of thing instead of focusing on an earnest discussion. That diminishes the value of your inputs. It's too bad as you do have great biotech knowledge and experience to share.
The interim analysis does not have “almost the same power” as the final analysis. Please get a grip!
Dew - If you have an idea of the difference in magnitude between the interim power and the final one, please enlighten us. Otherwise, your comment is yet another empty one.
I'd bet even iwfal won't say what that might be even if he has given his reason for not trusting the published power for the final.
Just a note that moving of the trigger closer to the final also means that in the event that the interim is missed the ability to make it on the final goes down (compared to having the interim much earlier than the final).
This really isn't an issue for IMPACT as modeled by D9901+02a. With OBF, the nominal alpha for the final is always above .04 regardless of where the interim is. As such, the power for the final look at 360 would always be over 90%. The "apparent" issue with reduction of power remains to be the lowering of the final trigger from 360 to 304. "Apparent" because, as we discussed elsewhere, estimating power is a tricky thing where assumptions might be changed based on knowledge gained - even if we outside of DNDN might not know exactly what that knowledge was.
The below paragraph from the CEGE 07/10/2007 PR gave a hint of why Vital-1 might be turning out not as CEGE expected. The phase 2 data were based on rather healthy patients as predicted by nomograms. Vital-1 patients might track the normal HRPC population with nomogram-predicted survival time below 20 months. Who knows whether or not the high medians would hold up in that case.
Cell Genesys' ongoing Phase 3 GVAX immunotherapy for prostate cancer program is supported by the median survival results from two, independent, multi-center Phase 2 clinical trials in approximately 115 patients. The subset of patients in these two trials who received the doses comparable to the Phase 3 dose showed median survival of 34.9 months and 35.0 months, respectively. These results also exceeded the predicted survival of 22.5 months and 22.0 months, respectively, as determined by a seven point patient disease characteristic nomogram. The results of the first trial were published in the July 1 issue of Clinical Cancer Research. Results from both studies compare favorably to the previously published median survival of 18.9 months for metastatic HRPC patients treated with Taxotere chemotherapy plus prednisone, the current standard of care.
Also note the bogus comparison of the benefit of GVAX shown in the phase-2 data vs. that of Taxotere shown in a much sicker population. The predicted survival time of patients before GVAX treatment was already better than the actual survival time of patients treated with Taxotere. Should that not tell them that the comparison was invalid due to wildly different patient populations? If this sort of sloppy thinking was a part of the trial design process...
The CEGE CEO stated that the interim for the Vital 1 GVAX Ph 3 trial in asymptomatic AIPC interim occurred as modeled and not earlier.
This is what Dr. Sherwin said in the 07/10/2007 PR announcing the completion of Vital-1:
"With the completion of patient recruitment behind us, we can now estimate the timing of the pre-planned interim analysis for the VITAL-1 trial to be in 2008, probably during the first half of the year, and that we will have a sufficient number of events required for the final analysis to follow sometime later in 2009."
Technically, the 01/14/2008 PR was in first half of the year. But given that it usually takes a few weeks to process data, the event must have happened either near the first day of the year or at the end of 2007. The phrasing of the above implied expecting the interim much later in the year. So that was either a really poor job of a PR or a really poor job of projecting the event.
Since Vital-1 continued after the interim failed, there wasn't an imbalance of deaths against the treatment to warrant stopping the trial. However, since the interim occurred in early Jan where CEGE estimated 1H08, it likely meant that the interim did occur a few months earlier than CEGE had modeled the drug effect for. That is, patients on the treatment arm were dying faster than projected. If this trend continues, it will not bode well for Vital-1.
I agree that 2% is not noise.
Suppose you were told that the powers were computed in two separate sets of simulations such that the averages were .88 with a standard deviation .015 and .90 with about the same standard deviation. Would you still say that the 2% difference was significant? Then, does it matter if you say yes or no?
Clark - That was clear. Maybe it's just my naivete again but they must have known that themselves. That's why the conjecture of trading reduced power for a more beefed up interim if OBF was in use. Otherwise, the power has been reduced from more than 90% to closed to 80% - big loss.
Clark -
Rancherho pointed out a while back that DNDN used Haybittle-Peto for interim alpha allocation in D9901. That might remain the same for the original IMPACT SPA. In that case, regardless of how many interims or when, the interim alpha remains a constant at .01.
On the other hand, they could have changed the alpha spending function to O'Brien-Fleming. This is "a big assumption". But, if true, the effect of lowering the final trigger and raising the interim one at the same time is to increase the information fraction (interim/final). In turn, this would raise the nominal alpha for the interim look without affecting that of the final too much. For example, according to OBF, if the interim is 230 and the final is 304, then the interim alpha would be .02 while the final alpha remains at .044. Although the sum of the alphas is larger than .05, this does not violate the preservation of the family alpha since the two analysis share the same set of events in the interim.
If we focus only on the data, lowering the final trigger will hurt the final p value since the 56 events taken out are the longer lasting events. So this move would make sense only if they see a substantial benefit in making the interim better. The PR was vague but its use of the term "alpha spending function" instead "alpha value" would make the above scenario not too tenuous.
If, as you thought, that this was all about finance and the alpha spending function remains Haybittle-Peto or whatever not OBF, then all doors are open. Let's see what they say tomorrow before deciding what is what.
IMO, the difference in results is so dramatic between the Povenge only results and the combination results that it will be reaffirned in 9902b.
Although I do think that the Provenge+Taxotere effect is believable, esp given a number of NCI publications on Taxotere+various vaccines, there are sufficient unknowns in how Taxotere would be taken in the trial for us to be so sure of a stat sig result based solely on that possibility. For example, caseystarman reported that his friend who was randomized onto the control arm took Taxotere before Frovenge.
The conservatism of the Haybittle-Peto boundaries at the interim analyses came from applying a uniformly stringent criterion, with a nominal P-value of 0.001, at all interim analyses." If the Haybittle-Peto method is used in 9902b as well, would this reduce your assessment that the interim might be successful?
I do not have the software to compute the interim nominal alpha for Haybittle but if it's .001, the chance of meeting stat sig will be very low. For OBF, the nominal alpha is a direct function of the trigger number, the higher the trigger, the higher the alpha. So if OBF was used for 9902b, the chance of stat sig will be a function of the trigger.
Rancherho - We've been through all those presentations and papers more than once. Without being too pedantic about this, it is always a danger to reason informally about these data based on just sketchy statistics such as medians.
I'll just mention that it is possible to put the data in the Petrylak's presentation and the integrated data together to approximate which patients actually took Taxotere. For example, from there, you can then estimate the effect of Provenge alone against placebo alone. If you've done that, you might see that Provenge worked fairly well. But even then, you still run into the issue that the subgroups that took Taxotere were self-selecting and so were the complements of those subgroups. That put into questions these results.
That said, there have been a substantial amount of work done at the NCI on combining Taxotere with different vaccines to indicate that this type of combinations might prove to be efficacious.
The other topic, which was obviously sensitive for CEGE yesterday, and may be for DNDN, is the permissive use of subsequent Taxotere after GVAX in Vital 1 and after Provenge in 9902b. DNDN finessed this subject in the Briefing Materials for the AC by reporting that a slightly higher percentage of the control group in 9901 had received subsequent Taxotere, suggesting that the overall increase in median survival, therefore, did not disproportionately favor the Provenge experimental arm.
The BLA indicated that DNDN performed Cox analysis adjusting for Taxotere usage and the data remained significant in favor of Provenge. In fact, for D9902a, after adjustment for Taxotere, the pvalue improved from .32 to .12 (page 72). Below is an excerpt related to the integrated data:
"Sensitivity analyses similar to those performed for Studies 1 and 2 were performed on the integrated Study 1 and 2 survival results. The treatment effect was consistent among study sub-populations defined by 21 known or potential baseline prognostic factors. The final model that was developed for Study 1 was applied to the integrated data. After adjusting for the 5 factors in the final model, the treatment effect remained strong (HR = 1.86 [95% CI: 1.31, 2.63]; P < 0.001). The treatment effect also remained after adjustment for docetaxel use following investigational therapy (HR = 1.50 [95% CI: 1.07, 2.08]; P = 0.017). Prostate cancer-specific survival revealed an HR of 1.72 ([95% CI: 1.21, 2.44]; P = 0.002)."
<Why would those who know that "the trend is your friend" make some Herculean effort to buck it?>
On a random channel flipping many years ago, I saw a professional gambler teaching a class on how to play roulette. He actually said that the right way to play the colors is to always bet on the same color that just showed. Is this what you mean by "the trend is your friend"?
5. The interim is scheduled at an appropriate later point than half time so that the nominal alpha is reasonable.
For example, I'll take a small wager that that's why Dr. Gold is optimistic about the IMPACT interim look rather than because some alpha allocation function other than O'Brien-Fleming was used. There is a lot of room between 180 and 360 to play with. And, Dendreon surely do not want Tom to feel slighted and write another letter after the interim .
<I used the word 'epistemology' because it has a conotation of how well we can know the TRUTH given the data at hand. Talking about statistics, statistical rules, etc loses sight of the fact that there is a fundamental limitation. We can debate the assumptions, and thus the size, of the multiple looks penalty - but at heart it is indisputably true that some kind of penalty is required to avoid false positives. False knowledge.>
Agree. But let's also not forget the other side of the coin, avoiding false negatives - true knowledge too quickly lost due to some limited thinking/evaluating process and/or, sometimes, the fear of false positives. Over many decades of work on computer tomography, we have learned that not all elephants must be lost because we only got limited impressions of them.
I chose to live in Flushing Queens, which is probably the most ethically diversified neighborhood in the US.
Like an investment chat board :?).
The p value would be somewhere below 0.0001.
<It is only events that matter. No events, no change in p value.
And, conversely, the more events, the better the estimate for the p value. This is why it is short-sighted for companies to keep the intervals between check-ups too long when tracking an event such as progression for which the exact occurring time is unknown so can be set only at check-up time.
As to the p-value remains unchanged during a quiet period, a metaphor that is not exactly correct but useful for thinking about these statistics is that every event is a vote for one arm of the trial or the other. The more votes we get, the more certainty we have in the observation. In a survival analysis, during a period without events, nothing new is learned, therefore the p value does not change. The metaphor is not exactly correct because events have different weights as the trial progresses. They often have more meaning toward the end than at the beginning - the denominators get smaller as Clark would say :).
Relevant to this discussion is the recent FVRL's announcement that they will abandon the event-based endpoint and simply stop their trial in April next year. This decision was made after seeing that the progression rate has slowed down over the past year. This would be your option (c).
http://biz.yahoo.com/prnews/071025/lath147.html?.v=8
Thanks.
Intacs are approved only for <=3 diopters and they do not correct for astigmatism.
Dew - Does Lasik correct for astigmatism?
It's hard to get through this with informal reasoning but by varying what the different parts of the population looks like at different enrollment times (more like D9901, D9901+02a or D9902a, etc.), simulation shows that the power for the final look is somewhere in the mid to upper 80% assuming alpha .04. But, FWIW, if you trust the Halabi data that about 80% of the population look like D9901, then the power is above 90%. This is all log-rank.