Reply Private New

Replies (1) Next 10 Prev Next

Send PM Follow Ignore

Followers	14
Posts	205
Boards Moderated	0
Alias Born	02/15/2016

esedu

Re: None

Sunday, 04/24/2016 11:08:27 PM

Sunday, April 24, 2016 11:08:27 PM

What to expect at midterm review

I recently had a chance to work with the original clinical design document of the Neuvax Phase 3 PRESENT trial (dated 2011). Some of this information may be outdated, but it can serve as a baseline for our expectations at this juncture in the clinical process. There are some indications that there have been amendments, so keep that in mind.

Futility Test

Four formal analyses are planned for efficacy. The first will occur upon reaching 50% (n=70) of the required DFS events. This analysis will be used for assessing efficacy (DFS), futility, and an overall safety evaluation; DFS futility will be established if the two-sided 95% lower bound of the hazard ratio (hV/hC) is >0.9.

A hazard ratio of 0.9 (vaccine arm recurrences divided by control arm recurrences) means an efficacy threshold of 10%. This means that Neuvax clears the simple test for futility with a 33:37 split of recurrences between the vaccine and control arms.

This original part of the design was probably amended, if we are to take CMO Bijan Nejadnik's statements in the most recent conference call. Now, the bar seems to be set at a conditional power of 0.15, meaning that the trial should have a 15% chance of reaching its stated endpoint goal (38% recurrence reduction at 141 total recurrences)

The interim analysis itself of course consists looking at separation of the rate, difference of the rate between the control group and the treatment group and within that knowing that certain time has passed, but the study is not over. Looking at that separation to calculate the probability of this study of succeed. And that hurdle have been fixed at 15% which is called a conditional power and that conditional power is very much low hurdle, we do not anticipate any problem there at all.

It is still a pretty low threshold. It might as well be a rubber stamp. There should be no chance of a stoppage for futility at this stage if Neuvax even works a tiny little bit.

Efficacy Stoppage Rules

There is no need to spend any alpha for the concurrent futility analyses, but 0.01% of the type I error will be spent at each safety and efficacy analysis to allow for the remote possibility of a compelling OS win before the final planned OS look (event driven). Allowing for up to 10 semi-annual DMC assessments for DFS and for up to 20 semi-annual OS assessments, this alpha spending function will collectively spend 0.1% of alpha on interim analyses, leaving approximately 4.9% type I error (2-sided p=0.049) for the final DFS efficacy analysis and approximately 4.8% type I error (2-sided p=0.048) for the final OS efficacy analysis.

Most of you may not be familiar with the term "error spending," but the concept should be fairly easy to grasp. Type I error, or alpha, is the chance of mistakenly rejecting the null hypothesis of a study - in other words, it is what is seen commonly as the target "p-value" of a study, indicating its statistical significance. Almost always, this is set at p < 0.05.

In a typical study designed for interim efficacy endpoints, certain amount of this final error threshold is "spent" at each interim review. The total amount of error in the end should add up to the target value (almost always 0.05). To give you an example of how this works, this is a hypothetical study with 2 interim reviews at the 1/3 and 2/3 completion time points. The spending function is the standard O'Brien-Fleming, used in the vast majority of trials that plan for interim efficacy endpoints.

Completion%.......p-value for halt........error spent (total=0.05)
00.........................0.0000.....................---------
33.........................0.0002.....................0.0002
67.........................0.0121.....................0.0119
100.......................0.05.........................0.0379

The issue with the Neuvax Phase 3 PRESENT's design is that it does not seem to be designed with early efficacy stopping in mind. The total amount of error allocated to interim reviews, instead of allocating the full 0.05 of final Type I error, is a mere 0.001 ("this alpha spending function will collectively spend 0.1% of alpha on interim analyses"). However, there are multiple interim looks planned, meaning that all of that 0.001 will not be allocated at first.

In fact, the trial design calls for spending "0.01% of the type I error at each safety and efficacy analysis." This makes the p-value target 0.0001 for the first review, 0.0002 for the second, 0.0003 for the third, and so forth until we get to 0.001 for the 10th review. The trial design calls for 10 years of data collection with "up to 20" interim assessments.

A p-value of under 0.0001 will be very, very difficult to attain with the 70 events at the first interim look, but what takes this from the unlikely to the nigh-impossible is the fact that for some reason interim efficacy stoppage is established by OS (overall survival), not DFS (disease free survival). Overall survival is defined simply as the number of people left alive at a given time point, while disease-free survival requires people to be living and disease-free. This means that the people who have had recurrences but are still living get to be counted under "overall survival." In these kinds of adjuvant studies (studies to prevent cancer recurrence after successful primary treatment), separation of DFS curves can happen relatively quickly, but it won't be many years until OS curves show significant divergence. Median overall survival time after breast cancer recurrence, according to data from the previous decade, is in excess of 4 years.

Comparison of DFS and OS plots from a trastuzumab trial.
There is zero divergence in the OS curves at the 2 year mark.

Designing interim endpoints with completely different criteria compared to primary endpoint criteria is a major pitfall of study design, especially when the interim evaluation is tied to data that takes so much longer to properly accrue. This is a study that demonstrates this exactly.

Trastuzumab emtansine demonstrated improved efficacy compared with TPC.

The ORR 31.3% vs 8.6%, p <0.0001
There was a significant improvement in PFS (HR = 0.528; p <0.0001) and a clear and consistent treatment effect across subgroups.
The interim OS analysis favoured trastuzumab emtansine (HR = 0.552; p = 0.0034) but efficacy stopping boundary was not crossed.

The primary metrics were response rate and progression-free survival (metastatic setting), but because interim efficacy stoppage was only concerned with overall survival, the study failed to receive an interim halt for efficacy. The eventual data ended up very, very strong (p < 0.0001), but because they used the criteria of overall survival instead of progression-free survival, the experimental and control groups did not show adequate divergence at the interim look.

To bring this back to Neuvax, it is objectively the case that there is a clause for possible interim efficacy stopping. However, it seems that this is a possibility that Galena never really expected or planned for. The reasons are thus:

1) Instead of allocating the full final Type I error of 0.05, they chose to allocate only 0.001, split into 0.0001 increments. A p-value target of 0.0001 with an interim data pool of 70 events is a tall order.

2) In a trial whose primary endpoint measures DFS, interim efficacy stoppage is tied to OS, a metric that by definition takes much longer to establish statistical confidence. In an adjuvant trial where every patient starts "disease free" so as to measure reductions in recurrence rates, it is very likely that there will be little to no divergence in overall survival outcomes in the 1-2 year time frame.

In conclusion, unless people in the control arm start literally dying at historically unprecedented rates, we should not expect even a reasonable possibility of an early efficacy stoppage. This conclusion is, however, contingent on the observation of the criteria laid out in the original clinical design from 2011. There may have been amendments since then, though per Galena's company line, this interim review is for "safety and futility" only.

Expect a near-certain "go to completion."

Keep Last Read

Replies (1) Next 10 Prev Next

Join the InvestorsHub Community

Register for free to join our community of investors and share your ideas. You will also get access to streaming quotes, interactive charts, trades, portfolio, live options flow and more tools.

Volume
Day Range:
Bid Price
Ask Price
Last Trade Time:

Boards:

Quotes:

Boards

News

Market Data

Markets

Discover

Discover

Boards:

Quotes:

Join the InvestorsHub Community