Register for free to join our community of investors and share your ideas. You will also get access to streaming quotes, interactive charts, trades, portfolio, live options flow and more tools.
Register for free to join our community of investors and share your ideas. You will also get access to streaming quotes, interactive charts, trades, portfolio, live options flow and more tools.
I got an ex-Amgen VP Comms to listen to BP's TED talk and she is psyched. Would be a perfect fit for emergency comms leadership - from broad comms strategy to mgmt coaching to writing press releases.
Pls contact me if you can confidently reach NP - I've sent an email and a linkedin message.
Man, I remember all those names! Remember what most of them are about.
I would love to have what I think is called the "statistical analysis package" - the stats addendum submitted with the initial documents requesting the FDA approve the trial.
The calculations you recall - hazard ratio, kaplan-meier, etc., are for endpoints that either occur at some date or not within the trial: e.g. death. Cochran-mantel-haensel is for binary outcomes stratified by groups.
I think the take away from the group convo today is that, because the primary endpoint is specified as the change in total symptom score, they are treating it as a simple, numeric, cardinal variable. Hence, a basic t-test seems likely to do the job. We'll see.
By contrast, the secondary outcome "7-point ordinal scale" by that phrasing doesn't allow adding or subtracting - I saw the calc "# of patients with a one or more point improvement" somewhere.
Don't think so - categorical data, small samples, requires assumption marginal cross-tabs totals are known in advance
Good catch. I've heard of fever being erratic and therefore noisy. What about cough if you are sedated? Breathing problems and body ache, sure. So, 68% of patients move =< 4 points either way - 1 per measurement?
Anyway, before we got off on numerical explorations, the starting observation was that there would by a lot of patients in the mid-range on change. The high-leverage patients are the moderate-pluses who recover well and the mild-minuses who are seriously lost.
So, Bobshmob, your response shows you have something better to do than watch paint dry. Our musings are for entertainment.
We do know a bit more about distribution: we are looking at change scores: the milds might come in at TSS = 4, meaning change of -4 to +8; moderates come in at 8, meaning change scores of -8 to +4, resulting in some symmetry and vague normality.
Then it is a w.a.g. on the range that includes 68% of outcomes (I guessed +/- 4 points - one level each parameter - others lower), but that gives an SD and the rest is math.
Ahhh! The geek crew...
Just to be really sticky with the language, the figure of merit is the difference in change scores:
X = (mean-delta-TSS-leronlimab) - (mean-delta-TSS-control)
I'd agree X =< -2 is solidly significant, and X =< -3 is blowout. You two are about a half point more certain than I.
Two comments:
- my w.a.g. is that both groups improve
- the unknown unknowns ...
Iddrisw - it's been several decades - is a logistic regression how you get significance levels for an odds ratio?
I may even go paint something just so I have some paint to watch dry.
Sounds reasonable to me.
I don't know for sure. I understand the clinical reason you might suspect this - interesting. My first reaction is that the variability of the placebo group is the larger contributor to the standard error due to lower numbers, so significance would benefit if that group had a lower relative SD.
It's interesting - both underlying groups might be skewed, but the combined distribution symmetric. However, that's not the case under the null hypothesis.
Again, I think the "unknown unknowns" dominate here. The rest here becomes musings.
However, I like your approach: in a normal distribution 2/3 of observations are within 1 SD of the mean. The distribution of changes (delta-TSS) runs pretty much from -8 to +8. My w.a.g. might be 2/3 between -4 and +4, so I'd guess 4 as an SD. Maybe that's high.
Then, in normal theory, it's just a t-test. Standard error = sqrt((s1^2/n1)+(s2^2/n2)) = sqrt((4^2/56)+(4^2/28)) = .93.
Lastly, the width of the confidence interval would be either 1.64 or 1.96 times that SE (one-sided or 2 sided at 5%), requiring an observed difference of about 1.5 to 1.8 units in average delta-TSS for significance.
Fun for a Saturday morning, but, again, it ignores the "unknown unknowns."
I don't have a standard deviation to run p-values - just noting the shape of the distribution (not unfriendly) and what data points have leverage.
Actually, I'm as concerned about the "unknown unknowns," to use an old phrase. For example, in HIV, when thinking about when we would get a PDUFA date, I didn't consider anything about 700mg data being needed from a subsequent trial and how that could be a problem.
Note: The endpoint is CHANGE in total symptom score ("TSS"). A patient who came in "all mild" would have baseline TSS = 4, so their change ("deltaTSS") score could run from -4 to +8. An "all moderate," with baseline TSS = 8, could have a deltaTSS from -8 to +4.
This spreads the numbers out much more than current modelers are assuming. Some nice properties: the distribution will be more symmetric and likely not that far off normal. Fewer ties helps in non-parametrics.
There will be lots of data points with low leverage between -4 and +4, but significance will be won with the high-leverage data points at the extremes: Are moderates who got healthy (TSS=<-5) largely on leronlimab? Are the milds that deteriorated (TSS=>+5) largely placebo?
I not sure all modelers understand what SDs they are (or should be) assuming.
Rockleo - all those data detail queries - aren't they done before the data is unlocked, which is before it is unblinded?
Let me purely speculate on one alternate explanation for the delay: the results are good enough to create a crisis in the FDA about how to respond. You know they'll get beat up either way - despite the p-value, it's a lot of policy to base on results from 84 people.
I've misplaced the link to Patterson speaking at Beckman Coulter. Can someone help? Thanks!
Most likely because Amarex hasn't updated the site, but one could speculate they are leaving them open for an extension of the trial into Phase III.
By data points I mean events - that is deaths, both from a somewhat higher sample size than the M/M, but more from my guess about death frequency.
If a very small fraction people die, it takes a really big trial to detect improvement.
I used 120 patients just because I know they have 100 with data and I assume a few more while they mess with the data.
Maybe a guess, but I'd say an educated one.
Try this for an S/C mental image: assume progression in M/M means an SAE; in S/C it means death. We are targeting cutting progression in half in either case.
On the S/C, yes only have a binomial. But we do have a somewhat larger sample size and, I'm guessing, a lot higher progression frequency (deaths in S/C) than the 21.4% SAE rate in M/M. Combining those, I'm thinking 50-75% more data points.
Difference of binomials, 120 pats,
Deaths (C=30%,L=15%)==> p=.035
Deaths (C=40%,L=20%)==> p-.012
Deaths (C=50%,L=25%)==> p-.003
Fair response - agree with the positives. I'm glad to hear I may be being an Eeyore at 80:20. Were you above 60:40 before?
My remaining concerns: (1) people getting better by day 14, (2) some SAEs in control group could be drug related, which won't count in primary endpoint.
I admit I haven't been a practicing consulting biostatistician since 1981, but, damn, I can't imagine what they are messing around with now.
The number crunching, table and graph generating software should have been completed and tested long ago. The methods section of the paper hasn't been waiting on anything to write.
Multiple versions of the abstract, auto-populated with numbers from the stat software, should have been prepped. The results section is even easier to auto-insert numbers and summary sentences.
Diagnostic software for finding outlier data points, and analyzing their impact on results is pre-existing.
Aargh! Then again, Amarex appears to have been less than flawless in HIV BLA execution - let them take all the time they want.
Before I saw the SAE data, I was 60/40 as to whether the primary endpoint would be statistically significant. Now my (internal) odds are 80/20. I feel close to certain they will be clinically significant.
My main concern was that not enough people would progress to confidently detect a reduction.
I'm among the crew that see the S/C trial as in leronlimab's wheelhouse. It's the heavyweight battle - the Trial of the Sickest - because our exclusions are so few. Sadly, many will die, but that makes a 50% reduction easier to detect.
SAEs are efficacy data in this environment. They are defined as any "untoward medical occurrence," whether caused by the treatment or the disease, that either (a) causes or extends a hospital stay, or (b) is life threatening or causes death. Because drug-caused adverse events are potentially significant, SAEs are arguably a more complete measure than the primary and secondary endpoints in the trial.
As a solo endpont, reducing the percent of patients with an SAE from 21.4% to 8.9% (a 58% reduction), was nearly significant with a one-sided p-value of .07.
Presumably, the more detailed clinical metrics will offer higher resolution (and lower p-values) than the blunt instrument of SAEs. (Also note a slightly different metric, total SAEs per patient, dropped 64%.)
OK - Everyone else has complained. Here's how the PR should have been written - IM(not so)HO. Excuse table formatting.
Leronlimab in COVID: Sixty percent reductions in Severe Adverse Event ("SAE") safety metrics in controlled trial. Efficacy data upcoming.
----
In an 86-patient randomized controlled trial, leromlimab reduced the percentage of patients with SAEs by 58.4% and the total number of SAEs per patient by 63.8%. Considering any level of adverse event, leronlimab decreased patients impacted by 32.2%.
The percentage of leronlimab patients with SAEs was 8.9%, versus 21.4% in the control arm of the trial. The control arm received "standard of care" treatment, which typically included multiple other drugs.
Because the definition of Serious Adverse Events includes both disease-caused and drug-caused events, they provide possible insight into the more detailed clinical efficacy findings that will be released as available. No drug-caused SAEs were recorded among leronlimab users.
Raw numbers are as follows:
Totals.-----------------.Leronlimab.---.Control
Number of patients.--------- 56 --------- 28
# patients with SAEs.------- 05 --------- 06
Total SAEs.------------------- 08 --------- 11
# patients, any AE.--------- 19 --------- 14
Per patient - - - - - - - Leronlimab - - Control - -Relative Risk
% patients with SAEs - - - - 8.9% - - - - 21.4% - - - -58.4%
Total SAEs per patient - - - .142 - - - - .392 - - - - -63.8%
% patients with any AE - - - 33.9% - - - 50.0% - - - - -32.2%
SAE definition: in a clinical trial context, the FDA defines an SAE as any "untoward medical occurrence," however caused, that (a) causes or prolongs a hospital stay, or (b) is life threatening or causes death.
Drug related SAEs are called SARs - serious adverse reactions. Not relevant to the calculation, but good not to have on your record.
M/M patient pop: although most recruiting centers were clearly hospitals, the inclusion criteria don't require patient hospitalization. Rockleo: do you think your patients are representative of the trial?
Thanks for putting it all in one place.
Would you prefer Patterson delay a day to Tuesday night to bask in results and show a few new charts?
Has Dr. Drew pre-recorded Dr. Yo? Or will he put the M/M results on big media just before the market closes?
Anyone wish Patterson would delay his web interview from Monday night until Tuesday?
Also, does it sound like Dr. Yo's Tue interview on Dr. Drew (unless pre-recorded) could be the main-stream media premiere of the M/M trial results?
(Correct me if dates are wrong.)
Patterson's "phosphorescent" leronlmab: (1) hearing of this technique in trials of other drugs over last year or two - critical to measurements like "receptor occupancy" - even hearing of "bar coded" tags (2) I'm glad this is OK with the FDA and isn't too much of a modification of the drug.
I'm listening to the arguments as to why the m/m or s/c trial is more likely to be statistically significant. But is "significant" all there is to "important?"
I've been seeing another figure of merit: how many patients do you have to treat to save one life? Clearly, that metric favors using the (presumably limited) doses of leronlimab in the s/c setting, where death is otherwise imminent for a substantial fraction of patients, even though a few patients might be unsalvageable.
For the m/m benefit calc, there would need to be some estimate of what fraction of patients "saved" by leronlimab (from progressing to s/c) would have eventually died.
With the robust Dr Been / Dr Patterson agenda in the tweet, there will be no time for them to discuss clinical trial results.
Stat significance calculator, mess around yourself:
https://www.wolframalpha.com/input/?i=two+proportion+hypothesis+test&lk=3
Remember: (1) this is difference in proportions test. Some papers have reported "risk ratios" and "odds ratios" - I don't know about those tests, and (2) not clear if we get a one-sided or two-sided test.
MM also said the annual report's audited financials were the limiting factor - the NASDAQ would want to spend time with them.
They are due 8/14 - hard to accelerate by more than a week. So, with NASDAQ review time, back to 8/14 earliest.
"Cytodyn FDA HIV Pause: 95% ready to file, last 5% addressable with no new data." That's my headline and take-away. Heck, the FDA could have said Samsung wasn't ready to make it. I'll definitely take this outcome. AF's got nothing.
Your suggestion is called a "meta-analysis" of the two trials.
One approach: both trials collect a "7-point ordinal scale" at 0 and 14 days (1=death, then declining oxygen support, 7=out of hospital no restrictions). I think (it's been a long time) you compute each person's change and use a 2-way Kruskal-Wallace test.
I raised this in an email to NP a month ago - I'm sure he and Kush are making all these arguments. I can't say I've heard of the FDA doing precisely this.
Chuckles - if a statistics course were administered as a clinical trial treatment (I guess the placebo arm would be anecdotes), would that be unjust cruelty to the "patients?"
Yep, us stats people are happy to play second fiddle, as long as the first fiddler is really top notch. Hey, way back in the 1980s I was a co-author on a JAMA article - one of only a few non-MDs to have such a cite that year. As I said, the lead author was really good.
Phase II trials are usually about assessing dose-response. That's the main point I'd add to your summary.
Typically there will be 2 or even 3 dose levels given to subgroups of the treatment arm. The objective is to set the optimal dose for the much larger, more costly Phase III trial.
Leronlimab's Phase II is logically like a a Phase III, with only one dose level. For that reason, I could conceive of the FDA extending this trial into a II/III, but not likely requiring a start-over.
Do you really need, say, 400 patients? Not if the result is pronounced and consistent. Statistical precision only increases with the square root of patients: 4x patients ==> 2x precision.
Put another way, if a drug's underlying effect is twice as large, a 100 person trial has the same chance of being significant as a drug with the lower effect in a 400 person trial.
Did the FDA possibly reset the BLA clock when the "remaining tables" were provided on May 11th? If so, July 10 is 60 days, when they can reply, and not 74 days, when they have to reply.
In what I've read, the FDA only communicates really good (priority review) or really bad (refuse to file) news at the 60 day mark. This would make NP's use of the word "could" in the June 8 PR precisely correct.
I am likely entirely wrong, and NP will have a PR on the BLA Monday morning, but, if not, this is one of many possible "no biggie" explanations.
My guess is, with FDA resources strained with COVID work, and this approval being focused on a combo therapy, it might be hard to get an FDA commitment for a priority review right now. That wouldn't show disrespect for leronlimab.
I like the meta-analysis idea. Even if just on the two trials. There is one measure they have in common: 14-day change in "7-point ordinal scale." The scale runs from 1 for death, then levels of oxygen support, through 7=discharge, no restrictions.
Maybe a (2-way?) ANOVA on ranks? I forget this stuff.
Or go real meta and use Z-scores on the two: eg P(z-mm>x and z-sc>y)
My odds (though, given the weekend, I'd give it until Tue AM):
0% May 11 - refuse to file
30% May 11 - no FDA comment
60% May 11 - FDA Priority Review, but not schedule yet
5% April 27- Priority review and PDUFA schedule
5% April 27- Standard review and PDUFA schedule
Remmember, the AIDS conference runs through Friday, and they liked the SHIV study enough to include it in their "breaking news" section.
When did the FDA start the clock? April 27 with the main filing or May 11 when additional data tables were provided?
If April 27, Saturday is 74 days and the FDA has to respond.
If May 11, Saturday is 60 days, and there is an optional FDA response (per my reading) only in case of really good news (priority review) or really bad news (refuse to file).
All just my understanding.
Hey, Chuckles, I hadn't done the back part of the math. There is probably a filtering process that reduces volume. Also, (lacking your specific data) if you are telling me the entire annual Samsung capacity is 460b (52x9) at street value (not cost) that would seem off but not nonsense. Thanks for the sanity check. Studhoss could help.