InvestorsHub Logo
icon url

Elmer Phud

05/17/03 1:50 PM

#4702 RE: sgolds #4700

sgolds -

smooth2o, do not confuse 'recall' with 'cancel'. The bug that Intel documented can cause data corruption and crashes, and is not due to "hardware or software bugs" (I guess it means that the problem is in the processor, not the mobo). This is all by their own admission.

Let's be clear here. The problem affects only a small number of systems and Intel has offered to replace any processor affected. We have occasions where AMD had defective product in the field and made the same offer. There's a difference between offering to replace a defective gaming processor and replacing a mission critical processor but nonetheless there's a field test and a free replacement.

This brings up a more serious problem. This is another testing issue, or test escape. If the device runs reliably at a slower speed then it should have been binned out that way. Why wasn't this speed path found earlier and a proper screen put in place? Sounds simple but the problem of identifying speedpaths and screening for them in the few seconds the device sits on the tester is what PHDs are made of and $100s of Millions are spent on. Defects fall into 2 categories, hard and soft. The hard ones are easy to catch because they appear to be stuck in a given state. A gate who's output is stuck in a state will be stuck to every other node it connects to. A soft defect may be a signal delay from that gate to only one single other point. There are tools available to quantify the number of hard defects a test suite catches, but there is no effective tool to even identify how many soft speed paths exist much less how effective a test suite is at catching it. The combination of cycles needed to even condition the part so some obscure internal state can be established and observed exceeds the memory capacity of even the most expensive test systems after only a few milliseconds of actual functional device operation. To test every possible speedpath is impossible even under lab conditions and certainly hopeless in a production environment. If you think this is carelessness then you don't understand the size of the problem.

The other shoe has not dropped.

No it hasn't. You think AMD lives in another universe where defects don't happen?


icon url

wbmw

05/17/03 6:15 PM

#4711 RE: sgolds #4700

Sgolds, Re: Will Madison have the same bug? Will Madison be delayed for a rework of the layout, or perhaps be limited in clock rate also?

They already said that Madison does not have the problem.

In fact, not all McKinley chips have the problem, either. Intel apparently has a detection script that can tell a bad CPU from a good one. It also sounds like the particular failing instruction sequence is well known. If an Itanium 2 system runs software that would never use the particular instructions, it does not need to be replaced (though Intel is offering replacements for all CPUs that fail the detection script).