Register for free to join our community of investors and share your ideas. You will also get access to streaming quotes, interactive charts, trades, portfolio, live options flow and more tools.
Register for free to join our community of investors and share your ideas. You will also get access to streaming quotes, interactive charts, trades, portfolio, live options flow and more tools.
Wbmw:
Name one 64 bit CPU that can run its 32 bit OS and still use its larger 64 bit register set without a recompile. Couldn't do it. Alpha, Power required recompiles because they were brand new ISAs. MIPS required recompile to use 64 addressing and 64 bit data typeswhich required rewriting OS & drivers for 64 bit operation. Ditto for SPARC. Your objections are worthless.
8080 to x86 needed a reassemble going from 8 to 16 bits, but to get the most required rewriting for the new 1MB address space and x87. X86 to x86-32 needed the same things going from 16 to 32 bit. Yet everyone I know liked the x86 to x86-32 (386) transistion to get back to flat addressing from the segmentation models that existed before. But it still needed a recompile to use 32 bit modes (and the OS & drivers to get the 4GB addressing space).
Yet the conversion to x86-64 was so easy that many 64 bit applications could be ported to Long mode quickly (optimization took longer). Most OSes that were successfully using 64 bits on other ISAs (CPUs) ported before there was working silicon (on the simulators). With hardware becoming available, most just dropped in and worked. That points to how well the transistion was done. Many of the changes were welcomed (making flat addressing required which rids segmentation stuff is a biggie).
So I think that they cut out the parts of the medusa that were no longer needed or wanted. Pointed to other things that will someday go away. And still allowed backwards compatability, if desired. A fact that Intel and others fail to grasp. Grandfathering and not having to spend to change software are king. This King keeps x86 in its various flavors at the top of the revenue and profit heap. Every time Intel tries to cleanly break away, they have gotten burned (iap432, i860, i960 and now IPF).
Pete
Dear Chipguy:
EPIC is just warmed over VLIW. All of the changes were to get more performance from the hardware when both it and the compilers weren't able to deliver on the real world performance promises. EPIC has the same disadvantages that most VLIWs have when applied to the general purpose computing world. Their best advantages come in predictable narrowly targetted applications. There they can do quite well.
EPIC has failed in its orignal targets like all of the other VLIWs. On the same transistor budgets, the non eplicitly parallel instruction CPUs (RISC and CISC) seem to blow it away.
Pete
Wbmw:
A lot of things were done in AMD64 that was done in reference to what people would want in a 64 bit extension of x86-32. Some features were added by request of programmers and others. How many times did we see 64 bit extensions done by others that were dictated by someone on high without regard to end users and others in the community. That is what Intel did giving the 286, the 386 and so on.
Of course it looks good in retrospect, but that is hard to get right the first time. And Intel's version first couldn't be done without large sacrifices (their way of course) in performance and they needed (highly likely wanted) to go a new direction. Then their x86-64 flavor was not as good for Microsoft, so they had to go AMD's route with AMD64. The customers spoke that either Intel went with the program or be left in droves.
The innovation you fail to see is why AMD with so much smaller resources did what Intel seems to consistently do not, listen to its customers. They didn't want to throw their x86-32 investment away. They did want to incrementally upgrade to 64 bit. You cite it as a disadvantage, the slow move to 64 bit. It is because of AMD64, that the move to 64 bit can be so slow. IA64 requires a leap of faith (and it doesn't succeed that often either). Customers like to slowly advance to their goals without taking huge risks. They hate it when they must leap and have a good chance of crashing and burning.
How many of those MIPS customers asked for MIP64's way? How many SYSTEM 370 users asked for Power? How many SPARC customers asked for SPARC64? How many VAX customers asked for Alpha? How many Pentium customers asked for IPF? Not many. Many asked for AMD64 instead of IPF.
As to your small ideas as to what constitutes innovation, the you wouldn't consider air brakes for trains to be an innovation. RISC back ends weren't innovations. PCI was't an innovation. The web wasn't an innovation. Internet wasn't an innovation because computers talked to other computers before. Heck even the i4004 wasn't an innovation because people built CPUs with transistors before. Good innovations always seem so small, easy and obvious, after the fact. It is hard to make them in such a way to seem easy and obvious.
I do agree that some so called innovations, weren't. Like IPF as VLIWs were done by many before and since. Like RDRAM which was built on things that are common place in other places. Like long pipelines and clock zones of which there are many previous implementations. But marketing people hype them to be the next best thing after sliced bread.
And they did a lot more than simply, extend and boost, the number of registers. There was the elimination of addressing modes, the deciding of the default types, what instructions to drop and all of those details that make people wonder "why did they ...?" when that wasn't obvious. If it was easy, Intel should have delivered the same exact thing and didn't either in performance or capabilities.
All in all, it just sounds like sour grapes and "Why didn't I(ntel) think of it?"
Pete
Dear Mmoy:
Yes, in the real world, sans Intel "dirty tricks", the compilers of choice are gcc and MSVC. Where reliability, repeatability and portability mean far more than the last 1% reduction of run times.
Pete
No its Kate and Chipguy who are in the minor leagues.
Chipguy:
When looking at SPECint make sure the two use the same compiler, etc. Dothan uses ICC 8.0 whereas the A64 3200+ (2GHz) submission uses ICC 7.0. A test of Opteron 270 (2x2GHz) using ICC 8.0 yields 31.5 SPECint2000_rate peak. That works out to 1575 SPECint2000 peak which is higher than 2GHz Dothan 755. Many fall into that trap of not making all else equal. Thats also a big problem when there is only one test with a Pentium M.
Using gcc or MSC make the Dothan worse than a equally clocked A64. And in SPECfp peak, A64 3200+ and Opteron 246 stomp all over Dothan 2GHz. They are 45% faster per clock using Pathscale compiler to Dothan's ICC 8.0 and IFC 8.0. Its not so nice when the shoe is on the other foot, isn't it?
Pete
Dear Tenchu:
I can use the same argument:
Stockholder's equity:
In Dollars Q2/04 Q2/05 Difference
Intel 38,593 37,614 -979
AMD 2,520 2,847 327
Result, Intel has been losing scads of its equity as it is. A price war would make it lose 5 times what AMD would. And it has to first shear $3 billion of its quarterly revenue just to start by matching AMD's ASPs. And before they can do that, they must at least match AMD's chips in performance, power and capabilities. Else they will have to go even lower, like a bin down or two, on ASP.
That's why it used "dirty tricks". It couldn't "crush" AMD in open competition. It couldn't win using performance, since they are behind. And a price war without those "dirty tricks" would quickly use up that cash cushion (if there is any left after the lawsuit). AMD has succeeded when both "dirty tricks" and price war was used on it before. Intel has not been in that position recently. That last time they were, DRAM, they gave up. If you want, they also gave up in the web hosting business and in the CE TV DLP chip business. But those didn't hit a core business.
Pete
Dear Tenchu:
AMD had chipsets in CPG for my GM calculations. The GM calculations were for flash being pulled out of each (I assumed that Intel's GM for flash is the same as Spansion's (although that may be giving Intel the benefit of the doubt)). That means I included "other" as well (PIC, embedded, math library writing, etc.). If you want to argue that chipsets should be pulled out of each, fine. But doesn't Intel use their chipsets to sell CPUs? Else where would Centrino be?
So without chipsets especially Centrino ones, we should eliminate the 90% of mobile CPU revs as well. Why don't you take out the whole division and we look at only DEG? The big problem is removing the COGS portions of the other divisions. My estimates for flash COGS are conservative (smaller than they probably are for AMD and higher than they likely are for Intel). So you should show your calculations of GM by getting all those things out that you want (show those numbers, especially the guesses).
But you didn't show any calculations for Intel's GM. AFAIK, you made a WAG with no justifications at all for your statement, "If Intel dropped their GMs down to AMD's, they would crush them". At least I put publically available numbers and a couple of guesses on the flash COGS side into my calculations. I figure you just don't like the numbers because they don't fit your WAG.
You also must be assuming that the lawsuit fails. If it succeeds, all of your statements are "irrelevant".
Pete
Tench:
The only "bunk" is what you are shoveling. Take a look at the GM for Intel in Q2. Take a look at the GM for AMD in Q2. They are not as different as the ASP differential in GP CPUs. AMD would be trumpting and holding big parties, if their ASPs went above $100. And Intel's are much higher than that.
Lets look at the GMs:
------------Q2-Revs----COGS----GM(100%*(Rev-COGS)/Rev)
Intel-------$9,231-----$4,028--56.4%
Intel-Flsh--$8,703-----$3,712--57.3%
AMD---------$1,260-------$766--39.2%
AMD-Flsh------$798-------$434--45.6%
AMD(CPG)------$767-------$398--48.1%
So taking out flash from both (since AMD flash isn't wholely owned by AMD and be spun off completely later anyway), Intel's GMs are only 11.7% higher or higher by 25%. To match them, a $1,879 million revenue haircut is needed for Intel (ASPs should drop about $46 (from $162 to $116)). On the other hand a $218 million rise for AMD would take their GMs up to Intel's (about a $26 ASP rise (from $90 to $116)).
Lets look at ASPs:
------------Q2-Revs-----Units(est)----ASP
Intel-------$6,659--------41.0-------$162
AMD-----------$767---------8.5--------$90
$6,659*90/162=$3,699 or $2,960 million less to merely match ASP. Given they are behind in performance, to "crush" AMD, they would have to go much lower. However dropping $2,960 million in revenue, they would have losses.
From the Q2/05 Intel earnings report:
Income before taxes was $2,754
ASP equalization adjust $2,960
Income before taxes now ($206)
Interesting note here is if both matched ASPs at $116 each, Intel would have operating profits of $875 million and AMD would have $232 million in operating profits at each other's current GMs.
Given some suggestions that AMD could get to $1 billion in CPG revenue by Q4 of this year, AMD could have GMs as high as (or higher than) Intel's this year and be surging ahead in both unit share and revenue share. Far from any "crush" of AMD.
I think you had better check your assumptions before labelling anyone else's numbers as "bunk".
Pete
Dear Tench:
To "crush" AMD, it would need to drop its ASPs to below $80. Considering that would cost Intel about $4 billion per quarter or over $1 billion in losses every quarter, IMHO Intel could neither get that past the board or the shareholders.
Dropping ASPs to AMD's GM, wouldn't change the picture much as the ASP required to make money at Intel is far higher than the amount AMD needs to make money. It is likely to happen as AMD reduces the ASP differential. Intel can either allow AMD's ASPs to rise making AMD's GMs much higher than Intel's or be forced to drop its ASPs, lowering its GMs below AMD. Once Intel decides it can't stop the ASP differential's erosion, it will likely choose the first course. At least that makes it a win-win outcome (Intel makes large profits and so does AMD). The latter decision will be a loser for Intel's management no matter what happens to AMD.
Pete
The effect of stock options on Q1 05:
INTEL CORPORATION
NOTES TO CONSOLIDATED CONDENSED FINANCIAL STATEMENTS - Unaudited (Continued)
Three Months Ended
(In Millions-Except Per Share Amounts)
April 2, 2005 March 27, 2004
Net income, as reported
$2,178 $1,730
Less: total stock-based employee compensation expense determined under the fair value method for all awards, net of tax
333 288
Pro forma net income
$1,845 $1,442
Reported basic earnings per common share
$0.35 $0.27
Pro forma basic earnings per common share
$0.30 $0.22
Reported diluted earnings per common share
$0.35 $0.26
Pro forma diluted earnings per common share
$0.29 $0.22
SFAS No. 123 requires the use of option pricing models that were not developed for use in valuing employee stock options. The Black-Scholes option pricing model was developed for use in estimating the fair value of short-lived exchange traded options that have no vesting restrictions and are fully transferable. The company's employee stock options have characteristics significantly different from those of traded options. In addition, option pricing models require the input of subjective assumptions, including the option's expected life and the price volatility of the underlying stock, and changes in the subjective input assumptions can materially affect the fair value estimate of employee stock options.
The weighted average estimated value of employee stock options granted during the first quarter of 2005 was $6.51 ($15.00 for the first quarter of 2004). The value of options granted was estimated at the date of grant using the following weighted average assumptions:
Three Months Ended
April 2, 2005 March 27, 2004
Expected life (in years)
5.6 5.3
Risk free interest rate
3.7% 3.0%
Volatility
.28 .52
Dividend yield
1.4% .5%
In light of recent regulatory guidance, the company reevaluated the assumptions used to estimate the value of employee stock options granted in the first quarter of 2005. Management determined that implied volatility is more reflective of market conditions and a better indicator of expected volatility than historical volatility. Additionally, in the first quarter of 2005, the company began using the simplified calculation of expected life, described in the SEC's Staff Accounting Bulletin 107, due to changes in the vesting terms and contractual life of current option grants compared to the company's historical grants. Management believes this calculation provides a reasonable estimate of expected life for the company's employee stock options. The expected life for options granted in the first quarter of 2005 and the first quarter of 2004 included the impact of options with extended vesting periods granted to key officers and other senior level employees.
It is in "interest and other net". The 10Q makes this more clear.
Pete
Dear Mas:
Intel already includes the money paid to exercise the options in revenue. Thus 50% of the buy back should be deducted from profits. That makes a $1.75 billion to shareholders of which only 42% by profits and the other 58% by cash. Although there were some quarters where they didn't buy any shares back which means more of each subsequent buy back was for deferred employee compensation.
In addition, since the buy backs paid more than book value for each share, the book value drops for each remaining share. Stock buy backs usually are a net negative for remaining shareholders. Many studies on the topic show this to be true. If you want to give money to shareholders, make a one-time special dividend pay out or increase dividends until cash is stable over time. That way all shareholders benefit not just those who get their shares bought.
Pete
Wbmw:
AMD already ships enough CPUs to supply the entire x86 server market. K8 Semperons (a little better than any Pentium Ms and far better than any Celerons), A64s (better than any P4s), A64X2s (better than any Xeon DPs) and Opterons (better than any Xeon MPs) can all be used in servers. And that does not include Athlon MPs or XPs.
AMD made more than 8 million CPUs last quarter alone. Are you saying the x86 server market is greater than 8 million CPUs per quarter? It is no where near that much. From IDC's Q4 2004's report: "Factory revenue for x86 servers grew 14.4%, while unit shipments grew 16.8% to 1.6 million servers". 1.6 million servers, where the average size is less than 2 CPUs, would mean that a supply of 3.2 million x86 CPUs a quarter is enough to supply the entire x86 server market. And since K8 CPUs are much faster than P4 Xeons on a per CPU basis, substantially less than 3.2 million K8s would be needed.
So the fact is proven. AMD can supply the entire x86 server market.
Pete
Dear Fpg:
Quick stab at damages for Server Market only:
X86 server market in 4/03 to 7/05, $22.5 billion.
AMD's should have at least got 25% of that, $5.5 billion very conservatively.
Amount actually received, $500 million.
Net Damages, $5 billion.
Award, $15 billion for Server Market alone.
The Desktop and Mobile Markets would have similar awards.
Capacity issues are not as relevant, because if AMD had capacity issues, they would have gotten foundries to produce their low end products. In addition, Fab 25 would have stayed a CPU fab and been upgraded to 90nm Copper SOI (or even smaller geometry).
We all know what AMD would have done making $150 ASPs at 15 million CPUs they could have produced per quarter. $2.25 billion a quarter builds or upgrades a lot of fabs.
Pete
Wbmw:
Re: You know as well as I do that legality can be interpreted in many different ways. In this case, all you need to do is cast reasonable doubt that AMD would have gained no more market share than if every one of the proven allegations had never happened. (Notice that I said "proven" as well, because any unproven allegations are moot.) Do you think this task is insurmountable? Keep in mind that casting reasonable doubt is a lot easier than proving beyond a shadow of a doubt, especially in front of a judge as opposed to the willing believers on this thread
This is in fact easy to prove that AMD could supply 100% of the x86 Server Market. Since they have less than 10% revenue share last year, it is evident that they could have gained more share had Intel not performed illegal acts in 2003 and 2004.
Re: Again, can this be proven? How does one put a quantitative value on how much any *proven* allegations have lessened AMD's ability to manufacture and ship said parts? I assume you'd need to subpoena AMD's yields and bin splits, as well as their entire test, sort, and shipping process to find out."
Again you are barking up the wrong tree here. All they need prove is that AMD received less revenue than it would have if Intel didn't act illegally. I think it easy to prove that Intel's acts both reduced unit volume and ASP for AMD CPUs. Interfering with the Opteron launch alone effectively proves harm to AMD and that is illegal no matter the size of the company doing it. That renders your arguments moot.
Pete
Dear Tenchu:
Intel is a monopoly. It has market power. Else Prescott would have never sold. Or Williamette. Or the original Celeron without cache. A few mistakes does not keep them from being judged a monopoly.
To think otherwise is wishful thinking. Can you think of another company having 90% or more of a $10 billion or more a year market revenue where it wasn't thought to be a monopoly?
Pete
Dear Tench:
Not so easy. AT&T was ruled a monopoly even though there were two major competitors, Sprint and GTE in the local lines business. They had two major competitors in the long distance end, MCI and Sprint. They also had another in the telcom equipment market, Northern Telcom. AT&T was broken up.
Yet AT&T couldn't push video telephones. Monopolies can't force end users to buy when they clearly don't want to or unwilling to pay the price of. RAMBUS and IPF couldn't convince the end users of their desirability for the asking price. Strong arming the companies in the middle doesn't help in those cases.
Failing to push something on the market does not keep you from being a monopoly. Being inept isn't a defense. Successfully forcing something on the market it doesn't want, proves you are a monopoly however.
Pete
Its a typical fallacy. Given A implies B, it does not follow that not A implies not B. Only not B can imply not A.
Tenchu:
Here's another: http://web.singnet.com.sg/~duane/merced.htm
Pete
Dear Dan3:
Take a look at the processors listed. What is a Turion ML-44? A A64 2.4GHz/1MB San Diego or 2.4GHz/512KB Venice?
Pete
Wbmw:
You can't remember what you don't want to see. I gave references, you just don't like what they said. In 1998, Merced was to get above 1GHz. Then in 1999, it was to get to 800MHz. Later they were found to "not be stable" by HP. Someone here claimed they were only to do 600MHz. Consistently lowering the bar.
Face it Merced was slow. A lot slower than when Intel began hyping Itanium during development.
Now Intel got smart. They don't tell you the launch speed until near the launch. That way they can claim it met or exceeded "internal targets".
So what speed do you think, Intel's internal target for Montecito is? What is your target? At least then we can see if it meets it when it launches.
Pete
Wbmw:
Why take my word for it? Intel states this in the datasheet. They acknowledge that a "power virus" will exceed their published TDPmax. Thus it can not be used to compare against AMD's TDPmax which can NEVER be exceeded. Using the criteria AMD uses to determine TDPmax, Pentium M does not have a lower TDPmax.
Face it. You can not use different methods to compute TDP on different CPUs and then use those TDPs to compare power usage of those CPUs. Intel uses a different standard than AMD. It is a fallacy you keep trying to foist upon others. People have exceeded Intel's published TDPmax for Pentium M using programs like Prime95 and K7burn at nominal Vcc and clock. No one has made Athlon64 exceed AMD's TDPmax using any program at nominal Vcc and clock.
The question of which uses less power can only be correctly answered by scientific emperical means. That is taking a sample group of each CPU (randomly selected), measuring each CPU's power usage against a battery of programs and taking the highest power usage of any CPU on any program. Then you compare the results. Unfortunately only AMD and Intel normally do this. The rest of us take a sample of 1 and take just a few programs.
I found those direct measurement result only for Athlon 64s and P4s. I have not found any direct measurement on Pentium Ms.
Pete
Now you claim that somewhere a AMD Venice uses 67.5W given any conditions within those specified by the datasheet. The best power virus found to date, K8burn, finds a usage of 30.4W for a 2.4GHz Venice. Now come up with a link to a higher usage program (with proof) using direct metering (actually measuring Icc, Vcc and then multiplying them).
Chipguy:
You don't like that reports show that HP and Intel were saying >1GHz for Merced in 1998. It was reported at many sites. Few sites keep unaltered records that far back that can be searched. We know now that even 800MHz "was unreliable" for Merced as stated by HP and Intel.
Here is the reference for Montecito as of 11/17/2002: http://haxor.dk/articles/intel.html Definitely after Mckinley was released.
Notice the 2004 debut of Montecito was missed. It shows here at 1.6-1.8GHz with 12MB L3. Chivano at 2.0-2.2GHz was to be released this year with 8MB L3. Has it been cancelled? According to this Intel presentation in 2003, it has: http://www.apac.edu.au/APAC03/Papers/Sponsors/David_Scott_ItPerf.pdf
Tanglewood is to be the next version. Here they get smart and say nothing of the clock, number of cores, L3 size or TDP, just >7 times the performance of the I2 1.5/6MB.
This site has Montecito at 1.6GHz in one place, 1.6GHz*1.1=1.76GHz with Foxtron in another and a target of 2.5GHz in a third: http://translate.google.com/translate?hl=en&sl=zh-CN&u=http://www.pcpro.com.cn/topic.php%3Fi....
Ostensibly from Intel at ISSCC.
Pete
Wbmw:
Check at Intel: http://www.intel.com/products/roadmap/notebook.htm 2.13GHz PM 770 is largest listed there till December 2005.
You should check the "horse's mouth".
Pete
Wbmw:
Just look at your second reference. The 865PE makes the same 2GHz Dothan use 11W more power at idle and 9W more at load than the 855GME. And it doesn't have half of those differences. Your 1/10th is way off since with less than half, we are already at 1/4th. A big part of the difference is that where the 939 A64 uses two PC3200 DDR DIMMs simultaneously, the 865PE uses only one at PC2700 at any given time. Open DDR banks use a lot more power than closed ones. Clock speed also has a big influence on DDR power.
Try it yourself. Run a "power virus" like K8burn or Prime95. Feel your PC3200 RAM's temps. Then down clock them to PC2700. You can feel the lower temps. If you have a dual channel DDR board, put one in each channel and take the temp. Now put both of them into the same channel and take that temp. The latter one will be smaller. Do both at the same time. That is quite a bit lower in power. And the efficiency of the PS isn't 100% which will boost the differences at the wall plug.
Pete
Wbmw:
Using Intel's datasheets and AMD's methodologies to create an apples to apples comparison, I get just under 40W TDPmax for the 2.13GHz Pentium M 770. And just under 8.9W for the FSB, DDR and hub portions of the 855PM from its datasheet. If you want us to use TDPtyp from Intel's methodologies, the TDP for K8Burn is 30.4W for the 2.4GHz Venice that has a 67.5W TDPmax from AMD. Intel's TDP(typ) doesn't include a "power virus" like K8burn when measuring power usages, so we can take that as a loose upper bound.
So if TDPtyp is 45% of TDPmax for Venice (30.4W/67.5W), then a 35W TDPmax Turion has a 15.75W TDPtyp and a 25W TDPmax Turion has a 11.25W TDPtyp. Dothan 2.13GHz has a TDPtyp which is 67% of the 40W TDPmax, than applying the same for Turions gets us to 23.62W/16.88W for the 35W/25W TDPmax. Either way, Dothan is more power hungry than Turion when the same methods are used.
But the hardware isn't the same. That graph doesn't take into account many of the enhanced options on the Asus A8V (939) not present on the DFI 855GME. The 855 has 2 fewer DIMM slots, 2 fewer SATA ports, 1 less IDE port, 6 fewer sound channels, 1/10th the Ethernet speed, dual channel PC3200 vs single channel PC2700, no firewire and 4 less USB ports. Add in cards like a SB Audigy, a RAID IDE card with a DDR cache, a Gb Ethernet card and a firewire card and that system may go beyond that 44W under load.
As to the performance, it overclocks the AGP port by 20%. Do the same for the AMD systems and The Dothan will be further behind the A64s.
If they would measure direct to CPU power ala Lost Circuits, then you could remove all of that (well not the dual PC3200).
Pete
Wbmw:
This says different: http://haxor.dk/articles/intel.html
Pete
You are not great!
As for you, you look to be in some far off fantasy world: http://www.investorshub.com/boards/read_msg.asp?message_id=6215915
You try to later retract that somewhat, but as you were caught dead to rights by Keith, you lost any claim to greatness.
BTW, the above post is speculation. And false as the major stockholders are American companies and people.
Lastly here is more of your lacks: http://www.investorshub.com/boards/read_msg.asp?message_id=6130746 Banias was designed by Israelies as a team that became public once the initial results were promising. Many fait accompli's happen at companies where management later makes them into success stories. There are just too many stories about how a group working for an entity makes something better and has the entity take full credit when it is released. A famous fictional one is the theft of 5 video games in "Tron".
Here is a famous one, the i4004 was a private effort by Intel for a customer to make a watch IC that was publicized later when it became a fait accompli. Certainly the watch company didn't think of spawning the MPU business. We wouldn't be even having this conversation now, if it wasn't for them. "Oh it never happens at Intel"! Just another time you put your foot in your mouth.
Pete
Chipguy:
Your revisions of history are typical when your company fails to meet the original targets. Itanium was to be released at only 600MHz. Here is a story from 1999 by the Register that states that HP had 800MHz as the original target, but that current samples barely make 400MHz: http://www.theregister.co.uk/1999/12/17/merced_slow_slow_quick_quick/ Just one month before: http://www.theregister.co.uk/1999/11/15/first_merceditanium_systems_get_cobbled/ For more on 1999 see: http://home.swipnet.se/~w-10554/intel_news_1999.html
Oh the original target wasn't 1GHz. Yeah right!
Here is an older look: http://www.theregister.co.uk/1998/10/15/intel_doctors_foster_to_extend/
This states that Merced was to be over 1GHz and that McKinley and Madison would be each higher than the previous incarnation. The names were kept, but the time was delayed by 2.5 years at lower speeds: http://www.theregister.co.uk/2001/08/29/mckinley_deerfield_speeds_and_feeds/
As each generation came out, Itanium release speeds were scaled back. This didn't damp speed speculations by Intel boosters that were initially high until near launch, when reality set in. Montecito's are now down to 1.6-1.8GHz.
Pete
Take your "comparisons" one by one.
The first compares a uATX board with minimal on board capabilities against a capable, quite featured, full ATX A8V. Like only 2 DIMM slots 855GME versus 4 DIMMs for the A8V. PC2700 speeds versus PC3200 speeds. No third IDE channel. 2 SATA ports versus 4 SATA ports. No firewire ports. 2CH audio versus 8CH audio. 10/100 Ethernet vs 10/100/1000 Ethernet. No Cool&Quiet for AMD, but PM could use its internal shutoffs. Higher capable power circuits on the A8V so it can run A64FX 55 with 3PH power versus 2PH power on the 855. Dual channel A8V pushes DDR harder than single channel 855 does. Lastly, A64 can push the graphics harder and push its power use up. Oh and they used a Winchester and not the current Venice. And overclocking may have been with a "cherry picked" CPU versus 2.4GHz Venices that are now "off the shelf". Venices and San Diegos typically overclock to 2.7GHz on air.
The second compares two low end boards, the 855 and 865PE versus a top end Nforce4 SLI with tons of features. Again only 2 DIMM slots versus 4 slots for the SLI. No RAID either for the 2 IDEs or 2 SATA ports on the 855/865, but available for the 2 IDEs and 8 SATA on the SLI. 10/100 Ethernet versus 2 10/100/1000 Ethernet ports. 2 SLI PCIe ports versus AGP only. Ditto on Geforce 6800 Ultra which needs a PCIe/AGP bridge for the SLI vs the 855/865 which uses it in its native AGP form. 2CH sound versus 8CH sound. Firewire missing. No firewall for Ethernet. Higher available power on the SLI versus the 855/865. Dual channel DDR vs single channel on the 855/865. All of these high end desktop stff takes power that PM doesn't have and can't really use. The 865 does have more stuff and uses 11W/9W idle/load than the 855 for the same 2GHz PM. This should server as a wakeup call that MB has a large effect. When looked at the 12V input to the CPU power circuits, A64 Winchester does much better wrt the PM (I forget which site did this, but the reference below uses an even better method, direct measurement of the VRM output (V*I)). Venice does even better than Winchester. But, you don't like to cite those references that use those methods.
Take another look and going from 2GHz to 2.13GHz uses 6W both at idle and full load. The A64 2GHz to 2.2GHz uses only 2W/5W idle/load for the 90nm Winchester. Scaling A64 to a 6.5% clock increase would be just 1.5W/3.3W vs 6W or 25%/55%. Enabling Cool&Quiet will likely show just how much power is used in those ancillary circuits on the MB as Winchester uses less than 3W in that mode. Performance number needs to take into consideration that FSB is at 130MHz for the OC Dothan likely making AGP run at 87MHz. This has been shown to boost gaming numbers. NF4-SLI locks PCIe and PCI at the normal levels.
Last reference shows no graph for power levels but given those numbers that they cite, likely is again measuring wall plug power. That is not like the direct measure used by Lost Circuits in the references below.
As for my numbers, I used http://www.sandpile.org and a few articles like http://www.lostcircuits.com/cpu/amd_venice/ , http://www.lostcircuits.com/cpu/amd_x2/ and of course datasheets like http://developer.intel.com/design/mobile/datashts/305262.htm .
By that datasheet and using AMD's methods for calculating TDPmax (Sum(Izmax*Vzmax) for all supplies z), you get 40W TDPmax for the Pentium M 770 (2.13GHz@533FSB) (Used Vcc, Vcca and Vccp). Intel uses 3 milliohm VRM series resistance to lower TDPtyps, AMD uses a value of 0 milliohms. I do this to use apples to apples comparisons. Of course you need to look at the datasheet of the NB (855PM: http://www.intel.com/design/chipsets/datashts/252613.htm ) to try to pick out the FSB switch and DDR controller power to match TDPs (I estimate 8.9W TDPmax).
Without this, 2.13GHz Dothan uses more power than either MT or ML Turions and 35W TDPmax Turion ML-37 is already at 2.2GHz. Before Dothan gets to 2.26GHz some time next year (highest on current roadmap is 2.13GHz 770 at least till December 2005), ML-40 will be out at 2.4GHz and 25W TDPmax MT-37 at 2.2GHz. Here is speculation that Yonah will get to 2.5GHz: http://news.softpedia.com/news/Intel-s-Yonah-could-run-at-2-5-GHz-056.shtml . Another poster at Ace's Hardware states that Intel specifies that Yonah will be at 2GHz at 31W and 1.67GHz at 15W ( http://www.theinquirer.net/?article=22508 ), but this looks like a server successor, Sossaman H1?/06. Others have it at 2.17GHz @ 667FSB x50 ( http://www.vr-zone.com/?i=2142&s=1 ) no word on TDP target. But the 780 PM is no longer on the 2H05 roadmap.
I don't take drugs. I am not lying here. Intel has cancelled projects that they tout highly. Tejas is one example. The 4GHz P4 is another. Some they push out. Itanium is but one example. Prescott and Dothan are others.
The original Itanium was late and much slower than advertised when it finally came out (800MHz vs 1GHz). Mckinley missed its speed targets as well (1GHz vs 1.3-1.5GHz). Madison was to be 1.9Ghz, then 1.8GHz and it didn't even make 1.7GHz. 90nm 12MB Itanium was supposed to be out last summer and then they pushed it to this summer as dual core and replaced it with 1.6GHz 9MB which was then 1 quarter late. And it will be this summer and no 2+GHz Montecito. It may make it in Q3 or more likely Q4. Again late and likely slow.
So many here say Intel will hit its targets when they have failed for the last few years. One sees that "your world" is full of FUD and unkept promises that never seem to pan out. Where did 7GHz P4s go? Where did "no need for 64 bit on the desktop" wind up? What happened with the "memory of the future" RAMBUS? Intel is full of "wait till ..." that never pans out. And everyone's plans slip from time to time. And I include AMD. Things usually take longer and are harder than planned. And the competition doesn't stop pushing either.
Does Intel always foul up? No they don't. They do come up with good ideas like PCI and Pentium Pro. But they have blind spots and a tendency to go down blind alleys. Look at the "web hosting" and "consumer projection" businesses. They flopped. To take all they say as gospel is setting yourself for a lot of disappointments and a lot of pain. Just look at how many "roadmap" changes have occured in the last year. And the fallacy of comparing current products against a competitor's future products get one in trouble, fast.
That you can't see this should be a wakeup call. Perhaps you need to go to "Detox" or stop lying to yourself.
Dear Wbmw:
Pentium M does not have lower power dissipation at competitive speeds (34.14W TDPmax at 2.13GHz without a memory controller vs 35W TDPmax with a memory controller @ 2.2GHz and that is with a narrow spec from Intel vs a family spec from AMD). It can't get to the speeds of K8 (2.6GHz FX55 or 2.4GHz A64 4000+). It doesn't have 64 bit mode. It can't access enough memory (2GB vs 8GB+).
Yonah won't fix these lacks. A64X2 is already at 2.4GHz and will be out much sooner than Yonah. Opteron will even be 1.8GHz dual core at 30W TDPmax before Yonah comes out. Pentium M can't do 15W TDPmax at 1.8GHz. It only does 1.4GHz at 13.4W TDPmax. And that's with only a 400MHz FSB for the PM vs 800MHz HT for K8.
We'll see if Yonah can even do 2.4GHz by 1H06. It may even be cancelled to wait for the 64 bit capable version.
Pete
Dear Tenchu:
The only "marketing speak" is your attempt to portray Opteron as the only "glueless" solution that matters.
Where did I say that? I gave other examples of "glueless" interconnect. Like Thicknet Ethernet. Like Thin Ethernet. Or PCI. Here are some more, SCSI and CompactPCI.
Pete
Dear Wbmw:
As far as Opteron goes, data may be located in local or remote memory. Best case, you have CPU->memory and worst case, you have CPU->CPU->CPU->memory.
You are being disingenous. Opteron CPU1 can talk directly with CPU2 without any step to memory. CPU1 owns L1 cache line X for address xxxxxxxxxxH that CPU2 wants. CPU1 then transmits that cache data directly to CPU2 without any going to memory. Even if that packet travels through 1 or more CPUs, it is still "glueless".
Traffic goes CPU->NB->Memory. I understand your point that there absolutely has to be NB in the middle, but I disagree with your representation that CPU->NB and NB->CPU is at all relevant.
It is relevant as Xeon CPU5 can not talk to Xeon CPU3 directly. It can't give it cache data that it just modified. It has to write it to memory and that requires an arbitration cycle by the NB just to begin a data transfer in the FSB no matter where it goes. Then CPU1 sends the data to the NB. Now if the CPU3 is on the same FSB, it could snoop the changes into its own cache. If not, the NB must initiate a data movement to CPU3 with the data (it wins arbitration once the bus is released). It does not need to actually read it from memory but can use its internal buffers.
So part of the arbitration logic must be on the NB and any inter FSB switch. Notice that the memory controllers may actually be off the NB chip like in the bad old RAMBUS days (on the MTH chip) or like in the Itanium server (DTH chips) chipsets. So even if Xeon had on die memory controllers, Xeon still needs the NB to talk to another CPU. If the memory was local, it still needs to put that traffic to the NB for the other Xeon caches to snoop the updates. And the NB would need to broadcast that to all other FSBs. Thus the external "glue" is always required even on systems with 1 CPU.
This misses the point. The point is that these "other chips to translate HTT to other types of I/O" are completely necessary to make a computer system. Without I/O, you can't even load a program. Memory is a volatile storage unit, and Opteron by itself cannot function without a non-volatile storage. So there is necessary "glue" that needs to be in an Opteron system to provide this kind of capability. This "glue" may be less complex than a fully functional North Bridge, but that isn't the point.
You miss the point. The NB is required in Xeon systems even if they had on die memory controllers or even on die I/O controllers. The FSB arbitration system requires it to function. As to needing glue logic on the MB, that is incorrect. All DDR DIMMs come with some ROM on them (serially accessed, but it is there). There is nothing in the DDR interface from you using ROM, battery backed SRAM or FLASH instead of DRAM on a DIMM. There are ones with LEDs on them so a IRLED/photodiode isn't far fetched as an upgrade. Granted that would unusual, but doable. So you can make an Opteron computer with no SB at all.
As to needing some external HW, that is being a nitpicker. If you go that route, no system is glueless. Even those with everything on die, because they need to be connected to outside power or packaged in some way. Coax Ethernet isn't then glueless because it needs a terminator on both ends. It is just another of your pushing things to extremes becoming totally senseless.
So pure I/O and memory is removed from being "glue". That allows MTHs, SBs (ICHs) and DTHs for Xeon/Pentium/Itanium. Opteron can have those too. Still you need that FSB arbitrator and broadcast logic for Xeon/Pentium/Itanium. It can't just translate FSB bus "packets" to I/O bus packets. Or simply pass along memory read and write requests. You can also tell because some only allow 1, 2 or 4 devices on any given FSB.
The CPU in an Intel system only needs a NB as a medium to transfer data.
Sorry, the NB does more than transfer data. It arbitrates access to the bus. It controls which CPU talks on the bus. It acts like a traffic cop at an intersection. And it must act like an Ethernet BaseT hub when more than 1 FSB is present. That last is why CPU->NB and NB->CPU is relevant. Cutting out the NB->MTH->DRAM->MTH->NB steps are a typical speedup or acceleration done to both decrease latency and preserve FSB bandwidth. Look at any recent Intel NB and you will see quite a few buffers to speed this up. 1 would only be needed for any clock domain boundary. Forgetting this, makes your server designs much slower than your competition. Is that why you don't do them any more?
Re: The BIOS can be placed on a DIMM as well.
That's a laughable way to make your argument, Pete. No one does this.
So Cell phones don't do this? I seem to remember even Intel stacking SRAM and flash into one package. A there is ROM on every DDR DIMM sold holding the timing, size and other parameters. Isn't Samsung stacking DRAM, SRAM and flash into a package? How about those HDs with 128MB of flash and a few MB of DRAM. Look at Corsair with their TWINX1024-3200XLPRO memory with LEDs showing usage patterns. The jump to an IR port wouldn't be that far away with enough ROM/flash to contain a AMD64 BIOS. Oh "nobody" does this. Sorry, wrong again.
Sure it will work, but will an 8-socket server scale in performance with just this? Of course not. It isn't a viable solution.
You forget about compute servers. They do not need much I/O and 1 Nforce Pro 2200 would be enough with 2 Gb NICs inside. That has 8GB/s of I/O. Even 8 socket Itaniums only have 6.4GB/s to memory using Intel's chipsets. You just keep digging yourself in deeper.
Isn't it time to stop before you completely bury yourself?
Pete
Dear Wbmw:
As far as Opteron goes, data may be located in local or remote memory. Best case, you have CPU->memory and worst case, you have CPU->CPU->CPU->memory.{/i]
You are being disingenous. Opteron CPU1 can talk directly with CPU2 without any step to memory. CPU1 owns L1 cache line X for address xxxxxxxxxxH that CPU2 wants. CPU1 then transmits that cache data directly to CPU2 without any going to memory. Even if that packet travels through 1 or more CPUs, it is still "glueless".
Traffic goes CPU->NB->Memory. I understand your point that there absolutely has to be NB in the middle, but I disagree with your representation that CPU->NB and NB->CPU is at all relevant.
It is relevant as Xeon CPU5 can not talk to Xeon CPU3 directly. It can't give it cache data that it just modified. It has to write it to memory and that requires an arbitration cycle by the NB just to begin a data transfer in the FSB no matter where it goes. Then CPU1 sends the data to the NB. Now if the CPU3 is on the same FSB, it could snoop the changes into its own cache. If not, the NB must initiate a data movement to CPU3 with the data (it wins arbitration once the bus is released). It does not need to actually read it from memory but can use its internal buffers.
So part of the arbitration logic must be on the NB and any inter FSB switch. Notice that the memory controllers may actually be off the NB chip like in the bad old RAMBUS days (on the MTH chip) or like in the Itanium server (DTH chips) chipsets. So even if Xeon had on die memory controllers, Xeon still needs the NB to talk to another CPU. If the memory was local, it still needs to put that traffic to the NB for the other Xeon caches to snoop the updates. And the NB would need to broadcast that to all other FSBs. Thus the external "glue" is always required even on systems with 1 CPU.
This misses the point. The point is that these "other chips to translate HTT to other types of I/O" are completely necessary to make a computer system. Without I/O, you can't even load a program. Memory is a volatile storage unit, and Opteron by itself cannot function without a non-volatile storage. So there is necessary "glue" that needs to be in an Opteron system to provide this kind of capability. This "glue" may be less complex than a fully functional North Bridge, but that isn't the point.
You miss the point. The NB is required in Xeon systems even if they had on die memory controllers or even on die I/O controllers. The FSB arbitration system requires it to function. As to needing glue logic on the MB, that is incorrect. All DDR DIMMs come with some ROM on them (serially accessed, but it is there). There is nothing in the DDR interface from you using ROM, battery backed SRAM or FLASH instead of DRAM on a DIMM. There are ones with LEDs on them so a IRLED/photodiode isn't far fetched as an upgrade. Granted that would unusual, but doable. So you can make an Opteron computer with no SB at all.
As to needing some external HW, that is being a nitpicker. If you go that route, no system is glueless. Even those with everything on die, because they need to be connected to outside power or packaged in some way. Coax Ethernet isn't then glueless because it needs a terminator on both ends. It is just another of your pushing things to extremes becoming totally senseless.
So pure I/O and memory is removed from being "glue". That allows MTHs, SBs (ICHs) and DTHs for Xeon/Pentium/Itanium. Opteron can have those too. Still you need that FSB arbitrator and broadcast logic for Xeon/Pentium/Itanium. It can't just translate FSB bus "packets" to I/O bus packets. Or simply pass along memory read and write requests. You can also tell because some only allow 1, 2 or 4 devices on any given FSB.
The CPU in an Intel system only needs a NB as a medium to transfer data.
Sorry, the NB does more than transfer data. It arbitrates access to the bus. It controls which CPU talks on the bus. It acts like a traffic cop at an intersection. And it must act like an Ethernet BaseT hub when more than 1 FSB is present. That last is why CPU->NB and NB->CPU is relevant. Cutting out the NB->MTH->DRAM->MTH->NB steps are a typical speedup or acceleration done to both decrease latency and preserve FSB bandwidth. Look at any recent Intel NB and you will see quite a few buffers to speed this up. 1 would only be needed for any clock domain boundary. Forgetting this, makes your server designs much slower than your competition. Is that why you don't do them any more?
{i]Re: The BIOS can be placed on a DIMM as well.
That's a laughable way to make your argument, Pete. No one does this.
So Cell phones don't do this? I seem to remember even Intel stacking SRAM and flash into one package. A there is ROM on every DDR DIMM sold holding the timing, size and other parameters. Isn't Samsung stacking DRAM, SRAM and flash into a package? How about those HDs with 128MB of flash and a few MB of DRAM. Look at Corsair with their TWINX1024-3200XLPRO memory with LEDs showing usage patterns. The jump to an IR port wouldn't be that far away with enough ROM/flash to contain a AMD64 BIOS. Oh "nobody" does this. Sorry, wrong again.
Sure it will work, but will an 8-socket server scale in performance with just this? Of course not. It isn't a viable solution.
You forget about compute servers. They do not need much I/O and 1 Nforce Pro 2200 would be enough with 2 Gb NICs inside. That has 8GB/s of I/O. Even 8 socket Itaniums only have 6.4GB/s to memory using Intel's chipsets. You just keep digging yourself in deeper.
Isn't it time to stop before you completely bury yourself?
Pete
Dear Chipguy:
Your "glueless" design has some flaws, if it requires any other chip to tie them together. "Glueless" is not defined as from N to N+1 for any N, but from M to M+1 for all M between 0 and N-1. That is a big difference.
By your definition any cluster is "glueless" just because I can add a CPU to any node's MB. Even a Horus based system would be considered "glueless" (just think of going from 5 way to 6 way) by your definition. And many consider any system using Horus as needing "glue". That includes myself and probably even you. That is why it must be for each and every value of M between 1 and N-1.
You need the glue to get M=0 in all Xeon/Pentium/Itanium systems. If those Xeons were like PCI bus masters, you could say that Xeon was "glueless". They need no other glue logic to intercommunicate. For Coax Ethernet, it is glueless, but 10/100/1000BaseT Ethernet requires either a hub or switch beyond N=2, which means it needs "glue" (N=2 is made by a simple mirrored cable and N=1 is just a disconnected NIC). For Opterons, N can be 1-8 and be truly "glueless".
Pete
Dear Tenchu:
"Glueless" in those are marketing speak, not from a MB designer standpoint. The "glue" is in the NB which is inside each Opteron/A64/A64X2 and external in Xeon/Pentium/Itanium/AthlonMP systems. No NB needs to be present for any Opteron system, just SB(s) or HT tunnel(s). Xeon/Pentium/Itanium must have at least one NB chip and sometimes many more. What Intel SB is there that can directly connect to Xeons? None! Opteron has quite a few.
So find a link to a Xeon/Pentium/Itanium MB without a NB. Can't find one, eh?
Pete
Dear Wbmw:
I suggest you take a look at how data is moved between CPUs on Intel's FSB. One CPU can not ring up another CPU and move data. It must go at least to the NB as there are no CPU to CPU bus states, only idle, CPU to NB and NB to CPU. There is an arbitration done with the NB deciding who "wins" access to the FSB. And CPUs can snoop other CPU's transactions with the NB to keep the caches coherent. But that is not really communication between the CPUs as the data is only that which is stored on each CPUs cache.
As to Opterons not being able to "string" the CPUs to make a computer system, they can do just that and only need other chips to translate HTT to other types of I/O. In fact if you bury on a memory DIMM, a message passing device, you do not need anything on the MB, but power, reset, CPU sockets, DIMM sockets and traces connecting them. The BIOS can be placed on a DIMM as well. Granted that computer system will be quite limited, but it can work. Try that with an Intel MB with no NB.
And many A64 systems could be made to work with just one SB. It can have a GPU, NICs, IDEs, SATAs, USBs, sound, firewire, PS/2 and no memory controller and still be a complete computer even by your standards. And the very same SB can be used on a 8 socket Opteron system with no changes.
Pete
Dear Tenchu:
Sorry the first link states 8 way Xeon II, yet in all the literature, Xeon, requires a bridge chip between FSBs after 4 way. That isn't glueless. The second link says that a 9,000 CPU system is "glueless", yet those also require ASICs to connect them else its a cluster and those require glue in the form of Ethernet NICs, hubs and/or switches. That author is particularly clueless.
As to the third link, it states specifically that "can effectively scale to 8-way and larger configurations with specialized chipset and clustering technologies". And needing a NB to communicate between CPUs (even on the FSB as one CPU can NOT talk to another directly) does constitute "glue". Serverworks chipset uses the NB to switch between FSBs to get to 8 way. Opterons do not need to have any chip between them or on an HTT link to talk to each other. Think of the answer this way, can two xeons without a NB talk to each other? They can snoop on the NB talking to another CPU, but CPU1 can't do a transfer of data directly to CPU2. It goes first from CPU1 to the NB to memory and then from memory to the NB to CPU2. Some NB implementations may cut out the memory steps, but still need the NB.
The last is just another rehash of the Intel P/R documents.
Now the FSB can be made glueless like the PCI bus masters, but the NB is the only bus master on Intel's FSBs. It has the critical circuitry needed to make the FSB work. Ignoring its need when considering something to be "glueless" is, at best, misleading. Opterons need no other chips to communicate amongst themselves. They do need external glue beyond 16 way (with dual core) currently.
That could change in the future if they add cHT links to each package (4 cHT links = 32 way, 5 cHT = 64 way and 6 cHT = 128 way assuming dual core). They could also add some cache coherency accelerator (with OS help, to make even more efficient) to Opterons to reduce cache coherency's impact on cHT traffic in high way SMP setups. But currently, up to 8 socket SMP, Opterons do not require any "glue".
Pete
Dear Tenchu:
Opteron is glueless in that it needs no other chips to connect 2 or more CPUs together. Intel needs at least 1 NB. What Intel box allows for a I/O chip like a SB to be used without any NB? Answer, none. What Intel box uses no additional chips to connect 5 CPUs together? Answer, none. Even IBM had to make a speciallized chipset to connect FSBs together in a Hub/Spoke arrangement.
Can Opteron use a speciallized chipset for interconnect? Yes, it can use one to boost connectivity. Cray's Red Storm allows the use of 40K of them to connect 40K CPUs. Those 40K CPU still have 80K active I/O HTT ports on them. That's over 1QB/s (1000TB/s) of BW. What Intel host system has that much I/O scalability? Answer, none. Heck Opteron could run programs without any chipset attached at all, just memory. Intel Pentium/Xeons can't do that. They need at least 1 chip to work, one with a memory controller.
Pete