Register for free to join our community of investors and share your ideas. You will also get access to streaming quotes, interactive charts, trades, portfolio, live options flow and more tools.
Register for free to join our community of investors and share your ideas. You will also get access to streaming quotes, interactive charts, trades, portfolio, live options flow and more tools.
Wbmw:
You claimed that I was the one to talk about 2.5GHz, yet I posted a post from February. Talk about false claims. You made it.
Yonah will not top the performance of A64 X2, flagship to flagship. It won't run 64 bit software. Some applications don't do all the functionality in 32 bit mode as they do in 64 bit mode. There is some software that doesn't even come in a x86-32 flavor, but does exist in AMD64. Of course we are hearing more about the future chip because the current one is found wanting.
LOL, Pete. So you want to compare the lowest leakage parts available on AMD's process with a random one on Intel's? In other words, you think because AMD can produce a small subset of chips at minimal leakage, that they can go ahead and release an entire line of chips at that lowest leakage level? Absurd, Pete. How about comparing Yonah's lowest leakage parts? You might find a few 10W Yonahs floating around the market in a few months. But then you already knew that....
Yonah's TDP spec'd from Intel is not the maximum power it can dissipate, as you well know (and if you don't, you have no business making any TDP claims). Dothan with a 21W TDP(typ) (400MHz FSB version) uses up to 28W. The 27W TDP(typ)(533MHz version) Dothan uses up to 36W. This is in Intel's datasheet if you look hard enough. The Yonah datasheet isn't public yet. Given Intel's track record, the 31W TDP(typ) Yonah is likely to use up to 42W when used with a thermal virus like Prime95 (two copies, one for each core).
As to the every Yonah can work under 31W is just as much bunk. The ones that make it into the bin test below 31W on Intel's suite. It makes for a self fulfilling prophecy. That does not mean that every good Yonah off the production line will make it under that power. Some will require you to slow down the clock to make it. Every one can't make 2.16GHz under 31W TDP(typ). That's binning. So the ones that make 2.16GHz are in fact cherry picked to begin with. And every CPU manufacturer does this. That what they mean by yield at a given bin. That percentage is the number of dies that successfully test into that bin divided by all the dies made and the result times 100%.
So if 2GHz Yonah has a 2% yield and 2GHz DC Opteron has a 3% yield in the 35W TDP area, the Opteron is better yielding than Yonah.
And that rating is an upper bound as AMD doesn't test any lower TDPmax bins. And yes, Yonah's limit is also an upper bound. Only testing can show which actually uses less. And only testing can show what the actual performance per watt, which requires both performance and watts used to be tested at the same time, else it makes the results meaningless.
So before we continue this debate, we should wait for someone like Lost Circuits to test the actual usage and performance per watt of Yonah like they did with Venice, Winchester, Newcastle, Clawhammer, Toledo, Manchester, Prescott and Smithfield.
Pete
Wbmw:
It was on the INTC board: http://www.investorshub.com/boards/read_msg.asp?message_id=5328456
2/4/2005 is far before my post.
The rest of your post was filled with lies
The only one lying is you to yourself.
You don't like looking up references that make you out to be wrong: http://www.hardforum.com/showthread.php?t=985544&page=5
Just got an Opty 170, its a CCBWE 0543TPMW. What is everyone maxt case? I seem to have gotten a chip with the lowest maxt listed on AMD's site . I havent OC'd it yet but i hope to get 2.8ghz on water at least.
AMD64 TCaseMax - v1.14
----------------------
CPU Information (CPU #1):
Standard CPUID: Family: F, Model: 3, Stepping: 2
Extended CPUID: Family: F, Model: 23, Stepping: 2
CPUID String: 20F32
Processor APIC: 0
Processor: Dual Core AMD Opteron 170 (Toledo)
CPU Speed: 2010.30 (201.03 x 10.0)
Revision: JH9-E6
Platform: Socket 939
Startup VCore: 1.350v
Maximum Case Temperature = 49C
TDP: 35.0 Watts
Written by: Arthur Liberman.
Idea by: Petr Koc.
More info at: http://www.thecoolest.zerobrains.co...wtopic.php?t=83
Here is that 49C 35W TDP Opteron 170 (2x1MB 2GHz).
You didn't think AMD could produce a CPU that does 2.0GHz under 35W TDPmax. Its out into the field in customer's hands before Yonah is available to the public. And it runs in any socket 939 MB, not some special socket on some special MB.
When Yonah comes out and some one can actually do the tests at the VRM input using some software that pushes Yonah to the limit could we check Yonah's usage and how it performs at that power usage.
Till then here is a 2.5Ghz Opteron 165 35W TDP:
Well my 165: CCBWE0541RPMW seems to be ok at 2486 MHz, with stock Vcore (which would be 1.35V according to PC Probe II). Should I take the Vcore higher? Load temps are at 46C atm on my XP120 with Panaflo M1A. Suggestions are welcome...
onto TCaseMax
AMD64 TCaseMax - v1.14
----------------------
CPU Information (CPU #1):
Standard CPUID: Family: F, Model: 3, Stepping: 2
Extended CPUID: Family: F, Model: 23, Stepping: 2
CPUID String: 20F32
Processor APIC: 0
Processor: Dual Core AMD Opteron 165 (Toledo)
CPU Speed: 1809.22 (201.02 x 9.0)
Revision: JH9-E6
Platform: Socket 939
Startup VCore: 1.350v
Maximum Case Temperature = 49C
TDP: 35.0 Watts
Amazing what a few weeks of process improvements can do for AMD. No wonder why you don't want to compare Yonah to what AMD produced at the same time.
Pete
Wbmw:
Nice try, Pete. If you look up this "prediction" of mine, it was based off an Inquirer article which said 2.5GHz. Considering that Yonah is a dual core device, I would not have expected the intro frequency to be greater than Intel's 90nm parts, but it should outscale 90nm in spite of dual core as the process matures. Sure enough, there is a 2.33GHz part on the Intel roadmaps. You can doubt Intel roadmaps if you want, but since you are eager to parade AMD's roadmaps, in addition to the assumption that AMD will beat their roadmaps, you are hardly in a position to be critical of Intel's.
You stated that Yonah was to be >=2.5GHz 6-12 months ago. It didn't pan out and they will only give 2.16GHz at intro. You and your Intel worshippers were saying that Montecito would be 2+GHz and be out Summer 2005. Then it was Q4/05 and now it may not get above 1.6GHz when it comes out in Summer 2006. Montecito is late and slow (like all Itanium releases). Its your predictions that turned out to be both very high and early compared to reality.
That's bull. Yonah may be speed path limited at 2.16GHz, rather than power limited. Some additional reviews with power and overclocking measurements should shed some light on this in January.
The only bull is your rosy painting over of facts. If it is speed path limited, then the power vs speed curve is even steeper than K8. That's good for lowering performance but, not increasing it. It does increase performance per watt to go slower and that's what I stated in my response. But you with your blinders failed to see that.
Except they didn't say this. The mobile architect that spoke in the article you are referring to said that they made the decision to not implement 64-bit in Yonah because 64-bits was not perceived to be a benefit to mobile users, while the power tradeoffs did not justify it. Given that Yonah was most likely architected 3-4 years ago, it would have placed it well before Opteron's introduction, at a time when Intel did not perceive a need for 64-bits. Additionally, this does not imply a "lot" of power increase. It could have simply been a couple of watts, which would still be rendered unnecessary if the architects did not believe 64-bits was necessary.
Sorry Yonah was not architected 3-4 years ago. Back then it was P4 to the metal and IPF for 64 bits. Yonah wasn't even planned until they hit the power wall with Prescott and it wasn't an easy fix. And the quote I used was from a designer that adding 64 bits added substantial power, not the 2 watts you suggest. >25% is substantial, not 5-7% as you suggest. If Intel could have added 64 bit and stay under the power budget, they would have.
The measurement at the VRM is irrelevant if you have the power at the wall socket. We are comparing systems with identical hardware configurations, except that Intel based systems need a full northbridge and southbridge, while AMD systems use a PCI-Express tunnel that is usually integrated into the southbridge. Yet in spite of this power advantage for AMD, the Yonah system still drew far less power.
The MBs were for very different configurations. A SLI MB uses quite a bit mmore power than a small uATX MB type strictly made for low power with less slots and only one graphics slot. Besides if you looked at the article, the power measurements were taken ffrom the August 1st, 2005 article, not from any new board and CPU for AMD. YOu like to say that in six months INtel will have a more mature process and lower power use at same performance, yet do not give AMD any of that benefit of time. I noticed that they didn't use a 939 low power MB and mobile chipset for the AMD like ATI Mobility 280 or Nforce 6150. But give all the breaks to Intel and then claim it was unbiased. Take a look before you make such a leap.
LOL, you misread the spec sheet. We discussed this on the forum here a few weeks ago. Why don't you link to an article which verifies this unbelievably low power models? You will not find an Opteron out there that comes close to the low power that every single Yonah chip is guaranteed to stay below.
I didn't misread any datasheet. Thats your baliwick. The Opterons have variable TDP. It is gotten by using the temp profile and the associated max die temp on that profile. A 49C using profile G means a TDP of 35W. And it might be lower but, the markings don't go lower on that profile. If it actually used 25W, it would still have the same markings. Check the CPUID and the external markings duplicated internally. The table is in the Opteron Thermal datasheet (9). I subvmitted the post of that Opteron 170, but of course you didn't see it with your poor comprehension of stuff that shows you are wrong.
LOL, Anandtech did a power test. You need more proof than this? Power specs are complex and do not accurately paint a picture of how much power a given chip will actually dissipate under real world conditions, but reviews remove the mystery by actually testing real world systems using real world stress tests, and Yonah demonstrated incredibly low power levels. As more reviews come out in the coming months, more tests will come out and we can see how consistent the results are between different lots.
More lies and misrepresentations from you. Anandtech's tests were not of the same configuration but, widely seperated by time as well as many other ways. Tests like them require more information than they were willing to part with (as they noted). Comparisons are even more fraught with errors. wall plug tests only work with all else the same which means same MB, cards, PS, memory, etc. This was not done. The way Lost Circuits did it from the VRMs at least removes many of the variables but, not all. They at least tested performance and power use simultaneously. And they got performance per watt for a couple of applications. Yonah has not been tested with the same standards, because it isn't available yet. But you knew that.
Pete
Wbmw:
You have no credibility predicting future speed of Intel CPUs. Remember the Prescott's going to 4+GHz at Intro and how its going to have very high performance, etc. Totally blowing 130nm Opterons and A64s out of the water. Now we know how that prediction turned out. It was slower than Northwood at intro and didn't get past 3.8GHz, ever. And it was slower per clock than the already bad Northwood. It got steamrolled in performance by A64 and Opterons in 130nm. 90nm K8s just blew it away.
Spin forward a few quarters. You were one of those who waxed politic about Yonah getting to 2.5GHz and up. Well its not going to make even 2.2GHz on introduction in 2006 with Intel's super duper 65nm process. The 2.16GHz will be less than Dothan's 2.26GHz (if it gets that far on anything but, a select few). Thats one to two speed bins down even getting a 65nm upgrade. Seems Intel's curve is just as steep, if not steeper. And that might be a good thing, if it wasn't so slow to begin with.
Now spin forward 6 months. Now its Conroe, Woodcrest and Merom who will get 3+GHz at intro in 2006. We will see where they go when (and if) they arrive.
As to Merom easily overclocking Yonah, you don't seem to listen to Intel when they say adding 64 bit adds a lot of power. Prescott got a lot more stages in the pipeline and it clocked slower. And the additional stages could be for more than just increasing clock. Some may be used to help the four wide decoder. One or two may help the fusion and scheduling process. One or two may be needed for 64 bit operations. And without these added stages, it might not clock even as much as Yonah. So adding stages doesn't automatically mean adding clock.
And we do know that increasing clock equals increasing performance on the same core. I have never stated that performance scales 100% with clock. Doubling clock doesn't make the hard drive spin twice as fast.
Lastly, no one has measured how much power off the 12V VRM Yonah uses. We are all treated to some configuration wall socket power use with there being different configs at large time differentials between the sides. We do know of at least one Opteron 170 (2GHz DC 2x1MB Rev E 939) that is rated by AMD at 35W using their TDPmax. Intel doesn't want to say what the TDPmax of a 2GHz Yonah is, just some TDPtyp thats harder to duplicate and is passed on a regular basis by independent third party testers.
Until a Yonah can be bought at retail along with a A64 X2 (or Turion DC if it show up) purchased at the same time using the same configurations (no laptop uATX MB versus a highend ATX desktop MB) measuring both performance and at the same time actual CPU power use (12V VRM input comparison is far better than wall socket), we won't be able to say who is ahead. Now we know that K8 is ahead.
Pete
Wbmw:
The problem is that everyone is neglecting known numbers for power versus speed. AMD's K8 to DC has shown that a 50% power reduction translates to two speed bins down. Thus a 2.8GHz SC uses the same power as a 2.4GHz DC. The same will be true of Intel.
Thus a 2.33GHz Merom with double the power becomes a 2.5-2.66GHz Conroe (more like 1.9x) and a 2.66-2.83GHz Woodcrest (more like 2.2x). Of course that assumes that Intel will meet that target for Merom, which hasn't happened yet. Since the transistion to 64 bit from 32 bit by AMD netted no increase in clock even though power went up, Merom is more likely to hit just 2.16GHz at introduction. Not only do they have the 32 to 64 bit transistion, but wider issue, wider decode and more powerful FPUs. All of those will reduce clock re 32 bit only Yonah at same power. Intel has already stated that 64 bit move alone would increase power by a very substantial amount.
So if they do improve their process enough to make up for the clock speed loss due to the upgrades (giving Intel the benefit of the doubt), Merom is more likely to intro at 2.16GHz. That puts Conroe at 2.33-2.5GHz and Woodcrest at 2.5-2.66GHz. That puts them below near future Turion, A64 X2 and Opteron performance. Socket F, M2 and S1 will kick AMD one to two bins higher and 65nm will add another bin. Thats 2.4GHz DC Turions and 3.0-3.2GHz A64 X2s and Opterons.
Given this Intel will still be behind in all three, Performance, performance per watt and performance per dollar. Maybe not so far back as with P4, but still behind. The slowest clocked single cores as shown by testing will have the highest performance per watt. But given the above, everyone should have known that.
Pete
Dear Chipdesigner:
Outperforms on that one Anand 32-bit database test, again, comparing one speed-grade below AMD's current max on Rev E, on AMD's current platform vs Intel's March 06 fastest part.
Rev F, on Socket F, with an x85 or x90 will be different.
And using a >1 year old Opteron MB, and not the current and faster Tyan 2895 with Nforce4 2200+2050+8131 MB with two PCIe x16 slots, 2 64 bit PCI-X 133, 1 64 bit PCI-X 100, 2 Gbe ports, 4 SATA2 ports and dual channel LSI 320 SCSI controller.
Given the power consumption of the Xeons, you could likely get 4 Opteron 880s on a 4S AMD MB and still get lower power and near double the performance. And I don't know if Bensley can even go to 4 sockets.
Pete
Wbmw:
Then why did Intel say they have increasing inventories in their mid quarter update? They also stated that they were in "hand to hand combat" with AMD. And if things were going like they were, that they would lose market share. Sounds more like Intel is having demand problems rather than supply ones. Well you could think that being behind in performance, performance per dollar and performance per watt, is a supply problem but, most here would put that as a management failure.
Pete
Wbmw:
Before going out on that limb of better performance/watt, you better see what is the current situation (09/20/05):
http://www.lostcircuits.com/cpu/amd_x2-3800+/11.shtml
Manchester gets 8678 to P4 670's 31536 in 3D rendering joules. Obviously Manchester gets 3.6 times the rendering per joule than P4 670. Even the best DC P4 820D needs 22016 rendering joules. Manchester still gets 2.5 times the rendering per joule than P4 820D. The best AMD CPU is the single core A64 3800+ Venice at 8176. Thats 3.9 times more rendering per joule than P4 670 and 2.7 times that of the P4 820D.
The results are similar using 3D Mark 2005:
http://www.lostcircuits.com/cpu/amd_x2-3800+/12.shtml
Intel is way behind in performance per watt (joule).
Pete
Wbmw:
Those aren't facts. What is the MB and PS used for the Yonah? What has the MB have on it? Was C&Q used on the AMD system? WE know that the test of that A64 X2 3800+ got 144W under load on a Asus SLI Deluxe with known stuff on it done prior to August, 2005 (That's right the MB, chipset and all other stuff tested were shipped prior to the published date). A A64 X2 cut down Toledo (4200+) (tested at 2GHz) at the same time got 145W on the same setup.
http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=2484&p=2
Given the similarity, I figure the 3800+ was a sample Manchester given that the tests at Xbit Labs and Lost Circuits clearly show a substantial power advantage between a true Toledo and a newer Manchester.
http://www.xbitlabs.com/articles/cpu/display/athlon64-x2-3800_3.html
Here the difference between Manchester under load and Manchester at idle is (65.1W-5.8W) 59.3W, not Anandtech's 35W. So either the idle measurements were too high for AMD or the load power was too low. That probably means that C&Q was not enabled on that SLI MB and/or the load was too light for testing purposes. Also notice that no mention of what software was used in making that "load" test.
http://www.lostcircuits.com/cpu/amd_x2-3800+/10.shtml
In this site's power usage graphs, they tried a number of high use progreams but stated that Prime95 pushed the CPUs tested the highest on consumption. The dual cores needing two copies to do their highest. Xbit Labs used "The workload was created by the special S&M 1.7.2 utility."
Of course you didn't check into the A64 X2 configuration before making such wild assumptions. And very little was given for the Yonah setup. All we know that it is a 945 series NB has one PCIe x16 slot and supposedly is a desktop MB using DDR2-667 memory. No other setup information or anything resembling the amount required to duplicate the results.
Even Anandtech copied their results from the August 1, 2005 tests into this one including the consumption ones. Yonah on an unknown board with an unknown setup 5 months at least later got less performance and somewhat lower power, but as Anandtech themselves stated in the earlier article, the consumption power results work for comparisons because "all else was equal." That is definitely not true of the comparison to Yonah in this preview.
For your comparison to be even remotely applicable, the CPU chosen should have been an Opteron 170 with a low Max Tcase on a uATX MB with a low power chipset like the mobile Ati Xpress 200 with C&Q enabled. That would have made the performance comparisons closer to what Taylor will be and make the power comparison more apples to apples. And allow the test to be duplicated which is needed for peer review.
But, you sweep all of that under the rug and proudly proclaim Yonah to be terrific. Sorry, this is likely another case where Yonah doesn't live up to its hype. And given all the breaks, it still came up short. Well the competition has gotten better and will continue to improve. AMD's CPUs have gotten better over the last six months, this preview missed all of those improvements to every component used for the A64 X2 3800+ tests.
As scientists often state "the devil is in the details!" You missed all of those devilish details.
Pete
Dear DrJohn:
I am not implying that Anand did a bad job. He used what he could. My problem is with wbmw trying to make unsubstantiated claims about power consumption from that preview. As for the rest, they are valid data points as far as they go. But extrapolating them to compare a CPU that will come out next year with CPUs shipping at the same time given the data points in that article, is fraught with errors.
I think we should wait and see just how good Yonah is when it can be purchased by third parties and run against other CPUs purchased at the same time. The good sites will try to do apples to apples comparisons (as much as possible). We'll have good enough data then to compare various lines.
Pete
Wbmw:
You are not testing flagship to flagship, but flagship to a AMD bin that is at the same clock speed. Yonah will be further behind than what was shown. The top bin (supposing that Intel actually ships it when Yonah is released (not a good bet)) is supposed to be 2.16GHz versus 2.6GHz now known for Opteron and A64 (x85 SE shipping now). Yonah will be lucky to get beyond 90% of K8 DC performance when it ships. Its likely to be under 80% of flagship performance.
And any abandoning of reality is what you are doing. Face it! Yonah isn't what it was hyped to be. Its going to be behind when it ships and continue to be behind both in performance and performance per watt. I guess you'll just have to CYFHO (cry your foolish head off).
Pete
Keith:
Ignoring something showing you were wrong about is a symptom of childeshness. It would have been better to admit your mistake. I guess you aren't decent enough to do that.
I don't ignore anyone here. I just discount them.
Pete
Wbmw:
What don't you understand is extremely large. Putting up system power from cherry picked CPUs not shipping yet and cherry picked motherboard also not shipping yet against a retail MB with more on board than Intel's MB and either a pre release A64 X2 3800+ or one bought in a store long ago. The A64 X2 3800+s shipping now have tweaks not in the one tested and average lower in TDPmax than the one tested.
Just look at these quote from that preview:
The Platform - Yet Another Socket
Well it wasn't tested on a current shipping desktop MB.
It's got a single PCIe x16 slot, meaning you don't have to rely on integrated graphics
Our test configuration is identical to what we used in our Athlon 64 X2 3800+ review, however we can't disclose the motherboard used for the Yonah platform. We can say that it used the Intel 945G chipset and was outfitted with 2 x 512MB DDR2-533 DIMMs; the rest of the configuration remained the same as the AMD and Intel systems.
Notice that they aren't allowed to talk about the MB. Its likely to be some special one off prototype test MB using a 945G as the NB. All we know is that it has just one PCIe x16 slot. The AMD configuration is known:
Date: August 1st, 2005 So the A64 X2 3800+ had to be bought or sample given before this.
Socket-939 Athlon 64 CPUs
2 x 512MB OCZ PC3200 EL Dual Channel DIMMs 2-2-2-7
ASUS A8N-SLI Deluxe
ATI Radeon X850 XT PCI Express
So you are comparing a SLI deluxe desktop ATX MB to some micro BTX MB with just one graphics slot.
Here is the A64 X2 3800+ article:
http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=2484&p=1
On page two the consumptions are only different by 1W under load. Likely the A64 X2 3800+ is a Toledo core with half the cache disabled and not a true Manchester.
ASUS A8N-SLI Deluxe has:
ATX MB - Retail / NVIDIA nForce4 SLI chipset / 939-pin / up to 4GB DDR400 ECC/Non-ECC support / 8-ch Audio/ Dual Gigabit LAN/ 2 PCIe x16/ 3 PCI & 2 PCIe x1 / SATA RAID / 1394 Firewire.
Let's see, 2 GBe ports, 8 SATA, 2 IDE CH, 8 Audio, 2 GPU slots, 2 PCI x1s, 3 PCI slots. Much more than that Intel Yonah test MB. The Asus supports overclocked A64 with >110W TDPs, but I don't think the Intel one could support near those power requirements. It likely wouldn't support even a 840EE. We don't even know if the PS is the same between the Yonah system and the AMD system.
So system power comparisons between AMD and Yonah are flawed greatly. No wonder Intel doesn't want the public to see the MB and measure the 12V VRM input to get the true power usage of the Yonah. Maybe its that Yonah doesn't look so good when it actually comes out. Like so many Intel CPUs of late.
Pete
Wbmw:
So you think being behind by <5% is the same as missing 10 laps in the Daytona 500? LOL.
Well you ridiculed a 5% performance loss as being "too little". 5% of a 200 lap race is 10 laps. When a car is running 10 laps behind the leader when the leader takes the checkered flag, that onwer and driver don't think its "too little". Neither does those who pay big bucks for the best. For you to take such a position just causes me to ROTFLMAO! You were CYFHO instead of LOL.
In each of these, the X2 got a far greater benefit with 10% more frequency than they got micro-architecturally over Yonah. When you also consider that production Yonah will use faster (lower latency) memory and potentially come with a faster front side bus than the one reviewed at Anandtech, then there are reasons to believe that top bin production parts will fare better.
And Anand didn't test with the fastest memory for A64 X2 either. They can use PC4300 memory right now out of the box. They will get DDR2(800MHz to 1066MHz) early next year. And faster HT as well. So those future speed ups also apply to Turion along with its K8 bretheren. Also Anand tested an old A64 X2 3800+, not the ones likely to be shipping when Yonah ships. And there already is a Opteron 170 939 (Dual core 2GHz 2x1MB exclusive L2) in the field with a TDPmax under 35W. And that uses AMD's far stricter (higher) TDPmax than Intel's TDP.
Yonah's top bin 2.26GHz will be much slower than DC K8s at 2.6GHz. Thats more than 10% behind. And don't try that Yonah's future bin is higher stuff, DC K8 will also be faster in the future. We know AMD can ship 35W MT class DC Turions right now. So the timing of DC Turion (Taylor) is up to AMD. They have surged forward before with dual cores.
Pete
Dear Ixse:
This is the link to the Opteron Thermal datasheet: http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/30417.pdf
This is a link to some Tcases found by users in the field: http://www.thecoolest.zerobrains.com/forums/viewtopic.php?t=96
http://www.xtremesystems.org/forums/showthread.php?t=71044&page=9
Manchesters appear to use profile F, 1xx and A64 X2 4400+ & 4800+ Toledos profile I and Opteron 2xx and 8xx Toledos profile G.
Pete
Wbmw:
QoQ (Q2->Q3/05) IPF server shipments declined, but revenue went slightly up per Gantner. QoQ Opteron server shipments went up substantially with revenue even more so. Shipments going down shows IPF losing market share.
Pete
Wbmw:
5% is a lot when it comes to performance between clock bins. A 2.13GHz Dothan is not 6.7% faster than a 2GHz Dothan on the average. You need about a 10-15% clock rate advantage (all else being equal) to get a 5% average speed increase. That is two to three clock speed bins. Given that the flagship bin is about 2 to 3 times the price of the second to third lower bins, that much more performance is highly valued.
By historical standards, having your flagship 2 to 3 speed bins back is a receipe for a disaster. So what you are saying is that a 2.33GHz or 2.5GHz Yonah being required to match that 2GHz A64 X2 3800+ overall is just a little off. 5% of the 200 lap Daytona 500 is 10 laps. Any race car driver would be quite ashamed to come in that far behind. That reasoning is just what someone who is that far behind would want to say.
Pete
Wbmw:
In this case you are plain flat wrong. There are four items in that PDF on Opteron thermal power. The common one is the 42C ambient used in the testing. The next is used based on the thermal profile used which is simply the thermal resistance (ie. 0.25W/C). The third item is the Tcontrol temperature which is the maximum allowed Tcase temperature no matter the cooling solution used. Max Tcase is the one item that is unique to each CPU packaged. It tells the maximum Tcase that will be seen on that particular CPU using the HSF with the given thermal resistance and ambient of the specified thermal profile. Yes, it is simple to work up the TDPmax for that particular CPU (just reverse the equation and plug in the known numbers).
This goes a few steps farther than Intel on giving people information on their CPU(s). Intel gives some typical power use for all its CPUs in a clock, VID and model bin. AMD will give you the maximum power used by this one particular CPU in the worst case conditions (outside of Tcase and VID). This is much tighter than the bounds supplied by Intel.
So if you want low power, get a CPU with a low Max Tcase. For those who overclock, they seem to want high Max Tcase (given their posts). The thinking is that those CPUs leak more showing the transistors probably have higher max frequency and thus can overclock higher than those with low Max Tcase, lower leakage and thus, lower max frequency. I think there are too many variables to make any of those claims given just VID and Max Tcase (at least without obtaining a much larger and/or more rigorously obtained sample set).
Pete
Dear Keith:
Yet I was proven right! There are sub 35W DC Opterons in the public domain sold to real customers. Yonah uses more power than that. No test has yet been performed to see how much the Yonah actually used (just the system power in one preview prior to any actual sales) even by measuring what that sample Yonah (probably was cherry picked for reviewers given Intel's past track record) actually pulled from the 12V VRM input. Yonah isn't even out yet and you compare it to desktop CPUs made much earlier and on a MB with much higher capabilities than the Yonah MB (it uses a different 479pin socket so even the MB it uses isn't out yet).
No you are the one who is making laughable comments that didn't prove true. Like 89W Opterons aren't good enough for laptop use. Wrong! Your logic failed! Your so called evidence is refuted by real world results. Thus your points must have been idiotic. There are A64 X2 3800+s (and higher) with TDPmax less than 35W in the field bought by real customers from real retail shops in the real world.
Your points are lacking in credibility. They failed in the present and that is the real test. You can get 939 pin DC Opterons and A64 X2s in laptops using less than 35W TDPmax right now. Since 35W (and 25W) Turions are in T&L laptops, AMD could make DC Turions right now (just place the same die (the DC opteron or A64 X2) on 754 pin mobile packages) and those would fit T&L laptops. The timing of the DC Turion release is up to AMD, since technically, they can do it right now, your incredulity not withstanding.
Pete
Dear Alan81:
This is not what he is saying. Each AMD CPU REV E and later has a part of the CPUID holding the Max Tcase you will see during that CPU's life given the profile's thermal resistance and ambient temperatures. In essence AMD now gives you the maximum TDP for that particular CPU (probably written during or immediately after the testing process). This rating is thus variable to any given CPU within the processor model and/or family. Thus MB/system/HSF designers must use the Maximum Tcase (Tcontrol) and TDP for their designs as they could get a CPU that requires the maximum values. But most CPUs are below that.
Essentially, once you boot the CPU up on a system, it can tell you its Max Tcase under the given thermal profile (from the OPN and CPU model/family) and using the table given, converts that Max Tcase into the TDPmax for that CPU. That does not mean it can't run when at the Tcontrol temperature, it just means that given the HSF and ambient conditions in the thermal profile used, it will not get hotter than the Max Tcase value stored.
Thus if you just so happen to get very lucky and get a CPU with a Max Tcase of 49C, your CPU will use the least power for that model. If you were very unlucky, your CPU (67C Max Tcase) will use the most power with that model. The typical values (from reports) are from 55C to 61C. A 49C Opteron 170 uses less than 28W TDPmax. You could put that into a laptop and have good battery life.
Pete
Wbmw:
Do you happen to have a count of the number of Linux based production shipping 64-bit consumer applications that are currently available for Linux?
Hmm, thats easy. Anything running on PA-RISC, SPARC, Power, PPC, Cray, SGI among lots of others. These have 64 bit Linux OSes and all the appropriate applications of those platforms. Many of which are custom vertically integrated systems. The kind that VARs and Software houses make. 100K+ would be a good order of magnitude count of production ready 64 bit systems on Linux.
Evidently you forget that Linux runs on platforms other than x86. Porting any of these to AMD64 (quite a few already are) is usually a recompile and then retesting. Moving applications from some 64 bit environement to another (endian and byte order changes not withstanding although with proper portability coding this is not that much harder) is a job that I have done many times before. It genrally took less than a week (automated making and testing are a great help). Going multithreaded, well that's a lot harder.
Pete
Chipguy:
Yes but those applications will have some desired ones in 64 bit which can't be done with Yonah. Face it dual core will be a premium high end type product for laptops. That implies that it should do anything you throw at it. Well there already exists software you can't run on it. In a year, there will be OSes and applications that you may want to run, but can't.
And those who just get their laptops for bragging rights, won't be satisfied. They can already get dual core 64 bit laptops that can take any lastest and greatest software out there and will for a long time. And Turion clocks faster and at lower power than Yonah. Those legacy applications will run faster on Turion for the same power.
Do note that AMD released a Opteron 175HE (2.2GHz dual 2x1MB L2 at 55W max electrical (a far stricter standard than the loose Intel TDP)). It has dual channel ODMC and all of those 64 bit niceties. Turion X2 (Taylor) will fit that into a 35W Max electrical envelope plus have a dual channel DDR2 ODMC.
And anyone buying one of those 32 bit dual core jobs will say, "Sh##, I shoulda got the Turion X2!"
Pete
Wbmw:
The only fools are those who buy 32 bit only dual cores. The only good reasons for dual core in a laptop is running those server type applications and OSes. Legacy software runs better on single core. The problem for Intel is that it can't run those latest and greatest OSes and applications on Centrino. Turion can however. And there are mobile 64 bit dual cores that can. And they will require less power in the near future.
Pete
Chipguy:
Think about what you are saying. Dual core is really only used in multitasking, which implies server type loads. If it can't run server class SMP software, theres no really good reason for dual core in that laptop. Single core is enough and AMD already can run that server software on their single core.
So Yonah loses in two ways, either it can't run latest and greatest software or it won't be bought because it doesn't give you enough in the laptop software out there, especially for the price.
And before you say encoding, there is a maker of encoding software already that won't make a production 32 bit version because the 64 bit version has more than 50% greater performance. In fact thay note a better than 100% improvement in some cases. So a dual core 32 bit will be slower than a single core 64 bit at same clock. Why pay for more expensive slower 32 bit dual cores?
Pete
Dear Alan81:
To make it easier, this was from an answer by Robert Rivet to analyst Mark Edelstone in the Q3-05 earnings Q&A session:
So this will comprehend the incremental depreciation associated with next year for Fab 36, and the decreasing depreciation for Fab 30. So I'll kind of give you the all-in net number for the microprocessor business, including the back end manufacturing operations. Think of a number of about $150 million of incremental depreciation net, for next year, compared to this year, for the microprocessor business. Obviously it'll be heavier loaded in the back half of the year than the front half of the year, as we continue to deploy more tools to run manufacturing. I hope that'll help a little bit, Mark.
From: http://epscontest.com/transcripts/05q3_amd_page2.htm
So extra depreciation for 2006 will be about $150 million more than what it was in 2005. Given that COGS outside of depreciation should go down with ramp of 300mm wafers, COGS will likely be a wash with a R&D being somewhat higher in the H1 versus H2 and depreciaton being opposite.
If AMD ships a higher percentage of dual core, look for ASPs to rise quickly as these will be high value server, mobile and mainstream desktop, but not value CPUs. Average ASPs should rise faster than COGS due to larger dies. Dual core will likely increase ASPs and after tax profits.
Pete
Dear Jules:
You can tell how high the HSF is because the fan that sits on top is 40x40x10mm (see review text on same page). If you use that as the baseline, the HSF is about 25mm which does not include the height of the socket, chip, PCB or backplate. Adding all of those in, you get to 35mm which is 1.4". Now in a T&L laptop, you have the thickness of the LCD display portion, the keyboard and the back panel plus the vent for the fan. You can see that if the notebook is thinner than 2", all of that just won't fit. That is not a notebook heatsink fan. The fan in a NB is placed to one side with either a heat pipe or thick metal plate to get the fins to one side with a blower type fan blowing air across the heatsink. If the thickness of what goes over the CPU die is more than a few mm thick, it just won't fit in a 1-1.5" T&L notebook.
You can look at the MSI notebook review or the Acer Ferrari notebook review for an example of a notebook heatpipe HSF.
Pete
Wbmw:
The data is missing on how much they did during the test. There have been a few sites that noticed that certain Centrino notebooks seem to notice that a battery life test is running and cut the processor speed down to the minimum. So time is not a good measurement for work completed. When they reloaded Windows from scratch, the good run times disappeared.
Your argument about the X40 is of the same quality, totally worthless as it is likely one of those notebooks who cheats when running mobile mark's battery life test. This site doesn't test for actual work done per unit time when the tested notebook is running on battery. Another test like DVD watching can misrepresent CPU work per watt hour as some GPU include a HW MPEG2 decoder which removes most of the CPU work required. My Nvidia video card has such a decoding accelerator. It uses 1/5 the CPU load of the non accelerated video card I had before. So with it I get 3-4 times the DVD play work versus a non accelerated GPU which likely works out to a 25-50% increase in play time for the whole system.
You would likely claim that it was all Centrino where the boost comes from the GPU. And you would be wrong. That's why trying to make the notebooks nearly identical keeps a lot of these biases out.
Pete
Wbmw:
Battery life/(weight * volume) is a terrible metric. It penalizes large batteries used to get more time between charges. Lets look at two notebooks that use the same exact power, but with one having a battery with twice the energy. The metric has the smaller battery with twice the score of the one with the larger battery. That is nuts!
It should be life/sqrt(weight*volume) at the least as that gets rid of the double penalty for longer run times. The better metric is number of work units divided by sqrt(power*weight). Then a notebook that recognizes the benchmark is running and puts the CPU to the lowest power-speed state to maximize life does not gain an advantage. Now the work unit could be anything like number of times a script completes, the number of prime95 runs, the number of Moldyn runs, etc. It should be a well defined reproducible constant amount of work covering many CPU families.
So the metric fails on three counts, penalizing larger batteries with the same energy and volume densities, not well defined as to work done per unit of life and allows for quite a bit of cheating.
Pete
Dear Mmoy:
To mount windows partitions in linux you can use lines like below in /etc/fstab:
# NOTE: Windows partitions
/dev/hda1 /c vfat auto 0 0
/dev/hdb1 /d vfat auto 0 0
/dev/hdb5 /e vfat auto 0 0
/dev/hda7 /l ntfs auto 0 0
The first column is the partition device name, the second where you want the windows partition mounted, the third is the partition type, the fourth the mount options, the fifth and sixth whether to check the filesystem and to mount it at boot.
The mount command for the "drive C" windows partition is
"mount -t vfat /dev/hda1 /c".
You can unmount it by simply doing "umount /c".
Pete
Wbmw:
Go away! People are buying the processor that delivers in greater and greater quanities, Turion. People are wising up. They see that Turion can run cutting edge software and Centrino can't. That's why Turion is growing faster than Centrino. And its getting worse for Centrino over time. Real people do not like getting less than what they were promised and even less when they don't get what they paid for.
If you want the rosy scenarios, go away to the Intel thread! They wallow in the FUD, hype and promises. Here you get the hard reality. Crying about it just makes you seem like the three year old cry baby next door.
Pete
Wbmw:
Only Intel calls it a thermal virus. AMD just calls it a program and they still run it at full speed. Their TDP allows any program to run, whereas Intel only allows use of programs that don't work the computer hard.
But don't they charge a 50% premium for that last 10% of performance? For that extra money, they should allow you to get what you paid for. But because they take that away silently (look at how much trouble Sony got into by doing things silently), they are now providing a reason for their customers to switch away.
And you decry the bearers of the bad news, instead of those (including yourself) who promted the lies and disinformation in the first place.
Pete
Wbmw:
Ignoring the obvious lack of similarity between laptop designs in that review, Turion does a nice job of competing with Dothan. It's an advantage that will be short lived, however.
Yes, the Turion and its successors will trounce Dothan and its successors.
<sarcasm>Its amazing that you would say that in trials where the two laptops were attempted to be as similar in features as possible and where apples to apples comparisons would get a lot closer, that you would say that they lacked similarity.</sarcasm> Face it, a 35W ML Turion beat a 27W Dothan in battery run time doing a constant load. Heaven forfend what a 25W MT Turion of the same clock rate and cache would have done.
Of course if done at the same price, the 35W ML Turion would have been upgraded in clock, dropped to 25W MT, more memory, more disk and sporting a larger battery (likely with a spare) being still cheaper than the Dothan laptop. Saving $200-300 can buy a lot.
Pete
Wbmw:
Trouble is that YOU don't know how TDP works. You failed to see that they measured the temperature by the on die diode. You failed to recognize that designing for Intel's TDP is a receipe for failure when "thermal virus" is run. You failed to realize that some laptop manufacturers fix Centrino overheating by adding software to throttle the CPU during normal operation. You failed to realize that using desktop performance does not relate to what the CPU gets when run on batteries.
You failed to realize that to get a true picture, one must look at how much was done versus the power used doing it. Using a open air desktop for the work done number and a closed air software throttled CPU for battery run time for average power use doesn't get you work per battery charge. The only check is to see work done while running on battery.
Dell probably did the unforgivable and used Intel's TDP numbers when designing the cooling solution. When hit with a "thermal virus" program, it overheated and then throttled. Just like what I and many others stated would happen. But no, you didn't believe us who actually design thermal solutions, didn't believe people who stated the prudent designer would use the maximum power plus some additional margin at elevated ambients instead of some typical TDP in a cool room.
As to a desktop cooler being low profile enough, I beg to differ. You forget that the height allowance for a HSF solution must subtract from the laptop height, the LCD display, the keyboard, the case, the MB height (the pins and solder bumps on the backside are not flat) and the socket height (most CPUs are not soldered to the MB). I don't know of any desktop CPU HSFs low enough to be used in a less than 1" thick T&L laptop.
Your disbelief does not make it untrue.. Your attacks of the testing fell flat on their face. Your assumptions were dashed. Proper empirical testing beats papers, PR and FUD.
Pete
Wbmw:
The 32 way set associative cache in the older StrongARM and Xscale is implemented with 16 or 32 CAMs, one for each set. And that was done on a 350nm process (StrongARM)! My transistor estimates for the 1,023 way 64 byte wide 64KB L2 includes all 35K SRAM bits plus all gates needed for the CAM portion. Every DRAM chip has hundreds of 10 bit plus 1 row decoders that use about a fourth of the above for each one. The LRU doubly linked list section takes about the same amount all told. That is not too much die area when you think that each cache line has 768 SRAM bits that store the data, ECC and predecode bits. Having the equivalent of another 70-80 SRAM bits, is not that much larger.
500K transistors is not that much compared to the 5 million used to store the data. The L1D and L1I caches are about the same size of 5 million transistors in each. The K8 core has quite a bit more transistors. 5.5 million is far less than the 72 million used in the 1MB L2 16 way SAC.
As to it being bunk, then most all CPUs are bunk because Wbmw thinks that they waste too many transistors. I didn't hear you complain when Itanium went beyond 1 billion transistors most of which are in the caches.
I think you are just digging yourself in deeper.
Pete
Jokerman:
The jokes on you. LRUs can be implemented many different ways. A doubly linked list just has some advantages which greatly assist LRU speed. StrongARM and Xscale use round robin replacement due to its simplicity.
Pete
Chipguy:
YOui still do not know the difference between a fully associative cache and a set associative cache. A banked cache is a set associative cache, dummy! They did implement two 32 entry FACs for the TLBs (just like most L1 TLBs made today).
They did not implement a 1024 way 32 byte wide FAC. They built a 32 set, 32 ways each of 32 bytes each cache. They do not check all banks simultaneously!
In fact the author refers to this as "The chip features separate 16 kByte, 32-way set associative virtual caches for instructions and data. Each cache is implemented as 16 fully associative blocks." Every bank is implemented as a set. So they have 16 sets each with 32 ways in which any given set can be be chacked in a single cycle.
The author(s) interchanges sets and banks and you couldn't figure it out. That's your problem, not mine. Everyone else who read that paper and cited it, saw it as 32 way set associative. Everyone at Intel did too. That's why they refer to it as 32 way SET ASSOCIATIVE!
That paper does miss a few critical pieces which would allow anyone to see what they were doing. The Xscale documents show that the caches while using CAM to check all ways in each set in a single cycle, only one set (bank) is done at any given time (to lower power).
Pete
Wbmw:
You are still wrong. Each set may be fuly associative, but the whole cache isn't. By your definition each set in every SA is fully associative, if it can check all ways in a set at once. Sorry that is not the definition of a fully associative cache. It makes all direct mapped caches to be fully associative, period. And that is complete nonsense.
A FAC must check all its cache lines for a match. A SAC only checks those in a set. Bingo, that is exactly what Intel is doing. It just does it in one cycle. So does every direct mapped cache, the special case where there is only one tag per set.
That they now had to for speed/power reasons had to do that check in more than one cycle, still makes it a SAC.
My design does that for all cache lines, that would be 1,023 lines (one is used for the list master record making zero the marker for end of list). Thus its a FAC.
Pete
Wbmw:
The 4-5 cycles is just what it takes for the update to the LRU algorithm. It's going to be much greater when you take the access in its entirety, and that may again defeat the benefits of the added hit rate.
Wrong again!
The first cycle checks all cache lines at once! A hit is known before the end of the first cycle, no matter how many cache lines are in the cache. With the correct number of registers and columns, you can remove the cache line from the linked list on cycle two. You can then add the cache line to the head of the list during cycle three. On cycle four you can be doing another address lookup in the cache.
Result is that you can check the cache in one cycle to know if there is a hit. If no cache lines match the requested address, on the second cycle a request is made of the next level (either L3 or memory). Two cycles are used to remove the tail entry, place it at the head and ready it to be filled with the new data from outside. This part could be delayed until the first word of the new data comes. It takes between 4 and 8 cycles to fully obtain the data anyway depending on SRQ and XBAR word size.
The five cycle was to allow a smaller amount of logic in the LRU update (going from two cycles to four). If cache misses occur then the throughput and latency of the L2 cache becomes 1 cycle. The latency increases to 3 on cache hits (or 5 using less update logic) and can go higher on cache misses that get data, but will still be available to do another check one cycle before the cache line is filled as cache data is fully available after the next probable hit.
In fact now that I think about it, since the LRU update is isolated from the address check, you could start a new address check on the last cycle of the LRU update. Thus the throughput is once check every two (or 4 with the smaller logic) cycles (except on cache filling misses).
AS to loading it into a FPGA, sorry that idea isn't patentable as its standard practice in software for decades. I don't think the USPTO could be that dumb (although they have made such silly mistakes before).
Pete
Chipguy:
Page 6 paragraph 3:
All requests that "miss" the instruction cache generate a 32-byte read request to external memory
That's a 32 byte wide cache line. 32 ways times 32 bytes per cache line means 1KB of cache per set. There are 32 sets of 32 ways each in the instruction cache.
Page 6 paragraph 4:
Each cache has a line size of 32 bytes, ...
So the data cache is the same except for:
Page 6 paragraph 5:
The processor core enables software aplications to re-configure up to 28 Kbytes of the data cache and use it as high speed RAM
At no time is either cache fully associative (having 1,024 ways of 32 bytes each). It is unknown whether the remaining 4KB of data cache is composed of 4 sets of 32 ways each or 32 sets of 4 ways each. The latter is more likely due to tag size usually being fixed.
This is not what I say about it, but what Intel says about it. Are you calling Intel a liar? No, here they do not lie. You do not comprehend what they are saying.
Situation normal, Chipguy is wrong!
Pete
Wbmw:
Neither Dell nor IBM have them currently in stock, try again.
Pete