dacaw

Followers	0
Posts	297
Boards Moderated	0
Alias Born	04/26/2004

dacaw

Message

Join the InvestorsHub Community

Register for free to join our community of investors and share your ideas. You will also get access to streaming quotes, interactive charts, trades, portfolio, live options flow and more tools.

Join the InvestorsHub Community

Register for free to join our community of investors and share your ideas. You will also get access to streaming quotes, interactive charts, trades, portfolio, live options flow and more tools.

dacaw

Message Follow

Latest Posts

dacaw

01/24/05 2:58 PM

Post #51231 on Advanced Micro Devices Inc (AMD)

New HT 2.0 article

Just about to read it:

http://www.devx.com/amd/Article/26985

dacaw

01/05/05 4:20 PM

Post #50149 on Advanced Micro Devices Inc (AMD)

Prescott 2M Max Tc

Note too that the max allowed temperature for Prescott 2M w/ EMT64 has dropped from 72.7C to 70.7C.

Obviously the design is "tighter". Dropping the Tc-Tcase by 2C make the colling solution even harder than it is already and its already tough enough. What a nightmare.

dacaw

12/21/04 3:11 PM

Post #49647 on Advanced Micro Devices Inc (AMD)

AMD more reliable than Intel

That doyen of hardware sites, known for its impartial view on intel v. AMD (not!), running stress tests on systems.

Seems that the intel system leaves a bit to be desired:

http://www20.tomshardware.com/stresstest/index.html

They'll probably start dropping coffee on the AMD system any time now. Anything to keep their intel ad revenue.

dacaw

12/13/04 10:46 AM

Post #49291 on Advanced Micro Devices Inc (AMD)

Semiconductor Breakthrough: Processor 24 times faster

Gotta love these journalists!

On Google news no less.

http://www.earthtimes.org/articles/show/852.html

dacaw

12/10/04 5:16 PM

Post #49209 on Advanced Micro Devices Inc (AMD)

Scc on SIMD

Often you can avoid setting condition codes by xors, mins and maxes. What are you wanting to do? I generally find that the video I have worked on can avoid any jcc.

dacaw

12/10/04 3:56 PM

Post #49206 on Advanced Micro Devices Inc (AMD)

Hi mmoy - nice assembler optimization

As promised here's a nice trick you can do with assembler - its one of the best optimizations I have come across yet its not in any software optimization guide I've seen, certainly not in AMD's.

Many loops have a general form:

top:

[do_loop_body]

[test_for_some_condition, if true break]

repeat_n_times

endloop:

This is generally encoded as a test, jcc, dec, jnz

But there is a better way if the test, jcc sets & tests the carry flag.

Note that a) ja jumps if CF==0 && Zf==0
and b) dec doesn't change the CF

so you can do this:

test_condition ; sets CF if break
dec reg ; doesn't change CF
ja top_loop ; jumps if neither CF nor ZF are set

Compared to some of the points in AMD's Software Optimization Guide this is a much bigger win than most. One of the big issues in the Guide is the # of jumps in a cache line, this often helps that problem.

Enjoy!

dacaw

12/09/04 4:28 PM

Post #49130 on Advanced Micro Devices Inc (AMD)

Hi mmoy:

Took a wander over to moox and saw that you're active there.

I presume you profiled Firefox, what are the critical sections?

Do you do any assembler or is it all C stuff? If you do assembler I have some tricks you may be interested in.

dacaw

12/09/04 1:40 PM

Post #49117 on Advanced Micro Devices Inc (AMD)

All agreed here.

Yup, no question about it, everything is just hunky and there's not a cloud in the sky.

Intel's just like a bunch of happy beavers.

Whistle-while-you-work, oh yes, whistle while you work. .

Do-dah-de-dah, hmmmmm.

dacaw

12/08/04 4:25 PM

Post #49026 on Advanced Micro Devices Inc (AMD)

About credibility.

Does anyone else feel that the multitudes of statements that intel's execution problems have been "solved" leaves just a teensy-weensy bit of disbelief.

I mean it was just a few weeks ago that Barrett exploded with rage in the widely circulated memo about intel really not doing their stuff.

What happened in those few short days? I mean did the archangel Gabriel come down and annoint all the researchers. There's hardly time for any results to shore up the "its fixed" claims.

More likely a whole lot of foils have been adjusted, schedules pulled in on critical paths - whether it can be done or not.

Meanwhile there's a whole lot of senior foreign scientists doing a "Berners-Lee" (inventor of WWW, leaving MIT for grayer climes over the other side of the pond). My neighbor told me last friday that researchers were leaving Amgen in Thousand Oaks, CA in quite large numbers, to return to their countries of birth. Something to do with the meglomania coming out of DC I believe.

So its pull in the schedule, dust of the resume, call mom in Bangalore and start making preps for the trip.

Well we'll see if intel's firing on "all 8 cylinders" won't we.

dacaw

12/03/04 12:48 PM

Post #48636 on Advanced Micro Devices Inc (AMD)

On chip memory controller.

Trouble is that intel has had a lot of trouble with memory controllers.

Let's hope they don't use the same circuitry as they did in the MTH.

Its a long way from foil to product. Another recall would be rather embarassing.

dacaw

12/01/04 3:16 PM

Post #48444 on Advanced Micro Devices Inc (AMD)

"No serious developer uses a laptop"

Well I can tell you that you're wrong!

dacaw

12/01/04 3:12 PM

Post #48440 on Advanced Micro Devices Inc (AMD)

I need a 64-bit portable

If I'm developing software for AMD64 desktops or servers I need a 64-bit laptop on which to work don't I?

It's not just gamers.

dacaw

11/25/04 12:51 PM

Post #48001 on Advanced Micro Devices Inc (AMD)

"directx on servers?"

DirectX is used for more than games.

Take a look at Sun's JRE (java runtime). I just had an issue with a machine that had a Kmode_Exception on D3DX. When you ran java apps it would trigger the same exception.

I know that there's a way to tell the java compiler not to use DirectX but default is to use it.

I believe java has some use on servers, no?

This is just an example.

dacaw

11/25/04 11:26 AM

Post #47996 on Advanced Micro Devices Inc (AMD)

No DirectX on Itanic

Just going over the SDK notes for DirectX 9.0c and there's this line:

"The is no support for the IA64 bit platforms. "

Yeah, itanic takes over the world.

dacaw

11/24/04 4:24 PM

Post #47972 on Advanced Micro Devices Inc (AMD)

Now THAT was funny.

I actually guffawed! Thanks.

dacaw

11/23/04 3:59 PM

Post #47935 on Advanced Micro Devices Inc (AMD)

Nvidia SLI question

Can someone explain to me please why this Nvidia SLI (dual video card motherboard for one display) is such a great thing?

Surely it makes more sense just to make a video card with 2 GPUs?

dacaw

11/20/04 12:35 PM

Post #47788 on Advanced Micro Devices Inc (AMD)

Slightly OT: New Google service

This looks really good:

http://scholar.google.com/

Returns search results that are papers & publications.

dacaw

11/20/04 12:33 PM

Post #47787 on Advanced Micro Devices Inc (AMD)

Quote "Finally, OoO is of very little benefit for most FP intensive code."

Where do you come up with this BS?

I told you exactly my experience of writing for OOO and non-OOO procs. Do you actually read what's posted or do you just replay the party line? I stated, and I will state again, that coding for fp-intensive operations is much easier on the Athlon because the proc takes care of the micro-level optimization.

Non-OOO are a royal pain. IMHO they have no place in big-end machines because the results are so unpredictable.

One butterfly flap of the wings and the run time goes to hell.

dacaw

11/20/04 11:06 AM

Post #47783 on Advanced Micro Devices Inc (AMD)

. . . and a corollary

Thinking about it - that non-OOO fp proc are highly sensitive to micro-optimizations - it would seem reasonable to me that compiler writers would put lots of effort into the codegen for oft-used SPEC sequences.

But of course intel would never tweak benchmarks to favor their own processors, would they????

dacaw

11/20/04 10:56 AM

Post #47782 on Advanced Micro Devices Inc (AMD)

Effects of Compiler, #regs & OOO on fp performance

I've watched this piece of fud about hidden regs vs visible regs with some amusement.

With itanic the optimization is done by the compiler. Since there is no out-of-order facility in the proc there is little opportunity to optimize at run time.

I experienced this when I moved from the K6-III to Athlon in my heavily hand-tweaked assembler. The K6-III fp unit had no opportunity to do OOO execution on fp code, like the itanic. When you were in a complex section just changing the order of a couple of lines could result in pretty large speed changes. What a pain! You even had to put nops in to align code on boundaries.

In contrast Athlon optimizes the fp code at run time by moving the ops around as resources come available. Thus micro-optimization by the coder is pointless - you really don't see any difference by tweaking the odd line here or there - or sprinkling nops around.

It looks to me, from articles on the Athlon64, that the fp OOO has been improved quite a bit. Of course just having 16 SSE regs is the bees knees.

Saying the itanic's registers are "better" because they are visible is just silly. They have to be visible else the proc can do nothing worthwhile with them. Its compile-time optimizations that are the whole basis of the EPIC design. I'd rather have intelligent run time hardware that maximises the resources available.

There are lots and lots of studies that analyze the benefit of increasing the # of regs. Of course its diminishing returns. 16 seems optimal right now given software tech and hardware design.

In so many ways x86 is broken. AMD64 makes it worthwhile for the first time.

dacaw

11/19/04 6:01 PM

Post #47756 on Advanced Micro Devices Inc (AMD)

'It looks like the answer is "no".'

No, it looks like the answer is being debated.

Probably between the intel marketeers and the people who actually do things like CFD (as I used to, a lot).

My conclusion, FWIW, is that there's a lot more to running analysis than just spouting some magic number.

dacaw

11/19/04 4:14 PM

Post #47750 on Advanced Micro Devices Inc (AMD)

Infiniband, Pathscale & Clusters

Article on how Pathscale is using Hypertransport, incl. the new HTX connector, with Infiniband for cluster interconnects and the benefits therein.

http://www.devx.com/amd/Article/22534

dacaw

11/19/04 3:44 PM

Post #47749 on Advanced Micro Devices Inc (AMD)

SPEC, cache & memory

From the same site as the Opteron fp unit discussion another article about the brouhaha over spec scores:

Quote: "The SPEC 2000 benchmarks are subject to much debate in the scientific community. Are they broken? Do they just depend on memory bandwidth? Do they fit entirely in the cache? "

http://www.chip-architect.com/news/2003_08_29_Cache_efficiency_for_SPEC2000.html

Note the comment in the final para:

"The memory footprint of the SPEC2000 benchmarks is less then 200 MByte to be able to run on systems with 256 MByte DRAM. Heavier applications using multiple Gigabyte structures are likely to see much greater degradations. AMD's distributed memory solution based on HyperTransfer links is likely to pay of in these cases. A four processor 2200 MHz Opteron may reach a similar SPEC2000_rate performance as a four way 1500 MHz Itanium 2 even though the latter has a much higher single processor score. Again, larger floating point memory footprints may skew the results even further. "

dacaw

11/17/04 4:44 PM

Post #47593 on Advanced Micro Devices Inc (AMD)

Lack of 90nm in channel

These amazing power consumption figures strongly suggest that:

a) 90nm chips are going to preferred customers (SUNW especially)
b) AMD wants to clear 130nm chips out of disties

That's why, for example Reseller Mike, sees few 90nm chips coming down the line.

dacaw

11/17/04 4:15 PM

Post #47589 on Advanced Micro Devices Inc (AMD)

Low power consumption

As in the chip.de article and tomshardware bodes extremely well for dual cores next year.

AMD has got quite a process there to match its matchless design.

dacaw

11/14/04 2:11 PM

Post #47320 on Advanced Micro Devices Inc (AMD)

Large cache & real world problems.

I've done a lot of fluid dynamics. I've published (15 years ago now) in this field. When I started CFD 25 years ago it was really in its infancy.

You always start with a small number of steps (finite-difference based methods) or a small number of mesh points (finite element methods). The governing parameter is the speed of solution - it used to be the size of virtual memory, now that is relaxed on workstations.

You want to have really detailed modeling of the system -i.e. lots of points - but you always hit the limit of run time. Its no fun waiting for your solution all day to find out it blew up.

So there is no magic "ideal" working set. More is always better if you have a fast enough system.

The specfp tests have a significant working set but its a fixed set. Cynical manufacturers could make processors where all of the code+data fits on-chip. Of course when the real user then tries to extend the simulation, and the code+data spills into physical memory, he/she is going to be disappointed. There is another point where the spillage from physical->virtual (i.e.disk) kills you but physical is cheap and relatively easily extended. On-chip cache is neither cheap nor easily extended.

That's why you don't take these specfp results only at face value. Take a trip to somewhere like comp.arch and see what people think about the itanic there. They will say exactly the same thing as me: these specfp results are all about false benchmarking.

dacaw

11/14/04 1:24 PM

Post #47315 on Advanced Micro Devices Inc (AMD)

Ho ho ho specfp & itanic

Just had to laugh.

I told you how itanic does well on specfp because of its large cache, a factor that does not extend to real world scenarios.

Now intel ups the cache to 9MB and it does really, really well - and you crow about it.

Thanks for making my point even more clear.

I hope intel just keeps pushing that boat out.

dacaw

11/12/04 4:41 PM

Post #47217 on Advanced Micro Devices Inc (AMD)

"Super platinum member"

I remember the "platinum" call. My wife went all gushy over that one (she answered the phone). I think that was about $350K. That was the day that AMD hit $90+.

Balance that against the day when AMD hit $3 and she sat on the bed sobbing as I got the margin call. Then she tried to hit me with a frying pan and told me I was stupid to believe in that @@*&$#@ AM whatever D company. Fortunately I had the cash to cover the call.

Super platinum is defined by a slight swagger in the walk, an ear-to-ear grin and an i-told-you-so expression. As this is potentially a family board I can't tell you about the member.

dacaw

11/12/04 4:18 PM

Post #47213 on Advanced Micro Devices Inc (AMD)

"AMD really isn't that much different than it was 6 months ago"

Yup, that's why the stock boards have been bursting with messages and why I was accumulating at 7.6, 9.3, 12.02 (a LOT), 16.

Feels pretty good today. Waiting for my call from Schwab - you know the one "you are now a super-platinum member, congratulations"

dacaw

11/12/04 4:13 PM

Post #47211 on Advanced Micro Devices Inc (AMD)

When to sell.

No, now is NOT the time to exit, the runup has just started. A lot of people look to TA for an exit point. That's well and good but the fact here is the momentum. Just take a look at the 3 month chart. It's awsome. There will be some settling along the way but this puppy won't be fully valued until its cap > $20bn and that's a long way from here. Of course by the time it reaches $20bn we will certainly have Solaris 10 and probably Windows XP64. Then we'll be looking for $30bn.

Just my 2c

dacaw

11/09/04 5:25 PM

Post #47009 on Advanced Micro Devices Inc (AMD)

Looking at the specfp scores it doesn't seem that the change from v7.1 to v8 made that much difference (see the SGI Altix lines where you can directly compare).

There are so few results for the itanic and, strangely, very few where you can break out the effect of cache versus clock.

Looks like to me the weasel is firmly in intel's camp.

dacaw

11/09/04 5:15 PM

Post #47006 on Advanced Micro Devices Inc (AMD)

Changes to k8 fp execution unit with E0 stepping

Sorry not to get back to you on this but been busy.

I did find this summary of what to expect as SSE3 arrives in the core:

"Kevin McGrath, chief architect of the AMD "Hammer" line, recently gave a presentation at Stanford University detailing the forthcoming changes in the next revision of the Athlon 64 and Opteron. Apparently both processor lines will feature full compatibility with SSE3. In fact, it may actually be somewhat better than Intel's SSE3, as the AMD chip will dynamically translate some of the SSE instructions into operations specifically tailored to the "Hammer" design, in some cases lowering latency down to as little as one cycle. Intel's latest "Prescott" iteration of the Pentium 4 design requires many more cycles to complete the same work due to its higher clock speed and deeper pipelines."

http://www.geek.com/news/geeknews/2004Mar/bch20040303024101.htm

Look to me like a significant revamp of the fp unit.

dacaw

11/09/04 5:03 PM

Post #47005 on Advanced Micro Devices Inc (AMD)

Cache usage on specfp

Uh, your link points to memory, not cache usage. The executing portion of the program << total program size.

See how the itanic2 scales with cache (from spec site):

Dell Dell PowerEdge 3250 (1.4GHz/1.5MB, Itanium2) 1 256KB(I+D) on chip 1.5MB (I+D) on chip 1444 1444 Sep-2003
Dell Dell PowerEdge 3250 (1.4GHz/3MB, Itanium2) 1 core, 1 chip, 1 core/chip 256KB(I+D) on chip 3MB (I+D) on chip 1868 1868 Apr-2004
Dell Dell PowerEdge 3250 (1.5GHz/6MB, Itanium2) 1 256KB(I+D) on chip 6MB (I+D) on chip 1875 1875 Aug-2003

So at the same clock (1.4GHz) the fp score goes up > 400points as we double the cache from 1.5MB to 3MB!
Now increase the clock to 1.5GHz and double the cache to 6MB and the score goes up 7 (yes that's SEVEN) points.

Duh, doesn't that look like the score is highly cache sensitive with a used cache size somewhere between 1.5 and 3MB?? Once the app is bigger than the cache I suggest that the 1444 score is the accurate figure (add another seven on for the clock speed, say 1451).

At 1451 its a bit inferior to the Athlon64s as you no doubt realize. Typical results for the Opteron 150 run 1528-1644 with a 1MB cache.

dacaw

11/09/04 4:02 PM

Post #46999 on Advanced Micro Devices Inc (AMD)

Itanic2 blade

Going to be some sorry customers if they don't do their homework:

http://story.news.yahoo.com/news?tmpl=story&u=/nf/20041109/bs_nf/28270

Itanic2 does well on specfp because the spec routines fit in the on-chip cache. Most real numeric tasks are not in-cache and those specfp results don't pan out in the real world.

dacaw

11/08/04 8:32 PM

Post #46958 on Advanced Micro Devices Inc (AMD)

intel has hardly made a secret of its give-aways on itanic.

Maybe its time you faced the obvious.

Still I'm happy to see intel carrying on pushing the itanic for all its worth.

If you want a different response there's always the intel board where you could post. I'm sure you will get a more favorable audience there.

dacaw

11/08/04 8:12 PM

Post #46955 on Advanced Micro Devices Inc (AMD)

IPF

You don't think that intel has taken to seeding that market with free itanics? Its a highly visible list. I wouldn't bank on breakthrough itanic sales on the basis of this if I were you. Once intel starts actually charging for the chips I doubt we'll see many new entries.

Meanwhile Cray & Sun are just starting to put out their Opteron products.

We'll see!

dacaw

11/07/04 2:59 PM

Post #46923 on Advanced Micro Devices Inc (AMD)

Maintaining specific code

That's why I prefer a macro assembler! The relevant bits can be extracted into included files.

You make an important point about pentiums. They only thing they have in common is the branding. I have a Pentium III here, in a laptop and I could not use that to develop Willamette or Northwood code.

While it makes sense for corporations to be "all p3" or "all p4" there is no logical basis for being "all intel" except for the 'intel inside" sticker. There is a basis for using intel chipsets instead of via's but that has passed now we have HT and nForce.

dacaw

11/07/04 1:55 PM

Post #46920 on Advanced Micro Devices Inc (AMD)

Vendor specific code

That's the beauty of plug-ins. You can let the user specify what processor they have. In fact my code checks the CPUID and determines which are valid choices so a k6 user can't specify SSE for example.

I did 3 versions from one codebase:

3dnow/MMX for k6
3dnow/MMX enhanced for athlon
3dnow/SSE/MMX for athlon xp
MMX for intel chips (less accurate)
someone else did an x87 version that was as accurate as my 3dnow but much, much slower.

This was highly time critical code. An early version became part of DivX but then they went proprietary. I don't know what they use there now.

I liked 3DNow except for one thing: you couldn't specify the rounding mode and it was not defined as part of the standard.

dacaw

11/07/04 11:12 AM

Post #46917 on Advanced Micro Devices Inc (AMD)

"Maybe that wouldn't be practical."

No, I don't think it would be. The main issue is that, AFAIK, MMX is not supported for 64-bit progs. So I have to convert MMX to SSE.

A lot of the code is macro driven so, with luck, that's write once etc. It was very tight code - all I could do to avoid register spillage. Now there are 16 regs that should be less of an issue.

I also have a section that is 3DNow, now that will be interesting. This code (its freeware and distributed as part of a major freeware package) is run a lot and on large video datasets. So how many people knew that 3DNow ended up being so widely used?

dacaw

11/06/04 7:31 PM

Post #46914 on Advanced Micro Devices Inc (AMD)

I think you're mixing up two threads that I was commenting on simultaneously.

The E0 change in the fp unit has nothing to do with reliability.

I pointed out my own experience of early VIA Athlon chipsets as the reason, maybe a reason, why intel did well with the P4 when Win2000 was being adopted. It was certainly sobering for me and I reported that the virtualdub author had knowledge of similar experience. He is now using an Athlon64. Virtualdub is very important to video enthusiasts.

Any improvement in the speed of SSE processing will be very welcome in video, this from my own profiling of video-related code. It looks that the Athlon64 will become the preferred platform for video, especially as 64-bit apps arrive. I see video aas the future of home computing.

So sorry that my comments on two unrelated matters confused. Now when will I get time to port my code (assembler) to AMD64?

Boards:

Quotes:

Boards

News

Market Data

Markets

Discover

Discover

Boards:

Quotes:

dacaw

Join the InvestorsHub Community

Join the InvestorsHub Community

Latest Posts