Transmeta (Semiconductor) (TMTA): Google engineer on power costs. The link...

Reply Private New

Next 10 Prev Next

Send PM Follow Ignore

Followers	5
Posts	1330
Boards Moderated	1
Alias Born	12/18/2001

stack

Re: None

Sunday, 12/11/2005 6:25:12 AM

Sunday, December 11, 2005 6:25:12 AM

Google engineer on power costs. The link has all the charts and follow up pages. Interesting to note, that @ Google performance per watt stayed virtually the same over multiple generations of hardware, while usually articles about AMD and Intel x86 processors claim a spiraling acceleration of performance per watt over the last years (decades?). Who is right? Maybe cooling costs are greatly misjudged?

*****************************+
http://acmqueue.com/modules.php?name=Content&pa=showpage&pid=330

The Price of Performance
ACM Queue vol. 3, no. 7 - September 2005
by Luiz André Barroso, Google

An Economic Case for Chip Multiprocessing
Cost

In the late 1990s, our research group at DEC was one of a growing number of teams advocating the CMP (chip multiprocessor) as an alternative to highly complex single-threaded CPUs. We were designing the Piranha system,1 which was a radical point in the CMP design space in that we used very simple cores (similar to the early RISC designs of the late '80s) to provide a higher level of thread-level parallelism. Our main goal was to achieve the best commercial workload performance for a given silicon budget.

Today, in developing Google's computing infrastructure, our focus is broader than performance alone. The merits of a particular architecture are measured by answering the following question: Are you able to afford the computational capacity you need? The high-computational demands that are inherent in most of Google's services have led us to develop a deep understanding of the overall cost of computing, and continually to look for hardware/software designs that optimize performance per unit of cost.

This article addresses some of the cost trends in a large-scale Internet service infrastructure and highlights the challenges and opportunities for CMP-based systems to improve overall computing platform cost efficiency.

UNDERSTANDING SYSTEM COST

The systems community has developed an arsenal of tools to measure, model, predict, and optimize performance. The community's appreciation and understanding of cost factors, however, remain less developed. Without thorough consideration and understanding of cost, the true merits of any one technology or product remain unproven.

We can break down the TCO (total cost of ownership) of a large-scale computing cluster into four main components: price of the hardware, power (recurring and initial data-center investment), recurring data-center operations costs, and cost of the software infrastructure.

Often the major component of TCO for commercial deployments is software. A cursory inspection of the price breakdown for systems used in TPC-C benchmark filings shows that per-CPU costs of just operating systems and database engines can range from $4,000 to $20,000.2 Once the license fees for other system software components, applications, and management software are added up, they can dwarf all other components of cost. This is especially true for deployments using mid- and low-end servers, since those tend to have larger numbers of less expensive machines but can incur significant software costs because of still-commonplace per-CPU or per-server license-fee policies.

Google's choice to produce its own software infrastructure in-house and to work with the open source community changes that cost distribution by greatly reducing software costs (software development costs still exist, but are amortized over large CPU deployments). As a result, it needs to pay special attention to the remaining components of cost. Here I will focus on cost components that are more directly affected by system-design choice: hardware and power costs.

Figure 1 shows performance, performance-per-server price, and performance-per-watt trends from three successive generations of Google server platforms. Google's hardware solutions include the use of low-end servers.3 Such systems are based on high-volume, PC-class components and thus deliver increasing performance for roughly the same cost over successive generations, resulting in the upward trend of the performance-per-server price curve. Google's fault-tolerant software design methodology enables it to deliver highly available services based on these relatively less-reliable building blocks.

Nevertheless, performance per watt has remained roughly flat over time, even after significant efforts to design for power efficiency. In other words, every gain in performance has been accompanied by a proportional inflation in overall platform power consumption. The result of these trends is that power-related costs are an increasing fraction of the TCO.

Such trends could have a significant impact on how computing costs are factored. The following analysis ignores other indirect power costs and focuses solely on the cost of energy. A typical low-end x86-based server today can cost about $3,000 and consume an average of 200 watts (peak consumption can reach over 300 watts). Typical power delivery inefficiencies and cooling overheads will easily double that energy budget. If we assume a base energy cost of nine cents per kilowatt hour and a four-year server lifecycle, the energy costs of that system today would already be more than 40 percent of the hardware costs.

And it gets worse. If performance per watt is to remain constant over the next few years, power costs could easily overtake hardware costs, possibly by a large margin. Figure 2 depicts this extrapolation assuming four different annual rates of performance and power growth. For the most aggressive scenario (50 percent annual growth rates), power costs by the end of the decade would dwarf server prices (note that this doesn't account for the likely increases in energy costs over the next few years). In this extreme situation, in which keeping machines powered up costs significantly more than the machines themselves, one could envision bizarre business models in which the power company will provide you with free hardware if you sign a long-term power contract.

The possibility of computer equipment power consumption spiraling out of control could have serious consequences for the overall affordability of computing, not to mention the overall health of the planet. It should be noted that although the CPUs are responsible for only a fraction of the total system power budget, that fraction can easily reach 50 percent to 60 percent in low-end server platforms.

Read on page 2