8 November 2013 by Jonathan Koomey
Jonathan Koomey, research fellow at the Steyer-Taylor Center for Energy Policy and Finance at Stanford University, says predictive modeling can unlock stranded capacity and identify practices for higher efficiency and reliability
Jonathan Koomey will be speaking at DatacenterDynamics Converged London on Day 2, November 21. To find out more about the two-day event, held on November 20 and 21 at the ExCel London ICC, click here.
More than three decades ago, engineers designing PCs used simple physical models to prevent computing devices from overheating when operated. They built models using cardboard or other convenient materials, placing fans, chips and other key components inside the box to understand airflow, heat transfer and temperature variations. Because of the expense and complexity of this crude approach, they could only test a modest number of design variations before deciding on which worked best.
At some point, the industry realized computer simulation of the heat and airflow within the box was possible, but it wasn’t simple. The physics of heat transfer was well understood in the abstract, but when dealing with manufactured components installed in complex systems, modeling tools and measurements had to be combined in the right way to make accurate predictions possible. Such tools allow a designer to simulate the effects of swapping out components without having to physically modify the system. Modeling computer systems has the advantage of being fast and cheap, so designers could assess hundreds or thousands of combinations before deciding on a final design.
Data center designers face an even more complex challenge than early PC designers. It is as if a PC had hundreds or thousands of fans, power supplies, chips and disk drives, and all these components were moving around in ways almost impossible to predict. Adding to the complexity is the fact that each data center is unique, so a model specific to each facility must be developed and calibrated. And the IT arrangement within the facility changes significantly over time, unlike the interior of a mass-produced device such as a computer.
If data centers were built and operated as designed, the electricity use (thus cooling demands) associated with computing equipment would be relatively predictable, but in real facilities the installation and operation of that equipment rarely matches assumptions used in the original design. The result is that cooling and power infrastructure is fragmented because of the imperfect and unpredictable nature of all real computing deployments, thus stranding infrastructure capital that could otherwise be delivering power to additional servers.
Businesses install computers to solve business problems, but because of time constraints, technical complexity and the fragmented nature of most data center organizations, the way computing devices are chosen, deployed and operated is far from optimal. That means the expensive capital represented by the data center itself is in most facilities being used at far below its maximum potential. This stranded capital is often one-third or more of the total, so a significant part of assets costing tens or hundreds of millions of dollars is generating no financial return.
The data center industry is only now beginning to face up to this challenge. Predictive modeling tools now exist that allow each data center to track its equipment, match that equipment inventory with measurements of airflow, temperature and energy use, and analyze in advance how different possible configurations of equipment would affect stranded capacity. If such tools are used properly, a significant fraction of stranded capacity can be recaptured but it will take a sea change in the way many of these facilities are designed, built and operated. It also requires management changes – it’s not just (not even primarily) a technical problem.
Most existing data centers have significant stranded capacity because they haven’t taken advantage of the new tools and changed their management practices. In fact, it’s the change in institutional structure that is the most critical factor in driving lower cost and higher efficiency, but this is rarely within the power of data center operators to affect. That is why it’s imperative that senior management in corporations with significant data center operations understand what is at stake, so they can drive these changes from the top down.
Predictive modeling allows operators to identify stranded capacity and unlock it, thus allowing installation of more value-generating computing equipment and reducing the cost-per-compute-cycle (which, rather than energy savings, is the ultimate goal). To accomplish this goal, organizations require the consolidation of budgets and authority under one manager and one department, the development of a common language around data center investments, the creation of an accurate and calibrated model for each data center facility, the use of such models for analyzing alternative investment scenarios to guide computing deployments and a single-minded focus by all parties on minimizing total cost-per-compute-cycle delivered.
Achieving low cost-per-computation requires the use of predictive modeling to minimize stranded capacity. Haphazard installation of computing equipment and poor coordination between facilities and information technology departments will inevitably result in poorly utilized capital assets, and this waste can only be stopped when the size of the potential waste is quantified and management changes are implemented to stamp it out. Predictive modeling is the key to accurately assessing and addressing this waste, and all modern data center organizations need to adopt it, post haste.