Wednesday, July 15, 2009
If you were to ask this question 15 years ago you would get a different answer from many than you do today.
Data Center siting ca. 1994 was about proximity and network. You needed to be close enough for your labor force to have easy access and to connect to two to three networks at a reasonable cost. In those days, reasonable cost meant "zero mile" circuits which were typically in downtown areas.
Today, network is ubiquitous and cheap so it is not nearly the limiting factor. Data centers are also increasingly evolving into infrastructure facilities that require much less support from a companies labor force.
So the major determinant today is cost of the data centers. Over the last few years that meant the cost of power, but today it is a much more wholistic synthesis of power cost, construction cost, and labor costs. In some cases, network has re-emerged as people have looked to remote regions of the world that lack ample and competitive networks, but again this is primarily a cost factor.
The data center business is good right now, and it is expected to continue in that direction, but I think that we will increasingly see an emphasis on cost, including the development of a range of new metrics to compare facilities.
Friday, July 10, 2009
So far this month we have seen major and prolonged outages in Toronto, Seattle, and Dallas, and, unfortunately, this is a trend that is likely to continue.
Why did it happen now?
Each of the situations was different, but this is generally the time of year when failures happen. Since the external temperatures are warm, electrical loads are at their peak. Although the critical load itself does not have seasonal variation, the amount of power used for cooling the data center, and the rest of the building, if it is a multiple use facility, is at a peak.
Why this year?
A combination of factors. First, age of the equipment. Second, potential lack of maintenance (although I am not pointing any specific fingers). Third, and probably most significant, increased critical load. We have been seeing critical loads in telco hotels and multi-user data centers going up 5 to 10% and sometimes more each year. Once lightly loaded facilities are now edging close to or past the comfort level of 80-85% of rated load. The unfortunate corollary is that this means that we can expect more failures next summer without significant intervention.
What do we do?
To some extent it is starting from the beginning again. I am not pointing fingers at any of these recent failures, but in review of facilities and facility failures we often find that initial commissioning of the facilities is either inadequate or non existent. To keep a facility on line, operators should:
1. Review commissioning and design records to make sure that they are adequately developed and that maintenance, monitoring, and operating procedures are adequately defined.
2. Make sure that all critical system maintenance, including such items as torque checks on bus duct, cycling of breakers, load testing, etc. is current.
3. Know the loads on critical parts of the system and where those are relative to safe working capacity - make sure that maintenance is also appropriate to loads - manage loads down or add additional capacity if required.
4. Make sure that disaster planning and recovery scenarios address a wide range of failure scenarios, including destruction of proximate equipment.
These and related procedures will not eliminate all failures, but they will eliminate many.