Wednesday, December 2, 2009

The Significance and Challenge of Tier Ratings

The Tier Rating System, developed and implemented by The Uptime Institute (which itself is now part of The 451 Group) has been the standard measure bantered about by clients seeking data center space.

Over time it has been supplemented by broad interpretations, no doubt to the at least ocassional dismay of its developers. These interpretations have been somewhat of a necessity given the lack of flexibility written into the standard.

For example, each tier specifies a minimum height for raised floor. This leads some people to the erroneous conclusion that data centers without raised floor are somehow less reliable. This interpretation is incorrect . Raised floor allows for different design and operational implementations but neither those implementations or the raised floor itself impart more reliability. In fact, in high seismic zones just the opposite IS TRUE - slab has a higher calculated reliability than raised floor.

Calculated reliability is where we need to go as an industry. We need to thoroughly analyze design and operating standards to identify single points of failure, sources of cascading failures, maintenance considerations and a range of other concerns that determine reliability. Where appropriate, we need to back these up with detailed and vallidated calculations.

Monday, November 23, 2009

Equinix Purchase of Switch - Decline of the Massive Data Center

Three years ago, Equinix would have never seriously considered buying Switch and Data. Switch facilities were of a smaller scale, lower power density and more broadly distributed. All of these points were the converse of Equinix' stated strategic tenets at the time.

So did Switch reinvent itself or Equinix radically alter its strategy? In truth a little of both happened, but even more this is an indicator in a significant shift in market direction influenced by a number of factors.

What seemed like an insatiable drive for higher power density and ever bigger and more reliable data centers has slowed if not all together abated. Although motivated by the drive to reduce energy costs. the shift to virtual environments is also starting to significantly impact data center procurement.

The trend in developing a hardware platform for a virtual or cloud implementation is to seek the optimum price point for hardware, which often means stopping well short of top of the line high density blades. Architecture is also advancing such that many clients are achieving distributed and self healing computing environments. Even though we have seen some celebrated "cloud failures" there is an increasing number of customers seeking lower physical infrastructure reliability, and price points.

I haven't even touched on potential shrinking footprints with full-scale virtualization, but let's just suffice to say that Equinix apparently views it wiser to invest more in broadening their foot print and product line and (perhaps) less in building new mega data centers.

Saturday, November 21, 2009

Colocation in Central Europe

Last week I had the pleasure of reviewing a number of colocation facilities in Hungary, Romania, Poland, and the Czech Republic. We saw facilities at many different stages of development and product offering.

In some facilities it was a trip back to the early 90s. Some of these facilities had literally hundreds of tower servers on utility shelves like you would use for storage in your basement. Apparently, this has remained a low cost hosting option in Central Europe. Most of these facilities are fully lights out. If the customer cannot fix their server remotely, it is removed from the data center for them to work on it. Customers are not allowed on the data center floor.

There is also a considerable amount of hosting by the U, managed services, and cabinet colocation. Most communication is via VPN or dedicated telco circuit, with very little metro ethernet.

Internet connectivity is almost exclusively to the west, with Frankfurt, Vienna, London, and Amsterdam noted as popular interconnection points.

Monday, September 21, 2009

Why ASPs, SaaS, and other Service Providers Should Think Twice Before Building a Dedicated Data Center

In my view of the shared data center market, there are the following five broad classes of service providers:

•Colocation
•Hosting
•Managed Services
•Software-as-a-Service (SaaS)
•Cloud Computing Services and Storage

Colocation and hosting are, in general, space intensive services that justify construction of a stand alone data center. The remaining three generally can be launched and grown to profitability with a relatively modest amount of space.

Further complicating the decision of how big a data center should be is the reality that larger data centers cost less per rack than smaller ones. These issues often create internal conflicts for smaller scale service providers.

Building too big a data center can be counterproductive for a service provider if their own needs to not consume the space and they are forced into new product offerings, such as colocation that do not fit with their core competencies.

Wednesday, July 15, 2009

What Makes a Good Data Center Location?

If you were to ask this question 15 years ago you would get a different answer from many than you do today.

Data Center siting ca. 1994 was about proximity and network. You needed to be close enough for your labor force to have easy access and to connect to two to three networks at a reasonable cost. In those days, reasonable cost meant "zero mile" circuits which were typically in downtown areas.

Today, network is ubiquitous and cheap so it is not nearly the limiting factor. Data centers are also increasingly evolving into infrastructure facilities that require much less support from a companies labor force.

So the major determinant today is cost of the data centers. Over the last few years that meant the cost of power, but today it is a much more wholistic synthesis of power cost, construction cost, and labor costs. In some cases, network has re-emerged as people have looked to remote regions of the world that lack ample and competitive networks, but again this is primarily a cost factor.

The data center business is good right now, and it is expected to continue in that direction, but I think that we will increasingly see an emphasis on cost, including the development of a range of new metrics to compare facilities.


Friday, July 10, 2009

Why are there so many data center outages?

So far this month we have seen major and prolonged outages in Toronto, Seattle, and Dallas, and, unfortunately, this is a trend that is likely to continue.

Why did it happen now?

Each of the situations was different, but this is generally the time of year when failures happen. Since the external temperatures are warm, electrical loads are at their peak. Although the critical load itself does not have seasonal variation, the amount of power used for cooling the data center, and the rest of the building, if it is a multiple use facility, is at a peak.

Why this year?

A combination of factors. First, age of the equipment. Second, potential lack of maintenance (although I am not pointing any specific fingers). Third, and probably most significant, increased critical load. We have been seeing critical loads in telco hotels and multi-user data centers going up 5 to 10% and sometimes more each year. Once lightly loaded facilities are now edging close to or past the comfort level of 80-85% of rated load. The unfortunate corollary is that this means that we can expect more failures next summer without significant intervention.

What do we do?

To some extent it is starting from the beginning again. I am not pointing fingers at any of these recent failures, but in review of facilities and facility failures we often find that initial commissioning of the facilities is either inadequate or non existent. To keep a facility on line, operators should:

1. Review commissioning and design records to make sure that they are adequately developed and that maintenance, monitoring, and operating procedures are adequately defined.
2. Make sure that all critical system maintenance, including such items as torque checks on bus duct, cycling of breakers, load testing, etc. is current.
3. Know the loads on critical parts of the system and where those are relative to safe working capacity - make sure that maintenance is also appropriate to loads - manage loads down or add additional capacity if required.
4. Make sure that disaster planning and recovery scenarios address a wide range of failure scenarios, including destruction of proximate equipment.

These and related procedures will not eliminate all failures, but they will eliminate many.