The Atomic Unit of Compute

This post is part of a series based on a presentation I did to the London VMware User Group on February 25th, 2010 about the reality of Enterprise scale internal cloud platforms. To find other posts in the series, just look for the tag “Bringing the Cloud Down to Earth”.

Bonus content When I first started putting together my VMUG presentation, it was actually solely focused on this particular topic. I’ll link to the original presentation at the end, as I think it’s better if you read the post first.

Another of the challenges you’ll face along the way of Cloud is that of how to measure exactly what it is you are offering. But having a look at what the industry is doing won’t give you much help… as with so many things in IT, there is no standard. Amazon have their EC2 unit, and state that it is roughly the equivalent of 1.0-1.2GHz of a 2007 Opteron or Xeon CPU. With Azure, Microsoft haven’t gone down the same path – their indicative pricing/sizing shows a base compute unit of 1.6GHz with no indication as to what is underneath. Rackspace flip the whole thing on it’s head by deciding that memory is the primary resource constraint, therefore they’ll just charge for that and presumably give you as much CPU as you want (but with no indication as to the characteristics of the underlying CPU). Which way should you go? IMHO, none of the above.

The problem is of course that we all know 1.0Ghz of a 2007 Opteron is _not_ the same as 1.0GHz of a 2007 Xeon, and a few months back posts like this were getting a lot of attention as people started spinning EC2 instances up and down in a kind of lucky dip style which ended when they got the CPU architecture they wanted. So all points to Amazon for trying, but they didn’t quite hit the mark – they may as well have not said anything about the underlying CPU like Microsoft and Rackspace. After all, this is the Cloud, right? None of that hardware stuff should matter to the apps, right?

The words “utility” and “cloud” are thrown around almost interchangably a lot of the time, heck I’ve done it. But are actual utilities like this? Imagine if one electricity provider charged you in Kilowatt Hours (kWh) while another charged you in Pferdestärkenstunde (PSh)? We need to have a standard unit of compute, that applies to virtual _and_ physical, new hardware and old, irrespective of AMD or Intel (or even SPARC or Power). And of course, it’s not all just about GHz because all GHz are most definitely not equal and yes it _does_ matter to applications. And lets not forget the power needed to deliver those GHz.

Luckily for you and I, there’s a group out there who have already solved this problem. Or at the very least, have taken a bloody good shot at doing so – they’re called Ideas International, and their shot is the Relative Performance Estimate 2 (RPE2). In a nutshell, the RPE2 is a composite benchmark consisting of several industry standards (SAP SD 2-tier, TPC-C, TPC-H, SPECjbb, SPECint_rate, SPECfp_rate) which can be used to provide a standard, objective measure of compute power across all hardware irrespective of chip architecture. Want to stack a SPARC based Sun M5000 up against a HP DL785? Go for it! Compare a 3-4 year old HP DL385 G2 with a shiny new HP BL460 G6? No problem! And most importantly, compare any of them with a VM. It’s all possible. This is the kind of standard unit we need in order to definitively point at that old 2 socket piece of tin that’s had <10% average utilisation and <40% average peak utilisation over the past 4 years and say "thou shalt be virtualised" when it comes time for a hardware refresh. And no one can argue with you, because you have performance comparison data in black and white.

"BUT…" I hear you say. "That may be all well and good for within an Enterprise, but what about the industry at large? With different hypervisors, with different schedulers? And different overheads of single vCPU vs multi-vCPU? And different contention ratios? And all the other shit you haven't even thought of? Someone tell him he’s dreaming!“.

Maybe I am dreaming. But you and I have the power to influence change, and if you should decide to adopt the RPE2 as your internal standard then there’s absolutely no reason why you shouldn’t request that any external Cloud provider you talk to does the same. Otherwise how can you move workloads around with confidence? How can you actually compare what you are buying with what you provide internally, or compare what you are getting from one Cloud vendor with another?

And the thing is, it’s not the first time someone has taken multiple fuzzy variables like those above and turned them into a single constant – in fact, the UK government did exactly that with The Gas (Calculation of Thermal Energy) Regulations 1996. Unlike water or electricity, the consumption of gas is more difficult to measure because of the large effect external factors like temperature and pressure (ie height above sea level) can have on volume, which is how gas is fundamentally consumed. And since the measurement of gas consumed takes place at the meter in your home, there is great potential for variation in temperature and pressure at the meter. But it would be entirely impractical to calculate these variables individually, so the UK government mandated that a constant be used, based on the average height above sea level and average temperature of a region. And that was the end of the matter – the playing field was largely levelled.

But they even went one better – they also mandated that gas would not be charged by some volumetric unit. They threw in a little extra science in order to make gas be charged in Kilowatt Hours, just like electricity! So now in the UK there is a _single_ unit that household energy is billed in. And for some strange reason, I think that’s pretty fucking cool.

Someone at the last CloudCamp in London was postulating about the role of Government in the Cloud. While I’m sure they didn’t have something like this in mind, perhaps the Government does have a role to play – as much as I’d like to think “if the government can do it, we can do it even better!”, I’m doubtful the industry could come to an agreement like this on it’s own. In order to achieve true utility, we sure as hell need a standard unit.

That wraps it up for this post, I hope you found it thought provoking. You can download the original presentation here.

Advertisements

Tags:

One Response to “The Atomic Unit of Compute”

  1. The Atomic Unit of Compute « TechOpsGuys.com Says:

    […] found this pretty fascinating, as someone who has been talking to several providers it certainly raises some […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: