Jane’s Dilemma

12 Comments

In the midst of some continued recent debate around the primary (and somewhat unanswerable) question of “whether organizations will realize significant savings from cloud computing”, I took a slightly different approach than those who delved into Jevons Paradox and tweeted this simple response:

Enterprises will struggle to realize significant cost savings from cloud computing because they simply can’t afford it

Subsequent to that comment, there was some immediate stirring within the #clouderati and, since I tweeted this cataclysmic follow-up…..

Enterprises don’t have $$$ for greenfield development, just $$$ for maintenance, so opportunistic is the best we can hope for

…..I have been subjected to some profoundly aghast virtual staring along with several pointed questions asking me to clarify “what precisely do you mean by that ?”

I hope this post will shed some light on how I arrived at that conclusion, but let me start by moving off the topic of cloud and reiterating an answer I recently posted on Quora, in response to the question of “What is holding back the business adoption of VDI?”

With a hat-tip to all the answers on this, which are very valid indeed, I’ll take a slightly different approach and offer up my thought that what is really holding VDI back is basically one “simple” thing. Applications.

To make VDI make any sense at all from a commercial perspective (unless you happen to be one of the major financial institutions where money is literally no object and one-to-one assigned desktops are considered de-facto in VDI) an organization must have the basic proposition that “dynamically assembled desktops and applications” is the way to go. This is simply from the point of view that it takes less resources in the data center overall and this becomes even more critical, cost-wise, if your data center(s) is collocated.

The utopian promise of VDI has always been “desktop + profile + applications” delivered on demand to the user. That’s all great and relatively achievable until you come to the applications piece. To have the dynamic applications, you really have to get deep into virtualization of the application, whether that be Citrix Streaming or MS App-V (or vmware ThinApp if you are truly mad). Only when you have ALL your applications (except maybe one or two you keep in the base desktop “gold” image) can you really attain that utopia.

Unfortunately, large organizations have hundreds and in some cases, thousands of applications, a mix of client, client/server and web apps. Assuming that there are significant numbers of non-web apps (in my experience, this is definitely the case) then there is literally a mountain to climb and an associated cost, of course, to turn all those applications into “virtualized packages”. While it wouldn’t be impossible to put together a business case to do this, I am not sure that I’ve seen a ready ROI calculator that takes all this into the equation (along with all the other components) and spits out an easily consumable, decision-leading analysis.

The answer I posted was very deliberately intended to beg a slightly rhetorical question:

“Does it make sense to spend money on testing, remediation, re-architecting, certifiying and re-deploying applications compared to what the overall savings might be ?”

Irrespective of the answer your arrive at in the context of the VDI discussion (which will of course differ on individual circumstance) I believe that the question is equally applicable to organizations (especially traditional enterprises) who are either considering, or are in the process of planning, the movement of their existing application workloads to cloud. In both cases, there is a potentially significant impact on the financial models and budgets of the Line of Business (LoB) application owners, many of whom are embedded in the business and not “controlled” by central IT. It is not unusual, in my experience, for large enterprises to have the budgets for LoB applications driven by the business, and focused largely on “keeping the lights on” rather than allocating funds for major initiatives.

By way of illustration, let me give another example from The <redacted> Company of Large Persuasion and take a look at what the example enterprise LoB spend may look like on a yearly basis.

The <redacted> Company of Large Persuasion has 55 LoB applications. 29 of them are In-House Applications (Homegrown) and 26 of them are Commercial Applications (Shrinkwrapped). Together, they form 100% of the LoB application portfolio. The application architectures range from client only, client / server through to web apps. The majority of the underlying technologies have been those employed to “build for behind the firewall”. In this example, the basis is an exploration of what it might take for the organization to plan to move application workloads to the cloud and is focused on public cloud services (IaaS and PaaS), but not a comparison of moving the LoB portfolio to pure SaaS as the types of business processes automated by the LoB portfolio are not readily available as SaaS offerings today.

The organization tracks the spend in 5 main cost categories: Product Enhancements & New Apps, Support, Maintenance, License Cost & Ammortization and Management. Below is the calculated breakdown of the overall spend. (click to enlarge)

The graphic shows the percentage over the overall spend, broken down in the 5 main cost categories. For the purposes of this illustration, the overall spend in monetary value is not the focal point. The “stand out” figure which I find particularly interesting is that there is less than 10% of the total spend in each of the In-House and Commercial Applications columns available for “Product Enhancements & New Apps”. Or put another way:

Only a maximum of 7.23% of the total LoB spend for In-House Applications and 9.39% for Commercial Applications is available for the organization to use for re-working today’s portfolio. This is the organization’s “cloud opportunity”.

Digging deeper, it would be logical to assess each of the other 4 main cost categories to ascertain whether investing non-budgeted funds to cloud-enable applications would pay significant dividends by reducing costs in those areas. My experience in assessing exactly this question is that there might be movement downwards in certain areas, but counter-intuitively, there may be movement upward in certain areas, depending on the cloud service “target” environments. Of the 4 main cost category areas (excluding Product Enhancement & New Apps) the area of Support is the most concerning. Today, I am not convinced by the argument that moving to IaaS (with Application workloads on top) will reduce the number of support resources required to efficiently operate and support the workload.

If the example above holds true for other enterprises (and I know of many, many more that have far larger and far more complex LoB portfolios than The <redacted> Company of Large Persuasion) then there is a fundamental question to be answered way before any promises of cost savings are made, let alone recognized:

Can we afford the cost of change ?

This challenge reminds me fondly of a one-time personal situation I found myself in – I will call it “Jane’s Dilemma”. My wife, Jane, once asked me to spend the equivalent of a small fortune on booking a month-long holiday to India. Her logic was that despite the significant upfront cost, it would be “cheap when we got there”. Is this the same question facing the enterprise ? Kind of. The only difference for those facing Jane’s Dilemma today in the business sense is that I could gather pretty solid empirical data on the “running costs” for the holiday, once the hairball of upfront cost was swallowed. The same might not be quite as true for the cloud. (Did someone say AWS bandwidth charges..or am I hearing voices again..)

So, with less than 10% available for “traditional” approaches to moving the enterprise forward and the assumption that asking for a blank check to re-write your LoB portfolio for “the cloud” might be met with more than a mocking glance, your innovation will have to come from elsewhere. It may lie within your organization taking a different approach to unlocking the value of the data you have within your environment today…..but that’s another story for another day.

The Efficient White Elephant

Leave a comment

Jevons Paradox. Commercial Airlines. The Cloud.

You may be wondering ask what those three things have in common, and, why would I be writing about them in the context of an Enterprise IT (hmm, I nearly said “Private Cloud”) experience ? Well, my focus in this post is on efficiency, demand and supply – three things which I believe are inherent to the examples above and more interestingly, are at the core of Enterprise IT’s challenges as more and more organizations move toward virtualization and beyond.

As recently as last week, there have been three excellent blog posts on the above. First, Andrew McAfee brilliantly explores Jevons Paradox. Second, Bernard Golden asks some thought provoking questions on cloud capacity planning. Finally, in is own inimitable style, Carl Brooks challenges some thinking around the assumptions of cloud automatically equals cost savings.

So, what has this got to do with the growing number of moving targets facing Enterprise IT ?

Let’s start with THE key word. Efficiency. If you take the time to focus on the content in the blog posts above, at some point early in your research, you will encounter the word “efficiency” as a fundamental, yet recurring tenet. For some time, I have been mulling over the meaning of “efficiency” as it relates to the transition of server-based application workloads from physical to virtual to cloud and wondering whether IT shops, and more specifically, CIOs, have a true feel for what efficiency means and what it takes to reach their efficiency goals – assuming that they actually have them in the first place ?

According to Dictionary.com, the word “efficient” is an adjective and means:

Performing or functioning in the best possible manner with the least waste of time and effort; The ratio of the [resource] developed to the [resource] supplied.

In my experience, people often confuse efficient with effective. Being effective means being good enough to accomplish a purpose. So while it may be effective to paint your house with an artists brush, it certainly is not time efficient. To use an Enterprise IT thought analogy, it may be effective to deploy applications on individual physical servers, it may not be the best use of those resources, and hence, it may not be considered the most efficient.

It doesn’t take a genius (like Jevons) to figure out how the promise of greater efficiency can and does provide the Enterprise CIO with a big draw toward server virtualization. Every cloud expert will tell you unequivocally that while server virtualization alone doesn’t necessarily make a cloud, it is generally agreed upon that it is a critical enabling component of many of today’s clouds, whether private, public or a combination of the two. (There is a relatively uncommon train of thought that it is possible to build a cloud offering without virtualization – the example being companies like newservers.com who purport to offer the “bare metal” cloud). In fact, just a simple search of your favorite search engine will yield a slew of results like these:

To address efficiency, IT managers must look at a variety of issues ranging from the smallest piece of silicon to the entire datacentre. To effectively address the increasing datacentre efficiency concerns, many companies have chosen to virtualize their environments.

As a result, server virtualization has become the cornerstone technology used to increase efficiencies and add dynamic capabilities to the datacentre. Complemented with storage virtualization, the efficiencies can be increased even more dramatically. Together, they are the key to a highly efficient and dynamic datacentre.

Virtualization is an end-to-end strategy that can profoundly affect nearly every aspect of the IT infrastructure management lifecycle. It can drive greater efficiencies, flexibility, and cost effectiveness throughout an organization.

I wouldn’t vehemently disagree with the statement above, yet based on my experience, there is a significant piece missing from the equation – “application minimum specifications” – yes, just those three little words. This is a piece who’s impact might not seem immediately obvious to all, as it has been able to hide amongst a multitude of other sins in the physical world, but is it serious enough to have a major impact on the overall efficiency of any virtualized environment and especially in those Enterprise IT shops making the brave march toward private cloud waving the “efficiency” banner as they go.

At risk of making a massive over-generalization, I think this is down to a mysterious phenomenon that is rooted in the development teams. I have often wondered how developers truly arrive at their application minimum specifications. Are they the result of scientific analysis based on a lean set of code / instructions tied to the threading capability of the target environment twinned with load-testing results and garnished with a little “room” for any potential spikes ? Or are they nothing more than an over-cautious guesstimate that will ensure that, to the best of their knowledge, that they application will never suffer performance issues ?

As a result, I wonder how many organizations are seeing their virtualized computing capacity hugely over-provisioned, with sufficient “headroom” for plenty more cycles, but unable to squeeze maximum efficiency out of their new environments ? Would a commercial airline fly an Airbus A380 between Washington Dulles and Newark for a commuter route that carries around 50 people per flight ?

I don’t think this is an issue solely about the goal of attaining increased virtual machine density (which according to IDG back in 2009 would become “the new measure of IT efficiency”) as my view is that goal, without sufficient contextual understanding, could actually break effectiveness. Nor do I think it is tied to the more philosophical “virtualization stall” questions posed by CA’s Andi Mann in this blog post. No, this is an issue that has been around for a long time in the physical world, but it does (and will continue to) manifest itself quickly in the virtualized world, ultimately probing questions around the viability of deploying legacy application architectures (and programming languages) in the private cloud environment. This is far more fundamental a concern that the “masking” tricks such as memory ballooning and dynamic resource scheduling could ever hope to fix.

In 1921, Fred R Barnyard was accredited with the first use of the adage “a picture speaks a thousand words”, so let me try to wrap this up with a real world example of how an real organization could feel like they have achieved significant strides in virtualization, yet within those perceived (and tangible) benefits, lie the opportunities to drive further efficiencies.

The <redacted> Company of Large Persuasion has spent a significant amount of time concentrating on virtualizing their entire server estate. Much of that focus has been on deploying line of business applications on top of a new hypervisor platform.

Over time, The <redacted> Co. has reached > 90% virtualized, successfully running more than 3000 VMs in production. Most of the line of business applications were “migrated” using the P2V metholodogies and toolsets available to the IT operations team. To ensure “continued performance” of the line of business application portfolio, the decision was made to move the workloads using “like for like” resources, in terms of memory allocation and storage, from the physical to the virtual environment.

Today, it is considered rare for organizations to have reached this level of success in virtualizing their entire estate.

Here is a snapshot of how their current environment may look. (click to enlarge)

What is really interesting here is that the chart usage “% Free” not “% Used” as its primary metric. Upon closer inspection, the numbers of VMs that have over-allocated resources is staggering. To give it an even more in-depth context, consider the following data:

Total RAM Installed = 3,223GB
Average RAM installed per server = 3.83GB
RAM Installed but not utilized = 2,210GB
Number of servers un-utilized RAM could support = 577

Total Storage allocated = 202,112GB
Average Storage per server = 240GB
Storage allocated but not utilized = 96,833GB
Number of servers un-utilized storage could support = 403

It would be easy to jump to the obvious conclusion that the organization did not exercise enough due diligence before the P2V activities, leading to the gulf between usage and resource provisioning. Wrong. A great deal of planning went into the entire 3-year project. The developers provided the application minimum specifications that are simply honored by the operations teams. Juggling the mechanics of a move to virtualization isn’t easy and above all else, “performance” must be guaranteed.

In the case above, despite the obvious benefits gained over the physical environment, there remains some unanswered questions around the accuracy of the initial baselines. I suppose this may serve as a good warning to other organizations approaching this exact set of challenges. The demand in a virtual environment is no less than than in a physical one, yet with over-specification waiting to eat away at the capability to quickly supply then perhaps there is a worrying trend developing.

This inefficiency is also waiting to help Enterprise IT blindly feed the coffers of IaaS providers like AWS – remember that an over-specification is a bad thing in IaaS to anyone except the provider. You want a mega-jiga-super-hoojamaflip-instance to run a simple app ? OK, here it is. AWS will give it to you all day long for $$$. Why did you want it ? Oh, that’s right, the specs demanded it. This really isn’t even as sophisticated as the great DevOps debate.

In a logical play-out of these scenarios, applications will become tightly aligned from a resource requirements perspective. That infers that they will scale, by default, by design. Ideally, they would only ever be granted a little more resource than they need. If (and it’s a big if) Enterprise IT “gets there” with new application architectures, it will also have an interesting twist – better scale will likely mean more servers. More servers will likely mean more management, more administration, but at least it will be truly efficient.

Jevons proposed that technological progress that increases the efficiency with which a resource is used tends to increase (rather than decrease) the rate of consumption of that resource. If the resource in question is a VM, then….

I’d say he had it spot on.

Not surprisingly, Jevons was British.

Uptime Girl ?

5 Comments

Friday 14th January marked an interesting, yet somewhat unheralded day in Cloud with this annoucement from Google that they are to become “the first major cloud provider to eliminate maintenance windows from their service level agreement.” To paraphrase, the folks in Google’s operations teams are focusing on a target of zero downtime, planned or otherwise, and in doing so, seem to be making the statement that “continuous uptime” is something they hope will become a key differentiator in their bid to own the multi-gazillion dollar SaaS-based productivity apps market. I wonder if they’ll coax Billy Joel into a re-work of his 80′s classic, taking the title of this post, with the hope that the video goes viral on YouTube ?

Following this announcement, Cloudave’s Krish Subramanian posted a tantalizing blog entry which skilfully posed the following question:

Can We Take Availability Off Cloud Concerns List ?

Although it’s hard to tell whether the question posed was a call to arms or a rhetorical acceptance of the current state of enterprise thinking, Subramanian articulates a very salient point, and one that in my experience, doesn’t seem to be far from the uncomfortable front of the minds of many CIOs considering cloud services as truly viable solutions – is it better or worse than my current environment?

I know, before you say it that there is clearly a problem with that question.

As Chris Hoff has inferred many times via his Rational Survivability blog and specifically in his excellent presentation entitled “Cloudifornication”, the question of “is the cloud more secure?” can only be answered by the question “more secure than what?“. In a parallel universe, the question “is it better or worse than my current environment?” can only be answered with “how bad is your current environment?”. Quid Pro Quo.

Diving a little deeper in my thought process, I am not sure if the key word troubling CIOs is “availability” or “reliability” or both. Following an incredibly short but detailed twitterburst of activity from the ever-willing (and incredibly smart) members of the Clouderati, I began to wonder if those two words (along with the cringe-inducing notion of the SLA) are actually taking on different contexts as we look to find the right balance between the new breed and old school of business-enabling, agile, cost-efficient and predictable services.

Let’s take a look at the areas.

SLA

I’ll start by taking an unadulterated swipe at the utterly ridiculous notion of the SLA. Irrespective of whether the SLA is external (supplier to customer) or internal (IT to business) the very premise of the traditional SLA makes little practical sense. In the external example, the legalese is (understandably) biased on the side of the supplier and it’s likely the “guarantee” will not cover some of the things you would like to see “measured” as part of the service. In the internal case, unless IT is an outsourced environment (which I would suggest is customer to supplier), wouldn’t it be better to simply gather key metrics and have them available as a monthly report ? I can’t imagine there is an organization in the world who hasn’t got better things to do than create an over-complex, unrealistic set of objectives that ultimately have little or no consequence if the targets are not met. Puzzling.

Having had the honor of formulating and managing some extremely large IT service contracts over the years, and managing operational activities for a very large global organization, I can not even begin to calculate the number of hours (and therefore $$$) that have gone into to the review from internal and external legal teams, and for what exactly ? A guarantee of “the best we will ever perform” as a provider with a credit / remuneration clause that is in no way comparative to any potential loss of revenue / earnings. No provider on the planet is going to align their SLA terms with any true-to-life “unavailability to loss” scenario. Ideally, we would want the contractual clauses to be on a par with the concepts of liquidated damages. Sadly, that’s never going to happen.

We would be far better replacing the “A” for “Agreement” and replacing it with “E” for “Expectation”. That way, we could retain transparency and create freedom of choice without the need for an investment in a legal sign-off for something that is so fundamentally misaligned that it is utterly worthless.

In summary, I don’t see how an SLA, especially an uptime SLA, can be a major differentiator. It’s about performance, not promises.

Availability

Continuing with the “on-premise versus off-premise” email scenario as described by Submaranian, I believe this area is where a significant amount of confusion (maybe even FUD) lies when discussing the merits and pitfalls of cloud. Consider the organization that steadfastly sticks to their internal deployment of MS Exchange versus the organization that moves their entire workforce email to, say, Google Apps – let’s look at some numbers.

“In 2010, Gmail was available 99.984 percent of the time, for both business and consumer users. 99.984 percent translates to seven minutes of downtime per month over the last year. Seven minutes of downtime compares very favorably with on-premises email, which is subject to much higher rates of interruption that hurt employee productivity. Our calculations suggest that Gmail is 32 times more reliable than the average email system, and 46 times more available than Microsoft Exchange (Source : Google)

Confused ? Yeah, me too. 32 times more reliable and 46 more times available. But here’s a thought – if you lose your network connectivity from “the premise” then both systems are pretty much useless as far as their core function of sending and receiving external email is concerned. No network, so SMTP, no email.  What’s the difference between available and reliable in the wording above ?

Like it or not, global inter-company email is predicated upon what used to be depicted as THE cloud with funky yellow lightning bolts. The public internet. If you can’t connect to it, you’re going to suffer, even if your email is within your four walls. Therefore, I would suggest that the above is an availability discussion, rooted firmly in the same architectural and operational DNA as the “availability of central IT services at the branch office via the company WAN” and is far more likely to be a point-of-failure discussion at the “consumer” end, rather than “provider” end. I am not sure cloud can be pointed to as the only example of where this becomes the key consideration.

Reliability

Unlike the availability discussion above, I do believe that the question of reliability is much more in the realm of responsibility of the provider and therefore, the enterprise CIO may be *wiser* to focus time and energy, where possible, in comparing the performance metrics of competitors in a given “service space” (whichever aaS) with that of available benchmarks from his or her own environment.

One of the major problems facing cloud providers is the fallout from the loss of service to multiple tenants (customers) when something catastrophic happens. For the bigger players, this usually includes bad press and in some cases, reputation damage – both of which add to the reluctance for organizations to move to cloud as it is further deemeed “unreliable”. It is often forgotten is that this type of catastrophic outage affects private cloud environments (I know because I live it every day) in a similar way, as multiple business units become multi-tenant configurations and are served via one or more global data centers. Unless there is true “hot standby” in the private cloud (which I am yet to see) then the effect of a catastrophic outage is felt across more than a single set of “ringfenced” users.

And then, of course, there is the “traditional data center”, with all its inherent problems and operational headaches. They tend to enjoy life “under the radar” today as outages, although prolonged in cases, only affect certain parts of the business and certain applications. Although these outages can be very damaging to productivity, they do not receive anything like the negative publicity of their cloudy counterparts.

There is an unfortunately macabre analogy that helps bring this thought process together:

The number of US highway deaths in a typical six month period – around 21,000 – roughly equals all commercial jet fatalities worldwide since the dawn of jet aviation over four decades ago. In fact, fewer people have died in commercial airplane accidents in America over the past 60 years than are killed in US automobile accidents in any typical three-month time period. (Source : Boeing Corporation)

It’s very infrequent to hear of a road crash (the traditional data center) make national news, but in the event of a commercial jet crash (the cloud) then it’s guaranteed to make headlines. Perhaps this is simply due to the number of people affected on board the airliner at a single time during the incident ?

So, assuming that the network connectivity at the “consumer” end remained available, then the problems with reliability are likely to be confined to the “provider” end (private or public).

It is (and will be) interesting to see what happens when enterprises struggle with “issues” arising from the movement of application workloads from traditional data centers (where they may or may not be virtualized) to IaaS providers. As organizations move forward, this will almost certainly occur and can not be attributed to a problem with the availability nor reliability of the consumer or provider. This is a phenomenon that I have nicknamed “I2″ (Instability & Incompatibility) – more to follow on that in a later post.

In summary, and with a hat tip to some help from Simon Wardley and Ruv Cohen I arrived at the following closeout definition as a guideline only.

Availability may be considered in the same way as the traditional thinking around MTTR (Mean Time To Recover / Repair) metrics

Reliability may be considered in the same way as the traditional thinking around MTBF (Mean Time Between Failure) metrics

Both of the above will matter, do matter, during the endless discussions, debate and deliberation around movement to cloud. I believe they are equally important depending on the context and timeline of your individual discussion, but I do not believe they are equally important and equally weighted considerations at all points during the challenge of the cloud conundrum.

If you’re after a great single pane of glass view on public cloud status and benchmarks, then the folks at Cloudharmony.com have a pretty good dashboard which details many public cloud providers.

My final piece of advice ? Be methodical, be thorough, be data-driven but when you feel that you’re picking your way through shark infested waters, don’t ever let the SLA be your navigational beacon.

Chargeback, Showback, Brokeback ?

6 Comments

On January 12th, the twitterverse momentarily offered up the topic of “IT Chargeback” and more interestingly, bandied around a relatively new term “IT Showback”, touting both as being “critical for the success of private cloud”.

To the untrained eye, it was really nothing more than a brief exchange of views between several well respected members of the #clouderati, but this topic struck a chord with me, having been involved with these kinds of discussions before with many organizations including our own. Of course, this isn’t a new topic, but it is interesting to see the stark focus being placed on the cost management of IT as it relates to “cloud” – and almost insinuating that to have a successful “private cloud”, an organization must provide an equivalent method of consumption-based billing as that provided by public cloud providers such as AWS, Rackspace, et al.

In a blog post from May 2009, Bernard Golden, CEO of Hyperstratus and CIO.com contributor, offers this simple, yet brilliant definition:

Chargeback: The ability to charge according to resources consumed rather than a fixed amount of assigned capacity. The concept of chargeback being an integral part of private clouds is controversial. Some maintain that very granular pricing must be in place for cloud computing to operate effectively; others maintain that the mode of payment is incidental to the scalability and flexibility of commitment that are the true hallmarks of cloud computing. The typical practice in place for internal IT today is that, upon request for compute resource (e.g., a four-processor machine with 16 GB of memory, 50GB of storage, and 8 NICs) a capital charge is applied, meaning that an overall cost for the acquisition of the system (perhaps $8,000 in this case) is transferred from the requesting group to the IT organization; in addition, the overhead costs for the IT group (headcount, facilities, etc., etc., down to the donuts put out every Friday) are applied, pro rata, to IT-consuming organizations.

In my experience, and as Golden alludes to in a more recent blog post on the topic of chargeback, it is incredibly difficult, not to mention time consuming and laborious, for enterprise IT shops (irrespective of whether they have a private cloud or not) to get to grips with their true running costs and accurately calculate and advertise the “per hour” cost of compute, the per-Gb cost of storage, or indeed any other units of measure, while managing the “cash flow” cycle and netting a zero sum total as a non-profit center. Above all, and perhaps the most controversial comment I will make is that even if this was easily achievable, it would be largely pointless as it simply doesn’t drive the right user behavior and in some cases, it drives an underground “renegade” culture that brings inherent risks, both operational and security related.

My overwhelming feeling is that the majority of users simply do not care about the resources they use because they aren’t spending their own money.  I would argue that in some organizations, this rolls up to the LoB owners even if chargeback is employed. It’s the company funds, so who cares ? If every user gave their personal credit card to the IT department and they were charged to that, then of course they would be more frugal, but when was the last time a user sent you an email saying “I was really worried about the cost impact of the 100GB of old data I had stored on the file server, so I stayed late last night to trawl through it and delete most of what I didn’t need” ? That’s right. It’s never happened, has it ?

A classic case of driving the wrong behavior is often attributed to email. I know of many organizations who limit the size of user mailboxes, but, bizarrely, not the size of file share storage. It’s not uncommon to have a limit of, say, 500MB for an individual mailbox with the “penalty” of additional charges if you go over that size limit. So, as Joe User, will I happily “pay for what I use” over and above the 500MB ? Highly unlikely. What I’m probably going to do is create a series of .pst files and store them on the “free” file shares, ensuring I stay under the 500MB limit, but creating ticking time-bomb for the operations staff and indirectly, the legal department.

There are other organizations who place high internal costs for storage (some around $3 per GB per month) as they look to recover their huge capital outlays on expensive SAN infrastructures and roll up maintenance, labor, and other associated costs into a handy single unit. The problem here ? It makes them glaringly uncompetitive against the likes of box.net, Dropbox and AWS S3. Do users understand (or again, do they care) about “what I get for my $3″  internally, versus “what I get for my $0.15″, externally ? Do they just see it as “cheaper than IT can provide” and in some cases, along with that “cheaper” might come “more easily accessible”. It’s not hard to see how that could keep CIO’s awake at night.

I’m still trying to figure out whether “IT Showback” is a new cloudybuzzyword and is being touted as a way to drive user behavior at the expense of good, old-fashioned “transparency”. Enterprise IT has spent a considerable amount of time and effort trying to align itself with the business and a huge part of that (in my experience) has been a focus on transparency. We built our private cloud successfully using a business alignment model (i.e. proactive, not reactive) and a mechanism to track the total cost to deliver and maintain the new generation of IT services provided to the business. No AWS-style chargeback, no complex algorithms, just openness, honesty and integrity. Our costs are clear and the value is obvious.

So, going back to the fleeting imagery of the initial twitterverse exchange, do I think that “IT Chargeback” or “IT Showback” is “critical to the success of private cloud” ? No. I’m absolutely convinced it isn’t. What I firmly believe, however, is that without transparency or alignment, any private cloud effort is just the “next new mess” waiting to happen and will leave the unfortunate IT department who attempts it, firmly labeled in the “cowboy” category.

The Crossover Cloud

2 Comments

I’m guessing that most of you who read this will never have stared quizzically into the front grill of a midnight blue Ford Edge and thought “a-ha, that’s the perfect analogy”. True ? I thought so.

I am, of course, assuming that you all know what a Ford Edge is. If you don’t, you can see one here > http://www.ford.com/crossovers/edge/

There’s not much about the car itself that makes me particularly excited except for one thing…it’s genre. The Ford Edge is a “Crossover”. Allow me to share a somewhat bizarre set of thoughts, but ones that hopefully make sense in terms of the ultimate message.

Over the last 18 months or so I, like many others, have struggled with the classification and explanation of the “Hybrid Cloud” model. I dislike the word intently because I dislike its connotation. In today’s world we are continually bombarded with the promise of the Hybrid Vehicle and its great potential save money, be more efficient and be better for the green footprint of the planet (none of which I am arguing and most of which sounds like  the promise of “cloud”) but there’s just one bit of the Hybrid Vehicle construction that I can’t let go of when I hear the Hybrid Cloud mentioned – it’s two completely different technologies, a conventional internal combustion engine (ICE) propulsion system with an electric propulsion system alongside, but each of those are operating within the same physical mass of metal. In other words, you may have different technologies providing the combined solution, but they’ve been provided by the same manufacturer and they’re both under your roof (or hood, or bonnet, or whatever you call it).

Of course, the magic of the Hybrid Vehicle all works fine in an automobile after years and years of R&D, but when you put this in an enterprise computing context, it just makes me nervous. It may simply be that the automatic, threshold-managed movement between the two main propulsion systems happens without breaking the fundamental purpose of the vehicle (i.e. keeping it moving at a different efficiency level) but, even though that efficiency alone may be a worthy draw for the public cloud, I am not sure today how that same smooth movement back and forth could be achieved with enterprise IT workloads without significant investment in one’s own R&D. Indeed, it may just be that in the enterprise context, even two seemingly similar technologies (for powering on premise and off premise workloads) can differ just enough where the rubber meets the road (pun intended) to equal incompatibilities and that’s REALLY not where you want to be when choosing and explaining your comprehensive cloud strategy to your CIO.

So, what has this got to do with the Ford Edge ? Let’s look at the definition of  the “Crossover” :

“A Crossover is a vehicle built on a car platform and combining, in variable degrees, features of a traditional sport utility vehicle (SUV) with features from a passenger vehicle, especially those of a station wagon or hatchback.

Using the unibody construction typical of passenger vehicles, the crossover combines SUV design features such as tall interior packaging, high H-point seating, high ground-clearance or all-wheel-drive capability — with design features from an automobile such as a passenger vehicle’s platform, independent rear suspension, car-like handling and fuel economy.”

With a little artistic license, let’s rephrase the above. I am definitely not in the business of trying to invent yet another moniker, but just for kicks….

“A Crossover Cloud is a solution built on a true platform and combining, in variable degrees, features of a cloud service vehicle (CSV) with features from a traditional data center, especially those of compatibility and security.

Using the unibody construction typical of enterprise data centers, the crossover cloud combines CSV design features such as pay-per-use computing, elastic capacity, industry accreditations (SAS-70, FISMA, ISO 27001), extensible networking and true on-demand capability — with design features from a traditional data center such as certified application stacks, high availability, familiar operations and round-the-clock support”

It may well be a far-fetched analogy, but there are three things I’d like to point out in the hope that the message may become a little clearer.

  • The “unibody construction” as the common denominator across on premise and off premise workloads. If the low-level technology is common and compatible, the vehicle (your workloads) will run and give you the option to add “optional extras” (provider service offerings) depending on your requirements.
  • Familiarity is an important factor. Not all repair shops are geared up to fix Hybrid Vehicles and not all enterprise support organizations are geared up to support complex, alien cloud infrastructures that need special tools or skills that aren’t readily available.
  • Passengers (end-users) should not notice any difference. The early Hybrid Vehicles drew attention to themselves because they looked and felt different than the automobiles that their drivers were used to driving. Crossover genre vehicles are barely different from today’s “usual” automobile. Think of that as a metaphor for user interface, usability, performance and end-user acceptance.

It’s not hard to understand and appreciate why 2011 may indeed turn out to be a very positive year for the “Crossover Cloud”. It is likely to be a year in which organizations continue to seek opportunities and demonstrate capabilities of the power of combining of on-premise and off-premise services, including running such unimaginable workloads as VDI (yes, why not?) in the off-premise infrastructure and moving HSM-driven, lower-value storage to public providers, both utilizing some kinds of “unibody” components that make the difference to the user (and support groups) hard to quantify from their respective stakeholder viewpoints.

The challenge, as I try to articulate above, will be to ensure that everything functions back and forth as seamlessly as the switch between transmissions of the Hybrid Vehicle, but keeping the warm and comfortable feeling of the traditional automobile, knowing that if anything were to go wrong, you could always call on Joe and have him fix it using the tools he’s had for many, many years.

Follow

Get every new post delivered to your Inbox.