Tuesday, July 29, 2008

VMware Accelerates Cloud with Free ESX

The new CEO of VMware, Paul Maritz, seems to be committed to establishing VMware technology as the basis for emerging compute cloud offerings that enable shared, scalable infrastructure as a service via hypervisor virtualization. With Amazon EC2, the poster child for the successful compute cloud offering, being based upon the competing Xen technology from Citrix, Maritz is losing no time staking claim to other potential providers by meeting the Xen price requirement – zero, zilch, nada, zip. I love it. Low cost drives adoption, and free is as good as it gets when it comes to low cost and adoption.

As the economics of servers tilt more and more toward larger systems with multi-core CPUs, the hypervisor is going to become a requirement for getting value from the newer, larger systems. Developers simply do not write code that scales effectively across lots of CPUs on a single system. The coding trend is toward service oriented architectures that enable functions as small, atomic applications running on one or two CPUs, with multiple units deployed to achieve scalability. Couple the bigger server trend with the SOA trend with the virtualization trend with the cloud trend, and you have a pretty big set of table stakes that VMware does not want to miss. If a hypervisor is a requirement, why not use VMware's hypervisor if it is free?

The only challenge with free in the case of VMware is going to be lack of freedom. Xen currently offers both free price and freedom because of its open source heritage. If I run into a problem with VMware's ESX, my only recourse is to depend on the good will of VMware to fix problems. With Xen, I have the option of fixing my own problem if I am so inclined and capable. It will be interesting to watch the hypervisor choices people make as they build their cloud infrastructures, both internally and for commercial consumption, based upon the successful Amazon EC2 architecture.

Thursday, July 24, 2008

The CIO is the Last to Know

A recent Goldman Sachs survey of CIOs indicates that these executives do not plan to spend much money on cloud computing in the coming year. Indeed, most of their stated plans involve reducing the amount of consulting services and hardware that they are buying. I'm certain the predictions are accurate, and this scenario will lead to even more rapid growth in cloud computing. And the CIO will be the last to know.

How does this work? If Goldman has correctly measured the intentions of the CIOs, then they will not be spending money on cloud computing. Instead it will be the business units that they are supposed to serve that will be spending the money because the service level of the IT department will not meet their needs. Recall the reduction in consultants and service personnel? When a fixed income group at an investment banking house needs to stand up 50 servers to run a set of Monte Carlo simulations to test a hypothesis, the over-stressed IT department response is going to be “we'll get to that request after we fill the 25 that are in line ahead of it. It will probably be next quarter.”

The “swoosh” sound you just heard is the developer of the simulation code swiping his credit card to set up his Amazon Web Services account. Three days later, he has 100 systems standing up on Amazon's Elastic Compute Cloud pumping back the information he needs to help his traders make money. The credit card bill is only about $5000 per month – much cheaper than the IT chargeback for similar capability. The head of fixed income hears about the profits due to the extra simulation capacity, and the developer gets a promotion and is encouraged to spin up another 100 to 200 machines to get even more aggressive with the strategy. Relative to the millions in profit, the cost is peanuts and the IT department just can't respond to these type requests anyway. The CIO is the last to know.

It always happens this way with new technology. As the leader of North America sales for Red Hat in 2002, I remember calling on the CIO of a company in the financial services industry that processed millions of transactions daily in support of the equities market. I sat in his office while he explained to me that his operation was mission critical – the markets depend on this operation. He would never consider using Linux and open source. “Why don't we take a tour of the datacenter,” he asked. I was game, so I replied “Sure.”

As we walked the floor, I noticed several machine consoles indicating they were attached to Red Hat Linux 7.1 servers. Here is the conversation that ensued:

Billy: What's this?

CIO: Huh? I don't know. Steve, what's this all about?

Steve the Admin: Yeah, we're running Red Hat Linux for most of our network services.

CIO: What do you mean?

Steve the Admin: You know, Apache, BIND, SendMail, a few transaction servers and log crunchers mixed in here and there.

CIO: How many of these are we running in this datacenter?

Steve the Admin: About 25% of the machines, I would guess. About 800 servers in total.

Billy: Why don't we go back to your office and have another conversation about how much value you are getting out of Linux and open source and how Red Hat can help you.

The CIO is always the last to know about new technology. The head of engineering brought UNIX into the enterprise for CAD/CAM and analysis applications, and the CIO was the last to know. Department managers brought in PCs and Windows for personal productivity and desktop publishing, and the CIO was the last to know. System administrators brought in Linux for network services, and the CIO was the last to know. The sales force brought in salesforce.com and introduced the enterprise to SaaS, and the CIO was the last to know. Developers in the business units will use cloud computing, and the CIO will be the last to know.

The good news is that CIOs know where their bread is buttered, and eventually supporting the business units becomes the top priority. In this case, I would guess that all of that spending that Goldman noted as being earmarked for virtualization will pave the path for a hybrid approach to cloud computing. The enterprise IT function will begin to model the services that they provide after Amazon, with hypervisor virtualization as the basis of the compute capacity. Then, with a single, corporate architecture for cloud computing, applications will be able to scale seamlessly across the internal cloud infrastructure and also out into the external clouds when necessary for extra capacity. In this scenario, everyone gets what they want, and the CIO is a hero for reducing the fixed costs and operating budget associated with data center capacity. Being the last to know isn't necessarily a bad thing.

Friday, July 18, 2008

Citrix Management Land Grab - Project Kensho

In an effort to secure the management technology high ground as hypervisors proliferate and become more of a ubiquitous commodity than a premium point product, Citrix has announced Project Kensho. The strategy to enable portability and scalability across heterogenous hypervisors is so obvious and correct, that my only question after reading the release was “What is the provenance of the name 'Kensho'?”

Given Citrix' headquarters deep in Florida, my first speculation was that it was a phonetically correct implementation of an affirmative response with a southern accent:

“I ken sho' yew zackly how dis new technology iz gonna be betta dan anything yew eva saw in yo' life.”

It seemed reasonable at first, given my own tendency to display a southern flair. But given the likelihood of a strong influence by Simon Crosby, an Englishman who is the Citrix CTO for all things virtualization, perhaps the name is not based at all upon the southern heritage of Citrix. I sent Simon a note asking about the provenance of the name, and he replied:

“Kensho is a Zen Buddhist term (pun on Xen) for enlightenment experiences . . . . Now go deep into your Zen mind and figure those out!”

With the mystery of the name solved, let's make some commentary on the obvious. I have no doubt that the future of application release and lifecycle management is going to be based upon virtual machine images. By releasing and managing applications as virtual machines, it is possible to define the application independent from the infrastructure upon which it runs. In so doing, applications can be deployed and scaled on-demand - on any virtualized cloud of machines - without complex, costly, ad hoc setup procedures. More importantly, you can de-scale on one infrastructure and re-scale on different infrastructure as demand fluctuates because the setup and configuration information is not unique to the infrastructure. Unlike VMware's VADK technology which is unique to the VMware hypervisor, Project Kensho is aiming higher by embracing this obvious, management focused, hypervisor independent architecture for cloud computing.

Well, I say “Welcome to the party, Citrix!” The more voices we have proclaiming the benefits of this new architecture for cloud computing, the better. I have spoken with not less than 12 CTOs and CIOs over the last 3 weeks who have proclaimed to me the importance of multi-hypervisor support for any application release and lifecycle management system based upon virtual machines. The ability to scale, de-scale, and re-scale seamlessly and repeatably across multiple infrastructure targets is critical if cloud computing is to move from promising hype to bankable reality. Kudos to Citrix for moving the ball down the field on this critical goal with Project Kensho.

Thursday, July 10, 2008

Thank you, Diane Greene

“Hello, this is Diane Greene.” Such was my introduction to Diane back in 1998 when she joined a conference call with me and Matthew Szulik. I had just reviewed the VMware technology with one of VMware's business development managers, Reza Malekzadeh, as part of a partnership opportunity between Red Hat and VMware. Red Hat, although still a very small company with only 70 employees and about $12M in revenue, was a hot target for alliances, and VMware wanted us to distribute their product with our Red Hat Linux product as part of the “extras” CD. Our engineers thought the technology was “very cool,” so shipping it as part of the CD made sense because it would create more demand for our product. I also thought it was cool, but at the time I was very skeptical of the business model.

Reza had shown me a diagram of the different permutations of how someone might use VMware. He indicated that it would be used immediately as a host environment atop an existing OS (such as Windows or Linux) to enable developers to rapidly develop and test for many platforms atop their workstations. But, he indicated to me that the big vision was for VMware to be the bottom layer, right against the hardware, with multiple other OS implementations running as guests atop that layer. My response: “I don't understand why anyone would ever want to do that.” Now we understand why Diane got SDForum's visionary award a few weeks back and Billy Marshall was lucky to be on the guest list.

Not one to be left behind, it only took me 6 years to determine that this new approach indeed represents one of the biggest opportunities to improve the efficiency and capability of information technology. As the hypervisor replaces the general purpose OS as the layer that exposes the hardware, the applications that ride atop that layer become much more portable and the datacenter resources become much more efficient. As Diane leaves VMware to explore her next opportunity, I owe her a big debt of gratitude for shining so much bright light on this revolutionary approach to computing. Thank you, Diane Greene, for giving all of us that play in this market an opportunity to do something wonderful for our customers.

Monday, July 07, 2008

Shut Down the Datacenter

Or at least power down significant pieces of it during periods of low demand. This message always draws funny looks from IT types when I suggest a seemingly simple answer to the problem of extreme costs for datacenter resources. I push on:

Billy – If utilization is around 20 – 30%, aren't there periods of time when you could just shut down about 50% of the systems? Or at least 25%?

IT – We can't just shut the systems down. . .

Billy – Why not? You aren't using them.

IT – You don't understand.

Billy – What am I missing?

IT – Well, it just doesn't work that way.

Billy – How does it work?

IT – It takes a long time to lay the application down atop a production server.

Billy – Why?

IT – Set up is complicated. Laying down the application and bringing it online can take several days, typically 2 to 4 weeks.

Billy – So part of the application definition is described by the physical system it runs on?

IT – Yes, that's right. If I shut down the physical system, I lose part of the definition and configuration of the application.

And therein lies the culprit. The “last mile” of application release engineering and deployment is a black art. Applications become tightly coupled to the physical hosts upon which they are deployed, and the physical hosts cannot be powered down without losing the definition of a stable application. Bringing the application back up is expensive due to the high costs of expert administration resources, and it is fraught with peril because the process is not repeatable. Enterprises are spending billions of dollars on datacenter operating costs because the risk of bring applications back on-line is not worth the savings of taking them off-line.

Of course I blame most of this mess on the faulty architecture of the One Size Fits All General Purpose Operating System (OSFAGPOS). OSFAGPOS is typically deployed in unison with the physical hosts because OSFAGPOS provides the drivers that enable the applications to access the hardware resources. To get an application to run correctly on OSFAGPOS, the system administrators then need to “fiddle with it” to adjust it to the needs of any given application. This “fiddling” is where things run amok. It's hard to document “fiddling,” and it is therefore difficult to repeat “fiddling.” The “fiddle” period can last for up to 30 days, depending on the complexity of the “fiddling” required.

So how do we get away from all of this “fiddling” around, and deploy an architecture that allows the datacenter to scale up and down based on actual demand? Start with a bare metal hypervisor as the layer that provides access to the hardware. Then extend release engineering discipline to include the OS by releasing applications as virtual machines with Just Enough OS (JeOS or “juice”) in lieu of OSFAGPOS, complete with all of the “metadata” required to access the appropriate resources (memory, CPU, data, network, authentication services, etc.). By decoupling the definition of the application from the physical hosts, a world of flexibility becomes possible for datacenter resources. Starting up applications becomes fast, cheap, and reliable. As an added bonus, embracing cloud capacity such as that provided by Amazon's EC2 becomes a reality. Instead of standing up application capacity in-house, certain peak demand workloads can be deployed “on-demand” with a variable cost model (in the case of Amazon it starts at about $.10/CPU/hr).

With oil trading at around $140 per barrel, the cost of allowing datacenter resources to “idle” during slow demand periods is becoming a real burden. “Fiddling around” with applications to get them deployed on OSFAGPOS is no longer just good clean fun for system administrators. It is serious money.

Labels: , ,