This is a re-issue of a post from my old blog that people have asked for.
SR-71 Blackbird. Photo from Wikimedia Commons.
The SR-71 Blackbird, a global reconnaissance aircraft developed by the United States Air Force, first flew in 1964, and was in service from 1968 to 2001. Even at the time of its retirement, it represented relatively advanced technology as compared with most aircraft. In 1976, a Blackbird set the speed record between New York and London at just under 2 hours. The Blackbird’s speed record for manned air-breathing aircraft still stands, although manned rocket-powered aircraft and at least one unmanned air-breather have gone faster.
The Blackbird was designed to operate under extreme conditions: Flying long missions at over 85,000 feet at speeds up to around Mach 3.2, the airframe is subjected to heat and pressure extremes no conventional jets would be able to survive. It is not enough to say the Blackbird can fly high and fast; it would be fair to say it must fly high and fast. Some of the engineering that makes it possible for the Blackbird to operate as it does (or as it did, while in service) also make it problematic for the aircraft to operate at "normal" altitudes and speeds.
Some of these engineering details are described in a Wikipedia article about the SR-71. For example, the skin of the aircraft was made of titanium. Titanium not only tolerates the high temperatures and pressure of Mach 3+ flight, but tests of operational aircraft showed that the metal grew stronger as a result of the heat treatment it received during flight. The structure of the aircraft used 85% titanium and 15% composite materials. To allow for expansion under high heat and pressure, key surface areas of the inboard wing skin were corrugated rather than smooth. A conventional aircraft’s smooth skin would split or curl under the operating conditions for which the Blackbird is designed.
To account for expansion, and also because there was no fuel sealing system that could survive the high pressure and temperature at speed and altitude, the fuselage panels were designed to fit loosely. Once the aircraft was at altitude and speed, thermal expansion and air pressure held the fuselage together. On the ground, fuel leaked out. The aircraft would take off and climb rapidly, then be fueled again in flight before moving up to its operational altitude and speed.
The fuel that pooled on the runway prior to take-off was special, too. It is called JP-7. Its chemical composition is designed to produce exhaust gasses that are hard to detect; one aspect of the aircraft’s stealth design. JP-7 also has a high flash point, so that it can be used safely at high temperatures. This characteristic also made JP-7 useful as a coolant and (because it is very slippery) hydraulic fluid during flight, prior to its being burned. The JP-7 would not even ignite unless it was lit by injections of triethylborane, which ignites on contact with air. To start the engines initially, and to light the afterburners, triethylborane was injected into the fuel flow by the pilot. That was important, because the special jet engines on the Blackbird are designed to fly on afterburners continuously.
That’s not the only unique characteristic of the Pratt & Whitney J58-P4 engines. The engine is a hybrid that operates as a turbojet at "normal" speeds and as a ramjet at high speeds. The Wikipedia article cited previously, as well as other published sources, describes this aspect of the design much better than I can. It’s quite impressive, really.
There are many more fascinating details about the special design of this aircraft. For my purposes here, the point is that in order to create an aircraft capable of operating at the speed and altitude of the SR-71 Blackbird, it was necessary to build an aircraft that could not, for all practical purposes, operate under "normal" conditions. Loose panels, leaky fuel, and corrugated skin wouldn’t last very long flying low and slow, or even moderately high and fast. The fleet of retired Blackbirds can never be repurposed as pasenger aircraft, fighter-bombers, fire-fighting aircraft, or crop-dusters. They can only function at Mach 3+ and 85,000+ feet, or as museum pieces.
And that brings us around to the subject of commercial "enterprise-class" software products, and when it makes sense to choose them over Open Source or home-grown alternatives.
Information technology professionals love to argue, and it sometimes seems as if they especially love to engage in passionate, circular arguments that have no chance of ever resulting in a useful outcome. One such argument that has been popping up here and there in cyberspace recently is the debate between those who favor commercial software solutions and those who favor Open Source and home-grown solutions. Proponents of both sides of the argument appear to believe that there is exactly one rational answer to the question of commercial vs. Open Source software: Their own. Proponents of both sides of the argument declare proponents of the other side of the argument to be suffering from a cargo-cult mentality.
I’m pleased to report that they are both correct about their own opinions, and wrong about one another’s opinions.
Enterprise-class software has certain characteristics that make it suitable for the largest of large-scale processing requirements. Many managers choose products in this performance class because they reckon any software that can handle extremely high loads with good performance and high availability must certainly be able to handle moderate loads with reasonable performance and acceptable availability. Managers also choose such products because they tend to be optimistic and forward-looking in their assessment of their own company’s needs. It may be true that we aren’t as big as the largest companies in the world today, but we are well on our way; so let’s buy the same tools as the largest companies use.
It all makes perfect sense, on a certain level. However, there are four common problems with the line of reasoning many managers take when deciding whether to go with a high-end commercial product.
First, the assumption that a high-end product will easily be able to handle less-extreme loads is not necessarily accurate. Just as the SR-71 Blackbird cannot operate as a crop-duster, a high-end enterprise-class product doesn’t support moderate processing loads very smoothly. Like the Blackbird, enterprise-class products are specially designed to function well under extreme operating conditions. The design features that enable them to do so also introduce significant internal overhead at low and moderate loads — in their eagerness and readiness to crunch a big workload, they thrash.
The second problem is that many managers have an incorrect notion of just how big "big" is. They think the operational loads their company must support are big, when in reality they may only be moderate, in the grand scheme of things.
Third, when the company really does need an enterprise-class solution in one area of the business, many managers assume they need such solutions in all areas of the business. It’s more likely that the company has extreme needs in just one or two categories of IT support, and has moderate requirements otherwise.
Finally, many managers underestimate the total cost of ownership of enterprise-class products. Products of this kind are complicated to configure and administer. Each such product requires a dedicated team of specialists who know the product deeply. Professionals of that calibre are rare and expensive. Apart from the personnel costs, the majority of high-end enterprise-class products were designed monolithically. They usually do not lend themselves to frequent modification, and they usually do not fit nicely into a software delivery work flow that calls for multiple concurrent development initiatives. That means multiple code repositories, multiple test environments, multiple staging environments, and extended lead times for any project that involves configuration changes or customization of an enterprise-class product. Incorporating a product of this kind into an IT environment and maintaining it is not the same as purchasing a copy of Microsoft Office and loading it onto a laptop.
A couple of years ago, I had the opportunity to work with a certain enterprise-class software product from IBM called WebSphere Commerce. It is a tool for supporting a high-volume e-commerce website. Extremely large e-commerce operations such as Amazon or Land’s End can make use of a product in this class to support their online stores. I don’t know that these particular companies actually use WebSphere Commerce; I mention them only to establish a frame of reference for judging a company’s real need for scalability, performance, and availability. I understand IBM has one competitor in this space, ATG. There are so few customers that legitimately require a product of this kind that there isn’t room in the market for many vendors.
I was extremely impressed by the technical architecture of IBM WebSphere Commerce. The level of detail the engineers reached in tailoring the product for each supported platform is downright amazing. It runs on IBM z-Series and i-Series machines as well as on AIX boxes and on commodity Intel hardware running RedHat Enterprise Linux, Suse Enterprise Linux, or Microsoft Windows. Although the product is based on a cross-platform technology (Java EE), it does not run on "any" Java Virtual Machine. This is because it has been heavily tailored to take full advantage of the hardware/software architecture of the handful of supported platforms on which it runs.
One could say, if one were inclined to say such things, that WebSphere Commerce’s Pratt & Whitney J58-P4 engines are explicitly designed to operate only under specific conditions, and to operate very well indeed under those specific conditions. The architecture of WebSphere Commerce is heavily customized to support high performance, high volume, high availability, and high reliability under heavy transaction loads. If you fly this bird at "normal" altitudes and speeds, its internal mechanisms to support high loads start to thrash, and it starts to leak virtual JP-7 all over the server room floor. You won’t get the pay-back you might have expected in exchange for the high cost of ownership. It will work, but there’s more to running a cost-effective IT operation than just that. A lot of solutions will work. The question is whether the benefit is worth the cost in a particular situation.
I’ve mentioned cost of ownership a couple of times already. Let’s explore that angle a bit more. Obviously, a commercial product comes with a price tag. That is only the beginning of cost of ownership, though. An enterprise-class product such as IBM WebSphere Commerce isn’t sold outright like a car or a pack of chewing gum; customers pay licensing fees. Products in this class are designed to be very difficult to learn, configure, administer, customize, and operate. Therefore, customers also pay for support, consulting, training, books, certification programs, specialized development tools that "know" how to interact with the product, vendor-sponsored publications, and user group memberships. Apart from the costs paid directly to the vendor and its business partners, customers also pay for facilities, utilities, personnel, insurance, and anything else necessary to keep the system operational.
Hey, wait a minute: Why would a software company design their flagship products to be difficult to learn, configure, administer, customize, and operate? Remember the small size of the market, as I mentioned earlier. The big software companies invest a significant amount in research and development of their enterprise-class products. If you doubt that, then I invite you to read up on the architecture of IBM WebSphere Commerce. I’ve worked with several other enterprise-class products over the years, as well; not only from IBM but from other vendors. They all share the characteristics that they are hard to learn, configure, administer, customize, and operate. And all are architected carefully to maximize their scalability, performance, and availability.
To recoup the development cost and start reaping profits, the software companies have to sell more copies of the product than there are customers who objectively need an enterprise-class solution. They are selling solutions that are appropriate for Fortune 100 companies to thousands upon thousands of mid-sized and small companies. In addition, they need to sell secondary products and services to try and squeeze a bit more revenue into the picture: Training classes, certification programs, consulting services, progressive levels of support, update subscriptions, custom development tools, and anything else they can dream up. If the products were as easy to learn and live with as Open Source solutions, the software companies would not be able to sell all this extra stuff. They need the revenue because software engineers with the skills necessary to design and build products in this class are exceedingly rare, and significant research and development facilities are needed.
There’s another reason for the high cost of ownership and the relatively high complexity of commercial products. Because the software companies must sell as many copies of their products as they can, the products must be flexible enough to accommodate the needs of a wide range of customers. Flexible software is configurable, customizable, and extensible. Configurable software can be made to operate differently by setting options in configuration files. Customizable software allows for the replacement of functional elements with custom versions of those elements. Extensible software offers a way to add functionality beyond what comes out of the box, through a plug-in architecture, user exits, or some other mechanism. Software vendors need their products to be highly flexible so that they can work for many different customers. If they limited the flexibility of the product, they would also be limiting their potential market.
From the point of view of any one customer, however, all that flexibility is merely unnecessary complexity that adds no value. Any single customer is interested in only one configuration of the product. But there is no way to get the scalability, performance, and availability characteristics of enterprise-class software without paying for the flexibility as well; and that is of benefit only to the vendor.
It sounds as if I’m weighing in on the side of the Open Source aficionados, doesn’t it? Not so fast! There’s more to the picture than cost. We have to balance cost against revenue generation. One of the sales people from the IBM business partner that sold WebSphere Commerce to the small firm where I was engaged told me a story of a client of theirs that runs a large WebSphere Commerce operation bringing in $35 million in revenue annually, against costs paid directly to IBM of around $500,000 per year. I would surmise they are paying another $500,000 in non-IBM-related operating costs to support the operation, as well. So, they are enjoying a 35:1 return. (Of course, this ignores other costs such as the cost of the merchandise they sell, but you get the picture.) I think that’s a fantastic deal. They have software that is really capable of supporting a high-volume, high-availability e-commerce operation over the web; they get support and help from IBM (in my experience, the responsiveness and quality of IBM support are very good); they get the training they need to babysit the thing (no easy task). It’s well worth the cost for this company, because they obtain a significant return on investment.
The sales person who told the story of the happy client wanted to give the impression any company that wishes to earn $35 million a year from its e-commerce site ought to pony up and start paying IBM $500,000 a year as soon as possible. This sort of reversal of cause and effect is typical of the sales pitch that seems to be so effective in overselling Blackbirds to companies that will never use more than 10% of the products’ capabilities. You don’t build your business to the level of $35 million in revenue by throwing your money away in the early years. The message I take from the story is this: If your company has a genuine need for enterprise-class solutions, then you will have the necessary cash flow and personnel to support those solutions. If you don’t have the latter, then you’d better think of an alternative to the former until you can build your business.
I’m hard-pressed to imagine the client I was working with at the time ever reaching that level of e-commerce traffic. I could be mistaken, of course. If I’m not mistaken, then this could turn out to be an example of The Blackbird Effect.
What do I mean by The Blackbird Effect? It’s the phenomenon whereby companies sign up for very expensive, enterprise-class software products when they don’t really have an objective business case for them.
WebSphere Commerce is only one example. This isn’t about any one software company or any single product. All enterprise-class software has the same general characteristics that result in high cost of ownership. All enterprise-class software can perform at a level beyond the capabilities of Open Source alternatives; and in most cases, the Open Source alternatives perform better at low and moderate levels. Those highly scalable and performant products are Blackbirds. I’ve had to become intimate with several such products from various vendors over the years. I assure you, there’s nothing wrong with enterprise-class software as a category. The key questions are whether your firm actually needs that level of performance and has the cash flow to cover the operating costs. Some do and some don’t.
There’s also an emotional factor at play. Many managers imagine their companies are very large and have very significant processing requirements. Some of them are right. Most of them are not; they simply haven’t seen how big Big is. Even so, I think it’s a positive indicator. You can’t achieve an ambitious goal unless you are able to visualize success, whether the goal is to qualify for your country’s Olympic team or to learn to play the banjo. People who are trying to build a small or mid-sized company to the size of a Fortune 500 firm have to visualize success every day. So, when a software salesman says to them, "If you want to play with the big boys, you have to buy big boy toys," it’s only natural that their first question is, simply, "Where do I sign?" They’re really eager to get to the top. Sometimes they forget they haven’t arrived yet.
A few years back I was working at a company that I would consider mid-sized. It had about 33,000 employees altogether, and had operations in six US states. The IT department comprised about 1,300 people, of whom roughly 300 were software developers. Like any corporate IT department, this group spent around 80% of its budget on operations, infrastructure, and ongoing application support. The remaining 20% was deemed "development," but most of the development work consisted of integrating COTS packages. As a financial holding company, the firm had a number of subsidiaries such as mortgage lenders, banks, and investment firms. For selected business operations, this company genuinely needed enterprise-class solutions. When managers forget the part about "selected business operations," they may become susceptible to The Blackbird Effect. That is exactly what transpired in this case.
A positive example at that financial services company is image processing. They take in a huge volume of paper documents every day, from loan applications to pay stubs to hand-written checks. To get these documents into electronic form quickly enough to support business requirements, they invested well into the 8-figure range in high-end imaging equipment. To consume the output from this equipment, they purchased IBM Content Manager and hired enough staff to customize and support the product properly. I don’t mean they bought one copy of Content Manager (or whatever it’s named these days). They bought a suite of products around Content Manager, and implemented it all with heavy support for scalability and continuous availability. All of this is very expensive to live with, as you can imagine. But it’s a no-brainer for this company. The volume of work they do easily generates enough cash flow to cover the costs. In context, the costs aren’t burdensome at all. I don’t believe we could have supported the processing requirements of that particular operation by cobbling together a home-grown solution out of Open Source building blocks and long weekends. The company is much better off partnering with a major software vendor that has the resources to support them properly.
Another positive example is the company’s ETL processing. That stands for extract-transform-load. It’s a very typical sort of requirement in companies that have substantial legacy systems that were built decades ago, were designed to be monolithic, and reside on a range of disparate, incompatible platforms. ETL is also important for companies whose business operations include ingesting data from numerous external sources. The company purchased an enterprise-class ETL package, and it solved a multitude of annoying and time-wasting problems in moving data between systems. Yes, they could have built something out of Open Source parts and sweat equity, but it seems very unlikely to me that the result would have been as useful as the commercial solution they chose. Does every company need an enterprise-class ETL facility? Of course not. But this company definitely needed it.
So far, it sounds as if this company made wise choices regarding enterprise-class software products. And they did…sometimes. Sadly, there are many more negative examples than positive ones. Management decided they needed to build a world-beating technical infrastructure that included the "best of breed" products in every category, just in case the need for them might arise in future. I was either directly involved with, or had visibility into, four of these: WebMethods (which is apparently owned by Software AG now); the Blaze Rules Engine from Fair Isaac; a workflow automation engine made by a local company that may or may not still be in business; and Microsoft BizTalk, a service orchestration platform.
We did have a business case for one of the WebMethods products: The integration server. It was very useful to us. It’s really an excellent product. I feel I must reiterate: It’s an excellent product, if you really, really need an enterprise-class solution in this category. We did. You might not. Be careful. However, in their quest to have the best-of-breed across the board, management signed up for a multi-million dollar subscription to the full suite of WebMethods products. We really didn’t have any business use cases for the other products. So we paid a high annual fee for the right to use software we didn’t need. For shelfware. For management ego.
The rules engine seemed like a good idea at the time. I was one of the people who attended the training course to learn to support the product. What I learned (among other things) was that the real value of a rules engine is the software called the inference engine. To use the inference engine, business rules have to be independent of one another. If the rules are dependent on each other and have to be checked in a specific order, then you set the product to operate in sequential mode. What that means is you’ve got an if-else structure embedded in the rules engine where most developers won’t understand it, and you’re side-stepping the inference engine altogether. So, you’re getting no value for your money, and you’re actually adding needless complexity to your solution. I’ll give you three guesses: Were our business rules independent of each other? Very good, you got it on the first guess. Another interesting point is that even if you could invoke the inference engine, the overhead it incurs to build its node structure before it can start evaluating the rules will take more processing time than it’s worth unless you have at least 1,000 rules to feed into it. Our worst-case process had 15 rules.
After some months with no usage of the workflow automation product, the manager who had signed the purchase order demanded that the next development project to come along had to incorporate the product into whatever solution they built. He didn’t care what the solution might be. He needed to show senior management that the purchase had not been foolhardy. The development team tried to comply. They ended up jamming a complete CRUD application into the "custom in-box" of the workflow product. As they added functionality iteration by iteration, it became more and more cumbersome to do anything with the "custom in-box" code. When I started to work with this team as a coach, this was one of the first problems I saw. I encouraged the team to take action, and eventually the technical team lead and the project manager went to the manager who had demanded the use of the product and explained to him that they were removing it from the solution because it was preventing delivery of useful results to the internal customer. The application they were delivering simply was not a workflow automation solution. The manager reluctantly agreed.
In those days, at that company, there was a dream that we would one day have a robust service-oriented infrastructure. To that end, but long before there were any services to orchestrate, the company purchased Microsoft BizTalk. In a replay of the workflow automation example, after some time of non-usage the manager who had signed the purchase order demanded that the next project to come along had to incorporate the product into whatever solution they built. There was no need for such a product in that project. Eventually the team was able to fake the use of BizTalk by writing a user exit and passing request and response messages through that thin piece of code, bypassing all BizTalk functionality. BizTalk was present on the server and appeared to be active. Microsoft even published an article in one of their corporate magazines to describe this highly successful and exemplary implementation of BizTalk. But we weren’t actually using any capabilities of the product.
My point is not to bash any of these products or companies. All the products mentioned are very good. My point is that a product is "good" only when it is used for its intended purpose and at its intended scale. The examples I gave of enterprise-class products at the financial company include two high-flying Blackbirds and four museum pieces. Just like the real Blackbirds, enterprise-class software can’t function in other roles. It’s either the high-end of the spectrum or nothing.
Are there Open Source alternatives to these products that may be appropriate for companies whose level of operations doesn’t rise to the stratosphere? In most cases, yes. And in most cases, there are even scalable and highly available Open Source alternatives. They won’t be as heavily tailored to specific platforms, of course. You could build quite a robust e-commerce solution using Open Source web servers, app servers, and web frameworks. Having spent my fair share of time struggling with configuration of enterprise-class products, I’m confident that a good development team could produce a usable solution in less time than it usually takes to get a complicated commercial product up and running properly in production. Open Source products usually aren’t heavily customized to take advantage of specific platforms. For that reason, they won’t be able to compare with high-end enterprise-class products for truly high-volume workloads. One point I’m trying to make here is that most companies don’t have truly high-volume workloads. They think that, say, 5 million transactions per day is high-volume. High-volume would be, maybe, 20 times that. This perception makes them vulnerable to the "big boys" sales pitch, and to The Blackbird Effect.
This isn’t about commercial software vs. Open Source software. It’s about assessing your real needs and understanding the cost-benefit balance before making significant financial decisions. Don’t try to dust your crops with a Blackbird. All you will do is crash. Manage the growth of your company intelligently, and before you know it you’ll be getting invitations to play golf with the big boys. Overspend in an attempt to look and act like one of the big boys prematurely, and you’ll be playing Putt-Putt, and liking it.