SmoothSpan Blog

For Executives, Entrepreneurs, and other Digerati who need to know about SaaS and Web 2.0.

Degrees of Multi-Tenancy (Degrees of Green Crystals)

Posted by Bob Warfield on June 20, 2008

Phil Wainewright is hosting a great discussion on multi-tenancy than now spans two postings.  I encourage you to read through both if you have an interest in SaaS or multi-tenancy.  The discussions really underscore how much confusion there is around the term.  I wanted to make a couple of points to try to dispell some of the, um, cloudiness around this holiest-of-holy cloud computing/SaaS tenets.

First, you have to keep in mind that multi-tenancy is much more of a marketing event than a technology event.  Whoa!  That sort of thing will get me excommunicated from the SaaS Church of Benioff.  Well, I’m sorry, but it’s true.  Multi-tenancy is all about what we used to call “green crystals marketing” at Borland, a term I first heard from my friend (and then VP of Product Management) Rob Dickerson.

What is Green Crystals Marketing?  When you’re having a hard time differentiating, you find something unique and make it you green crystals.  They provide a reason to believe why your offering is better even if they aren’t the whole reason or even most of the reason.  In those days, VROOM (Virtual Real Time Object Oriented Memory) was Borland’s Green Crystals.  It reached a hilarious level of success when Bill Gates was left sputtering at one user group presentation when a member of the audience suggested he needed to license VROOM from Borland if he ever expected Windows to run on the machines of the day.  VROOM was in fact a very sophisticated overlay and memory manager, and a neat piece of technology, but it’s marketing presence was far larger than its technology reality.  It was written by Istvan Cseri (now runs a big part of MSFT SQL Server) and other really bright people, so I don’t mean to take anything away from it.  It delivers real value, but not necessarily as much as the hype would imply.

BTW, it’s called green crystals marketing due to soap advertising.  Why is our soap better?  Because it has green crystals. 

Multi-tenancy was Mark Benioff’s Green Crystals for SaaS.  He had to differentiate his offering from the ASP’s of the day, and we again see the ASP curse word applied to companies who do not sufficiently comply with the vision of multi-tenancy.

Now let’s move on to the technology and a little more hard edged view of the realities.  We’re going to leave the marketing aside.

Multi-tenancy is ultimately about cost, when we look at what it delivers to the business.  It is more cost-effective in two ways.  First, it reduces machine resource requiremetns–cpu, memory, and disk.  Second, it reduces operational costs (but it isn’t the silver bullet many have claimed).  Because its goal is to reduce the number of instances, and to align everyone’s schemas, it becomes cheaper and easier to manage, and fewer admins are required.  Let’s look at each in turn, and focus on different variants of “multi-tenancy”, including some that many may not regard as “pure” enough to be called multi-tenant.

On the machine resource side, one can look at the various components, cpu, memory, and disk, and reach some conclusions.  Let’s start out with putting a single customer on one or more machines.  This is one everyone agrees is not multi-tenant.  If you use this model, you’re not sharing any resources and every customer needs enough resources for their maximum usage.  It also means a lot more machines for administrators to touch in order to keep things humming along, do backups, do upgrades, and whatever else comes up.  Take this one as a baseline we can improve upon.

Next up is to employ virtualization.  It still looks like every customer has their own complete set of software, but we’re able to put multiple customers on a single configuration (could be multiple machines if we run clustering) and thereby share some resources.  Note that this sharing will largely be variable cost sharing.  Fixed costs will still mount up.  What do I mean by fixed versus variable?  Assume every customer gets a copy of MySQL or Oracle.  There is a fixed cost to bring up an empty MySQL or Oracle schema that is charged against every customer.  However, variable costs are easily shared among the various virtual instances that exist on a single configuration, so costs are lower.  Virtualization helps administrators a bit, given that there are fewer physical boxes to touch, but there are still a lot of instances to keep up with.

Okay, let’s jump to one of the “pure” multi-tenant models.  The classic one.  In this model, we have multi-tenancy right down to the tables.  Let’s say we have an “Accounts” table that lists companies in a CRM system.  Each row corresponds to a company.  There is a column that designates which tenant owns the row.  Software is carefully written so that the column is always accessed and no tenant can see another tenant’s rows (you can see there is some potential for a mistake here though).  Efficiency is much greater because we eliminate the fixed cost overhead.  However many tenants can run on a single instance of MySQL or Oracle get to share those fixed costs instead of charging them over and over again.  There really is just one schema for administrators to look after, so the model is a lot cheaper.

Is this the best possible model?  Perhaps.  It does have a drawback or two. For example, the cost of the column to identify the tenant is now being charged on every single row of every table.  Very likely it isn’t a big cost, but it is there.  Tables will get bigger too, as all the tenants are piled in.  Presumably this can lead to scaling issues sooner.  We can federate the tables by breaking them apart into sub-tables that still have groups of tenants.  Another important consideration is that if we ever needed to do reporting on data from multiple tenants, that’s pretty easy.   We may even use our notion of “tenant” to include the divisions or business units of a larger organization.

One last model I want to mention:  multiple-schemas-on-a-server.  In this model, we don’t comingle tenants within a table.  Each tenant has their own set of tables.  Scaling is easy, we can just move the tables onto new servers.  There are some fixed costs to having more tables, but they’re often less than the cost of the extra column, and they’re way less than virtualization-style fixed costs because we are still stacking multiple tenants within a database.  This is actually a pretty powerful model.  It gives the ability to manage scaling pretty easily.  It is slightly harder to roll out changes because you roll them out to a bunch of tables rather than a single table, but that still is not too bad.   This model can also be done with less fundamental rearchitecting than a columnar multi-tenant model.

Things brings me to a definition for multi-tenancy that is the only one that makes much sense to me:

Multi-tenancy is software that to the third party server makes it transparent that there is more than one tenant running there.

My database server has no idea whether I have 1, 20, or 200 tenants whether I run columnar or multi-table.  Hence I see it as multi-tenant.  Virtualization, which may be just fine economically, is not really multi-tenant because we’re just sharing the hardware, not the software.  I don’t see these two models in terms of “purity” or “degree” (Phil Wainewright has First, Second, and Lesser Degrees in his discussion) because I can show you advantages for either of these two over the other, but both of these have significant advantages over the other models I’ve seen.

So what’s cheaper?  The latter two models, either columnar or multi-table multitenancy will be cheaper unless you run so few tenants per machine it doesn’t matter.  This is likely a function fo the size deals you’re closing.  Salesforce averages 20-odd seats per deal, so they want to cram a lot of tenants onto a single schema.  Others may run large enough deals that virtualization is fine, and I have certainly talked to some such.

There’s just one problem with all this:  the machine resources, fixed and variable, are not the lion’s share of the cost to deliver a service.  It’s Operations headcount.  While these models do somewhat ameliorate those costs, they are not the final word.  The final word is relentless automation of operations.  Facebook manages to adminster 1800 MySQL servers per DBA.  I would venture to say most SaaS vendors are nowhere close to that level of efficiency regardless of which model they run.  I certainly haven’t talked to anyone who was.  If you had sufficiently automated your operations, you could run any of the models I mentioned and still get relatively cheap costs.  This automation is the real driver of SaaS efficiency, but it isn’t sexy.  It isn’t green crystals, so nobody talks about it much.

15 Responses to “Degrees of Multi-Tenancy (Degrees of Green Crystals)”

  1. danhardy said

    Good article, Bob.

    I’d like to suggest one additional benefit of the multiple-schemas-on-a-server model. A good portion of business apps will have a requirement that customization capabilities are present, and table-stakes for that is likely at least custom field capabilities. If each customer has their own set of tables, you can actually go ahead and give them custom fields by dynamically altering tables to add/remove columns (either altering the core tables, or an extension table that is joined in with the “base” table). This can be completely automated.

    This is a nice option to have, because custom fields can more easily become first-class fields in all respects – working seamlessly with other parts of the architecture such as API access, business rules processing, reporting, etc. I’ve seen some solutions where custom fields are clearly second class — core fields are referenced using something like $summary, while custom fields are instead referenced by some convoluted syntax like $cf.{mycustomfieldname}! Yuck.

    Dealing with custom fields as first-class constructs may be more difficult to do in an all-customer-in-one-schema model. You might find you have to go with other approaches (extension tables with N columns of each type, or many ‘custom value’ rows per record in an extension table, etc. — all of which seem less elegant).

    Dan

  2. smoothspan said

    Great point Dan. Almost everyone wants to be able to add custom fields. Trying to do this without going to a second table is cumbersome at best without a scheme such as you’ve suggested.

    Cheers,

    BW

  3. kalqlate said

    I think the degree to which custom fields appear and operate as first or second-class have mostly to do with the initial design considerations and goals of the provider.

    If custom fields are considered a first-class feature from inception then they should and will be given first-class treatment all around. The useability and programmability of native vs custom fields then becomes a non-issue as they were provisioned for throughout the entire modeling, UI design, and development processes. Certainly, on the data architecture and code side there may be issues of additional, yet surmountable, complexity. But on the front end, second-class handling of custom fields need not and should not be the case.

    The remaining benefits of multiple-schemas-on-a-server are easier backup and restore, and a greater “sense” of “assured” client data privacy. The drawback to multiple-schema vs shared-schema, as mentioned in your article, is cost of maintenance in terms of time and people resources.

    If poor useability and programmability of custom fields is a show-stopper for the shared-schema scenario, then a certain upcoming PaaS that I’m aware of will make some a bit more comfortable in that regard.

    I believe that supporting and making available all modes of multi-tenancy, and automating the process of migrating clients to and from these modes according to the desire of the client or need of the system, will be of benefit to all. The PaaS vendor will be more attractive as they can offer more points of entry and handling of concerns. Prospective and current clients will be comforted by the flexibility to choose their fit, cost and security-wise, with migration options to another tenancy mode when and if the need/desire arises.

    David

  4. kalqlate said

    I also believe that Bungee’s client-housed and hosted virtual application server appliance is another option for PaaS vendors to consider in terms of catering to comfort of potential clients.

    (BTW, the quote in my first post was intended to be The final word is relentless automation of operations.)

  5. […] multi-tenancy. Those postings in turn prompted fellow-Enterprise Irregulars blogger Bob Warfield to liken multi-tenancy to the green crystals apocryphally used in soap marketing: “… multi-tenancy is much more of a marketing event than a technology event. Whoa! That sort of […]

  6. […] Comments Software as Services… on Degrees of Multi-Tenancy (Degr…Social Media Adoptio… on Do You Read the SaaS Curmudgeo…Rob Mathewson on Kindle: Seth […]

  7. johnfmartin said

    Bob, you hit the nail on the head with this issue – multitenancy is really about having a single software version across all customers (for efficient support and maintenance), and shared infrastructure. Here are some thoughts on the impact of these factors on Building Saas blog: http://buildingsaas.typepad.com/blog/2006/08/salesforcecom_i.html

    The schema-per-client approach can work well, but only as long as the schemas are the same across customers so the software is exactly the same across customers. I disagree with the commenter Danhardy above about customer-specific schema changes – this approach requires customer-specific support and perhaps customer-specific software changes, which over time will erode the efficiency benefits for SaaS providers. Our SaaS offering is entering its 10th year and if we had built client-specific schema extensions over the years (instead of a metadata-driven approach that works across customers), our development and support efficiency would be a lot lower than it is today.

    – John Martin, IQNavigator CTO

  8. smoothspan said

    John, for the most part I agree that schema changes should not be customer specific. There are ways to implement custom fields and the like without requiring custom schemas. I can imagine scenarios though where some custom aspects make sense. Even there, though, I would implement them via metadata and other automated facilities and not allow manual customization.

    Cheers,

    BW

  9. […] bloggers unfold with a focus on multi-tenancy. A while back, Bob Warfield over at Smoothspan, posted some interesting commentary on multi-tenancy being more of a marketing gimmick (not so black and white, but the gist was there) […]

  10. […] causes significant miscommunication. From a marketing perspective, vendors are sucked into the Green Crystals Marketing described by Bob Warfield a couple of years ago. Most cloud vendors are touting that they are multi-tenant; they want you to […]

  11. […] use a somewhat limited definition of what constitutes multi-tenancy. Not that I want to veer into a ‘green crystals’ debate here about whose multi-tenancy follows the one true path. But the fact remains that there are ways […]

  12. […] a rather singular clarification of what constitutes multi-tenancy. Not that we wish to curve into a ‘green crystals’ debate here about whose multi-tenancy follows a one loyal path. But a fact stays that there are ways of […]

  13. […] use a somewhat limited definition of what constitutes multi-tenancy. Not that I want to veer into a ‘green crystals’ debate here about whose multi-tenancy follows the one true path. But the fact remains that there are ways […]

  14. […] providers. The emphasis on well designed multi-tenancy is important. We do not wish to re-visit the green crystals debate, but a good multi-tenant architecture will facilitate automation in a host of areas, while […]

  15. […] providers. The emphasis on well designed multi-tenancy is important. We do not wish to re-visit the green crystals debate, but a good multi-tenant architecture will facilitate automation in a host of areas, while […]

Leave a Reply

 

Discover more from SmoothSpan Blog

Subscribe now to keep reading and get access to the full archive.

Continue reading