Database as a Service (DBaaS) is not only a relatively new term but also a surprisingly generic one. Various companies, products, and services have claimed to offer a DBaaS, and this has led to a fair amount of confusion.
What Is Database as a Service?
As the name implies, DBaaS is a database that is offered to the user as a service . But, what does that really mean?
Does it, for example, imply that the DBaaS is involved in the storage and retrieval of data, and the processing of queries? Does the DBaaS perform activities such as data validation, backups, and query optimization and deliver such capabilities as high availability, replication, failover, and automatic scaling?
One way to answer these questions is to decompose a DBaaS into its two constituent parts, namely, the database and the service .
The Database
There was a time when the term database was used synonymously with relational database management system (RDBMS). That is no longer the case. Today the term is used to refer equally to RDBMS and NoSQL database technologies.
A database management system is a piece of technology, sometimes only software, sometimes with customized and specialized hardware, that allows users to store and retrieve data. The Free Online Dictionary of Computing defines a database management system as A suite of programs which typically manage large structured sets of persistent data, offering ad hoc query facilities to many users.
The Service
Looking now at the other half as a Service we can see that its very essence is the emphasis on the delivery of the service rather than the service being delivered.
In other words, Something as a Service makes it easier for an operator to provide the Something for consumption while offering the consumer quick access to, and the benefit of, the Something in question.
For example, consider that Email as a Service offerings from a number of vendors including Googles Gmail and Microsofts Office365 make it easy for end users to consume e-mail services without the challenges of installing and managing servers and e-mail software.
The Service as a Category
The most common use of the term as a Service occurs when referring to the broad category of Software as a Service (SaaS). This term is often used to refer to applications as a service, like the Salesforce.com customer relationship management (CRM) software, which is offered as a hosted, online service. It also includes Infrastructure as a Service (IaaS) offerings like AWS and Platform as a Service (PaaS) solutions like Cloud Foundry or Engine Yard.
DBaaS is a specific example of SaaS and inherits some of the attributes of SaaS. These include the fact that DBaaS is typically centrally hosted and made available to its consumers on a subscription basis; users only pay for what they use, and when they use it.
DBaaS Defined
One can therefore broadly define a DBaaS to be a technology that
Offers these database servers on demand;
Provisions database servers;
Configures those database servers, or groups of database servers, potentially in complex topologies;
Automates the management of database servers and groups of database servers;
Scales the provided database capacity automatically in response to system load; and
Optimizes the utilization of the supporting infrastructure resources dynamically.
Clearly, these are very broad definitions of capabilities and different offerings may provide each of these to a different degree.
Just as Amazon offers EC2 as a compute service on its AWS public cloud, it also offers a number of DBaaS products. In particular, it provides Relational Database Service (RDS) for relational databases like MySQL or Oracle, a data warehouse as a service in Redshift, and a couple of NoSQL options in DynamoDB and SimpleDB.
OpenStack is a software platform that allows cloud operators and businesses alike to deliver cloud services to their users. It includes Nova, a computing service similar to Amazons EC2, and Swift, an object storage service similar to Amazons S3, as well as numerous other services. One of these additional services is Trove, OpenStacks DBaaS solution.
Unlike Amazons DBaaS offerings, which are database specific, Trove allows you to launch a database from a list of popular relational and nonrelational databases. For each of these databases, Trove provides a variety of benefits including simplified management, configuration, and maintenance throughout the life cycle of the database.
The Challenge Databases Pose to IT Organizations
Databases, and the hardware they run on, continue to be a significant part of the cost and burden of operating an IT infrastructure. Database servers are often the most powerful machines in a data center, and they rely on extremely high performance from nearly all of a computers subsystems.
The interactions with client applications are network intensive, query processing is memory intensive, indexing is compute intensive, retrieving data requires extremely high random disk access rates, and data loads and bulk updates imply that disk writes be processed quickly. Traditional databases also do not tend to scale across machines very well, meaning that all of this horsepower must be centralized into a single computer or redundant pair with massive amounts of resources.
Of course, new database technologies like NoSQL and NewSQL are changing these assumptions, but they also present new challenges. They may scale out across machines more easily, reducing the oversized hardware requirements, but the coordination of distributed processing can tax network resources to an even greater degree.
The proliferation of these new database technologies also presents another challenge. Managing any particular database technology can require a great deal of specialized technical expertise. Because of this, IT organizations have typically only developed expertise in a specific database technology or in some cases a few database technologies. Because of this, they have generally only offered their customers support for a limited number of choices of database technologies. In some cases, this was justified, or rationalized as being a corporate standard .
In recent years, however, development teams and end users have realized that not all databases are created equal. There are now databases that are specialized to particular access patterns like key-value lookup, document management, map traversal, or time series indexing. As a result, there is increasing demand for technologies with which IT has limited experience.
Starting in the latter part of the 2000s, there was an explosion in the so-called NoSQL databases. While it was initially possible to resist these technologies, their benefits and popularity made this extremely difficult.