1. Why NoSQL?
NoSQL databasesrefer to the group of databases that are not based on the relational database model. Relational databases such as Oracle database, MySQL database, and DB2 database store data in tables, which have relations between them and make use of SQL (Structured Query Language) to access and query the tables. NoSQL databases, in contrast, make use of a storage and query mechanism that is predominantly based on a non-relational, non-SQL data model.
The data storage model used by NoSQL databases is not some fixed data model, but the common feature among the NoSQL databases is that the relational and tabular database model of SQL-based databases is not used. Most NoSQL databases make use of no SQL at all, but NoSQL does not imply that absolutely no SQL is used, because of which NoSQL is also termed as not only SQL. Some examples of NoSQL databases are discussed in Table .
Table 1-1.
NoSQL Databases
NoSQL Database | Database Type | Data Model | Support for SQL-like query language |
---|
Couchbase Server | Document | Key-Value pairs in which the value is a JSON (JavaScript Object Notation) document. | Supports N1QL, which is an SQL-like query language. |
Apache Cassandra | Columnar | Key-Value pairs stored in a column family (table). | Cassandra Query Language (CQL) is an SQL-like query language. |
MongoDB | Document | Key-Value pairs in which the value is a Binary JSON (BSON) document. | MongoDB query language is an SQL-like query language. |
Oracle NoSQL Database | Key-Value | Key-Value pairs. The value is a byte array with no fixed data structure. The value could be simple fixed string format or a complex data structure such as a JSON document. | SQL query support from an Oracle database External Table. |
This chapter covers the following topics.
What is JSON?
What is wrong with SQL?
Advantages of NoSQL Databases
What has Big Data got to do with NoSQL?
NoSQL is not without Drawbacks
Why Couchbase Server?
Who Uses Couchbase Server and for what?
What Is JSON?
As mentioned in Table , the Couchbase Server data model is based on key-value pairs in which the value is a JSON (JavaScript Object Notation) document. JSON is a data-interchange format, which is easy to read and write and also easy to parse and generate by a machine. The JSON text format is a language format that is language independent but makes use of conventions familiar to commonly used languages such as Java, C, and JavaScript.
Essentially a JSON document is an object, a collection of name/value pairs enclosed in curly braces {} . Each name in the collection is followed by : and each subsequent name/value pair is separated from the preceding by a ' , '. An example of a JSON document is as follows in which attributes of a catalog are specified as name/value pairs.
{
"journal":"Oracle Magazine",
"publisher":"Oracle Publishing",
"edition": "January February 2013"
}
The name in name/value pairs must be enclosed in double quotes "" . The value must also be enclosed in "" if a string includes at least a single character. The value may have one of the types discussed in Table .
Table 1-2.
JSON Data Types
Type | Description | Example |
---|
string | A string literal. A string literal must be enclosed in "" . | { "c1":"v1", "c2":"v2" } The string may consist of any Unicode character except " and \ . Each value in the following JSON document is not valid. { "c1":""", "c2":"\" } The " and \ may be included in a string literal by preceding them with a \ . The following JSON document is valid. { "c1":"\"", "c2":"\\" } |
number | A number may be positive or negative, integer or decimal. | { "c1": 1, "c2": -2.5, "c3":0 } |
array | An array is a list of values enclosed in [] . | { "c1":[1,2,3,4,5,"v1","v2"], "c2":[-1,2.5,"v1",0] } |
true false | The value may be a Boolean true or false. | { "c1":true, "c2":false } |
null | The value may be null. | { "c1":null, "c2":null } |
object | The value may be another JSON object. | { "c1":{"a1":"v1", "a2":"v2", "a3":[1,2,3]}, "c2":{"a1":1, "A2":null, "a3":true}, "c3":{} } |
The JSON document model is most suitable for storing unstructured data, as the JSON objects can be added in a hierarchical structure creating complex JSON documents. For example, the following JSON document is a valid JSON document consisting of hierarchies of JSON objects.
{
"c1": "v1",
"c2": {
"c21":[1,2,3],
"c22":
{
"c221":"v221",
"c222":
{
"c2221":"v2221"
},
"c223":
{
"c2231":"v2231"
}
}
}
}
What Is Wrong with SQL?
NoSQL databases were developed as a solution to the following requirements of applications:
Increase in the volume of data stored about users and objects, also termed as big data.
Rate at which big data influx is increasing.
Increase in the frequency at which the data is accessed.
Fluctuations in data usage.
Increased processing and performance required to handle big data.
Ultra-high availability.
The type of data is unstructured or semi-structured.
SQL-based relational databases were not designed to handle the scalability, agility, and performance requirements of modern applications using real-time access and processing big data. While most RDBMS databases provide scalability and high availability as features, Couchbase Server provides higher levels of scalability and high availability. For example, while most RDBMS databases provide replication within a datacenter, Couchbase Server provides Cross Datacenter Replication (XDCR), which is replication to multiple, geographically distributed datacenters. XDCR is discussed in more detail in a later section. Couchbase Server also provides rack awareness, which traditional RDBMS databases dont.
Big data is growing exponentially. Concurrent users have grown from a few hundred or thousand to several million for applications running on the Web. It is not just that once big data has been stored new data is not added. It is not just that once a web application is being accessed by millions of users it shall continue to be accessed by as many users for a predictable period of time. The number of users could drop to a few thousand within a day or a few days. Relational database is based on a single server architecture. A single database is a single point of failure (SPOF). For a highly available database, data must be distributed across a cluster of servers instead of relying on a single database. NoSQL databases provide the distributed, scalable architecture required for big data. "Distributed" implies that data in a NoSQL database is distributed across a cluster of servers. If one server becomes unavailable another server is used. The "distributed" feature is a provision and not a requirement for a NoSQL database. A small scale NoSQL database may consist of only one server.