NoSQL Databases Venkata Suraj Kongara(K00327390) Parasuram Reddy kalluri(K00379540)

NoSQL Databases

Venkata Suraj
Kongara(K00327390)

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!


order now

Parasuram Reddy
kalluri(K00379540)

Dinesh Javvaji(K00381837)

 

Introduction:

NoSQL (not only SQL) is a
non-relational database management system. It is used for fast information
retrieval database and is portable. NoSQL can be applied to unstructured,
semi-structured and schema-less data. NoSQL databases are open source,
appropriated in nature and in addition it is having high performance directly
that is horizontally scalable. The non-relational database does not have a
schema and sorts out its data in related tables (i.e., data is stored in a
non-normalized way). Distributed means information is spread to various
machines and is overseen by various machines so here it utilizes the idea of
information replication.

CAP Theorem:

Eric Brewer was the philosophical fuel
behind the NoSQL databases. CAP Stands for Consistency, Availability and
Partition Tolerance. The theorem claims that “in a distributed system when there is an inevitable
network partition (and the cluster breaks into two or more “islands”), you
can’t guarantee both availability (for updates) and consistency.”. According to this any distributed
system cannot guaranty C, A and P simultaneously

·     
Consistency: All the nodes in the distributed
system see the same data. The system is said to be consistent if we start a
transaction (read or write) in a consistent state and ends with the system in a
consistent state. In this model, if a system steps into an inconsistent state during
a transaction then the system gets rolled back into a consistent state if there
is an in error in a transaction. Examples are SQL, MYSQL and PostgreSQL.

·     
Availability: In a distributed system, if the
system is 100% operational all the time then we have achieved availability. Every
client gets the response regardless of his individual state of the node in the
system. Examples are SQL, MYSQL and PostgreSQL. So, we can say that the
relational databases come under the CA category. Document-oriented databases
like Elastic search also fall under this CA.

·     
Partition
Tolerance: If
a system is partition tolerant then we can say that it can sustain any amount
of network failure that does not result in the entire network failure. Data is
replicated across the combinations of nodes and network to keep the system out
of network failures. Examples the storage systems that come under the umbrella
of CP are Redis and MongoDB. The storage systems that come under the AP
umbrella are Cassandra, CouchDB and dynamoDB.    

 

Types of NoSQL
Databases:

·     
Key-Value
Databases: It
is the simplest of all the types of NoSQL. In this, the data is stored in the
form of key-value pair. Stored values can be of any type like JSON, string,
text document and so on which can be accessed by a key. Each value has a unique
key this is a drawback for key-value databases for generating a unique key for
every value. When we look back to CAP Theorem these databases fall under
Availability and partition but lack of Consistency. Examples Redis, Riak, and
BerkeleyDB.

·     
Document
Store Databases:
In this, the data is stored in the form of documents. These are semi-structured
data stores. The data is stored in the form of key-value pairs similar to key-value
data stores but the only difference is the values stored has some structural
encoding like BSON (Binary encoding of JSON), JSON (JavaScript Object Notation),
XML. Data can be retrieved. Examples are Couchbase and MongoDB.

·     
Column
Store Databases:
In this, the data is stored in the form of the column rather than rows.  Column-oriented databases are those in which
the values containing columns are put together into column families. These can
query large dataset tables faster. Examples are Cassandra, HBase, Google Big Table.

·     
Graph
Databases: In
this, we define a graphical representation of data. The data is stored in nodes
and the edges are used to connect the nodes. Because of its graphical
representation of the data, it supports richer representations of data
relationships. Nodes and relationships both have some define properties. The
graph has nodes which have defined properties and these nodes have some relationships
which is shown by the directional edges. Examples are IBMGraph, Neo4j and Titan.

SQL
vs NoSQL:

·     
Speed: SQL requires a higher degree of normalization
i.e. the data is broken down into small relational tables to avoid data
redundancy and duplication of data. It helps manage data in an efficient way
but having several tables reduces the performance of data processing. In NoSQL
data is stored horizontally where the data is duplicated repeatedly and hardly
we ever partition the data but it is stored in the form of entity. So, read and
write operations through a single entity is easier and faster.

·     
DB
types: SQL
databases can be open source or closed depending upon the commercial vendors.

In NoSQL databases can be categorized in the way of storing data as key-value
store, document store, column store or graph store databases

·     
Data
Recovery: when
there are crisis NoSQL databases can easily recover the data, as NoSQL
databases are unstructured and data is stored in the form of documents.

Conclusion:

            Due to the tremendous rise in the
use of the internet Google and Facebook faced real-time problems while handling
a huge amount of data. We are entering a time of bilingual persistence, a
method that utilizations distinctive data storage technologies to deal with
changing data storage needs. Bilingual persistence can apply over an enterprise
or a single page application.