Apache Cassandra is a top level open source project from Apache Software Foundation. It was developed by Facebook for in-box search,in July 2008 facebook made it open-source. Apcahe included it in Incubator in March 2009 and later in Feb 2010 made it one of its top level project.
What is a No Sql database ?
A NoSql databases so called Not Only Sql is a database to handle large amount of data with no tabular or structured schema like relational databases. NoSql databases provides easy data replication on multiple nodes, supports both structured and unstructured data and provides an effective fault toleration mechanism.
What is Apache Cassandra ?
Apcahe Cassandra is one the most popular No Sql database to manage large amount of structured, semi structures and un structured data across multiple data centers and the cloud. Cassandra is a write friendly nosql database to provide linear scalability, continuous availability and implementable simplicity across many commodity servers with no single point of failure.
Some of the most popular and widely used NoSql databases are Apache Hbase, MongoDB and Apache Cassandra.
Features of Cassandra
Apache Cassandra being a widely used open-source and highly scalable NoSql database has following features:
Data distribution on number of servers (nodes) is the code of NoSql databases and Cassandra does it so well, in Cassandra data is automatically distributed between all the nodes participating in a ring or cluster.
Apache Cassandra supports easy data partition on multiple nodes, partition is managed on the basis of Column Family structure. A partition key is being defined at the time a Column Family or table is created.
3) Data replication
Apache Cassandra provides a customize data replication, that means redundant copies of same data are made on two or more nodes in the cluster. If a node goes down than the request can be fulfilled from the replicated copy on another node.
4) Linear Scalability
Cassandra provide a linear scalability, that means the capacity of the cluster can be increased linearly by adding more servers (nodes) to it. If a cluster of two nodes are handling 10K requests/sec than 4 nodes will be handling 20K and 8 nodes 40K request/sec.
5) Easy operations
Cassandra provides a shell script based implementation along with language drivers in written in most of commonly used programming languages like Python, C#/.NET, C++, Ruby, Java, Go, and many more.
Cassandra shell script supports CQL (Cassandra Query Language) it has a simle and similar to Sql syntax to create, update and delete keyspaces, column families and rows in the cluster.
From architectural point of view Cassandra is made for fast and heavy insertions and low number of queries.