Goal

Goal: How can we build an ultra-high performance, low-latency multi-table database that can grow incrementally to handle massive-scale data?

Cassandra is a large-scale open-source NoSQL wide-column distributed database. It was developed at Facebook and its design was influenced by Amazon Dynamo and Google Bigtable.

image

Abstract

Introduction

image

Cassandra explicitly chooses not to implement operations that require cross-partition coordination as they are typically slow and hard to provide highly available global semantics. For example, Cassandra does not support:

Wide-columns storage

NoSQL databases are often designed as wide-column data stores.

The concept of wide columns changes the way columns are used in a data store. With a traditional database, the programmer is aware of the columns (fields) within a table. A wide-column table, on the other hand, may contain different columns for different rows and the programmer may need to iterate over their names.

image

For example a user might have multiple phone numbers that, instead of being stored in a separate table with foreign keys that identify the owner of each number, are co-located with the user’s information. Each number is a new column (e.g., phone-1, phone-2, …). There’s also no limit to the number of columns in a row.

image

Wide-column data stores such as Bigtable and Cassandra essentially have no limit on the number of columns that can be associated with a row. In the case of Cassandra, the limit is essentially the storage capacity of the node.

image