Apache Cassandra

Article post picture: Wide Column Database Schema Design

Software Engineering

Because relational database management systems have dominated IT for over 30 years, it is hardly surprising that although a vast amount of literature and material with regards to database schema design is available, most of it focuses on those traditional database systems. However, since NoSQL systems raised in popularity since 2007, knowledge of relational schema design is not sufficient any more. During this post I am going to outline differences and considerable aspects when doing the same for wide column database systems (and especially Apache Cassandra).Before going into any details, I will provide a brief recap on data modelling in general as well as techniques targeting traditional RDBMS. After reading this post, you will be able to understand why data modelling for NoSQL systems like Apache Cassandra follows different rules and what is necessary to design schemas for those systems.

Software Engineering

From my previous blog post you already know that data written to Apache Cassandra is persisted into so called Sorted String Table (SSTable) files eventually. During this article I'm going to explain the Apache Cassandra data directory and SSTable format in more detail.

Software Engineering

As you know, top use cases for Apache Cassandra include user activity tracking, Internet of Things (IoT) applications, product catalogue lookups and time-series-based applications in general. Not only but especially for those use cases, high throughput rates can be achieved by using Apache Cassandra as a database management system. But how is that possible? Let me explain.