Cassandra Architecture and Data Modeling

Cassandra solves a variety of problems in the space of realtime distributed systems. My recent work has been around realtime event ingestion, monitoring and log collection, and other timeseries data. These problems exhibit read and write behaviors that are in line with some of Cassandra’s natural use cases. In particular, I needed to satisfy heavy write requirements without compromising efficient reads. This is not for free however. Although Cassandra may be very suitable for scaling out your persistence store with relatively lower devops overhead, I found that it does come at the cost of more complicated data modeling and application code. With this in mind and given that I had a team ready to dive into these sets of problem, I gave an talk on Cassandra. This was a fairly technical talk that primarily covered architecture and data modeling.