Surge 2011 ~ Hybrid data storage: finding balance.

Posted in Databases, Development, Conferences on February 22, 2012



Over the past several years Clearspring has developed custom distributed processing and storage systems for dealing with the billions of views our web products receive per day. A central part of this system is a tree-based storage structure that fills a useful middle ground between the datamodel-centric view of row oriented databases and the query-centric view more common with column oriented ones. This presentation will cover the key components of our architecture and some of the trade-offs we have faced between: sharding/scatter-gather approaches versus more sophisticated distributed approaches, statistical approximation versus exact answers, custom solutions versus adapting off-the-shelf components, and latency of updates versus just about everything. In particular we hope to share how these trade-offs have changed and how we have adapted over time.

Watch Video

Tags: Big Data, Apache Cassandra, Kafka, JVM, surge, conference, OmniTI