Skip to main content

Posts

Showing posts from September, 2013

Real time data processing with Cassandra, Part 1

This is the first part of getting start with real time data processing with Cassandra. In the first part i am going to describe how to configure Hadoop, Hive and Cassandra, also some adhoc query to use new CqlStorageHandler. In the second part i will show, how to use Shark and Spark for real time fast data processing with Cassandra. I was encourage by the blog from the Data Stax, you can find out it here . Also all the credit goes for the author of the library cassandra-handler and Alex Lui for developing the CQLCassandraStorage. Of course you can use DataStax enterprise version for the first part, Data Stax enterprise version has built in support Hive and Hadoop. In this blog post i will use all the native apache products. If you are interested in Real time data process, please check this blog . In the first part i will use following products: 1) Hadoop 1.2.1 (Single node cluster) 2) Hive 0.9.0 3) Cassandra 1.2.6 (Single node cluster) 4) cassandra-handler 1.2.6 (depends on Hive