Hadoop is a great tool for large scale data processing still it brings a lot of complexity to the table.
From Hardware requirements to the mental model required for writing jobs and day to day tasks like testing and deployment.
Using Clojure and Amazon EMR offers a great path to overcome these challenges, In this talk will cover:
* Main motivation, Why Clojure + Hadoop will make you work faster.
* Clojure Hadoop library, cutting off boilerplate.
* Amazon EMR, intro and main benefits.
* Using Lemure for job launching.
* Performance tuning and benchmarking (using Critirium).
* Cascalog, declarative query engine.
* Main pitfalls and tips.