As someone who once cut teeth in quantitative finance, I can tell you that a huge fraction of 'data science' is data collection. Yes, it's kind of unsexy, but it's hugely important. If you have bad data, you have no chance of drawing good conclusions from your data consistently. That was the thought behind the Fluentd project: we realized that the reality of data collection (especially for log data) is messy, and we wanted to clean it up.
In this talk, I plan to code live onstage to show how easy it is to get started with Fluentd by creating a useful data collection pipeline.