Me on Hadoop setup at StyleFeeder, part 2
This post on the StyleFeeder tech blog is a HOWTO for taking a Cloudera Hadoop distribution in the 0.20 series, patching it for yourself, and running a Hadoop cluster on EC2 based on it.
This post on the StyleFeeder tech blog is a HOWTO for taking a Cloudera Hadoop distribution in the 0.20 series, patching it for yourself, and running a Hadoop cluster on EC2 based on it.
My heartfelt congratulations to my clients and colleagues at StyleFeeder, which is being acquired by Time, Inc. Time is getting a tremendous asset, technology that will give them an edge, and top talent.
My colleagues and clients at StyleFeeder are good enough to let me post on their tech blog from time to time. I'm exploring Hadoop on their behalf, as partially described here: http://seventhfloor.whirlycott.com/2010/01/14/hadoop-for-the-lone-analyst/. That's basically a HOWTO for Hadoop 0.20 + Apache logs + MySQL on EC2, with tips on streaming, compression, Pig, Redhat/CentOS and the Cloudera Python scripts for EC2.