Installing Cloudera’s version of Hadoop on an OSX Macbook Pro is not difficult, if you get the steps right.
Go to: https://ccp.cloudera.com/display/SUPPORT/CDH3+Downloadable+Tarballs and download the Hadoop tarball.
We are going to run Hadoop in pseudo-distributed mode, which is nice in a dev environment.
Open up a Terminal window, and run:
1 2 3 |
|
Now we need to edit 2 files so that Hadoop knows where to write it’s data. This is when you decide where to write it. I did:
1
|
|
Edit core-site.xml
1 2 3 4 |
|
Next, edit hdfs-site.xml
1 2 3 4 |
|
Finally, format HDFS, and start up the nodes:
1 2 3 |
|
If you are typing in your password a lot, try this (assuming you have your SSH keys set up):
1 2 |
|
If you have upgraded to OS X Lion (v 10.7), then you might see this every time you do something:
1
|
|
You can ignore it. It has something to do with Kerberos authentication (I think), but I don’t yet have a solution to getting rid of it.