Hadoop cheat sheet

Installing and configuring Hadoop

  • Installing Hadoop on Ubuntu linux, an excellent tutorial by Michael  Noll.
  • Installing Hadoop on Mac OS X , I tried this on snow leopard on a power book and it worked fine with the following caveats – Check it against Michaels tutorial and configure core-site.xml(referred to as hadoop-site.xml), mapred-site.xml, and hdfs-site.xml. I forgot the last two and only hte data node and secondary name node started – remember you can use the jps command from the shell to see what java processes are running.

A list of useful hadoop related commands that I keep forgeting

  • Starting core as hadoop user: $<HADOOP INSTALL DIR>/bin/start-all.sh
  • Stopping core as hadoop user: $<HADOOP INSTALL DIR>/bin/stop-all.sh
  • Starting Zoopkeeper: $<ZOOKEEPER INSTALL DIR>/bin/zkServer.sh start
  • Stopping Zoopkeeper: $<ZOOKEEPER INSTALL DIR>/bin/zkServer.sh stop
  • Starting HBase: $<HBASE INSTALL DIR>/bin/start-hbase.sh
  • Stopping HBase: $<HBASE INSTALL DIR>/bin/stop-hbase.sh
  • Useful command to see what is running: $jps
  • (Re)format hdfs <HADOOP INSTALL DIR>/bin/hadoop namenode -format

Important paths

Hadoop class path

Ran into problems today trying to run a jar in Hadoop that needed to reference the hbase jar and zookeeper jar. To get it to work I had to add some changes to the hadoop-env.sh. The changes were in the line

# Extra Java CLASSPATH elements.  Optional

and the changes were

export HADOOP_CLASSPATH=/usr/local/hbase/hbase-0.20.2.jar:/usr/local/zookeeper/zookeeper-3.2.2.jar

which is basically adding the path to the hbase jar and the zookeeper jar

Building a Hadoop compatible jar from netbeans

Right – couldn’t build a jar that would work with hadoop from netbeans 6.7. After consulting the most knowledgeable java developer I know, we got to the bottom of it, sort of. It turns out the default manifest that netbeans creates seems to cause issues – all sorts of inexplicable errors infact. So the solution was to simply delete the manifest.mf from the root of the project. Phew. Might try and look into it further at some point, still working on a work flow based on netbeans.

Important port numbers

zookeeper client port = 2181

Hbase master port = 60000


About this entry