Hadoop cheat sheet
Installing and configuring Hadoop
- Installing Hadoop on Ubuntu linux, an excellent tutorial by Michael Noll.
- Installing Hadoop on Mac OS X , I tried this on snow leopard on a power book and it worked fine with the following caveats – Check it against Michaels tutorial and configure core-site.xml(referred to as hadoop-site.xml), mapred-site.xml, and hdfs-site.xml. I forgot the last two and only hte data node and secondary name node started – remember you can use the jps command from the shell to see what java processes are running.
A list of useful hadoop related commands that I keep forgeting
- Starting core as hadoop user: $<HADOOP INSTALL DIR>/bin/start-all.sh
- Stopping core as hadoop user: $<HADOOP INSTALL DIR>/bin/stop-all.sh
- Starting Zoopkeeper: $<ZOOKEEPER INSTALL DIR>/bin/zkServer.sh start
- Stopping Zoopkeeper: $<ZOOKEEPER INSTALL DIR>/bin/zkServer.sh stop
- Starting HBase: $<HBASE INSTALL DIR>/bin/start-hbase.sh
- Stopping HBase: $<HBASE INSTALL DIR>/bin/stop-hbase.sh
- Useful command to see what is running: $jps
- (Re)format hdfs <HADOOP INSTALL DIR>/bin/hadoop namenode -format
Important paths
- Hadoop install path: /usr/local/hadoop
- web UI for MapReduce job tracker(s)
- web UI for task tracker(s)
- Web UI HDFS Name Node
Hadoop class path
Ran into problems today trying to run a jar in Hadoop that needed to reference the hbase jar and zookeeper jar. To get it to work I had to add some changes to the hadoop-env.sh. The changes were in the line
# Extra Java CLASSPATH elements. Optional
and the changes were
export HADOOP_CLASSPATH=/usr/local/hbase/hbase-0.20.2.jar:/usr/local/zookeeper/zookeeper-3.2.2.jar
which is basically adding the path to the hbase jar and the zookeeper jar
Building a Hadoop compatible jar from netbeans
Right – couldn’t build a jar that would work with hadoop from netbeans 6.7. After consulting the most knowledgeable java developer I know, we got to the bottom of it, sort of. It turns out the default manifest that netbeans creates seems to cause issues – all sorts of inexplicable errors infact. So the solution was to simply delete the manifest.mf from the root of the project. Phew. Might try and look into it further at some point, still working on a work flow based on netbeans.
Important port numbers
zookeeper client port = 2181
Hbase master port = 60000
About this entry
You’re currently reading “Hadoop cheat sheet,” an entry on random()
- Published:
- 24.01.10 / 6pm
- Category:
- Hadoop
No comments
Jump to comment form | comments rss [?] | trackback uri [?]