Thursday, May 12, 2011

How to install user-provided jars to Hadoop

If you write a MapReduce program and compile it to a jar, you usually run it with following command:

./bin/hadoop jar your_jar_name

If you want to your jar loaded when Hadoop is started (e.g. you add a new service which should be initiated and started by Hadoop), you can follow steps shown below.

In file bin/hadoop-config.sh, you can find following snippet:

for f in $HADOOP_COMMON_HOME/hadoop-*.jar; do
  CLASSPATH=${CLASSPATH}:$f;
done

So only jars whose names starting with "hadoop-" are loaded by default.

Drop your jar to the directory where Hadoop is used, and change file bin/hadoop-config.sh to add

CLASSPATH=${CLASSPATH}:$HADOOP_COMMON_HOME/your_jar_name

No comments: