In this post, I will explain how to configure Hadoop and run Map Reduce Programs.

 

At first, install Java SDK. You can download the SDK here

Once you install Java set JAVA_HOME variable, check if Java is installed properly using java -version

Download Apache Hadoop 1.1.1 here

Open your Terminal and navigate to the folder where you have downloaded Hadoop build.

Create a working directory Hadoop and copy the build to this path.

Now, Extract the build as shown below.

 


Once you extract the files, change your directory to hadoop-1.1.1

open hadoop-env.sh present in the conf/ directory

 

 

In hadoop-env.sh set JAVA_HOME to your JAVA_HOME path and save the file (:w!)

 

That’s it, now you can run your map reduce Jobs !

General method to run a map reduce job

bin/hadoop jar hadoop/contrib/streaming/hadoop-1.1.1-streaming.jar -mapper script/<mapper-file> -reducer script/<reducer-file> -input <input_directory> -output <output_directory>