Hadoop – How to Run a MapReduce job
In this post, I will explain how to configure Hadoop and run Map Reduce Programs.
At first, install Java SDK. You can download the SDK here
Once you install Java set JAVA_HOME variable, check if Java is installed properly using java -version
Download Apache Hadoop 1.1.1 here
Open your Terminal and navigate to the folder where you have downloaded Hadoop build.
Create a working directory Hadoop and copy the build to this path.
Now, Extract the build as shown below.
Once you extract the files, change your directory to hadoop-1.1.1
open hadoop-env.sh present in the conf/ directory
In hadoop-env.sh set JAVA_HOME to your JAVA_HOME path and save the file (:w!)
That’s it, now you can run your map reduce Jobs !
General method to run a map reduce job
bin/hadoop jar hadoop/contrib/streaming/hadoop-1.1.1-streaming.jar -mapper script/<mapper-file> -reducer script/<reducer-file> -input <input_directory> -output <output_directory>