MAP REDUCE LIFE CYCLE
- DRIVER
- MAPPER
- COMBINER (If Any)
- PARTITIONER
- REDUCER
- INPUTFORMAT
- OUTPUTFORMAT
Syntax of configuring Job using JobConf:
// Create a new JobConf
JobConf job = new JobConf(new Configuration(), MyJob.class);
// Specify various job-specific parameters
job.setJobName("myjob");
FileInputFormat.setInputPaths(job, new Path("in"));
FileOutputFormat.setOutputPath(job, new Path("out"));
job.setMapperClass(MyJob.MyMapper.class);
job.setCombinerClass(MyJob.MyReducer.class);
job.setReducerClass(MyJob.MyReducer.class);
job.setInputFormat(SequenceFileInputFormat.class);
job.setOutputFormat(SequenceFileOutputFormat.class);
// Create a new JobConf
JobConf job = new JobConf(new Configuration(), MyJob.class);
// Specify various job-specific parameters
job.setJobName("myjob");
FileInputFormat.setInputPaths(job, new Path("in"));
FileOutputFormat.setOutputPath(job, new Path("out"));
job.setMapperClass(MyJob.MyMapper.class);
job.setCombinerClass(MyJob.MyReducer.class);
job.setReducerClass(MyJob.MyReducer.class);
job.setInputFormat(SequenceFileInputFormat.class);
job.setOutputFormat(SequenceFileOutputFormat.class);
Skeleton of the MAP REDUCE job:
public class SamplePgm { public static void main(String[] args) throws Exception { // Create configuration Configuration conf = new Configuration(); // Create job Job job = new Job(conf, "Sample Pgm"); job.setJarByClass(SamplePgm.class); // Setup MapReduce job job.setMapperClass(Mapper.class); job.setReducerClass(Reducer.class); // Set only 1 reduce task job.setNumReduceTasks(1); // Specify key / value job.setOutputKeyClass(LongWritable.class); job.setOutputValueClass(Text.class); // Input FileInputFormat.addInputPath(job, new Path(args[0])); job.setInputFormatClass(TextInputFormat.class); // Output FileOutputFormat.setOutputPath(job, new Path(args[1])); job.setOutputFormatClass(TextOutputFormat.class); // Execute job int code = job.waitForCompletion(true) ? 0 : 1; System.exit(code); }}
Comments
Post a Comment