Skip to main content

MAP REDUCE LIFE CYCLE

MAP REDUCE LIFE CYCLE
  • DRIVER
  • MAPPER 
  • COMBINER (If Any)
  • PARTITIONER
  • REDUCER
  • INPUTFORMAT
  • OUTPUTFORMAT

Syntax of configuring Job using JobConf:

    // Create a new JobConf
     JobConf job = new JobConf(new Configuration(), MyJob.class);
    
     // Specify various job-specific parameters    
     job.setJobName("myjob");
  
     FileInputFormat.setInputPaths(job, new Path("in"));
     FileOutputFormat.setOutputPath(job, new Path("out"));
  
     job.setMapperClass(MyJob.MyMapper.class);
     job.setCombinerClass(MyJob.MyReducer.class);
     job.setReducerClass(MyJob.MyReducer.class);

     job.setInputFormat(SequenceFileInputFormat.class);
     job.setOutputFormat(SequenceFileOutputFormat.class);

Skeleton of the MAP REDUCE job:

public class SamplePgm {
 
    public static void main(String[] args) throws Exception {
 
        // Create configuration
        Configuration conf = new Configuration();
 
        // Create job
        Job job = new Job(conf, "Sample Pgm");
        job.setJarByClass(SamplePgm.class);
 
        // Setup MapReduce job
        job.setMapperClass(Mapper.class);
        job.setReducerClass(Reducer.class);
 
        // Set only 1 reduce task
        job.setNumReduceTasks(1);
 
        // Specify key / value
        job.setOutputKeyClass(LongWritable.class);
        job.setOutputValueClass(Text.class);
 
        // Input
        FileInputFormat.addInputPath(job, new Path(args[0]));
        job.setInputFormatClass(TextInputFormat.class);
 
        // Output
        FileOutputFormat.setOutputPath(job, new Path(args[1]));
        job.setOutputFormatClass(TextOutputFormat.class);
 
        // Execute job
        int code = job.waitForCompletion(true) ? 0 : 1;
        System.exit(code);
    }
}


Comments

Popular posts from this blog

Checking XML Validity

Checking XML Validity When you edit XML, it is a good idea to use an XML-aware editor to be sure that your syntax is correct and your XML is well-formed. You can also use the xmllint utility to check that your XML is well-formed. By default, xmllint re-flows and prints the XML to standard output. To check for well-formedness and only print output if errors exist, use the command xmllint -noout filename.xml .

Deprecated API Versus New API

Seq No Deprecated API NEW API 1 Release 0.20 Release 1.X and 2.X 2 Interfaces (means you can add a method with default implementation to an abstract class without breaking old implementations of class Abstract classes For Example Mapper and Reducer interfaces in Old API are abstract classes in new API 3 Package org.apache.hadoop.mapred org.apache.hadoop.mapreduce 4 JobConf,the OutputCollector and the Reporter Context Object   (allows the user code to communicate with MapReduce System) 5 Both API’s, Key-Value record pairs are pushed to mapper and reducer In addition execution flow can be controlled by run() method 6 Job Control by JobClient class Job class 7 Output file are named as Part-nnnnn Part-m-nnnnn( mapper) , part-r-nnnnn(reducer) 8 ...