public class InputSizeReducerEstimator extends Object implements PigReducerEstimator
e.g. the following is your pig script
a = load '/data/a'; b = load '/data/b'; c = join a by $0, b by $0; store c into '/tmp';and the size of /data/a is 1000*1000*1000, and the size of /data/b is 2*1000*1000*1000 then the estimated number of reducer to use will be (1000*1000*1000+2*1000*1000*1000)/(1000*1000*1000)=3
BYTES_PER_REDUCER_PARAM, DEFAULT_BYTES_PER_REDUCER, DEFAULT_MAX_REDUCER_COUNT_PARAM, MAX_REDUCER_COUNT_PARAM| Constructor and Description |
|---|
InputSizeReducerEstimator() |
| Modifier and Type | Method and Description |
|---|---|
int |
estimateNumberOfReducers(org.apache.hadoop.mapreduce.Job job,
MapReduceOper mapReduceOper)
Determines the number of reducers to be used.
|
static long |
getTotalInputFileSize(org.apache.hadoop.conf.Configuration conf,
List<POLoad> lds,
org.apache.hadoop.mapreduce.Job job) |
public int estimateNumberOfReducers(org.apache.hadoop.mapreduce.Job job,
MapReduceOper mapReduceOper)
throws IOException
estimateNumberOfReducers in interface PigReducerEstimatorjob - job instancemapReduceOper - IOExceptionpublic static long getTotalInputFileSize(org.apache.hadoop.conf.Configuration conf,
List<POLoad> lds,
org.apache.hadoop.mapreduce.Job job)
throws IOException
IOExceptionCopyright © 2007-2017 The Apache Software Foundation