public class AVG extends EvalFunc<Double> implements Algebraic, Accumulator<Double>
Algebraic, so if possible the execution will
performed in a distributed fashion.
AVG can operate on any numeric type. It can also operate on bytearrays, which it will cast to doubles. It expects a bag of tuples of one record each. If Pig knows from the schema that this function will be passed a bag of integers or longs, it will use a specially adapted version of AVG that uses integer arithmetic for summing the data. The return type of AVG will always be double, regardless of the input type.
AVG implements the Accumulator interface as well.
While this will never be
the preferred method of usage it is available in case the combiner can not be
used for a given calculation
| Modifier and Type | Class and Description |
|---|---|
static class |
AVG.Final |
static class |
AVG.Initial |
static class |
AVG.Intermediate |
EvalFunc.SchemaTypelog, pigLogger, reporter, returnType| Constructor and Description |
|---|
AVG() |
| Modifier and Type | Method and Description |
|---|---|
void |
accumulate(Tuple b)
Pass tuples to the UDF.
|
void |
cleanup()
Called after getValue() to prepare processing for next key.
|
protected static Tuple |
combine(DataBag values) |
protected static long |
count(Tuple input) |
Double |
exec(Tuple input)
This callback method must be implemented by all subclasses.
|
List<FuncSpec> |
getArgToFuncMapping()
Allow a UDF to specify type specific implementations of itself.
|
String |
getFinal()
Get the final function.
|
String |
getInitial()
Get the initial function.
|
String |
getIntermed()
Get the intermediate function.
|
Double |
getValue()
Called when all tuples from current key have been passed to accumulate.
|
Schema |
outputSchema(Schema input)
Report the schema of the output of this UDF.
|
protected static Double |
sum(Tuple input) |
allowCompileTimeCalculation, finish, getCacheFiles, getInputSchema, getLoadCaster, getLogger, getPigLogger, getReporter, getReturnType, getSchemaName, getSchemaType, getShipFiles, isAsynchronous, needEndOfAllInputProcessing, progress, setEndOfAllInput, setInputSchema, setPigLogger, setReporter, setUDFContextSignature, warnpublic Double exec(Tuple input) throws IOException
EvalFuncexec in class EvalFunc<Double>input - the Tuple to be processed.IOExceptionpublic String getInitial()
AlgebraicgetInitial in interface Algebraicpublic String getIntermed()
AlgebraicgetIntermed in interface Algebraicpublic String getFinal()
Algebraicprotected static Tuple combine(DataBag values) throws ExecException
ExecExceptionprotected static long count(Tuple input) throws ExecException
ExecExceptionprotected static Double sum(Tuple input) throws ExecException, IOException
ExecExceptionIOExceptionpublic Schema outputSchema(Schema input)
EvalFunc
The default implementation interprets the OutputSchema annotation,
if one is present. Otherwise, it returns null (no known output schema).
outputSchema in class EvalFunc<Double>input - Schema of the inputpublic List<FuncSpec> getArgToFuncMapping() throws FrontendException
EvalFuncgetArgToFuncMapping in class EvalFunc<Double>FrontendExceptionpublic void accumulate(Tuple b) throws IOException
Accumulatoraccumulate in interface Accumulator<Double>b - A tuple containing a single field, which is a bag. The bag will contain the set
of tuples being passed to the UDF in this iteration.IOExceptionpublic void cleanup()
Accumulatorcleanup in interface Accumulator<Double>public Double getValue()
AccumulatorgetValue in interface Accumulator<Double>Copyright © 2007-2017 The Apache Software Foundation