Package org.apache.nutch.crawl
Class URLPartitioner
- java.lang.Object
-
- org.apache.hadoop.mapreduce.Partitioner<Text,Writable>
-
- org.apache.nutch.crawl.URLPartitioner
-
- All Implemented Interfaces:
Configurable
public class URLPartitioner extends Partitioner<Text,Writable> implements Configurable
Partition urls by host, domain name or IP depending on the value of the parameter 'partition.url.mode' which can be 'byHost', 'byDomain' or 'byIP'
-
-
Field Summary
Fields Modifier and Type Field Description static StringPARTITION_MODE_DOMAINstatic StringPARTITION_MODE_HOSTstatic StringPARTITION_MODE_IPstatic StringPARTITION_MODE_KEY
-
Constructor Summary
Constructors Constructor Description URLPartitioner()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description ConfigurationgetConf()intgetPartition(Text key, Writable value, int numReduceTasks)Hash by host or domain name or IP address.voidsetConf(Configuration conf)
-
-
-
Field Detail
-
PARTITION_MODE_KEY
public static final String PARTITION_MODE_KEY
- See Also:
- Constant Field Values
-
PARTITION_MODE_HOST
public static final String PARTITION_MODE_HOST
- See Also:
- Constant Field Values
-
PARTITION_MODE_DOMAIN
public static final String PARTITION_MODE_DOMAIN
- See Also:
- Constant Field Values
-
PARTITION_MODE_IP
public static final String PARTITION_MODE_IP
- See Also:
- Constant Field Values
-
-
Method Detail
-
setConf
public void setConf(Configuration conf)
- Specified by:
setConfin interfaceConfigurable
-
getConf
public Configuration getConf()
- Specified by:
getConfin interfaceConfigurable
-
getPartition
public int getPartition(Text key, Writable value, int numReduceTasks)
Hash by host or domain name or IP address.- Specified by:
getPartitionin classPartitioner<Text,Writable>
-
-