Class WebGraph.OutlinkDb
- java.lang.Object
-
- org.apache.hadoop.conf.Configured
-
- org.apache.nutch.scoring.webgraph.WebGraph.OutlinkDb
-
- All Implemented Interfaces:
Configurable
- Enclosing class:
- WebGraph
public static class WebGraph.OutlinkDb extends Configured
The OutlinkDb creates a database of all outlinks. Outlinks to internal urls by domain and host can be ignored. The number of Outlinks out to a given page or domain can also be limited.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classWebGraph.OutlinkDb.OutlinkDbMapperPasses through existing LinkDatum objects from an existing OutlinkDb and maps out new LinkDatum objects from new crawls ParseData.static classWebGraph.OutlinkDb.OutlinkDbReducer
-
Field Summary
Fields Modifier and Type Field Description static StringURL_FILTERINGstatic StringURL_NORMALIZING
-
Constructor Summary
Constructors Constructor Description OutlinkDb()Default constructor.OutlinkDb(Configuration conf)Configurable constructor.
-
-
-
Field Detail
-
URL_NORMALIZING
public static final String URL_NORMALIZING
- See Also:
- Constant Field Values
-
URL_FILTERING
public static final String URL_FILTERING
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
OutlinkDb
public OutlinkDb()
Default constructor.
-
OutlinkDb
public OutlinkDb(Configuration conf)
Configurable constructor.- Parameters:
conf- a populatedConfiguration
-
-