Package org.apache.nutch.tools
Class CommonCrawlFormatFactory
- java.lang.Object
-
- org.apache.nutch.tools.CommonCrawlFormatFactory
-
public class CommonCrawlFormatFactory extends Object
Factory class that creates newCommonCrawlFormatobjects (a.k.a. formatter) that map crawled files to CommonCrawl format.
-
-
Constructor Summary
Constructors Constructor Description CommonCrawlFormatFactory()
-
Method Summary
All Methods Static Methods Concrete Methods Deprecated Methods Modifier and Type Method Description static CommonCrawlFormatgetCommonCrawlFormat(String formatType, String url, Content content, Metadata metadata, Configuration nutchConf, CommonCrawlConfig config)Deprecated.static CommonCrawlFormatgetCommonCrawlFormat(String formatType, Configuration nutchConf, CommonCrawlConfig config)
-
-
-
Method Detail
-
getCommonCrawlFormat
public static CommonCrawlFormat getCommonCrawlFormat(String formatType, String url, Content content, Metadata metadata, Configuration nutchConf, CommonCrawlConfig config) throws IOException
Deprecated.Returns a new instance of aCommonCrawlFormatobject specifying the type of formatter.- Parameters:
formatType- the type of formatter to be created.url- the url.content- the content.metadata- the metadata.nutchConf- the configuration.config- the CommonCrawl output configuration.- Returns:
- the new
CommonCrawlFormatobject. - Throws:
IOException- If any I/O error occurs.
-
getCommonCrawlFormat
public static CommonCrawlFormat getCommonCrawlFormat(String formatType, Configuration nutchConf, CommonCrawlConfig config) throws IOException
- Throws:
IOException
-
-