Uses of Class
org.apache.nutch.util.NutchTool
-
Packages that use NutchTool Package Description org.apache.nutch.crawl Crawl control code and tools to run the crawler.org.apache.nutch.fetcher The Nutch multi-threaded fetching moduleorg.apache.nutch.indexer Index content, configure and run indexing and cleaning jobs to add, update, and delete documents from an index.org.apache.nutch.parse TheParseinterface and related classes.org.apache.nutch.service.impl org.apache.nutch.tools Miscellaneous tools. -
-
Uses of NutchTool in org.apache.nutch.crawl
Subclasses of NutchTool in org.apache.nutch.crawl Modifier and Type Class Description classCrawlDbThis class takes the output of the fetcher and updates the crawldb accordingly.classDeduplicationJobGeneric deduplicator which groups fetched URLs with the same digest and marks all of them as duplicate except the one with the highest score (based on the score in the crawldb, which is not necessarily the same as the score indexed).classGeneratorGenerates a subset of a crawl db to fetch.classInjectorInjector takes a flat text file of URLs (or a folder containing text files) and merges ("injects") these URLs into the CrawlDb.classLinkDbMaintains an inverted link map, listing incoming links for each url. -
Uses of NutchTool in org.apache.nutch.fetcher
Subclasses of NutchTool in org.apache.nutch.fetcher Modifier and Type Class Description classFetcherA queue-based fetcher. -
Uses of NutchTool in org.apache.nutch.indexer
Subclasses of NutchTool in org.apache.nutch.indexer Modifier and Type Class Description classIndexingJobGeneric indexer which relies on the plugins implementing IndexWriter -
Uses of NutchTool in org.apache.nutch.parse
Subclasses of NutchTool in org.apache.nutch.parse Modifier and Type Class Description classParseSegment -
Uses of NutchTool in org.apache.nutch.service.impl
Methods in org.apache.nutch.service.impl that return NutchTool Modifier and Type Method Description NutchToolJobFactory. createToolByClassName(String className, Configuration conf)NutchToolJobFactory. createToolByType(JobManager.JobType type, Configuration conf)Constructors in org.apache.nutch.service.impl with parameters of type NutchTool Constructor Description JobWorker(JobConfig jobConfig, Configuration conf, NutchTool tool)To initialize JobWorker thread with the Job Configurations provided by user.ServiceWorker(ServiceConfig serviceConfig, NutchTool tool) -
Uses of NutchTool in org.apache.nutch.tools
Subclasses of NutchTool in org.apache.nutch.tools Modifier and Type Class Description classCommonCrawlDataDumperThe Common Crawl Data Dumper tool enables one to reverse generate the raw content from Nutch segment data directories into a common crawling data format, consumed by many applications.
-