| Interface | Description |
|---|---|
| AuthenticationCredentials |
This interface describes immutable classes which represents authentication information for all kinds of authentication.
|
| FormData |
This interface describes the form data gleaned from an HTML page.
|
| FormDataElement |
This interface describes individual form data elements, for form submission.
|
| IDiscoveredLinkHandler |
This interface describes the functionality needed by a link extractor to note a discovered link.
|
| IHTMLHandler |
This interface describes the functionality needed by an HTML processor in order to handle an HTML document.
|
| IMetaTagHandler |
This interface describes the functionality needed by a parser to handle metadata tags.
|
| IRedirectionHandler |
This interface describes the functionality needed by an redirection processor in order to handle a redirection.
|
| IThrottledConnection |
This interface represents an established connection to a URL.
|
| IXMLHandler |
This interface describes the functionality needed by an XML processor in order to handle an XML document.
|
| LoginCookies |
This interface describes cookies obtained during sequential authentication.
|
| LoginParameters |
This interface describes login parameters to be used to submit a page during sequential authentication.
|
| PageCredentials |
This interface describes immutable classes which represents authentication information for page-based authentication.
|
| SequenceCredentials |
This interface describes immutable classes which represents authentication information for sequence-based authentication.
|
| Class | Description |
|---|---|
| AbortChecker |
This class furnishes an abort signal whenever the job activity says it should.
|
| CookieManager |
This class manages the database table into which we write cookies.
|
| CookieManager.CookiesCacheClass |
Cache class for robots.
|
| CookieManager.CookiesDescription |
This is the object description for a session key object.
|
| CookieManager.CookiesExecutor |
This is the executor object for locating cookies session objects.
|
| CookieManager.DynamicCookieSet |
This is a set of cookies, built dynamically.
|
| CookieSet |
This class represents a bunch of cookies
|
| CredentialsDescription |
This class describes credential information pulled from a configuration.
|
| CredentialsDescription.BasicCredential |
Basic type credentials
|
| CredentialsDescription.CredentialsItem |
Class representing an individual credential item.
|
| CredentialsDescription.LoginParameterIterator |
LoginParameter iterator
|
| CredentialsDescription.NTLMCredential |
NTLM-style credentials
|
| CredentialsDescription.SessionCredential |
Session credentials
|
| CredentialsDescription.SessionCredentialItem |
Session credential helper class
|
| CredentialsDescription.SessionCredentialParameter |
Session credential parameter class
|
| DataCache |
This class is a cache of a specific URL's data.
|
| DataCache.DocumentData |
This class represents everything we need to know about a document that's getting passed from the
getDocumentVersions() phase to the processDocuments() phase.
|
| DNSManager |
This class manages the database table into which we DNS entries for hosts.
|
| DNSManager.DNSCacheClass |
Cache class for robots.
|
| DNSManager.DNSInfo |
This is a cached data item.
|
| DNSManager.HostDescription |
This is the object description for a robots host object.
|
| DNSManager.HostExecutor |
This is the executor object for locating robots host objects.
|
| FindContentHandler |
This class is the handler for HTML content grepping during state transitions
|
| FindHandler |
This class is used to discover links in a session login context
|
| FindHTMLFormHandler |
This class is the handler for HTML form parsing during state transitions
|
| FindHTMLHrefHandler |
This class is the handler for HTML parsing during state transitions
|
| FindPreferredRedirectionHandler |
This class is the handler for redirection handling during state transitions
|
| FindRedirectionHandler |
This class is the handler for redirection parsing during state transitions
|
| FormDataAccumulator |
This class accumulates form data and allows overrides
|
| FormDataAccumulator.FormItemIterator |
Iterator over FormItems
|
| FormItem |
This class provides an individual data item
|
| FormParseState |
This class interprets the tag stream generated by the BasicParseState class, and keeps track of the form tags.
|
| LinkParseState |
This class recognizes and interprets all links
|
| Messages | |
| MetaParseState |
This class recognizes and interprets all meta tags
|
| RobotsManager |
This class manages the database table into which we write robots.txt files for hosts.
|
| RobotsManager.HostDescription |
This is the object description for a robots host object.
|
| RobotsManager.HostExecutor |
This is the executor object for locating robots host objects.
|
| RobotsManager.Record |
This class represents a record in a robots.txt file.
|
| RobotsManager.RobotsCacheClass |
Cache class for robots.
|
| RobotsManager.RobotsData |
This is a cached data item.
|
| ScriptParseState |
This class interprets the tag stream generated by the HTMLParseState class, and causes script sections to be skipped
|
| ThrottleDescription |
This class describes complex throttling criteria pulled from a configuration.
|
| ThrottleDescription.ThrottleItem |
Class representing an individual throttle item.
|
| ThrottledFetcher |
This class uses httpclient to fetch stuff from webservers.
|
| ThrottledFetcher.ConnectionPool |
Each connection pool has identical connections we can draw on.
|
| ThrottledFetcher.ConnectionPoolKey |
Connection pool key
|
| ThrottledFetcher.ExecuteMethodThread |
This thread does the actual socket communication with the server.
|
| ThrottledFetcher.LaxBrowserCompatSpecProvider |
Class to create a cookie spec.
|
| ThrottledFetcher.OurBasicCookieStore | |
| ThrottledFetcher.ThrottledConnection |
Throttled connections.
|
| ThrottledFetcher.ThrottledInputstream |
This class throttles an input stream based on the specified byte rate parameters.
|
| TrustsDescription |
This class describes trust information pulled from a configuration.
|
| TrustsDescription.TrustsItem |
Class representing an individual credential item.
|
| WebcrawlerConfig |
Constants for the Webcrawler connector configuration.
|
| WebcrawlerConnector |
This is the Web Crawler implementation of the IRepositoryConnector interface.
|
| WebcrawlerConnector.CanonicalizationPolicies |
Class representing a list of canonicalization rules
|
| WebcrawlerConnector.CanonicalizationPolicy |
Class representing a URL regular expression match, for the purposes of determining canonicalization policy
|
| WebcrawlerConnector.EvaluatorToken |
Evaluator token.
|
| WebcrawlerConnector.EvaluatorTokenStream |
Token stream.
|
| WebcrawlerConnector.FetchStatus | |
| WebcrawlerConnector.MappingRule |
Class representing a mapping rule
|
| WebcrawlerConnector.MappingRules |
Class that represents all mappings
|
| WebcrawlerConnector.NameValue |
Name/value class
|
| WebURL |
Replacement class for java.net.URI, which is broken in many ways.
|
| Exception | Description |
|---|---|
| ThrottledFetcher.PoolException |
Pool exception class
|
| ThrottledFetcher.WaitException |
Wait exception class
|