Interface | Description |
---|---|
AuthenticationCredentials |
This interface describes immutable classes which represents authentication information for all kinds of authentication.
|
FormData |
This interface describes the form data gleaned from an HTML page.
|
FormDataElement |
This interface describes individual form data elements, for form submission.
|
IDiscoveredLinkHandler |
This interface describes the functionality needed by a link extractor to note a discovered link.
|
IHTMLHandler |
This interface describes the functionality needed by an HTML processor in order to handle an HTML document.
|
IMetaTagHandler |
This interface describes the functionality needed by a parser to handle metadata tags.
|
IRedirectionHandler |
This interface describes the functionality needed by an redirection processor in order to handle a redirection.
|
IThrottledConnection |
This interface represents an established connection to a URL.
|
IXMLHandler |
This interface describes the functionality needed by an XML processor in order to handle an XML document.
|
LoginCookies |
This interface describes cookies obtained during sequential authentication.
|
LoginParameters |
This interface describes login parameters to be used to submit a page during sequential authentication.
|
PageCredentials |
This interface describes immutable classes which represents authentication information for page-based authentication.
|
SequenceCredentials |
This interface describes immutable classes which represents authentication information for sequence-based authentication.
|
Class | Description |
---|---|
AbortChecker |
This class furnishes an abort signal whenever the job activity says it should.
|
CookieManager |
This class manages the database table into which we write cookies.
|
CookieManager.CookiesCacheClass |
Cache class for robots.
|
CookieManager.CookiesDescription |
This is the object description for a session key object.
|
CookieManager.CookiesExecutor |
This is the executor object for locating cookies session objects.
|
CookieManager.DynamicCookieSet |
This is a set of cookies, built dynamically.
|
CookieSet |
This class represents a bunch of cookies
|
CredentialsDescription |
This class describes credential information pulled from a configuration.
|
CredentialsDescription.BasicCredential |
Basic type credentials
|
CredentialsDescription.CredentialsItem |
Class representing an individual credential item.
|
CredentialsDescription.LoginParameterIterator |
LoginParameter iterator
|
CredentialsDescription.NTLMCredential |
NTLM-style credentials
|
CredentialsDescription.SessionCredential |
Session credentials
|
CredentialsDescription.SessionCredentialItem |
Session credential helper class
|
CredentialsDescription.SessionCredentialParameter |
Session credential parameter class
|
DataCache |
This class is a cache of a specific URL's data.
|
DataCache.DocumentData |
This class represents everything we need to know about a document that's getting passed from the
getDocumentVersions() phase to the processDocuments() phase.
|
DNSManager |
This class manages the database table into which we DNS entries for hosts.
|
DNSManager.DNSCacheClass |
Cache class for robots.
|
DNSManager.DNSInfo |
This is a cached data item.
|
DNSManager.HostDescription |
This is the object description for a robots host object.
|
DNSManager.HostExecutor |
This is the executor object for locating robots host objects.
|
FindContentHandler |
This class is the handler for HTML content grepping during state transitions
|
FindHandler |
This class is used to discover links in a session login context
|
FindHTMLFormHandler |
This class is the handler for HTML form parsing during state transitions
|
FindHTMLHrefHandler |
This class is the handler for HTML parsing during state transitions
|
FindPreferredRedirectionHandler |
This class is the handler for redirection handling during state transitions
|
FindRedirectionHandler |
This class is the handler for redirection parsing during state transitions
|
FormDataAccumulator |
This class accumulates form data and allows overrides
|
FormDataAccumulator.FormItemIterator |
Iterator over FormItems
|
FormItem |
This class provides an individual data item
|
FormParseState |
This class interprets the tag stream generated by the BasicParseState class, and keeps track of the form tags.
|
LinkParseState |
This class recognizes and interprets all links
|
Messages | |
MetaParseState |
This class recognizes and interprets all meta tags
|
RobotsManager |
This class manages the database table into which we write robots.txt files for hosts.
|
RobotsManager.HostDescription |
This is the object description for a robots host object.
|
RobotsManager.HostExecutor |
This is the executor object for locating robots host objects.
|
RobotsManager.Record |
This class represents a record in a robots.txt file.
|
RobotsManager.RobotsCacheClass |
Cache class for robots.
|
RobotsManager.RobotsData |
This is a cached data item.
|
ScriptParseState |
This class interprets the tag stream generated by the HTMLParseState class, and causes script sections to be skipped
|
ThrottleDescription |
This class describes complex throttling criteria pulled from a configuration.
|
ThrottleDescription.ThrottleItem |
Class representing an individual throttle item.
|
ThrottledFetcher |
This class uses httpclient to fetch stuff from webservers.
|
ThrottledFetcher.ConnectionPool |
Each connection pool has identical connections we can draw on.
|
ThrottledFetcher.ConnectionPoolKey |
Connection pool key
|
ThrottledFetcher.ExecuteMethodThread |
This thread does the actual socket communication with the server.
|
ThrottledFetcher.LaxBrowserCompatSpecProvider |
Class to create a cookie spec.
|
ThrottledFetcher.OurBasicCookieStore | |
ThrottledFetcher.ThrottledConnection |
Throttled connections.
|
ThrottledFetcher.ThrottledInputstream |
This class throttles an input stream based on the specified byte rate parameters.
|
TrustsDescription |
This class describes trust information pulled from a configuration.
|
TrustsDescription.TrustsItem |
Class representing an individual credential item.
|
WebcrawlerConfig |
Constants for the Webcrawler connector configuration.
|
WebcrawlerConnector |
This is the Web Crawler implementation of the IRepositoryConnector interface.
|
WebcrawlerConnector.CanonicalizationPolicies |
Class representing a list of canonicalization rules
|
WebcrawlerConnector.CanonicalizationPolicy |
Class representing a URL regular expression match, for the purposes of determining canonicalization policy
|
WebcrawlerConnector.EvaluatorToken |
Evaluator token.
|
WebcrawlerConnector.EvaluatorTokenStream |
Token stream.
|
WebcrawlerConnector.FetchStatus | |
WebcrawlerConnector.MappingRule |
Class representing a mapping rule
|
WebcrawlerConnector.MappingRules |
Class that represents all mappings
|
WebcrawlerConnector.NameValue |
Name/value class
|
WebURL |
Replacement class for java.net.URI, which is broken in many ways.
|
Exception | Description |
---|---|
ThrottledFetcher.PoolException |
Pool exception class
|
ThrottledFetcher.WaitException |
Wait exception class
|