Class FindHTMLFormHandler
- java.lang.Object
-
- org.apache.manifoldcf.crawler.connectors.webcrawler.FindHandler
-
- org.apache.manifoldcf.crawler.connectors.webcrawler.FindHTMLFormHandler
-
- All Implemented Interfaces:
IDiscoveredLinkHandler,IHTMLHandler,IMetaTagHandler
public class FindHTMLFormHandler extends FindHandler implements IHTMLHandler
This class is the handler for HTML form parsing during state transitions
-
-
Field Summary
Fields Modifier and Type Field Description protected FormDataAccumulatorcurrentFormDataprotected FormDataAccumulatordiscoveredFormDataprotected java.util.regex.PatternformNamePattern-
Fields inherited from class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHandler
parentURI, targetURI
-
-
Constructor Summary
Constructors Constructor Description FindHTMLFormHandler(java.lang.String parentURI, java.util.regex.Pattern formNamePattern)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidapplyFormOverrides(LoginParameters lp)voidfinishUp()Done with the document.FormDatagetFormData()voidnoteAHREF(java.lang.String rawURL)Note discovered hrefvoidnoteBASEHREF(java.lang.String rawURL)Note discovered base hrefvoidnoteFormEnd()Note the end of a formvoidnoteFormInput(java.util.Map inputAttributes)Note an input tagvoidnoteFormStart(java.util.Map formAttributes)Note the start of a formvoidnoteFRAMESRC(java.lang.String rawURL)Note discovered FRAME SRCvoidnoteIMGSRC(java.lang.String rawURL)Note discovered IMG SRCvoidnoteLINKHREF(java.lang.String rawURL)Note discovered hrefvoidnoteMetaTag(java.util.Map metaAttributes)Note a meta tagvoidnoteTextCharacter(char textCharacter)Note a character of text.-
Methods inherited from class org.apache.manifoldcf.crawler.connectors.webcrawler.FindHandler
getTargetURI, noteDiscoveredBase, noteDiscoveredLink
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.manifoldcf.crawler.connectors.webcrawler.IDiscoveredLinkHandler
noteDiscoveredBase, noteDiscoveredLink
-
-
-
-
Field Detail
-
formNamePattern
protected final java.util.regex.Pattern formNamePattern
-
discoveredFormData
protected FormDataAccumulator discoveredFormData
-
currentFormData
protected FormDataAccumulator currentFormData
-
-
Method Detail
-
applyFormOverrides
public void applyFormOverrides(LoginParameters lp) throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
getFormData
public FormData getFormData()
-
noteTextCharacter
public void noteTextCharacter(char textCharacter) throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionNote a character of text. Structured this way to keep overhead low for handlers that don't use text.- Specified by:
noteTextCharacterin interfaceIHTMLHandler- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
noteMetaTag
public void noteMetaTag(java.util.Map metaAttributes) throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionNote a meta tag- Specified by:
noteMetaTagin interfaceIMetaTagHandler- Parameters:
metaAttributes- are the attributes that belong to the tag.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
noteFormStart
public void noteFormStart(java.util.Map formAttributes) throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionNote the start of a form- Specified by:
noteFormStartin interfaceIHTMLHandler- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
noteFormInput
public void noteFormInput(java.util.Map inputAttributes) throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionNote an input tag- Specified by:
noteFormInputin interfaceIHTMLHandler- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
noteFormEnd
public void noteFormEnd() throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionNote the end of a form- Specified by:
noteFormEndin interfaceIHTMLHandler- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
noteBASEHREF
public void noteBASEHREF(java.lang.String rawURL) throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionNote discovered base href- Specified by:
noteBASEHREFin interfaceIHTMLHandler- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
noteAHREF
public void noteAHREF(java.lang.String rawURL) throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionNote discovered href- Specified by:
noteAHREFin interfaceIHTMLHandler- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
noteLINKHREF
public void noteLINKHREF(java.lang.String rawURL) throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionNote discovered href- Specified by:
noteLINKHREFin interfaceIHTMLHandler- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
noteIMGSRC
public void noteIMGSRC(java.lang.String rawURL) throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionNote discovered IMG SRC- Specified by:
noteIMGSRCin interfaceIHTMLHandler- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
noteFRAMESRC
public void noteFRAMESRC(java.lang.String rawURL) throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionNote discovered FRAME SRC- Specified by:
noteFRAMESRCin interfaceIHTMLHandler- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
finishUp
public void finishUp() throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionDescription copied from interface:IHTMLHandlerDone with the document.- Specified by:
finishUpin interfaceIHTMLHandler- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
-