Package net.bioclipse.managers
Class JSoupManager
java.lang.Object
net.bioclipse.managers.JSoupManager
- All Implemented Interfaces:
IBactingManager,net.bioclipse.managers.business.IBioclipseManager
Manager for JSoup functionality to parse HTML content.
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptiondoi()Lists the DOIs of the articles associated to this manager.org.jsoup.nodes.DocumentParses a file with HTML content from the workspace into the JSoupDocument.org.jsoup.nodes.DocumentparseString(String htmlString) Parses a string with HTML content into the JSoupDocument.removeHTMLTags(String htmlString) Takes a HTML string and removes all tags.org.jsoup.select.ElementsSelects a subsection of theDocumentand returns it as anElementsobject.
-
Constructor Details
-
JSoupManager
Creates a newJSoupManager.- Parameters:
workspaceRoot- location of the workspace, e.g. "."
-
-
Method Details
-
parseString
Parses a string with HTML content into the JSoupDocument.- Parameters:
htmlString- the HTML asString- Returns:
- the HTML content as
Document
-
parse
public org.jsoup.nodes.Document parse(String htmlFile) throws net.bioclipse.core.business.BioclipseException Parses a file with HTML content from the workspace into the JSoupDocument.- Parameters:
htmlFile- the name of the HTML file in the workspace- Returns:
- the HTML content as
Document - Throws:
net.bioclipse.core.business.BioclipseException- when the file could not be read
-
removeHTMLTags
Takes a HTML string and removes all tags.- Parameters:
htmlString- the HTML asString- Returns:
- the text bits from the HTML
-
select
Selects a subsection of theDocumentand returns it as anElementsobject.- Parameters:
doc- JSoup document to select from asElementcssSelector- String with a Cascading Style Sheet selector instruction- Returns:
- the selected content
-
getManagerName
- Specified by:
getManagerNamein interfacenet.bioclipse.managers.business.IBioclipseManager
-
doi
Description copied from interface:IBactingManagerLists the DOIs of the articles associated to this manager.- Specified by:
doiin interfaceIBactingManager- Returns:
- a
Listof String with DOIs
-