Class JSoupManager

java.lang.Object
net.bioclipse.managers.JSoupManager
All Implemented Interfaces:
IBactingManager, net.bioclipse.managers.business.IBioclipseManager

public class JSoupManager extends Object implements IBactingManager
Manager for JSoup functionality to parse HTML content.
  • Constructor Details

    • JSoupManager

      public JSoupManager(String workspaceRoot)
      Creates a new JSoupManager.
      Parameters:
      workspaceRoot - location of the workspace, e.g. "."
  • Method Details

    • parseString

      public org.jsoup.nodes.Document parseString(String htmlString)
      Parses a string with HTML content into the JSoup Document.
      Parameters:
      htmlString - the HTML as String
      Returns:
      the HTML content as Document
    • parse

      public org.jsoup.nodes.Document parse(String htmlFile) throws net.bioclipse.core.business.BioclipseException
      Parses a file with HTML content from the workspace into the JSoup Document.
      Parameters:
      htmlFile - the name of the HTML file in the workspace
      Returns:
      the HTML content as Document
      Throws:
      net.bioclipse.core.business.BioclipseException - when the file could not be read
    • removeHTMLTags

      public String removeHTMLTags(String htmlString)
      Takes a HTML string and removes all tags.
      Parameters:
      htmlString - the HTML as String
      Returns:
      the text bits from the HTML
    • select

      public org.jsoup.select.Elements select(org.jsoup.nodes.Element doc, String cssSelector)
      Selects a subsection of the Document and returns it as an Elements object.
      Parameters:
      doc - JSoup document to select from as Element
      cssSelector - String with a Cascading Style Sheet selector instruction
      Returns:
      the selected content
    • getManagerName

      public String getManagerName()
      Specified by:
      getManagerName in interface net.bioclipse.managers.business.IBioclipseManager
    • doi

      public List<String> doi()
      Description copied from interface: IBactingManager
      Lists the DOIs of the articles associated to this manager.
      Specified by:
      doi in interface IBactingManager
      Returns:
      a List of String with DOIs