net.datacrow.util
Class HtmlUtils
java.lang.Object
net.datacrow.util.HtmlUtils
public class HtmlUtils
- extends java.lang.Object
Method Summary |
static org.w3c.dom.Document |
getDocument(java.lang.String html)
|
static org.w3c.dom.Document |
getDocument(java.net.URL url,
int cleanupLevel)
|
static org.w3c.dom.Document |
getDocument(java.net.URL url,
java.lang.String charset)
|
static org.w3c.dom.Document |
getDocument(java.net.URL url,
java.lang.String charset,
int cleanupLevel)
|
static java.lang.String |
getHtmlCleaned(java.net.URL url,
java.lang.String charset,
int cleanupLevel)
|
static java.lang.String |
toPlainText(java.lang.String html)
|
static java.lang.String |
toPlainText(java.lang.String html,
java.lang.String charset)
Clean the string of any unwanted characters |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
HtmlUtils
public HtmlUtils()
getDocument
public static org.w3c.dom.Document getDocument(java.net.URL url,
int cleanupLevel)
throws java.lang.Exception
- Throws:
java.lang.Exception
getDocument
public static org.w3c.dom.Document getDocument(java.net.URL url,
java.lang.String charset)
throws java.lang.Exception
- Throws:
java.lang.Exception
getDocument
public static org.w3c.dom.Document getDocument(java.net.URL url,
java.lang.String charset,
int cleanupLevel)
throws java.lang.Exception
- Throws:
java.lang.Exception
getDocument
public static org.w3c.dom.Document getDocument(java.lang.String html)
throws java.lang.Exception
- Throws:
java.lang.Exception
getHtmlCleaned
public static java.lang.String getHtmlCleaned(java.net.URL url,
java.lang.String charset,
int cleanupLevel)
throws java.lang.Exception
- Throws:
java.lang.Exception
toPlainText
public static java.lang.String toPlainText(java.lang.String html)
toPlainText
public static java.lang.String toPlainText(java.lang.String html,
java.lang.String charset)
- Clean the string of any unwanted characters
- Parameters:
s
- string to clean