04-25-2010 08:50 AM
Hello everyone! i'm trying to parse HTML tag,
I see on this link to show How to parse XML
DocumentBuilderFactory docBuilderFactory= DocumentBuilderFactory. newInstance();
DocumentBuilder docBuilder= docBuilderFactory.newDocumentBuilder();
Document doc = docBuilder.parse(conn.openInputStream());
DocumentBuilder seems just parse only XML, because when i try with html it will be error at this line
Document doc = docBuilder.parse(conn.openInputStream()); // (Error here, its ok with xml)
So anyone know how to parse html tag in BB please help!
Thanks in advance
04-25-2010 09:39 AM
And the error is?
How do you create the input stream?
That said, html does not have to be strict as XML. For example most people write <p> but never write the </p> tag. This will cause the XML parser to blow up with the parsing exception. However most browsers cope with that.
So if you are going to be parsing html over which you have no control, then I think you will have to do it by yourself. If you are going to be parsing html that you have control over and you know will be well formed, the the XML parser may work for you.