Welcome!

Welcome to the official BlackBerry Support Community Forums.

This is your resource to discuss support topics with your peers, and learn from each other.

inside custom component

Java Development

Reply
Developer
sandeepkumar03
Posts: 117
Registered: ‎02-12-2009
My Device: Not Specified
Accepted Solution

Issue with XML parsing with html content as part of node

Hi All,

 

I am facing issue with XML parsin when the node has HTML content, if the node is not having any html content then it is working fine. Similar parsing logic works for Standard Java but not for Blackberry APIs. I have requirement of not removing HTML content before parsing. As that would be used to display in browser field.

 

                          String xml="<?xml version=\"1.0\" encoding=\"UTF-8\"?><root>" +
				"<disclaimer>&lt;p&gt; Dummy content &lt;/p&gt; " +
				"</disclaimer>" +
				"</root>";
                          DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
			DocumentBuilder dBuilder;
			Document doc;
			bis = new ByteArrayInputStream(xml.getBytes("UTF-8")); 
	
			dBuilder = dbFactory.newDocumentBuilder();
			
			doc = dBuilder.parse(bis);
			doc.getDocumentElement().normalize();
			
			NodeList nList = doc.getElementsByTagName("disclaimer");
			
			Element activeTagElmnt = (Element) nList.item(0);
			NodeList activeTag = activeTagElmnt.getChildNodes(); 
			
			tagValue=((Node) activeTag.item(0)).getNodeValue();
                          System.out.println(tagValue); //This prints "<" with HTML content. If i remove HTML then "Dummy Content" is printed

 If i need to remove html content then i would need to reformat content as per HTML like line break for a linebreak.

 

Any suggestions would help.

 

Thank you,

Sandeep

Please use plain text.
Developer
peter_strange
Posts: 19,601
Registered: ‎07-14-2008
My Device: Not Specified

Re: Issue with XML parsing with html content as part of node

What issue are you facing? 

 

If you are getting errors that indicate the html is not well formed XML, then that is perfectly normal and to be expected.  html is not as rigorous as XML.

 

Can you give us a sample of the html you are trying to parse?  Does not need to be real data, just small sample data that works for Standard Java and does not work on BlackBerry. 

Please use plain text.
Developer
sandeepkumar03
Posts: 117
Registered: ‎02-12-2009
My Device: Not Specified

Re: Issue with XML parsing with html content as part of node

Thanks Peter for your reply,

 

I am not getting any exception. Its just that when HTML content is present in the node, the full value of content is not printed. Below is sample XML.

 

String xml="<?xml version=\"1.0\" encoding=\"UTF-8\"?><root>" +
"<disclaimer>&lt;p&gt; Dummy content &lt;/p&gt; " +
"</disclaimer>" +
"</root>";

Thanks and regards,

Sandeep

Please use plain text.
Developer
peter_strange
Posts: 19,601
Registered: ‎07-14-2008
My Device: Not Specified

Re: Issue with XML parsing with html content as part of node

There is no html in that sample.  So I'm guessing it works correctly.  Can we have one with html that fails?

Please use plain text.
Developer
sandeepkumar03
Posts: 117
Registered: ‎02-12-2009
My Device: Not Specified

Re: Issue with XML parsing with html content as part of node

Hi Peter,

 

HTML is present in xml string but angle brackets are denoted by &lt; and &gt;

 

String xml_html="<?xml version=\"1.0\" encoding=\"UTF-8\"?><root>" +

"<disclaimer>&lt;p&gt; Dummy content &lt;/p&gt;" +

"</disclaimer>" +

"</root>";

 

String xml_no_html="<?xml version=\"1.0\" encoding=\"UTF-8\"?><root>" +

"<disclaimer>Dummy content" +

"</disclaimer>" +

"</root>";

 

When we are editing XML then HTML content is being denoted by these. I was expecting that parsing disclaimer node in String xml_html would give &lt;p&gt; Dummy content &lt;/p&gt; but it is just printing "<".  If we use String xml_no_html for parsing then it is parsing correctly and printing "Dummy content".

 

Thanks and regards,

Sandeep

Please use plain text.
Developer
kamal_nigam
Posts: 434
Registered: ‎07-23-2012
My Device: Blackberry 10
My Carrier: Orange

Re: Issue with XML parsing with html content as part of node

I feel xml parser has mature enough to handle these characters if sills you are getting error explore CDATA. It might help you. 

Thanks
-------------------------------------------------------------------------------------
Press the Accept as solution Button when u got the Solution
Press Kudo to say thank to developer.
-------------------------------------------------------------------------------------.
Please use plain text.
Developer
kamal_nigam
Posts: 434
Registered: ‎07-23-2012
My Device: Blackberry 10
My Carrier: Orange

Re: Issue with XML parsing with html content as part of node

But before that check is you xml is well formed and valid

Thanks
-------------------------------------------------------------------------------------
Press the Accept as solution Button when u got the Solution
Press Kudo to say thank to developer.
-------------------------------------------------------------------------------------.
Please use plain text.
Developer
peter_strange
Posts: 19,601
Registered: ‎07-14-2008
My Device: Not Specified

Re: Issue with XML parsing with html content as part of node

Apologies, the symbols you are describing as html are in fact also XML,  So I didn't realise that you meant those.  .

 

However I think you may need to encode the html String befroe adding it to the String.  .

 

I think the value of the disclaimer node as currently specified will be returned as

<p> Dummy Conent <p>

 

If you expect to get

&lt;p&gt; Dummy content &lt;/p&gt;

 

then you need to pass in, to your XML parser, something like:

&amp;lt;p&amp;gt; Dummy content &amp;lt;p&amp;gt;

 

Hopefully you will get a chance to test this before I do, if not, I will try to test it later. 

 

Please use plain text.
Developer
sandeepkumar03
Posts: 117
Registered: ‎02-12-2009
My Device: Not Specified

Re: Issue with XML parsing with html content as part of node

Thanks Peter, Kamal for your replies.

 

Peter I tried encoding it is printing "&". It seems Kamal's pointer is parsing correctly. If we wrap the xml node tag inside CDDATA tag then BB APIs parse it properly. if I use

 

String xml="<?xml version=\"1.0\" encoding=\"UTF-8\"?><root>" +
    "<disclaimer><![CDATA[&lt;p&gt; Dummy content &lt;/p&gt;]]>" +
    "</disclaimer>" +
    "</root>";

 

It works. Same content is being used by other teams also. I hope this would not not impact parsing of other teams if we ask XML content provider to wrap inside CDDATA. In Standard Java this was working without CDDATA tag.

 

Thank you.

Please use plain text.