12-08-2009 07:50 AM
I am using the w3c dom parser and noticed that the getNodeValue() method on an element object does not seem to be able to parse & correctly. The xml used is as follows:
<description><SPAN style="FONT-SIZE: 9pt; LINE-HEIGHT: 120%; FONT-FAMILY: Times-Roman; LETTER-SPACING: 0.05pt; mso-bidi-font-size: 12.0pt; mso-bidi-font-family: Times-Roman; mso-font-width: 103%"><FONT face="Times New Roman">The most common concerns for many countries in the West Asia and North Africa (WANA) region are related to resources and the environment. Whether it is oil, water, energy, human resources or climate change most of the countries suffer from mismanagement or the lack of such resources or even both.<?xml:namespace prefix = o ns = "urn:schemas-microsoft-comfficeffice" /><o></o></FONT></SPAN></description>
The returned result on calling getNodeValue() was: "< " It is truncating the string on seeing the "<".
what causes this?
if anyone knows why, please reply.
12-08-2009 08:36 AM
Sorry I'm not familiar with the DOM model (I generally use SAX), but in this case, in XML terms, the <description> element has no text associated with it. It just has an an enclosed SPAN element, which itself has an enclosed FONT element. So if you want the text, you will have to ask for the FONT text.
Have I missed the point?
12-11-2009 06:24 AM - edited 12-11-2009 06:56 AM
The first thing I would look at is changing the process that generates this so that it doesn't include the formatting stuff in a document that you are sending to a BlackBerry to be interpreted as text. That is the easiest and best option.
As noted I am not familiar with the DOM parser. But if you know that there will be a FONT and SPAN elements, then it would surprise me if you can't go down the element tress to find the details you want.
Re getting the entire element, sorry, once it has been parsed, I have no idea. Hopefully someone else has.
Edit: But when you talk about getting the whole element back, you actually don't mean the whole element, you mean the text that exists between your <description> and your <\description> tags don't you?
12-11-2009 06:36 AM
Is the XML fragment you posted actually correct? Why is the xml tag in the middle of the document, and not at the start? Of what node value are you getting "<" back for?
12-16-2009 12:17 AM
thanks peter_strange for your concern,
but I can't change the process that generates it, am getting this from the webservice of a news chhanel and about the FONT and SPAN element, these comes with just this news channel and that is also not always. So, I can't go down to get the text in between these elements..
So, what I can try or what could help me is to get whatever is inside the description tag.
Thanks once again...
12-16-2009 04:01 AM
As noted I don't use the DOM parser at all, I use SAX. If I had to parse what you might or might not get using that parser, I would have to code some intelligence in the processing of the description tag, to do something setting a state variable so that the processing knows it is processing a description tag, then in the FONT and SPAN tag processing I would check this state and if in a description, I would attempt to extract the text associated with that tag.
I presume you can do the same sort of thing with the DOM model, i.e. look for some text in the description, if you don't find it there look at the enclosed FONT and SPAN elements (or whatever other elements you find).
Can't think of anything better than that sorry.