Welcome!

Welcome to the official BlackBerry Support Community Forums.

This is your resource to discuss support topics with your peers, and learn from each other.

inside custom component

Java Development

Reply
Developer
Posts: 1,415
Registered: ‎07-30-2008
My Device: Not Specified

want to extract all the links from an html file - easiest way? XMLParser?

Hi,

I've got a browserfield that I can load from faked http connections using local data. I have a thread that downloads

various piece of html and keeps then cached in anticipation of later need. While I could create a new renderingsession

and renderingapplication to go get all required resources, I thought it would be quicker to simply parse the

html and extract the required stuff. Is there an easy way to do that on the phone? I guess I could create a page on a server

to handle the request and create some server side logic to feed me the url's but it seems it would be possible  to do on the

phone and I have XML/SOAP/RSS applications contemplated that I would like to integrate using a similar concept.

 

Thanks.

BlackBerry Development Advisor
Posts: 15,784
Registered: ‎07-09-2008
My Device: BlackBerry PRIV
My Carrier: Bell

Re: want to extract all the links from an html file - easiest way? XMLParser?

There are no APIs in the BlackBerry API set that apply specifically to HTML parsing, but there are XML parsers.

 

You can also open the browser with raw HTML.  The following link explains how this can be done.

 

How To - Invoke the browser with raw HTML
Article Number: DB-00573

http://www.blackberry.com/knowledgecenterpublic/livelink.exe/fetch/2000/348583/800332/800440/How_To_...

Mark Sohm
BlackBerry Development Advisor

Please refrain from posting new questions in solved threads.
Problem solved? Click the Accept As Solution button.
Found a bug? Report it using Issue Tracker
Developer
Posts: 1,415
Registered: ‎07-30-2008
My Device: Not Specified

Re: want to extract all the links from an html file - easiest way? XMLParser?

Thanks but that html doesn't help because I want this to work on 4.x and want to just go cache all the

images during dead time for rendering later.

I'll see what I can do either with XML or bruteforce code.

 

Developer
Posts: 31
Registered: ‎07-22-2008
My Device: Not Specified

Re: want to extract all the links from an html file - easiest way? XMLParser?

Hi marchywka,

 

I've found from experience that this sort of problem is a good fit for a very very simple "look-for-the <tagname characters" type parser.  Anything else is going to incur the cost of DOM creation, or all the extra nodes that you don't care about, etc.

 

Not much of an answer, I know, but that's what I always end up doing.

 

Jimmy

Developer
Posts: 19,636
Registered: ‎07-14-2008
My Device: Not Specified

Re: want to extract all the links from an html file - easiest way? XMLParser?

I would agree completely with jjthrash.  Also, when I last looked, there were a number of lightweight html parsers around that you might be able to hook into.  However the point I really wanted to make is that I definitely would not use any XML parser on html, it will barf on any 'invalid' xml (e.g. a <p> without a closing </p>) and I don't think I found a single html page that is not invalid in some way....