Welcome!

Welcome to the official BlackBerry Support Community Forums.

This is your resource to discuss support topics with your peers, and learn from each other.

inside custom component

Native Development

Reply
New Developer
Posts: 19
Registered: ‎12-16-2009
My Device: 8310

how to analyze a html web, sort and get related tag and content.

how to analyze a html page, specially for wikipedia content , there is 'do you know' section in the page, and there is 10 items in the page, we want to sort it out, how shall we do?  

 

Developer
Posts: 16,987
Registered: ‎07-29-2008
My Device: Z10 LE, Z30, Passport
My Carrier: O2 Germany

Re: how to analyze a html web, sort and get related tag and content.

if the section has a fixed name you could download the page, extract this section, then analyze it in detail.
----------------------------------------------------------
feel free to press the like button on the right side to thank the user that helped you.
please mark posts as solved if you found a solution.
@SimonHain on twitter
New Developer
Posts: 19
Registered: ‎12-16-2009
My Device: 8310

Re: how to analyze a html web, sort and get related tag and content.

thank you for guide.

so how to download a html page, use which api or function. (i help my colleague to ask this question)
Developer
Posts: 16,987
Registered: ‎07-29-2008
My Device: Z10 LE, Z30, Passport
My Carrier: O2 Germany

Re: how to analyze a html web, sort and get related tag and content.

qnetworkaccessmanager
----------------------------------------------------------
feel free to press the like button on the right side to thank the user that helped you.
please mark posts as solved if you found a solution.
@SimonHain on twitter
New Developer
Posts: 19
Registered: ‎12-16-2009
My Device: 8310

Re: how to analyze a html web, sort and get related tag and content.

Thank you for the guide, will have a try and see. let get back to you

Developer
Posts: 16,987
Registered: ‎07-29-2008
My Device: Z10 LE, Z30, Passport
My Carrier: O2 Germany

Re: how to analyze a html web, sort and get related tag and content.

For a good starting point i suggest the appropriate samples, most likely you can copy&paste quite some code from there (after understanding what it does, of course).
----------------------------------------------------------
feel free to press the like button on the right side to thank the user that helped you.
please mark posts as solved if you found a solution.
@SimonHain on twitter
New Developer
Posts: 19
Registered: ‎12-16-2009
My Device: 8310

Re: how to analyze a html web, sort and get related tag and content.

Hi BB Team and other bb10 tech geedk, we still have problem in this issue. 

 

our background and need is: 

1. get the red marked field, so we can get the content out with our purpose. 

2. then try to rewite the content, such as make the 'July 14 Wikipedia featured article' into '2014 July 14 Wikipedia featured article'

 

 

<!DOCTYPE html>
  <html lang="en" dir="ltr" class="client-nojs">
  <head>
  <meta charset="UTF-8" />
  <title>July 14 Wikipedia featured article - Wikipedia, the free encyclopedia</title>
   
  <div class="pre-content">
  <h1 id="section_0">July 14 Wikipedia featured article</h1> </div>
  <div id="content" class="content" lang="en" dir="ltr"> <div style="float: left; margin: 0.5em 0.9em 0.4em 0em;"><a href="/wiki/File:Gelderland1601-1603_Lophopsittacus_mauritianus.jpg" class="image" title="1601 sketch of the broad-billed parrot"><img alt="1601 sketch of the broad-billed parrot" src="//upload.wikimedia.org/wikipedia/commons/thumb/8/8f/Gelderland1601-1603_Lophopsittacus_mauritianus.j..." width="133"height="86" srcset="//upload.wikimedia.org/wikipedia/commons/thumb/8/8f/Gelderland1601-1603_Lophopsittacus_mauritianus.jpg... 1.5x, //upload.wikimedia.org/wikipedia/commons/thumb/8/8f/Gelderland1601-1603_Lophopsittacus_mauritianus.jpg... 2x" data-file-width="3624" data-file-height="2344" /></a></div>
  <p>The <b><a href="/wiki/Broad-billed_parrot" title="Broad-billed parrot">broad-billed parrot</a></b> is a large extinct <a href="/wiki/Parrot" title="Parrot">parrot</a> in the<a href="/wiki/Family_(biology)" title="Family (biology)">family</a> <ahref="/wiki/Psittaculidae" title="Psittaculidae">Psittaculidae</a> that was endemic to the island of <a href="/wiki/Mauritius" title="Mauritius">Mauritius</a> in the <ahref="/wiki/Indian_Ocean" title="Indian Ocean">Indian Ocean</a>. It has been classified as a member of the <a href="/wiki/Tribe_(biology)" title="Tribe (biology)">tribe</a> <a href="/wiki/Psittaculini" title="Psittaculini">Psittaculini</a>, and may have been closely related to the <a href="/wiki/Rodrigues_parrot" title="Rodrigues parrot">Rodrigues parrot</a>. The broad-billed parrot had a large head in proportion to its body, a distinct crest of feathers on the front of the head, and a very large beak that would have enabled it to crack hard seeds. <a href="/wiki/Subfossil"title="Subfossil">Subfossil</a> bones indicate that the species exhibited greater <ahref="/wiki/Sexual_dimorphism" title="Sexual dimorphism">sexual dimorphism</a> in overall size and head size than any living parrot. A contemporary description indicates that it had a blue head, a greyish or blackish body, and perhaps a red beak. The broad-billed parrot was first referred to as the "Indian raven" in Dutch ships' journals from 1598 onwards. It was first scientifically described from a subfossil mandible in 1866, but this was not linked to the few brief contemporary descriptions until the rediscovery of a detailed 1601 sketch <i>(pictured)</i>. The bird became extinct in the 17th century owing to a combination of <a href="/wiki/Deforestation"title="Deforestation">deforestation</a>, predation by introduced <ahref="/wiki/Invasive_species" title="Invasive species">invasive species</a>, and probably also because of hunting. (<a href="/wiki/Broad-billed_parrot" title="Broad-billed parrot"><b>Full&#160;article...</b></a>)
  </p><p>Recently featured: <a href="/wiki/Joel_Brand" title="Joel Brand">Joel Brand</a>&#160;– <a href="/wiki/M-185_(Michigan_highway)" title="M-185 (Michigan highway)">M-185 (Michigan highway)</a>&#160;– <a href="/wiki/Babe_Ruth"title="Babe Ruth">Babe Ruth</a>
  </p>
  <div style="text-align: right;" class="noprint"><b><a href="/wiki/Wikipedia:Today%27s_featured_article/July_2014" title="Wikipedia:Today's featured article/July 2014">Archive</a></b> – <b><a href="https://lists.wikimedia.org/mailman/listinfo/daily-article-l" class="extiw" title="mail:daily-article-l">By email</a></b> – <b><ahref="/wiki/Wikipedia:Featured_articles" title="Wikipedia:Featured articles">More featured articles...</a></b></div>
  <noscript><img src="//en.wikipedia.org/wiki/Special:CentralAutoLogin/start?type=1x1" alt="" title="" width="1" height="1" style="border: none; position: absolute;" /></noscript><div id="page-secondary-actions"></div> </div>
  </div>
  <div id="footer">
  <ul class="footer-info">
  <li id="footer-info-mobile-switcher"><h2><img src="//bits.wikimedia.org/static-1.24wmf13/extensions/MobileFrontend/images/logo-copyright-en.png" alt="Wikipedia ®" /></h2>
  <ul>
  <li>Mobile&zwnj;</li><li><a id="mw-mf-display-toggle" href="http://en.wikipedia.org/w/index.php?title=Special:FeedItem/featured/20140714000000/en&amp;mobileacti...">Desktop</a></li>
  </ul></li><li id="footer-info-mobile-license">Content is available under <aclass="external" rel="nofollow" href="//creativecommons.org/licenses/by-sa/3.0/">CC BY-SA 3.0</a> unless otherwise noted.</li> </ul>
  <ul class="footer-places">
  <li id="footer-places-terms-use"><a href="//m.wikimediafoundation.org/wiki/Terms_of_use">Terms of Use</a></li><li id="footer-places-privacy"><ahref="//wikimediafoundation.org/wiki/Privacy_policy" title="wikimediaSmiley Tonguerivacy policy">Privacy</a></li> </ul>
  </div>
  </div>
  </div>
  <script>if(window.mw){
  mw.config.set({"wgBackendResponseTime":123,"wgHostname":"mw1168"});
  }</script><script>/*<![CDATA[*/window.jQuery && jQuery.ready();/*]]>*/</script><script>if(window.mw){
  mw.loader.state({"user.groups":"ready"});
  }</script>
  <script>if(window.mw){
  mw.loader.load(["mediawiki.user","mediawiki.hidpi","mobile.search","mobile.startup","mobile.stable","mobile.notifications","mobile.issues","mobile.editor","mobile.languages","mobile.newusers","mobile.toggling","mobile.loggingSchemas","ext.eventLogging.subscriber","ext.navigationTiming","schema.UniversalLanguageSelector"],null,true);
  }</script>
  </body>
  </html>

 

 

 

we use the method as: 

we use QNetworkAccessManager to get the html content,and try to get the  <h1 id="section_0"> those tag to get the related content. but failed. 

 we use QDomDocument api to try to analyse the html code, and when use setContent,then it failed at once as below: 
 
.pro
QT += network
QT += xml
 
 ________________
.CPP
 
networkAccessManager = new QNetworkAccessManager(this);
 
void ApplicationUI::requestURL() {
disconnect(networkAccessManager, SIGNAL(finished(QNetworkReply*)), this,
SLOT(receiveValue(QNetworkReply*)));
QString myurl =
networkAccessManager->get(QNetworkRequest(QUrl(myurl)));
connect(networkAccessManager, SIGNAL(finished(QNetworkReply*)), this,
SLOT(receiveValue(QNetworkReply*)));
}
 
 
void ApplicationUI::receiveValue(QNetworkReply* replay) {
QByteArray value = replay->readAll();
if (replay != NULL && replay->error() == QNetworkReply::NoError) {
QDomDocument document;
QString errorMsg ; int errorLine , errorColumn ; if(!document.setContent(value,&errorMsg,&errorLine,&errorColumn)){ if(errorMsg!="unexpected end of file"){ qDebug()<<"errorMsg::"<<errorMsg; } }
QDomElement myElement = document.elementById("content"); qDebug()<<"content::"<<myElement.text();
}
 
}
 
so how to realize our use case and solve the problem.
or which api we shall use to analyse the html code, so to get related tag as: <h1 id="section_0"> or  html body or div these tags.  
New Developer
Posts: 19
Registered: ‎12-16-2009
My Device: 8310

Re: how to analyze a html web, sort and get related tag and content.

any guider and geek to help us in this issue.  thank you. 

Developer
Posts: 508
Registered: ‎01-19-2011
My Device: My Trusty Red Plane
My Carrier: Outer Space

Re: how to analyze a html web, sort and get related tag and content.

geek? careful what you say Smiley Wink also IMO there are very few BB ppl here, mostly it's just us 'normal' developers

 

when you load the data from the web, why don't you just save it as string first, parse it as XML, change what you want to change, save it as string again, and then parse it as HTML? or modify the string directly?

-----------------------------------------------------------------------
I'm a bird from outer space. But I'm not flappy o.o
Developer
Posts: 16,987
Registered: ‎07-29-2008
My Device: Z10 LE, Z30, Passport
My Carrier: O2 Germany

Re: how to analyze a html web, sort and get related tag and content.

if domdocument does not help you could always use substring.

Another option could be QWebPage and QWebElement

Or you could try a xml parser
----------------------------------------------------------
feel free to press the like button on the right side to thank the user that helped you.
please mark posts as solved if you found a solution.
@SimonHain on twitter