ruby - Need help to locate the text of element with class? -


i have file have got using command page.css("table.vc_result span a"), not able second , third span element of file:

file

<table border="0" bgcolor="#ffffff" onmouseout="resdef(this)" onmouseover="resemp(this)" class="vc_result"> <tbody>   <tr>     <td width="260" valign="top">       <table>         <tbody>           <tr>             <td width="40%" valign="top"><span><a class="caddname" href="/usa/illinois/chicago/yellow+page+advertising+and+telephone+directory+publica/gateway-megatech_13478733">             gateway megatech</a></span><br>             <span class="caddtext">p.o. box 99682, chicago il 60696</span></td>           </tr>            <tr>             <td><span class="caddtext">cook county illinois</span></td>           </tr>            <tr>             <td><span class="caddcategory">yellow page advertising , telephone             directory publica chicago</span></td>           </tr>         </tbody>       </table>     </td>      <td width="260">       <table align="center">         <tbody>           <tr>             <td>               <table>                 <tbody>                   <tr>                     <td>                       <div style=                       "background: url('images/listings.png');background-position: -0px -0px; width: 16px; height: 16px">                       </div>                     </td>                      <td><font style="font-weight:bold">847-506-7800</font></td>                   </tr>                 </tbody>               </table>             </td>           </tr>            <tr>             <td>               <table>                 <tbody>                   <tr>                     <td>                       <div style=                       "background: url('images/listings.png');background-position: -0px -78px; width: 16px; height: 16px">                       </div>                     </td>                      <td><a href=                     "/usa/illinois/chicago/yellow+page+advertising+and+telephone+directory+publica/gateway-megatech_13478733"                     class="caddnearby">businesses near 60696</a></td>                   </tr>                 </tbody>               </table>             </td>           </tr>            <tr>             <td>               <table>                 <tbody>                   <tr>                     <td></td>                   </tr>                 </tbody>               </table>             </td>           </tr>         </tbody>       </table>     </td>   </tr> </tbody> </table> 

...this not complete file there plenty more span entries in file.

the code using able locate exact text not able associate text of nested element span a.

require 'rubygems' require 'nokogiri' require 'open-uri' name="yellow" city="chicago" state="il"  burl="http://www.sitename.com/" url="#{burl}business_listings.php?name=#{name}&city=#{city}&state=#{state}&current=1&submit=search" page = nokogiri::html(open(url))   rows = page.css("table.vc_result span a") rows.each |arow|    if arow.text == "gateway megatech"     puts(arow.next_element.text)     puts("capturing next span text")     found="got it"     break          else     puts("found nothing")     found="none"   end end 

assuming each business new <tr> inside top table have supplied, following code gives array of hashes values:

require 'nokogiri' doc = nokogiri.html(html)  business_rows = doc.css('table.vc_result > tbody > tr') details = business_rows.map |tr|   # inside first <td> of row, find <td> a.caddname in   business = tr.at_xpath('td[1]//td[//a[@class="caddname"]]')   name     = business.at_css('a.caddname').text.strip   address  = business.at_css('.caddtext').text.strip    # inside second <td> of row, find first <font> tag   phone    = tr.at_xpath('td[2]//font').text.strip    # return hash of values row, using capitalization requested   { name:name, address:address, phone:phone } end  p details #=> [ #=>   { #=>     :name=>"gateway megatech", #=>     :address=>"p.o. box 99682, chicago il 60696", #=>     :phone=>"847-506-7800" #=>   } #=> ] 

this pretty fragile, works you've given, , there not seem many semantic items hang onto in insane, horrorific abuse of html.


Comments

Popular posts from this blog

php - Calling a template part from a post -

Firefox SVG shape not printing when it has stroke -

How to mention the localhost in android -