ruby - How do I correctly deal with non-breaking spaces using Nokogiri? -
i using nokogiri parse html page, having odd problems non-breaking spaces. tried different encodings, replacing whitespace, , few other headache inducing attempts.
here html snippet in question:
<td>amount 15,300 at dollars</td>   note change   representation after use nokogiri:
<td>amount 15,300 at dollars</td>   and outputting inner_text:
amount 15,300 at dollars   this base nokogiri grab, did try few alternatives solve failed miserably:
doc = nokogiri::html(open(url))   and doc.search item in question.
note if @ doc, line shows   on line. 
clarification: not think stated difficulty having. can't inner_text show without strange  symbol. 
i know old, took me hour find out how solve problem, , easy once know. pass string function , "de-nbsp-fied".
def strip_html(str)   nbsp = nokogiri::html(" ").text   str.gsub(nbsp,'') end   you replace whith space if wished. may many of find answer!
Comments
Post a Comment