ruby - How do I correctly deal with non-breaking spaces using Nokogiri? -
i using nokogiri parse html page, having odd problems non-breaking spaces. tried different encodings, replacing whitespace, , few other headache inducing attempts.
here html snippet in question:
<td>amount 15,300 at dollars</td> note change representation after use nokogiri:
<td>amount 15,300 at dollars</td> and outputting inner_text:
amount 15,300 at dollars this base nokogiri grab, did try few alternatives solve failed miserably:
doc = nokogiri::html(open(url)) and doc.search item in question.
note if @ doc, line shows   on line.
clarification: not think stated difficulty having. can't inner_text show without strange  symbol.
i know old, took me hour find out how solve problem, , easy once know. pass string function , "de-nbsp-fied".
def strip_html(str) nbsp = nokogiri::html(" ").text str.gsub(nbsp,'') end you replace whith space if wished. may many of find answer!
Comments
Post a Comment