module Loofah::TextBehavior
Overrides text in HTML::Document and HTML::DocumentFragment, and mixes in to_text.
Public Instance Methods
text(options = {})
click to toggle source
Returns a plain-text version of the markup contained by the document, with HTML entities encoded.
This method is significantly faster than to_text, but isn’t clever about whitespace around block elements.
Loofah.document("<h1>Title</h1><div>Content</div>").text # => "TitleContent"
By default, the returned text will have HTML entities escaped. If you want unescaped entities, and you understand that the result is unsafe to render in a browser, then you can pass an argument as shown:
frag = Loofah.fragment("<script>alert('EVIL');</script>") # ok for browser: frag.text # => "<script>alert('EVIL');</script>" # decidedly not ok for browser: frag.text(:encode_special_chars => false) # => "<script>alert('EVIL');</script>"
# File lib/loofah/instance_methods.rb, line 95 def text(options = {}) result = if serialize_root serialize_root.children.reject(&:comment?).map(&:inner_text).join("") else "" end if options[:encode_special_chars] == false result # possibly dangerous if rendered in a browser else encode_special_chars result end end
Also aliased as: inner_text, to_str
to_text(options = {})
click to toggle source
Returns a plain-text version of the markup contained by the fragment, with HTML entities encoded.
This method is slower than text, but is clever about whitespace around block elements and line break elements.
Loofah.document("<h1>Title</h1><div>Content<br>Next line</div>").to_text # => "\nTitle\n\nContent\nNext line\n"
# File lib/loofah/instance_methods.rb, line 121 def to_text(options = {}) Loofah.remove_extraneous_whitespace self.dup.scrub!(:newline_block_elements).text(options) end