Skip to content
trans edited this page Sep 13, 2010 · 3 revisions

Comparing Sanitize to Loofah

Facts

  • Written by Ryan Grove
  • Depends on Nokigiri
  • NOT Regexp based

Usage

By default Sanitize removes all tags.

    require 'sanitize'
    html = '<b><a href="http://foo.com/">foo</a></b><img src="http://foo.com/bar.jpg" />'

    Sanitize.clean(html) # => 'foo'
You supply built-in constants to the #clean method to specify the type of filtering you want.
    Sanitize.clean(html, Sanitize::Config::RELAXED)
    # => '<b><a href="http://foo.com/">foo</a></b><img src="http://foo.com/bar.jpg" />'

You can supply your own custom configuration through an options hash instead of a built-in constant.

Benchmarks

Loofah is about 1-2x faster than Sanitize of HTML documents and about 20% slower on small text snippets.

    HeadToHeadSanitizerSanitize
      Large document, 98282 bytes (x100)
                                       total    single    rel
                       Loofah :strip  15.132 (0.151318)     -
                      Sanitize.clean  31.295 (0.312947)  2.07x

      Small fragment, 3178 bytes (x1000)
                                       total    single    rel
                       Loofah :strip   6.887 (0.006887)     -
                      Sanitize.clean   6.681 (0.006681)  0.97x

      Text snippet, 58 bytes (x10000)
                                       total    single    rel
                       Loofah :strip   5.798 (0.000580)     -
                      Sanitize.clean   4.580 (0.000458)  0.79x
Clone this wiki locally