Safe DOM manipulation with the Sanitizer API

Jack J

Applications deal with untrusted strings all the time, but safely rendering that content as part of an HTML document can be tricky. Without sufficient care, you may accidentally create opportunities for cross-site scripting (XSS) that malicious attackers may exploit.

To mitigate that risk, the new Sanitizer API proposal aims to build a robust processor for arbitrary strings to be safely inserted into a page.

// Expanded Safely !!
$div.setHTML(`<em>hello world</em><img src="" onerror=alert(0)>`, new Sanitizer())

Escape user input

When inserting user input, query strings, cookie contents, and more, into the DOM, the strings must be escaped properly. Particular attention should be paid to DOM manipulation with .innerHTML, where unescaped strings are a typical source of XSS.

const user_input = `<em>hello world</em><img src="" onerror=alert(0)>`
$div.innerHTML = user_input

If you escape HTML special characters in the input string or expand it using .textContent, alert(0) isn't executed. However, since <em> added by the user is also expanded as a string as it is, this method cannot be used in order to keep the text decoration in HTML.

The best thing to do here is not escaping, but sanitizing.

Sanitize user input

Escaping refers to replacing special HTML characters with HTML Entities.

Sanitizing refers to removing semantically harmful parts (such as script execution) from HTML strings.

Example

In the previous example, <img onerror> causes the error handler to be executed, but if the onerror handler was removed, it would be possible to safely expand it in the DOM while leaving <em> intact.

// XSS 🧨
$div.innerHTML = `<em>hello world</em><img src="" onerror=alert(0)>`
// Sanitized ⛑
$div.innerHTML = `<em>hello world</em><img src="">`

To sanitize correctly, it is necessary to parse the input string as HTML, omit tags and attributes that are considered harmful, and keep the harmless ones.

The proposed Sanitizer API specification aims to provide such processing as a standard API for browsers.

Note: Internet Explorer had implemented window.toStaticHTML() for this purpose, but it was never standardized.

Sanitizer API

The Sanitizer API is used in the following way:

const $div = document.querySelector('div')
const user_input = `<em>hello world</em><img src="" onerror=alert(0)>`
$div.setHTML(user_input, { sanitizer: new Sanitizer() }) // <div><em>hello world</em><img src=""></div>

However, the { sanitizer: new Sanitizer() } is the default argument.

$div.setHTML(user_input) // <div><em>hello world</em><img src=""></div>

It's worth noting that setHTML() is defined on Element. Being a method of Element, the context to parse is self-explanatory (<div> in this case), the parsing is done once internally, and the result is directly expanded into the DOM.

To get the result of sanitization as a string, you can use .innerHTML from the setHTML() results.

const $div = document.createElement('div')
$div.setHTML(user_input)
$div.innerHTML // <em>hello world</em><img src="">

Customize with configuration

The Sanitizer API is configured by default to remove strings that would trigger script execution. However, you can also add your own customizations to the sanitization process with a configuration object.

const config = {
  allowElements: [],
  blockElements: [],
  dropElements: [],
  allowAttributes: {},
  dropAttributes: {},
  allowCustomElements: true,
  allowComments: true
};
// sanitized result is customized by configuration
new Sanitizer(config)

The following options specify how the sanitization result should treat the specified element.

allowElements: Names of elements that the sanitizer should retain.

blockElements: Names of elements the sanitizer should remove, while retaining their children.

dropElements: Names of elements the sanitizer should remove, along with their children.

const str = `hello <b><i>world</i></b>`

$div.setHTML(str)
// <div>hello <b><i>world</i></b></div>

$div.setHTML(str, { sanitizer: new Sanitizer({allowElements: [ "b" ]}) })
// <div>hello <b>world</b></div>

$div.setHTML(str, { sanitizer: new Sanitizer({blockElements: [ "b" ]}) })
// <div>hello <i>world</i></div>

$div.setHTML(str, { sanitizer: new Sanitizer({allowElements: []}) })
// <div>hello world</div>

You can also control whether the sanitizer will allow or deny specified attributes with the following options:

allowAttributes
dropAttributes

allowAttributes and dropAttributes properties expect attribute match lists—objects whose keys are attribute names, and values are lists of target elements or the * wildcard.

const str = `<span id=foo class=bar style="color: red">hello</span>`

$div.setHTML(str)
// <div><span id="foo" class="bar" style="color: red">hello</span></div>

$div.setHTML(str, { sanitizer: new Sanitizer({allowAttributes: {"style": ["span"]}}) })
// <div><span style="color: red">hello</span></div>

$div.setHTML(str, { sanitizer: new Sanitizer({allowAttributes: {"style": ["p"]}}) })
// <div><span>hello</span></div>

$div.setHTML(str, { sanitizer: new Sanitizer({allowAttributes: {"style": ["*"]}}) })
// <div><span style="color: red">hello</span></div>

$div.setHTML(str, { sanitizer: new Sanitizer({dropAttributes: {"id": ["span"]}}) })
// <div><span class="bar" style="color: red">hello</span></div>

$div.setHTML(str, { sanitizer: new Sanitizer({allowAttributes: {}}) })
// <div>hello</div>

allowCustomElements is the option to allow or deny custom elements. If they're allowed, other configurations for elements and attributes still apply.

const str = `<custom-elem>hello</custom-elem>`

$div.setHTML(str)
// <div></div>

const sanitizer = new Sanitizer({
  allowCustomElements: true,
  allowElements: ["div", "custom-elem"]
})
$div.setHTML(str, { sanitizer })
// <div><custom-elem>hello</custom-elem></div>

Note: The Sanitizer API is designed to be safe by default. This means that no matter how you set it up, it will never allow constructs that are known XXS targets. For example, allowElements: ["script"] won't actually allow <script>, because the built-in baseline configuration cannot be overridden. The purpose of customization is to override default settings if your application has special needs.

API surface

Comparison with DomPurify

DOMPurify is a well-known library that offers sanitization functionality. The main difference between the Sanitizer API and DOMPurify is that DOMPurify returns the result of the sanitization as a string, which you need to write into a DOM element with .innerHTML.

const user_input = `<em>hello world</em><img src="" onerror=alert(0)>`
const sanitized = DOMPurify.sanitize(user_input)
$div.innerHTML = sanitized
// `<em>hello world</em><img src="">`

DOMPurify can serve as a fallback when the Sanitizer API is not implemented in the browser.

The DOMPurify implementation has a couple of downsides. If a string is returned, then the input string is parsed twice, by DOMPurify and .innerHTML. This double parsing wastes processing time, but can also lead to interesting vulnerabilities caused by cases where the result of the second parsing is different from the first.

HTML also needs context to be parsed. For example, <td> makes sense in <table>, but not in <div>. Since DOMPurify.sanitize() only takes a string as an argument, the parsing context had to be guessed.

The Sanitizer API improves upon the DOMPurify approach and is designed to eliminate the need for double parsing and to clarify the parsing context.

API status and browser support

The Sanitizer API is under discussion in the standardization process and Chrome is in the process of implementing it.

Step	Status
1. Create explainer	Complete
2. Create specification draft	Complete
3. Gather feedback and iterate on design	Complete
4. Chrome origin trial	Complete
5. Launch	Intent to Ship on M105

Mozilla: Considers this proposal worth prototyping, and is actively implementing it.

WebKit: See the response on the WebKit mailing list.

How to enable the Sanitizer API

Browser Support

Chrome is in the process of implementing the Sanitizer API. In Chrome 93 or later, you can try out the behavior by enabling about://flags/#enable-experimental-web-platform-features flag. In earlier versions of Chrome Canary and Dev channel, you can enable it with --enable-blink-features=SanitizerAPI. Check out the instructions for how to run Chrome with flags.

Firefox

Firefox also implements the Sanitizer API as an experimental feature. To enable it, set the dom.security.sanitizer.enabled flag to true in about:config.

Feature detection

if (window.Sanitizer) {
  // Sanitizer API is enabled
}

Feedback

If you try this API and have some feedback, we'd love to hear it. Share your thoughts on Sanitizer API GitHub issues and discuss with the spec authors and folks interested in this API.

If you find any bugs or unexpected behavior in Chrome's implementation, file a bug to report it. Select the Blink>SecurityFeature>SanitizerAPI components and share details to help implementers track the problem.

Demo

To see the Sanitizer API in action check out the Sanitizer API Playground by Mike West:

Safe DOM manipulation with the Sanitizer API Stay organized with collections Save and categorize content based on your preferences.

Escape user input

Sanitize user input

Example

Sanitizer API

Customize with configuration

API surface

Comparison with DomPurify

API status and browser support

How to enable the Sanitizer API

Firefox

Feature detection

Feedback

Demo

References

Safe DOM manipulation with the Sanitizer API