Learn Measure Blog Live About

Boldly link where no one has linked before: Text Fragments

Text Fragments let you specify a text snippet in the URL fragment. When navigating to a URL with such a text fragment, the browser can emphasize and/or bring it to the user's attention.

Updated

Fragment Identifiers

Chrome 80 was a big release. It contained a number of highly anticipated features like ECMAScript Modules in Web Workers, nullish coalescing, optional chaining, and more. The release was, as usual, announced through a blog post on the Chromium blog. You can see an excerpt of the blog post in the screenshot below.

Chromium blog post with red boxes around elements with an id attribute.

You are probably asking yourself what all the red boxes mean. They are the result of running the following snippet in DevTools. It highlights all elements that have an id attribute.

document.querySelectorAll('[id]').forEach((el) => {
el.style.border = 'solid 2px red';
});

I can place a deep link to any element highlighted with a red box thanks to the fragment identifier which I then use in the hash of the page's URL. Assuming I wanted to deep link to the Give us feedback in our Product Forums box in the aside, I could do so by handcrafting the URL https://blog.chromium.org/2019/12/chrome-80-content-indexing-es-modules.html#HTML1. As you can see in the Elements panel of the Developer Tools, the element in question has an id attribute with the value HTML1.

Dev Tools showing the id of an element.

If I parse this URL with JavaScript's URL() constructor, the different components are revealed. Notice the hash property with the value #HTML1.

new URL('https://blog.chromium.org/2019/12/chrome-80-content-indexing-es-modules.html#HTML1');
/* Creates a new `URL` object
URL {
hash: "#HTML1"
host: "blog.chromium.org"
hostname: "blog.chromium.org"
href: "https://blog.chromium.org/2019/12/chrome-80-content-indexing-es-modules.html#HTML1"
origin: "https://blog.chromium.org"
password: ""
pathname: "/2019/12/chrome-80-content-indexing-es-modules.html"
port: ""
protocol: "https:"
search: ""
searchParams: URLSearchParams {}
username: ""
}
*/

The fact though that I had to open the Developer Tools to find the id of an element speaks volumes about the probability this particular section of the page was meant to be linked to by the author of the blog post.

What if I want to link to something without an id? Say I want to link to the ECMAScript Modules in Web Workers heading. As you can see in the screenshot below, the <h1> in question does not have an id attribute, meaning there is no way I can link to this heading. This is the problem that Text Fragments solve 🎉.

Dev Tools showing a heading without an id.

Text Fragments

The Text Fragments proposal adds support for specifying a text snippet in the URL hash. When navigating to a URL with such a text fragment, the user agent can emphasize and/or bring it to the user's attention.

Browser compatibility

The Text Fragments feature is supported in version 80 and beyond of Chromium-based browsers. At the time of writing, Safari and Firefox have not publicly signaled an intent to implement the feature. See Related links for pointers to the Safari and Firefox discussions.

Note that these links currently do not work when served across client-side redirects that some common services like Twitter use. Regular HTTP redirects work fine.

textStart

In its simplest form, the syntax of Text Fragments is as follows: The hash symbol # followed by :~:text= and finally textStart, which represents the percent-encoded text I want to link to.

#:~:text=textStart

For example, say that I want to link to the ECMAScript Modules in Web Workers heading in the blog post announcing features in Chrome 80, the URL in this case would be:

https://blog.chromium.org/2019/12/chrome-80-content-indexing-es-modules.html#:~:text=ECMAScript%20Modules%20in%20Web%20Workers

The text fragment is emphasized like this. If you click the link in a supporting browser like Chrome, the text fragment is highlighted and scrolls into view:

Text fragment scrolled into view and highlighted.

textStart and textEnd

Now what if I want to link to the entire section titled ECMAScript Modules in Web Workers, not just its heading? Percent-encoding the entire text of the section would make the resulting URL impracticably long.

Luckily there is a better way. Rather than the entire text, I can frame the desired text using the textStart,textEnd syntax. Therefore, I specify a couple of percent-encoded words at the beginning of the desired text, and a couple of percent-encoded words at the end of the desired text, separated by a comma ,.

That looks like this:

https://blog.chromium.org/2019/12/chrome-80-content-indexing-es-modules.html#:~:text=ECMAScript%20Modules%20in%20Web%20Workers,ES%20Modules%20in%20Web%20Workers..

For textStart, I have ECMAScript%20Modules%20in%20Web%20Workers, then a comma , followed by ES%20Modules%20in%20Web%20Workers. as textEnd. When you click through on a supporting browser like Chrome, the whole section is highlighted and scrolled into view:

Text fragment scrolled into view and highlighted.

Now you may wonder about my choice of textStart and textEnd. Actually, the slightly shorter URL https://blog.chromium.org/2019/12/chrome-80-content-indexing-es-modules.html#:~:text=ECMAScript%20Modules,Web%20Workers. with only two words on each side would have worked, too. Compare textStart and textEnd with the previous values.

If I take it one step further and now use only one word for both textStart and textEnd, you can see that I am in trouble. The URL https://blog.chromium.org/2019/12/chrome-80-content-indexing-es-modules.html#:~:text=ECMAScript,Workers. is even shorter now, but the highlighted text fragment is no longer the originally desired one. The highlighting stops at the first occurrence of the word Workers., which is correct, but not what I intended to highlight. The problem is that the desired section is not uniquely identified by the current one-word textStart and textEnd values:

Non-intended text fragment scrolled into view and highlighted.

prefix- and -suffix

Using long enough values for textStart and textEnd is one solution for obtaining a unique link. In some situations, however, this is not possible. On a side note, why did I choose the Chrome 80 release blog post as my example? The answer is that in this release Text Fragments were introduced:

Blog post text: Text URL Fragments.
    Users or authors can now link to a specific portion of a page
    using a text fragment provided in a URL.
    When the page is loaded, the browser highlights the text and scrolls the fragment into view.
    For example, the URL below loads a wiki page for 'Cat'
    and scrolls to the content listed in the `text` parameter.
Text Fragments announcement blog post excerpt.

Notice how in the screenshot above the word "text" appears four times. The forth occurrence is written in a green code font. If I wanted to link to this particular word, I would set textStart to text. Since the word "text" is, well, only one word, there cannot be a textEnd. What now? The URL https://blog.chromium.org/2019/12/chrome-80-content-indexing-es-modules.html#:~:text=text matches at the first occurrence of the word "Text" already in the heading:

Text Fragment matching at the first occurrence of "Text".

Caution: Note that text fragment matching is case-insensitive.

Luckily there is a solution. In cases like this, I can specify a prefix​- and a -suffix. The word before the green code font "text" is "the", and the word after is "parameter". None of the other three occurrences of the word "text" has the same surrounding words. Armed with this knowledge, I can tweak the previous URL and add the prefix- and the -suffix. Like the other parameters, they, too, need to be percent-encoded and can contain more than one word. https://blog.chromium.org/2019/12/chrome-80-content-indexing-es-modules.html#:~:text=the-,text,-parameter. To allow the parser to clearly identify the prefix- and the -suffix, they need to be separated from the textStart and the optional textEnd with a dash -.

Text Fragment matching at the desired occurrence of "text".

The full syntax

The full syntax of Text Fragments is shown below. (Square brackets indicate an optional parameter.) The values for all parameters need to be percent-encoded. This is especially important for the dash -, ampersand &, and comma , characters, so they are not being interpreted as part of the text directive syntax.

#:~:text=[prefix-,]textStart[,textEnd][,-suffix]

Each of prefix-, textStart, textEnd, and -suffix will only match text within a single block-level element, but full textStart,textEnd ranges can span multiple blocks. For example, :~:text=The quick,lazy dog will fail to match in the following example, because the starting string "The quick" does not appear within a single, uninterrupted block-level element:

  <div>The<div> </div>quick brown fox</div>
<div>jumped over the lazy dog</div>

It does, however, match in this example:

  <div>The quick brown fox</div>
<div>jumped over the lazy dog</div>

Creating Text Fragment URLs with a browser extension

Creating Text Fragments URLs by hand is tedious, especially when it comes to making sure they are unique. If you really want to, the specification has some tips and lists the exact steps for generating Text Fragment URLs. We provide a browser extension called Link to Text Fragment that lets you link to any text by selecting it, and then clicking "Copy Link to Selected Text" in the context menu.

Link to Text Fragment browser extension.

Multiple text fragments in one URL

Note that multiple text fragments can appear in one URL. The particular text fragments need to be separated by an ampersand character &. Here is an example link with three text fragments: https://blog.chromium.org/2019/12/chrome-80-content-indexing-es-modules.html#:~:text=Text%20URL%20Fragments&text=text,-parameter&text=:~:text=On%20islands,%20birds%20can%20contribute%20as%20much%20as%2060%25%20of%20a%20cat's%20diet.

Three text fragments in one URL.

Mixing element and text fragments

Traditional element fragments can be combined with text fragments. It is perfectly fine to have both in the same URL, for example, to provide a meaningful fallback in case the original text on the page changes, so that the text fragment does not match anymore. The URL https://blog.chromium.org/2019/12/chrome-80-content-indexing-es-modules.html#HTML1:~:text=Give%20us%20feedback%20in%20our%20Product%20Forums. linking to the Give us feedback in our Product Forums section contains both an element fragment (HTML1), as well as a text fragment (text=Give%20us%20feedback%20in%20our%20Product%20Forums.):

Linking with both element fragment and text fragment.

The fragment directive

There is one element of the syntax I have not explained yet: the fragment directive :~:. To avoid compatibility issues with existing URL element fragments as shown above, the Text Fragments specification introduces the fragment directive. The fragment directive is a portion of the URL fragment delimited by the code sequence :~:. It is reserved for user agent instructions, such as text=, and is stripped from the URL during loading so that author scripts cannot directly interact with it. User agent instructions are also called directives. In the concrete case, text= is therefore called a text directive.

Feature detection

To detect support, test for the read-only fragmentDirective property on Location.prototype. The fragment directive is a mechanism for URLs to specify instructions directed to the browser rather than the document. It is meant to avoid direct interaction with author script, so that future user agent instructions can be added without fear of introducing breaking changes to existing content. One potential example of such future additions could be translation hints.

if ('fragmentDirective' in Location.prototype) {
// Text Fragments is supported.
}

Caution: This property might move to document.fragmentDirective in the future. For details see https://crbug.com/1057795.

Feature detection is mainly intended for cases where links are dynamically generated (for example by search engines) to avoid serving text fragments links to browsers that do not support them.

Polyfillability

The Text Fragments feature can be polyfilled to some extent. There is early work in progress to create an extension for browsers that do not support Text Fragments natively where the functionality is implemented in JavaScript.

Security

Text fragment directives are invoked only on full (non-same-page) navigations that are the result of a user activation. Additionally, navigations originating from a different origin than the destination will require the navigation to take place in a noopener context, such that the destination page is known to be sufficiently isolated. Text fragment directives are only applied to the main frame. This means that text will not be searched inside iframes, and iframe navigation will not invoke a text fragment.

Privacy

It is important that implementations of the Text Fragments specification do not leak whether a text fragment was found on a page or not. While element fragments are fully under the control of the original page author, text fragments can be created by anyone. Remember how in my example above there was no way to link to the ECMAScript Modules in Web Workers heading, since the <h1> did not have an id, but how anyone, including me, could just link to anywhere by carefully crafting the text fragment?

Imagine I ran an evil ad network evil-ads.example.com. Further imagine that in one of my ad iframes I dynamically created a hidden cross-origin iframe to dating.example.com with a Text Fragment URL dating.example.com#:~:text=Log%20Out once the user interacts with the ad. If the text "Log Out" is found, I know the victim is currently logged in to dating.example.com, which I could use for user profiling. Since a naive Text Fragments implementation might decide that a successful match should cause a focus switch, on evil-ads.example.com I could listen for the blur event and thus know when a match occurred. In Chrome, we have implemented Text Fragments in such a way that the above scenario cannot happen.

Another attack might be to exploit network traffic based on scroll position. Assume I had access to network traffic logs of my victim, like as the admin of a company intranet. Now imagine there existed a long human resources document What to Do If You Suffer From… and then a list of conditions like burn out, anxiety, etc. I could place a tracking pixel next to each item on the list. If I then determine that loading the document temporally co-occurs with the loading of the tracking pixel next to, say, the burn out item, I can then, as the intranet admin, determine that an employee has clicked through on a text fragment link with :~:text=burn%20out that the employee may have assumed was confidential and not visible to anyone. Since this example is somewhat contrived to begin with and since its exploitation requires very specific preconditions to be met, the Chrome security team evaluated the risk of implementing scroll on navigation to be manageable. Other user agents may decide to show a manual scroll UI element instead.

For sites that still wish to opt-out, we have proposed a Document Policy header value that they can send, so user agents will not process Text Fragment URLs. Since Document Policy is not yet shipped, we are running an origin trial to apply this policy as an intermediate solution. The ForceLoadAtTop origin trial is running from Chrome version 83 to 84.

Document-Policy: force-load-at-top

For some searches, the search engine Google provides a quick answer or summary with a content snippet from a relevant website. These featured snippets are most likely to show up when a search is in the form of a question. Clicking a featured snippet takes the user directly to the featured snippet text on the source web page. This works thanks to automatically created Text Fragments URLs.

Google search engine results page showing a featured snippet. The status bar shows the Text Fragments URL.
After clicking through, the relevant section of the page is scrolled into view.

Conclusion

Text Fragments URL is a powerful feature to link to arbitrary text on webpages. The scholarly community can use it to provide highly accurate citation or reference links. Search engines can use it to deeplink to text results on pages. Social networking sites can use it to let users share specific passages of a webpage rather than inaccessible screenshots. I hope you start using Text Fragment URLs and find them as useful as I do. Be sure to install the Link to Text Fragment browser extension.

Acknowledgements

Text Fragments was implemented and specified by Nick Burris and David Bokan, with contributions from Grant Wang. Thanks to Joe Medley for the thorough review of this article. Hero image by Greg Rakozy on Unsplash.

Last updated: Improve article