Skip to content

Conversation

@foolip
Copy link
Member

@foolip foolip commented Nov 28, 2025

  • Remove the inert document from the HTML fragment parser
  • Add a target argument to the HTML/XML fragment parsing algorithms
  • At least two implementers are interested (and none opposed):
  • Tests are written and can be reviewed and commented upon at:
  • Implementation bugs are filed:
    • Chromium: …
    • Gecko: …
    • WebKit: …
    • Deno (only for timers, structured clone, base64 utils, channel messaging, module resolution, web workers, and web storage): …
    • Node.js (only for timers, structured clone, base64 utils, channel messaging, and module resolution): …
  • Corresponding HTML AAM & ARIA in HTML issues & PRs:
  • MDN issue is filed: …
  • The top of this comment includes a clear commit message to use.

(See WHATWG Working Mode: Changes for more details.)


/dynamic-markup-insertion.html ( diff )
/parsing.html ( diff )
/xhtml.html ( diff )

@foolip foolip changed the title foolip/fragment parser no inert doc Remove the inert document from the HTML fragment parser Nov 28, 2025
@foolip foolip changed the title Remove the inert document from the HTML fragment parser Remove the inert document from the HTML fragment parsing algorithm Nov 28, 2025
@foolip
Copy link
Member Author

foolip commented Nov 28, 2025

This is speculative editing following #11669 (comment) to see what it would mean to remove the inert document from the HTML fragment parsing algorithm.

There are two questions, one small and one big.

Small: Is it necessary to put something on the stack of open elements to not violate assumptions elsewhere? At least Chromium and WebKit put a DocumentFragment on the stack of open elements, but that's not an element. In a quick survey of "stack of open elements" I couldn't find anything that would be broken by letting it be empty, but if there is something perhaps the context element or a shallow copy of it could be placed on the stack of open elements.

Big: What were the side effects of using an inert document that implementations might have achieved in some other way, and that also need to be spec'd?

The main reason for exploring this is to pave way for streamHTMLUnsafe() to simply insert directly into the target node, but it's not strictly necessary, the inert document could be kept around in the definition of existing APIs if it's too risky to change.

cc @zcorpan

Copy link
Member

@annevk annevk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree this is what we need to do, but I don't want to lose sight of the requirements for streamHTML() (and setHTML()) while we do this. For those cases we do still want to create in a separate document (and then maybe mutate) before moving things over.

data-x="concept-tree-child">children</span>, in <span>tree order</span>.</p></li>
<li><p><span data-x="concept-node-append">Append</span> the resulting <code>Document</code>
node's <span>document element</span>'s <span data-x="concept-tree-child">children</span> to
<var>target</var>, in <span>tree order</span>.</p></li>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will create the wrong mutation records.

<li><p>Let <var>document</var> be a <code>Document</code> node whose <span
data-x="concept-document-type">type</span> is "<code data-x="">html</code>".</p></li>
<li><p>Let <var>parser</var> be a new <span>HTML parser</span> associated with
<var>context</var>'s <span>node document</span>.</p></li>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems wrong as this would mean the parser will potentially manipulate that document. Though exactly how this is layered today is unclear.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are the bits that I could find beyond inserting nodes:

I'll have to check how implementations deal with these cases.

@foolip
Copy link
Member Author

foolip commented Dec 1, 2025

I think I agree this is what we need to do, but I don't want to lose sight of the requirements for streamHTML() (and setHTML()) while we do this. For those cases we do still want to create in a separate document (and then maybe mutate) before moving things over.

Do you mean for sanitizer, or are there other reasons to use an intermediate document? My thinking was that we'd integrate the sanitizer into the parser so that it's streaming in order to support streamHTML(), and then probably setHTML() could just use the same setup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants