- From: Benjamin Aster <notifications@github.com>
- Date: Sun, 30 Jul 2023 03:53:58 -0700
- To: whatwg/dom <dom@noreply.github.com>
- Cc: Subscribed <subscribed@noreply.github.com>
- Message-ID: <whatwg/dom/issues/1217@github.com>
I think there are valid use cases for DOM APIs like `DOMParser`, `XMLSerializer`, `document.implementation.createDocument()` etc. to be available in web workers. I don't mean having direct access to the current document (that wouldn't make sense, of course), but being able to parse, create, modify and serialize "offscreen" documents. Use cases for this include: - **Parsing & serializing XML files off the main thread**: For example, I'm currently working on a web-based rich text editor and Microsoft Word alternative, and I'm planning to add DOCX (Microsoft Word document file) support to it in the future. A DOCX file basically consists of a bunch of XML files zipped into a compressed archive. I can then [(un)compress the zip file with the help of (De)CompressionStream](https://dev.to/ndesmic/writing-a-simple-browser-zip-file-decompressor-with-compressionstreams-5che) and parse the XML files with `DOMParser` or create them with `XMLSerializer`. Currently, this has to be done on the main thread which will lead to the page being unresponsive while reading/writing DOCX files. Some projects like @jakearchibald's [SVGOMG](https://github.com/jakearchibald/svgomg), an SVG optimizer & minifier based on [SVGO](hhttps://npmjs.com/package/svgo), are currently even using XML parsing libraries like [Sax](https://www.npmjs.com/package/@trysound/sax) instead of the browser's `DOMParser` – amongst other reasons, to make them work in web workers. - **Generating HTML files off the main thread**: Applications that generate HTML files – be it website builders, [math document editors](https://github.com/BenjaminAster/PAMM), Markdown to HTML transpilers, etc. – could profit immensely from being able to convert their internal representations to HTML off the main thread. Since only a few months, all three major browser engines support worker modules and OffscreenCanvas, so I think websites are starting to do more and more expensive stuff off the main thread, with people like @surma having [advocated](https://www.youtube.com/watch?v=7Rrv9qFMWNM) for that for years. From a technical perspective, my proposal is that e.g. a global `self.document` property is exposed in workers, which is a stripped down version of `Document` containing only the following properties and functions: - `self.document.implementation` - `self.document.createAttribute()` - `self.document.createAttributeNS()` - `self.document.createCDATASection()` - `self.document.createComment()` - `self.document.createDocumentFragment()` (?) - `self.document.createElement()` - `self.document.createElementNS()` - `self.document.createEvent()` - `self.document.createExpression()` - `self.document.createProcessingInstruction()` - `self.document.createRange()` (?) - `self.document.createTextNode()` Additionally, the following interfaces should be exposed in workers: - `Document` & `XMLDocument` - `DocumentType` - `DOMImplementation` - `DocumentFragment` - [`DOMParser`](https://html.spec.whatwg.org/multipage/dynamic-markup-insertion.html#dom-parsing-and-serialization) - [`XMLSerializer`](https://w3c.github.io/DOM-Parsing/#the-xmlserializer-interface) - `XSLTProcessor` - [`Sanitizer`](https://wicg.github.io/sanitizer-api/#sanitizer-api) - `Node` - `ParentNode` - `Attr` - `CharacterData` - `Text` - `CDATASection` - `Element` - `Comment` - [`HTMLElement`](https://html.spec.whatwg.org/multipage/dom.html#htmlelement) and all HTML element interfaces - [`SVGElement`](https://svgwg.org/svg2-draft/types.html#InterfaceSVGElement) and all SVG element interfaces - [`MathMLElement`](https://w3c.github.io/mathml-core/#dom-and-javascript) - `NodeList` - `HTMLCollection` - `AbstractRange`, `StaticRange` & `Range` - `MutationObserver` & `MutationRecord` (?) - `NamedNodeMap` - `ProcessingInstruction` - `XPathResult`, `XPathExpression` & `XPathEvaluator` One could then use `new DOMParser().parseFromString()` or `self.document.implementation.{createDocument(), createHTMLDocument()}` to create a new document, modify it with all the usual and beloved DOM methods, and stringify it with `new XMLSerializer().serializeToString()` or `myOffscreenDocument.documentElement.outerHTML`. Things like [`Element.prototype.getClientRects()`](https://drafts.csswg.org/cssom-view/#ref-for-dom-element-getclientrects) or [`Element.prototype.computedStyleMap()`](https://drafts.css-houdini.org/css-typed-om/#ref-for-dom-element-computedstylemap) don't make sens with offscreen documents of course, but that is already the case with documents created on the main thread with `DOMParser` or `document.implementation.createHTMLElement`. -- Reply to this email directly or view it on GitHub: https://github.com/whatwg/dom/issues/1217 You are receiving this because you are subscribed to this thread. Message ID: <whatwg/dom/issues/1217@github.com>
Received on Sunday, 30 July 2023 10:54:04 UTC