From f52ede23aa5399120d3271605cc7ea0cd28baa18 Mon Sep 17 00:00:00 2001 From: Luke Wilde Date: Tue, 11 Apr 2023 19:19:30 +0100 Subject: LibWeb: Return from "the end" during HTML fragment parsing This will examine the algorithm known as "the end" from the HTML specification, which executes when parsing HTML markup has completed, and it's potential to observably run script or change certain attributes. This currently executes in our engine when parsing HTML received from the internet during navigation, using document.{open,write,close}, setting the innerHTML attribute or using DOMParser. The latter two are only possible by executing script. This has been causing some issues in our engine, which will be shown later, so we are considering removing the call to "the end" for these two cases. Spoiler: the implications of running "the end" for DOMParser will be considered in the future. It is the only script-created HTML/XML parser remaining after this commit that uses "the end", including it's XML variant implemented as XMLDocumentBuilder::document_end(). This will only focus on setting the innerHTML attribute, which falls under "HTML fragment parsing", which starts here in the specification: https://html.spec.whatwg.org/multipage/parsing.html#parsing-html-fragments https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/Parser/HTMLParser.cpp#L3491 While you may notice our HTMLParser::parse_html_fragment returns `void` and assume this means no scripts are executed because of our use of `WebIDL::ExceptionOr` and `JS::ThrowCompletionOr`, note that dispatched events will execute arbitrary script via a callback, catch any exceptions, report them and not propagate them. This means that while a function does not return an exception type, it can still potentially execute script. A breakdown of the steps of "the end" in the context of HTML fragment parsing and its observability follows: https://html.spec.whatwg.org/multipage/parsing.html#the-end https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/Parser/HTMLParser.cpp#L221 1. No-op, as we don't currently have speculative HTML parsing. Even if we did, we would instantly return after stopping the speculative HTML parser anyway. 2. No-op, document.{open,write,close} are not accessible from the temporary document. 3. No-op, document.readyState, window.navigation.timing and the readystatechange event are not accessible from the created temporary document. 4. This is presumably done so that reentrant invocation of the HTML parser from document.{write,close} during the firing of the events after step 4 ends up parsing from a clean state. This is a no-op, as the events after step 4 do not fire and are not accessible. 5. No-op, we set HTMLScriptElement::m_already_started to true when creating it whilst parsing an HTML fragment, which causes HTMLScriptElement::prepare_script to instantly bail, meaning `scripts_to_execute_when_parsing_has_finished` is always empty. 6. No-op, tasks are considered not runnable when the document does not have a browsing context, which is always the case in fragment parsing. Additionally, window.navigation.timing and the DOMContentLoaded event aren't reachable from the temporary document. 7. Almost a no-op, `scripts_to_execute_as_soon_as_possible` is always empty for the same reason as step 4. However, this step uses an unconditional `spin_until` call, which _is_ observable and causes one of the alluded to issues, which will be talked about later. 8. No-op, as delaying the load event has no purpose in this case, as the task in step 9 will set the current document readiness to "complete" and then return immediately after, as the temporary document has no browsing context, skipping the Window load event. However, this step causes another alluded to issue, which will be talked about later. 9. No-op, for the same reason as step 6. Additionally, document.readyState is not accessible from the temporary document and the temporary document has no browsing context, so navigation timing, the Window load event, the pageshow event, the Document load event and the `