serenity - The Serenity Operating System 🐞

Age	Commit message (Collapse)	Author
2022-03-28	LibWeb: Load X(HT)ML documents and transform them into HTML DOM	Ali Mohammad Pur

2022-03-26	LibWeb: Move HTML dimension value parsing from CSS to HTML namespace	Andreas Kling
	These are part of HTML, not CSS, so let's not confuse things.
2022-03-24	LibWeb: Rename PARSER_DEBUG => HTML_PARSER_DEBUG	Idan Horowitz
	Since this macro was created we gained a couple more parsers in the system :^)
2022-03-24	LibWeb: Remove inheritance of FormAssociatedElement from HTMLElement	Timothy Flynn
	HTMLObjectElement will need to be both a FormAssociatedElement and a BrowsingContextContainer. Currently, both of these classes inherit from HTMLElement. This can work in C++, but is generally frowned upon, and doesn't play particularly well with the rest of LibWeb. Instead, we can essentially revert commit 3bb5c62 to remove HTMLElement from FormAssociatedElement's hierarchy. This means that objects such as HTMLObjectElement individually inherit from FormAssociatedElement and HTMLElement now. Some caveats are: * FormAssociatedElement still needs to know when the HTMLElement is inserted into and removed from the DOM. This hook is automatically injected via a macro now, while still allowing classes like HTMLInputElement to also know when the element is inserted. * Casting from a DOM::Element to a FormAssociatedElement is now a sideways cast, rather than directly following an inheritance chain. This means static_cast cannot be used here; but we can safely use dynamic_cast since the only 2 instances of this already use RTTI to verify the cast.
2022-03-21	LibTextCodec: Don't allocate Strings on encoding normalisation	Hendiadyoin1
	This ripples down to LibWeb's HTML and XHR decoders, which therefore become less allocation heavy.
2022-03-21	LibWeb: Implement "has element in select scope" per-spec	Simon Wanner
	The HTML Specification is quite tricky in this case. Usually "have a particular element in <x> scope" mentions "consisting of the following element types:", but in this case it's "consisting of all element types except the following:" Thanks to @AtkinsSJ for spotting this difference
2022-03-20	LibWeb: Implement the rest of the Adoption Agency Algorithm	Simon Wanner
	This gets us 2 points on html5test.com :^) - Before: https://html5te.st/4cf57659bc08272e (208) - After: https://html5te.st/fb8a9259bda1c115 (210)
2022-03-19	LibWeb: Only delay "load" event for script elements that load something	Andreas Kling
	We shouldn't delay the load event for scripts that we're completely refusing to run anyway. Also, for scripts that have inline text content, we don't need to delay them either, as they will become ready before returning from "prepare script". This makes the "load" event finally fire on lots of websites, including Wikipedia. :^)
2022-03-19	LibWeb: Don't delay document "load" event for unclosed script tags	Andreas Kling
	We previously had a bug where markup with unclosed script tags caused the document load event to be delayed indefinitely. Fix this by only marking script elements as delaying the load event once we encounter the script end tag.
2022-03-17	Libraries: Use default constructors/destructors in LibWeb	Lenny Maiorani
	https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#cother-other-default-operation-rules "The compiler is more likely to get the default semantics right and you cannot implement these functions better than the compiler."
2022-03-14	LibWeb: Use inline script tag source line as javascript line offset	Idan Horowitz
	This makes JS exception line numbers meaningful for inline script tags.
2022-03-08	LibWeb: Move Window from DOM directory & namespace to HTML	Linus Groh
	The Window object is part of the HTML spec. :^) https://html.spec.whatwg.org/multipage/window-object.html
2022-03-02	LibWeb: Fix issue where double-quoted doctype system ID was not captured	Andreas Kling
	We were storing double-quoted system ID's in the public ID field. 1% progression on ACID3. :^)
2022-03-01	LibWeb: Associate form elements with a form in parsing and dynamically	Luke Wilde
	This makes it available for all form associated elements and not just select and input elements. It also makes it more spec compliant, especially around the form attribute. The main thing missing is re-associating form elements with a form attribute when the form attribute changes or an element with an ID is inserted/removed or has its ID changed.
2022-02-21	LibWeb: Make document.write() work while document is parsing	Andreas Kling
	This necessitated making HTMLParser ref-counted, and having it register itself with Document when created. That makes it possible for scripts to add new input at the current parser insertion point. There is now a reference cycle between Document and HTMLParser. This cycle is explicitly broken by calling Document::detach_parser() at the end of HTMLParser::run(). This is a huge progression on ACID3, from 31% to 49%! :^)
2022-02-21	LibWeb: Add basic support for dynamic markup insertion	Lorenz Steinert
	This implements basic support for dynamic markup insertion, adding * Document::open() * Document::write(Vector<String> const&) * Document::writeln(Vector<String> const&) * Document::close() The HTMLParser is modified to make it possible to create a script-created parser which initially only contains a HTMLTokenizer without any data. Aditionally the HTMLParser::run method gains an overload which does not modify the Document and does not run HTMLParser::the_end() so that we can reenter the parser at a later time. Furthermore all FIXMEs that consern the insertion point are implemented wich is defined in the HTMLTokenizer. Additionally the following member-variables of the HTMLParser are now exposed by getter funcions: * m_tokenizer * m_aborted * m_script_nesting_level The HTMLTokenizer is modified so that it contains an insertion point which keeps track of where the next input from the Document::write functions will be inserted. The insertion point is implemented as the charakter offset into m_decoded_input and a boolean describing if the insertion point is defined. Functions to update, check and {re}store the insertion point are also added. The function HTMLTokenizer::insert_eof is added to tell a script-created parser that document::close was called and HTMLParser::the_end() should be called. Lastly an explicit default constructor is added to HTMLTokenizer to create a empty HTMLTokenizer into which data can be inserted.
2022-02-21	LibWeb: Fix 'Comment end state' in HTML Tokenizer	Adam Hodgen
	Also, update the expected hash in the LibWeb TestHTMLTokenizer regression test. This is due to the "This comment has a few too many dashes." comment token being updated.
2022-02-21	LibWeb: Implement tokenization newline preprocessing	Adam Hodgen
	Newline normalization will replace \r and \r\n with \n. The spec specifically states > Before the tokenization stage, the input stream must be preprocessed > by normalizing newlines. wheras this is implemented the processing during the tokenization itself. This should still exhibit the same behaviour, while keeping the tokenization logic in the same place.
2022-02-21	LibWeb: Fix off by one error in HTML Tokenizer	Adam Hodgen
	In 'NamedCharacterReference' we attempt to lookup the code point by a identifier, eg apos; becomes ' This is done by passing the entire rest of the document to the `HTML::code_points_from_entity` function. However, before this change we didn't sent the final character which meant if the document ended in a named character reference the lookup would fail.
2022-02-20	LibWeb: Handle markers when reconstructing active formatting elements	Luke Wilde
	The entry we get from the active formatting elements list during the Rewind step of "reconstruct the active formatting elements" can be a marker. Previously we assumed it was not a marker, which can trigger an assertion failure with certain malformed HTML. If the entry in this step is a marker, the spec simply ignores it. This is step 6 of the algorithm. This also makes the index unsigned, as this algorithm is a no-op if the list is empty. Additionally, this also adds spec comments to this algorithm. Fixes #12668.
2022-02-19	LibWeb: Use Vector::clear_with_capacity() in HTMLTokenizer	Andreas Kling
	This avoids constantly reallocating the Vector<HTMLToken>.
2022-02-15	LibWeb: Fail gracefully when reaching the unimplemented part of the AAA	Linus Groh
	Pages such as https://html5test.com are testing all sorts of weird, incomplete, and wrong HTML but can be useful or at least interesting for development - let's try to avoid crashing the process.
2022-02-15	LibWeb: Implement state switch for "[CDATA[" in HTML parser	Linus Groh

2022-02-15	LibWeb: Add an optional pointer to an HTMLParser to the HTMLTokenizer	Linus Groh
	This is needed to access the 'adjusted current node' in the 'Markup declaration open state'. We don't want to create a full parser for something like syntax highlighting, so it's optional (null) by default.
2022-02-15	LibWeb: Remove unused HTMLParser function declaration	Linus Groh
	There is no implementation of this function: HTMLParser::stack_of_open_elements_has_element_with_tag_name_in_scope
2022-02-15	LibWeb: Add spec links to each HTML tokenizer state section	Linus Groh
	I didn't add full spec comments this time, but this is better than nothing :^)
2022-02-15	LibWeb: Add spec comments to the StackOfOpenElements class	Andreas Kling

2022-02-15	LibWeb: Rename element_before() => element_immediately_above()	Andreas Kling
	This matches the spec terminology around the "stack of open elements".
2022-02-15	LibWeb: Add spec comments to find_appropriate_place_for_inserting_node()	Andreas Kling

2022-02-14	LibWeb: Don't emit current token on EOF in HTML Tokenizer	Karol Kosek
	Emitting tokens on EOF caused an infinite loop, freezing the app, which could be a bit annoying when writing an HTML comment at the end of the file in Text Editor. :^)
2022-02-14	LibWeb: Fix highlighting HTML comments	Karol Kosek
	Commit b193351a99 caused the HTML comments to flash when changing the text cursor. Also, when double-clicking on a comment, the selection started from the beginning of the file instead. The following message was displaying when `TOKENIZER_TRACE_DEBUG` was enabled: (Tokenizer::nth_last_position) Invalid position requested: 4th-last of 4. Returning (0-0). Changing the `nth_last_position` to 3 fixes this. I'm guessing that's because the parser is at that moment on the second hyphen of the `<!--` string, so it has to go back only by three characters.
2022-02-13	LibWeb: Fix off-by-one in HTMLTokenizer::restore_to()	MacDue
	The difference should be between m_utf8_iterator and the the new position, if m_prev_utf8_iterator is used one fewer source position is popped than required. This issue was not apparent on most pages since restore_to used for tokens such <!doctype> that are normally followed by a newline that resets the column to zero, but it can be seen on pages with minified HTML.
2022-02-08	LibWeb: Introduce the Environment Settings Object	Luke Wilde
	The environment settings object is effectively the context a piece of script is running under, for example, it contains the origin, responsible document, realm, global object and event loop for the current context. This effectively replaces ScriptExecutionContext, but it cannot be removed in this commit as EventTarget still depends on it. https://html.spec.whatwg.org/multipage/webappapis.html#environment-settings-object
2021-12-10	LibWeb: Fix off-by-one error when highlighting unquoted HTML attributes	Sam Atkins
	This fixes #11166
2021-12-05	LibWeb: Cast unused smart-pointer return values to void	Sam Atkins

2021-11-11	Everywhere: Pass AK::StringView by value	Andreas Kling

2021-10-17	LibWeb: Implement Attribute closer to the spec and with an IDL file	Timothy Flynn
	Note our Attribute class is what the spec refers to as just "Attr". The main differences between the existing implementation and the spec are just that the spec defines more fields. Attributes can contain namespace URIs and prefixes. However, note that these are not parsed in HTML documents unless the document content-type is XML. So for now, these are initialized to null. Web pages are able to set the namespace via JavaScript (setAttributeNS), so these fields may be filled in when the corresponding APIs are implemented. The main change to be aware of is that an attribute is a node. This has implications on how attributes are stored in the Element class. Nodes are non-copyable and non-movable because these constructors are deleted by the EventTarget base class. This means attributes cannot be stored in a Vector or HashMap as these containers assume copyability / movability. So for now, the Vector holding attributes is changed to hold RefPtrs to attributes instead. This might change when attribute storage is implemented according to the spec (by way of NamedNodeMap).
2021-10-10	LibWeb: Remove dead "outer loop" code in adoption agency algorithm	Brian Gianforcaro

2021-10-01	LibWeb: Check for HTML integration points in the tree constructor	Luke Wilde
	This particularly implements these two points: - "If the adjusted current node is an HTML integration point and the token is a start tag" - "If the adjusted current node is an HTML integration point and the token is a character token" This also adds spec comments to the tree constructor.
2021-09-26	LibWeb: Add the PageTransitionEvent interface and fire "pageshow" events	Andreas Kling
	We now fire "pageshow" events at the appropriate time during document loading (done by the parser.) Note that there are no corresponding "pagehide" events yet.
2021-09-26	LibWeb: Add a "page showing" flag to documents	Andreas Kling
	This will be used to determine whether "pageshow" and "pagehide" events are appropriate. We won't actually make use of it until we implement more of history traversal and document unloading.
2021-09-26	LibWeb: Implement "update the current document readiness" from spec	Andreas Kling
	The only difference from what we were already doing is that setting the same ready state twice no longer fires a "readystatechange" event. I don't think that could happen in practice though.
2021-09-26	LibWeb: Store HTML document ready state as an enum	Andreas Kling

2021-09-26	LibWeb: Allow HTML parser to delay delivery of the document "load" event	Andreas Kling
	We will now spin in "the end" until there are no more "things delaying the load event". Of course, nothing actually uses this yet, and there are a lot of things that need to.
2021-09-26	LibWeb: Implement more of HTMLParser::the_end() and bring closer to spec	Andreas Kling

2021-09-26	LibWeb: Split out "The end" from the HTML parsing spec to a function	Andreas Kling
	Also add a spec link and some comments.
2021-09-25	LibWeb: Rename HTMLDocumentParser => HTMLParser	Andreas Kling

2021-09-21	Libraries: Use AK::Variant default initialization where appropriate	Ben Wiederhake

2021-09-20	LibWeb: Make <script src> loads partially async (by following the spec)	Andreas Kling
	Instead of firing up a network request and synchronously blocking for it to finish via a nested event loop, we now start an asynchronous request when encountering <script src>. Once the script load finishes (or fails), it gets executed at one of the synchronization points in the HTML parser. This solves some long-standing issues with random unexpected events getting dispatched in the middle of parsing.
2021-09-20	LibWeb: Pop entire stack of open elements at the end of parsing	Andreas Kling