serenity - The Serenity Operating System 🐞

Age	Commit message (Collapse)	Author
2022-12-06	AK+Everywhere: Rename String to DeprecatedString	Linus Groh
	We have a new, improved string type coming up in AK (OOM aware, no null state), and while it's going to use UTF-8, the name UTF8String is a mouthful - so let's free up the String name by renaming the existing class. Making the old one have an annoying name will hopefully also help with quick adoption :^)
2022-11-22	LibJS: Reduce AST memory usage by shrink-wrapping source range info	Andreas Kling
	Before this change, each AST node had a 64-byte SourceRange member. This SourceRange had the following layout: filename: StringView (16 bytes) start: Position (24 bytes) end: Position (24 bytes) The Position structs have { line, column, offset }, all members size_t. To reduce memory consumption, AST nodes now only store the following: source_code: NonnullRefPtr<SourceCode> (8 bytes) start_offset: u32 (4 bytes) end_offset: u32 (4 bytes) SourceCode is a new ref-counted data structure that keeps the filename and original parsed source code in a single location, and all AST nodes have a pointer to it. The start_offset and end_offset can be turned into (line, column) when necessary by calling SourceCode::range_from_offsets(). This will walk the source code string and compute line/column numbers on the fly, so it's not necessarily fast, but it should be rare since this information is primarily used for diagnostics and exception stack traces. With this, ASTNode shrinks from 80 bytes to 32 bytes. This gives us a ~23% reduction in memory usage when loading twitter.com/awesomekling (330 MiB before, 253 MiB after!) :^)
2022-07-12	Everywhere: Add sv suffix to strings relying on StringView(char const*)	sin-ack
	Each of these strings would previously rely on StringView's char const* constructor overload, which would call __builtin_strlen on the string. Since we now have operator ""sv, we can replace these with much simpler versions. This opens the door to being able to remove StringView(char const*). No functional changes.
2021-12-29	LibJS: Detect invalid unicode and stop lexing at that point	davidot
	Previously we might swallow invalid unicode point which would skip valid ascii characters. This could be dangerous as we might skip a '"' thus not closing a string where we should. This might have been exploitable as it would not have been clear what code gets executed when looking at a script. Another approach to this would be simply replacing all invalid characters with the replacement character (this is what v8 does). But our lexer and parser are currently not set up for such a change.
2021-11-11	Everywhere: Pass AK::StringView by value	Andreas Kling

2021-09-18	LibJS: Make Lexer::s_keywords store keywords as FlyString	Andreas Kling
	This allows O(1) comparison against lexed keywords, since we lex to FlyString.
2021-09-10	LibJS: Share "parsed identifiers" between copied JS::Lexer instances	Andreas Kling
	When we save/load state in the parser, we preserve the lexer state by simply making a copy of it. This was made extremely heavy by the lexer keeping a cache of all parsed identifiers. It keeps the cache to ensure that StringViews into parsed Unicode escape sequences don't become dangling views when the Token goes out of scope. This patch solves the problem by replacing the Vector<FlyString> which was used to cache the identifiers with a ref-counted HashTable<FlyString> instead. Since the purpose of the cache is just to keep FlyStrings alive, it's fine for all Lexer instances to share the cache. And as a bonus, using a HashTable instead of a Vector replaces the O(n) accesses with O(1) ones. This makes a 1.9 MiB JavaScript file parse in 0.6s instead of 24s. :^)
2021-08-24	LibJS: Fix some small remaining issues with parsing unicode escapes	davidot
	Added a test to ensure the behavior stays the same. We now throw on a direct usage of an escaped keywords with a specific error to make it more clear to the user.
2021-08-19	LibJS: Allow Unicode escape sequences in identifiers	Timothy Flynn
	For example, "property.br\u{64}wn" should resolve to "property.brown". To support this behavior, this commit changes the Token class to hold both the evaluated identifier name and a view into the original source for the unevaluated name. There are some contexts in which identifiers are not allowed to contain Unicode escape sequences; for example, export statements of the form "export {} from foo.js" forbid escapes in the identifier "from". The test file is added to .prettierignore because prettier will replace all escaped Unicode sequences with their unescaped value.
2021-08-16	LibJS: Correctly handle Unicode characters in JS source text	davidot
	Also recognize additional white space characters.
2021-08-16	LibJS: Force the lexer to parse a regex when expecting a statement	davidot

2021-08-15	LibJS: Add a mode to parse JS as a module	davidot
	In a module strict mode should be enabled at the start of parsing and we allow import and export statements.
2021-06-26	LibJS+LibCrypto: Allow '_' as a numeric literal separator :^)	Andreas Kling
	This patch adds support for the NumericLiteralSeparator concept from the ECMAScript grammar.
2021-06-13	Userland: Allow building SerenityOS with -funsigned-char	Gunnar Beutner
	Some of the code assumed that chars were always signed while that is not the case on ARM hosts. Also, some of the code tried to use EOF (-1) in a way similar to what fgetc() does, however instead of storing the characters in an int variable a char was used. While this seemed to work it also meant that character 0xFF would be incorrectly seen as an end-of-file. Careful reading of fgetc() reveals that fgetc() stores character data in an int where valid characters are in the range of 0-255 and the EOF value is explicitly outside of that range (usually -1).
2021-05-29	Everywhere: Use s.unverwerth@serenityos.org :^)	Stephan Unverwerth

2021-04-22	Everything: Move to SPDX license identifiers in all files.	Brian Gianforcaro
	SPDX License Identifiers are a more compact / standardized way of representing file license information. See: https://spdx.dev/resources/use/#identifiers This was done with the `ambr` search and replace tool. ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
2021-03-01	LibJS: Keep track of file names, lines and columns inside the AST	Jean-Baptiste Boric

2021-01-12	Libraries: Move to Userland/Libraries/	Andreas Kling