serenity - The Serenity Operating System 🐞

Age	Commit message (Collapse)	Author
2021-09-01	LibUnicode: Resolve the most likely territory alias when there are many	Timothy Flynn

2021-09-01	LibUnicode: Perform complex Unicode locale alias substitution	Timothy Flynn

2021-09-01	LibUnicode: Canonicalize calendar subtags	Timothy Flynn
	Calendar subtags are a bit of an odd-man-out in that we must match the variants "ethiopic-amete-alem" in that order, without any other variant in the locale. So a separate method is needed for this, and we now defer sorting the variant list until after other canonicalization is done.
2021-09-01	LibUnicode: Canonicalize timezone subtags	Timothy Flynn

2021-09-01	LibUnicode: Canonicalize the subtag "imperial" to "uksystem"	Timothy Flynn

2021-09-01	LibUnicode: Canonicalize the subtag "primary" and "tertiary" to "levelN"	Timothy Flynn

2021-09-01	LibUnicode: Canonicalize the subtag "names" to "prprname"	Timothy Flynn

2021-09-01	LibUnicode: Canonicalize the subtag "yes" to "true"	Timothy Flynn

2021-09-01	LibUnicode: Substitute Unicode locale aliases during canonicalization	Timothy Flynn
	Unicode TR35 defines how locale subtag aliases should be emplaced when converting a locale to canonical form. For most subtags, it is a simple substitution. Language subtags depend on context; for example, the language "sh" should become "sr-Latn", but if the original locale has a script subtag already ("sh-Cyrl"), then only the language subtag of the alias should be taken ("sr-Latn"). To facilitate this, we now make two passes when canonicalizing a locale. In the first pass, we convert the LocaleID structure to canonical syntax (where the conversions all happen in-place). In the second pass, we form the canonical string based on the canonical syntax.
2021-09-01	LibJS+LibUnicode: Store parsed Unicode locale data as full strings	Timothy Flynn
	Originally, it was convenient to store the parsed Unicode locale data as views into the original string being parsed. But to implement locale aliases will require mutating the data that was parsed. To prepare for that, store the parsed data as proper strings.
2021-09-01	Tests: Convert remaining LibC tests to LibTest	Andrew Kaster
	Convert them to using outln instead of printf at the same time.
2021-09-01	Everywhere: Use my cool new @serenityos.org email address	Peter Elliott

2021-08-31	Tests: Test LibMarkdown against commonmark test suite	Peter Elliott
	TestCommonmark runs the CommonMark test suite (https://spec.commonmark.org/0.30/spec.json) against LibMarkdown. Currently 44/652 tests pass.
2021-08-31	AK: Add FixedPoint arithmetic helper	Hediadyoin1
	Co-authored-by: Hendiadyoin1 <leon2002.la@gmail.com> Co-authored-by: kleines Filmröllchen <malu.bertsch@gmail.com>
2021-08-30	Tests/LibWasm: Handle all stream errors in parse_webassembly_module	Ali Mohammad Pur

2021-08-30	Tests/LibWasm: Add support for javascript bigint values	Ali Mohammad Pur
	Some i64 values will not fit in normal doubles, and these values _are_ tested by the test suite, this makes the test runtime capable of handling them correctly.
2021-08-30	LibUnicode: Canonicalize locale private use extensions	Timothy Flynn

2021-08-30	LibUnicode: Canonicalize locale extensions	Timothy Flynn

2021-08-30	LibUnicode: Parse locale private use extensions	Timothy Flynn

2021-08-30	LibUnicode: Parse locale extensions of the other extension form	Timothy Flynn

2021-08-30	LibUnicode: Parse locale extensions of the transformed extension form	Timothy Flynn

2021-08-30	LibUnicode: Parse locale extensions of the Unicode locale extension form	Timothy Flynn

2021-08-30	AK: Return early from swap() when swapping the same object	Timothy Flynn
	When swapping the same object, we could end up with a double-free error. This was found while quick-sorting a Vector of Variants holding complex types, reproduced by the new swap_same_complex_object test case.
2021-08-30	LibRegex: Allow null bytes in pattern	Ali Mohammad Pur
	That check was rather pointless as the input is a StringView which knows its own bounds. Fixes #9686.
2021-08-26	LibUnicode: Implement grammar validators for Unicode TR-35	Timothy Flynn
	ECMA-402 requires validating user input against the EBNF grammar for Unicode locales described in TR-35: https://www.unicode.org/reports/tr35 This commit adds validators for that grammar, as well as other helper to e.g. canonicalize a locale string.
2021-08-26	AK: Implement method to convert a String/StringView to title case	Timothy Flynn
	This implementation preserves consecutive spaces in the orginal string.
2021-08-26	Tests: Test setjmp/sigsetjmp LibC functions	Jean-Baptiste Boric
	Since there are no real users of these functions in Serenity's userland and this is my third attempt at this... This time, the great LibTest test suite will make sure that I do it right!
2021-08-21	LibSQL: Introduce Serializer as a mediator between Heap and client code	Jan de Visser
	Classes reading and writing to the data heap would communicate directly with the Heap object, and transfer ByteBuffers back and forth with it. This makes things like caching and locking hard. Therefore all data persistence activity will be funneled through a Serializer object which in turn submits it to the Heap. Introducing this unfortunately resulted in a huge amount of churn, in which a number of smaller refactorings got caught up as well.
2021-08-21	LibSQL+SQLServer: Bare bones INSERT and SELECT statements	Jan de Visser
	This patch provides very basic, bare bones implementations of the INSERT and SELECT statements. They are very limited: - The only variant of the INSERT statement that currently works is SELECT INTO schema.table (column1, column2, ....) VALUES (value11, value21, ...), (value12, value22, ...), ... where the values are literals. - The SELECT statement is even more limited, and is only provided to allow verification of the INSERT statement. The only form implemented is: SELECT * FROM schema.table These statements required a bit of change in the Statement::execute API. Originally execute only received a Database object as parameter. This is not enough; we now pass an ExecutionContext object which contains the Database, the current result set, and the last Tuple read from the database. This object will undoubtedly evolve over time. This API change dragged SQLServer::SQLStatement into the patch. Another API addition is Expression::evaluate. This method is, unsurprisingly, used to evaluate expressions, like the values in the INSERT statement. Finally, a new test file is added: TestSqlStatementExecution, which tests the currently implemented statements. As the number and flavour of implemented statements grows, this test file will probably have to be restructured.
2021-08-21	LibSQL: Redesign Value implementation and add new types	Jan de Visser
	The implemtation of the Value class was based on lambda member variables implementing type-dependent behaviour. This was done to ensure that Values can be used as stack-only objects; the simplest alternative, virtual methods, forces them onto the heap. The problem with the the lambda approach is that it bloats the Values (which are supposed to be lightweight objects) quite considerably, because every object contains more than a dozen function pointers. The solution to address both problems (we want Values to be able to live on the stack and be as lightweight as possible) chosen here is to encapsulate type-dependent behaviour and state in an implementation class, and let the Value be an AK::Variant of those implementation classes. All methods of Value are now basically straight delegates to the implementation object using the Variant::visit method. One issue complicating matters is the addition of two aggregate types, Tuple and Array, which each contain a Vector of Values. At this point Tuples and Arrays (and potential future aggregate types) can't contain these aggregate types. This is limiting and needs to be addressed. Another area that needs attention is the nomenclature of things; it's a bit of a tangle of 'ValueBlahBlah' and 'ImplBlahBlah'. It makes sense right now I think but admit we probably can do better. Other things included here: - Added the Boolean and Null types (and Tuple and Array, see above). - to_string now always succeeds and returns a String instead of an Optional. This had some impact on other sources. - Added a lot of tests. - Started moving the serialization mechanism more towards where I want it to be, i.e. a 'DataSerializer' object which just takes serialization and deserialization requests and knows for example how to store long strings out-of-line. One last remark: There is obviously a naming clash between the Tuple class and the Tuple Value type. This is intentional; I plan to make the Tuple class a subclass of Value (and hence Key and Row as well).
2021-08-21	LibSQL: Make TupleDescriptor a shared pointer instead of a stack object	Jan de Visser
	Tuple descriptors are basically the same for for example all rows in a table. Makes sense to share them instead of copying them for every single row.
2021-08-20	LibRegex: Treat pattern string characters as unsigned	Timothy Flynn
	For example, consider the following pattern: new RegExp('\ud834\udf06', 'u') With this pattern, the regex parser should insert the UTF-8 encoded bytes 0xf0, 0x9d, 0x8c, and 0x86. However, because these characters are currently treated as normal char types, they have a negative value since they are all > 0x7f. Then, due to sign extension, when these characters are cast to u64, the sign bit is preserved. The result is that these bytes are inserted as 0xfffffffffffffff0, 0xffffffffffffff9d, etc. Fortunately, there are only a few places where we insert bytecode with the raw characters. In these places, be sure to treat the bytes as u8 before they are cast to u64.
2021-08-20	LibCore: Make Core::File::open() return OSError in case of failure	Andreas Kling

2021-08-19	LibRegex: Allow Unicode escape sequences in capture group names	Timothy Flynn
	Unfortunately, this requires a slight divergence in the way the capture group names are stored. Previously, the generated byte code would simply store a view into the regex pattern string, so no string copying was required. Now, the escape sequences are decoded into a new string, and a vector of all parsed capture group names are stored in a vector in the parser result structure. The byte code then stores a view into the corresponding string in that vector.
2021-08-19	AK: Add GenericLexer API to consume an escaped Unicode code point	Timothy Flynn
	This parsing is already duplicated between LibJS and LibRegex, and will shortly be needed in more places in those libraries. Move it to AK to prevent further duplication. This API will consume escaped Unicode code points of the form: \\u{code point} \\unnnn (where each n is a hexadecimal digit) \\unnnn\\unnnn (where the two escaped values are a surrogate pair)
2021-08-18	LibRegex: Ensure the GoBack operation decrements the code unit index	Timothy Flynn
	This was missed in commit 27d555bab0d84913599cea3c4a6b0a0ed2a15b66.
2021-08-18	LibRegex: In non-Unicode mode, parse \u{4} as a repetition pattern	Timothy Flynn

2021-08-15	LibJS: Add a mode to parse JS as a module	davidot
	In a module strict mode should be enabled at the start of parsing and we allow import and export statements.
2021-08-15	LibRegex: Implement and use a REPEAT operation for bytecode repetition	Timothy Flynn
	Currently, when we need to repeat an instruction N times, we simply add that instruction N times in a for-loop. This doesn't scale well with extremely large values of N, and ECMA-262 allows up to N = 2^53 - 1. Instead, add a new REPEAT bytecode operation to defer this loop from the parser to the runtime executor. This allows the parser to complete sans any loops (for this instruction), and allows the executor to bail early if the repeated bytecode fails. Note: The templated ByteCode methods are to allow the Posix parsers to continue using u32 because they are limited to N = 2^20.
2021-08-15	LibRegex+LibJS: Combine named and unnamed capture groups in MatchState	Timothy Flynn
	Combining these into one list helps reduce the size of MatchState, and as a result, reduces the amount of memory consumed during execution of very large regex matches. Doing this also allows us to remove a few regex byte code instructions: ClearNamedCaptureGroup, SaveLeftNamedCaptureGroup, and NamedReference. Named groups now behave the same as unnamed groups for these operations. Note that SaveRightNamedCaptureGroup still exists to cache the matched group name. This also removes the recursion level from the MatchState, as it can exist as a local variable in Matcher::execute instead.
2021-08-15	LibRegex: Disallow unescaped quantifiers in Unicode mode	Timothy Flynn

2021-08-15	LibRegex: Use correct source characters for Unicode identity escapes	Timothy Flynn

2021-08-15	LibRegex: Implement legacy octal escape parsing closer to the spec	Timothy Flynn
	The grammar for the ECMA-262 CharacterEscape is: CharacterEscape[U, N] :: ControlEscape c ControlLetter 0 [lookahead ∉ DecimalDigit] HexEscapeSequence RegExpUnicodeEscapeSequence[?U] [~U]LegacyOctalEscapeSequence IdentityEscape[?U, ?N] It's important to parse the standalone "\0 [lookahead ∉ DecimalDigit]" before parsing LegacyOctalEscapeSequence. Otherwise, all standalone "\0" patterns are parsed as octal, which are disallowed in Unicode mode. Further, LegacyOctalEscapeSequence should also be parsed while parsing character classes.
2021-08-15	LibRegex: Convert LibRegex tests to use StringView in place of C-strings	Timothy Flynn
	A subsequent commit will add tests that require a string containing only "\0". As a C-string, this will be interpreted as the null terminator. To make the diff for that commit easier to grok, this commit converts all tests to use StringView without any other functional changes.
2021-08-15	LibRegex: Ensure escaped hexadecimals are exactly 2 digits in length	Timothy Flynn

2021-08-15	LibRegex: Ensure escaped code points are exactly 4 digits in length	Timothy Flynn

2021-08-15	LibRegex: Fix ECMA-262 parsing of invalid identity escapes	Timothy Flynn
	* Only alphabetic (A-Z, a-z) characters may be escaped with \c. The loop currently parsing \c includes code points between the upper/lower case groups. * In Unicode mode, all invalid identity escapes should cause a parser error, even in browser-extended mode. * Avoid an infinite loop when parsing the pattern "\c" on its own.
2021-08-15	AK: Add Time::is_negative() to detect negative time values	Brian Gianforcaro

2021-08-14	Tests: Re-enable UserspaceEmulator tests on the Clang build	Daniel Bertalan
	Now that problems that made UE crash have been fixed, this test should now pass.
2021-08-14	Tests: Add regression tests for the LibCpp preprocessor	Itamar
	Similarly to the LibCpp parser regression tests, these tests run the preprocessor on the .cpp test files under Userland/LibCpp/Tests/preprocessor, and compare the output with existing .txt ground truth files.