LibRegex: Treat pattern string characters as unsigned

For example, consider the following pattern: new RegExp('\ud834\udf06', 'u') With this pattern, the regex parser should insert the UTF-8 encoded bytes 0xf0, 0x9d, 0x8c, and 0x86. However, because these characters are currently treated as normal char types, they have a negative value since they are all > 0x7f. Then, due to sign extension, when these characters are cast to u64, the sign bit is preserved. The result is that these bytes are inserted as 0xfffffffffffffff0, 0xffffffffffffff9d, etc. Fortunately, there are only a few places where we insert bytecode with the raw characters. In these places, be sure to treat the bytes as u8 before they are cast to u64.
author: Timothy Flynn <trflynn89@pm.me> 2021-08-20 10:22:23 -0400
committer: Andreas Kling <kling@serenityos.org> 2021-08-20 19:16:33 +0200
commit: 562d4e497b286335b0ea956e4839e6dd9f140673 (patch)
tree: 24861043c7ecb441a67599389f2ebae80d9e17b0 /Tests/LibRegex/Regex.cpp
parent: 7c54b6bd45efbf3ba933e615131b3df6bbffef9f (diff)
download: serenity-562d4e497b286335b0ea956e4839e6dd9f140673.zip
1 files changed, 2 insertions, 0 deletions
diff --git a/Tests/LibRegex/Regex.cpp b/Tests/LibRegex/Regex.cpp
index 7a14f30eb9..d7630fb25d 100644
--- a/Tests/LibRegex/Regex.cpp
+++ b/Tests/LibRegex/Regex.cpp
@@ -687,6 +687,8 @@ TEST_CASE(ECMA262_unicode_match)
         ECMAScriptFlags options {};
     };
     _test tests[] {
+        { "\xf0\x9d\x8c\x86"sv, "abcdef"sv, false, ECMAScriptFlags::Unicode },
+        { "[\xf0\x9d\x8c\x86]"sv, "abcdef"sv, false, ECMAScriptFlags::Unicode },
         { "\\ud83d"sv, "😀"sv, true },
         { "\\ud83d"sv, "😀"sv, false, ECMAScriptFlags::Unicode },
         { "\\ude00"sv, "😀"sv, true },
author	Timothy Flynn <trflynn89@pm.me>	2021-08-20 10:22:23 -0400
committer	Andreas Kling <kling@serenityos.org>	2021-08-20 19:16:33 +0200
commit	562d4e497b286335b0ea956e4839e6dd9f140673 (patch)
tree	24861043c7ecb441a67599389f2ebae80d9e17b0 /Tests/LibRegex/Regex.cpp
parent	7c54b6bd45efbf3ba933e615131b3df6bbffef9f (diff)
download	serenity-562d4e497b286335b0ea956e4839e6dd9f140673.zip