diff options
author | Timothy Flynn <trflynn89@pm.me> | 2021-07-20 10:46:53 -0400 |
---|---|---|
committer | Andreas Kling <kling@serenityos.org> | 2021-07-22 09:10:44 +0200 |
commit | 0c42aece362edfbd71f3b149601c065b5c675e80 (patch) | |
tree | 7813607db2d9eed60c3f0dc0d4fa277221a5d87d /Userland/Libraries/LibCards | |
parent | 0e25d2393f2a7f49ded730d4a11643005ae9b468 (diff) | |
download | serenity-0c42aece362edfbd71f3b149601c065b5c675e80.zip |
LibJS: Transcode UTF-8 strings to UTF-16 and add UTF-16 accessors
LibJS parses JavaScript as UTF-8, so when creating a string, we must
transcode it to UTF-16 to handle encoded surrogate pairs.
For example, consider the following string:
"\ud83d\ude00"
The UTF-8 encoding of this surrogate pair is:
0xf0 0x9f 0x98 0x80
However, LibJS will currently store the two surrogates individually as
UTF-8 encoded bytes, rather than combining the pair:
0xed 0xa0 0xb8, 0xed 0xb8 0x80
These are not equivalent. So, as String.prototype becomes UTF-16 aware,
this encoding will no longer work for abstractions like strict equality.
Diffstat (limited to 'Userland/Libraries/LibCards')
0 files changed, 0 insertions, 0 deletions