diff options
author | Luke Wilde <lukew@serenityos.org> | 2023-02-28 03:47:40 +0000 |
---|---|---|
committer | Sam Atkins <atkinssj@gmail.com> | 2023-02-28 08:46:06 +0000 |
commit | e864444fe3b1fcbb2839f05df20463a99efb3850 (patch) | |
tree | decd5ca19341e5c9b886b4aa8990fafd86c8d4f2 | |
parent | 1c918e826c337bb46277cb224e29107ce576eeab (diff) | |
download | serenity-e864444fe3b1fcbb2839f05df20463a99efb3850.zip |
LibTextCodec/Latin1: Iterate over input string with u8 instead of char
Using char causes bytes equal to or over 0x80 to be treated as a
negative value and produce incorrect results when implicitly casting to
u32.
For example, `atob` in LibWeb uses this decoder to convert non-ASCII
values to UTF-8, but non-ASCII values are >= 0x80 and thus produces
incorrect results in such cases:
```js
Uint8Array.from(atob("u660"), c => c.charCodeAt(0));
```
This used to produce [253, 253, 253] instead of [187, 174, 180].
Required by Cloudflare's IUAM challenges.
-rw-r--r-- | Userland/Libraries/LibTextCodec/Decoder.cpp | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/Userland/Libraries/LibTextCodec/Decoder.cpp b/Userland/Libraries/LibTextCodec/Decoder.cpp index 79c3d8f760..958e7f0839 100644 --- a/Userland/Libraries/LibTextCodec/Decoder.cpp +++ b/Userland/Libraries/LibTextCodec/Decoder.cpp @@ -353,7 +353,7 @@ ErrorOr<String> UTF16LEDecoder::to_utf8(StringView input) ErrorOr<void> Latin1Decoder::process(StringView input, Function<ErrorOr<void>(u32)> on_code_point) { - for (auto ch : input) { + for (u8 ch : input) { // Latin1 is the same as the first 256 Unicode code_points, so no mapping is needed, just utf-8 encoding. TRY(on_code_point(ch)); } |