Age | Commit message (Collapse) | Author |
|
This implements a method to get a Utf8CodepointIterator at a specified
byte offset.
|
|
This patch implements a Unicode-safe substring method, which can be used
when offset and length should be specified in actual characters instead
of bytes.
This can be used to mitigate issues where a string is split in the
middle of a UTF-8 multi-byte character, which leads to invalid UTF-8.
Furthermore, it implements to common shorthands for substring methods
which take only an offset and return the substring until the end of the
string.
|
|
This changes the return type of Utf8View::byte_length and the argument
types of substring_view from int to size_t.
|
|
SPDX License Identifiers are a more compact / standardized
way of representing file license information.
See: https://spdx.dev/resources/use/#identifiers
This was done with the `ambr` search and replace tool.
ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
|
|
Unlike String/StringView::starts_with this compares utf8 code points
instead of "characters" (bytes), which is important when handling
aribtary utf-8 input that could include overlong characters.
|
|
Problem:
- Many constructors are defined as `{}` rather than using the ` =
default` compiler-provided constructor.
- Some types provide an implicit conversion operator from `nullptr_t`
instead of requiring the caller to default construct. This violates
the C++ Core Guidelines suggestion to declare single-argument
constructors explicit
(https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#c46-by-default-declare-single-argument-constructors-explicit).
Solution:
- Change default constructors to use the compiler-provided default
constructor.
- Remove implicit conversion operators from `nullptr_t` and change
usage to enforce type consistency without conversion.
|
|
|
|
This was causing WindowServer and Taskbar to crash sometimes when the
stars aligned and we tried cutting off a string ending with "..." right
on top of an emoji. :^)
|
|
Problem:
- `typedef` is a keyword which comes from C and carries with it old
syntax that is hard to read.
- Creating type aliases with the `using` keyword allows for easier
future maintenance because it supports template syntax.
- There is inconsistent use of `typedef` vs `using`.
Solution:
- Use `clang-tidy`'s checker called `modernize-use-using` to update
the syntax to use the newer syntax.
- Remove unused functions to make `clang-tidy` happy.
- This results in consistency within the codebase.
|
|
This enables use of these classes in templated code.
|
|
This time, without trailing 's'. Ran:
git grep -l 'codepoint' | xargs sed -ie 's/codepoint/code_point/g
|
|
This reverts commit ea9ac3155d1774f13ac4e9a96605c0e85a8f299e.
It replaced "codepoint" with "code_points", not "code_point".
|
|
Unicode calls them "code points" so let's follow their style.
|
|
|
|
|
|
|
|
|
|
This changes copyright holder to myself for the source code files that I've
created or have (almost) completely rewritten. Not included are the files
that were significantly changed by others even though it was me who originally
created them (think HtmlView), or the many other files I've contributed code to.
|
|
As suggested by Joshua, this commit adds the 2-clause BSD license as a
comment block to the top of every source file.
For the first pass, I've just added myself for simplicity. I encourage
everyone to add themselves as copyright holders of any file they've
added or modified in some significant way. If I've added myself in
error somewhere, feel free to replace it with the appropriate copyright
holder instead.
Going forward, all new source files should include a license header.
|
|
This allows you to retrieve the length (in bytes) of the codepoint the
iterator is currently pointing at.
|
|
There's some overload ambiguity when doing Utf8View("literal")
|
|
|
|
Utf8View wraps a StringView and implements begin() and end() that
return a Utf8CodepointIterator, which parses UTF-8-encoded Unicode
codepoints and returns them as 32-bit integers.
This is the first step towards supporting emojis in Serenity ^)
https://github.com/SerenityOS/serenity/issues/490
|