serenity - The Serenity Operating System 🐞

Age	Commit message (Collapse)	Author
2023-03-08	AK+LibUnicode: Implement String::equals_ignoring_case without allocating	Timothy Flynn
	We currently fully casefold the left- and right-hand sides to compare two strings with case-insensitivity. Now, we casefold one code point at a time, storing the result in a view for comparison, until we exhaust both strings.
2023-03-05	LibUnicode: Detect ZWJ sequences when filtering by emoji presentation	Timothy Flynn
	This was preventing some unqualified emoji sequences from rendering properly, such as the custom SerenityOS flag. We rendered the flag correctly when given the fully qualified sequence: U+1F3F3 U+FEOF U+200D U+1F41E But were not detecting the unqualified sequence as an emoji when also filtering for emoji-presentation sequences: U+1F3F3 U+200D U+1F41E
2023-02-28	LibUnicode: Allow ignoring text presentation emoji in sequence detection	Timothy Flynn
	This adds an option to only detect emoji that should always present as emoji. For example, the copyright symbol (unless followed by an emoji presentation selector) should render as text.
2023-02-25	LibUnicode: Skip over emoji sequences in grapheme boundary segmentation	Timothy Flynn
	Emoji sequences in the grapheme segmentation spec are a bit tricky: \p{Extended_Pictographic} Extend* ZWJ × \p{Extended_Pictographic} Our current strategy of tracking a boolean to indicate if we are in an emoji sequence was causing us to break up emoji made of multiple sub- sequences. For example, in the "family: man, woman, girl, boy" sequence: U+1F468 U+200D U+1F469 U+200D U+1F467 U+200D U+1F466 We would break at indices 0 (correctly) and 6 (incorrectly). Instead of tracking a boolean, it's quite a bit simpler to reason about emoji sequences by just skipping past them entirely. Note that in cases like the above emoji, we skip one sub-sequence at a time.
2023-02-24	LibUnicode: Add a method to check if a code point could start an emoji	Timothy Flynn

2023-02-24	LibUnicode: Generate the path to emoji images alongside emoji data	Timothy Flynn
	This will provide for quicker emoji lookups, rather than having to discover and allocate these paths at runtime before we find out if they even exist.
2023-02-16	LibUnicode: Remove non-iterative text segmentation algorithms	Timothy Flynn
	They are now unused.
2023-02-16	LibUnicode: Use iterative text segmentation algorithms for titlecasing	Timothy Flynn

2023-02-15	LibUnicode: Fix typos causing text segmentation on mid-word punctuation	Timothy Flynn
	For example the words "can't" and "32.3" should not have boundaries detected on the "'" and "." code points, respectively. The String test cases fixed here are because "b'ar" is now considered one word.
2023-02-15	LibUnicode: Support finding the next/previous text segmentation boundary	Timothy Flynn

2023-02-15	LibUnicode: Allow iterating over text segmentation boundaries	Timothy Flynn
	This will be useful for e.g. finding the next boundary after a specific index - we can just stop iterating once a condition is satisfied.
2023-02-15	LibUnicode: Implement text segmentation algorithms for all UTF encodings	Timothy Flynn
	Similar to commit 6d710eeb431d4fc729e4692ac8db4270183cd039. Rather than pick-and-chosing what to support, let's just support all encodings now, as it is trivial. For example, LibGUI will want the UTF-32 overloads.
2023-02-15	LibUnicode+LibJS: Move text segmentation algorithms to their own files	Timothy Flynn
	These algorithms are quite chonky, and more APIs around them are to be added, so let's move them to their own files for a bit of organization.
2023-02-08	Everywhere: Use ReadonlySpan<T> instead of Span<T const>	MacDue

2023-01-18	AK+LibUnicode: Provide Unicode-aware caseless String matching	Timothy Flynn
	The Unicode spec defines much more complicated caseless matching algorithms in its Collation spec. This implements the "basic" case folding comparison.
2023-01-18	LibUnicode: Parse and generate case folding code point data	Timothy Flynn
	Case folding rules have a similar mapping style as special casing rules, where one code point may map to zero or more case folding rules. These will be used for case-insensitive string comparisons. To see how case folding can differ from other casing rules, consider "ß" (U+00DF): >>> "ß".lower() 'ß' >>> "ß".upper() 'SS' >>> "ß".title() 'Ss' >>> "ß".casefold() 'ss'
2023-01-18	LibUnicode: Update out-of-date spec links	Timothy Flynn
	And remove links that aren't adding much value but will often get out of date (i.e. links to UCD files, which are already all listed in unicode_data.cmake).
2023-01-16	AK+LibUnicode: Provide Unicode-aware String titlecase transformation	Timothy Flynn

2023-01-16	LibUnicode: Support full case folding for titlecasing a string	Timothy Flynn
	Unicode declares that to titlecase a string, the first cased code point after each word boundary should be transformed to its titlecase mapping. All other codepoints are transformed to their lowercase mapping.
2023-01-16	LibUnicode: Generate simple case folding mappings for titlecase	Timothy Flynn
	Note we already generate the special case foldings for titlecase.
2023-01-16	LibUnicode: Add an overload of word segmentation for UTF-8 strings	Timothy Flynn

2023-01-15	LibUnicode: Return a String from Unicode normalization	Timothy Flynn

2023-01-09	AK+LibUnicode: Provide Unicode-aware String case transformations	Timothy Flynn
	Since AK can't refer to LibUnicode directly, the strategy here is that if you need case transformations, you can link LibUnicode and receive them. If you try to use either of these methods without linking it, then you'll of course get a linker error (note we don't do any fallbacks to e.g. ASCII case transformations). If you don't need these methods, you don't have to link LibUnicode.
2023-01-09	LibUnicode: Move Unicode-aware case transformations to a helper file	Timothy Flynn
	These will be needed by AK::String as well, so move them to a helper file where they can be re-used.
2023-01-09	LibUnicode+LibJS: Propagate OOM from Unicode normalization	Timothy Flynn

2023-01-09	LibUnicode+LibJS+LibWeb: Propagate OOM from Unicode case transformations	Timothy Flynn

2022-12-14	LibUnicode: Fix compilation when the UCD download is disabled	Timothy Flynn

2022-12-06	Everywhere: Rename to_{string => deprecated_string}() where applicable	Linus Groh
	This will make it easier to support both string types at the same time while we convert code, and tracking down remaining uses. One big exception is Value::to_string() in LibJS, where the name is dictated by the ToString AO.
2022-12-06	AK+Everywhere: Rename String to DeprecatedString	Linus Groh
	We have a new, improved string type coming up in AK (OOM aware, no null state), and while it's going to use UTF-8, the name UTF8String is a mouthful - so let's free up the String name by renaming the existing class. Making the old one have an annoying name will hopefully also help with quick adoption :^)
2022-11-06	Meta+LibUnicode: Avoid relocations for static unicode data	Gunnar Beutner
	Previously the s_decomposition_mappings variable would refer to other data in s_decomposition_mappings_data. This would cause thousands of avoidable relocations at load time. This saves about 128kB RAM for each process which uses LibUnicode.
2022-11-01	Everywhere: Mark dependencies of most targets as PRIVATE	Tim Schumacher
	Otherwise, we end up propagating those dependencies into targets that link against that library, which creates unnecessary link-time dependencies. Also included are changes to readd now missing dependencies to tools that actually need them.
2022-10-17	Lagom+CMake: Propagate dependencies for generated custom targets	Andrew Kaster
	We have logic for serenity_generated_sources which works well for source files that are specified in GENERATED_SOURCES prior to calling serenity_lib or serenity_bin. However, code generated with invoke_generator, and the LibWeb generators do not always follow the pattern of the IDL and GML files. For the LibWeb generators, we can just add_dependencies to LibWeb at the time we declare the generate_Foo custom target. However for LibLocale, LibTimeZone, and LibUnicode, we don't have the name of the target available, so export the name in a variable to set into GENERATED_SOURCES. To make this work for Lagom, we need to make sure that lagom_lib and serenity_bin in Lagom/CMakeLists.txt call serenity_generated_sources on the target. This enables the Xcode generator on macOS hosts, at least for Lagom.
2022-10-07	LibUnicode: Fix Hangul syllable composition for specific cases	matcool
	This fixes `combine_hangul_code_points` which would try to combine a LVT syllable with a trailing consonant, resulting in a wrong character. Also added a test for this specific case.
2022-10-06	LibUnicode: Add to-and-from string converters for NormalizationForm	Timothy Flynn

2022-10-06	LibUnicode: Add decomposition mappings and Unicode normalization	matcool
	The mappings are exposed via `Unicode::code_point_decomposition(u32)` and `Unicode::code_point_decompositions()`, the latter being useful for reverse searching a code point from its decomposition. The normalization code does not make use of `Quick_Check` props (https://www.unicode.org/reports/tr44/#Decompositions_and_Normalization), meaning no quick check optimizations.
2022-09-11	LibUnicode: Parse and generate custom emoji added for SerenityOS	Timothy Flynn
	Parse emoji from emoji-serenity.txt to allow displaying their names and grouping them together in the EmojiInputDialog. This also adds an "Unknown" value to the EmojiGroup enum. This will be useful for emoji that aren't found in the UCD, or for when UCD downloads are disabled.
2022-09-08	LibUncode: Parse and generate emoji code point data	Timothy Flynn
	According to TR #51, the "best definition of the full set [of emojis] is in the emoji-test.txt file". This defines not only the emoji themselves, but the order in which they should be displayed, and what "group" of emojis they belong to.
2022-09-05	LibLocale: Move locale source files to the LibLocale library	Timothy Flynn
	Everything is now setup to create the LibLocale library and link it where needed.
2022-09-05	LibUnicode: Generate a separate Locale enumeration for special casing	Timothy Flynn
	The UCD only cares about a few locales for special casing rules (az, lt, and tr). Unfortunately, LibUnicode cannot use LibLocale once the libraries are separate because LibLocale will need to use LibUnicode for many more things; thus there would be a circular dependency. Instead, just generate the small enum needed for this one use case.
2022-09-05	LibLocale: Move locale source files to the LibLocale folder	Timothy Flynn
	These are still included in LibUnicode, but this updates their location and the include paths of other files which include them.
2022-09-05	Userland: Move files destined for LibLocale to the Locale namespace	Timothy Flynn

2022-09-05	LibUnicode+LibJS: Move Unicode::get_available_currencies() to Locale.h	Timothy Flynn
	This is generated by GenerateLocaleData, which will soon be in the Locale namespace. Move it out of CurrencyCode.h, as that will continue to live in the Unicode namespace.
2022-09-05	LibLocale+LibUnicode: Move generated CLDR data files to LibLocale folder	Timothy Flynn
	They are still included into LibUnicode, but this moves their generated location to be under LibLocale.
2022-09-05	LibUnicode+Userland: Migrate generated CLDR data to LibLocaleData	Timothy Flynn
	Currently, LibUnicodeData contains the generated UCD and CLDR data. Move the UCD data to the main LibUnicode library, and rename LibUnicodeData to LibLocaleData. This is another prepatory change to migrate to LibLocale.
2022-09-05	LibUnicode: Move CLDR data generators to a LibLocale subfolder	Timothy Flynn
	To prepare for placing all CLDR generated data in a new library, LibLocale, this moves the code generators for the CLDR data to the LibLocale subfolder.
2022-09-05	LibUnicode: Fully qualify use of AK::Variant in Locale.h	Timothy Flynn
	The generated locale data contains an enum also named Variant, as variants are part of locale strings. This hasn't been an issue, but as includes are reordered, the order in which the enum and AK::Variant are included may cause an ambiguity error.
2022-08-25	LibUnicode: Fix compilation when ENABLE_UNICODE_DATABASE_DOWNLOAD is OFF	Timothy Flynn

2022-07-21	LibUnicode: Generate per-locale data for the "noon" fixed day period	Timothy Flynn
	Note that not all locales have this day period.
2022-07-20	LibUnicode: Implement the range pattern processing algorithm	Timothy Flynn
	This algorithm is to inject spacing around the range separator under certain conditions. For example, in en-US, the range [3, 5] should be formatted as "3–5" if unitless, but as "$3 – $5" for currency.
2022-07-20	LibUnicode: Generate per-locale approximately & range separator symbols	Timothy Flynn