summaryrefslogtreecommitdiff
path: root/AK/String.cpp
AgeCommit message (Collapse)Author
2023-03-13AK: Rename Stream::read_entire_buffer to Stream::read_until_filledTim Schumacher
No functional changes.
2023-03-09AK: Make FlyString::hash() use the cached hash in StringData if possibleAndreas Kling
This avoids rehashing the string every time.
2023-03-08AK: Make String::contains(code_point) handle non-ASCIITimothy Flynn
We currently only accept a char, instead of a full code point.
2023-03-08AK: Make String::{starts,ends}_with(code_point) handle non-ASCIITimothy Flynn
We currently pass the code point to StringView::{starts,ends}_with, which actually accepts a single char, thus cannot handle non-ASCII code points.
2023-03-03AK: Ensure short String instances are valid UTF-8Timothy Flynn
We are currently only validating long strings.
2023-03-03AK: Add String::ends_with{,_bytes}()Linus Groh
2023-02-28AK: Add two starts_with{bytes,}() APIs to StringAli Mohammad Pur
2023-02-21AK: Ensure that we fill the whole String when reading from a StreamTim Schumacher
2023-02-21AK: Add String::from_stream methodAndrew Kaster
The caller is responsible for determining how long the string is that they want to read.
2023-02-21AK: Make String const-correct internallyAndreas Kling
2023-02-18AK: Fix 64-bit alignment issue in shared-superstring substringsAndreas Kling
Thanks to Timothy Flynn for the test! Fixes #17141
2023-01-28AK: Add String::trimTimothy Flynn
2023-01-28AK: Add String::joinTimothy Flynn
2023-01-27AK: Add an overload of String::find_byte_offset for StringViewTimothy Flynn
2023-01-24AK: Add convenience substring wrappers to String to exclude a lengthTimothy Flynn
These overloads exist on other string classes and are used throughout the code base.
2023-01-24AK: Add a method to create a String with a repeated code pointTimothy Flynn
2023-01-24AK: Add a method to find the byte offset of a code pointTimothy Flynn
2023-01-22AK: Reduce String's allocated data by one byteTimothy Flynn
This was copied from allocation_size_for_stringimpl, which had to ensure the string is null-terminated. String makes no such guarantee.
2023-01-22AK: Change String's default constructor to be constantTimothy Flynn
This allows creating expressions such as: constexpr Array<String, 10> {};
2023-01-21AK: Add `split()` for `String`martinfalisse
2023-01-20AK: Support creating known short string literals at compile timeTimothy Flynn
In cases where we know a string literal will fit in the short string storage, we can do so at compile time without needing to handle error propagation. If the provided string literal is too long, a compilation error will be emitted due to the failed VERIFY statement being a non- constant expression.
2023-01-15AK: Add String::containsTimothy Flynn
2023-01-15AK: Add a somewhat naive implementation of String::reverseTimothy Flynn
This will reverse the String's code points (i.e. not just its bytes), but is not aware of grapheme clusters.
2023-01-12AK: Implement FlyString for the new String classTimothy Flynn
This implements a FlyString that will de-duplicate String instances. The FlyString will store the raw encoded data of the String instance: If the String is a short string, FlyString holds the String::ShortString bytes; otherwise FlyString holds a pointer to the Detail::StringData. FlyString itself does not know about String's storage or how to refcount its Detail::StringData. It defers to String to implement these details.
2023-01-02Everywhere: Remove unused includes of AK/Memory.hBen Wiederhake
These instances were detected by searching for files that include AK/Memory.h, but don't match the regex: \\b(fast_u32_copy|fast_u32_fill|secure_zero|timing_safe_compare)\\b This regex is pessimistic, so there might be more files that don't actually use any memory function. In theory, one might use LibCPP to detect things like this automatically, but let's do this one step after another.
2022-12-11AK: Change the moved-from String state to the empty short stringkleines Filmröllchen
The previous moved-from state was the null string. This violates both our invariant that String is never null, and also the C++ contract that the moved-from state must be valid but unspecified. The empty short string state is of course valid, so it satisfies both invariants. It also allows us to remove any extra checks for the null state. The reason this change is made is primarily because swap() requires moved-from objects to be reassignable (C++ allows this). Because the move assignment of String would not check the null state, it crashed trying to increment the data reference count (nullptr signals a non-short string). This meant that e.g. quick_sort'ing String would crash immediately.
2022-12-09AK: Unref old m_data in String's move assignmentMaciej
We were overridding the data pointer without unreffing it, causing a memory leak when assigning a String.
2022-12-06AK: Introduce the new String, replacement for DeprecatedStringAndreas Kling
DeprecatedString (formerly String) has been with us since the start, and it has served us well. However, it has a number of shortcomings that I'd like to address. Some of these issues are hard if not impossible to solve incrementally inside of DeprecatedString, so instead of doing that, let's build a new String class and then incrementally move over to it instead. Problems in DeprecatedString: - It assumes string allocation never fails. This makes it impossible to use in allocation-sensitive contexts, and is the reason we had to ban DeprecatedString from the kernel entirely. - The awkward null state. DeprecatedString can be null. It's different from the empty state, although null strings are considered empty. All code is immediately nicer when using Optional<DeprecatedString> but DeprecatedString came before Optional, which is how we ended up like this. - The encoding of the underlying data is ambiguous. For the most part, we use it as if it's always UTF-8, but there have been cases where we pass around strings in other encodings (e.g ISO8859-1) - operator[] and length() are used to iterate over DeprecatedString one byte at a time. This is done all over the codebase, and will *not* give the right results unless the string is all ASCII. How we solve these issues in the new String: - Functions that may allocate now return ErrorOr<String> so that ENOMEM errors can be passed to the caller. - String has no null state. Use Optional<String> when needed. - String is always UTF-8. This is validated when constructing a String. We may need to add a bypass for this in the future, for cases where you have a known-good string, but for now: validate all the things! - There is no operator[] or length(). You can get the underlying data with bytes(), but for iterating over code points, you should be using an UTF-8 iterator. Furthermore, it has two nifty new features: - String implements a small string optimization (SSO) for strings that can fit entirely within a pointer. This means up to 3 bytes on 32-bit platforms, and 7 bytes on 64-bit platforms. Such small strings will not be heap-allocated. - String can create substrings without making a deep copy of the substring. Instead, the superstring gets +1 refcount from the substring, and it acts like a view into the superstring. To make substrings like this, use the substring_with_shared_superstring() API. One caveat: - String does not guarantee that the underlying data is null-terminated like DeprecatedString does today. While this was nifty in a handful of places where we were calling C functions, it did stand in the way of shared-superstring substrings.
2022-12-06AK+Everywhere: Rename String to DeprecatedStringLinus Groh
We have a new, improved string type coming up in AK (OOM aware, no null state), and while it's going to use UTF-8, the name UTF8String is a mouthful - so let's free up the String name by renaming the existing class. Making the old one have an annoying name will hopefully also help with quick adoption :^)
2022-10-24AK: Add SplitBehavior::KeepTrailingSeparator with testsdemostanis
2022-10-24AK+Everywhere: Turn bool keep_empty to an enum in split* functionsdemostanis
2022-10-23AK: Add to_{double, float} convenience functions to all string typesdavidot
These are guarded with #ifndef KERNEL, since doubles (and floats) are not allowed in KERNEL mode. In StringUtils there is convert_to_floating_point which does have a template parameter incase you have a templated type.
2022-07-12AK: Remove String <-> char const* comparison operatorssin-ack
During the removal of StringView(char const*), all users of these functions were removed, and they are of dubious value (relying on implicit StringView conversion).
2022-07-12Everywhere: Add sv suffix to strings relying on StringView(char const*)sin-ack
Each of these strings would previously rely on StringView's char const* constructor overload, which would call __builtin_strlen on the string. Since we now have operator ""sv, we can replace these with much simpler versions. This opens the door to being able to remove StringView(char const*). No functional changes.
2022-05-26AK: Add invert_case() and invert_case(StringView)huttongrabiel
In the given String, invert_case() swaps lowercase characters with uppercase ones and vice versa.
2022-04-20AK: Explicitly instantiate String::to_uint<unsigned long{, long}>()Ali Mohammad Pur
Instead of just to_uint<u64>().
2022-04-01Everywhere: Run clang-formatIdan Horowitz
2022-02-25AK: Add String::split_view(Function<bool(char)>)Andreas Kling
This allows you to split around a custom separator, and enables expressive code like this: string.split_view(is_ascii_space);
2022-01-29AK: Implement String's comparison operators in terms of StringView'sDaniel Bertalan
2022-01-24Everywhere: Convert ByteBuffer factory methods from Optional -> ErrorOrSam Atkins
Apologies for the enormous commit, but I don't see a way to split this up nicely. In the vast majority of cases it's a simple change. A few extra places can use TRY instead of manual error checking though. :^)
2022-01-16AK: Fix logic in String::operator>(const String&)Matt Jacobson
Null strings should not compare greater than non-null strings. Add tests for >, <, >=, and <= comparison involving null strings.
2021-11-17AK: Convert AK::Format formatting helpers to returning ErrorOr<void>Andreas Kling
This isn't a complete conversion to ErrorOr<void>, but a good chunk. The end goal here is to propagate buffer allocation failures to the caller, and allow the use of TRY() with formatting functions.
2021-11-11Everywhere: Pass AK::StringView by valueAndreas Kling
2021-11-10AK+Everywhere: Stop including Vector.h from StringView.hAndreas Kling
Preparation for using Error.h from Vector.h. This required moving some things out of line.
2021-09-12AK: Escape '"' in escape_html_entitiesPeter Elliott
2021-09-11AK: Replace the mutable String::replace API with an immutable versionIdan Horowitz
This removes the awkward String::replace API which was the only String API which mutated the String and replaces it with a new immutable version that returns a new String with the replacements applied. This also fixes a couple of UAFs that were caused by the use of this API. As an optimization an equivalent StringView::replace API was also added to remove an unnecessary String allocations in the format of: `String { view }.replace(...);`
2021-09-11AK: Make String::count not use strstr and take a StringViewIdan Horowitz
This was needlessly copying StringView arguments, and was also using strstr internally, which meant it was doing a bunch of unnecessary strlen calls on it. This also moves the implementation to StringUtils to allow API consistency between String and StringView.
2021-09-06Everywhere: Make ByteBuffer::{create_*,copy}() OOM-safeAli Mohammad Pur
2021-09-01AK: Pass AK::Format TypeErasedFormatParams by reference in AK::StringBrian Gianforcaro
This silences a overeager warning in sonar cloud, warning that slicing could occur with `VariadicFormatParams` which derives from `TypeErasedFormatParams`. Reference: https://sonarcloud.io/project/issues?id=SerenityOS_serenity&issues=AXuVPBW3k92xXUF3qXTE&open=AXuVPBW3k92xXUF3qXTE This is a continuation of f0b3aa033134b788a28fe8cf8ff6028d0e7941e8.
2021-08-26AK: Implement method to convert a String/StringView to title caseTimothy Flynn
This implementation preserves consecutive spaces in the orginal string.