summaryrefslogtreecommitdiff
path: root/Userland/Libraries/LibRegex/RegexMatcher.h
AgeCommit message (Collapse)Author
2023-03-06Everywhere: Remove NonnullOwnPtr.h includesAndreas Kling
2023-01-27LibRegex: Remove declarations for non-existent methodsSam Atkins
2022-12-06Everywhere: Rename to_{string => deprecated_string}() where applicableLinus Groh
This will make it easier to support both string types at the same time while we convert code, and tracking down remaining uses. One big exception is Value::to_string() in LibJS, where the name is dictated by the ToString AO.
2022-12-06AK+Everywhere: Rename String to DeprecatedStringLinus Groh
We have a new, improved string type coming up in AK (OOM aware, no null state), and while it's going to use UTF-8, the name UTF8String is a mouthful - so let's free up the String name by renaming the existing class. Making the old one have an annoying name will hopefully also help with quick adoption :^)
2022-02-05LibRegex: Do not return an Optional from Regex::Matcher::executeTimothy Flynn
The code path that could return an optional no longer exists as of commit: a962ee020a6310b2d7c7479aa058c15484127418
2021-12-15LibRegex: Merge alternations based on blocks and not instructionsAli Mohammad Pur
The instructions can have dependencies (e.g. Repeat), so only unify equal blocks instead of consecutive instructions. Fixes #11247. Also adds the minimal test case(s) from that issue.
2021-11-11Everywhere: Pass AK::StringView by valueAndreas Kling
2021-09-16LibRegex: Pass RegexStringView and Vector<RegexStringView> by referenceBrian Gianforcaro
Flagged by pvs-studio, it looks like these were intended to be passed by reference originally, but it was missed. This avoids excessive argument copy when searching / matching in the regex API. Before: Command: /usr/Tests/LibRegex/Regex --bench Average time: 5998.29 ms (median: 5991, stddev: 102.18) After: Command: /usr/Tests/LibRegex/Regex --bench Average time: 5623.2 ms (median: 5623, stddev: 86.25)
2021-09-13LibRegex: Add a basic optimization passAli Mohammad Pur
This currently tries to convert forking loops to atomic groups, and unify the left side of alternations.
2021-08-19AK: Move FormatParser definition from header to implementation fileTimothy Flynn
This is primarily to be able to remove the GenericLexer include out of Format.h as well. A subsequent commit will add AK::Result to GenericLexer, which will cause naming conflicts with other structures named Result. This can be avoided (for now) by preventing nearly every file in the system from implicitly including GenericLexer. Other changes in this commit are to add the GenericLexer include to files where it is missing.
2021-08-15LibRegex: Remove (mostly) unused regex::MatchOutputTimothy Flynn
This struct holds a counter for the number of executed operations, and vectors for matches, captures groups, and named capture groups. Each of the vectors is unused. Remove the struct and just keep a separate counter for the executed operations.
2021-08-15LibRegex+LibJS: Combine named and unnamed capture groups in MatchStateTimothy Flynn
Combining these into one list helps reduce the size of MatchState, and as a result, reduces the amount of memory consumed during execution of very large regex matches. Doing this also allows us to remove a few regex byte code instructions: ClearNamedCaptureGroup, SaveLeftNamedCaptureGroup, and NamedReference. Named groups now behave the same as unnamed groups for these operations. Note that SaveRightNamedCaptureGroup still exists to cache the matched group name. This also removes the recursion level from the MatchState, as it can exist as a local variable in Matcher::execute instead.
2021-08-02LibRegex: Make Matcher<>::match(Vector<>) take a reference to the vectorAli Mohammad Pur
It was previously copying the entire vector every time, which is not a nice thing to do. :^)
2021-08-02LibRegex: Make Fork{Jump,Stay} non-recursiveAli Mohammad Pur
This makes very fork-heavy expressions (like `(aa)*`) not run out of stack space when matching very long strings.
2021-07-30LibRegex: Allow separately parsing patterns and creating Regex objectsTimothy Flynn
Adds a static method to parse a regex pattern and return the result, and a constructor to accept a parse result. This is to allow LibJS to parse the pattern string of a RegExpLiteral once and hand off regex objects any number of times thereafter.
2021-07-30LibRegex: Take ownership of pattern string and fix move operationsTimothy Flynn
The Regex object created a copy of the pattern string anyways, so tweak the constructor to allow callers to move() pattern strings into the regex. The Regex move constructor and assignment operator currently result in memory corruption. The Regex object stores a Matcher object, which holds a reference to the Regex object. So when the Regex object is moved, that reference is no longer valid. To fix this, the reference stored in the Matcher must be updated when the Regex is moved.
2021-07-23LibRegex: Switch to east-const styleAli Mohammad Pur
2021-07-09LibRegex: Break from execution loop when the sticky flag is setTimothy Flynn
If the sticky flag is set, the regex execution loop should break immediately even if the execution was a failure. The specification for several RegExp.prototype methods (e.g. exec and @@split) rely on this behavior.
2021-06-30LibRegex: Make regex::Regex move-constructible and move-assignableAndrew Kaster
For some reason the default move constructor and default move-assign operator were deleted, so we explicitly default them instead.
2021-05-21Revert "Userland: static vs non-static constexpr variables"Linus Groh
This reverts commit 800ea8ea969835297dc7e7da345a45b9dc5e751a. Booting the system no longer worked after these changes.
2021-05-21Userland: static vs non-static constexpr variablesLenny Maiorani
Problem: - `static` variables consume memory and sometimes are less optimizable. - `static const` variables can be `constexpr`, usually. - `static` function-local variables require an initialization check every time the function is run. Solution: - If a global `static` variable is only used in a single function then move it into the function and make it non-`static` and `constexpr`. - Make all global `static` variables `constexpr` instead of `const`. - Change function-local `static const[expr]` variables to be just `constexpr`.
2021-04-22Everything: Move to SPDX license identifiers in all files.Brian Gianforcaro
SPDX License Identifiers are a more compact / standardized way of representing file license information. See: https://spdx.dev/resources/use/#identifiers This was done with the `ambr` search and replace tool. ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
2021-01-12Libraries: Move to Userland/Libraries/Andreas Kling