summaryrefslogtreecommitdiff
path: root/AK/URL.cpp
AgeCommit message (Collapse)Author
2022-10-09AK+Everywhere: Fix data corruption due to code-point-to-char conversionBen Wiederhake
In particular, StringView::contains(char) is often used with a u32 code point. When this is done, the compiler will for some reason allow data corruption to occur silently. In fact, this is one of two reasons for the following OSS Fuzz issue: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=49184 This is probably a very old bug. In the particular case of URLParser, AK::is_url_code_point got confused: return /* ... */ || "!$&'()*+,-./:;=?@_~"sv.contains(code_point); If code_point is a large code point that happens to have the correct lower bytes, AK::is_url_code_point is then convinced that the given code point is okay, even if it is actually problematic. This commit fixes *only* the silent data corruption due to the erroneous conversion, and does not fully resolve OSS-Fuzz#49184.
2022-07-12Everywhere: Add sv suffix to strings relying on StringView(char const*)sin-ack
Each of these strings would previously rely on StringView's char const* constructor overload, which would call __builtin_strlen on the string. Since we now have operator ""sv, we can replace these with much simpler versions. This opens the door to being able to remove StringView(char const*). No functional changes.
2022-06-10AK: Make URL ApplicationXWWWFormUrlencoded encoding closer to specKarol Kosek
It was mostly implemented based on a spec note, that described only allowed characters, but instead of allowing some special characters not to be escaped, we escaped every special character except those 'new in this encode set' disallowed characters from the spec definition.
2022-06-10AK: Append correct number of port characters when serializing a URLKarol Kosek
Instead of formatting a port string, it put bytes from stack, using the port number as a length (so for port 8000 it appended 8000 bytes).
2022-04-21AK: Add `URL::create_with_help_scheme` helper functionForLoveOfCats
2022-04-10AK: Don't destructively re-encode query strings in the URL parserAndreas Kling
We were decoding and then re-encoding the query string in URLs. This round-trip caused us to lose information about plus ('+') ASCII characters encoded as "%2B".
2022-04-10AK+LibWeb: Encode ' ' as '+' in application/x-www-form-urlencodedAndreas Kling
This matches what the URL and HTML specifications ask us to do.
2022-04-08AK+LibHTTP: Revert prior change to percent encode plus signsGeekFiftyFive
A change was made prior to percent encode plus signs in order to fix an issue with the Google cookie consent page. Unforunately, this was treating a symptom of a problem and not the root cause and is incorrect behavior.
2022-04-02AK+LibHTTP: Ensure plus signs are percent encoded in query stringGeekFiftyFive
Adds a new optional parameter 'reserved_chars' to AK::URL::percent_encode. This new optional parameter allows the caller to specify custom characters to be percent encoded. This is then used to percent encode plus signs by HttpRequest::to_raw_request.
2021-11-11Everywhere: Pass AK::StringView by valueAndreas Kling
2021-09-14AK: Make URL::m_port an Optional<u16>, Expose raw port getterIdan Horowitz
Our current way of signalling a missing port with m_port == 0 was lacking, as 0 is a valid port number in URLs.
2021-09-14AK: Accept optional url and state override parameters in URLParserIdan Horowitz
These are required in the specification and used by the web's URL built-in, this commit also removes the Badge<AK::URL> from URLParser to allow other classes that need to call the parser directly like the web's URL built-in to do so.
2021-09-14AK: Add URL::serialize_origin based on HTML's origin definitionIdan Horowitz
2021-06-30AK: Remove the LexicalPath::is_valid() APIMax Wipfli
Since this is always set to true on the non-default constructor and subsequently never modified, it is somewhat pointless. Furthermore, there are arguably no invalid relative paths.
2021-06-03Everywhere: Replace ctype.h to avoid narrowing conversionsMax Wipfli
This replaces ctype.h with CharacterType.h everywhere I could find issues with narrowing conversions. While using it will probably make sense almost everywhere in the future, the most critical places should have been addressed.
2021-06-01AK: Move identity check from URL::operator==() to equals()Max Wipfli
2021-06-01AK: Use correct constness in URL class methodsMax Wipfli
This changes the URL class to use the correct constness for getters, setters and other methods. It also changes the entire class to use east const style.
2021-06-01AK: Add hostname parameter to URL::create_with_file_scheme()Max Wipfli
This adds a hostname parameter as the third parameter to URL::create_with_file_scheme(). If the hostname is "localhost", it will be ignored (as per the URL specification). This can for example be used by ls(1) to create more conforming file URLs.
2021-06-01AK: Rewrite URL::compute_validity() to conform to new parserMax Wipfli
This rewrites the URL validation check to be more specific, so it can more accurately detect if a user of URL class constructs invalid URLs by hand.
2021-06-01AK: Remove deprecated m_path member variable from URLMax Wipfli
The m_path member variable has been superseded by m_paths. Thus, it has been removed. The path() getter will continue to exist as a convenience method for getting the path joined together as a string.
2021-06-01AK: Replace URL::to_string() with new serialize() implementationMax Wipfli
2021-06-01AK: Replace old URL parser with new URLParser::parse()Max Wipfli
This replaces the old URL::parse() and URL::complete_url() parsing mechanisms with the new spec-compliant URLParser::parse().
2021-06-01AK: Add spec-compliant URL serialization methodsMax Wipfli
This adds URL serialization methods which are more in line with the specification. The serialize_for_display() method should be used e.g. in the browser address bar, and as per the spec should not display username and password. Furthermore, it could decode most percent-encoded code points, although that is not implemented yet.
2021-06-01AK: Add helper functions and private data URL constructor to URLMax Wipfli
This adds a few helper functions and a private constructor to instantiate a data URL to the URL class. These will be needed by the upcoming URL parser.
2021-06-01AK: Add member variables to the URL classMax Wipfli
This adds the m_username, m_password, m_paths and m_cannot_be_a_base_url member variables to the URL class. These are necessary for the upcoming new URL parser. The deprecated m_path variable shadows the m_paths variable if it is non-null. This behavior will be removed once the old URL parser has been removed.
2021-06-01AK+Everywhere: Replace usages of URLParser::urlencode() and urldecode()Max Wipfli
This replaces all occurrences of those functions with the newly implemented functions URL::percent_encode() and URL::percent_decode(). The old functions will be removed in a further commit.
2021-06-01AK: Implement more conforming URL percent encode/decode mechanismMax Wipfli
This adds a few new functions to percent encode/decode strings according to the URL specification. The functions allow specifying a PercentEncodeSet, which is defined by the specification. It will be used to replace the current urlencode() and urldecode() functions in a further commit. This commit adds a few duplicate helper functions in the URL class, such as is_digit() and is_ascii_digit(). This will be cleaned up as soon as the upcoming new URL parser will replace the current one.
2021-06-01AK: Internally rename protocol to scheme in URLMax Wipfli
This renames all references to protocol to scheme, which is the name used by the URL standard (https://url.spec.whatwg.org/). Externally, all methods referencing "protocol" were duplicated with "scheme". The old methods still exist as compatibility.
2021-06-01AK: Omit unnecessary function parameter names in URLMax Wipfli
This patch removes unnecessary function parameter names in declarations of the URL class. It also changes parameter types from String to StringView where applicable.
2021-04-22Everything: Move to SPDX license identifiers in all files.Brian Gianforcaro
SPDX License Identifiers are a more compact / standardized way of representing file license information. See: https://spdx.dev/resources/use/#identifiers This was done with the `ambr` search and replace tool. ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
2021-04-18AK: Add default ports for Websockets to the URL classDexesTTP
2021-03-07AK: Add optional fragment parameter to create_with_file_protocol()speles
Now that we use fragment for specifying starting selection in FileManager we would benefit from providing it as argument instead of setting it each time separately.
2020-12-12AK::URL: Fix setting the port number in the case it was the last element of ↵xspager
the URL
2020-11-04AK::URL: Check if URL requires a port set to be considered a valid URLBrendan Coles
`AK::URL` will now check if the URL requires a port to be set using `AK::URL.protocol_requires_port(protocol)`. If the URL does not specify a port, and no default port for the URL protocol is found with `AK::URL.default_port_for_protocol(protocol)`, the URL is considered to be invalid.
2020-10-08AK: Use new format functions.asynts
2020-08-24AK: Add URL::create_with_data() to create data URLsAnotherTest
2020-06-12AK: Make string-to-number conversion helpers return OptionalAndreas Kling
Get rid of the weird old signature: - int StringType::to_int(bool& ok) const And replace it with sensible new signature: - Optional<int> StringType::to_int() const
2020-06-10AK: URL should urldecode data: URL payloadsAndreas Kling
Otherwise we can end up with percent-encoded nonsense in base64 data which does not decode correctly.
2020-06-07AK: Don't try to complete relative data: URLsAndreas Kling
2020-05-26AK: Rename FileSystemPath -> LexicalPathSergey Bugaev
And move canonicalized_path() to a static method on LexicalPath. This is to make it clear that FileSystemPath/canonicalized_path() only perform *lexical* canonicalization.
2020-05-23AK: Fix URL::complete_url behaviour for when a fragment is passedFalseHonesty
Previously, passing a fragment string ("#section3") to the complete_url method would result in a URL that looked like "file:///home/anon/www/#section3" which was obviously incorrect. Now the result looks like "file:///home/anon/www/afrag.html#section3".
2020-05-17AK: Make sure URL retains trailing slash if present in complete_urlConrad Pankoff
2020-05-17AK: Set default port in URL to 1965 for gemini protocolConrad Pankoff
2020-05-16AK: Handle "protocol relative URLs" in URL::complete_url()Linus Groh
2020-05-10AK: Add support for about: URLsAndreas Kling
2020-05-09AK: Unbreak parsing of file:// URLs with no hostAndreas Kling
We should still accept file:/// in the URL parser. :^)
2020-05-09AK: Allow file:// URLs to have a hostnameAndreas Kling
2020-05-05AK: Add URL::basename()Andreas Kling
2020-04-26AK: Make URL::to_string() produce a data URL for data URLs :^)Andreas Kling
2020-04-26AK: Teach URL how to parse data: URLs :^)Andreas Kling