summaryrefslogtreecommitdiff
path: root/Userland/Libraries/LibJS/Parser.h
AgeCommit message (Collapse)Author
2022-12-06Everywhere: Rename to_{string => deprecated_string}() where applicableLinus Groh
This will make it easier to support both string types at the same time while we convert code, and tracking down remaining uses. One big exception is Value::to_string() in LibJS, where the name is dictated by the ToString AO.
2022-12-06AK+Everywhere: Rename String to DeprecatedStringLinus Groh
We have a new, improved string type coming up in AK (OOM aware, no null state), and while it's going to use UTF-8, the name UTF8String is a mouthful - so let's free up the String name by renaming the existing class. Making the old one have an annoying name will hopefully also help with quick adoption :^)
2022-11-27LibJS: Remove m_first_invalid_property_range from ObjectExpressiondavidot
This was state only used by the parser to output an error with appropriate location. This shrinks the size of ObjectExpression from 120 bytes down to just 56. This saves roughly 2.5 MiB when loading twitter.
2022-11-23LibJS: Make FunctionNode::Parameter be a standalone FunctionParameterAndreas Kling
This will allow us to forward declare it and avoid including AST.h in a number of places.
2022-11-23LibJS: Make Parser::Error a standalone ParserError classAndreas Kling
This allows us to forward declare it and reduce the number of things that need to include Parser.h.
2022-11-22LibJS: Reduce AST memory usage by shrink-wrapping source range infoAndreas Kling
Before this change, each AST node had a 64-byte SourceRange member. This SourceRange had the following layout: filename: StringView (16 bytes) start: Position (24 bytes) end: Position (24 bytes) The Position structs have { line, column, offset }, all members size_t. To reduce memory consumption, AST nodes now only store the following: source_code: NonnullRefPtr<SourceCode> (8 bytes) start_offset: u32 (4 bytes) end_offset: u32 (4 bytes) SourceCode is a new ref-counted data structure that keeps the filename and original parsed source code in a single location, and all AST nodes have a pointer to it. The start_offset and end_offset can be turned into (line, column) when necessary by calling SourceCode::range_from_offsets(). This will walk the source code string and compute line/column numbers on the fly, so it's not necessarily fast, but it should be rare since this information is primarily used for diagnostics and exception stack traces. With this, ASTNode shrinks from 80 bytes to 32 bytes. This gives us a ~23% reduction in memory usage when loading twitter.com/awesomekling (330 MiB before, 253 MiB after!) :^)
2022-10-24AK+Everywhere: Turn bool keep_empty to an enum in split* functionsdemostanis
2022-09-02LibJS: Allow anonymous functions as default exportsdavidot
This requires a special case with names as the default function is supposed to have a unique name ("*default*" in our case) but when checked should have name "default".
2022-08-23LibJS: Replace GlobalObject with VM in remaining AOs [Part 19/19]Linus Groh
2022-08-17LibJS: Allow invalid string in tagged template literalsdavidot
Since tagged template literals can inspect the raw string it is not a syntax error to have invalid escapes. However the cooked value should be `undefined`. We accomplish this by tracking whether parse_string_literal fails and then using a NullLiteral (since UndefinedLiteral is not a thing) and finally converting null in tagged template execution to undefined.
2022-07-12Everywhere: Add sv suffix to strings relying on StringView(char const*)sin-ack
Each of these strings would previously rely on StringView's char const* constructor overload, which would call __builtin_strlen on the string. Since we now have operator ""sv, we can replace these with much simpler versions. This opens the door to being able to remove StringView(char const*). No functional changes.
2022-07-06LibJS: Properly compute the line for source location hintsDexesTTP
These were obvious wrong uses of the old default "only first occurence" parameter that was used in String::replace.
2022-07-06AK: Use an enum instead of a bool for String::replace(all_occurences)DexesTTP
This commit has no behavior changes. In particular, this does not fix any of the wrong uses of the previous default parameter (which used to be 'false', meaning "only replace the first occurence in the string"). It simply replaces the default uses by String::replace(..., ReplaceMode::FirstOnly), leaving them incorrect.
2022-04-11LibJS: Add missing steps and spec comments to PerformEvalLuke Wilde
While adding spec comments to PerformEval, I noticed we were missing multiple steps. Namely, these were: - Checking if the host will allow us to compile the string (allowing LibWeb to perform CSP for eval) - The parser's initial state depending on the environment around us on direct eval: - Allowing new.target via eval in functions - Allowing super calls and super properties via eval in classes - Disallowing the use of the arguments object in class field initializers at eval's parse time - Setting ScriptOrModule of eval's execution context The spec allows us to apply the additional parsing steps in any order. The method I have gone with is passing in a struct to the parser's constructor, which overrides the parser's initial state to (dis)allow the things stated above from the get-go.
2022-04-01Everywhere: Run clang-formatIdan Horowitz
2022-02-16LibJS: Fix mixing of logical and coalescing operatorsAnonymous
The same expression is not allowed to contain both the logical && and || operators, and the coalescing ?? operator. This patch changes how "forbidden" tokens are handled, using a finite set instead of an Vector. This supports much more efficient merging of the forbidden tokens when propagating forward, and allowing the return of forbidden tokens to parent contexts.
2022-02-15LibJS: Fix cases where we incorrectly allowed 'in' in for loopsAnonymous
We needed to propagate the forbidden token set to all parsing functions that can call back into parse_expression.
2022-02-13LibJS: Make more use of Token::flystring_value()Andreas Kling
This patch makes check_identifier_name_for_assignment_validity() take a FlyString instead of a StringView. We then exploit this by passing FlyString in more places via flystring_value(). This gives a ~1% speedup when parsing the largest Discord JS file.
2022-02-09LibJS: Replace uses of MarkedValueList with MarkedVector<Value>Linus Groh
This is effectively a drop-in replacement.
2022-01-22LibJS: Make parsing import and export entries follow the specdavidot
The big changes are: - Allow strings as Module{Export, Import}Name - Properly track declarations in default export statements However, the spec is a little strange in that it allows function and class declarations without a name in default export statements. This is quite hard to fully implement without rewriting more of the parser so for now this behavior is emulated by faking things with function and class expressions. See the comments in parse_export_statement for details on the hacks and where it goes wrong.
2022-01-19LibJS: Capture source text of FunctionNode and ClassExpressionLinus Groh
2022-01-16LibJS: Implement create_dynamic_function() according to the specLinus Groh
The three major changes are: - Parsing parameters, the function body, and then the full assembled function source all separately. This is required by the spec, as function parameters and body must be valid each on their own, which cannot be guaranteed if we only ever parse the full function. - Returning an ECMAScriptFunctionObject instead of a FunctionExpression that needs to be evaluated separately. This vastly simplifies the {Async,AsyncGenerator,Generator,}Function constructor implementations. Drop '_node' from the function name accordingly. - The prototype is now determined via GetPrototypeFromConstructor and passed to OrdinaryFunctionCreate.
2022-01-06LibJS: Replace the custom unwind mechanism with completions :^)Linus Groh
This includes: - Parsing proper LabelledStatements with try_parse_labelled_statement() - Removing LabelableStatement - Implementing the LoopEvaluation semantics via loop_evaluation() in each IterationStatement subclass; and IterationStatement evaluation via {For,ForIn,ForOf,ForAwaitOf,While,DoWhile}Statement::execute() - Updating ReturnStatement, BreakStatement and ContinueStatement to return the appropriate completion types - Basically reimplementing TryStatement and SwitchStatement according to the spec, using completions - Honoring result completion types in AsyncBlockStart and OrdinaryCallEvaluateBody - Removing any uses of the VM unwind mechanism - most importantly, VM::throw_exception() now exclusively sets an exception and no longer triggers any unwinding mechanism. However, we already did a good job updating all of LibWeb and userland applications to not use it, and the few remaining uses elsewhere don't rely on unwinding AFAICT.
2021-12-21LibJS: Parse assert clauses of in- and export statementsdavidot
Based on proposal: https://tc39.es/proposal-import-assertions Since imports are not supported yet this is not functional.
2021-11-30LibJS: Split parsing program to script and module separatelydavidot
This allows us to only perform checks like export bindings existing only for modules. Also this makes it easier to set strict and other state variables with TemporaryChanges.
2021-11-30LibJS: Rename in_async_function_context to await_expression_is_validdavidot
Since await can be valid in module code which is not an async function the old name is not really representative for the usage.
2021-11-30LibJS: Parse dynamic import calls 'import()' and 'import.meta'davidot
For now both just throw when executing but this can be implemented when modules are implemented :^).
2021-11-29LibJS: Implement parsing and executing for-await-of loopsdavidot
2021-11-21LibJS: Parse async arrow functionsdavidot
2021-11-11Everywhere: Pass AK::StringView by valueAndreas Kling
2021-11-10LibJS: Add support for await expressionsIdan Horowitz
2021-11-10LibJS: Add support for async functionsIdan Horowitz
This commit adds support for the most bare bones version of async functions, support for async generator functions, async arrow functions and await expressions are TODO.
2021-10-20LibJS: Add parsing and evaluation of private fields and methodsdavidot
2021-10-15LibJS: Do not save state for peeking at the next token from the lexerdavidot
This saves having to save and load the parser state. This could give an incorrect token in some cases where the parser communicates to the lexer. However this is not applicable in any of the current usages and this would require one to parse the current token as normal which is exactly what you don't want to do in that scenario.
2021-10-08LibJS: Propagate "contains direct call to eval()" flag from parserAndreas Kling
We now propagate this flag to FunctionDeclaration, and then also into ECMAScriptFunctionObject. This will be used to disable optimizations that aren't safe in the presence of direct eval().
2021-10-08LibJS: Add missing initializer for ParserState::m_current_scope_pusherAndreas Kling
2021-10-05LibJS: Add an optimization to avoid needless arguments object creationLinus Groh
This gives FunctionNode a "might need arguments object" boolean flag and sets it based on the simplest possible heuristic for this: if we encounter an identifier called "arguments" or "eval" up to the next (nested) function declaration or expression, we won't need an arguments object. Otherwise, we *might* need one - the final decision is made in the FunctionDeclarationInstantiation AO. Now, this is obviously not perfect. Even if you avoid eval, something like `foo.arguments` will still trigger a false positive - but it's a start and already massively cuts down on needlessly allocated objects, especially in real-world code that is often minified, and so a full "arguments" identifier will be an actual arguments object more often than not. To illustrate the actual impact of this change, here's the number of allocated arguments objects during a full test-js run: Before: - Unmapped arguments objects: 78765 - Mapped arguments objects: 2455 After: - Unmapped arguments objects: 18 - Mapped arguments objects: 37 This results in a ~5% speedup of test-js on my Linux host machine, and about 3.5% on i686 Serenity in QEMU (warm runs, average of 5). The following microbenchmark (calling an empty function 1M times) runs 25% faster on Linux and 45% on Serenity: function foo() {} for (var i = 0; i < 1_000_000; ++i) foo(); test262 reports no changes in either direction, apart from a speedup :^)
2021-10-03Everywhere: Use my awesome new serenityos email :^)davidot
2021-09-30LibJS: Make scoping follow the specdavidot
Before this we used an ad-hoc combination of references and 'variables' stored in a hashmap. This worked in most cases but is not spec like. Additionally hoisting, dynamically naming functions and scope analysis was not done properly. This patch fixes all of that by: - Implement BindingInitialization for destructuring assignment. - Implementing a new ScopePusher which tracks the lexical and var scoped declarations. This hoists functions to the top level if no lexical declaration name overlaps. Furthermore we do checking of redeclarations in the ScopePusher now requiring less checks all over the place. - Add methods for parsing the directives and statement lists instead of having that code duplicated in multiple places. This allows declarations to pushed to the appropriate scope more easily. - Remove the non spec way of storing 'variables' in DeclarativeEnvironment and make Reference follow the spec instead of checking both the bindings and 'variables'. - Remove all scoping related things from the Interpreter. And instead use environments as specified by the spec. This also includes fixing that NativeFunctions did not produce a valid FunctionEnvironment which could cause issues with callbacks and eval. All FunctionObjects now have a valid NewFunctionEnvironment implementation. - Remove execute_statements from Interpreter and instead use ASTNode::execute everywhere this simplifies AST.cpp as you no longer need to worry about which method to call. - Make ScopeNodes setup their own environment. This uses four different methods specified by the spec {Block, Function, Eval, Global}DeclarationInstantiation with the annexB extensions. - Implement and use NamedEvaluation where specified. Additionally there are fixes to things exposed by these changes to eval, {for, for-in, for-of} loops and assignment. Finally it also fixes some tests in test-js which where passing before but not now that we have correct behavior :^).
2021-09-30LibJS: Handle escaped keywords in more cases and handle 'await' labelsdavidot
2021-09-30LibJS: Allow multiple labels on the same statementdavidot
Since there are only a number of statements where labels can actually be used we now also only store labels when necessary. Also now tracks the first continue usage of a label since this might not be valid but that can only be determined after we have parsed the statement. Also ensures the correct error does not get wiped by load_state.
2021-09-30LibJS: Allow member expressions in binding patternsdavidot
Also allows literal string and numbers as property names in object binding patterns.
2021-09-14LibJS: Implement parsing and execution of optional chainsAli Mohammad Pur
2021-09-11AK: Replace the mutable String::replace API with an immutable versionIdan Horowitz
This removes the awkward String::replace API which was the only String API which mutated the String and replaces it with a new immutable version that returns a new String with the replacements applied. This also fixes a couple of UAFs that were caused by the use of this API. As an optimization an equivalent StringView::replace API was also added to remove an unnecessary String allocations in the format of: `String { view }.replace(...);`
2021-09-03Everywhere: Prevent risky implicit casts of (Nonnull)RefPtrDaniel Bertalan
Our existing implementation did not check the element type of the other pointer in the constructors and move assignment operators. This meant that some operations that would require explicit casting on raw pointers were done implicitly, such as: - downcasting a base class to a derived class (e.g. `Kernel::Inode` => `Kernel::ProcFSDirectoryInode` in Kernel/ProcFS.cpp), - casting to an unrelated type (e.g. `Promise<bool>` => `Promise<Empty>` in LibIMAP/Client.cpp) This, of course, allows gross violations of the type system, and makes the need to type-check less obvious before downcasting. Luckily, while adding the `static_ptr_cast`s, only two truly incorrect usages were found; in the other instances, our casts just needed to be made explicit.
2021-09-01LibJS: Add support for public fields in classesdavidot
2021-09-01LibJS: Fix small issues in parserdavidot
- Fix some places where escaped keywords are (not) allowed. - Be more strict about parameters for functions with 'use strict'. - Fix that expressions statements allowed functions and classes. - Fix that class expressions were not allowed. - Added a new next_token() method for checking the look ahead. - Fix that continue labels could jump to non iterating targets. - Fix that generator functions cannot be declared in if statements.
2021-08-24LibJS: Disallow yield expression correctly in formal parametersdavidot
And add ZERO WIDTH NO BREAK SPACE to valid whitespace.
2021-08-16LibJS: Correctly handle Unicode characters in JS source textdavidot
Also recognize additional white space characters.
2021-08-16LibJS: Check that 'let' is followed by declaration before matching itdavidot
Since 'let' is a valid variable name (in non-strict mode) let may not be the start of a declaration but just an identifier.