Age | Commit message (Collapse) | Author |
|
|
|
If we can easily communicate failure, let's avoid asserting and report
failure instead.
|
|
Our existing implementation did not check the element type of the other
pointer in the constructors and move assignment operators. This meant
that some operations that would require explicit casting on raw pointers
were done implicitly, such as:
- downcasting a base class to a derived class (e.g. `Kernel::Inode` =>
`Kernel::ProcFSDirectoryInode` in Kernel/ProcFS.cpp),
- casting to an unrelated type (e.g. `Promise<bool>` => `Promise<Empty>`
in LibIMAP/Client.cpp)
This, of course, allows gross violations of the type system, and makes
the need to type-check less obvious before downcasting. Luckily, while
adding the `static_ptr_cast`s, only two truly incorrect usages were
found; in the other instances, our casts just needed to be made
explicit.
|
|
We generally don't use Optional with nullable types, since they already
have an empty state, and having multiple empty states is confusing.
|
|
Classes reading and writing to the data heap would communicate directly
with the Heap object, and transfer ByteBuffers back and forth with it.
This makes things like caching and locking hard. Therefore all data
persistence activity will be funneled through a Serializer object which
in turn submits it to the Heap.
Introducing this unfortunately resulted in a huge amount of churn, in
which a number of smaller refactorings got caught up as well.
|
|
This patch provides very basic, bare bones implementations of the
INSERT and SELECT statements. They are *very* limited:
- The only variant of the INSERT statement that currently works is
SELECT INTO schema.table (column1, column2, ....) VALUES
(value11, value21, ...), (value12, value22, ...), ...
where the values are literals.
- The SELECT statement is even more limited, and is only provided to
allow verification of the INSERT statement. The only form implemented
is: SELECT * FROM schema.table
These statements required a bit of change in the Statement::execute
API. Originally execute only received a Database object as parameter.
This is not enough; we now pass an ExecutionContext object which
contains the Database, the current result set, and the last Tuple read
from the database. This object will undoubtedly evolve over time.
This API change dragged SQLServer::SQLStatement into the patch.
Another API addition is Expression::evaluate. This method is,
unsurprisingly, used to evaluate expressions, like the values in the
INSERT statement.
Finally, a new test file is added: TestSqlStatementExecution, which
tests the currently implemented statements. As the number and flavour of
implemented statements grows, this test file will probably have to be
restructured.
|
|
These are standard SQL concepts which columns should be aware of.
|
|
The implemtation of the Value class was based on lambda member variables
implementing type-dependent behaviour. This was done to ensure that
Values can be used as stack-only objects; the simplest alternative,
virtual methods, forces them onto the heap. The problem with the the
lambda approach is that it bloats the Values (which are supposed to be
lightweight objects) quite considerably, because every object contains
more than a dozen function pointers.
The solution to address both problems (we want Values to be able to live
on the stack and be as lightweight as possible) chosen here is to
encapsulate type-dependent behaviour and state in an implementation
class, and let the Value be an AK::Variant of those implementation
classes. All methods of Value are now basically straight delegates to
the implementation object using the Variant::visit method.
One issue complicating matters is the addition of two aggregate types,
Tuple and Array, which each contain a Vector of Values. At this point
Tuples and Arrays (and potential future aggregate types) can't contain
these aggregate types. This is limiting and needs to be addressed.
Another area that needs attention is the nomenclature of things; it's
a bit of a tangle of 'ValueBlahBlah' and 'ImplBlahBlah'. It makes sense
right now I think but admit we probably can do better.
Other things included here:
- Added the Boolean and Null types (and Tuple and Array, see above).
- to_string now always succeeds and returns a String instead of an
Optional. This had some impact on other sources.
- Added a lot of tests.
- Started moving the serialization mechanism more towards where I want
it to be, i.e. a 'DataSerializer' object which just takes
serialization and deserialization requests and knows for example how
to store long strings out-of-line.
One last remark: There is obviously a naming clash between the Tuple
class and the Tuple Value type. This is intentional; I plan to make the
Tuple class a subclass of Value (and hence Key and Row as well).
|
|
Tuple descriptors are basically the same for for example all rows in
a table. Makes sense to share them instead of copying them for every
single row.
|
|
This was exposed to the user by mistake, and even accumulated a bunch of
users that didn't blow up out of sheer luck.
|
|
Problem:
- Clang ToT generates warnings due to user-declared functions causing
the implicitly generated assignment operator to not be generated.
Solution:
- Declare the default constructor `= default`.
- Remove the default copy constructor declaration.
|
|
This patch introduces the SQLServer system server. This service is
supposed to be the only process/application talking to database storage.
This makes things like locking and caching more reliable, easier to
implement, and more efficient.
In LibSQL we added a client component that does the ugly IPC nitty-
gritty for you. All that's needed is setting a number of event handler
lambdas and you can connect to databases and execute statements on them.
Applications that wish to use this SQLClient class obviously need to
link LibSQL and LibIPC.
|
|
This patch introduces the ability execute parsed SQL statements. The
abstract AST Statement node now has a virtual 'execute' method. This
method takes a Database object as parameter and returns a SQLResult
object.
Also introduced here is the CREATE SCHEMA statement. Tables live in a
schema, and if no schema is present in a table reference the 'default'
schema is implied. This schema is created if it doesn't yet exist when
a Database object is created.
Finally, as a proof of concept, the CREATE SCHEMA and CREATE TABLE
statements received an 'execute' implementation. The CREATE TABLE
method is not able to create tables created from SQL queries yet.
|
|
The Order enum is used in the Meta component of LibSQL. Using this enum
meant having to include the monster AST/AST.h include file. Furthermore,
they are sort of basic and therefore can live in the general SQL
namespace. Moved to LibSQL/Type.h.
Also introduced a new class, SQLResult, which is needed in future
patches.
|
|
SQL was standardized before there was consensus on sane language syntax
constructs had evolved. The language is mostly case-insensitive, with
unquoted text converted to upper case. Identifiers can include lower
case characters and other 'special' characters by enclosing the
identifier with double quotes. A double quote is escaped by doubling it.
Likewise, a single quote in a literal string is escaped by doubling it.
All this means that the strategy used in the lexer, where a token's
value is a StringView 'window' on the source string, does not work,
because the value needs to be massaged before being handed to the
parser. Therefore a token now has a String containing its value. Given
the limited lifetime of a token, this is acceptable overhead.
Not doing this means that for example quote removal and double quote
escaping would need to be done in the parser or in AST node
construction, which would spread lexing basically all over the place.
Which would be suboptimal.
There was some impact on the sql utility and SyntaxHighlighter component
which was addressed by storing the token's end position together with
the start position in order to properly highlight it.
Finally, reviewing the tests for parsing numeric literals revealed an
inconsistency in which tokens we accept or reject: `1a` is accepted but
`1e` is rejected. Related to this is the fate of `0x`. Added a FIXME
reminding us to address this.
|
|
The SQL engine is expected to be a fairly sizeable piece of software.
Therefore we're starting to restructure the codebase for growth.
|
|
This patch implements the beginnings of a database API allowing for the
creation of tables, inserting rows in those tables, and retrieving those
rows.
|
|
This patch implements a basic hash index. It uses the extendible hashing
algorith. Also includes a test file.
|
|
Unfortunately this patch is quite large.
The main functionality included are a BTree index implementation and
the Heap class which manages persistent storage.
Also included are a Key subclass of the Tuple class, which is a
specialization for index key tuples. This "dragged in" the Meta layer,
which has classes defining SQL objects like tables and indexes.
|
|
This patch adds the basic dynamic value classes used by the SQL Storage
layer. The most elementary class is Value, which holds a typed Value
which can be converted to standard C++ types. A Tuple is a collection
of Values described by a TupleDescriptor, which specifies the names,
types, and ordering of the elements in the Tuple.
Tuples and Values can be serialized and deserialized to and from
ByteBuffers. This is mechanism which is used to save them to disk.
Tuples are used as keys in SQL indexes and rows in SQL tables.
Also included is a test file.
|
|
Some of the code assumed that chars were always signed while that is
not the case on ARM hosts.
Also, some of the code tried to use EOF (-1) in a way similar to what
fgetc() does, however instead of storing the characters in an int
variable a char was used.
While this seemed to work it also meant that character 0xFF would be
incorrectly seen as an end-of-file.
Careful reading of fgetc() reveals that fgetc() stores character
data in an int where valid characters are in the range of 0-255 and
the EOF value is explicitly outside of that range (usually -1).
|
|
SQLite hasn't documented a limit on https://www.sqlite.org/limits.html
for the maximum number of nested subqueries. However, its parser is
generated with Yacc and has an internal limit of 100 for general nested
statements.
Fixes https://crbug.com/oss-fuzz/35022.
|
|
And use them to highlight javascript in HTML source.
This commit also changes how TextDocumentSpan::data is interpreted,
as it used to be an opaque pointer, but everyone stuffed an enum value
inside it, which made the values not unique to each highlighter;
that field is now a u64 serial id.
The syntax highlighters don't need to change their ways of stuffing
token types into that field, but a highlighter that calls another
nested highlighter needs to register the nested types for use with
token pairs.
|
|
According to the definition at https://sqlite.org/lang_expr.html, SQL
expressions could be infinitely deep. For practicality, SQLite enforces
a maxiumum expression tree depth of 1000. Apply the same limit in
LibSQL to avoid stack overflow in the expression parser.
Fixes https://crbug.com/oss-fuzz/34859.
|
|
This changes the SQL SyntaxHighlighter to conform to the now-fixed
rendering of syntax highlighting spans in GUI::TextEditor.
|
|
This changes SyntaxHighlighter.{cpp,h} to use east const style. It also
removes two unused headers and simplifies a loop.
|
|
Rather than aborting when a LIMIT clause of the form 'LIMIT expr, expr'
is encountered, fail the parser with a syntax error. This will be nicer
for the user and fixes the following fuzzer bug:
https://crbug.com/oss-fuzz/34837
|
|
SQL::CommonTableExpressionList is required to be non-empty. Return an
error if zero common table expressions were parsed.
Fixes #7627
|
|
|
|
|
|
|
|
Right now RefPtr<T> is way more lenient than it should be. That might
change in the future though.
|
|
As many macros as possible are moved to Macros.h, while the
macros to create a test case are moved to TestCase.h. TestCase is now
the only user-facing header for creating a test case. TestSuite and its
helpers have moved into a .cpp file. Instead of requiring a TEST_MAIN
macro to be instantiated into the test file, a TestMain.cpp file is
provided instead that will be linked against each test. This has the
side effect that, if we wanted to have test cases split across multiple
files, it's as simple as adding them all to the same executable.
The test main should be portable to kernel mode as well, so if
there's a set of tests that should be run in self-test mode in kernel
space, we can accomodate that.
A new serenity_test CMake function streamlines adding a new test with
arguments for the test source file, subdirectory under /usr/Tests to
install the test application and an optional list of libraries to link
against the test application. To accomodate future test where the
provided TestMain.cpp is not suitable (e.g. test-js), a CUSTOM_MAIN
parameter can be passed to the function to not link against the
boilerplate main function.
|
|
There are 4 forms an ALTER TABLE statement can take, and each are very
distinct, so they each get their own AST node class.
|
|
This also migrates parsing of conflict resolution to a helper method,
since both INSERT and UPDATE need it.
|
|
This also adds missing '&' on a couple AST getter methods.
|
|
|
|
|
|
This also moves testing of common-table-expression to its own test case.
|
|
|
|
The EXISTS expression is a bit of an odd-man-out because it can appear
as any of the following forms:
EXISTS (select-stmt)
NOT EXISTS (select-stmt)
(select-stmt)
Which makes it the only keyword expression that doesn't require its
keyword to actually be used. The consequence is that we might come
across an EXISTS expression while parsing another expression type;
NOT would have triggered a unary operator expression, and an opening
parentheses would have triggered an expression chain.
|
|
Currently, every parse_*_statement method ends by parsing a semi-colon.
This will prevent nested statements, e.g. a SELECT statement may be
nested in a CREATE TABLE statement. Move the semi-colon expectation up
and out of the individual statement parsers.
|
|
Another common semantic is parsing an identifier of the form
"schema_name.table_name" / "table_name". Add a helper to do this work.
This helper does not parse any optional alias after the table name.
some syntaxes specify an alias using the AS keyword, some let the AS
keyword be optional, and others just parse it as an identifier. So
callers to this helper will just continue parsing the alias however
they require.
|
|
A quite common semantic emerged for parsing comma-separated expressions:
consume(TokenType::ParenOpen);
do {
// do something
if (!match(TokenType::Comma))
break;
consume(TokenType::Comma);
} while (!match(TokenType::Eof));
consume(TokenType::ParenClose);
Add a helper to do the bulk of the while loop.
|
|
In some syntaxes, using the 'AS' keyword to define an alias is optional.
But if it does appear, an identifier better appear afterwards.
|
|
This makes it more symmetrical with adopt_own() (which is used to
create a NonnullOwnPtr from the result of a naked new.)
|
|
This doesn't yet parse join clauses, windowing functions, or compound
SELECT statements.
|
|
Statements like SELECT, INSERT, and UPDATE also optionally include this
list, so move its parsing out of parse_delete_statement(). Since it will
appear before the actual statement, parse it first in next_statement();
then only parse for statements that are allowed to include the list.
|
|
Misread the graph: In the "WITH [RECURSIVE] common-table-expression"
section, common-table-expression is actually a repeating list. This
changes the parser to correctly parse this section as a list. Create a
new AST node, CommonTableExpressionList, to store both this list and the
boolean RECURSIVE attribute (because every statement that uses this list
also includes the RECURSIVE attribute beforehand).
|
|
SPDX License Identifiers are a more compact / standardized
way of representing file license information.
See: https://spdx.dev/resources/use/#identifiers
This was done with the `ambr` search and replace tool.
ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
|