Age | Commit message (Collapse) | Author |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
checking functions
|
|
|
|
|
|
Okay, so this is kind of a mega-commit of a lot of performance related changes
to rlua, some of which are pretty complicated.
There are some small improvements here and there, but most of the benefits of
this change are from a few big changes. The simplest big change is that there
is now `protect_lua` as well as `protect_lua_call`, which allows skipping a
lightuserdata parameter and some stack manipulation in some cases. Second
simplest is the change to use Vec instead of VecDeque for MultiValue, and to
have MultiValue be used as a sort of "backwards-only" Vec so that ToLuaMulti /
FromLuaMulti still work correctly.
The most complex change, though, is a change to the way LuaRef works, so that
LuaRef can optionally point into the Lua stack instead of only registry values.
At state creation a set number of stack slots is reserved for the first N LuaRef
types (currently 16), and space for these are also allocated separately
allocated at callback time. There is a huge breaking change here, which is that
now any LuaRef types MUST only be used with the Lua on which they were created,
and CANNOT be used with any other Lua callback instance. This mostly will
affect people using LuaRef types from inside a scope callback, but hopefully in
those cases `Function::bind` will be a suitable replacement. On the plus side,
the rules for LuaRef types are easier to state now.
There is probably more easy-ish perf on the table here, but here's the
preliminary results, based on my very limited benchmarks:
create table time: [314.13 ns 315.71 ns 317.44 ns]
change: [-36.154% -35.670% -35.205%] (p = 0.00 < 0.05)
create array 10 time: [2.9731 us 2.9816 us 2.9901 us]
change: [-16.996% -16.600% -16.196%] (p = 0.00 < 0.05)
Performance has improved.
create string table 10 time: [5.6904 us 5.7164 us 5.7411 us]
change: [-53.536% -53.309% -53.079%] (p = 0.00 < 0.05)
Performance has improved.
call add function 3 10 time: [5.1134 us 5.1222 us 5.1320 us]
change: [-4.1095% -3.6910% -3.1781%] (p = 0.00 < 0.05)
Performance has improved.
call callback add 2 10 time: [5.4408 us 5.4480 us 5.4560 us]
change: [-6.4203% -5.7780% -5.0013%] (p = 0.00 < 0.05)
Performance has improved.
call callback append 10 time: [9.8243 us 9.8410 us 9.8586 us]
change: [-26.937% -26.702% -26.469%] (p = 0.00 < 0.05)
Performance has improved.
create registry 10 time: [3.7005 us 3.7089 us 3.7174 us]
change: [-8.4965% -8.1042% -7.6926%] (p = 0.00 < 0.05)
Performance has improved.
I think that a lot of these benchmarks are too "easy", and most API usage is
going to be more like the 'create string table 10' benchmark, where there are a
lot of handles and tables and strings, so I think that 25%-50% improvement is a
good guess for most use cases.
|
|
The expected change is always zero, because stack_guard / stack_err_guard are
always used at `rlua` entry / exit points.
|
|
Previously, on an internal panic, the Lua stack would be reset before panicking
in an attempt to make sure that such panics would not cause stack leaks or leave
the stack in an unknown state. Now, such panic handling is done in stack_guard
and stack_err_guard instead, and this is for a few reasons:
1) The previous approach did NOT handle user triggered panics that were outside
of `rlua`, such as a panic in a ToLua / FromLua implementation. This is
especially bad since most other panics would be indicative of an internal bug
anyway, so the utility of keeping `rlua` types usable after such panics was
questionable. It is much more sensible to ensure that `rlua` types are
usable after *user generated* panics.
2) Every entry point into `rlua` should be guarded by a stack_guard or
stack_err_guard anyway, so this should restore the Lua stack on exiting back
to user code in all cases.
3) The method of stack restoration no longer *clears* the stack, only resets it
to what it previously was. This allows us, potentially, to keep values at
the beginning of the Lua stack long term and know that panics will not
clobber them. There may be a way of dramatically speeding up ref types by
using a small static area at the beginning of the stack instead of only the
registry, so this may be important.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This cannot be accomplished without using unsafe code, which justifies this addition in my opinion.
Also changes "null" to "nul" to be in sync with `std::ffi` docs. Naming is derived from `CStr::to_bytes_with_nul`, using `as_*` instead of `to_*` since this isn't doing any computation.
|
|
Tests are also moved to the new string.rs file to ensure related functionality is in one place.
|
|
|