mlua/src/util.rs

723 lines
25 KiB
Rust
Raw Normal View History

2017-05-21 18:50:59 -05:00
use std::any::Any;
2018-08-05 08:51:39 -05:00
use std::ffi::CStr;
2017-05-21 18:50:59 -05:00
use std::os::raw::{c_char, c_int, c_void};
use std::panic::{catch_unwind, resume_unwind, AssertUnwindSafe};
2018-08-05 08:51:39 -05:00
use std::sync::Arc;
use std::{mem, ptr};
2017-05-21 18:50:59 -05:00
2017-10-23 15:42:20 -05:00
use error::{Error, Result};
2018-08-05 08:51:39 -05:00
use ffi;
2017-05-21 18:50:59 -05:00
// Checks that Lua has enough free stack space for future stack operations. On failure, this will
// panic with an internal error message.
pub unsafe fn assert_stack(state: *mut ffi::lua_State, amount: c_int) {
// TODO: This should only be triggered when there is a logic error in `rlua`. In the future,
// when there is a way to be confident about stack safety and test it, this could be enabled
// only when `cfg!(debug_assertions)` is true.
rlua_assert!(
Another major API change, out of stack space is not an Err It, ahem "should not" be possible to exhaust lua stack space in normal usage, and causing stack errors to be Err is slightly obnoxious. I have been wanting to make this change for a while, and removing the callback API from tables makes this sensible *I think*. I can think of a couple of ways that this is not technically true, but I think that they are acceptable, or should be handled differently. One, you can make arbitrarily sized LuaVariadic values. I think this is maybe a bug already, because there is an argument limit in Lua which is lower than the stack limit. I'm not sure what happens there, but if it is a stack based panic, (or any panic?) it is a bug. Two, I believe that if you recurse over and over between lua -> rust -> lua -> rust etc, and call rlua API functions, you might get a stack panic. I think for trusted lua code, this is morally equivalent to a regular stack overflow in plain rust, which is already.. well it's not a panic but it's some kind of safe crash I'm not sure, so I think this is acceptable. For *untrusted* lua code, this could theoretically be a problem if the API provided a callback that would call back into lua, then some lua script could force a stack based panic. There are so many concerns with untrusted lua code, and this library is NOT safe enough yet for untrusted code (it doesn't even provide an option to limit lua to the safe API subset yet!), so this is not currently an issue. When the library provides support for "safe lua", it should come with big warnings anyway, and being able to force a stack panic is pretty minor in comparison. I think if there are other ways to cause unbounded stack usage, that it is a bug, or there can be an error just for that situation, like argument count limits. This commit also fixes several stupid bugs with tests, stack checking, and panics.
2017-06-25 15:52:32 -05:00
ffi::lua_checkstack(state, amount) != 0,
"out of stack space"
);
}
// Similar to `assert_stack`, but returns `Error::StackError` on failure.
pub unsafe fn check_stack(state: *mut ffi::lua_State, amount: c_int) -> Result<()> {
if ffi::lua_checkstack(state, amount) == 0 {
Err(Error::StackError)
} else {
Ok(())
}
}
A lot of performance changes. Okay, so this is kind of a mega-commit of a lot of performance related changes to rlua, some of which are pretty complicated. There are some small improvements here and there, but most of the benefits of this change are from a few big changes. The simplest big change is that there is now `protect_lua` as well as `protect_lua_call`, which allows skipping a lightuserdata parameter and some stack manipulation in some cases. Second simplest is the change to use Vec instead of VecDeque for MultiValue, and to have MultiValue be used as a sort of "backwards-only" Vec so that ToLuaMulti / FromLuaMulti still work correctly. The most complex change, though, is a change to the way LuaRef works, so that LuaRef can optionally point into the Lua stack instead of only registry values. At state creation a set number of stack slots is reserved for the first N LuaRef types (currently 16), and space for these are also allocated separately allocated at callback time. There is a huge breaking change here, which is that now any LuaRef types MUST only be used with the Lua on which they were created, and CANNOT be used with any other Lua callback instance. This mostly will affect people using LuaRef types from inside a scope callback, but hopefully in those cases `Function::bind` will be a suitable replacement. On the plus side, the rules for LuaRef types are easier to state now. There is probably more easy-ish perf on the table here, but here's the preliminary results, based on my very limited benchmarks: create table time: [314.13 ns 315.71 ns 317.44 ns] change: [-36.154% -35.670% -35.205%] (p = 0.00 < 0.05) create array 10 time: [2.9731 us 2.9816 us 2.9901 us] change: [-16.996% -16.600% -16.196%] (p = 0.00 < 0.05) Performance has improved. create string table 10 time: [5.6904 us 5.7164 us 5.7411 us] change: [-53.536% -53.309% -53.079%] (p = 0.00 < 0.05) Performance has improved. call add function 3 10 time: [5.1134 us 5.1222 us 5.1320 us] change: [-4.1095% -3.6910% -3.1781%] (p = 0.00 < 0.05) Performance has improved. call callback add 2 10 time: [5.4408 us 5.4480 us 5.4560 us] change: [-6.4203% -5.7780% -5.0013%] (p = 0.00 < 0.05) Performance has improved. call callback append 10 time: [9.8243 us 9.8410 us 9.8586 us] change: [-26.937% -26.702% -26.469%] (p = 0.00 < 0.05) Performance has improved. create registry 10 time: [3.7005 us 3.7089 us 3.7174 us] change: [-8.4965% -8.1042% -7.6926%] (p = 0.00 < 0.05) Performance has improved. I think that a lot of these benchmarks are too "easy", and most API usage is going to be more like the 'create string table 10' benchmark, where there are a lot of handles and tables and strings, so I think that 25%-50% improvement is a good guess for most use cases.
2018-03-11 22:20:10 -05:00
pub struct StackGuard {
state: *mut ffi::lua_State,
top: c_int,
}
Another major API change, out of stack space is not an Err It, ahem "should not" be possible to exhaust lua stack space in normal usage, and causing stack errors to be Err is slightly obnoxious. I have been wanting to make this change for a while, and removing the callback API from tables makes this sensible *I think*. I can think of a couple of ways that this is not technically true, but I think that they are acceptable, or should be handled differently. One, you can make arbitrarily sized LuaVariadic values. I think this is maybe a bug already, because there is an argument limit in Lua which is lower than the stack limit. I'm not sure what happens there, but if it is a stack based panic, (or any panic?) it is a bug. Two, I believe that if you recurse over and over between lua -> rust -> lua -> rust etc, and call rlua API functions, you might get a stack panic. I think for trusted lua code, this is morally equivalent to a regular stack overflow in plain rust, which is already.. well it's not a panic but it's some kind of safe crash I'm not sure, so I think this is acceptable. For *untrusted* lua code, this could theoretically be a problem if the API provided a callback that would call back into lua, then some lua script could force a stack based panic. There are so many concerns with untrusted lua code, and this library is NOT safe enough yet for untrusted code (it doesn't even provide an option to limit lua to the safe API subset yet!), so this is not currently an issue. When the library provides support for "safe lua", it should come with big warnings anyway, and being able to force a stack panic is pretty minor in comparison. I think if there are other ways to cause unbounded stack usage, that it is a bug, or there can be an error just for that situation, like argument count limits. This commit also fixes several stupid bugs with tests, stack checking, and panics.
2017-06-25 15:52:32 -05:00
A lot of performance changes. Okay, so this is kind of a mega-commit of a lot of performance related changes to rlua, some of which are pretty complicated. There are some small improvements here and there, but most of the benefits of this change are from a few big changes. The simplest big change is that there is now `protect_lua` as well as `protect_lua_call`, which allows skipping a lightuserdata parameter and some stack manipulation in some cases. Second simplest is the change to use Vec instead of VecDeque for MultiValue, and to have MultiValue be used as a sort of "backwards-only" Vec so that ToLuaMulti / FromLuaMulti still work correctly. The most complex change, though, is a change to the way LuaRef works, so that LuaRef can optionally point into the Lua stack instead of only registry values. At state creation a set number of stack slots is reserved for the first N LuaRef types (currently 16), and space for these are also allocated separately allocated at callback time. There is a huge breaking change here, which is that now any LuaRef types MUST only be used with the Lua on which they were created, and CANNOT be used with any other Lua callback instance. This mostly will affect people using LuaRef types from inside a scope callback, but hopefully in those cases `Function::bind` will be a suitable replacement. On the plus side, the rules for LuaRef types are easier to state now. There is probably more easy-ish perf on the table here, but here's the preliminary results, based on my very limited benchmarks: create table time: [314.13 ns 315.71 ns 317.44 ns] change: [-36.154% -35.670% -35.205%] (p = 0.00 < 0.05) create array 10 time: [2.9731 us 2.9816 us 2.9901 us] change: [-16.996% -16.600% -16.196%] (p = 0.00 < 0.05) Performance has improved. create string table 10 time: [5.6904 us 5.7164 us 5.7411 us] change: [-53.536% -53.309% -53.079%] (p = 0.00 < 0.05) Performance has improved. call add function 3 10 time: [5.1134 us 5.1222 us 5.1320 us] change: [-4.1095% -3.6910% -3.1781%] (p = 0.00 < 0.05) Performance has improved. call callback add 2 10 time: [5.4408 us 5.4480 us 5.4560 us] change: [-6.4203% -5.7780% -5.0013%] (p = 0.00 < 0.05) Performance has improved. call callback append 10 time: [9.8243 us 9.8410 us 9.8586 us] change: [-26.937% -26.702% -26.469%] (p = 0.00 < 0.05) Performance has improved. create registry 10 time: [3.7005 us 3.7089 us 3.7174 us] change: [-8.4965% -8.1042% -7.6926%] (p = 0.00 < 0.05) Performance has improved. I think that a lot of these benchmarks are too "easy", and most API usage is going to be more like the 'create string table 10' benchmark, where there are a lot of handles and tables and strings, so I think that 25%-50% improvement is a good guess for most use cases.
2018-03-11 22:20:10 -05:00
impl StackGuard {
// Creates a StackGuard instance with wa record of the stack size, and on Drop will check the
// stack size and drop any extra elements. If the stack size at the end is *smaller* than at
// the beginning, this is considered a fatal logic error and will result in an abort.
pub unsafe fn new(state: *mut ffi::lua_State) -> StackGuard {
StackGuard {
state,
top: ffi::lua_gettop(state),
}
}
A lot of performance changes. Okay, so this is kind of a mega-commit of a lot of performance related changes to rlua, some of which are pretty complicated. There are some small improvements here and there, but most of the benefits of this change are from a few big changes. The simplest big change is that there is now `protect_lua` as well as `protect_lua_call`, which allows skipping a lightuserdata parameter and some stack manipulation in some cases. Second simplest is the change to use Vec instead of VecDeque for MultiValue, and to have MultiValue be used as a sort of "backwards-only" Vec so that ToLuaMulti / FromLuaMulti still work correctly. The most complex change, though, is a change to the way LuaRef works, so that LuaRef can optionally point into the Lua stack instead of only registry values. At state creation a set number of stack slots is reserved for the first N LuaRef types (currently 16), and space for these are also allocated separately allocated at callback time. There is a huge breaking change here, which is that now any LuaRef types MUST only be used with the Lua on which they were created, and CANNOT be used with any other Lua callback instance. This mostly will affect people using LuaRef types from inside a scope callback, but hopefully in those cases `Function::bind` will be a suitable replacement. On the plus side, the rules for LuaRef types are easier to state now. There is probably more easy-ish perf on the table here, but here's the preliminary results, based on my very limited benchmarks: create table time: [314.13 ns 315.71 ns 317.44 ns] change: [-36.154% -35.670% -35.205%] (p = 0.00 < 0.05) create array 10 time: [2.9731 us 2.9816 us 2.9901 us] change: [-16.996% -16.600% -16.196%] (p = 0.00 < 0.05) Performance has improved. create string table 10 time: [5.6904 us 5.7164 us 5.7411 us] change: [-53.536% -53.309% -53.079%] (p = 0.00 < 0.05) Performance has improved. call add function 3 10 time: [5.1134 us 5.1222 us 5.1320 us] change: [-4.1095% -3.6910% -3.1781%] (p = 0.00 < 0.05) Performance has improved. call callback add 2 10 time: [5.4408 us 5.4480 us 5.4560 us] change: [-6.4203% -5.7780% -5.0013%] (p = 0.00 < 0.05) Performance has improved. call callback append 10 time: [9.8243 us 9.8410 us 9.8586 us] change: [-26.937% -26.702% -26.469%] (p = 0.00 < 0.05) Performance has improved. create registry 10 time: [3.7005 us 3.7089 us 3.7174 us] change: [-8.4965% -8.1042% -7.6926%] (p = 0.00 < 0.05) Performance has improved. I think that a lot of these benchmarks are too "easy", and most API usage is going to be more like the 'create string table 10' benchmark, where there are a lot of handles and tables and strings, so I think that 25%-50% improvement is a good guess for most use cases.
2018-03-11 22:20:10 -05:00
}
Another major API change, out of stack space is not an Err It, ahem "should not" be possible to exhaust lua stack space in normal usage, and causing stack errors to be Err is slightly obnoxious. I have been wanting to make this change for a while, and removing the callback API from tables makes this sensible *I think*. I can think of a couple of ways that this is not technically true, but I think that they are acceptable, or should be handled differently. One, you can make arbitrarily sized LuaVariadic values. I think this is maybe a bug already, because there is an argument limit in Lua which is lower than the stack limit. I'm not sure what happens there, but if it is a stack based panic, (or any panic?) it is a bug. Two, I believe that if you recurse over and over between lua -> rust -> lua -> rust etc, and call rlua API functions, you might get a stack panic. I think for trusted lua code, this is morally equivalent to a regular stack overflow in plain rust, which is already.. well it's not a panic but it's some kind of safe crash I'm not sure, so I think this is acceptable. For *untrusted* lua code, this could theoretically be a problem if the API provided a callback that would call back into lua, then some lua script could force a stack based panic. There are so many concerns with untrusted lua code, and this library is NOT safe enough yet for untrusted code (it doesn't even provide an option to limit lua to the safe API subset yet!), so this is not currently an issue. When the library provides support for "safe lua", it should come with big warnings anyway, and being able to force a stack panic is pretty minor in comparison. I think if there are other ways to cause unbounded stack usage, that it is a bug, or there can be an error just for that situation, like argument count limits. This commit also fixes several stupid bugs with tests, stack checking, and panics.
2017-06-25 15:52:32 -05:00
A lot of performance changes. Okay, so this is kind of a mega-commit of a lot of performance related changes to rlua, some of which are pretty complicated. There are some small improvements here and there, but most of the benefits of this change are from a few big changes. The simplest big change is that there is now `protect_lua` as well as `protect_lua_call`, which allows skipping a lightuserdata parameter and some stack manipulation in some cases. Second simplest is the change to use Vec instead of VecDeque for MultiValue, and to have MultiValue be used as a sort of "backwards-only" Vec so that ToLuaMulti / FromLuaMulti still work correctly. The most complex change, though, is a change to the way LuaRef works, so that LuaRef can optionally point into the Lua stack instead of only registry values. At state creation a set number of stack slots is reserved for the first N LuaRef types (currently 16), and space for these are also allocated separately allocated at callback time. There is a huge breaking change here, which is that now any LuaRef types MUST only be used with the Lua on which they were created, and CANNOT be used with any other Lua callback instance. This mostly will affect people using LuaRef types from inside a scope callback, but hopefully in those cases `Function::bind` will be a suitable replacement. On the plus side, the rules for LuaRef types are easier to state now. There is probably more easy-ish perf on the table here, but here's the preliminary results, based on my very limited benchmarks: create table time: [314.13 ns 315.71 ns 317.44 ns] change: [-36.154% -35.670% -35.205%] (p = 0.00 < 0.05) create array 10 time: [2.9731 us 2.9816 us 2.9901 us] change: [-16.996% -16.600% -16.196%] (p = 0.00 < 0.05) Performance has improved. create string table 10 time: [5.6904 us 5.7164 us 5.7411 us] change: [-53.536% -53.309% -53.079%] (p = 0.00 < 0.05) Performance has improved. call add function 3 10 time: [5.1134 us 5.1222 us 5.1320 us] change: [-4.1095% -3.6910% -3.1781%] (p = 0.00 < 0.05) Performance has improved. call callback add 2 10 time: [5.4408 us 5.4480 us 5.4560 us] change: [-6.4203% -5.7780% -5.0013%] (p = 0.00 < 0.05) Performance has improved. call callback append 10 time: [9.8243 us 9.8410 us 9.8586 us] change: [-26.937% -26.702% -26.469%] (p = 0.00 < 0.05) Performance has improved. create registry 10 time: [3.7005 us 3.7089 us 3.7174 us] change: [-8.4965% -8.1042% -7.6926%] (p = 0.00 < 0.05) Performance has improved. I think that a lot of these benchmarks are too "easy", and most API usage is going to be more like the 'create string table 10' benchmark, where there are a lot of handles and tables and strings, so I think that 25%-50% improvement is a good guess for most use cases.
2018-03-11 22:20:10 -05:00
impl Drop for StackGuard {
fn drop(&mut self) {
unsafe {
let top = ffi::lua_gettop(self.state);
if top > self.top {
ffi::lua_settop(self.state, self.top);
} else if top < self.top {
rlua_panic!("{} too many stack values popped", self.top - top);
}
}
}
}
A lot of performance changes. Okay, so this is kind of a mega-commit of a lot of performance related changes to rlua, some of which are pretty complicated. There are some small improvements here and there, but most of the benefits of this change are from a few big changes. The simplest big change is that there is now `protect_lua` as well as `protect_lua_call`, which allows skipping a lightuserdata parameter and some stack manipulation in some cases. Second simplest is the change to use Vec instead of VecDeque for MultiValue, and to have MultiValue be used as a sort of "backwards-only" Vec so that ToLuaMulti / FromLuaMulti still work correctly. The most complex change, though, is a change to the way LuaRef works, so that LuaRef can optionally point into the Lua stack instead of only registry values. At state creation a set number of stack slots is reserved for the first N LuaRef types (currently 16), and space for these are also allocated separately allocated at callback time. There is a huge breaking change here, which is that now any LuaRef types MUST only be used with the Lua on which they were created, and CANNOT be used with any other Lua callback instance. This mostly will affect people using LuaRef types from inside a scope callback, but hopefully in those cases `Function::bind` will be a suitable replacement. On the plus side, the rules for LuaRef types are easier to state now. There is probably more easy-ish perf on the table here, but here's the preliminary results, based on my very limited benchmarks: create table time: [314.13 ns 315.71 ns 317.44 ns] change: [-36.154% -35.670% -35.205%] (p = 0.00 < 0.05) create array 10 time: [2.9731 us 2.9816 us 2.9901 us] change: [-16.996% -16.600% -16.196%] (p = 0.00 < 0.05) Performance has improved. create string table 10 time: [5.6904 us 5.7164 us 5.7411 us] change: [-53.536% -53.309% -53.079%] (p = 0.00 < 0.05) Performance has improved. call add function 3 10 time: [5.1134 us 5.1222 us 5.1320 us] change: [-4.1095% -3.6910% -3.1781%] (p = 0.00 < 0.05) Performance has improved. call callback add 2 10 time: [5.4408 us 5.4480 us 5.4560 us] change: [-6.4203% -5.7780% -5.0013%] (p = 0.00 < 0.05) Performance has improved. call callback append 10 time: [9.8243 us 9.8410 us 9.8586 us] change: [-26.937% -26.702% -26.469%] (p = 0.00 < 0.05) Performance has improved. create registry 10 time: [3.7005 us 3.7089 us 3.7174 us] change: [-8.4965% -8.1042% -7.6926%] (p = 0.00 < 0.05) Performance has improved. I think that a lot of these benchmarks are too "easy", and most API usage is going to be more like the 'create string table 10' benchmark, where there are a lot of handles and tables and strings, so I think that 25%-50% improvement is a good guess for most use cases.
2018-03-11 22:20:10 -05:00
// Call a function that calls into the Lua API and may trigger a Lua error (longjmp) in a safe way.
// Wraps the inner function in a call to `lua_pcall`, so the inner function only has access to a
// limited lua stack. `nargs` is the same as the the parameter to `lua_pcall`, and `nresults` is
// always LUA_MULTRET. Internally uses 2 extra stack spaces, and does not call checkstack.
// Provided function must *never* panic.
pub unsafe fn protect_lua(
state: *mut ffi::lua_State,
nargs: c_int,
f: unsafe extern "C" fn(*mut ffi::lua_State) -> c_int,
) -> Result<()> {
let stack_start = ffi::lua_gettop(state) - nargs;
A lot of performance changes. Okay, so this is kind of a mega-commit of a lot of performance related changes to rlua, some of which are pretty complicated. There are some small improvements here and there, but most of the benefits of this change are from a few big changes. The simplest big change is that there is now `protect_lua` as well as `protect_lua_call`, which allows skipping a lightuserdata parameter and some stack manipulation in some cases. Second simplest is the change to use Vec instead of VecDeque for MultiValue, and to have MultiValue be used as a sort of "backwards-only" Vec so that ToLuaMulti / FromLuaMulti still work correctly. The most complex change, though, is a change to the way LuaRef works, so that LuaRef can optionally point into the Lua stack instead of only registry values. At state creation a set number of stack slots is reserved for the first N LuaRef types (currently 16), and space for these are also allocated separately allocated at callback time. There is a huge breaking change here, which is that now any LuaRef types MUST only be used with the Lua on which they were created, and CANNOT be used with any other Lua callback instance. This mostly will affect people using LuaRef types from inside a scope callback, but hopefully in those cases `Function::bind` will be a suitable replacement. On the plus side, the rules for LuaRef types are easier to state now. There is probably more easy-ish perf on the table here, but here's the preliminary results, based on my very limited benchmarks: create table time: [314.13 ns 315.71 ns 317.44 ns] change: [-36.154% -35.670% -35.205%] (p = 0.00 < 0.05) create array 10 time: [2.9731 us 2.9816 us 2.9901 us] change: [-16.996% -16.600% -16.196%] (p = 0.00 < 0.05) Performance has improved. create string table 10 time: [5.6904 us 5.7164 us 5.7411 us] change: [-53.536% -53.309% -53.079%] (p = 0.00 < 0.05) Performance has improved. call add function 3 10 time: [5.1134 us 5.1222 us 5.1320 us] change: [-4.1095% -3.6910% -3.1781%] (p = 0.00 < 0.05) Performance has improved. call callback add 2 10 time: [5.4408 us 5.4480 us 5.4560 us] change: [-6.4203% -5.7780% -5.0013%] (p = 0.00 < 0.05) Performance has improved. call callback append 10 time: [9.8243 us 9.8410 us 9.8586 us] change: [-26.937% -26.702% -26.469%] (p = 0.00 < 0.05) Performance has improved. create registry 10 time: [3.7005 us 3.7089 us 3.7174 us] change: [-8.4965% -8.1042% -7.6926%] (p = 0.00 < 0.05) Performance has improved. I think that a lot of these benchmarks are too "easy", and most API usage is going to be more like the 'create string table 10' benchmark, where there are a lot of handles and tables and strings, so I think that 25%-50% improvement is a good guess for most use cases.
2018-03-11 22:20:10 -05:00
ffi::lua_pushcfunction(state, error_traceback);
ffi::lua_pushcfunction(state, f);
if nargs > 0 {
ffi::lua_rotate(state, stack_start + 1, 2);
}
let ret = ffi::lua_pcall(state, nargs, ffi::LUA_MULTRET, stack_start + 1);
ffi::lua_remove(state, stack_start + 1);
if ret == ffi::LUA_OK {
Ok(())
2017-05-21 18:50:59 -05:00
} else {
A lot of performance changes. Okay, so this is kind of a mega-commit of a lot of performance related changes to rlua, some of which are pretty complicated. There are some small improvements here and there, but most of the benefits of this change are from a few big changes. The simplest big change is that there is now `protect_lua` as well as `protect_lua_call`, which allows skipping a lightuserdata parameter and some stack manipulation in some cases. Second simplest is the change to use Vec instead of VecDeque for MultiValue, and to have MultiValue be used as a sort of "backwards-only" Vec so that ToLuaMulti / FromLuaMulti still work correctly. The most complex change, though, is a change to the way LuaRef works, so that LuaRef can optionally point into the Lua stack instead of only registry values. At state creation a set number of stack slots is reserved for the first N LuaRef types (currently 16), and space for these are also allocated separately allocated at callback time. There is a huge breaking change here, which is that now any LuaRef types MUST only be used with the Lua on which they were created, and CANNOT be used with any other Lua callback instance. This mostly will affect people using LuaRef types from inside a scope callback, but hopefully in those cases `Function::bind` will be a suitable replacement. On the plus side, the rules for LuaRef types are easier to state now. There is probably more easy-ish perf on the table here, but here's the preliminary results, based on my very limited benchmarks: create table time: [314.13 ns 315.71 ns 317.44 ns] change: [-36.154% -35.670% -35.205%] (p = 0.00 < 0.05) create array 10 time: [2.9731 us 2.9816 us 2.9901 us] change: [-16.996% -16.600% -16.196%] (p = 0.00 < 0.05) Performance has improved. create string table 10 time: [5.6904 us 5.7164 us 5.7411 us] change: [-53.536% -53.309% -53.079%] (p = 0.00 < 0.05) Performance has improved. call add function 3 10 time: [5.1134 us 5.1222 us 5.1320 us] change: [-4.1095% -3.6910% -3.1781%] (p = 0.00 < 0.05) Performance has improved. call callback add 2 10 time: [5.4408 us 5.4480 us 5.4560 us] change: [-6.4203% -5.7780% -5.0013%] (p = 0.00 < 0.05) Performance has improved. call callback append 10 time: [9.8243 us 9.8410 us 9.8586 us] change: [-26.937% -26.702% -26.469%] (p = 0.00 < 0.05) Performance has improved. create registry 10 time: [3.7005 us 3.7089 us 3.7174 us] change: [-8.4965% -8.1042% -7.6926%] (p = 0.00 < 0.05) Performance has improved. I think that a lot of these benchmarks are too "easy", and most API usage is going to be more like the 'create string table 10' benchmark, where there are a lot of handles and tables and strings, so I think that 25%-50% improvement is a good guess for most use cases.
2018-03-11 22:20:10 -05:00
Err(pop_error(state, ret))
2017-05-21 18:50:59 -05:00
}
}
// Call a function that calls into the Lua API and may trigger a Lua error (longjmp) in a safe way.
// Wraps the inner function in a call to `lua_pcall`, so the inner function only has access to a
// limited lua stack. `nargs` and `nresults` are similar to the parameters of `lua_pcall`, but the
// given function return type is not the return value count, instead the inner function return
// values are assumed to match the `nresults` param. Internally uses 3 extra stack spaces, and does
// not call checkstack. Provided function must *not* panic, and since it will generally be
// lonjmping, should not contain any values that implement Drop.
A lot of performance changes. Okay, so this is kind of a mega-commit of a lot of performance related changes to rlua, some of which are pretty complicated. There are some small improvements here and there, but most of the benefits of this change are from a few big changes. The simplest big change is that there is now `protect_lua` as well as `protect_lua_call`, which allows skipping a lightuserdata parameter and some stack manipulation in some cases. Second simplest is the change to use Vec instead of VecDeque for MultiValue, and to have MultiValue be used as a sort of "backwards-only" Vec so that ToLuaMulti / FromLuaMulti still work correctly. The most complex change, though, is a change to the way LuaRef works, so that LuaRef can optionally point into the Lua stack instead of only registry values. At state creation a set number of stack slots is reserved for the first N LuaRef types (currently 16), and space for these are also allocated separately allocated at callback time. There is a huge breaking change here, which is that now any LuaRef types MUST only be used with the Lua on which they were created, and CANNOT be used with any other Lua callback instance. This mostly will affect people using LuaRef types from inside a scope callback, but hopefully in those cases `Function::bind` will be a suitable replacement. On the plus side, the rules for LuaRef types are easier to state now. There is probably more easy-ish perf on the table here, but here's the preliminary results, based on my very limited benchmarks: create table time: [314.13 ns 315.71 ns 317.44 ns] change: [-36.154% -35.670% -35.205%] (p = 0.00 < 0.05) create array 10 time: [2.9731 us 2.9816 us 2.9901 us] change: [-16.996% -16.600% -16.196%] (p = 0.00 < 0.05) Performance has improved. create string table 10 time: [5.6904 us 5.7164 us 5.7411 us] change: [-53.536% -53.309% -53.079%] (p = 0.00 < 0.05) Performance has improved. call add function 3 10 time: [5.1134 us 5.1222 us 5.1320 us] change: [-4.1095% -3.6910% -3.1781%] (p = 0.00 < 0.05) Performance has improved. call callback add 2 10 time: [5.4408 us 5.4480 us 5.4560 us] change: [-6.4203% -5.7780% -5.0013%] (p = 0.00 < 0.05) Performance has improved. call callback append 10 time: [9.8243 us 9.8410 us 9.8586 us] change: [-26.937% -26.702% -26.469%] (p = 0.00 < 0.05) Performance has improved. create registry 10 time: [3.7005 us 3.7089 us 3.7174 us] change: [-8.4965% -8.1042% -7.6926%] (p = 0.00 < 0.05) Performance has improved. I think that a lot of these benchmarks are too "easy", and most API usage is going to be more like the 'create string table 10' benchmark, where there are a lot of handles and tables and strings, so I think that 25%-50% improvement is a good guess for most use cases.
2018-03-11 22:20:10 -05:00
pub unsafe fn protect_lua_closure<F, R>(
state: *mut ffi::lua_State,
nargs: c_int,
nresults: c_int,
f: F,
) -> Result<R>
where
F: Fn(*mut ffi::lua_State) -> R,
R: Copy,
{
union URes<R: Copy> {
uninit: (),
init: R,
}
struct Params<F, R: Copy> {
A lot of performance changes. Okay, so this is kind of a mega-commit of a lot of performance related changes to rlua, some of which are pretty complicated. There are some small improvements here and there, but most of the benefits of this change are from a few big changes. The simplest big change is that there is now `protect_lua` as well as `protect_lua_call`, which allows skipping a lightuserdata parameter and some stack manipulation in some cases. Second simplest is the change to use Vec instead of VecDeque for MultiValue, and to have MultiValue be used as a sort of "backwards-only" Vec so that ToLuaMulti / FromLuaMulti still work correctly. The most complex change, though, is a change to the way LuaRef works, so that LuaRef can optionally point into the Lua stack instead of only registry values. At state creation a set number of stack slots is reserved for the first N LuaRef types (currently 16), and space for these are also allocated separately allocated at callback time. There is a huge breaking change here, which is that now any LuaRef types MUST only be used with the Lua on which they were created, and CANNOT be used with any other Lua callback instance. This mostly will affect people using LuaRef types from inside a scope callback, but hopefully in those cases `Function::bind` will be a suitable replacement. On the plus side, the rules for LuaRef types are easier to state now. There is probably more easy-ish perf on the table here, but here's the preliminary results, based on my very limited benchmarks: create table time: [314.13 ns 315.71 ns 317.44 ns] change: [-36.154% -35.670% -35.205%] (p = 0.00 < 0.05) create array 10 time: [2.9731 us 2.9816 us 2.9901 us] change: [-16.996% -16.600% -16.196%] (p = 0.00 < 0.05) Performance has improved. create string table 10 time: [5.6904 us 5.7164 us 5.7411 us] change: [-53.536% -53.309% -53.079%] (p = 0.00 < 0.05) Performance has improved. call add function 3 10 time: [5.1134 us 5.1222 us 5.1320 us] change: [-4.1095% -3.6910% -3.1781%] (p = 0.00 < 0.05) Performance has improved. call callback add 2 10 time: [5.4408 us 5.4480 us 5.4560 us] change: [-6.4203% -5.7780% -5.0013%] (p = 0.00 < 0.05) Performance has improved. call callback append 10 time: [9.8243 us 9.8410 us 9.8586 us] change: [-26.937% -26.702% -26.469%] (p = 0.00 < 0.05) Performance has improved. create registry 10 time: [3.7005 us 3.7089 us 3.7174 us] change: [-8.4965% -8.1042% -7.6926%] (p = 0.00 < 0.05) Performance has improved. I think that a lot of these benchmarks are too "easy", and most API usage is going to be more like the 'create string table 10' benchmark, where there are a lot of handles and tables and strings, so I think that 25%-50% improvement is a good guess for most use cases.
2018-03-11 22:20:10 -05:00
function: F,
result: URes<R>,
nresults: c_int,
2017-12-02 17:37:17 -06:00
}
unsafe extern "C" fn do_call<F, R>(state: *mut ffi::lua_State) -> c_int
where
R: Copy,
F: Fn(*mut ffi::lua_State) -> R,
{
let params = ffi::lua_touserdata(state, -1) as *mut Params<F, R>;
ffi::lua_pop(state, 1);
(*params).result.init = ((*params).function)(state);
if (*params).nresults == ffi::LUA_MULTRET {
ffi::lua_gettop(state)
} else {
(*params).nresults
}
}
2017-12-02 17:37:17 -06:00
let stack_start = ffi::lua_gettop(state) - nargs;
2017-12-02 17:37:17 -06:00
ffi::lua_pushcfunction(state, error_traceback);
ffi::lua_pushcfunction(state, do_call::<F, R>);
A lot of performance changes. Okay, so this is kind of a mega-commit of a lot of performance related changes to rlua, some of which are pretty complicated. There are some small improvements here and there, but most of the benefits of this change are from a few big changes. The simplest big change is that there is now `protect_lua` as well as `protect_lua_call`, which allows skipping a lightuserdata parameter and some stack manipulation in some cases. Second simplest is the change to use Vec instead of VecDeque for MultiValue, and to have MultiValue be used as a sort of "backwards-only" Vec so that ToLuaMulti / FromLuaMulti still work correctly. The most complex change, though, is a change to the way LuaRef works, so that LuaRef can optionally point into the Lua stack instead of only registry values. At state creation a set number of stack slots is reserved for the first N LuaRef types (currently 16), and space for these are also allocated separately allocated at callback time. There is a huge breaking change here, which is that now any LuaRef types MUST only be used with the Lua on which they were created, and CANNOT be used with any other Lua callback instance. This mostly will affect people using LuaRef types from inside a scope callback, but hopefully in those cases `Function::bind` will be a suitable replacement. On the plus side, the rules for LuaRef types are easier to state now. There is probably more easy-ish perf on the table here, but here's the preliminary results, based on my very limited benchmarks: create table time: [314.13 ns 315.71 ns 317.44 ns] change: [-36.154% -35.670% -35.205%] (p = 0.00 < 0.05) create array 10 time: [2.9731 us 2.9816 us 2.9901 us] change: [-16.996% -16.600% -16.196%] (p = 0.00 < 0.05) Performance has improved. create string table 10 time: [5.6904 us 5.7164 us 5.7411 us] change: [-53.536% -53.309% -53.079%] (p = 0.00 < 0.05) Performance has improved. call add function 3 10 time: [5.1134 us 5.1222 us 5.1320 us] change: [-4.1095% -3.6910% -3.1781%] (p = 0.00 < 0.05) Performance has improved. call callback add 2 10 time: [5.4408 us 5.4480 us 5.4560 us] change: [-6.4203% -5.7780% -5.0013%] (p = 0.00 < 0.05) Performance has improved. call callback append 10 time: [9.8243 us 9.8410 us 9.8586 us] change: [-26.937% -26.702% -26.469%] (p = 0.00 < 0.05) Performance has improved. create registry 10 time: [3.7005 us 3.7089 us 3.7174 us] change: [-8.4965% -8.1042% -7.6926%] (p = 0.00 < 0.05) Performance has improved. I think that a lot of these benchmarks are too "easy", and most API usage is going to be more like the 'create string table 10' benchmark, where there are a lot of handles and tables and strings, so I think that 25%-50% improvement is a good guess for most use cases.
2018-03-11 22:20:10 -05:00
if nargs > 0 {
ffi::lua_rotate(state, stack_start + 1, 2);
}
2017-12-02 17:37:17 -06:00
let mut params = Params {
A lot of performance changes. Okay, so this is kind of a mega-commit of a lot of performance related changes to rlua, some of which are pretty complicated. There are some small improvements here and there, but most of the benefits of this change are from a few big changes. The simplest big change is that there is now `protect_lua` as well as `protect_lua_call`, which allows skipping a lightuserdata parameter and some stack manipulation in some cases. Second simplest is the change to use Vec instead of VecDeque for MultiValue, and to have MultiValue be used as a sort of "backwards-only" Vec so that ToLuaMulti / FromLuaMulti still work correctly. The most complex change, though, is a change to the way LuaRef works, so that LuaRef can optionally point into the Lua stack instead of only registry values. At state creation a set number of stack slots is reserved for the first N LuaRef types (currently 16), and space for these are also allocated separately allocated at callback time. There is a huge breaking change here, which is that now any LuaRef types MUST only be used with the Lua on which they were created, and CANNOT be used with any other Lua callback instance. This mostly will affect people using LuaRef types from inside a scope callback, but hopefully in those cases `Function::bind` will be a suitable replacement. On the plus side, the rules for LuaRef types are easier to state now. There is probably more easy-ish perf on the table here, but here's the preliminary results, based on my very limited benchmarks: create table time: [314.13 ns 315.71 ns 317.44 ns] change: [-36.154% -35.670% -35.205%] (p = 0.00 < 0.05) create array 10 time: [2.9731 us 2.9816 us 2.9901 us] change: [-16.996% -16.600% -16.196%] (p = 0.00 < 0.05) Performance has improved. create string table 10 time: [5.6904 us 5.7164 us 5.7411 us] change: [-53.536% -53.309% -53.079%] (p = 0.00 < 0.05) Performance has improved. call add function 3 10 time: [5.1134 us 5.1222 us 5.1320 us] change: [-4.1095% -3.6910% -3.1781%] (p = 0.00 < 0.05) Performance has improved. call callback add 2 10 time: [5.4408 us 5.4480 us 5.4560 us] change: [-6.4203% -5.7780% -5.0013%] (p = 0.00 < 0.05) Performance has improved. call callback append 10 time: [9.8243 us 9.8410 us 9.8586 us] change: [-26.937% -26.702% -26.469%] (p = 0.00 < 0.05) Performance has improved. create registry 10 time: [3.7005 us 3.7089 us 3.7174 us] change: [-8.4965% -8.1042% -7.6926%] (p = 0.00 < 0.05) Performance has improved. I think that a lot of these benchmarks are too "easy", and most API usage is going to be more like the 'create string table 10' benchmark, where there are a lot of handles and tables and strings, so I think that 25%-50% improvement is a good guess for most use cases.
2018-03-11 22:20:10 -05:00
function: f,
result: URes { uninit: () },
nresults,
};
2017-12-02 17:37:17 -06:00
ffi::lua_pushlightuserdata(state, &mut params as *mut Params<F, R> as *mut c_void);
let ret = ffi::lua_pcall(state, nargs + 1, nresults, stack_start + 1);
ffi::lua_remove(state, stack_start + 1);
2017-12-02 17:37:17 -06:00
if ret == ffi::LUA_OK {
// LUA_OK is only returned when the do_call function has completed successfully, so
// params.result is definitely initialized.
Ok(params.result.init)
} else {
Err(pop_error(state, ret))
2017-12-02 17:37:17 -06:00
}
}
2017-12-02 17:37:17 -06:00
// Pops an error off of the stack and returns it. The specific behavior depends on the type of the
// error at the top of the stack:
// 1) If the error is actually a WrappedPanic, this will continue the panic.
// 2) If the error on the top of the stack is actually a WrappedError, just returns it.
// 3) Otherwise, interprets the error as the appropriate lua error.
// Uses 2 stack spaces, does not call lua_checkstack.
pub unsafe fn pop_error(state: *mut ffi::lua_State, err_code: c_int) -> Error {
rlua_debug_assert!(
err_code != ffi::LUA_OK && err_code != ffi::LUA_YIELD,
"pop_error called with non-error return code"
);
A lot of performance changes. Okay, so this is kind of a mega-commit of a lot of performance related changes to rlua, some of which are pretty complicated. There are some small improvements here and there, but most of the benefits of this change are from a few big changes. The simplest big change is that there is now `protect_lua` as well as `protect_lua_call`, which allows skipping a lightuserdata parameter and some stack manipulation in some cases. Second simplest is the change to use Vec instead of VecDeque for MultiValue, and to have MultiValue be used as a sort of "backwards-only" Vec so that ToLuaMulti / FromLuaMulti still work correctly. The most complex change, though, is a change to the way LuaRef works, so that LuaRef can optionally point into the Lua stack instead of only registry values. At state creation a set number of stack slots is reserved for the first N LuaRef types (currently 16), and space for these are also allocated separately allocated at callback time. There is a huge breaking change here, which is that now any LuaRef types MUST only be used with the Lua on which they were created, and CANNOT be used with any other Lua callback instance. This mostly will affect people using LuaRef types from inside a scope callback, but hopefully in those cases `Function::bind` will be a suitable replacement. On the plus side, the rules for LuaRef types are easier to state now. There is probably more easy-ish perf on the table here, but here's the preliminary results, based on my very limited benchmarks: create table time: [314.13 ns 315.71 ns 317.44 ns] change: [-36.154% -35.670% -35.205%] (p = 0.00 < 0.05) create array 10 time: [2.9731 us 2.9816 us 2.9901 us] change: [-16.996% -16.600% -16.196%] (p = 0.00 < 0.05) Performance has improved. create string table 10 time: [5.6904 us 5.7164 us 5.7411 us] change: [-53.536% -53.309% -53.079%] (p = 0.00 < 0.05) Performance has improved. call add function 3 10 time: [5.1134 us 5.1222 us 5.1320 us] change: [-4.1095% -3.6910% -3.1781%] (p = 0.00 < 0.05) Performance has improved. call callback add 2 10 time: [5.4408 us 5.4480 us 5.4560 us] change: [-6.4203% -5.7780% -5.0013%] (p = 0.00 < 0.05) Performance has improved. call callback append 10 time: [9.8243 us 9.8410 us 9.8586 us] change: [-26.937% -26.702% -26.469%] (p = 0.00 < 0.05) Performance has improved. create registry 10 time: [3.7005 us 3.7089 us 3.7174 us] change: [-8.4965% -8.1042% -7.6926%] (p = 0.00 < 0.05) Performance has improved. I think that a lot of these benchmarks are too "easy", and most API usage is going to be more like the 'create string table 10' benchmark, where there are a lot of handles and tables and strings, so I think that 25%-50% improvement is a good guess for most use cases.
2018-03-11 22:20:10 -05:00
if let Some(err) = get_wrapped_error(state, -1).as_ref() {
ffi::lua_pop(state, 1);
err.clone()
} else if is_wrapped_panic(state, -1) {
let panic = get_userdata::<WrappedPanic>(state, -1);
if let Some(p) = (*panic).0.take() {
resume_unwind(p);
} else {
rlua_panic!("panic was resumed twice")
}
} else {
let err_string = gc_guard(state, || {
if let Some(s) = ffi::lua_tostring(state, -1).as_ref() {
CStr::from_ptr(s).to_string_lossy().into_owned()
} else {
"<unprintable error>".to_owned()
}
});
ffi::lua_pop(state, 1);
match err_code {
ffi::LUA_ERRRUN => Error::RuntimeError(err_string),
ffi::LUA_ERRSYNTAX => {
Error::SyntaxError {
// This seems terrible, but as far as I can tell, this is exactly what the
// stock Lua REPL does.
incomplete_input: err_string.ends_with("<eof>"),
message: err_string,
2017-06-24 21:26:35 -05:00
}
}
ffi::LUA_ERRERR => {
// The Lua manual documents this error wrongly: It is not raised when a message
// handler errors, but rather when some specific situations regarding stack
// overflow handling occurs. Since it is not very useful do differentiate
// between that and "ordinary" runtime errors, we handle them the same way.
Error::RuntimeError(err_string)
}
ffi::LUA_ERRMEM => {
// This should be impossible, as we set the lua allocator to one that aborts
// instead of failing.
A lot of performance changes. Okay, so this is kind of a mega-commit of a lot of performance related changes to rlua, some of which are pretty complicated. There are some small improvements here and there, but most of the benefits of this change are from a few big changes. The simplest big change is that there is now `protect_lua` as well as `protect_lua_call`, which allows skipping a lightuserdata parameter and some stack manipulation in some cases. Second simplest is the change to use Vec instead of VecDeque for MultiValue, and to have MultiValue be used as a sort of "backwards-only" Vec so that ToLuaMulti / FromLuaMulti still work correctly. The most complex change, though, is a change to the way LuaRef works, so that LuaRef can optionally point into the Lua stack instead of only registry values. At state creation a set number of stack slots is reserved for the first N LuaRef types (currently 16), and space for these are also allocated separately allocated at callback time. There is a huge breaking change here, which is that now any LuaRef types MUST only be used with the Lua on which they were created, and CANNOT be used with any other Lua callback instance. This mostly will affect people using LuaRef types from inside a scope callback, but hopefully in those cases `Function::bind` will be a suitable replacement. On the plus side, the rules for LuaRef types are easier to state now. There is probably more easy-ish perf on the table here, but here's the preliminary results, based on my very limited benchmarks: create table time: [314.13 ns 315.71 ns 317.44 ns] change: [-36.154% -35.670% -35.205%] (p = 0.00 < 0.05) create array 10 time: [2.9731 us 2.9816 us 2.9901 us] change: [-16.996% -16.600% -16.196%] (p = 0.00 < 0.05) Performance has improved. create string table 10 time: [5.6904 us 5.7164 us 5.7411 us] change: [-53.536% -53.309% -53.079%] (p = 0.00 < 0.05) Performance has improved. call add function 3 10 time: [5.1134 us 5.1222 us 5.1320 us] change: [-4.1095% -3.6910% -3.1781%] (p = 0.00 < 0.05) Performance has improved. call callback add 2 10 time: [5.4408 us 5.4480 us 5.4560 us] change: [-6.4203% -5.7780% -5.0013%] (p = 0.00 < 0.05) Performance has improved. call callback append 10 time: [9.8243 us 9.8410 us 9.8586 us] change: [-26.937% -26.702% -26.469%] (p = 0.00 < 0.05) Performance has improved. create registry 10 time: [3.7005 us 3.7089 us 3.7174 us] change: [-8.4965% -8.1042% -7.6926%] (p = 0.00 < 0.05) Performance has improved. I think that a lot of these benchmarks are too "easy", and most API usage is going to be more like the 'create string table 10' benchmark, where there are a lot of handles and tables and strings, so I think that 25%-50% improvement is a good guess for most use cases.
2018-03-11 22:20:10 -05:00
rlua_abort!("impossible Lua allocation error")
}
ffi::LUA_ERRGCMM => Error::GarbageCollectorError(err_string),
_ => rlua_panic!("unrecognized lua error code"),
}
}
}
// Internally uses 4 stack spaces, does not call checkstack
2018-09-30 14:42:04 -05:00
pub unsafe fn push_string<S: ?Sized + AsRef<[u8]>>(
state: *mut ffi::lua_State,
s: &S,
) -> Result<()> {
A lot of performance changes. Okay, so this is kind of a mega-commit of a lot of performance related changes to rlua, some of which are pretty complicated. There are some small improvements here and there, but most of the benefits of this change are from a few big changes. The simplest big change is that there is now `protect_lua` as well as `protect_lua_call`, which allows skipping a lightuserdata parameter and some stack manipulation in some cases. Second simplest is the change to use Vec instead of VecDeque for MultiValue, and to have MultiValue be used as a sort of "backwards-only" Vec so that ToLuaMulti / FromLuaMulti still work correctly. The most complex change, though, is a change to the way LuaRef works, so that LuaRef can optionally point into the Lua stack instead of only registry values. At state creation a set number of stack slots is reserved for the first N LuaRef types (currently 16), and space for these are also allocated separately allocated at callback time. There is a huge breaking change here, which is that now any LuaRef types MUST only be used with the Lua on which they were created, and CANNOT be used with any other Lua callback instance. This mostly will affect people using LuaRef types from inside a scope callback, but hopefully in those cases `Function::bind` will be a suitable replacement. On the plus side, the rules for LuaRef types are easier to state now. There is probably more easy-ish perf on the table here, but here's the preliminary results, based on my very limited benchmarks: create table time: [314.13 ns 315.71 ns 317.44 ns] change: [-36.154% -35.670% -35.205%] (p = 0.00 < 0.05) create array 10 time: [2.9731 us 2.9816 us 2.9901 us] change: [-16.996% -16.600% -16.196%] (p = 0.00 < 0.05) Performance has improved. create string table 10 time: [5.6904 us 5.7164 us 5.7411 us] change: [-53.536% -53.309% -53.079%] (p = 0.00 < 0.05) Performance has improved. call add function 3 10 time: [5.1134 us 5.1222 us 5.1320 us] change: [-4.1095% -3.6910% -3.1781%] (p = 0.00 < 0.05) Performance has improved. call callback add 2 10 time: [5.4408 us 5.4480 us 5.4560 us] change: [-6.4203% -5.7780% -5.0013%] (p = 0.00 < 0.05) Performance has improved. call callback append 10 time: [9.8243 us 9.8410 us 9.8586 us] change: [-26.937% -26.702% -26.469%] (p = 0.00 < 0.05) Performance has improved. create registry 10 time: [3.7005 us 3.7089 us 3.7174 us] change: [-8.4965% -8.1042% -7.6926%] (p = 0.00 < 0.05) Performance has improved. I think that a lot of these benchmarks are too "easy", and most API usage is going to be more like the 'create string table 10' benchmark, where there are a lot of handles and tables and strings, so I think that 25%-50% improvement is a good guess for most use cases.
2018-03-11 22:20:10 -05:00
protect_lua_closure(state, 0, 1, |state| {
2018-09-30 14:42:04 -05:00
let s = s.as_ref();
ffi::lua_pushlstring(state, s.as_ptr() as *const c_char, s.len());
})
2017-05-21 18:50:59 -05:00
}
// Internally uses 4 stack spaces, does not call checkstack
pub unsafe fn push_userdata<T>(state: *mut ffi::lua_State, t: T) -> Result<()> {
A lot of performance changes. Okay, so this is kind of a mega-commit of a lot of performance related changes to rlua, some of which are pretty complicated. There are some small improvements here and there, but most of the benefits of this change are from a few big changes. The simplest big change is that there is now `protect_lua` as well as `protect_lua_call`, which allows skipping a lightuserdata parameter and some stack manipulation in some cases. Second simplest is the change to use Vec instead of VecDeque for MultiValue, and to have MultiValue be used as a sort of "backwards-only" Vec so that ToLuaMulti / FromLuaMulti still work correctly. The most complex change, though, is a change to the way LuaRef works, so that LuaRef can optionally point into the Lua stack instead of only registry values. At state creation a set number of stack slots is reserved for the first N LuaRef types (currently 16), and space for these are also allocated separately allocated at callback time. There is a huge breaking change here, which is that now any LuaRef types MUST only be used with the Lua on which they were created, and CANNOT be used with any other Lua callback instance. This mostly will affect people using LuaRef types from inside a scope callback, but hopefully in those cases `Function::bind` will be a suitable replacement. On the plus side, the rules for LuaRef types are easier to state now. There is probably more easy-ish perf on the table here, but here's the preliminary results, based on my very limited benchmarks: create table time: [314.13 ns 315.71 ns 317.44 ns] change: [-36.154% -35.670% -35.205%] (p = 0.00 < 0.05) create array 10 time: [2.9731 us 2.9816 us 2.9901 us] change: [-16.996% -16.600% -16.196%] (p = 0.00 < 0.05) Performance has improved. create string table 10 time: [5.6904 us 5.7164 us 5.7411 us] change: [-53.536% -53.309% -53.079%] (p = 0.00 < 0.05) Performance has improved. call add function 3 10 time: [5.1134 us 5.1222 us 5.1320 us] change: [-4.1095% -3.6910% -3.1781%] (p = 0.00 < 0.05) Performance has improved. call callback add 2 10 time: [5.4408 us 5.4480 us 5.4560 us] change: [-6.4203% -5.7780% -5.0013%] (p = 0.00 < 0.05) Performance has improved. call callback append 10 time: [9.8243 us 9.8410 us 9.8586 us] change: [-26.937% -26.702% -26.469%] (p = 0.00 < 0.05) Performance has improved. create registry 10 time: [3.7005 us 3.7089 us 3.7174 us] change: [-8.4965% -8.1042% -7.6926%] (p = 0.00 < 0.05) Performance has improved. I think that a lot of these benchmarks are too "easy", and most API usage is going to be more like the 'create string table 10' benchmark, where there are a lot of handles and tables and strings, so I think that 25%-50% improvement is a good guess for most use cases.
2018-03-11 22:20:10 -05:00
let ud = protect_lua_closure(state, 0, 1, move |state| {
ffi::lua_newuserdata(state, mem::size_of::<T>()) as *mut T
})?;
ptr::write(ud, t);
Ok(())
}
pub unsafe fn get_userdata<T>(state: *mut ffi::lua_State, index: c_int) -> *mut T {
let ud = ffi::lua_touserdata(state, index) as *mut T;
rlua_debug_assert!(!ud.is_null(), "userdata pointer is null");
ud
}
// Pops the userdata off of the top of the stack and returns it to rust, invalidating the lua
// userdata.
pub unsafe fn take_userdata<T>(state: *mut ffi::lua_State) -> T {
// We set the metatable of userdata on __gc to a special table with no __gc method and with
// metamethods that trigger an error on access. We do this so that it will not be double
// dropped, and also so that it cannot be used or identified as any particular userdata type
// after the first call to __gc.
get_destructed_userdata_metatable(state);
ffi::lua_setmetatable(state, -2);
let ud = ffi::lua_touserdata(state, -1) as *mut T;
rlua_debug_assert!(!ud.is_null(), "userdata pointer is null");
ffi::lua_pop(state, 1);
ptr::read(ud)
}
// Populates the given table with the appropriate members to be a userdata metatable for the given
// type. This function takes the given table at the `metatable` index, and adds an appropriate __gc
// member to it for the given type and a __metatable entry to protect the table from script access.
// The function also, if given a `members` table index, will set up an __index metamethod to return
// the appropriate member on __index. Additionally, if there is already an __index entry on the
// given metatable, instead of simply overwriting the __index, instead the created __index method
// will capture the previous one, and use it as a fallback only if the given key is not found in the
// provided members table. Internally uses 6 stack spaces and does not call checkstack.
pub unsafe fn init_userdata_metatable<T>(
state: *mut ffi::lua_State,
metatable: c_int,
members: Option<c_int>,
) -> Result<()> {
// Used if both an __index metamethod is set and regular methods, checks methods table
// first, then __index metamethod.
unsafe extern "C" fn meta_index_impl(state: *mut ffi::lua_State) -> c_int {
ffi::luaL_checkstack(state, 2, ptr::null());
ffi::lua_pushvalue(state, -1);
ffi::lua_gettable(state, ffi::lua_upvalueindex(2));
if ffi::lua_isnil(state, -1) == 0 {
ffi::lua_insert(state, -3);
ffi::lua_pop(state, 2);
1
} else {
ffi::lua_pop(state, 1);
ffi::lua_pushvalue(state, ffi::lua_upvalueindex(1));
ffi::lua_insert(state, -3);
ffi::lua_call(state, 2, 1);
1
}
}
let members = members.map(|i| ffi::lua_absindex(state, i));
ffi::lua_pushvalue(state, metatable);
if let Some(members) = members {
push_string(state, "__index")?;
ffi::lua_pushvalue(state, -1);
let index_type = ffi::lua_rawget(state, -3);
if index_type == ffi::LUA_TNIL {
ffi::lua_pop(state, 1);
ffi::lua_pushvalue(state, members);
} else if index_type == ffi::LUA_TFUNCTION {
ffi::lua_pushvalue(state, members);
protect_lua_closure(state, 2, 1, |state| {
ffi::lua_pushcclosure(state, meta_index_impl, 2);
})?;
} else {
rlua_panic!("improper __index type {}", index_type);
}
protect_lua_closure(state, 3, 1, |state| {
ffi::lua_rawset(state, -3);
})?;
}
push_string(state, "__gc")?;
ffi::lua_pushcfunction(state, userdata_destructor::<T>);
protect_lua_closure(state, 3, 1, |state| {
ffi::lua_rawset(state, -3);
})?;
push_string(state, "__metatable")?;
ffi::lua_pushboolean(state, 0);
protect_lua_closure(state, 3, 1, |state| {
ffi::lua_rawset(state, -3);
})?;
ffi::lua_pop(state, 1);
Ok(())
}
pub unsafe extern "C" fn userdata_destructor<T>(state: *mut ffi::lua_State) -> c_int {
callback_error(state, || {
take_userdata::<T>(state);
Ok(0)
})
}
2017-06-24 21:26:35 -05:00
// In the context of a lua callback, this will call the given function and if the given function
// returns an error, *or if the given function panics*, this will result in a call to lua_error (a
// longjmp). The error or panic is wrapped in such a way that when calling pop_error back on
2017-06-24 21:26:35 -05:00
// the rust side, it will resume the panic.
2017-05-21 18:50:59 -05:00
pub unsafe fn callback_error<R, F>(state: *mut ffi::lua_State, f: F) -> R
2017-06-15 09:26:39 -05:00
where
F: FnOnce() -> Result<R>,
2017-05-21 18:50:59 -05:00
{
match catch_unwind(AssertUnwindSafe(f)) {
2017-05-21 18:50:59 -05:00
Ok(Ok(r)) => r,
Ok(Err(err)) => {
ffi::lua_settop(state, 0);
ffi::luaL_checkstack(state, 2, ptr::null());
push_wrapped_error(state, err);
2017-05-21 18:50:59 -05:00
ffi::lua_error(state)
}
Err(p) => {
ffi::lua_settop(state, 0);
if ffi::lua_checkstack(state, 2) == 0 {
rlua_abort!("not enough stack space to propagate panic");
}
push_wrapped_panic(state, p);
2017-05-21 18:50:59 -05:00
ffi::lua_error(state)
}
}
}
// Takes an error at the top of the stack, and if it is a WrappedError, converts it to an
// Error::CallbackError with a traceback, if it is some lua type, prints the error along with a
// traceback, and if it is a WrappedPanic, does not modify it. This function should never panic or
// trigger a error (longjmp).
pub unsafe extern "C" fn error_traceback(state: *mut ffi::lua_State) -> c_int {
// I believe luaL_traceback requires this much free stack to not error.
const LUA_TRACEBACK_STACK: c_int = 11;
if ffi::lua_checkstack(state, 2) == 0 {
// If we don't have enough stack space to even check the error type, do nothing
A lot of performance changes. Okay, so this is kind of a mega-commit of a lot of performance related changes to rlua, some of which are pretty complicated. There are some small improvements here and there, but most of the benefits of this change are from a few big changes. The simplest big change is that there is now `protect_lua` as well as `protect_lua_call`, which allows skipping a lightuserdata parameter and some stack manipulation in some cases. Second simplest is the change to use Vec instead of VecDeque for MultiValue, and to have MultiValue be used as a sort of "backwards-only" Vec so that ToLuaMulti / FromLuaMulti still work correctly. The most complex change, though, is a change to the way LuaRef works, so that LuaRef can optionally point into the Lua stack instead of only registry values. At state creation a set number of stack slots is reserved for the first N LuaRef types (currently 16), and space for these are also allocated separately allocated at callback time. There is a huge breaking change here, which is that now any LuaRef types MUST only be used with the Lua on which they were created, and CANNOT be used with any other Lua callback instance. This mostly will affect people using LuaRef types from inside a scope callback, but hopefully in those cases `Function::bind` will be a suitable replacement. On the plus side, the rules for LuaRef types are easier to state now. There is probably more easy-ish perf on the table here, but here's the preliminary results, based on my very limited benchmarks: create table time: [314.13 ns 315.71 ns 317.44 ns] change: [-36.154% -35.670% -35.205%] (p = 0.00 < 0.05) create array 10 time: [2.9731 us 2.9816 us 2.9901 us] change: [-16.996% -16.600% -16.196%] (p = 0.00 < 0.05) Performance has improved. create string table 10 time: [5.6904 us 5.7164 us 5.7411 us] change: [-53.536% -53.309% -53.079%] (p = 0.00 < 0.05) Performance has improved. call add function 3 10 time: [5.1134 us 5.1222 us 5.1320 us] change: [-4.1095% -3.6910% -3.1781%] (p = 0.00 < 0.05) Performance has improved. call callback add 2 10 time: [5.4408 us 5.4480 us 5.4560 us] change: [-6.4203% -5.7780% -5.0013%] (p = 0.00 < 0.05) Performance has improved. call callback append 10 time: [9.8243 us 9.8410 us 9.8586 us] change: [-26.937% -26.702% -26.469%] (p = 0.00 < 0.05) Performance has improved. create registry 10 time: [3.7005 us 3.7089 us 3.7174 us] change: [-8.4965% -8.1042% -7.6926%] (p = 0.00 < 0.05) Performance has improved. I think that a lot of these benchmarks are too "easy", and most API usage is going to be more like the 'create string table 10' benchmark, where there are a lot of handles and tables and strings, so I think that 25%-50% improvement is a good guess for most use cases.
2018-03-11 22:20:10 -05:00
} else if let Some(error) = get_wrapped_error(state, 1).as_ref() {
let traceback = if ffi::lua_checkstack(state, LUA_TRACEBACK_STACK) != 0 {
gc_guard(state, || {
ffi::luaL_traceback(state, state, ptr::null(), 0);
});
let traceback = CStr::from_ptr(ffi::lua_tostring(state, -1))
.to_string_lossy()
.into_owned();
ffi::lua_pop(state, 1);
traceback
} else {
"not enough stack space for traceback".to_owned()
};
A lot of performance changes. Okay, so this is kind of a mega-commit of a lot of performance related changes to rlua, some of which are pretty complicated. There are some small improvements here and there, but most of the benefits of this change are from a few big changes. The simplest big change is that there is now `protect_lua` as well as `protect_lua_call`, which allows skipping a lightuserdata parameter and some stack manipulation in some cases. Second simplest is the change to use Vec instead of VecDeque for MultiValue, and to have MultiValue be used as a sort of "backwards-only" Vec so that ToLuaMulti / FromLuaMulti still work correctly. The most complex change, though, is a change to the way LuaRef works, so that LuaRef can optionally point into the Lua stack instead of only registry values. At state creation a set number of stack slots is reserved for the first N LuaRef types (currently 16), and space for these are also allocated separately allocated at callback time. There is a huge breaking change here, which is that now any LuaRef types MUST only be used with the Lua on which they were created, and CANNOT be used with any other Lua callback instance. This mostly will affect people using LuaRef types from inside a scope callback, but hopefully in those cases `Function::bind` will be a suitable replacement. On the plus side, the rules for LuaRef types are easier to state now. There is probably more easy-ish perf on the table here, but here's the preliminary results, based on my very limited benchmarks: create table time: [314.13 ns 315.71 ns 317.44 ns] change: [-36.154% -35.670% -35.205%] (p = 0.00 < 0.05) create array 10 time: [2.9731 us 2.9816 us 2.9901 us] change: [-16.996% -16.600% -16.196%] (p = 0.00 < 0.05) Performance has improved. create string table 10 time: [5.6904 us 5.7164 us 5.7411 us] change: [-53.536% -53.309% -53.079%] (p = 0.00 < 0.05) Performance has improved. call add function 3 10 time: [5.1134 us 5.1222 us 5.1320 us] change: [-4.1095% -3.6910% -3.1781%] (p = 0.00 < 0.05) Performance has improved. call callback add 2 10 time: [5.4408 us 5.4480 us 5.4560 us] change: [-6.4203% -5.7780% -5.0013%] (p = 0.00 < 0.05) Performance has improved. call callback append 10 time: [9.8243 us 9.8410 us 9.8586 us] change: [-26.937% -26.702% -26.469%] (p = 0.00 < 0.05) Performance has improved. create registry 10 time: [3.7005 us 3.7089 us 3.7174 us] change: [-8.4965% -8.1042% -7.6926%] (p = 0.00 < 0.05) Performance has improved. I think that a lot of these benchmarks are too "easy", and most API usage is going to be more like the 'create string table 10' benchmark, where there are a lot of handles and tables and strings, so I think that 25%-50% improvement is a good guess for most use cases.
2018-03-11 22:20:10 -05:00
let error = error.clone();
ffi::lua_pop(state, 1);
push_wrapped_error(
state,
Error::CallbackError {
traceback,
cause: Arc::new(error),
},
);
} else if !is_wrapped_panic(state, 1) {
if ffi::lua_checkstack(state, LUA_TRACEBACK_STACK) != 0 {
gc_guard(state, || {
let s = ffi::lua_tostring(state, 1);
let s = if s.is_null() {
cstr!("<unprintable lua error>")
} else {
s
};
ffi::luaL_traceback(state, state, s, 0);
ffi::lua_remove(state, -2);
});
}
}
1
}
2017-05-21 18:50:59 -05:00
// A variant of pcall that does not allow lua to catch panic errors from callback_error
pub unsafe extern "C" fn safe_pcall(state: *mut ffi::lua_State) -> c_int {
ffi::luaL_checkstack(state, 2, ptr::null());
let top = ffi::lua_gettop(state);
if top == 0 {
ffi::lua_pushstring(state, cstr!("not enough arguments to pcall"));
ffi::lua_error(state);
} else if ffi::lua_pcall(state, top - 1, ffi::LUA_MULTRET, 0) != ffi::LUA_OK {
2017-06-24 21:26:35 -05:00
if is_wrapped_panic(state, -1) {
2017-05-21 18:50:59 -05:00
ffi::lua_error(state);
}
ffi::lua_pushboolean(state, 0);
ffi::lua_insert(state, -2);
2
} else {
ffi::lua_pushboolean(state, 1);
ffi::lua_insert(state, 1);
ffi::lua_gettop(state)
2017-05-21 18:50:59 -05:00
}
}
// A variant of xpcall that does not allow lua to catch panic errors from callback_error
pub unsafe extern "C" fn safe_xpcall(state: *mut ffi::lua_State) -> c_int {
unsafe extern "C" fn xpcall_msgh(state: *mut ffi::lua_State) -> c_int {
ffi::luaL_checkstack(state, 2, ptr::null());
2017-06-24 21:26:35 -05:00
if is_wrapped_panic(state, -1) {
2017-05-21 18:50:59 -05:00
1
} else {
ffi::lua_pushvalue(state, ffi::lua_upvalueindex(1));
ffi::lua_insert(state, 1);
ffi::lua_call(state, ffi::lua_gettop(state) - 1, ffi::LUA_MULTRET);
ffi::lua_gettop(state)
}
}
ffi::luaL_checkstack(state, 2, ptr::null());
let top = ffi::lua_gettop(state);
if top < 2 {
ffi::lua_pushstring(state, cstr!("not enough arguments to xpcall"));
ffi::lua_error(state);
}
2017-05-21 18:50:59 -05:00
ffi::lua_pushvalue(state, 2);
ffi::lua_pushcclosure(state, xpcall_msgh, 1);
ffi::lua_copy(state, 1, 2);
ffi::lua_replace(state, 1);
2017-05-21 18:50:59 -05:00
let res = ffi::lua_pcall(state, ffi::lua_gettop(state) - 2, ffi::LUA_MULTRET, 1);
if res != ffi::LUA_OK {
2017-06-24 21:26:35 -05:00
if is_wrapped_panic(state, -1) {
2017-05-21 18:50:59 -05:00
ffi::lua_error(state);
}
ffi::lua_pushboolean(state, 0);
ffi::lua_insert(state, -2);
2
} else {
ffi::lua_pushboolean(state, 1);
ffi::lua_insert(state, 2);
ffi::lua_gettop(state) - 1
2017-05-21 18:50:59 -05:00
}
}
// Does not call lua_checkstack, uses 1 stack space.
pub unsafe fn main_state(state: *mut ffi::lua_State) -> *mut ffi::lua_State {
ffi::lua_rawgeti(state, ffi::LUA_REGISTRYINDEX, ffi::LUA_RIDX_MAINTHREAD);
let main_state = ffi::lua_tothread(state, -1);
ffi::lua_pop(state, 1);
main_state
}
// Pushes a WrappedError::Error to the top of the stack. Uses two stack spaces and does not call
// lua_checkstack.
pub unsafe fn push_wrapped_error(state: *mut ffi::lua_State, err: Error) {
gc_guard(state, || {
let ud = ffi::lua_newuserdata(state, mem::size_of::<WrappedError>()) as *mut WrappedError;
ptr::write(ud, WrappedError(err))
});
2017-05-21 18:50:59 -05:00
get_error_metatable(state);
ffi::lua_setmetatable(state, -2);
}
A lot of performance changes. Okay, so this is kind of a mega-commit of a lot of performance related changes to rlua, some of which are pretty complicated. There are some small improvements here and there, but most of the benefits of this change are from a few big changes. The simplest big change is that there is now `protect_lua` as well as `protect_lua_call`, which allows skipping a lightuserdata parameter and some stack manipulation in some cases. Second simplest is the change to use Vec instead of VecDeque for MultiValue, and to have MultiValue be used as a sort of "backwards-only" Vec so that ToLuaMulti / FromLuaMulti still work correctly. The most complex change, though, is a change to the way LuaRef works, so that LuaRef can optionally point into the Lua stack instead of only registry values. At state creation a set number of stack slots is reserved for the first N LuaRef types (currently 16), and space for these are also allocated separately allocated at callback time. There is a huge breaking change here, which is that now any LuaRef types MUST only be used with the Lua on which they were created, and CANNOT be used with any other Lua callback instance. This mostly will affect people using LuaRef types from inside a scope callback, but hopefully in those cases `Function::bind` will be a suitable replacement. On the plus side, the rules for LuaRef types are easier to state now. There is probably more easy-ish perf on the table here, but here's the preliminary results, based on my very limited benchmarks: create table time: [314.13 ns 315.71 ns 317.44 ns] change: [-36.154% -35.670% -35.205%] (p = 0.00 < 0.05) create array 10 time: [2.9731 us 2.9816 us 2.9901 us] change: [-16.996% -16.600% -16.196%] (p = 0.00 < 0.05) Performance has improved. create string table 10 time: [5.6904 us 5.7164 us 5.7411 us] change: [-53.536% -53.309% -53.079%] (p = 0.00 < 0.05) Performance has improved. call add function 3 10 time: [5.1134 us 5.1222 us 5.1320 us] change: [-4.1095% -3.6910% -3.1781%] (p = 0.00 < 0.05) Performance has improved. call callback add 2 10 time: [5.4408 us 5.4480 us 5.4560 us] change: [-6.4203% -5.7780% -5.0013%] (p = 0.00 < 0.05) Performance has improved. call callback append 10 time: [9.8243 us 9.8410 us 9.8586 us] change: [-26.937% -26.702% -26.469%] (p = 0.00 < 0.05) Performance has improved. create registry 10 time: [3.7005 us 3.7089 us 3.7174 us] change: [-8.4965% -8.1042% -7.6926%] (p = 0.00 < 0.05) Performance has improved. I think that a lot of these benchmarks are too "easy", and most API usage is going to be more like the 'create string table 10' benchmark, where there are a lot of handles and tables and strings, so I think that 25%-50% improvement is a good guess for most use cases.
2018-03-11 22:20:10 -05:00
// Checks if the value at the given index is a WrappedError, and if it is returns a pointer to it,
// otherwise returns null. Uses 2 stack spaces and does not call lua_checkstack.
pub unsafe fn get_wrapped_error(state: *mut ffi::lua_State, index: c_int) -> *const Error {
let userdata = ffi::lua_touserdata(state, index);
if userdata.is_null() {
return ptr::null();
}
if ffi::lua_getmetatable(state, index) == 0 {
return ptr::null();
}
get_error_metatable(state);
let res = ffi::lua_rawequal(state, -1, -2) != 0;
ffi::lua_pop(state, 2);
if res {
&(*get_userdata::<WrappedError>(state, -1)).0
2017-06-24 21:26:35 -05:00
} else {
A lot of performance changes. Okay, so this is kind of a mega-commit of a lot of performance related changes to rlua, some of which are pretty complicated. There are some small improvements here and there, but most of the benefits of this change are from a few big changes. The simplest big change is that there is now `protect_lua` as well as `protect_lua_call`, which allows skipping a lightuserdata parameter and some stack manipulation in some cases. Second simplest is the change to use Vec instead of VecDeque for MultiValue, and to have MultiValue be used as a sort of "backwards-only" Vec so that ToLuaMulti / FromLuaMulti still work correctly. The most complex change, though, is a change to the way LuaRef works, so that LuaRef can optionally point into the Lua stack instead of only registry values. At state creation a set number of stack slots is reserved for the first N LuaRef types (currently 16), and space for these are also allocated separately allocated at callback time. There is a huge breaking change here, which is that now any LuaRef types MUST only be used with the Lua on which they were created, and CANNOT be used with any other Lua callback instance. This mostly will affect people using LuaRef types from inside a scope callback, but hopefully in those cases `Function::bind` will be a suitable replacement. On the plus side, the rules for LuaRef types are easier to state now. There is probably more easy-ish perf on the table here, but here's the preliminary results, based on my very limited benchmarks: create table time: [314.13 ns 315.71 ns 317.44 ns] change: [-36.154% -35.670% -35.205%] (p = 0.00 < 0.05) create array 10 time: [2.9731 us 2.9816 us 2.9901 us] change: [-16.996% -16.600% -16.196%] (p = 0.00 < 0.05) Performance has improved. create string table 10 time: [5.6904 us 5.7164 us 5.7411 us] change: [-53.536% -53.309% -53.079%] (p = 0.00 < 0.05) Performance has improved. call add function 3 10 time: [5.1134 us 5.1222 us 5.1320 us] change: [-4.1095% -3.6910% -3.1781%] (p = 0.00 < 0.05) Performance has improved. call callback add 2 10 time: [5.4408 us 5.4480 us 5.4560 us] change: [-6.4203% -5.7780% -5.0013%] (p = 0.00 < 0.05) Performance has improved. call callback append 10 time: [9.8243 us 9.8410 us 9.8586 us] change: [-26.937% -26.702% -26.469%] (p = 0.00 < 0.05) Performance has improved. create registry 10 time: [3.7005 us 3.7089 us 3.7174 us] change: [-8.4965% -8.1042% -7.6926%] (p = 0.00 < 0.05) Performance has improved. I think that a lot of these benchmarks are too "easy", and most API usage is going to be more like the 'create string table 10' benchmark, where there are a lot of handles and tables and strings, so I think that 25%-50% improvement is a good guess for most use cases.
2018-03-11 22:20:10 -05:00
ptr::null()
2017-06-24 21:26:35 -05:00
}
}
// Runs the given function with the Lua garbage collector disabled. `rlua` assumes that all
// allocation failures are aborts, so when the garbage collector is disabled, 'm' functions that can
// cause either an allocation error or a a `__gc` metamethod error are prevented from causing errors
// at all. The given function should never panic or longjmp, because this could inadverntently
// disable the gc. This is useful when error handling must allocate, and `__gc` errors at that time
// would shadow more important errors, or be extremely difficult to handle safely.
pub unsafe fn gc_guard<R, F: FnOnce() -> R>(state: *mut ffi::lua_State, f: F) -> R {
if ffi::lua_gc(state, ffi::LUA_GCISRUNNING, 0) != 0 {
ffi::lua_gc(state, ffi::LUA_GCSTOP, 0);
let r = f();
ffi::lua_gc(state, ffi::LUA_GCRESTART, 0);
r
} else {
f()
}
}
// Initialize the error, panic, and destructed userdata metatables.
pub unsafe fn init_error_metatables(state: *mut ffi::lua_State) {
assert_stack(state, 8);
// Create error metatable
unsafe extern "C" fn error_tostring(state: *mut ffi::lua_State) -> c_int {
ffi::luaL_checkstack(state, 2, ptr::null());
callback_error(state, || {
A lot of performance changes. Okay, so this is kind of a mega-commit of a lot of performance related changes to rlua, some of which are pretty complicated. There are some small improvements here and there, but most of the benefits of this change are from a few big changes. The simplest big change is that there is now `protect_lua` as well as `protect_lua_call`, which allows skipping a lightuserdata parameter and some stack manipulation in some cases. Second simplest is the change to use Vec instead of VecDeque for MultiValue, and to have MultiValue be used as a sort of "backwards-only" Vec so that ToLuaMulti / FromLuaMulti still work correctly. The most complex change, though, is a change to the way LuaRef works, so that LuaRef can optionally point into the Lua stack instead of only registry values. At state creation a set number of stack slots is reserved for the first N LuaRef types (currently 16), and space for these are also allocated separately allocated at callback time. There is a huge breaking change here, which is that now any LuaRef types MUST only be used with the Lua on which they were created, and CANNOT be used with any other Lua callback instance. This mostly will affect people using LuaRef types from inside a scope callback, but hopefully in those cases `Function::bind` will be a suitable replacement. On the plus side, the rules for LuaRef types are easier to state now. There is probably more easy-ish perf on the table here, but here's the preliminary results, based on my very limited benchmarks: create table time: [314.13 ns 315.71 ns 317.44 ns] change: [-36.154% -35.670% -35.205%] (p = 0.00 < 0.05) create array 10 time: [2.9731 us 2.9816 us 2.9901 us] change: [-16.996% -16.600% -16.196%] (p = 0.00 < 0.05) Performance has improved. create string table 10 time: [5.6904 us 5.7164 us 5.7411 us] change: [-53.536% -53.309% -53.079%] (p = 0.00 < 0.05) Performance has improved. call add function 3 10 time: [5.1134 us 5.1222 us 5.1320 us] change: [-4.1095% -3.6910% -3.1781%] (p = 0.00 < 0.05) Performance has improved. call callback add 2 10 time: [5.4408 us 5.4480 us 5.4560 us] change: [-6.4203% -5.7780% -5.0013%] (p = 0.00 < 0.05) Performance has improved. call callback append 10 time: [9.8243 us 9.8410 us 9.8586 us] change: [-26.937% -26.702% -26.469%] (p = 0.00 < 0.05) Performance has improved. create registry 10 time: [3.7005 us 3.7089 us 3.7174 us] change: [-8.4965% -8.1042% -7.6926%] (p = 0.00 < 0.05) Performance has improved. I think that a lot of these benchmarks are too "easy", and most API usage is going to be more like the 'create string table 10' benchmark, where there are a lot of handles and tables and strings, so I think that 25%-50% improvement is a good guess for most use cases.
2018-03-11 22:20:10 -05:00
if let Some(error) = get_wrapped_error(state, -1).as_ref() {
let error_str = error.to_string();
gc_guard(state, || {
ffi::lua_pushlstring(
state,
error_str.as_ptr() as *const c_char,
error_str.len(),
)
});
ffi::lua_remove(state, -2);
Ok(1)
} else {
panic!("userdata mismatch in Error metamethod");
}
})
}
ffi::lua_pushlightuserdata(
state,
&ERROR_METATABLE_REGISTRY_KEY as *const u8 as *mut c_void,
);
ffi::lua_newtable(state);
ffi::lua_pushstring(state, cstr!("__gc"));
ffi::lua_pushcfunction(state, userdata_destructor::<WrappedError>);
ffi::lua_rawset(state, -3);
ffi::lua_pushstring(state, cstr!("__tostring"));
ffi::lua_pushcfunction(state, error_tostring);
ffi::lua_rawset(state, -3);
ffi::lua_pushstring(state, cstr!("__metatable"));
ffi::lua_pushboolean(state, 0);
ffi::lua_rawset(state, -3);
ffi::lua_rawset(state, ffi::LUA_REGISTRYINDEX);
// Create panic metatable
ffi::lua_pushlightuserdata(
state,
&PANIC_METATABLE_REGISTRY_KEY as *const u8 as *mut c_void,
);
ffi::lua_newtable(state);
ffi::lua_pushstring(state, cstr!("__gc"));
ffi::lua_pushcfunction(state, userdata_destructor::<WrappedPanic>);
ffi::lua_rawset(state, -3);
ffi::lua_pushstring(state, cstr!("__metatable"));
ffi::lua_pushboolean(state, 0);
ffi::lua_rawset(state, -3);
ffi::lua_rawset(state, ffi::LUA_REGISTRYINDEX);
// Create destructed userdata metatable
unsafe extern "C" fn destructed_error(state: *mut ffi::lua_State) -> c_int {
ffi::luaL_checkstack(state, 2, ptr::null());
push_wrapped_error(state, Error::CallbackDestructed);
ffi::lua_error(state)
}
ffi::lua_pushlightuserdata(
state,
&DESTRUCTED_USERDATA_METATABLE as *const u8 as *mut c_void,
);
ffi::lua_newtable(state);
for &method in &[
cstr!("__add"),
cstr!("__sub"),
cstr!("__mul"),
cstr!("__div"),
cstr!("__mod"),
cstr!("__pow"),
cstr!("__unm"),
cstr!("__idiv"),
cstr!("__band"),
cstr!("__bor"),
cstr!("__bxor"),
cstr!("__bnot"),
cstr!("__shl"),
cstr!("__shr"),
cstr!("__concat"),
cstr!("__len"),
cstr!("__eq"),
cstr!("__lt"),
cstr!("__le"),
cstr!("__index"),
cstr!("__newindex"),
cstr!("__call"),
cstr!("__tostring"),
cstr!("__pairs"),
cstr!("__ipairs"),
] {
ffi::lua_pushstring(state, method);
ffi::lua_pushcfunction(state, destructed_error);
ffi::lua_rawset(state, -3);
}
ffi::lua_rawset(state, ffi::LUA_REGISTRYINDEX);
}
struct WrappedError(pub Error);
2019-09-26 14:14:36 -05:00
struct WrappedPanic(pub Option<Box<dyn Any + Send>>);
// Pushes a WrappedError::Panic to the top of the stack. Uses two stack spaces and does not call
// lua_checkstack.
2019-09-26 14:14:36 -05:00
unsafe fn push_wrapped_panic(state: *mut ffi::lua_State, panic: Box<dyn Any + Send>) {
gc_guard(state, || {
let ud = ffi::lua_newuserdata(state, mem::size_of::<WrappedPanic>()) as *mut WrappedPanic;
ptr::write(ud, WrappedPanic(Some(panic)))
});
get_panic_metatable(state);
ffi::lua_setmetatable(state, -2);
}
// Checks if the value at the given index is a WrappedPanic. Uses 2 stack spaces and does not call
// lua_checkstack.
2017-12-02 16:47:00 -06:00
unsafe fn is_wrapped_panic(state: *mut ffi::lua_State, index: c_int) -> bool {
2017-06-24 21:26:35 -05:00
let userdata = ffi::lua_touserdata(state, index);
if userdata.is_null() {
return false;
}
Big API incompatible error change, remove dependency on error_chain The current situation with error_chain is less than ideal, and there are lots of conflicting interests that are impossible to meet at once. Here is an unorganized brain dump of the current situation, stay awhile and listen! This change was triggered ultimately by the desire to make LuaError implement Clone, and this is currently impossible with error_chain. LuaError must implement Clone to be a proper lua citizen that can live as userdata within a lua runtime, because there is no way to limit what the lua runtime can do with a received error. Currently, this is solved by there being a rule that the error will "expire" if the error is passed back into rust, and this is very sub-optimal. In fact, one could easily imagine a scenario where lua is for example memoizing some function, and if the function has ever errored in the past the function should continue returning the same error, and this situation immediately fails with this restriciton in place. Additionally, there are other more minor problems with error_chain which make the API less good than it could be, or limit how we can use error_chain. This change has already solved a small bug in a Chucklefish project, where the conversion from an external error type (Borrow[Mut]Error) was allowed but not intended for user code, and was accidentally used. Additionally, pattern matching on error_chain errors, which should be common when dealing with Lua, is less convenient than a hand rolled error type. So, if we decide not to use error_chain, we now have a new set of problems if we decide interoperability with error_chain is important. The first problem we run into is that there are two natural bounds for wrapped errors that we would pick, (Error + Send + Sync), or just Error, and neither of them will interoperate well with error_chain. (Error + Send + Sync) means we can't wrap error chain errors into LuaError::ExternalError (they're missing the Sync bound), and having the bounds be just Error means the opposite, that we can't hold a LuaError inside an error_chain error. We could just decide that interoperability with error_chain is the most important qualification, and pick (Error + Send), but this causes a DIFFERENT set of problems. The rust ecosystem has the two primary error bounds as Error or (Error + Send + Sync), and there are Into impls from &str / String to Box<Error + Send + Sync> for example, but NOT (Error + Send). This means that we are forced to manually recreate the conversions from &str / String to LuaError rather than relying on a single Into<Box<Error + Send + Sync>> bound, but this means that string conversions have a different set of methods than other error types for external error conversion. I have not been able to figure out an API that I am happy with that uses the (Error + Send) bound. Box<Error> is obnoxious because not having errors implement Send causes needless problems in a multithreaded context, so that leaves (Error + Send + Sync). This is actually a completely reasonable bound for external errors, and has the nice String Into impls that we would want, the ONLY problem is that it is a pain to interoperate with the current version of error_chain. It would be nice to be able to specify the traits that an error generated by the error_chain macro would implement, and this is apparently in progress in the error_chain library. This would solve both the problem with not being able to implement Clone and the problems with (Error + Send) bounds. I am not convinced that this library should go back to using error_chain when that functionality is in stable error_chain though, because of the other minor usability problems with using error_chain. In that theoretical situation, the downside of NOT using error_chain is simply that there would not be automatic stacktraces of LuaError. This is not a huge problem, because stack traces of lua errors are not extremely useful, and for external errors it is not too hard to create a different version of the LuaExternalResult / LuaExternalError traits and do conversion from an error_chain type into a type that will print the stacktrace on display, or use downcasting in the error causes. So in summary, this library is no longer using error_chain, and probably will not use it again in the future. Currently this means that to interoperate with error_chain, you should use error_chain 0.8.1, which derives Sync on errors, or wait for a version that supports user defined trait derives. In the future when error_chain supports user defined trait derives, users may have to take an extra step to make wrapped external errors print the stacktrace that they capture. This change works, but is not entirely complete. There is no error documentation yet, and the change brought to a head an ugly module organization problem. There will be more commits for documentation and reorganization, then a new stable version of rlua.
2017-06-24 17:11:56 -05:00
2017-06-24 21:26:35 -05:00
if ffi::lua_getmetatable(state, index) == 0 {
return false;
2017-05-21 18:50:59 -05:00
}
2017-06-24 21:26:35 -05:00
get_panic_metatable(state);
let res = ffi::lua_rawequal(state, -1, -2) != 0;
ffi::lua_pop(state, 2);
res
2017-05-21 18:50:59 -05:00
}
unsafe fn get_error_metatable(state: *mut ffi::lua_State) {
2017-06-15 09:26:39 -05:00
ffi::lua_pushlightuserdata(
state,
&ERROR_METATABLE_REGISTRY_KEY as *const u8 as *mut c_void,
);
ffi::lua_rawget(state, ffi::LUA_REGISTRYINDEX);
2017-05-21 18:50:59 -05:00
}
2017-06-24 21:26:35 -05:00
unsafe fn get_panic_metatable(state: *mut ffi::lua_State) {
2017-06-24 21:26:35 -05:00
ffi::lua_pushlightuserdata(
state,
&PANIC_METATABLE_REGISTRY_KEY as *const u8 as *mut c_void,
);
ffi::lua_rawget(state, ffi::LUA_REGISTRYINDEX);
2017-06-24 21:26:35 -05:00
}
unsafe fn get_destructed_userdata_metatable(state: *mut ffi::lua_State) {
ffi::lua_pushlightuserdata(
state,
&DESTRUCTED_USERDATA_METATABLE as *const u8 as *mut c_void,
);
ffi::lua_rawget(state, ffi::LUA_REGISTRYINDEX);
}
static ERROR_METATABLE_REGISTRY_KEY: u8 = 0;
static PANIC_METATABLE_REGISTRY_KEY: u8 = 0;
static DESTRUCTED_USERDATA_METATABLE: u8 = 0;