What do you mean by VM?
Well we first must understand how Lua works and what happens when you code in Lua, the CPU executes your code, but how?
All roads lead to the CPU!
Whenever you write code in Lua, Python, C, C++, Swift, whatever it maybe, the CPU is the one who acts upon your code! but how?
Well lua is an interpreted language, meaning unlike languages like C, C++ it is not directly compiled to machine-code, which is what the CPU executes.
Lua, since its an interpreted language has what is known as the Virtual Machine.
What is the Virtual Machine?
The Virtual Machine is what executes your code, the compiler turns your code into bytecode which is what the Virtual Machine reads and executes, we do not write Lua code in bytecode because it will be very hard to understand and write.
You can think of it like this: Roblox takes my code I write and tells the compiler to compile our code to bytecode, then the compiler sends the bytecode to the VM (Virtual Machine) which will read the bytecode and execute them.
What is bytecode?
As I said previously bytecode is what the VM executes, bytecode is your compiled code. When the VM disassembles the bytecode, it becomes more understand and readable, bytecode is what the VM will read from and execute as I said before, and disassembled bytecode is the direct representation of the bytecode, so everything is more understandable to humans you can think of disassembled bytecode as “a list of tasks to do”.
I’ll give you an example, take a look at this source code:
local function add(a, b)
return (a+b);
end;
add(1, 2);
Very simple, just a function to add two numbers together, but what does this look like under the hood?
Well when bytecode isn’t disassembled it becomes very hard to read and understand, so we will be looking at what disassembled bytecode is, the disassembled bytecode will look like this:
Function 0 (add)
ADD R2 R0 R1
RETURN R2 1
Function 1 ()
DUPCLOSURE R0 K0 [add]
MOVE R1 R0
LOADN R2 1
LOADN R3 2
CALL R1 2 0
RETURN R0 0
So, when bytecode gets compiled they wrap your code in everything that is called the MAIN_PROTO, we can best visualize it by showing what it looks like in lua
This is our source:
print("Hello, world!");
When its compiled:
local function main(...)
print("Hello, world!");
return;
end;
The VM will start execution at main, if we look at the bytecode, we can piece things that are happening.
If we take a look at our ‘Function 1’ that is our main proto we talked about, below that we see what we call instructions.
What are instructions?
Instructions basically tell the VM what to do, at a very high-level. There are all kind of instructions that we see in Lua, keep in note Luau has their own instructions that are not seen in Lua.
A thing you may notice is the repetitive R like you see ‘R1’ in the disassembled bytecode, you may ask what are those?
Those are known as something as a register. Registers are slots of data that the VM works with, registers contain and store data, you can think of them as the arrays in lua which is an oversimplified analogy but whatever registers are limited in space!, there is more than just one register.
To help you get a better idea, lets look at this example, the provided code below is our source code
print("Hello, world!");
The disassembled bytecode looks like so:
GETIMPORT R0 1 [print] ; This will load the "print" function into register 0
LOADK R1 K2 ['Hello, world!'] ; This will load our hello world string in the source into register one
CALL R0 1 0 ; We invoke the print function specifying register 0, we specify the number one because that is the amount of arguments we called print which is our "hello world" and we specify the number zero because in our source code we ignored what print returns
Keep in mind, code while in –!native | @native (Native Code Generation) is LuauJIT (Luau Just In Time), meaning your source is directly translated into machine-code that CPUs execute rather than bytecode
What are Function Environments?
The best way I can explain this is how functions in your script get functions from, and the only way you can learn this is by playing with code your self.
Try to run this code in studio and see what happens!
local function test_func()
return print("Hello, world!");
end;
setfenv(test_func, {print = warn});
test_func();
In this example, we created a new function test_func which will print ‘Hello, world!’ when called, then we called a function called setfenv (Set Function Environment) to our defined array {print = warn}.
If you ran it, you would notice that print actually called warn!, because we defined a new function environment for test_func which is an array which set the print index to the warn function!
Setting a new environment for the function clears the old one!
local function test_func()
if (workspace ~= nil) then
print("Workspace exists");
else
print("Workspace doesnt exist!", workspace);
print(typeof(workspace));
end;
return;
end;
setfenv(test_func, {print = warn; typeof = typeof});
test_func();
As you see, print was warn again because like last time we defined a new array with the key string of print as warn, and you noticed in that function the workspace global was nil, because we set the environment to our new one
local function test_func(): ()
print("game?", game);
print(typeof(game));
print(game.Name);
return;
end;
setfenv(test_func, {game = workspace; print = print; typeof = typeof});
test_func();
If you ran it, we would see the game global is swapped out with workspace, because we defined that in our function environment {game = workspace; print = print; typeof = typeof}
We can also get the function environments of functions, we can do this with the function getfenv (get function environment)
local function test_func(): ()
print("Called");
return;
end;
local our_new_env: {[string]: any} = {print = warn};
setfenv(test_func, our_new_env);
print(getfenv(test_func) == our_new_env); --> true
getfenv(test_func).print = print;
test_func();
If you ran the script, you could observe that print actually made a call too print even when we defined {print = warn} as the new environment for the function, this is because we later GOT the function environment with getfenv and set getfenv(test_func).print = print back to the real print, keep in note we set ONLY test_func environment, so anything outside test_func still retains the old print
Why does this matter at all for this project?
As I’ve said the project uses a pure LuaU compiler and Virtual Machine Fiu to execute and run custom code, while in luau, this project is a modified | fork of the VM.
Since we can control the VM, we can add some pretty cool stuff to it.