Welcome back to Bytecode Tuesday, in previous articles we created smart contracts with different logic applied into them. On this week we're going to implement functions that can be called by any ethereum account or smart contract.
Is important to understand how function call work under the hood to be able to write better and safer smart contract, optimize, and integrate with any tooling available today. Additionally, this way we can prepare for future updates where EVM get's improved.
We are going to create a smart contract that has two functions as shown in the following pseudocode:
MyContract {
a() {
return 4
}
b() {
return 5
}
}
But before writing it we need to understand a concept introduced into the Ethereum ecosystem very early on: Function Signature Hashing.
Function signatures are a standard and particular way to express the function names and parameters, for example the following is the ERC20 function signatures. Notice no spaces, also the return values are not expressed.
name()
symbol()
decimals()
totalSupply()
balanceOf(address)
transfer(address,uint256)
transferFrom(address,address,uint256)
approve(address,uint256)
allowance(address,address)
However , just as any bytecode language, the EVM has no native string implementation, it only has bytes. So in order to express function names in bytecode, the early ethereum community decided to hash the function names and grab the first 4 bytes. Those 4 bytes will represent the function names inside of EVM bytecode contracts. The following are ERC20 function signatures.
name(): 0x06fdde03
symbol(): 0x95d89b41
decimals(): 0x313ce567
totalSupply(): 0x18160ddd
balanceOf(address): 0x70a08231
transfer(address,uint256): 0xa9059cbb
transferFrom(address,address,uint256): 0x23b872dd
approve(address,uint256): 0x095ea7b3
allowance(address,address): 0xdd62ed3e
This standard has been adopted by high level languages (solidity, vyper), indexers (etherscan, blockscout) and JS tooling (ethers, viem). It has been rightfully criticized of being too gas expensive, it can express more than 4 billion functions per contract where 1 byte can serve 256 functions, more than enough for most, if not all, contracts deployed on ethereum. That being said, this format has served to build a lot of tooling a provide good developer experience because it has been widely adopted across the all EVM usage.
The standard in which any Ethereum account and smart contract can call functions on another smart contract is by sending data to another smart contract in the following format:
<4 byte signature><32 byte Param1><32 byte Param2><32 byte Param3>...
Data received can be read by calling the opcode CALLDATALOAD
(aka 0x35) that receives a byte offset to be read.
We now know everything we need to create the contract we will be showcasing step-by-step.
MyContract {
a() {
return 4
}
b() {
return 5
}
}
The pseudocode smart contract entry (function dispatcher):
signature = CALLDATALOAD(0) // read first 4 bytes of call data
if signature == 0x0DBE671F // this is the a() function signature
goto a();
else
goto b();
Notice that we are not using params and we assume that, if a() was not called b() will be called. This is not normally the case in smart contracts but understanding this simple example will give you solid foundations to understand more complex contracts later on.
The following is the smart contract runtime bytecode implementation:
5F // PUSH0, which is a more gas efficient way of performing PUSH1(0)
35 // CALLDATALOAD
7F 0DBE671F00000000000000000000000000000000000000000000000000000000 // PUSH the a() function signature
14 // EQ, if the top 2 elements stacks are equal they get replaced by 0x01, otherwise by 0x00
60 2F // PUSH1(0x2F) this is the function a() code destination, at byte 47
57 // JUMPI Jumps to byte 47 (0x2F) if the top of the stack is 0x01
// This is now function b() code
60 05 // PUSH1(0x05) the b() return value is now on the stack
5F // PUSH0 we will place it on the memory to return it
52 // MSTORE 0x05 is not on memory
60 20 // PUSH1(0x20) Return size is 32 bytes, 0x20
5F // PUSH0 pushes the offset of memory returned
F3 // RETURN
// This is function a() code, very similar to b()
5B // JUMPDEST signals a safe landing for JUMPI
60 04 // PUSH(0x04) Pushes the a() return value
5F // PUSH0 Memory location where the number 0x04 will be placed
52 // MSTORE The return value is now on memory
60 20 // PUSH1(0x20) Return value is 32 bytes, 0x20
5F // PUSH0 The location of the value returned, in memory
F3 // RETURN
And the following is the full string of bytes.
5F357F0DBE671F0000000000000000000000000000000000000000000000000000000014602F5760055F5260205FF35B60045F5260205FF3
But it's easier to understand with the following animation.
We hope you now have more clarity on how EVM smart contract functions work, this knowledge is fundamental for security auditors, builders and users to have the best results out the EVM and to prepare for future upgrades.
Thanks for reading, see you next Tuesday for more bytes!