
A thing about gas-golfing and test-fitting
Crypto-twitter likes puzzles. And gas-golfing. Any CTF puzzle gains huge success, although in the end it all comes down to “test-fitting”. Let me review a recent example from RareSkills that blew up on CT a couple of days ago:Distribution gas-puzzle by RareSkillsThe idea is to optimize gas in the contract and run provided tests for it that have a threshold. By default, the gas is way over the target:Provided tests results when ran on the original contractSo the “rules” are: no messing with op...
Making of HappyNewYear CTF Puzzle
Imagine there is a contract that can execute any code you send to it. And imagine it also holds some money. How easy would it be to hack it? It would be quite easy, especially if all the opcodes are available. That’s why I decided to create a puzzle like this, but with a few extra tricks to make it a bit harder to solve (even if the tricks are mostly “smoke and mirrors”). I announced the puzzle on Twitter two days before 2023, and it has already been solved: https://twitter.com/0x796/status/1...

A thing about gas-golfing and test-fitting
Crypto-twitter likes puzzles. And gas-golfing. Any CTF puzzle gains huge success, although in the end it all comes down to “test-fitting”. Let me review a recent example from RareSkills that blew up on CT a couple of days ago:Distribution gas-puzzle by RareSkillsThe idea is to optimize gas in the contract and run provided tests for it that have a threshold. By default, the gas is way over the target:Provided tests results when ran on the original contractSo the “rules” are: no messing with op...
Making of HappyNewYear CTF Puzzle
Imagine there is a contract that can execute any code you send to it. And imagine it also holds some money. How easy would it be to hack it? It would be quite easy, especially if all the opcodes are available. That’s why I decided to create a puzzle like this, but with a few extra tricks to make it a bit harder to solve (even if the tricks are mostly “smoke and mirrors”). I announced the puzzle on Twitter two days before 2023, and it has already been solved: https://twitter.com/0x796/status/1...

Subscribe to Convergence Boy

Subscribe to Convergence Boy
Share Dialog
Share Dialog
<100 subscribers
<100 subscribers
Have you ever encountered a JUMPDEST opcode in Ethereum bytecode and wondered how the execution can jump there? Look at this delicious 5Bs, so many places the code can run from!

If there is a 5B you can always jump there, right?…
Wrong!
And the answer has to do with something called instruction boundaries.
When the Ethereum Virtual Machine (EVM) processes bytecode, it loads each octet and defines where one instruction starts and ends. These are known as instruction boundaries. Most opcodes are one-byte, with the exception of the PUSH opcodes. There are 32 PUSH opcodes, ranging from PUSH1 to PUSH32, and they all take more than one byte in bytecode for a single instruction at the program counter. When the EVM encounters anything between 60-7F, it needs to determine which PUSH it is and the size of the accompanying payload, then load everything at once as a single instruction and skip to the next bytecode byte.
But what does this has to do with JUMPDESTs? 5B is a 5B in the end! You see it in code - you jump!

Not really. If you’ve seen the Ethereum Yellow Paper, there’s a separate section there about the Validity of Jump Destinations (9.4.3):

And it explicitly states that "all [JUMP] positions must be on valid instruction boundaries, rather than sitting in the data portion of PUSH operations."
For example, consider the following bytecode:
At first glance, it may seem like this bytecode consists of a JUMPDEST opcode followed by a PUSH1 00 opcode and an SSTORE opcode, which could potentially write something to storage. However, if we take a broader look in the context of it, the JUMPDEST opcode is actually in the middle of a PUSH instruction:

So it’s not really a JUMPDEST, but a “5b600055” number pushed into stack. Therefore, the JUMPDEST opcode is not on an instruction boundary and cannot be jumped to.
That code was from SeaPort and actually it’s a string that contains 5B, that is pushed into the stack with PUSH32:

And the string is “ConsiderationItem[]” or whatever that means:
![Consider a ConsiderationItem[] consideratio](https://img.paragraph.com/cdn-cgi/image/format=auto,width=3840,quality=85/https://storage.googleapis.com/papyrus_images/73799fe51badf81f0b6f6ae52778bddb3da33a7af09f72d9fbc8e0fa5ec03bfc.png)
So the next time you see a JUMPDEST opcode in the middle of a PUSH instruction, you'll know you can't jump there.
And if you want to read and translate Ethereum bytecode into something more readable, just remember to process it sequentially and take PUSH instructions and their data into account.
Actually, it’s very easy, here’s a sample TypeScript code that does exactly that:

And here’s a bonus cherry on the top - a fully RegExp disassembler for EVM bytecode:
https://twitter.com/0x796/status/1608039943582142464
The regex strings are linked in the tweet reply - take a look! It only works in the PCRE2 flavor of regex (Perl family), so you may need to apply some tricks if your language doesn't support it.
So, in the end, we are safe and no jumps to the middle of revert strings are possible. You can breath out and make yourself a favorite hot drink.
If you like the stuff I write - subscribe, collect, follow me on Twitter and spread the word!
Have you ever encountered a JUMPDEST opcode in Ethereum bytecode and wondered how the execution can jump there? Look at this delicious 5Bs, so many places the code can run from!

If there is a 5B you can always jump there, right?…
Wrong!
And the answer has to do with something called instruction boundaries.
When the Ethereum Virtual Machine (EVM) processes bytecode, it loads each octet and defines where one instruction starts and ends. These are known as instruction boundaries. Most opcodes are one-byte, with the exception of the PUSH opcodes. There are 32 PUSH opcodes, ranging from PUSH1 to PUSH32, and they all take more than one byte in bytecode for a single instruction at the program counter. When the EVM encounters anything between 60-7F, it needs to determine which PUSH it is and the size of the accompanying payload, then load everything at once as a single instruction and skip to the next bytecode byte.
But what does this has to do with JUMPDESTs? 5B is a 5B in the end! You see it in code - you jump!

Not really. If you’ve seen the Ethereum Yellow Paper, there’s a separate section there about the Validity of Jump Destinations (9.4.3):

And it explicitly states that "all [JUMP] positions must be on valid instruction boundaries, rather than sitting in the data portion of PUSH operations."
For example, consider the following bytecode:
At first glance, it may seem like this bytecode consists of a JUMPDEST opcode followed by a PUSH1 00 opcode and an SSTORE opcode, which could potentially write something to storage. However, if we take a broader look in the context of it, the JUMPDEST opcode is actually in the middle of a PUSH instruction:

So it’s not really a JUMPDEST, but a “5b600055” number pushed into stack. Therefore, the JUMPDEST opcode is not on an instruction boundary and cannot be jumped to.
That code was from SeaPort and actually it’s a string that contains 5B, that is pushed into the stack with PUSH32:

And the string is “ConsiderationItem[]” or whatever that means:
![Consider a ConsiderationItem[] consideratio](https://img.paragraph.com/cdn-cgi/image/format=auto,width=3840,quality=85/https://storage.googleapis.com/papyrus_images/73799fe51badf81f0b6f6ae52778bddb3da33a7af09f72d9fbc8e0fa5ec03bfc.png)
So the next time you see a JUMPDEST opcode in the middle of a PUSH instruction, you'll know you can't jump there.
And if you want to read and translate Ethereum bytecode into something more readable, just remember to process it sequentially and take PUSH instructions and their data into account.
Actually, it’s very easy, here’s a sample TypeScript code that does exactly that:

And here’s a bonus cherry on the top - a fully RegExp disassembler for EVM bytecode:
https://twitter.com/0x796/status/1608039943582142464
The regex strings are linked in the tweet reply - take a look! It only works in the PCRE2 flavor of regex (Perl family), so you may need to apply some tricks if your language doesn't support it.
So, in the end, we are safe and no jumps to the middle of revert strings are possible. You can breath out and make yourself a favorite hot drink.
If you like the stuff I write - subscribe, collect, follow me on Twitter and spread the word!
No activity yet