In this article, we are going to inspect OpenZeppelin's String library. Most of the articles are about core things like oracles, launchpads, tokens, etc. I wanted to write an article about this kind of useful but not too much-used thing. Let’s start!
At the beginning of our contract, we have two state variables:
bytes16 private constant _HEX_SYMBOLS = "0123456789abcdef";
uint8 private constant _ADDRESS_LENGTH = 20;
We will use these variables in our toHexString functions.
This function Converts a uint256 to its ASCII string decimal representation.
First of all, let’s look at the function signature:
function toString(uint256 value) internal pure returns (string memory)
It takes an uint256 parameter.
It’s an internal function. So, it can only be accessible in the String contract itself, or a contract inherited from the String contract.
It’s a pure function. That means it doesn’t read or write from the state.
And lastly, it returns a string.
The method we used in this function does not work with the number 0. So, if the input number is zero, return ”0”:
if (value == 0) {
return "0";
}
Next, we are saving our input variable to another variable. We’ll divide it by 10 continuously. Because we want to get how many digits is the input variable and we don’t want to lose the value.
uint256 temp = value;
uint256 digits;
while (temp != 0) {
digits++;
temp /= 10;
}
Let’s assume that we sent 100,042 to this function. Now, temp is equal to 0 , and digits is equal to 6.
Now, we are going to create a bytes variable and insert the bytes one by one to it:
bytes memory buffer = new bytes(digits);
while (value != 0) {
digits -= 1;
buffer[digits] = bytes1(uint8(48 + uint256(value % 10)));
value /= 10;
}
return string(buffer);
At the beginning buffer is equal to 0x000000000000. And in every loop it changes these:
0x000000000032
0x000000003432
0x000000303432
0x000030303432
0x003030303432
0x313030303432
If you convert every byte to ASCII, you can get 0 for 0x30, 1 for 31, 2 for 32, etc. 30 to 40 are the numbers 0 to 9.
Let’s break down the confusing line:
value % 10is for getting the least important number.uint8(48 + uint256(value % 10))nothing much different. Just type conversion to be able to usebytes1without losing data.bytes1(uint8(48 + uint256(value % 10)))now we can add the resulting number to ourbuffer.
Let’s run the first loop to understand what is going on in this line:
value % 10gives us2. I am not going to explain to you “what is mod” here. If you don’t know how we are getting2from100,042 % 10, then just google “what is mod in programming”.uint8(48 + uint256(value % 10))gives us50. You can think: “what the hell is this number?”. And you’ll be right. They used this, because, we are going to use hexadecimal numbers. If you convert50to hexadecimal, you’ll get32. And32is equal to2in the ASCII table. Wow! What a conversion, hah?bytes1(uint8(48 + uint256(value % 10)))conversion is required for ourbuffer, which is a bytes variable.
Lastly, we are converting our bytes to string. I love this part. Because if you want to get to result in uint, the result is will be a totally different thing. But, we are telling “convert it to string, I want to see its corresponding value in the ASCII table”.
Before we go deep dive into this function, you have to know that: there are 3 different toHexString functions in this library. If you don’t know function overloading, you might be thinking “how can they use 3 functions with the same name?”. So, it’ll be better for you to check out function overloading before continuing this article.
Let’s look at our three functions:
function toHexString(uint256 value, uint256 length) internal pure returns (string memory)
function toHexString(uint256 value) internal pure returns (string memory)
function toHexString(address addr) internal pure returns (string memory)
As you can see, they have the same name, but, they all have a different kinds of parameters. Because of that, their function selectors are different. So, EVM is good with that.
Please think the functions in order. For example, the number 1, the first function, is the function that takes two uint256 parameters.
But, you should know that: the second and third functions are calling the first function. It means if you call the second or the third function, your input values are going to the first function at the end of the day.
Enough for talking. Let’s look at the third function’s code:
return toHexString(uint256(uint160(addr)), _ADDRESS_LENGTH);
It has only one line of code. And it is converting our address input variable to uint256. And, send the result with _ADDRESS_LENGTH, which is 20, to our first function.
Let’s convert my address,0x000000000042bAA586DD7161dC0EB8f0CB4a9fBE , to uint256: 346477235235092042596196240046333886 we’ll get this huge number.
We’ll look at the other steps in a minute when we are talking about the first function.
Okay, now time for our second function. If the input value is 0, then we are returning ”0x00”.
if (value == 0) {
return "0x00";
}
If it is not zero, then we are going to calculate it’s length:
uint256 temp = value;
uint256 length = 0;
while (temp != 0) {
length++;
temp >>= 8;
}
I think only the temp >>= 8; line is confusing. Let’s again send to this function our 100,042 number and see what happens in our loop:
In our first loop
lengthwill be 1 and the temp is390. What? Ok. Now we have to go to the dark side of coding: the binary world.100,042in binary is equal to11000011011001010and 390 is equal to110000110. Did you notice the similarity between to binaries? Their first (start from left) 8 digits are equal.>>means shift bits to right. If you used<<this one, the result will be:25610752and its binary is1100001101100101000000000. Okay, I believe now things are more clear.>>deletes bits,<<adds zero to binary. I think we can think like that. So, our loop basically calculates the length of our value in bytes (8 bits are equal to 1 byte). We will use the length of this byte in our first function. Let’s go on to other steps.temp’svalue was110000110. So, we can delete the least significant 8 bits by our hand (last 8 bits). If we delete them here we have binary1, it is equal to decimal1. Andlengthis equal to2now.This is the last step for our value. Because in this step our value is going to equal to
0. And while loop not going to run anymore. In this steptempis equal0andlengthis equal to3. Now, we are sending100,042andlengthto the first function.
So far so good! Only one function is left, the first function. It starts with defining a bytes variable named buffer and its length is equal to 2 * length + 2. Why is that? It is because every 2 hexadecimal characters are equal to 1 byte. For example, if you want to convert byte 0 to hexadecimal, you'll have 0x00. Since we are returning a string, we don’t actually have 0x at the beginning. So, we have to add it manually. I think you got why we are adding 2 to our length, it is because of 0x.
bytes memory buffer = new bytes(2 * length + 2);
buffer[0] = "0";
buffer[1] = "x";
After than, we are going to determine all of our hexadecimal values one by one.
for (uint256 i = 2 * length + 1; i > 1; --i) {
buffer[i] = _HEX_SYMBOLS[value & 0xf];
value >>= 4;
}
We have sent to this function to value, do you remember? One is (100042, 3) and the other one is (0x000000000042bAA586DD7161dC0EB8f0CB4a9fBE, 20) . Let’s run them step-by-step. First (100042, 3) :
In this step, we are going to determine
value & 0xffirst. It is a bitwise AND operation. We want to get the last 4 bits We sent100042, in binary11000011011001010. So,value & 0xfgives us1010, which is equal to decimal10. 10th item in the_HEX_SYMBOLSlist is equal toa. After then we are deleting the least significant 4 bits fromvalue. And now we have1100001101100, which is equal to decimal6252.bufferis equal to0x3078000000000061. The first 2 bytes3078are coming from0xthat we add before the loop. Do you remember30is equal to0in the ASCII table, right? The same goes with thexanda.78is equal toxin the ASCII table. If you convert thebufferto string, you’ll get0xa.Now we are operating
1100001101100 & 0xfwhich gives us decimal12and hexadecimalc. Buffer is equal to0x3078000000006361, in string0xca.Our number is
110000110. Again, delete the least significant 4 bits and run the operation bitwise AND. It gives us0110. Buffer is equal to0x3078000000366361, in string0x6ca.Our number is
11000. In this step, we are getting8. Nowtempis equal to1. Buffer is0x3078000038366361, in string0x86ca.We get
1in binary in this step. And it gives us hexadecimal1.tempis0. Buffer is0x3078003138366361, in string0x186ca.tempwas equal to. Because of that, we are getting a0in hexadecimal. Buffer is equal to0x3078303138366361, in string0x0186ca. And this was the last step.
We gave (100042, 3) to this function and it returned 0x0186ca. You can convert these numbers using an online converter and you’ll see the result is true.
I said I also inspect the 346477235235092042596196240046333886 and 20 . But I am not a computer, I am bored doing the same thing over and over. So, you can do it by yourself. Here is the full code of the library that I inspect. I am sharing it because, in the future, it can be changed:
// SPDX-License-Identifier: MIT
// OpenZeppelin Contracts v4.4.1 (utils/Strings.sol)
pragma solidity ^0.8.0;
/**
* @dev String operations.
*/
library Strings {
bytes16 private constant _HEX_SYMBOLS = "0123456789abcdef";
uint8 private constant _ADDRESS_LENGTH = 20;
/**
* @dev Converts a `uint256` to its ASCII `string` decimal representation.
*/
function toString(uint256 value) internal pure returns (string memory) {
// Inspired by OraclizeAPI's implementation - MIT licence
// https://github.com/oraclize/ethereum-api/blob/b42146b063c7d6ee1358846c198246239e9360e8/oraclizeAPI_0.4.25.sol
if (value == 0) {
return "0";
}
uint256 temp = value;
uint256 digits;
while (temp != 0) {
digits++;
temp /= 10;
}
bytes memory buffer = new bytes(digits);
while (value != 0) {
digits -= 1;
buffer[digits] = bytes1(uint8(48 + uint256(value % 10)));
value /= 10;
}
return string(buffer);
}
/**
* @dev Converts a `uint256` to its ASCII `string` hexadecimal representation.
*/
function toHexString(uint256 value) internal pure returns (string memory) {
if (value == 0) {
return "0x00";
}
uint256 temp = value;
uint256 length = 0;
while (temp != 0) {
length++;
temp >>= 8;
}
return toHexString(value, length);
}
/**
* @dev Converts a `uint256` to its ASCII `string` hexadecimal representation with fixed length.
*/
function toHexString(uint256 value, uint256 length) internal pure returns (string memory) {
bytes memory buffer = new bytes(2 * length + 2);
buffer[0] = "0";
buffer[1] = "x";
for (uint256 i = 2 * length + 1; i > 1; --i) {
buffer[i] = _HEX_SYMBOLS[value & 0xf];
value >>= 4;
}
require(value == 0, "Strings: hex length insufficient");
return string(buffer);
}
/**
* @dev Converts an `address` with fixed length of 20 bytes to its not checksummed ASCII `string` hexadecimal representation.
*/
function toHexString(address addr) internal pure returns (string memory) {
return toHexString(uint256(uint160(addr)), _ADDRESS_LENGTH);
}
}
