# Introduction to Walrus: A Note on How the Red Stuff Protocol Works **Published by:** [ContributionDAO](https://paragraph.com/@contributiondaoblog/) **Published on:** 2025-02-17 **URL:** https://paragraph.com/@contributiondaoblog/introduction-to-walrus-a-note-on-how-the-red-stuff-protocol-works ## Content 1. IntroductionSo far, we’ve discussed two decentralized storage networks: Celestia [1] and Espresso [2]. Unlike traditional decentralized storage systems that rely on full replication, where every node stores a complete copy of the original data, Celestia and Espresso use erasure coding. This method splits data into encoded fragments, allowing reconstruction from only a subset of these fragments. By requiring nodes to store only a small fraction of the encoded data (much smaller than the full original data), these systems significantly reduce per-node storage overhead. Despite these advantages, recovering data from failed nodes remains costly. Each failed node must download an amount of data proportional to the size of a blob (the original data), leading to a recovery cost of O(blob size) per node. Additionally, most decentralized storage systems utilizing erasure codes rely on Reed-Solomon (RS) codes, which involve polynomial arithmetic over finite fields. While effective, these operations introduce significant computational overhead during encoding and decoding. These challenges highlight the limitations of current approaches, especially in permissionless environments, where nodes frequently experience faults, lose data, or store corrupted data due to network instability or adversarial behavior. As a result, frequent data recovery is required, further increasing costs. To address these challenges, MystenLabs introduces Walrus, a novel approach to decentralized storage. By leveraging a two-dimensional encoding technique called Red Stuff, Walrus reduces computational overhead and lowers data recovery costs to levels proportional to the amount of lost data. It ensures efficient data reconstruction, regeneration, and resilience, making it well-suited for permissionless environments. In short, Walrus provides a scalable and cost-effective solution for decentralized data storage.2. Before Jumping into TechnicalsWalrus, as a decentralized storage system, consists of many components, including Red Stuff, economics and incentives, and committee reconfiguration. While it’s impossible to cover all these aspects in detail here, this article focuses on the core innovation that makes Walrus unique: Red Stuff. However, to provide some context, I’ll briefly touch on other key components so you can explore them further if you're interested. Let’s start with economics and incentives. As a permissionless decentralized storage system, Walrus needs mechanisms to ensure all participants act honestly. To achieve this, it uses a delegated proof-of-stake (dPoS) protocol. Like other dPoS-based systems, Walrus has its own token, WAL, which users can delegate to secure the network. Storage nodes handle data in proportion to their staked WAL tokens. However, unlike many decentralized storage systems, staking, governance, and payment handling don’t occur directly on Walrus. Instead, these functions are managed on Sui, a high-performance blockchain. This approach offers several benefits:Data storage focus – Walrus can focus solely on data storage, while Sui handles metadata management and the control plane.Sui-native interoperability – Proofs that data is stored on Walrus (proofs of availability) exist as objects on Sui, making them programmable and easy to integrate with smart contracts.Efficiency – By leveraging Sui, Walrus avoids the complexity of building a custom blockchain or VM, benefiting from Sui’s speed for economic and governance processes.Next, committee reconfiguration: Since Walrus operates permissionlessly, the set of storage nodes (storage committee) is not static and evolves over time. Walrus operates in epochs, similar to Sui, where each epoch assumes a fixed set of storage nodes. At the end of an epoch, committee reconfiguration occurs: the staked WAL tokens of each node are recalculated, a new committee is formed, and the previous committee transfers its data to the new one. These explanations provide a general overview of Walrus, excluding its encoding process. Other aspects, such as ensuring that storage nodes reliably store assigned data or determining storage prices and payments, aren’t covered here. If you’re interested in diving deeper, I encourage you to explore the Walrus whitepaper [3]. Now that we’ve set the stage, let’s jump into the heart of Walrus: the Red Stuff protocol.3. Definition of Erasure CodeAn erasure code is a method that adds redundant data to an original message, extending it into a longer message called a codeword, such that only a subset of the codeword is needed to reconstruct the original message. More formally, a message is divided into $$k$$ smaller pieces, called message symbols. The erasure code then transforms these $$k$$ symbols into $$n$$ encoded symbols, where $$n > k$$, ensuring that any subset of $$k'$$ symbols is sufficient to recover the message. In optimal cases, such as Reed-Solomon (RS) codes, which are widely used in decentralized storage systems, the minimum subset required for reconstruction is exactly $$k$$ symbols ($$k' = k$$), guaranteeing full recovery. However, encoding and decoding in RS codes are computationally expensive because they rely on polynomial arithmetic over finite fields. Walrus, instead, adopts RaptorQ [5], a Fountain code, for its encoding algorithm. RaptorQ uses XOR operations, making it significantly faster and more efficient than RS codes. Although RaptorQ’s encoding protocol is probabilistic—meaning it requires $$k' \geq k$$ symbols for successful decoding—the probability of recovery approaches 1 very quickly as $$k'$$ slightly exceeds $$k$$. As a result, it is generally safe to assume $$k' = k$$ in practice. In summary, both Reed-Solomon and RaptorQ follow the same fundamental principle: they take $$k$$ original symbols, extend them to $$n$$ encoded symbols, and allow the original message to be reconstructed from any $$k$$ symbols. The key difference lies in how these operations are performed—Reed-Solomon relies on polynomial arithmetic over finite fields, while RaptorQ is XOR-based, making it computationally lightweight. For simplicity, rather than focusing on specific encoding protocols, from this point onwards, I will discuss everything in the broader context of erasure coding without explicitly referring to the underlying algorithm.4. The Idea Behind Walrus: Red Stuff4.1 The Straightforward Approach to Distributed StorageRed Stuff is a 2D erasure code, meaning an original message is structured in a matrix form before encoding. But why do we need 2D encoding—is 1D encoding not enough? While 1D encoding is possible, it is not optimal for failed nodes to recover their missing data. Let’s see why. Assume we are designing a decentralized storage system with $$n$$ nodes and want to store a blob $$B$$ of size $$|B|$$, where $$B$$ can be divided into $$f + 1$$ symbols (meaning $$k = f + 1$$). We then encode $$B$$ into $$n$$ symbols, where $$n = 3f + 1$$. Each symbol is then distributed to a separate node. From this setup, we observe that the total storage required across the network is about three times the original message size (since $$(3f + 1)/(f + 1) \approx 3$$), regardless of the number of nodes. This is far more efficient than full replication, where the storage requirement scales linearly with the number of nodes.Figure 1: Writing Process in the Straightforward 1D Erasure ApproachTo retrieve the message, a data reader requests symbols from nodes in the network until it collects $$f + 1$$ symbols, at which point the entire message can be reconstructed. The amount of data required for recovery is equal to the size of the original message, $$O(|B|)$$. This is reasonable for data readers.Figure 2: Reading Process in the Straightforward 1D Erasure ApproachHowever, this method is not optimal for failed nodes that need to recover their lost symbols. Since failed nodes follow the same recovery process as data readers but discard the symbols they don’t need, they end up downloading significantly more data than necessary. In the worst-case scenario, where up to $$f$$ nodes fail, the total bandwidth cost for the network is $$O(n|B|)$$, which scales linearly with the number of nodes—making it inefficient for large decentralized systems.Figure 3: Recovery Process in the Straightforward 1D Erasure Approach4.2 Red Stuff: Improving Recovery with the Twin-Code FrameworkFortunately, the Twin-Code framework [4] significantly reduces the cost for a failed node to recover its lost data. Instead of downloading data equal to the blob size $$O(|B|)$$, Twin-Code allows failed nodes to recover their missing data by downloading only $$O(|B|/n)$$—approximately the size of the data they need to restore. In this subsection, I’ll provide a high-level overview of the Red Stuff protocol, which reduces recovery costs by incorporating the Twin-Code framework into erasure coding. I will avoid mathematical details as much as possible. The next section will cover the full mathematical breakdown. Assume we have a blob $$B$$ of size $$|B|$$ as a message. The key idea behind Twin-Code is to apply 2D encoding to $$B$$, producing two encoded messages, which we refer to as primary encoding and secondary encoding. As always, we assume a network of $$n$$ nodes, where $$n = 3f + 1$$. For simplicity, we assume that each node has equal power, meaning it is responsible for managing the same amount of data. However, this assumption does not hold in a permissionless dPoS system, where a node’s power depends on its staked WAL tokens. As a result, nodes must manage data in proportion to their stake rather than sharing it equally. 4.2.1 Encoding ProcessPartition the message into a 2D matrix with $$f + 1 $$ rows and $$2f + 1$$ columns, creating an $$(f + 1) \times (2f + 1)$$ matrix where each matrix element is a symbol.Compute the primary encoded message by encoding each column of the original message matrix, extending its $$f + 1$$ elements to $$n = 3f + 1$$ elements, resulting in a $$(3f + 1) \times (2f + 1)$$ matrix.Compute the secondary encoded message by encoding each row of the original message matrix, extending its $$2f + 1$$ elements to $$n = 3f + 1$$ elements, forming an $$(f + 1) \times (3f + 1)$$ matrix.Partition the encoded messages:Primary encoded message → Divided row-wise into $$3f + 1$$ chunks, each called a primary sliver.Secondary encoded message → Divided column-wise into $$3f + 1$$ chunks, each called a secondary sliver.Distribute the encoded data:A sliver pair i consists of the $$i$$-th primary and secondary slivers.Since there are $$n = 3f + 1$$ sliver pairs, each node receives a unique pair.Figure 4: Red Stuff - Encoding phase of the writing process & Distributing phase of the writing process4.2.2 Reading Process The primary encoded message serves the same purpose as in 1D encoding—it is used for the read protocol. A data reader can request slivers from nodes in the network until it obtains $$f + 1$$ primary slivers, allowing it to reconstruct the original message. The secondary encoded message can also be used to reconstruct the original message. Although it requires $$2f+1$$ secondary slivers, the recovery cost remains the same as in the primary case. The total cost of recovering data using one type of sliver equals the number of required slivers of that type multiplied by the size of each sliver of that type:Primary slivers: Each primary sliver contains $$2f+1$$ symbols, so the total cost is $$(2f+1) \times (f+1)$$Secondary slivers: Each secondary sliver contains $$f+1$$ symbols, and $$2f+1$$ secondary slivers are required, so the total cost is $$(f+1) \times (2f+1)$$.The downsides of using the secondary encoded message are:Each storage node must store slightly more data, as it stores both primary and secondary slivers rather than only one type of sliver.A data reader needs to interact with twice as many nodes compared to using the primary encoded message.4.2.3 Recovery Process The real advantage of having both encoded messages is that one type of encoded message allows efficient recovery of the other. More specifically, a failed node can:Request $$2f + 1$$ nodes, where each node responds with a scalar calculated from its stored secondary sliver, allowing the failed node to recover its missing primary sliver.Request $$f + 1$$ nodes, where each node responds with a scalar calculated from its stored primary sliver, allowing the failed node to recover its missing secondary sliver.Thus, a failed node needs to download only $$(2f + 1) + (f + 1) = n$$ scalars in total to recover its missing sliver pair.Figure 5a: Red Stuff - Secondary sliver reconstruction & Primary sliver reconstructionFigure 5b: Red Stuff - Primary sliver reconstructionSince a primary encoded message contains $$(3f + 1) \times (2f + 1) = O(n^2)$$ symbols (where $$n = 3f + 1$$), the same applies to the secondary encoded message, meaning the system stores a total of $$O(n^2)$$ symbols. Because the original message has a size of $$|B| = O(|B|)$$ and there are $$n$$ nodes, we have:The size of each symbol: $$O(|B|/n^2)$$The amount of data each node stores: $$O(|B|/n)$$As a scalar has the same size as a symbol, the total bandwidth cost for a failed node is equal to the size of each scalar multiplied by the number of scalars needed to recover a sliver pair: $$O\left(\frac{|B|}{n^2} \times n\right) = O\left(\frac{|B|}{n}\right)$$ which is the same as the amount of data each node stores. That’s it! This is a high-level overview of the Red Stuff protocol and how it improves the failed node recovery cost. If this explanation feels incomplete, don’t worry—in the next section, we’ll dive into the full mathematical details together.5. Red Stuff Encoding Protocol: The Math VersionThis section provides an overview of the math behind Red Stuff. I’ll keep it concise—the math won’t be rigorously formal but will be sufficient to convey the key ideas. We’ll introduce notation, walk through the encoding and decoding processes, and explain how a failed node can efficiently recover its sliver pair using only $$O(|B|/n)$$ download bandwidth. We assume the encoding algorithm is a linear code (both Reed-Solomon encoding and RaptorQ are also linear codes). A linear code is an error-correcting code where every codeword is a linear combination of symbols from an input message. Formally, a linear code can be represented as a linear transformation using matrix notation: $$E=GM$$ where:$$G$$ is the generator matrix.$$M$$ is the input message matrix (an original message or a blob).$$E$$ is the encoded matrix (a codeword).This equation shows that the encoded message $$E$$ is obtained by applying the transformation $$G$$ to the message $$M$$. Note: In standard coding theory, encoding is often written as $$E = MG$$. However, to maintain consistency with the matrix definitions in the Walrus WPP, we use the $$E=GM$$ notation in this article.5.1 Encoding ProcessNow, let’s define our parametersMessage/blob $$ B $$: an $$(f+1) \times (2f+1)$$ matrix, denoted as $$ M_p $$.Transposed message $$ M_s $$: a $$(2f+1) \times (f+1)$$ matrix, where $$ M_s = M_p^T $$ (used for secondary encoding, with $$ T $$ denoting the transpose operation).Primary encoded message $$ E_p $$: a $$(3f+1) \times (2f+1)$$ matrix.Secondary encoded message $$ E_s $$: a $$(3f+1) \times (f+1)$$ matrix.Generator matrix for primary encoding $$ G_p $$: a $$(3f+1) \times (f+1)$$ matrix.Generator matrix for secondary encoding $$ G_s $$: a $$(3f+1) \times (2f+1)$$ matrix.Additionally, we define the following row-wise notation:The $$ l $$-th row of $$ E_p $$: a $$ 1 \times (2f+1) $$ matrix, denoted as $$ e_p^l $$.The $$ l $$-th row of $$ E_s $$: a $$ 1 \times (f+1) $$ matrix, denoted as $$ e_s^l $$.The $$ l $$-th row of $$ G_p $$: a $$ 1 \times (f+1) $$ matrix, denoted as $$ g_p^l $$.The $$ l $$-th row of $$ G_s $$: a $$ 1 \times (2f+1) $$ matrix, denoted as $$ g_s^l $$.The $$ l $$-th column of $$ M_p $$: a $$ (f+1) \times 1 $$ matrix, denoted as $$ \bar{m}_p^l $$.Without loss of generality, let node $$ l $$ store sliver pair $$ l $$, which consists of:The $$ l $$-th primary sliver: Corresponding to the $$ l $$-th row of $$ E_p $$, denoted as $$ e_p^l $$.The $$ l $$-th secondary sliver: Corresponding to the $$ l $$-th row of $$ E_s $$, denoted as $$ e_s^l $$.Since $$ E_p $$ can be written in matrix form as: $$E_p = G_p M_p$$ we can rewrite this equation using the vector notation introduced earlier as: $$E_p = \begin{pmatrix} e_p^1 \ e_p^2 \ e_p^3 \ \vdots \end{pmatrix}, \quad G_p = \begin{pmatrix} g_p^1 \ g_p^2 \ g_p^3 \ \vdots \end{pmatrix}, \quad M_p = \begin{pmatrix} \bar{m}_p^1 & \bar{m}_p^2 & \bar{m}_p^3 & \cdots \ \end{pmatrix}$$ Here, each $$ e_p^l $$ and $$ g_p^l $$ represents a row vector, and each $$ \bar{m}_p^l $$ represents a column vector, leading to:$$\begin{pmatrix} e_p^1 \ e_p^2 \ e_p^3 \ \vdots \end{pmatrix}\begin{pmatrix} g_p^1 \ g_p^2 \ g_p^3 \ \vdots \end{pmatrix} \begin{pmatrix} \bar{m}_p^1 & \bar{m}_p^2 & \bar{m}_p^3 & \cdots \ \end{pmatrix}\begin{pmatrix} g_p^1 \bar{m}_p^1 & g_p^1 \bar{m}_p^2 & g_p^1 \bar{m}_p^3 & \dots\ g_p^2 \bar{m}_p^1 & g_p^2 \bar{m}_p^2 & g_p^2 \bar{m}_p^3 & \dots\ g_p^3 \bar{m}_p^1 & g_p^3 \bar{m}_p^2 & g_p^3 \bar{m}_p^3 & \dots\ \vdots & \vdots & \vdots & \ddots \end{pmatrix}$$$$\begin{pmatrix} e_p^1 \ e_p^2 \ e_p^3 \ \vdots \end{pmatrix}\begin{pmatrix} g_p^1 \begin{pmatrix} \bar{m}_p^1 & \bar{m}_p^2 & \bar{m}_p^3 & \dots \end{pmatrix}\ g_p^2 \begin{pmatrix} \bar{m}_p^1 & \bar{m}_p^2 & \bar{m}_p^3 & \dots \end{pmatrix}\ g_p^3 \begin{pmatrix} \bar{m}_p^1 & \bar{m}_p^2 & \bar{m}_p^3 & \dots \end{pmatrix}\ \vdots \end{pmatrix}\begin{pmatrix} g_p^1 M_p\ g_p^2 M_p\ g_p^3 M_p\ \vdots \end{pmatrix}$$ From this, we see that the $$ l $$-th row of $$ E_p $$ is obtained by applying the $$ l $$-th row of $$ G_p $$ to the entire message $$ M_p $$. The same process can be applied to $$ E_s $$, and the results can be represented in row-wise form as $$ e_p^l = g_p^l M_p $$ and $$ e_s^l = g_s^l M_s $$. Hence, each node $$ l $$ stores the pair: $$(e_p^l, e_s^l) = (g_p^l M_p, g_s^l M_s), \quad \text{where} \quad l \in {1, 2, ..., n=3f+1}$$5.2 Reading Process: Recovering the Original MessageA reader who wants to retrieve the original message $$ M_p $$ can do so by requesting nodes in the network until it collects at least $$ (f+1) $$ primary slivers, where: $$e_p^l = g_p^l M_p$$ Assume, without loss of generality, that the reader receives $$ (f+1) $$ primary slivers from nodes $$1$$ to $$ f+1 $$, meaning it collects the subset:$$E'p = \begin{pmatrix} e_p^1 \ e_p^2 \ \vdots \e{f+1} \end{pmatrix}\begin{pmatrix} g_p^1 \ g_p^2 \ \vdots \g_{f+1} \end{pmatrix}M_p = G'_p M_p$$ Note that although the reader only receives primary slivers (i.e., $$ E'_p $$), it is assumed to have access to $$ g_p^l $$ and can construct $$ G'_p $$ using a predefined deterministic mechanism and shared parameters. Here, $$ G'_p $$ is an $$ (f+1) \times (f+1) $$ matrix, where each row of $$ G'_p $$ corresponds to a unique row of $$ G_p $$. Due to the properties of linear encoding, $$ (f+1) $$ distinct primary slivers are sufficient to decode $$ E'_p $$ and recover $$ M_p $$. This means the reader can solve the linear equation: $$E'_p = G'_p M_p$$ by inverting $$ G'_p $$, allowing it to recover the entire message: $$M_p = (G'_p)^{-1} E'_p$$ The same decoding method can be applied to $$ E'_s$$, but it requires $$ (2f+1) $$ secondary slivers. The downsides of this approach are discussed in the Red Stuff: Improving Recovery with the Twin-Code Framework section. However, as we’ll see next, secondary slivers are critical for failed nodes to recover missing data.5.3 Recovery Process: How a Failed Node Recovers Its Sliver PairIf a node $$F$$ fails, it needs to recover its sliver pair: $$(e_p^F, e_s^F) = (g_p^F M_p, g_s^F M_s),$$ which it is supposed to store. Step 1: Request Scalars from Other Nodes The failed node $$ F $$ begins the recovery process by requesting information from other nodes. Unlike the reading process, where a node $$ l $$ responds to a reader with its primary sliver $$ g_p^l M_p $$, here, node $$ l $$ first multiplies its primary sliver by the transpose of $$ g_s^F $$ before responding. Since it is assumed that each node has access to $$ g_s^i $$ for any $$ i $$ via a predefined deterministic mechanism, this results in a scalar (a $$ 1 \times 1 $$ matrix): $$g_p^l M_p ({g_s}^F)^T$$ Step 2: Construct a Recovery Equation Once the failed node collects $$ (f+1) $$ scalars from distinct nodes (assume, for simplicity, that these come from nodes $$ 1, 2, \dots, f+1 $$), it constructs the matrix $$ \tilde{E}_p $$ as: $$\begin{pmatrix} g_p^1 M_p ({g_s}^F)^T \ g_p^2 M_p ({g_s}^F)^T \ \vdots \ g_p^{f+1} M_p ({g_s}^F)^T \end{pmatrix}$$ Rearranging the equation: $$\tilde{E}_p = \begin{pmatrix} g_p^1 \ g_p^2 \ \vdots \ g_p^{f+1} \end{pmatrix} M_p ({g_s}^F)^T$$ Here, the failed node knows $$ \begin{pmatrix} g_p^1 \ g_p^2 \ \vdots \ g_p^{f+1} \end{pmatrix} $$ and $$ \tilde{E}_p $$, but needs to solve for $$ M_p ({{g_s}^F})^T $$. Step 3: Solve for Missing Data Define: $$\mu = M_p ({g_s}^F)^T$$ Then the equation simplifies to: $$\tilde{E}_p = G'_p \mu$$ Since this has the same form as the earlier decoding equation $$ E'_p = G'_p M_p $$, the failed node solves for $$ \mu $$ by inverting $$ G'_p $$: $$\mu = (G'_p)^{-1} \tilde{E}_p$$ Step 4: Recover Secondary Sliver Once the failed node has $$ \mu $$, it transposes the result: $$\mu^T = (M_p ({g_s}^F)^T)^T = g_s^F M_p^T = g_s^F M_s = e_s^F$$ Thus, the failed node successfully recovers its secondary sliver. Step 5: Recover Primary Sliver Similarly, the failed node requests scalars of the form: $$g_s^l M_s ({g_p}^F)^T$$ from $$ (2f+1) $$ nodes and follows Steps 2-4 to reconstruct its primary sliver: $$e_p^F = g_p^F M_p$$5.4 Efficiency of Red Stuff vs. Traditional MethodsThe key advantage of Red Stuff is its low-bandwidth recovery process. By following the previously mentioned process, node $$ F $$ can recover its sliver pair by requesting:$$ f+1 $$ scalars of the form $$ g_p^l M_p ({g_s}^F)^T $$ to reconstruct its secondary sliver.$$ 2f+1 $$ scalars of the form $$ g_s^l M_s ({g_p}^F)^T $$ to reconstruct its primary sliver.As each scalar requires $$ O(1) $$ bandwidth to download, the total bandwidth cost for recovery is: $$O((f+1) + (2f+1)) = O(n)$$ Thus, the total download bandwidth per failed node is: $$O(|B| / n)$$ which is significantly more efficient than the $$ O(|B|)$$ bandwidth required in traditional erasure-coded storage systems.5.5 Connection with the White PaperFigure 3 in the white paper might seem counterintuitive at first. It illustrates a situation where a node can recover missing symbols of one type by requesting encoded slivers of the other type, even though those symbols do not appear to exist. Below is a description of Figure 3 from the white paper:In a network with four nodes, node 4 is missing symbols $$S_{41}$$, $$S_{42}$$, $$S_{43}$$, $$S_{14}$$ and $$S_{24}$$, so it attempts to recover them. It requests $$ S_{34} $$ from node 3, even though $$ S_{34} $$ does not explicitly exist in the encoding process. However, node 4 successfully reconstructs $$ S_{24} $$ using $$ S_{14} $$ and $$ S_{34} $$. This raises a crucial question: How does node 3 generate $$ S_{34} $$ if it was never directly encoded? Let’s connect this observation with the mathematical framework explained earlier. Recall that node $$ l $$ can assist node $$ F $$ in recovering its secondary sliver by sending: $$g_p^l M_p (g_s^F)^T = e^l_p (g_s^F)^T.$$ Rearranging this equation slightly, we obtain: $$e^l_p (g_s^F)^T = (g_s^F (e^l_p)^T)^T = (g_s^F \Gamma)^T,$$ where $$ \Gamma = (e^l_p)^T $$. Since we know that $$ g_p^l M_p (g_s^F)^T = (g_s^F \Gamma)^T $$ is a scalar and remains invariant under transposition, we can ignore the transpose and focus only on the term $$ g_s^F \Gamma $$. Viewing $$ \Gamma $$ as a message being encoded by $$ G_s $$, we get: $$E_s(\Gamma) = G_s \Gamma.$$ Thus, the $$ F $$-th row of $$ E_s(\Gamma) $$, denoted as $$ e^F_s(\Gamma) $$, is simply: $$e^F_s(\Gamma) = g^F_s \Gamma = g_s^F (e^l_p)^T.$$ This equation tells us that node $$ l $$ encodes the transpose of its primary sliver $$ e^l_p $$ using the generator matrix $$ G_s $$, extracts only the $$ F $$-th row from the result, and sends it to node $$ F $$. Returning to the example in the white paper, node 3 follows these steps to assist node 4:Compute the transpose of its primary sliver $$ (e^3_p)^T $$.Encode it using the secondary generator matrix $$ G_s $$: $$ G_s (e^3_p)^T $$.Extract only the 4th row of the encoded result: $$ g^4_s (e^3_p)^T $$.Send $$ g^4_s (e^3_p)^T $$ to node 4.Thus, one can interpret this process as node 3 encoding the transpose of its primary sliver before extracting only the 4th row and sending it to node 4.6. SummaryMost traditional data availability solutions rely on RS codes, which impose high computational overhead and bandwidth costs for data recovery. Walrus introduces Red Stuff, a novel encoding technique that leverages the Twin-Code framework and a Fountain code to address these inefficiencies. With Red Stuff, nodes benefit from lightweight computational overhead and efficient data recovery, with bandwidth usage proportional to the amount of lost data rather than the total dataset size. These advantages make Walrus a cost-efficient, reliable, and scalable solution, positioning it as one of the most attractive decentralized data storage systems.References[1] ContributionDAO, “Celestia: A Summary of How Fraud Proofs and Data Availability Proofs Work.” https://mirror.xyz/contributiondaoblog.eth/_VknA2qiU0AF2aHcfCRxNjNQ2zLgOOWDnDXjs7fQR1o [2] ContributionDAO, “Espresso System” https://x.com/contributedao/status/1813444809442644098 [3] MystenLabs Team, “Walrus: An Efficient Decentralized Storage Network,” Version 1.0, September 16, 2024. https://docs.walrus.site/walrus.pdf [4] K. V. Rashmi, Nihar B. Shah, and P. Vijay Kumar, “Enabling Node Repair in Any Erasure Code for Distributed Storage,” IEEE International Symposium on Information Theory (ISIT), July 2011. https://ieeexplore.ieee.org/document/6033732/ [5] Qualcomm Incorporated, “RaptorQ™ Technical Overview,” 2010. https://www.qualcomm.com/content/dam/qcomm-martech/dm-assets/documents/RaptorQ_Technical_Overview.pdf ## Publication Information - [ContributionDAO](https://paragraph.com/@contributiondaoblog/): Publication homepage - [All Posts](https://paragraph.com/@contributiondaoblog/): More posts from this publication - [RSS Feed](https://api.paragraph.com/blogs/rss/@contributiondaoblog): Subscribe to updates - [Twitter](https://twitter.com/contributedao): Follow on Twitter