Let's talk about another storage project developed by the @SuiNetwork team, @WalrusProtocol 🧐🧐
The Sui development company, Mysten Labs, has actually developed a data layer project called @WalrusProtocol.
Walrus is a project focused on data storage and data availability. After my research, I have one feeling—"awesome." This is the best storage project I have ever seen.
Here is the main text,
Decentralized storage projects mainly fall into two categories.
The first category uses full replication, trading off efficiency for security, meaning each node stores a complete copy of the data, represented by projects like @Filecoin and Arweave.
The second category uses Reed-Solomon erasure coding, which saves original data in slices, represented by projects like @Storj and Sia.
————————————————————————————————
Explaining erasure codes in layman's terms
The storage method of erasure codes needs some explanation. To be precise, it splits the original file into f+1 original slices and generates 2f additional repair slices. Each storage node saves a different slice, and any f+1 slices can reconstruct the original file.
Okay, you can skip this technical explanation and look at the next part.
Suppose we want to save 4 important numbers: [3, 7, 2, 5]. These 4 numbers are our "original slices."
Next, we need to generate additional slices,
Repair slice 1 = 3 + 7 + 2 + 5 = 17
Repair slice 2 = 3×1 + 7×2 + 2×3 + 5×4 = 47
Repair slice 3 = 3×1² + 7×2² + 2×3² + 5×4² = 131
Now we have 7 slices: [3, 7, 2, 5, 17, 47, 131], right?
Assuming the system has 7 nodes, we distribute them,
Zhang San: 3
Li Si: 7
Wang Wu: 2
Zhao Liu: 5
Qian Qi: 17
Sun Ba: 47
Zhou Jiu: 131
Suppose Li Si, Zhao Liu, and Zhou Jiu lose their data, we only have: [3, _, 2, _, 17, 47, _].
So how do we recover the original data?
Remember the formulas for the additional slices? That's right, we solve a linear equation.
3 + X + 2 + Y = 17
3×1 + X×2 + 2×3 + Y×4 = 47
We get X=7, Y=5.
Of course, this is just a very simple example.
You just need to remember the effect achieved by erasure codes. The effect is that as long as more than 1/3 of the nodes are functioning, it works.
In other words, in an erasure code system, nodes only need to store data slices, and as long as more than 1/3 of the nodes can operate, the data can be recovered, but the nodes need to be stable because the replacement cost is high.
However, in a fully replicated system, there must be full nodes that download all data copies.
The former sacrifices some security for lower costs, while the latter sacrifices redundancy for system security and stability.
————————————————————————————————
Walrus's two-dimensional (2D) erasure code innovation
Walrus's approach actually finds a middle ground, achieving a balance between the two. The core also uses erasure coding but has created an innovative technology called Red Stuff.
Red Stuff employs a more clever encoding method for data slicing. Remember the previous example of erasure codes?
To save 4 important numbers: [3, 7, 2, 5], we need to generate additional slices and finally solve the linear equation.
Let's explain Red Stuff with this example. The Red Stuff encoding method is a two-dimensional (2D) encoding algorithm, which you can think of as "Sudoku."
3 7 25 in Red Stuff encoding becomes,
[3 7]
[2 5]
Assuming the encoding rules are,
Column 3 = Column 1 + Column 2
Column 4 = Column 1×2 + Column 2×2
Row 3 = Row 1 + Row 2
Row 4 = Row 1×2 + Row 2×2
This turns the additional slices into
[3 7 10 20]
[2 5 7 14]
[5 12 18 34]
[10 24 34 68]
Next, we distribute them to nodes by rows and columns,
Zhang San: 3 7 10 20, which is the first row
Li Si: 2 5 7 14, the second row
Wang Wu: 5 12 18 34, …
Zhao Liu: 10 24 35 68, …
Qian Qi: 3 2 5 10, the first column
Sun Ba: 7 5 12 24, …
Zhou Jiu: 10 7 18 34, …
Zheng Shi: 20 14 34 68, …
Suppose Wang Wu loses data, meaning the third row of data is lost. He only needs to ask Zhang San from the first row and Li Si from the second row for the numbers 10 and 7.
Again, we solve the linear equations to get the results.
From the above simple but not so rigorous example, we can summarize the characteristics of Red Stuff,
When recovering data, there is no need for complete rows or columns; only specific position data is needed. This feature can be called "locality."
Additionally, a number can be recovered from both row and column dimensions, which is "information reuse."
Furthermore, for complex data, one can first recover the more "easily" computable dimensions and then use the recovered data to compute the more difficult dimensions, which is "progressiveness."
In practical applications, suppose a file is encoded into 301 slices under the erasure code architecture.
In a typical erasure code system, recovering 1 slice requires 101 slices, while in Red Stuff, recovering a pair of slices only requires about 200 individual symbols.
Assuming we store a 1GB file, and the system has 301 nodes, in a typical erasure code system, after node failure, one needs to download 1GB to recover the slice, while in Red Stuff, each node stores: main slice (3.3MB) + secondary slice (3.3MB) = 6.6MB.
During recovery, only about 10MB of symbol data needs to be downloaded, saving 99% of bandwidth.
This design allows Walrus to maintain a large-scale decentralized storage network at an extremely low bandwidth cost, reducing recovery costs from O(|blob|) to O(|blob|/n). This is why Red Stuff is called "self-healing."
In addition, Walrus has added many security features, such as being the first protocol to support storage challenges in asynchronous networks.
The so-called "challenges" here are similar to the optimistic mechanism for spot-checking the data storage status of nodes.
Red Stuff adds verifiable cryptographic commitments to each slice, and each symbol can be independently verified, etc.
To summarize the features,
1) The first asynchronous security: solving the trust issue of nodes in distributed storage;
2) Self-verifying: built-in anti-counterfeiting mechanism;
3) Progressive: handling dynamic changes in nodes;
4) Scalable: supporting hundreds to thousands of nodes;
Finding the best balance between security and efficiency.
(This is the first part of the article)
Show original26.41K
13
The content on this page is provided by third parties. Unless otherwise stated, OKX is not the author of the cited article(s) and does not claim any copyright in the materials. The content is provided for informational purposes only and does not represent the views of OKX. It is not intended to be an endorsement of any kind and should not be considered investment advice or a solicitation to buy or sell digital assets. To the extent generative AI is utilized to provide summaries or other information, such AI generated content may be inaccurate or inconsistent. Please read the linked article for more details and information. OKX is not responsible for content hosted on third party sites. Digital asset holdings, including stablecoins and NFTs, involve a high degree of risk and can fluctuate greatly. You should carefully consider whether trading or holding digital assets is suitable for you in light of your financial condition.