St. Petersburg State University
The network storage systems are treated, organized into several groups of devices with data redundancy capability. We consider two schemes for managing data reconstruction process, namely the declustered Redundant Array of Independent Disks (RAID) technique and, more generally, the Reed-Solomon (RS) error correction coding within the framework of the Welch-Berlekamp algorithm. These approaches essentially exploit the properties of circulant or Hankel matrices.
The reconstruction of all the units in the failed device causes a certain read/write load onto the survived devices. One of the major requirements to the storage device array is to manage the chunk group distribution across devices to produce a balanced read/write load on survived devices regardless of the location of the failed device. Construction of the data layout with the aid of an appropriate circulant matrix provides one with such an opportunity. This surprisingly relates to the classical Graph Coloring problem.
For the problem of the error locator polynomial construction in the RS-coding, we propose an effective algorithm for recursive computation of the potential candidates in the form of Hankel polynomials.
Data storage developers RAIDIX, XINNOR, . . .
RAID (Redundant Array of Independent Discs), Error correcting codes (Reed-Solomon)
Declustered RAID
Matrix formalism: Vandermonde, Hankel, circulant (циклические) matrices
RAID-group: $G$ binary blocks (chunks), $\ge 4 $KB, $G=G_{I}+G_{P} $ where $G_I$ stands for the number of information chunks, $G_P\in\{1,2,3\} $ the number of parity (checksums) chunks. For RAID5 configuration: $$C_1\oplus C_2 \oplus \dots \oplus C_{G-1} = C_G \, . $$ For $G=3$ $$ C_1\oplus C_2\oplus C_3=\mathbb O $$ All the chunks of the RAID-group are located in different disks. In case of failure of any disk, the lost chunks are reconstructed with the aid of the chunks of their RAID-groups remained undamaged. These chunks are XOR'ed, with the result first written to the spare chunk (rebuild). On completing this procedure for all the lost chunks, the chunks from spares are to be rewritten to the damaged disk (rebalance) if it is finally repaired.
General techological needs in storage array composition
(1) balancing load (Read-Write) on disks in the process of reconstruction of the failed disks
and/or
(2) maximally parallelize the reconstruction.
For any failed disk IDs.
Storage array configuration: each stripe contains $K=3$ RAID-groups with $G=3$ chunks in each, plus $ 1 $ spare chunk: $$ M= \left(\begin{array}{cccccccccc} S & {\color{Red}A_1} & {\color{Red}A_2} & {\color{Blue} B_1} & {\color{Green} C_1} & {\color{Blue} B_2} & {\color{Red}A_3} & {\color{Blue} B_3} & {\color{Green} C_2} & {\color{Green} C_3} \\ {\color{Magenta} D_1} & {\color{Salmon} E_1} & {\color{Purple} F_1} & {\color{Purple} F_2} & S & {\color{Magenta} D_2} & {\color{Magenta} D_3} & {\color{Salmon} E_2} & {\color{Purple} F_3} & {\color{Salmon} E_3} \\ 3 & 2 & {\color{Purple} G_1} & 2 & 3 & 3 & S & {\color{Purple} G_2} & {\color{Purple} G_3} & 2 \\ \vdots & & & & & & & & & \vdots \end{array} \right) $$ In each row (stripe) the numbers denoting RAID-groups are independent $$ M= \left(\begin{array}{cccccccccc} 0 & 1 & 1 & 2 & 3 & 2 & 1 & 2 & 3 & 3 \\ 1 & 2 & 3 & 3 & 0 & 1 & 1 & 2 & 3 & 2 \\ 3 & 2 & 1 & 2 & 3 & 3 & 0 & 1 & 1 & 2 \end{array} \right) $$ If the $ 8 $th disk is lost $$ M= \left(\begin{array}{cccccccccc} 0 & 1 & 1 & 2 & 3 & 2 & 1 & {\color{Pink} 2} & 3 & 3 \\ 1 & 2 & 3 & 3 & 0 & 1 & 1 & {\color{Pink} 2} & 3 & 2 \\ 3 & 2 & 1 & 2 & 3 & 3 & 0 & {\color{Pink} 1} & 1 & 2 \end{array} \right) $$ for reconstruction of its chunks it is necessary to read the chunks of the involved RAID-groups from the survived disks (colored in red): $$ M= \left(\begin{array}{cccccccccc} 0 & 1 & 1 & {\color{Red} 2} & 3 & {\color{Red} 2} & 1 & {\color{Pink} 2} & 3 & 3 \\ 1 & {\color{Red} 2} & 3 & 3 & 0 & 1 & 1 & {\color{Pink} 2} & 3 & {\color{Red} 2} \\ 3 & 2 & {\color{Red} 1} & 2 & 3 & 3 & 0 & {\color{Pink} 1} & {\color{Red} 1} & 2 \end{array} \right) $$ All these chunks are located in different disks (columns). Therefore, the reconstruction of $ 3 $ failed chunks could be managed in parallel mode.
We call a bundle (пучок) such a set of rows. If it contain rows with IDs $ j_1, j_2,\dots $, we denote it as $$ \langle j_1, j_2,\dots \rangle $$
Unfortunately, this trick does not work for the case of failure of the $2$nd disk (column): $$ M= \left(\begin{array}{cccccccccc} 0 & {\color{Pink} 1} & {\color{Red} 1} & 2 & 3 & 2 & {\color{Red} 1} & 2 & 3 & 3 \\ 1 & {\color{Pink} 2} & 3 & 3 & 0 & 1 & 1 & {\color{Red} 2} & 3 & {\color{Red} 2} \\ 3 & {\color{Pink} 2} & 1 & {\color{Red} 2} & 3 & 3 & 0 & 1 & 1 & {\color{Red} 2} \end{array} \right) $$ To reconstruct its chunks, one need to read two chunks from the $10$th column.
Problems statement. Given a matrix $ M $ with $N=G\times K+1 $ columns and each row containing the numbers $$ 0,\underbrace{1,1,\dots,1}_G,\underbrace{2,2,\dots,2}_G,\dots,\underbrace{K,K,\dots,K}_G $$ permuted somehow; $$ M= \left(\begin{array}{cccccccccc} 0 & 1 & 2 & 1 & 3 & 3& 1& 3& 2& 2 \\ 3 & 0 & 1 & 2 & 1 & 2& 3 &3 & 2 & 1 \\ 3 & 2 & 0 & 1 & 2 & 3 & 2 & 1 & 1 & 3 \\ 3 &1& 3 & 0 & 1 & 1 & 2 & 3 & 2 & 2 \\ 1 & 3 & 2 & 2 & 0 & 1 & 2 & 3 & 1 & 3 \\ 3 & 3 & 3 & 2 & 2 & 0 & 1 & 1 & 1 & 2 \\ 2 & 2 & 1 & 3 & 3 & 1 & 0 & 1 & 2 & 3 \\ 2 & 3 & 2 & 3 & 3 & 2 &1 &0 &1 & 1 \\ 2 & 1 & 3 & 3 & 2 &3 & 2 & 1 & 0 & 1 \\ 1 &2 & 2 & 3 & 1 & 3 & 1 & 3 & 2 & 0 \end{array}\right) $$ find clusterization of its rows into the set of disjoined bundles…
…For any ID of the failed column (disk)…
…With minimal possible number of bundles.
$$ \begin{array}{c|c|c|c|c} \mbox{Lost column ID} & 0 & 1 & 2 & \dots \\ \hline \begin{array}{c} \mbox{Clusterization} \\ \mbox{ into bundles} \end{array} & \left\{\begin{array}{c} \langle 1, 4, 5\rangle \\ \langle 2, 3, 6, 8\rangle \\ \langle 7, 9\rangle \end{array} \right\} & \left\{\begin{array}{c} \langle 0, 3, 4, 5\rangle \\ \langle 2, 6. 8\rangle \\ \langle 7, 9\rangle \end{array} \right\} & \left\{\begin{array}{c} \langle 1, 5, 6, 7\rangle \\ \langle 2, 4, 9, 10\rangle \\ \langle 8\rangle\end{array} \right\} & \dots \end{array} $$ It can be proved that for any lost disk ID, the minimal number of bundles equals $ 3 $. And it is the minimal possible number for any configuration of the matrix constructed for these parameters $ G=3,K=3 $.
How to decluster the given matrix?
How to find an optimal matrix for given $G,K$?
$$ \mathfrak C= \left(\begin{array}{lllll} a_1 & a_2 & a_3 & \dots & a_n \\ a_n & a_1 & a_2 & \dots & a_{n-1} \\ a_{n-1} & a_n & a_1 & \dots & a_{n-2} \\ \vdots & & \ddots & \ddots & \vdots \\ a_2 & a_3 & a_4 & \dots & a_1 \end{array} \right)=\left[ a_{j-k+1 \pmod{n}}\right]_{j,k=1}^n \ . $$ The first row generates the whole matrix via cyclic permutation. $$ (0,a_1,\dots,a_{G \cdot K}) \, . $$ $G=3,K=3$ $$ M= \left(\begin{array}{cccccccccc} 0 & 1 & 1 & 1 & 3 & 2 & 2 & 3 & 2 & 3 \\ 3 & 0 & 1 & 1 & 1 & 3 & 2 & 2 & 3 & 2 \\ 2 & 3 & 0 & 1 & 1 & 1 & 3 & 2 & 2 & 3 \\ 3 & 2 & 3 & 0 & 1 & 1 & 1 & 3 & 2 & 2 \\ 2 & 3 & 2 & 3 & 0 & 1 & 1 & 1 & 3 & 2 \\ 2 & 2 & 3 & 2 & 3 & 0 & 1 & 1 & 1 & 3 \\ 3 & 2 & 2 & 3 & 2 & 3 & 0 & 1 & 1 & 1 \\ 1 & 3 & 2 & 2 & 3 & 2 & 3 & 0 & 1 & 1 \\ 1 & 1 & 3 & 2 & 2 & 3 & 2 & 3 & 0 & 1 \\ 1 & 1 & 1 & 3 & 2 & 2 & 3 & 2 & 3 & 0 \end{array} \right) $$
$$ \begin{array}{c|c|c|c|c} \mbox{Lost column ID} & 0 & 1 & 2 & \dots \\ \hline \begin{array}{c} \mbox{Clusterization} \\ \mbox{ into bundles} \end{array} & \left\{\begin{array}{c} \langle 1,3,8\rangle \\ \langle 2,4,5\rangle \\ \langle 6,7,9 \rangle \end{array} \right\} & \left\{\begin{array}{c} \langle 2,4,9\rangle \\ \langle 3,5,6\rangle \\ \langle 7,8,0 \rangle \end{array} \right\} & \left\{\begin{array}{c} \langle 3,5,0\rangle \\ \langle 4,6,7\rangle \\ \langle 8,9,1 \rangle \end{array} \right\} & \dots \end{array} $$
Можно доказать, что если циклическая матрица кластеризуется на определенное количество пучков при порче какого-то ее столбца, то она кластеризуется на столько же пучков тех же размеров при порче любого другого столбца. И правило нумерации строк, входящих в пучки также определяется циклическим сдвигом.
Теорема 1. Если пучок
$$ \langle i_1,i_2, \dots, i_{\ell} \rangle \, , $$ допустим для случая порчи $0$-столбца, то пучок $$ \langle i_1,i_2, \dots, i_{\ell} \rangle + h \langle 1,1, \dots, 1 \rangle \pmod{N} $$ допустим в случае порчи $h$-столбца.
Template of a RAID-group is a row vector containing $ G $ numbers of the columns containing the chunks of this group.
Шаблоном RAID-группы назовем строку из $ G $ номеров столбцов (дисков), содержащих чанки группы. Всего в циклической матрице содержится $ N G $ шаблонов. Каждое число из $\{0,1,\dots, N-1\} $ содержится ровно в $ N-1 $ шаблоне. Каждый шаблон «привязан» к номеру строки матрицы, в которой располагается его RAID-группа.
For the circulant matrix $$ M= \left(\begin{array}{cccccccccc} 0 & 1 & 1 & 1 & 3 & 2 & 2 & 3 & 2 & 3 \\ 3 & 0 & 1 & 1 & 1 & 3 & 2 & 2 & 3 & 2 \\ 2 & 3 & 0 & 1 & 1 & 1 & 3 & 2 & 2 & 3 \\ 3 & 2 & 3 & 0 & 1 & 1 & 1 & 3 & 2 & 2 \\ 2 & 3 & 2 & 3 & 0 & 1 & 1 & 1 & 3 & 2 \\ 2 & 2 & 3 & 2 & 3 & 0 & 1 & 1 & 1 & 3 \\ 3 & 2 & 2 & 3 & 2 & 3 & 0 & 1 & 1 & 1 \\ 1 & 3 & 2 & 2 & 3 & 2 & 3 & 0 & 1 & 1 \\ 1 & 1 & 3 & 2 & 2 & 3 & 2 & 3 & 0 & 1 \\ 1 & 1 & 1 & 3 & 2 & 2 & 3 & 2 & 3 & 0 \end{array} \right) $$ all the templates containing the entries of the $0$-column are as follows $$ \begin{array}{c|c|c|c|c|c|c|c|c|c} \mbox{rows} &1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 \\ \hline \mbox{templates} & {\color{Blue} {(0,5,8)}} & (0,7,8) & {\color{Blue} {(0,2,7)}} & (0,2,9) & (0,1,3) & {\color{Blue} {(0,3,5)}} & (0,8,9) & (0,1,9) & (0,1,2) \end{array} $$ We call the templates (and correspomding rows) as being in conflict if they contain any common number other than $ 0 $.
General rule for the bundle composition: rows can be adjoint into a bundle iff the corresponding templates do not possess common number other than $0$.
Общее правило формирования пучков: в одном пучке могут быть только те строки, у шаблонов которых нет общих чисел, отличных от $0$.
How to manage the optimal clusterization into bundles, i.e. the one with minimum possible number of bundles?
For the circulant matrix $$ M= \left(\begin{array}{cccccccccc} 0 & 1 & 1 & 1 & 3 & 2 & 2 & 3 & 2 & 3 \\ 3 & 0 & 1 & 1 & 1 & 3 & 2 & 2 & 3 & 2 \\ 2 & 3 & 0 & 1 & 1 & 1 & 3 & 2 & 2 & 3 \\ 3 & 2 & 3 & 0 & 1 & 1 & 1 & 3 & 2 & 2 \\ 2 & 3 & 2 & 3 & 0 & 1 & 1 & 1 & 3 & 2 \\ 2 & 2 & 3 & 2 & 3 & 0 & 1 & 1 & 1 & 3 \\ 3 & 2 & 2 & 3 & 2 & 3 & 0 & 1 & 1 & 1 \\ 1 & 3 & 2 & 2 & 3 & 2 & 3 & 0 & 1 & 1 \\ 1 & 1 & 3 & 2 & 2 & 3 & 2 & 3 & 0 & 1 \\ 1 & 1 & 1 & 3 & 2 & 2 & 3 & 2 & 3 & 0 \end{array} \right) $$ assume the $0$-column is lost. We need to cluster into bundles the rows $ 1,2,3,\dots,9 $. Place this numbers into the plane and draw the line connecting any two numbers for which corresponding templates have common number other than $0$:
Continue…
The obtained figure is known as a graph.
THe circled $ \{1,2,\dots,9\} $ are its vertices (вершины графа) , green lines are its edges (ребра графа). Two verices joined with tha common edge are
adjacent vertices (смежные вершины).
Graph coloring problem: color the graph vertices with $ k $ different colors in such a way that any two of adjacent vertices are colored in different colors.
Задача о раскраске графа. Закрасить вершины графа с помощью $ k $ различных красок так, чтобы любые две смежные вершины были окрашены разными цветами.
For the graph of our example, the solution of the problem is as follows
with $ k=3 $ colors. And is a solution to the bundle clusterization problem.
The minimal possible number of colors = the minimal possible number of bundles
It is known as the chromatic number of a graph.
General problem: NP-hard.
Graphs of our particular problem, i.e. generated by a circulant matrix, possess some specific properties.
We have developed an algorithm that takes into account this specifics. It is optimal with high probalility, i.e. for the configurations of $ G $ and $ K $ where it is possible to find all the graphs with minimal possible chromatic number, application of the algorithm results in less than $10$ % erroneous conclusions.
Теорема 2. The number of bundles in any clusterization $ \ge G $ (the width of a RAID-group).
For some configurations of $ G $ and $ K $, $$ G,K \in \{3,3\}, \{3,4\}, \{4,4\}, \dots $$ there exist clusterization into $ G $ bundles. For some other configuration does not. For the case $G\ge 5, K\ge 5 $ not found (the existence or non-existence is not establised).
The real hard problem: for given $ G $ and $ K $ how to find the ideal circulant matrix, i.e. the matrix with minimal possible clusterization into bundles?
$G=4,K=4 $: only $ 2 $ ideal matrices.
To what limits have we reached? $$ G\times K \le 200 $$ (disks in array).
Combinatorics. Balanced Incomplete Block Design (BIBD).
Alternatives?