Abstract:
Locally repairable codes (LRCs) are error correcting codes used in distributed data storage. A traditional approach is to look for codes which simultaneously maximize error tolerance and minimize storage space consumption. However, this tends to yield codes for which error correction requires an unrealistic amount of communication between storage nodes. LRCs solve this problem by allowing errors to be corrected locally. This thesis reviews previous results on the subject presented in [1]. These include that every almost affine LRC induces a matroid such that the essential properties of the code are determined by the matroid. Also, the generalized Singleton bound for LRCs can be extended to matroids as well. Then, matroid theory can be used to find classes of matroids that either achieve the bound, meaning they are optimal in a certain sense, or at least come close to the bound. This thesis presents an improvement to the results of [1] in both of these cases. [1] T. Westerb\"ack, R. Freij, T. Ernvall and C. Hollanti, "On the Combinatorics of Locally Repairable Codes via Matroid Theory", arXiv:1501.00153 [cs.IT], 2014.

Abstract:
Petabyte-scale distributed storage systems are currently transitioning to erasure codes to achieve higher storage efficiency. Classical codes like Reed-Solomon are highly sub-optimal for distributed environments due to their high overhead in single-failure events. Locally Repairable Codes (LRCs) form a new family of codes that are repair efficient. In particular, LRCs minimize the number of nodes participating in single node repairs during which they generate small network traffic. Two large-scale distributed storage systems have already implemented different types of LRCs: Windows Azure Storage and the Hadoop Distributed File System RAID used by Facebook. The fundamental bounds for LRCs, namely the best possible distance for a given code locality, were recently discovered, but few explicit constructions exist. In this work, we present an explicit and optimal LRCs that are simple to construct. Our construction is based on grouping Reed-Solomon (RS) coded symbols to obtain RS coded symbols over a larger finite field. We then partition these RS symbols in small groups, and re-encode them using a simple local code that offers low repair locality. For the analysis of the optimality of the code, we derive a new result on the matroid represented by the code generator matrix.

Abstract:
Constructions of optimal locally repairable codes (LRCs) in the case of $(r+1) \nmid n$ and over small finite fields were stated as open problems for LRCs in [I. Tamo \emph{et al.}, "Optimal locally repairable codes and connections to matroid theory", \emph{2013 IEEE ISIT}]. In this paper, these problems are studied by constructing almost optimal linear LRCs, which are proven to be optimal for certain parameters, including cases for which $(r+1) \nmid n$. More precisely, linear codes for given length, dimension, and all-symbol locality are constructed with almost optimal minimum distance. `Almost optimal' refers to the fact that their minimum distance differs by at most one from the optimal value given by a known bound for LRCs. In addition to these linear LRCs, optimal LRCs which do not require a large field are constructed for certain classes of parameters.

Abstract:
Distributed storage systems for large-scale applications typically use replication for reliability. Recently, erasure codes were used to reduce the large storage overhead, while increasing data reliability. A main limitation of off-the-shelf erasure codes is their high-repair cost during single node failure events. A major open problem in this area has been the design of codes that {\it i)} are repair efficient and {\it ii)} achieve arbitrarily high data rates. In this paper, we explore the repair metric of {\it locality}, which corresponds to the number of disk accesses required during a {\color{black}single} node repair. Under this metric we characterize an information theoretic trade-off that binds together locality, code distance, and the storage capacity of each node. We show the existence of optimal {\it locally repairable codes} (LRCs) that achieve this trade-off. The achievability proof uses a locality aware flow-graph gadget which leads to a randomized code construction. Finally, we present an optimal and explicit LRC that achieves arbitrarily high data-rates. Our locality optimal construction is based on simple combinations of Reed-Solomon blocks.

Abstract:
In this paper, locally repairable codes with all-symbol locality are studied. Methods to modify already existing codes are presented. Also, it is shown that with high probability, a random matrix with a few extra columns guaranteeing the locality property, is a generator matrix for a locally repairable code with a good minimum distance. The proof of this also gives a constructive method to find locally repairable codes. Constructions are given of three infinite classes of optimal vector-linear locally repairable codes over an alphabet of small size, not depending on the size of the code.

Abstract:
In this paper, locally repairable codes with all-symbol locality are studied. Methods to modify already existing codes are presented. Also, it is shown that with high probability, a random matrix with a few extra columns guaranteeing the locality property, is a generator matrix for a locally repairable code with a good minimum distance. The proof of this gives also a constructive method to find locally repairable codes.

Abstract:
The locally repairable codes (LRCs) were introduced to correct erasures efficiently in distributed storage systems. LRCs are extensively studied recently. In this paper, we first deal with the open case remained in \cite{q} and derive an improved upper bound for the minimum distances of LRCs. We also give an explicit construction for LRCs attaining this bound. Secondly, we consider the constructions of LRCs with any locality and availability which have high code rate and minimum distance as large as possible. We give a graphical model for LRCs. By using the deep results from graph theory, we construct a family of LRCs with any locality $r$ and availability $2$ with code rate $\frac{r-1}{r+1}$ and optimal minimum distance $O(\log n)$ where $n$ is the length of the code.

Abstract:
This paper presents a construction for several families of optimal binary locally repairable codes (LRCs) with small locality (2 and 3). This construction is based on various anticodes. It provides binary LRCs which attain the Cadambe-Mazumdar bound. Moreover, most of these codes are optimal with respect to the Griesmer bound.

Abstract:
Locally repairable codes (LRCs) are a class of codes designed for the local correction of erasures. They have received considerable attention in recent years due to their applications in distributed storage. Most existing results on LRCs do not explicitly take into consideration the field size $q$, i.e., the size of the code alphabet. In particular, for the binary case, only a few results are known. In this work, we present an upper bound on the minimum distance $d$ of linear LRCs with availability, based on the work of Cadambe and Mazumdar. The bound takes into account the code length $n$, dimension $k$, locality $r$, availability $t$, and field size $q$. Then, we study binary linear LRCs in three aspects. First, we focus on analyzing the locality of some classical codes, i.e., cyclic codes and Reed-Muller codes, and their modified versions, which are obtained by applying the operations of extend, shorten, expurgate, augment, and lengthen. Next, we construct LRCs using phantom parity-check symbols and multi-level tensor product structure, respectively. Compared to other previous constructions of binary LRCs with fixed locality or minimum distance, our construction is much more flexible in terms of code parameters, and gives various families of high-rate LRCs, some of which are shown to be optimal with respect to their minimum distance. Finally, availability of LRCs is studied. We investigate the locality and availability properties of several classes of one-step majority-logic decodable codes, including cyclic simplex codes, cyclic difference-set codes, and $4$-cycle free regular low-density parity-check (LDPC) codes. We also show the construction of a long LRC with availability from a short one-step majority-logic decodable code.

Abstract:
Distributed storage systems need to store data redundantly in order to provide some fault-tolerance and guarantee system reliability. Different coding techniques have been proposed to provide the required redundancy more efficiently than traditional replication schemes. However, compared to replication, coding techniques are less efficient for repairing lost redundancy, as they require retrieval of larger amounts of data from larger subsets of storage nodes. To mitigate these problems, several recent works have presented locally repairable codes designed to minimize the repair traffic and the number of nodes involved per repair. Unfortunately, existing methods often lead to codes where there is only one subset of nodes able to repair a piece of lost data, limiting the local repairability to the availability of the nodes in this subset. In this paper, we present a new family of locally repairable codes that allows different trade-offs between the number of contacted nodes per repair, and the number of different subsets of nodes that enable this repair. We show that slightly increasing the number of contacted nodes per repair allows to have repair alternatives, which in turn increases the probability of being able to perform efficient repairs. Finally, we present pg-BLRC, an explicit construction of locally repairable codes with multiple repair alternatives, constructed from partial geometries, in particular from Generalized Quadrangles. We show how these codes can achieve practical lengths and high rates, while requiring a small number of nodes per repair, and providing multiple repair alternatives.