.. _chapter-codes: ************************ Linear codes and ciphers ************************ Codes ===== A linear code of length :math:`n` is a finite dimensional subspace of :math:`GF(q)^n`. Sage can compute with linear error-correcting codes to a limited extent. It basically has some wrappers to GAP and GUAVA 2.8 commands (GUAVA 2.8 does not include any of Leon's C code which was included in previous versions of GUAVA). GUAVA 2.8 is included with Sage's install of GAP 4.4.7 and later. .. index: pair: codes; linear pair: codes; Hamming Sage can compute Hamming codes :: sage: C = HammingCode(3,GF(3)) sage: C Linear code of length 13, dimension 10 over Finite Field of size 3 sage: C.minimum_distance() 3 sage: C.gen_mat() [1 0 0 0 0 0 0 0 2 0 0 2 1] [0 1 0 0 0 0 0 0 2 0 0 2 0] [0 0 1 0 0 0 0 0 2 0 0 2 2] [0 0 0 1 0 0 0 0 2 0 0 1 0] [0 0 0 0 1 0 0 0 2 0 0 1 2] [0 0 0 0 0 1 0 0 2 0 0 1 1] [0 0 0 0 0 0 1 0 2 0 0 0 2] [0 0 0 0 0 0 0 1 2 0 0 0 1] [0 0 0 0 0 0 0 0 0 1 0 2 2] [0 0 0 0 0 0 0 0 0 0 1 2 1] .. index:: pair: codes; Golay the four Golay codes :: sage: C = ExtendedTernaryGolayCode() sage: C Linear code of length 12, dimension 6 over Finite Field of size 3 sage: C.minimum_distance() 6 sage: C.gen_mat() [1 0 0 0 0 0 2 0 1 2 1 2] [0 1 0 0 0 0 1 2 2 2 1 0] [0 0 1 0 0 0 1 1 1 0 1 1] [0 0 0 1 0 0 1 1 0 2 2 2] [0 0 0 0 1 0 2 1 2 2 0 1] [0 0 0 0 0 1 0 2 1 2 2 1] as well as binary Reed-Muller codes, quadratic residue codes, quasi-quadratic residue codes, "random" linear codes, and a code generated by a matrix of full rank (using, as usual, the rows as the basis). .. index:: pair: codes; check matrix pair: codes; generator matrix pair: codes; dual For a given code, :math:`C`, Sage can return a generator matrix, a check matrix, and the dual code: :: sage: C = HammingCode(3,GF(2)) sage: Cperp = C.dual_code() sage: C; Cperp Linear code of length 7, dimension 4 over Finite Field of size 2 Linear code of length 7, dimension 3 over Finite Field of size 2 sage: C.gen_mat() [1 0 0 1 0 1 0] [0 1 0 1 0 1 1] [0 0 1 1 0 0 1] [0 0 0 0 1 1 1] sage: C.check_mat() [1 0 0 1 1 0 1] [0 1 0 1 0 1 1] [0 0 1 1 1 1 0] sage: C.dual_code() Linear code of length 7, dimension 3 over Finite Field of size 2 sage: C = HammingCode(3,GF(4,'a')) sage: C.dual_code() Linear code of length 21, dimension 3 over Finite Field in a of size 2^2 For :math:`C` and a vector :math:`v\in GF(q)^n`, Sage can try to decode :math:`v` (i.e., find the codeword :math:`c\in C` closest to :math:`v` in the Hamming metric) using syndrome decoding. As of yet, no special decoding methods have been implemented. :: sage: C = HammingCode(3,GF(2)) sage: MS = MatrixSpace(GF(2),1,7) sage: F = GF(2); a = F.gen() sage: v1 = [a,a,F(0),a,a,F(0),a] sage: C.decode(v1) (1, 0, 0, 1, 1, 0, 1) sage: v2 = matrix([[a,a,F(0),a,a,F(0),a]]) sage: C.decode(v2) (1, 0, 0, 1, 1, 0, 1) sage: v3 = vector([a,a,F(0),a,a,F(0),a]) sage: c = C.decode(v3); c (1, 0, 0, 1, 1, 0, 1) To plot the (histogram of) the weight distribution of a code, one can use the matplotlib package included with Sage: :: sage: C = HammingCode(4,GF(2)) sage: C Linear code of length 15, dimension 11 over Finite Field of size 2 sage: w = C.weight_distribution(); w [1, 0, 0, 35, 105, 168, 280, 435, 435, 280, 168, 105, 35, 0, 0, 1] sage: J = range(len(w)) sage: W = IndexedSequence([ZZ(w[i]) for i in J],J) sage: P = W.plot_histogram() Now type ``show(P)`` to view this. There are several coding theory functions we are skipping entirely. Please see the reference manual or the file ``coding/linear_codes.py`` for examples. Sage can also compute algebraic-geometric codes, called AG codes, via the Singular interface ยง sec:agcodes. One may also use the AG codes implemented in GUAVA via the Sage interface to GAP ``gap_console()``. See the GUAVA manual for more details. {GUAVA} Ciphers ======= LFSRs ----- A special type of stream cipher is implemented in Sage, namely, a linear feedback shift register (LFSR) sequence defined over a finite field. Stream ciphers have been used for a long time as a source of pseudo-random number generators. {linear feedback shift register} S. Golomb {G} gives a list of three statistical properties a sequence of numbers :math:`{\bf a}=\{a_n\}_{n=1}^\infty`, :math:`a_n\in \{0,1\}`, should display to be considered "random". Define the autocorrelation of :math:`{\bf a}` to be .. math:: C(k)=C(k,{\bf a})=\lim_{N\rightarrow \infty} \frac{1}{N}\sum_{n=1}^N (-1)^{a_n+a_{n+k}}. In the case where :math:`a` is periodic with period :math:`P` then this reduces to .. math::C(k)=\frac{1}{P}\sum_{n=1}^P (-1)^{a_n+a_{n+k}}. Assume :math:`a` is periodic with period :math:`P`. - balance: :math:`|\sum_{n=1}^P(-1)^{a_n}|\leq 1`. - low autocorrelation: .. math:: C(k)= \left\{ \begin{array}{cc} 1,& k=0,\\ \epsilon, & k\not= 0. \end{array} \right. (For sequences satisfying these first two properties, it is known that :math:`\epsilon=-1/P` must hold.) - proportional runs property: In each period, half the runs have length :math:`1`, one-fourth have length :math:`2`, etc. Moveover, there are as many runs of :math:`1`'s as there are of :math:`0`'s. A sequence satisfying these properties will be called pseudo-random. {pseudo-random} A general feedback shift register is a map :math:`f:{\bf F}_q^d\rightarrow {\bf F}_q^d` of the form .. math:: \begin{array}{c} f(x_0,...,x_{n-1})=(x_1,x_2,...,x_n),\\ x_n=C(x_0,...,x_{n-1}), \end{array} where :math:`C:{\bf F}_q^d\rightarrow {\bf F}_q` is a given function. When :math:`C` is of the form ..math:: C(x_0,...,x_{n-1})=c_0x_0+...+c_{n-1}x_{n-1}, for some given constants :math:`c_i\in {\bf F}_q`, the map is called a linear feedback shift register (LFSR). The sequence of coefficients :math:`c_i` is called the key and the polynomial .. math::C(x) = 1+ c_0x +...+c_{n-1}x^n .. index:: pair: ciphers; connection polynomial is sometimes called the connection polynomial. Example: Over :math:`GF(2)`, if :math:`[c_0,c_1,c_2,c_3]=[1,0,0,1]` then :math:`C(x) = 1 + x + x^4`, .. math::x_n = x_{n-4} + x_{n-1},\ \ \ n\geq 4. The LFSR sequence is then .. math:: \begin{array}{c} 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 1, \\ 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 1, ...\ . \end{array} The sequence of :math:`0,1`'s is periodic with period :math:`P=2^4-1=15` and satisfies Golomb's three randomness conditions. However, this sequence of period 15 can be "cracked" (i.e., a procedure to reproduce :math:`g(x)`) by knowing only 8 terms! This is the function of the Berlekamp-Massey algorithm {M}, implemented as ``lfsr_connection_polynomial`` (which produces the reverse of ``berlekamp_massey``). :: sage: F = GF(2) sage: o = F(0) sage: l = F(1) sage: key = [l,o,o,l]; fill = [l,l,o,l]; n = 20 sage: s = lfsr_sequence(key,fill,n); s [1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 1, 0] sage: lfsr_autocorrelation(s,15,7) 4/15 sage: lfsr_autocorrelation(s,15,0) 8/15 sage: lfsr_connection_polynomial(s) x^4 + x + 1 sage: berlekamp_massey(s) x^4 + x^3 + 1 Classical ciphers ----------------- has a type for cryptosystems (created by David Kohel, who also wrote the examples below), implementing classical cryptosystems. The general interface is as follows: :: sage: S = AlphabeticStrings() sage: S Free alphabetic string monoid on A-Z sage: E = SubstitutionCryptosystem(S) sage: E Substitution cryptosystem on Free alphabetic string monoid on A-Z sage: K = S([ 25-i for i in range(26) ]) sage: e = E(K) sage: m = S("THECATINTHEHAT") sage: e(m) GSVXZGRMGSVSZG Here's another example: :: sage: S = AlphabeticStrings() sage: E = TranspositionCryptosystem(S,15); sage: m = S("THECATANDTHEHAT") sage: G = E.key_space() sage: G Symmetric group of order 15! as a permutation group sage: g = G([ 3, 2, 1, 6, 5, 4, 9, 8, 7, 12, 11, 10, 15, 14, 13 ]) sage: e = E(g) sage: e(m) EHTTACDNAEHTTAH The idea is that a cryptosystem is a map :math:`E: KS \to \text{Hom}_\text{Set}(MS,CS)` where :math:`KS`, :math:`MS`, and :math:`CS` are the key space, plaintext (or message) space, and ciphertext space, respectively. :math:`E` is presumed to be injective, so ``e.key()`` returns the pre-image key.