|
2 | 2 | \section{Lattice-based Crypto and LWE} |
3 | 3 | \label{sec:lattices} |
4 | 4 |
|
| 5 | +\paragraph{Motivation.} |
| 6 | +Existing asymmetric encryption algorithms rely heavily on the assumption that |
| 7 | +certain problems, namely prime factorization and discrete logarithms, are |
| 8 | +computationally intractable to solve for sufficiently large inputs. While no |
| 9 | +polynomial-time algorithm (over the number of bits in the input) is known to |
| 10 | +exist for either of these two problems on a classical computer, there do exist |
| 11 | +polynomial-time algorithms for both of these problems on a quantum computer, the |
| 12 | +most famous of such perhaps being Shor's algorithm. |
5 | 13 |
|
6 | | -Given lattice basis $\textbf{B} = [\vecb_1,\ldots,\vecb_n] \in \mathbb{Z}^n$, |
7 | | -and $r \in \mathbb{Q}$, determine whether $\lambda_1(\calL(\textbf{B})) \le r$ |
8 | | -or $\lambda_1(\calL(\textbf{B})) > \gamma \cdot r$. |
| 14 | +While no physical quantum computer known to exist today contains enough qubits |
| 15 | +with enough stability to actually perform Shor's algorithm on real-world sized |
| 16 | +inputs, the fact that the problem is a physical or technological one rather than |
| 17 | +a mathematical one is sufficient cause for concern. This provides a motivation |
| 18 | +for creating an asymmetric encryption algorithm that is resistant against the |
| 19 | +capabilities of quantum computers. Such algorithms may colloquially be known as |
| 20 | +post-quantum cryptography, or PQC. Lattice-based cryptography provides a basis |
| 21 | +for one such algorithm. |
9 | 22 |
|
| 23 | +\paragraph{Definitions.} |
| 24 | +\begin{newitemize} |
| 25 | + \item |
| 26 | + Given a space $\mathbb{R}^n$ and a set of vectors $\textbf{B} = |
| 27 | + \{\vecb_1,\ldots,\vecb_n\} \subset \mathbb{R}^n$ known as a \textbf{basis}, |
| 28 | + a \textbf{lattice} $\calL$ on that basis is the set of points in |
| 29 | + $\mathbb{R}^n$ that can be made by summing together integer multiples of any |
| 30 | + of the basis vectors. More formally, $\calL(\textbf{B}) = \{\vecv = |
| 31 | + \sum_{i=1}^{n}x_i\vecb_i \Colon x_i,\ldots,x_m \in \mathbb{Z}\}$. |
| 32 | + |
| 33 | + \begin{newitemize} |
| 34 | + \item |
| 35 | + For example, $\calL((0,1),(1,0))$ and $\calL((1,1),(2,1))$ are both |
| 36 | + equal to $\mathbb{Z}^2$. |
| 37 | + |
| 38 | + \item |
| 39 | + Pedantically, the number of vectors in the basis need not equal the |
| 40 | + length of each vector, but cryptographic applications always set the two |
| 41 | + equal to each other (also known as a \textbf{full rank lattice}). |
| 42 | + \end{newitemize} |
| 43 | + |
| 44 | + \item |
| 45 | + The shortest vector length in a lattice, denoted $\lambda_1(\calL)$, is the |
| 46 | + length of the nonzero vector in the lattice with the smallest magnitude |
| 47 | + (more formally, $\min_{\vecv \in \calL \setminus \{0\}}{\norm{\vecv}}$). For |
| 48 | + example, $\lambda_1(\calL((1,1),(2,1))) = 1$ despite neither basis vector |
| 49 | + having length 1, as (among others) the point (0,1) can be made by summing |
| 50 | + integer multiples of the basis vectors, and thus (0,1) exists in the |
| 51 | + lattice. The \textbf{Shortest Vector Problem (SVP-$\gamma$)} given a basis |
| 52 | + $\textbf{B}$ is to find a vector $\vecv$ whose magnitude is no more than |
| 53 | + $\gamma$ times the shortest vector length in $\calL(\textbf{B})$ (more |
| 54 | + formally, $\vecv$ such that $\norm{\vecv} \le \gamma \cdot |
| 55 | + \lambda_1(\calL(\textbf{B}))$). |
| 56 | + |
| 57 | + \item |
| 58 | + The \textbf{Gap Shortest Vector Problem (GapSVP-$\gamma$)} given a basis |
| 59 | + $\textbf{B}$ and some distance $r \in \mathbb{R}$ is to determine whether |
| 60 | + the shortest vector length in $\calL(\textbf{B})$ is less than or equal to |
| 61 | + $r$, or whether it is greater than $\gamma r$. (If |
| 62 | + $\lambda_1(\calL(\textbf{B}))$ is in fact between these two values, then any |
| 63 | + answer is considered correct, thus making the problem easier the larger |
| 64 | + $\gamma$ is.) |
| 65 | +\end{newitemize} |
| 66 | + |
| 67 | +\paragraph{Computational difficulty.} |
| 68 | +It is somewhat trivial to show that SVP-$\gamma$ is at least as difficult as |
| 69 | +GapSVP-$\gamma$, since if an SVP-$\gamma$ solution is known for a given lattice, |
| 70 | +it is easy to solve GapSVP-$\gamma$ for that lattice by just comparing $r$ |
| 71 | +directly with the known lattice point that has magnitude $\gamma \cdot |
| 72 | +\lambda_1(\calL(\textbf{B}))$ or less. |
| 73 | + |
| 74 | +As mentioned briefly in the previous section, the difficulty of solving |
| 75 | +SVP-$\gamma$ increases as $\gamma$ decreases. If $\gamma$ is allowed to be |
| 76 | +$2^{kn}$ for some constant $k$, then the best known algorithm can solve |
| 77 | +SVP-$\gamma$ in $O(2^{1/k})$ or polynomial time [LLL '82, Schnorr '87]. |
| 78 | +Meanwhile if $\gamma$ is restricted to $n$, then the best known algorithm can |
| 79 | +only solve SVP-$\gamma$ in $O(2^n)$ or exponential time [Ajtai '96, ...]. But |
| 80 | +critically, this statement holds true not only for classical computers but also |
| 81 | +quantum computers. Thus if a cryptosystem is built from an assumption that can |
| 82 | +be reduced to SVP-$\gamma$ for large $n$, then this proves that it is at least |
| 83 | +as hard to solve as SVP-$\gamma$ and it can be said to be resistant against |
| 84 | +quantum algorithms. |
| 85 | + |
| 86 | +For certain vague definitions of intuitiveness, one can attempt to grasp the |
| 87 | +difficulty of (Gap)SVP-$\gamma$ by noting that randomly chosen high-dimensional |
| 88 | +points tend to clump together at the same magnitude and/or distance, making |
| 89 | +finding one with low magnitude difficult. Additionally, if the basis vectors |
| 90 | +$\vecb_i$ are of large magnitude, there is no guarantee that the actual shortest |
| 91 | +vector length is anywhere near the known basis vectors or their lengths. |
| 92 | +\scribenote{This is just my guess but it comes from the ML classes I've taken |
| 93 | +and the ``curse of dimensionality'' problem there. Is it relevant?} |
| 94 | + |
| 95 | +\paragraph{Learning with errors (LWE).} |
| 96 | +Bridging the gap (pun totally intended) between SVP problems and a real |
| 97 | +cryptosystem is the learning with errors problem or LWE. At a very high level, |
| 98 | +it is possible to reduce LWE to SVP-$\gamma$ with $\gamma = n$, making LWE-based |
| 99 | +cryptosystems a suitable candidate for post-quantum cryptography. Slightly more |
| 100 | +detail on this is provided in a later section. |
| 101 | + |
| 102 | +\paragraph{Terminology.} |
| 103 | +\begin{newitemize} |
| 104 | + \item |
| 105 | + Let $\vecs$ be an $n$-dimensional vector such that all entries are integers |
| 106 | + modulo $q$ (more formally, $\vecs \in \mathbb{Z}^n_q$). Call this the |
| 107 | + \textbf{secret vector}. |
| 108 | + |
| 109 | + \item |
| 110 | + Let $\textbf{A}$ be a publicly known and reusable $m \times n$ matrix such |
| 111 | + that all matrix entries are also integers modulo $q$ (formally, $\textbf{A} |
| 112 | + \getsr \mathbb{Z}^{m \times n}_q$). |
| 113 | + |
| 114 | + \item |
| 115 | + Let $\vece$ be an $m$-dimensional vector of integers whose entries are drawn |
| 116 | + from a discrete Gaussian distribution (formally, $\vece \getsr \chi^m$). |
| 117 | + Call this the error vector or noise vector. |
| 118 | + |
| 119 | + \item |
| 120 | + Let $\vecy$ be the publicly transmitted $m$-dimensional vector equal to |
| 121 | + $\textbf{A}\vecs + \vece$. |
| 122 | + |
| 123 | + \item |
| 124 | + Let the \textbf{search problem} be the goal of reconstructing $\vecs$ given |
| 125 | + known $\textbf{A}$ and $\vecy$. Note that this is effectively a system of |
| 126 | + $m$ linear equations in $m+n$ unknowns. |
| 127 | + |
| 128 | + \item |
| 129 | + Let the \textbf{decision problem} be the goal of distinguishing between the |
| 130 | + ``real'' and ``random'' worlds given $\textbf{A}$ and a matrix $\textbf{Z}$, |
| 131 | + where $\textbf{Z} = \vecy$ in the ``real'' world and $\textbf{Z}$ is just a |
| 132 | + random vector of $m$ integers mod $q$ in the ``random'' world ($\textbf{Z} |
| 133 | + \getsr \mathbb{Z}^m_q$). |
| 134 | +\end{newitemize} |
| 135 | + |
| 136 | +While reducing the search problem to the decision problem is trivial, it turns |
| 137 | +out that it is possible to reduce the decision problem to the search problem as |
| 138 | +well (with the same $m$, $n$, $q$, and $\chi$ parameters) [BKFL '94, Regev '05, |
| 139 | +Peikert '09, ...]. A proof of this latter reduction is omitted here. |
| 140 | +\scribenote{I'm guessing from the "surprising" descriptor on the slides that |
| 141 | +such a proof is not going to be very easy to draw up from scratch for an |
| 142 | +MEng student? :(} |
| 143 | + |
| 144 | +\paragraph{Intuitions.} |
| 145 | +If $\vece$ were to be drawn from uniformly random integers mod $q$ (instead of |
| 146 | +being drawn from an uneven distribution), $\vecs$ would be completely impossible |
| 147 | +to recover. The intuition here is similar to the reasoning why the one-time pad |
| 148 | +is unbreakable. |
| 149 | + |
| 150 | +If $\vece = \textbf{0}$, then the problem degenerates into $\textbf{A}\vecs = |
| 151 | +\vecy$. If $m \ge n$ also, then the single unique $\vecs$ can also be exactly |
| 152 | +solved for easily via row reduction. |
| 153 | + |
| 154 | +If one attempts to use row reduction despite the presence of a nonzero $\vece$, |
| 155 | +then it turns out that the linear combination operations repeatedly performed on |
| 156 | +every row causes the errors or noise to accumulate. This makes finding a likely |
| 157 | +$\vecs$ intractable. |
| 158 | +\scribenote{Is the proof for this very difficult?} |
| 159 | + |
| 160 | +(The difficulty of row reduction with errors present might be thought of as |
| 161 | +somewhat related to the difficulty of row reduction on a physical computer when |
| 162 | +lossy floating-point values are involved. As row reduction strongly relies on |
| 163 | +driving various matrix elements to exactly 0, a matrix element that ends up very |
| 164 | +close to 0 may appear as the denominator of a division operation, causing other |
| 165 | +values in the matrix to end up extremely large, with equally large error bars |
| 166 | +that end up spreading to other non-large matrix elements.) |
| 167 | +\scribenote{Is this relevant or just a barely-related trivium?} |
| 168 | + |
| 169 | +\paragraph{Relation to GapSVP.} |
| 170 | +As hinted at earlier, it turns out it is possible to reduce both the |
| 171 | +decision-LWE and search-LWE problems to that of the GapSVP problem, either with |
| 172 | +a quantum algorithm [Regev '05] or with a classical algorithm (the latter as |
| 173 | +long as $q \ge 2^n$) [Peikert '09]. With GapSVP conjectured to be hard for even |
| 174 | +a quantum computer to solve, this demonstrates that any cryptosystem relying on |
| 175 | +the hardness of decision-LWE and/or search-LWE can also be said to be |
| 176 | +quantum-resistant. |
| 177 | + |
| 178 | +\paragraph{Regev's encryption scheme.} |
| 179 | +This is one example of an asymmetric encryption scheme that uses the hardness of |
| 180 | +decision-LWE. In a nutshell, $\vecs$ as defined above is used as the secret key |
| 181 | +$\textnormal{sk}$, while $\vecy$ or $\textbf{A}\vecs + \vece$ is used as the |
| 182 | +public key $\textnormal{pk}$. (Also as defined earlier, matrix $\textbf{A}$ is |
| 183 | +completely public and may be reused across keys.) |
| 184 | + |
| 185 | +To encrypt a single-bit message $\overline{m}$, first generate a random $m$-bit |
| 186 | +string $\vecu$, then calculate $c_1 = \vecu^T\textbf{A}$ and $c_2 = |
| 187 | +\vecu^T\textnormal{pk}+\overline{m}\lceil q/2 \rceil$. (Note that $c_1$ is an |
| 188 | +$n$-dimensional vector of integers mod $q$, while $c_2$ is a single integer mod |
| 189 | +$q$.) Now $(c_1, c_2)$ is sent to the recipient. The recipient decrypts by |
| 190 | +calculating $\overline{m'} = c_2 - c_1\textbf{s}$, then recovering $\overline{m} |
| 191 | += 1$ if $\overline{m'} > q/4$ or $\overline{m} = 0$ if $\overline{m'} < q/4$. |
| 192 | + |
| 193 | +To observe the effect of the random error $\vece$ on the decryption process, |
| 194 | +substitute the appropriate derivations for $c_1$ and $c_2$: |
10 | 195 |
|
11 | 196 | \begin{align*} |
12 | | - m' &= c_2 - c_1\vecs\\ |
13 | | - &= \vecu^T\textnormal{pk} + m\lceil q/2\rceil - \vecu^T\textbf{A}\vecs \\ |
14 | | - &= \vecu^T\textbf{A}\vecs + \vecu^T\vece + m\lceil q/2\rceil - \vecu^T\textbf{A}\vecs \\ |
15 | | - &= \vecu^T\vece + m\lceil q/2\rceil |
| 197 | + \overline{m'} |
| 198 | + &= c_2 - c_1\vecs\\ |
| 199 | + &= \vecu^T\textnormal{pk} + \overline{m}\lceil q/2\rceil - |
| 200 | + \vecu^T\textbf{A}\vecs \\ |
| 201 | + &= \vecu^T\textbf{A}\vecs + \vecu^T\vece + \overline{m}\lceil q/2\rceil - |
| 202 | + \vecu^T\textbf{A}\vecs \\ |
| 203 | + &= \vecu^T\vece + \overline{m}\lceil q/2\rceil |
16 | 204 | \end{align*} |
17 | 205 |
|
| 206 | +The latter addend $\overline{m}\lceil q/2 \rceil$ will be either $0$ or $\lceil |
| 207 | +q/2 \rceil$, so it can be identified through noise as long as $\vecu^T\vece$ is |
| 208 | +small enough as to not drag $\overline{m'}$ too close to the other value. This |
| 209 | +will be okay as long as $\vecu^T\vece < q/4$, so the Gaussian distribution that |
| 210 | +draws $\vece$ must have a variance low enough that $\vecu^T\vece \ge q/4$ is |
| 211 | +extremely unlikely to occur. (If it did occur, corruption of this bit of the |
| 212 | +message on the receiver's end would result.) |
| 213 | + |
| 214 | +\paragraph{Reduction.} |
| 215 | +We show that Regev's encryption scheme reduces to decision-LWE by playing the |
| 216 | +role of an adversary that wants to break decision-LWE. We are given $\textbf{A}$ |
| 217 | +and a $m \times n$ matrix $\textbf{Z}_b$, and our goal is to distinguish whether |
| 218 | +this matrix is $\textbf{Z}_1$ (i.e. $\textbf{A}\vecs + \vece$), or |
| 219 | +$\textbf{Z}_0$ (a random draw from $\mathbb{Z}_q^{m \times n}$). We have access |
| 220 | +to an adversary able to break Regev's encryption scheme. |
| 221 | + |
| 222 | +Now we use the provided $\textbf{Z}_b$ as the public key to encrypt some |
| 223 | +arbitrary message, and we pass the ciphertext, $\textbf{A}$, and $\textbf{Z}_b$ |
| 224 | +to our Regev's encryption scheme breaker. If it successfully returns our |
| 225 | +message, we return $\textbf{Z}_b = \textbf{Z}_1$. If it fails, we return |
| 226 | +$\textbf{Z}_b = \textbf{Z}_0$. |
| 227 | + |
| 228 | +To complete this demonstration, we must show that no adversary able to break |
| 229 | +Regev's encryption scheme could possibly have done so if given a random public |
| 230 | +key and a ciphertext encrypted by this random public key. To do this, we use the |
| 231 | +leftover hash lemma to show that the first addend of our $c_2$, namely |
| 232 | +$\vecu^T\textbf{Z}_0$, is statistically indistinguishable from random draws over |
| 233 | +its domain $\mathbb{Z}_q$. As it has no relationship to $\textbf{Z}$, |
| 234 | +$\textbf{A}$, or anything else, it will perfectly hide (any constant multiple |
| 235 | +of) the message $\overline{m}$ that is added to it, using the same intuition as |
| 236 | +the ``one-time pad'' intuition discussed on the previous page. |
| 237 | +\scribenote{This paragraph is nearly copied word for word from the slides and |
| 238 | +the lecture; I don't think I understood it too well. Is it close?} |
| 239 | + |
| 240 | +\paragraph{Practicality.} |
| 241 | +As is, Regev's LWE encryption scheme has non-optimal space and time |
| 242 | +requirements, with a public key of size $O(\lg(q)n^2)$, a ciphertext of size |
| 243 | +$O(\lg(q)n)$ per message bit, and the fact that it can only encrypt one bit at a |
| 244 | +time. (If one lets $q = 2^n$ [Peikert '09], then $\lg q = n$ and the above two |
| 245 | +space requirements further balloon to $O(n^3)$ and $O(n^2)$/bit respectively.) |
| 246 | +\scribenote{Is using $\lg$ as a shorthand specifically for $\log_2$ (because the |
| 247 | +word log is being abbreviated to 2 letters) a stupid notation or a common |
| 248 | +one? I recall some texts using it all the time and other texts seemingly |
| 249 | +recoiling from it.} |
| 250 | + |
| 251 | +Recent developments in this space include Ring-LWE [Lyubashevsky, Peikert, Regev |
| 252 | +2010], which uses a ring of polynomials over a finite field instead of just the |
| 253 | +$\mathbb{Z}_q$ group, and allows $n$ message bits to be encrypted at a time |
| 254 | +instead of just 1. This was made more concrete by a later system of the name |
| 255 | +NewHope-KEM [Braithwaite, 2016], which uses key sizes of around 2-4 KBytes to |
| 256 | +achieve 1024 bits of security. While this is a constant factor of around 30 |
| 257 | +times larger than optimal, it is at least much more practical than the |
| 258 | +equivalent ~1 Gbit key size that the original encryption scheme would have |
| 259 | +demanded. Criticisms of this approach include that it relies on a new assumption |
| 260 | +that has a less clear reduction to GapSVP than LWE does. |
| 261 | +\scribenote{This paragraph is also nearly copied word for word from the slides |
| 262 | +and lecture, and I didn't understand it too well.} |
| 263 | + |
| 264 | +\paragraph{Exercise 1.} |
| 265 | +To demonstrate that the shortest vector in a lattice may have a length |
| 266 | +arbitrarily smaller than the lengths of the basis vectors, give a method to |
| 267 | +construct a lattice $\calL(\vecb_1, \vecb_2) \subseteq \mathbb{Z}^2$ such that |
| 268 | +$\lambda_1(\calL) < N\norm{\vecb_1} \wedge \lambda_1(\calL) < N\norm{\vecb_2}$ |
| 269 | +for any arbitrary positive integer $N$. |
| 270 | +\scribenote{Let $b_1 = (N,N)$ and $b_2 = (N+1,N)$. Then $\lambda_1(\calL) = 1$ |
| 271 | +(the (1,0) vector) while $\norm{b_1} > N$ and $\norm{b_2} > N$ regardless of |
| 272 | +how big $N$ is.} |
| 273 | + |
| 274 | +\paragraph{Exercise 2.} |
| 275 | +You are attempting to send a message to someone using Regev's encryption scheme, |
| 276 | +but when your recipient constructed their public key $\textbf{A}\vecs + \vece$, |
| 277 | +they drew the elements of $\vece$ uniformly from $\mathbb{Z}_q$ instead of from |
| 278 | +a Gaussian. This is because they wanted their $s$ to be absolutely unrecoverable |
| 279 | +instead of just computationally intractably unrecoverable. When your ciphertext |
| 280 | +arrives, what difficulty will the recipient encounter in trying to decrypt it? |
| 281 | +Why? |
| 282 | +\scribenote{The last step of the decryption process involves the equation |
| 283 | +$\overline{m'} = \vecu^T\vece + \overline{m}\lceil q/2 \rceil$; normally, |
| 284 | +the second addend can be distinguished between $0$ and $\lceil q/2 \rceil$ |
| 285 | +because the first addend $\vecu^T\vece$ involves a Gaussian and will almost |
| 286 | +always be less than $q/4$. But now, it is effectively uniformly randomly |
| 287 | +distributed across all of $q$. In effect, the whole message has become |
| 288 | +corrupted into random bits and is unrecoverable.} |
| 289 | + |
| 290 | +\paragraph{Exercise 3.} |
| 291 | +A recipient is attempting to receive messages via Regev's encryption scheme, but |
| 292 | +when they constructed $\vece$, instead of drawing $m$ independent values from a |
| 293 | +Gaussian distribution they drew 1 value and reused it $m$ times. (I.e. instead |
| 294 | +of $\vece \getsr \chi^m$, the recipient performed $\vece \getsr \chi |
| 295 | +\textbf{1}^m$.) Furthermore, their $\textbf{A}$ has $m$ and $n$ set such that |
| 296 | +$m=n+1$. Give an efficient attack to recover $\vecs$ given the recipient's |
| 297 | +public key. |
| 298 | +\scribenote{Instead of $m$ linear equations in $m+n$ unknowns, the recipient has |
| 299 | +effectively created $m$ linear equations in just $1+n$ unknowns, or $m$ |
| 300 | +linear equations in $m$ unknowns. This is easy to solve for via direct row |
| 301 | +reduction.} |
0 commit comments