[Operating System Concepts搬運工] 關於加密，驗證, SSL / TLS3.0的基礎知識

02-12

寫在前面：

作為一個初學的弱渣，寫這篇東西完全是在當筆記做。在知乎大神這麼多地方，尤其不敢造次。

我覺得我們這門課雖然叫Operating System Concepts,到不如說是安全基本知識入門（所以學啥完全取決於老師的方向和自己）。這部分大概預習了（霧。。）有一周，還是懵懂，不對的地方希望大家多多指教。課本用得是很經典的Abraham Sliberschatz，Peter Bear Galvin 和Greg Gagne 的 Operating System Concepts 9th Edition（也就是恐龍書）. （然而弱渣的我基本上是看中譯第七版，翻譯得一塌糊塗）剛好看到知乎最近也在升級Https，感覺沒有很好的答案去解釋這些概念，所以就把自己在書上看到的記在這裡。

中文是我自己翻譯的，不對準確性負責，自己的陋見用的是斜體。（時間有限，所以就選一部分啦，請參見原書Chapter 15。）

版權歸書籍原作者所有。

關於這個專欄:

啊，大概會不定期更新，看心情嘍。主題什麼的也是看心情。

而且這並不是一個技術專欄（笑

下周是西浦的考試周，所以最近比較壓抑，只能發點兒這類的東西，希望大家還是能支持下......畢竟我內心還是一個文科僧的（逃

有機會的話也會在github上架個Jekyll 模板上blog但是現在沒空...所以假期的話也該會做（flag已立）,現在的話就只是純靜態的個人介紹頁面。

Encryption 加密

Because it solves a wide variety of communication security problems, encryption
is used frequently in many aspects of modern computing. It is used to send
messages securely across a network, as well as to protect database data,

files, and even entire disks from having their contents read by unauthorized
entities.
由於加密能解決很多通信領域的安全問題，它被廣泛使用在現代計算的很多方面，包括在不同的網路間安全地發送消息以及避免數據，文件乃至整個磁碟的內容被不合法的實體讀取。

所以加密是用來解決安全問題的一種手段，關於安全問題，前面一小節有提過：

Security, on the other hand, requires not only an adequate protection
system but also consideration of the external environment within which the
system operates.

也就是說安全既要考慮內部的保護問題（簡單的說就是系統內部的訪問許可權控制）也要考慮系統外部環境的操作。反過來對系統的攻擊（attack）集中在：

Breach of confidentiality（保密性）,
Breach of integrity（完整性）,
Breach of availability（可用性，簡單的說就是unauthorized destruction未被許可的破壞）,
Theft of service（偷竊服務, 即 unauthorized use of resources私自佔有使用資源）,

Denial of service（拒絕服務）.

回到加密這個話題，

An encryption algorithm enables the sender of a message to ensure that only a computer possessing a certain key can read the message, or ensure that the writer of data is the only reader of that data.
加密演算法能夠讓一個消息的發送者確定只有那些擁有特定密鑰的計算機能夠讀取這個消息，或者僅僅只有數據的寫入者才能讀取它。
An encryption algorithm consists of the following components:
? A set K of keys.
? A set M of messages.
? A set C of ciphertexts.
? An encrypting function E : K → (M→C). That is, for each k ∈ K, $E_{k}$ is a
function for generating ciphertexts from messages. Both E and $E_{k}$ for any k

should be efficiently computable functions. Generally, $E_{k}$ is a randomized
mapping from messages to ciphertexts.
? A decrypting function D : K → (C → M). That is, for each k ∈ K, $D_{k}$ is a
function for generating messages from ciphertexts. Both D and $D_{k}$ for any
k should be efficiently computable functions.
加密演算法由下面部分構成：
? 一個密鑰集合K
?一個消息集合M
?一個密文集合C
? 一個加密函數 E : K → (M→C). 即，對於每個密鑰 k ∈ K, $E_{k}$ 是一個用消息生成密文的函數。對任意k來說， E 和 $E_{k}$ 都是高效的且可計算的函數。大致來說， $E_{k}$ 是從消息到密文的一個隨機映射。

? 一個解密函數 E : K → (C→M). 即，對於每個密鑰 k ∈ K, $D_{k}$ 是一個用密文生成消息的函數。對任意k來說， D 和 $D_{k}$ 都是高效的且可計算的函數。大致來說， $D_{k}$ 是從密文到消息的一個隨機映射。
An encryption algorithm must provide this essential property: given a
ciphertext c ∈ C, a computer can compute m such that $E_{k}$ (m) = c only if
it possesses k. Thus, a computer holding k can decrypt ciphertexts to the
plaintexts used to produce them, but a computer not holding k cannot decrypt
ciphertexts. Since ciphertexts are generally exposed (for example, sent on a
network), it is important that it be infeasible to derive k from the ciphertexts.
There are two main types of encryption algorithms: symmetric and
asymmetric.
一個加密演算法必須提供這個必要屬性：給定一個密文c ∈ 密文集合C，只有計算機掌握k的時候才能通過計算加密函數 $E_{k}$ (m) = c得到消息m。因此，只有掌握密鑰k的計算機可以解密密文，但是沒有掌握k的計算機卻不可以。因為密文常常是被暴露出來的（比如在網路中發送的時候），所以很重要的一點是讓從密文推出密鑰k變得不可能。一共有兩種加密：對稱加密和非對稱加密。

Symmetric encryption algorithm and Asymmetric encryption algorithm

對稱加密：

In a symmetric encryption algorithm, the same key is used to encrypt and to
decrypt. Therefore, the secrecy of k must be protected.
對稱加密演算法使用同樣的密鑰來加密和解密，因此，必須使密鑰k保持機密。
該圖展示了用對稱密鑰加密的過程。

幾種對稱加密演算法：

data-encryption standard (DES)：64-bit value,a 56-bit key, performing a series of transformations that are based on substitution and permutation operations （基於替換和排列變換）. Work on a block of bits at a time, is known as a block cipher.

triple DES: DES algorithm is repeated three times (two encryptions and one decryption) on the same plaintext using two or three keys—for example, c = $E_{k3}$ ( $D_{k2}$ ( $E_{k1}$ (m))). When three keys are used, the effective key length is 168 bits (i.e. 56*3).

advanced encryption standard (AES)：another block cipher, use key lengths of 128, 192, or 256 bits and works on 128-bit blocks. //這裡block cipher應該是按照固定bit長度加密的意思

RC4: stream cipher based (encrypt and decrypt a stream of bytes or bits rather than a block). This is useful when the length of a communication would make a block cipher
too slow.

非對稱加密：

In an asymmetric encryption algorithm, there are different encryption and
decryption keys.... Any sender can use that key to encrypt a communication,

but only the key creator can decrypt the communication. This scheme, known
as public-key encryption, was a breakthrough in cryptography.
在非對稱加密演算法，加密和解密密鑰是不同的。...任何發送者能均夠能使用那個密鑰（即公鑰）加密通訊，但是只有密匙創建者能解密通訊。這種模式，被稱為公鑰加密，曾是密碼學的一個突破。
Example: RSA Algorithm 舉例：RSA 加密演算法
In RSA, $k_{e}$ is the public key, and $k_{d}$ is the private key. N is the product of
two large, randomly chosen prime numbers p and q (for example, p and q are
512 bits each). It must be computationally infeasible to derive $k_{d,N}$ from $k_{e,N}$ , so
that $k_{e}$ need not be kept secret and can be widely disseminated. The encryption

algorithm is $E_{k_{e} }$ ,N(m) = $m^{k_{e} }$ mod N, where $k_{e}$ satisfies $k_{e} k_{d}$ mod (p?1)(q?1) =
1. The decryption algorithm is then $D_{k_{d} }$ ,N(c) = $c^{k_{d}}$ mod N.
在RSA中， $k_{e}$ 是公鑰， $k_{d}$ 是私鑰, N是兩個較大的隨機選擇的素數之積（比如，p,q每個都是512位長）。從 $k_{d,N}$ 到 $k_{e,N}$ 一定是不能計算出的,因此 $k_{e}$ 不必保持機密並可以被廣泛傳播。
加密演算法是 $E_{k_{e} }$ , N(m) = $m^{k_{e} }$ mod N, $k_{e}$ 滿足 $k_{e} k_{d}$ mod (p?1)(q?1) = 1.
接著解密演算法是 $D_{k_{d} }$ ，N(c) = $c^{k_{d}}$ mod N
An example using small values is shown in Figure 15.8. In this example, we
make p = 7 and q = 13.We then calculate N = 7?13 = 91 and (p?1)(q?1) = 72.
We next select $k_{e}$ relatively prime to 72 and < 72, yielding 5. Finally, we calculate
$k_{d}$ such that $k_{e} k_{d}$ mod 72 = 1, yielding 29. We now have our keys: the public
key, $k_{e}$ ,N = 5, 91, and the private key, $k_{d}$ ,N = 29, 91. Encrypting the message 69

with the public key results in the message 62, which is then decoded by the
receiver via the private key.
圖15.8一個比較小一點的例子。我們讓 p = 7, q =13。然後計算N = 7*13 且 (p?1)(q?1) = 72。接著我們相應的給 $k_{e}$ 選擇一個小於72的素數，得到5。最後我們通過 $k_{e} k_{d}$ mod 72=1計算得到 $k_{d}$ ，為29。因此現在 $k_{e}$ ,N = 5, 91；私鑰 $k_{d}$ ,N = 29, 91。用公鑰加密消息69得到密文結果62,然後通過私鑰解密。

關於RSA演算法用到的數學知識，請參見

阮一峰大神的教程I(前置數學知識)

及

阮一峰大神的教程II（最後一部分:有關演算法正確性的推導)

The use of asymmetric encryption begins with the publication of the public
key of the destination. For bidirectional communication, the source also must
publish its public key. 「Publication」 can be as simple as handing over an
electronic copy of the key, or it can be more complex. The private key (or 「secret
key」) must be zealously guarded, as anyone holding that key can decrypt any
message created by the matching public key.
非對稱加密的使用從公鑰的發布開始。對於雙向通信，消息源還必須發布它的公鑰（譯者註：注意直接把公鑰發布出去會出問題，下面會提到）。發布可以簡單的像傳遞這個公鑰的電子拷貝一樣，也可以使得它變得更複雜一點。私鑰必須被積極保護起來。因為任何持有這個私鑰的人，都可以解密由和這個私鑰相匹配的那個公鑰加密的任何信息。
We should note that the seemingly small difference in key use between
asymmetric and symmetric cryptography is quite large in practice. Asymmetric
cryptography is much more computationally expensive to execute. It is much
faster for a computer to encode and decode ciphertext by using the usual
symmetric algorithms than by using asymmetric algorithms. Why, then, use
an asymmetric algorithm? In truth, these algorithms are not used for general purpose
encryption of large amounts of data. However, they are used not
only for encryption of small amounts of data but also for authentication,
confidentiality, and key distribution, as we show in the following sections.
我們應該注意到非對稱加密和對稱加密在密鑰使用上的微小不同其實導致在實踐中二者的差別是非常大的。非對稱加密計算會消耗更多的資源。通常來說，使用對稱加密和解密密文比非對稱加密密文更快。那麼為啥我們還要用非對稱加密演算法呢？事實上，這些（非對稱加密）演算法不是為通常意義上大量數據的加密而準備的。非對稱加密不僅僅被使用在少量數據的加密中也被使用在驗證，保密和密鑰分發的過程，就像下面幾個小節所述的一樣。

Authentication認證

We have seen that encryption offers a way of constraining the set of possible
receivers of a message. Constraining the set of potential senders of a message
is called authentication. Authentication is thus complementary to encryption.
加密可以提供一種限制信息接受者範圍的途徑（譯者註：即有密鑰(非對稱中是私鑰)才能解密沒有則不能）。限制信息的發送者範圍叫做驗證。驗證是加密的補充。
Authentication is also useful for proving that a message has not been modified.
驗證也能確保信息不被更改。
An authentication algorithm using symmetric keys consists of the following
components:
? A set K of keys.
? A set M of messages.
? A set A of authenticators.
? A function S : K → (M → A). That is, for each k ∈ K, $S_{k}$ is a function for
generating authenticators from messages. Both S and $S_{k}$ for any k should
be efficiently computable functions.
? A function V : K → (M×A→{true, false}). That is, for each k ∈ K, $V_{k}$
is a function for verifying authenticators on messages. Both V and $V_{k}$ for
any k should be efficiently computable functions.
使用對稱密鑰的認證演算法由下面的部分構成：
? 一個密鑰集合K
?一個消息集合M
?一個驗證器集合A
? 一個驗證器生成函數 S : K → (M→A). 即，對於每個密鑰 k ∈ K, $S_{k}$ 是一個用消息生成驗證器的函數。對任意k來說， S和 $S_{k}$ 都是高效的且可計算的函數。
? 一個驗證器驗證函數 E : K → (M×A→{true, false}). 即，對於每個密鑰 k ∈ K, $V_{k}$ 是一個用來驗證特定消息的驗證器的函數。對任意k來說， V 和 $V_{k}$ 都是高效的且可計算的函數。
The critical property that an authentication algorithm must possess is this:
for a message m, a computer can generate an authenticator a ∈ A such
that $V_{k}$ (m, a) = true only if it possesses k. Thus, a computer holding k can generate authenticators on messages so that any computer possessing k can
verify them. However, a computer not holding k cannot generate authenticators
on messages that can be verified using $V_{k}$ . Since authenticators are generally
exposed (for example, sent on a network with the messages themselves), it
must not be feasible to derive k from the authenticators. Practically, if $V_{k}$ (m, a)
= true, then we know that m has not been modified, and that the sender of
the message has k. If we share k with only one entity, then we know that the
message originated from k.
驗證演算法的重要屬性是：對一個消息m,只有計算機掌握k的時候可以生成一個驗證器 a ∈ A 使得驗證函數 $V_{k}$ (m, a) = true 。因此，任何持有密鑰k的計算機可以生成關於消息m的驗證器，這個驗證器可被其它任意一台任何持有密鑰k的計算機通過 $V_{k}$ 來驗證。因為驗證器通常是被暴露的（比如，在網路上和消息一起被發送），所以很重要的一點是讓從驗證器推出密鑰k變得不可能。實際上，如果 $V_{k}$ (m, a) = true,我們就可以知道消息沒被更改過。如果我們只把k分享給過一個實體，那麼我們就能知道消息源自最初的發布者。
Just as there are two types of encryption algorithms, there are two main varieties of authentication algorithms.
和兩種加密演算法相同，也有兩種認證演算法。

Hash函數

The first step in understanding these algorithms is to explore hash functions. A hash function H(m) creates a small, fixed-sized block of data, known as a message digest or hash value, from a message m. Hash functions work by taking a message, splitting it into blocks, and processing the blocks to produce an n-bit hash. H must be collision resistant —that is, it must be infeasible to find an $m^{}$ = m such that H(m) = H( $m^{}$ ). Now, if H(m) = H( $m^{}$ ), we know that m = $m^{}$ — that is, we know that the message has not been modified. Common message-digest functions include MD5, now considered insecure, which produces a 128-bit hash, and SHA-1, which outputs a 160-bit hash. Message digests are useful for detecting changed messages but are not useful as authenticators. For example, H(m) can be sent along with a
message; but if H is known, then someone could modify m to $m^{}$ and recompute H( $m^{}$ ), and the message modification would not be detected. Therefore, we must authenticate H(m).
理解這兩種演算法的第一步是探索hash函數。hash函數H(m)用一條消息創建一塊小的且固定大小的數據，被稱作消息摘要或者hash值。它是這樣工作的：取一條消息,將其拆分成塊，並且處理這些小塊來產生n位的hash。H函數是拒絕碰撞的，也就是說，如果 $m^{}$ =m，則H(m) = H( $m^{}$ ) 是不成立的。所以現在，如果H(m) = H( $m^{}$ ), 那麼一定有m = $m^{}$ ，也就是說，我們可以確定這條消息m沒有被更改。常見的消息摘要函數包括產生128位hash摘要的MD5（現在已經被認為是不安全的）和產生160位hash摘要的SHA-1。消息摘要對檢測消息是否被更改十分有用但是對驗證驗證器來說是沒用的。比如，H(m)可以和一條消息一起被發送，但是如果H函數是已知的其他人可以將m修改成 $m^{}$ 然後重新計算H( $m^{}$ ),並且這種更改不能被檢測出來。因此，我們必須驗證H(m).

message-authentication code (MAC)消息驗證碼

? uses symmetric encryption 使用對稱加密
? a cryptographic checksum is generated from the message using a secret key 消息的加密校驗和通過一個密鑰來被生成。
? k is needed to compute both $S_{k}$ and $V_{k}$ , so anyone able to compute one can compute the other. k用來計算 $S_{k}$ 和 $V_{k}$ ,所以任何能計算一個的人也能計算另外一個。

digital-signature algorithm數字簽名演算法

? the authenticators thus produced are called digital signatures 這種演算法的驗證器也被叫做數字簽名。
? Digital signatures are very useful in that they enable anyone to verify the authenticity of the message.數字簽名使得任何人可以驗證消息的真實性。
? $k_{v}$ is the public key, and $k_{s}$ is the private key. $k_{v}$ 是公鑰， $k_{s}$ 是私鑰。
? infeasible to derive $k_{s}$ from $k_{v}$ 不可從 $k_{v}$ 推導出 $k_{s}$
? Example: RSA digital-signature algorithm，similar to the RSA encryption algorithm, but the key use is reversed. The digital signature of a message is derived by computing $S_{ks}$ (m) = H $(m)^{k_{s}}$ mod N.The key $k_{s}$ again is a pair <d, N>, where N is the product of two large, randomly chosen prime numbers p and q. The verification algorithm is then
$V_{k_{v}} = ?$ ( $a^{k_{v}}$ mod N = H(m)), where $k_{v}$ satisfies $k_{v} k_{s}$ mod (p ? 1)(q ? 1) = 1.
例子：RSA 數字簽名演算法, 和RSA加密演算法類似，但是key的使用是相反的(譯者註：即私鑰算出驗證器，公鑰驗證驗證器)。一個消息的數字簽名 $S_{ks}$ (m)是通過計算 $S_{ks}$ (m) =H $(m)^{k_{s}}$ mod N 得到的。密鑰 $k_{s}$ 同樣是有序數對<d,N>,N同樣是兩個巨大的且隨機選出的素數p和q之積。驗證演算法是 $V_{k_{v}} = ?$ ( $a^{k_{v}}$ mod N =H(m)),同樣 $k_{v}$ 滿足 $k_{v} k_{s}$ mod (p ? 1)(q ? 1) = 1.

（先寫到這裡好了，回來將後面SSL的部分補上。）

[Operating System Concepts搬運工] 關於加密，驗證, SSL / TLS3.0的基礎知識

Encryption 加密

Authentication認證