如何理解矩陣相乘的幾何意義或現實意義?

例如,對於5*3=15,可以理解為有3個袋子,每個袋子有5個蘋果,則共有15個蘋果。對於一元二次方程有沒有解,可以理解為在直角坐標繫上的函數圖像與X軸有沒有交點。那麼如何理解矩陣相乘的幾何意義或現實意義?


矩陣是線性變換的表象,矩陣的乘積可以看做線性變換的複合


這個問題我也思考了許久,如何從高中的知識過度到大學的線代知識,偶然間看到一篇文章再結合MIT的線代和國內的西工大的矩陣論的一小撮知識,終於把這個問題可以詳細的寫出來了,達到知其所以然。(歡迎大家指正錯誤)

我們先來看一個高中就學過的向量運算:內積。A=(x1,y1),B=(x2,y2),如下圖所示

好,我們從A點向B所在直線引一條垂線,我們知道垂線與B的交點叫做A在B上的投影,不過我們如果將內積表示為另一種我們熟悉的形式:A*B=|A||B||cos(a)|,我們可以看到A與B的內積等於A到B的投影長度乘以B的模,再進一步,令|B|=1,那麼就變成了A*B=|A||cosa|

我們繼續在二維空間討論向量,一個向量(3,2)本身是不能精確表示一個向量的,實際上隱含了再X軸投影為3而Y軸的投影為2,更正式的,向量(x,y)實際上可以表示為

X*left( egin{array}{ccc}1 \0 end{array}
ight) +X*left( egin{array}{ccc}0 \1 end{array}
ight),此處(1,0)和(0,1)叫做二維空間的一組基。

我們之所以默認選擇(1,0)和(0,1)為基,當然是比較方便,因為它們分別是x和y軸正方向上的單位向量,因此就使得二維平面上點坐標和向量一一對應,非常方便。但實際上任何兩個線性無關的二維向量都可以成為一組基,所謂線性無關在二維平面內可以直觀認為是兩個不在一條直線上的向量。

例如,(1,1)和(-1,1)也可以成為一組基。一般來說,我們希望基的模是1,因為從內積的意義可以看到,如果基的模是1,那麼就可以方便的用向量點乘基而直接獲得其在新基上的坐標了!實際上,對應任何一個向量我們總可以找到其同方向上模為1的向量,只要讓兩個分量分別除以模就好了。例如,上面的基可以變為下圖所示,(frac{1}{sqrt{2} }, frac{1}{sqrt{2} })(-frac{1}{sqrt{2} }, frac{1}{sqrt{2} })(註:基都為列向量,這是遵循MIT教授的課上的建議,但是為了表示方便,我混用了。)現在我們想獲得(3,2)在新基上的坐標,即在兩個方向上的投影矢量值。

根據程雲鵬的矩陣論11頁上的定義:設x1,x2,...,xn是Vn的舊基,y1,y2,...,yn為其新基,則由基的定義可以寫為(y1,y2,...,yn)=(x1,x2,...,xn)C(此處y和x均為列向量),

其中C稱之為過渡矩陣

所以我們有Y=X*C,其中Y為left( egin{array}{ccc}frac{1}{sqrt{2} } -frac{1}{sqrt{2} } \frac{1}{sqrt{2} }frac{1}{sqrt{2} }end{array}
ight) ,X為left( egin{array}{ccc}10 \01 end{array}
ight) ,可得C為left( egin{array}{ccc}frac{1}{sqrt{2} } -frac{1}{sqrt{2} } \frac{1}{sqrt{2} }frac{1}{sqrt{2} }end{array}
ight)

那麼根據程雲鵬的矩陣論書中,坐標在新基中的表示為C^{-1}*left( egin{array}{ccc}3 \2 end{array}
ight) =left( egin{array}{ccc}frac{5}{sqrt{2} }  \-frac{1}{sqrt{2} } end{array}
ight)

我們可以看到C明顯為正交矩陣,正交矩陣的性質為

正交矩陣的逆等於其轉置,所以本來C的第一列為基,求逆矩陣之後變成C的第一行為基。

所以推導出下面這個式子。

一般的,如果我們有M個N維向量,想將其變換為由R個N維向量表示的新空間中,那麼首先將R個基按行組成矩陣A,然後將向量按列組成矩陣B,那麼兩矩陣的乘積AB就是變換結果,其中AB的第m列為A中第m列變換後的結果

數學表示為:

特別要注意的是,這裡R可以小於N,而R決定了變換後數據的維數。也就是說,我們可以將一N維數據變換到更低維度的空間中去,變換後的維度取決於基的數量。因此這種矩陣相乘的表示也可以表示降維變換。

最後,上述分析同時給矩陣相乘找到了一種物理解釋:兩個矩陣相乘的意義是將右邊矩陣中的每一列列向量變換到左邊矩陣中每一行行向量為基所表示的空間中去。更抽象的說,一個矩陣可以表示一種線性變換。很多同學在學線性代數時對矩陣相乘的方法感到奇怪,但是如果明白了矩陣相乘的物理意義,其合理性就一目了然了

參考文獻:

1.PCA數學原理

2.矩陣論--程雲鵬

3.Introduction to Linear Algebra--GILBERT STRANG


思索很久,終於明白了。 矩陣是一個線性變換 ,就是對一個向量進行拉伸和變換,是通過矩陣的變換基完成的。如果以矩陣的行向量作為變換基。例如,x軸變換基負責對向量的x維度數據(x,0)進行變換,y軸變換基負責對y維度向量(0,y)進行變換,那麼假如變換基是單位向量,那麼長度不變,如果不是,那肯定變了。理解難點:其實任何一個向量(x,y)都可以表示為(x,0)+(0,y)。所以所謂的線性變換,本質上就是利用矩陣的變換基對各個向量分量進行變換。


簡單理解的話(代數意義),矩陣相乘就是一個線性函數帶入到另一個線性函數的過程,比如兩個一元線性函數y=kx, x=pz, 把x=pz帶入到y=kx,得到y=kpz,這沒問題吧? 如果x,y,z是向量,k和p是矩陣,那麼kp就是兩個矩陣相乘的來歷。可以用兩個二元線性方程組(函數組)驗證下,最後的結果就是矩陣的乘法。

其幾何意義就是兩個線性變換的複合,比如A矩陣表示旋轉變換,B矩陣表示伸長變換,AB就是伸長加旋轉的總變換------同時伸長和旋轉。

其現實意義的例子,汽車生產線上的機械手有幾個關節,每個關節的轉動都可看作一個空間轉動矩陣,最後機械手末端的位置就是所有關節矩陣連乘(聯動)的結果。


------------------------------------------------------------------------------------------------------------------------------------------

今天偶然看到這個問題,Copy我在Quora上的回答,勿噴,談談自己的理解。。。謝謝!

---------------------------------------------------------我也來個分割線-----------------------------------------------------------

There is my understanding. The basic idea of the matrix with rows and columns is likely a X-Y axis which means a 2-dimensional space. So that we consider row and column into two parts generously. If you are more interested in the row relation, (such as X-axis you"re interested) you will get your point of view of the problem from a X-axis" perspective. In another word, you can imagine you are just standing on some position at X-axis from original point to positive X-axis (NOTICE: X-axis is a 1-dimensional space), and you"re more willing to tackle your problems using the solutions in 1-dimension. Don"t worry! Let me describe it more vividly.

The problem is lying in a 2-dimensional space, and you want to solve it using an approach of 1-dimensional perspective. Why should we do it (Question 1)? How can we do it (Question 2)? The answer to the Question 1 is that it is more easy than a 2-dimensional problem to solve from our experiences and conclusions in the most time. The answer to the Question 2 is more complex. Matrix is a smart way to compress a 2-dimensional question into a 1-dimensional question. The approach is, of course, differentiating a row and a column. You can continue imagining you"re standing at the X-axis with a 1-dimensional point of view being ready to tackle a problem you faced. You are succeed ignoring the Y-axis, no matter what happened on it, because you have no ability to meet anything in the second dimension (Y-axis). That"s a simple way for a problem solver.

Hold on Question 2 :)

  • Imagine a compressor compress a square biscuit at one direction, referencing Figure 1.

  • You"re more interested in X-axis, and you don"t care what append on Y-axis. All right, actually, Some solvers think they firstly consider each column contains homogeneous elements and the column could be compressed. Later, we use Compressing and Decreasing Dimension Method (I named it CDDM). Reference Figure 2.

  • For this reason, you are more likely walking on the X-axis form point O(0,0) to point A(0,a) where 『a』 is a real number on X-axis, and considering a 1-dimensional problem instead. You will see how we replaced and simplified the question, referencing Figure 3.

Do you got the Compressing and Decreasing Dimension Method (CDDM)? Let me draw a simple conclusion. The matrix is a 2-dimensional problem. We use CDDM to simplify it, that is, we chose row or column to calculate and proof a theorem or problem. This approach is likely decreasing the dimension, I think. And the rule is we believe each row or column has the coordinating and corresponding properties for every elements contained by a row or column.

Yes, matrix is a container! If you"re more interested in a row relation, and imagining walking on a X-axis, you will believe there is no column so that you compress the columns into single elements. And then, you walk from original point O(0,0) to point A(0,a), where 『a』 is a real number on X-axis. It"s more easy to find out the row relation in the matrix, isn"t it?

Now, I suggest you think again about the form of the matrix below in Figure 4. Why do we write it like that?

I hope I could explain the Question 2 more clearly. But, you know, English, as a second language is not too fluent for me. What I want to highlight at last is that row and column is the same, which is the major relationship between rows and columns, I think. The only differences are the angle we tackle a problem and the way we understand a knowledge. CDDM is a useful attitude in our real life.


矩陣是線性變換的表示,矩陣乘以一個向量等於對這個向量施加此矩陣代表的線性變換。這種線性變換通過變換基來實現,矩陣中的各列就是變換後的新基。

兩個矩陣相乘,AB,就是把B中各列代表的「新基」又經過了A代表的線性變換得到了一組「新新基」。實際就是B線性變換和A線性變換的複合。

總之,線性代數的重要內容就是搞基。


b站上有3blue1brown,裡面有線性代數的本質系列,可以看看參考。


大概看了一下,很有興趣,可是英文不是太好,有譯文嗎?


推薦閱讀:

如何證明正整數平方根之和為無理數?
無窮小在現代數學中到底有沒有地位?
對1024個點的信號做4次256點FFT和1次1024點FFT,請問這兩種方案得到的頻譜之間的關係?
對物理競賽比較感興趣,零基礎高數自學需要怎麼做?
自學高等數學應該選用什麼教材較好?

TAG:數學 | 線性代數 | 高等數學 | 矩陣 |