谷歌的 Tensor Flow 和微軟的 DMTK，兩套開源機器學習系統各有哪些特點？

01-06

Tensor Flow 主頁：http://www.tensorflow.org/
Tensor Flow GitHub 主頁：tensorflow/tensorflow · GitHub
DMTK 主頁：Distributed Machine Learning Toolkit
DMTK GitHub 主頁：Microsoft/DMTK · GitHub

話說TensorFlow現在已經支持分散式了

Deep Learning with Spark and TensorFlow

tensorflow/tensorflow/core/distributed_runtime at master · tensorflow/tensorflow · GitHub

記得有人做過benchmark測試了，tensorflow效率並不算高。不過講真谷歌業界影響力太大了。

看了DMTK，其中的multiverso的實現參考的是parameter server，現在開源的paramerer server好像只有微軟的、豆瓣的和@李沐的那個吧，應該還有eric xing實驗室的那個。TF還沒有具體了解，等看了之後再來補充。

微軟和TensorFlow對應的難道不應該是CNTK嗎？DMTK不是針對神經網路演算法的吧。

關於CNTK和TensorFlow的比較，可以參考Quora：

https://www.quora.com/How-do-you-compare-Microsoft-CNTK-and-Google-Tensorflow

CNTK更快、底層cuDNN的版本也更新，但TF顯然社區牛叉得多。Google技術影響力發揮作用了。

更詳細全面的深度學習框架比較，包括Torch、Caffe、Theano，在這裡：deepframeworks/README.md at master · zer0n/deepframeworks · GitHub

Yan Lecun推薦過 https://www.facebook.com/yann.lecun/posts/10153293997352143 他是這麼總結的：

Torch has an almost perfect rating on all counts. Theano and TensorFlow lack speed, Tensorflow and Caffe lack flexibility.

挖個圖看看，

微軟的DMTK

谷歌的Tensor Flow

微軟的是分散式的谷歌的是單機的單機的有個卵用

我知道他們的首頁有區別，

DMTK連個Fork me on git都不放，所以他們的star差了2個數量級，fork差了1個。

——

已經加了

不管是什麼技術，google目前遠超MS是肯定的。如果你不懂，看到這裡就可以了。

好笑的是MSer，無知的說自己是分布而別人是單機的。

Cross-Device Communication

這個特徵，我想MSer，大概都沒有聽說過，當MS徹底輸在伺服器端和分布結構後，努力想改變自己的單機思維，趕超google概念，這次他們強調自己是Distrubited machine。所以說人是玩單機的，我只好笑笑。

這個領域上，MS依舊是玩google玩剩下的。Google提出的是各台機器獨立思維分散執行

（Distributed Execution），然後把結果總結交流 Cross-Device Communication。和MS的那種開始就開會討論然後亂鬨哄的運算一通的分散式哪個更高明呢？光從設計概念來看高明一大截了。

當然最後誰高誰低，單從文章來看是無法斷定的，我的建議是你們最好是先去讀懂了資料再來發言。純粹外行的話，不明覺厲也可以的。不要看到一個單詞就來否定別人，事實上人家可能比你高明得去了。

你要是英文無法快速閱讀，那麼我貼這一段給你看看。

3.3 Distributed Execution

Distributed execution of a graph is very similar to multi-

device execution. After device placement, a subgraph is

created per device. Send/Receive node pairs that com-

municate across worker processes use remote communi-

cation mechanisms such as TCP or RDMA to move data

across machine boundaries.

Fault Tolerance

Failures in a distributed execution can be detected in a

variety of places. The main ones we rely on are (a) an

error in a communication between a Send and Receive

node pair, and (b) periodic health-checks from the master

process to every worker process.

When a failure is detected, the entire graph execution

is aborted and restarted from scratch. Recall however

that Variable nodes refer to tensors that persist across ex-

ecutions of the graph. We support consistent checkpoint-

ing and recovery of this state on a restart. In partcular,

each Variable node is connected to a Save node. These

Save nodes are executed periodically, say once every N

iterations, or once every N seconds. When they execute,

the contents of the variables are written to persistent stor-

age, e.g., a distributed file system. Similarly each Vari-

able is connected to a Restore node that is only enabled

in the first iteration after a restart. See Section 4.2 for

details on how some nodes can only be enabled on some

executions of the graph.

這根本不是單機，而是有規劃且自由獨立的，並能夠容錯的機制。民主自由平等的概念都在裡面了，還說人是單機。

我來這裡只是告訴你一個真理——不管世界如何改變，MSer和MS的粉兒的low是不會變的。

————————————————————

您在問題谷歌的 Tensor Flow 和微軟的 DMTK，兩套開源機器學習系統各有哪些特點？中的回答已被建議修改，原因是「不友善內容」

還要去舉報，MSer，你們沒有本事，就玩這個？

開發人員必須了解的14種機器學習工具

我準備好好研究一下CNTK，尋找志同道合者。cntk吧_百度貼吧

等等。。。那微軟之前那個cntk是個啥

Google發布的第二代深度學習系統TensorFlow不支持分散式，請參考謝澎濤，CMU機器學習在知乎的評論如何評價Google發布的第二代深度學習系統TensorFlow? - 你如何評價 X 。我相信谷歌在後續的版本中會開源分散式的TensorFlow版本。

微軟DMTK更專註於NLP