如何將手中 20 多台舊電腦,組建一台超級計算機?

背景:1 手裡有20多台 台式舊電腦,奔四 CPU 512MB 內存,板載顯卡音效卡獨立網卡。
2 前兩年雲計算開始火,自己的學校建立了雲計算中心,偶然體驗了一下覺得很刺激,但體驗和遠程登陸沒有區別。
3 【刪除,原因見底部】
4 聽過一位兩院院士的學術報告,聽到了分散式計算的概念
5 聽說 Google 機房和我的情況類似,採用舊的機器來分擔伺服器壓力
問:我可否用這些破電腦來組建一台超級計算機,當然也不需要太強,做電設的軟體不卡就行。
謝謝。
2016年12月28日11:10:13補充-刪除了有關領導的描述,我挺感謝我們領導的,給我這樣的條件,讓我有機會來提這個問題,期望大家不要歪樓,多討論討論技術。


很明顯,你所提到的學校領導在智商、誠實和友愛三者之中最多只擁有一樣,而且很有可能三者都缺

下圖是P4 3.0GHz CPU同其他目前常見的中高端CPU的計算性能比較(CPU Mark),就算你花了九牛二虎之力把這20多個CPU組裝到一起,並且奇蹟般地沒有任何通訊開銷和效率損失,這20多台計算機的計算能力也只有勉強達到一個i7-2600主機的水平。某寶告訴我這種主機目前大約價錢4000元。

如果這種「超級計算機」是你追求的目標,或者你希望通過這樣一個活動來提高自己對並行計算的認識,不妨玩玩。否則我能給的最好的建議就是——把所有機器放淘寶上一個200賣掉,賺到的5000元錢買一個性能強勁新的主機回來。忍不住:你的領導真不知道IT設備5年價值歸零嗎?


我是不太認為20頭狗和一頭牛的耕地效果能一樣。


對領導,你要會用他聽得懂的語言,20台P4性能水平低,但功耗可一點也不比現在的電腦低。
以一台平均實際跑的功耗150W為例(實際上如果滿載,還不只這麼多)
150W*24小時*365天/1000*20台=26280度電, 以電價0.5元/度,一年下來20台電腦用掉¥13140的電。
而¥13140可以配個什麼樣的電腦呢?6850K的配下來有剩,湊合一點6900K也配的下來。
性能隨隨便便甩這20台疊加起來N條街。
所以環保不是口號…是實實在在的利益啊。

訂閱號:OrangetechA


你把這20多台電腦連在一台交換機上,交換機不可接外網,切記,
然後搞點垃圾軟體啊,最好是那種能在區域網中不停發垃圾數據包的什麼的軟體,
這個我也不是很懂了,外行人,然後鎖門,保持有電,
過個三五年去看,哪台還能動的,就是蠱機王,此乃苗疆煉蠱之法,
這台機器已經吸收其他幾台的硬體配置的靈氣,天下無雙了


電腦方面專業性建議我沒法給,我不懂。但題主決不能把這些電腦賣掉,決不能!從領導的回答看,這些電腦可能還都存在於賬面上,都屬於國有資產。而國有資產處置是非常麻煩的,程序上稍有不慎,就會被人抓住「國有資產流失」、「利益輸送」、「侵佔國有資產」、「以權謀私」等等小辮子,責任人輕者處分,重則開除。處理這些破爛的最好的辦法,打個書面報告說這些貨已經無法使用了,請示上級該如何處理,估計不會得到領導的批示的,但程序上你已經上報此事,未來檢查起來可以做到無責了(當然這也不一定,不過起碼有個解釋)。然後找個地方堆放且需做好防塵防潮措施,以及做個資產目錄登記一下,標明每台電腦無法使用的狀況,一年至少檢查和更新登記記錄一次。直至財務和資產管理部門把這些破爛銷帳,然後就可以一直放在那裡不用再管它們了。但切記不要處置,不要處置,不要處置!就放在那裡腐爛吧。


請不要嘲笑別人的想法,老外已經實現了四台電腦的計算集群,題主20台也應該是可以實現的。這個資料是英文,我找到了部分中文資料,粗略地瀏覽了一遍,應該是可以實現的,但是細節部分沒看明白,求高人翻譯資料。(不好意思,我剛才發現和@高超 的回答撞車了,不過,這些資料確實是在我沒看他的資料之前自己搜索出來的,嘿嘿,大家就當他回答的補充好了)。知乎里的大神真多,呵呵,以後得好好看回答

資料地址如下 官方網址 Microwulf: A Personal, Portable Beowulf Cluster
中文翻譯的資料地址如下【個人小超算】實戰資料彙編
先上張圖片震撼一下大家

以下是我找到中文翻譯資料,我是直接複製的,沒能把圖片等複製過來,大家就湊合看吧,也可以看上面那個中文資料的網站

個人電腦陣列

一、作者簡介:

喬爾 亞當斯 (Joel Adams)是卡爾文學院(Calvin College)計算機科學(computer science)教授,1988年獲得在匹次堡大學獲得博士學位,主要研究超算的內部連接,是幾本計算機編程教材的作者,兩次獲得Fulbright Scholar
(模里西斯1998, 冰島 2005).
緹姆 布倫姆(Tim Brom)是卡耐基大學計算機科學的研究生,2007年五月在卡爾文學院獲得計算機科學學士學位。

二、說明:

此小超算擁有超過260億次的性能,價格少於2500美元,重量少於31磅,外觀規格為11" x 12" x 17"——剛好夠小,足夠放在桌面上或者柜子里上,

更新:2007年8月1日,這個小超算已經可以用1256美元構建成,使得其性價比達到4.8美元/億次——這樣的話,可以增加更多的晶元,以提升性能,讓其更接近21世紀初的超算性能。

此小超算是由卡爾文大學的計算機系統教授喬爾 亞當斯和助教 緹姆 布倫姆設計和構建。


下面是原文的目錄,可點擊查看:

  • 設計
  • 硬體信息
  • 軟體系統構建說明
  • 效果圖片
  • 性能
  • 計算效率
  • 價格效率
  • 功耗
  • 新聞報道
  • 相關係統

三、介紹

作為一個典型的超算用戶,我需要到計算中心排隊,而且要限定使用的計算資源。這個對於開發新的分散式軟體來說,很麻煩。所以呢,我需要一個自己
的,我夢想中的小超算是可以小到放在我的桌面上,就像普通個人電腦一樣。只需要普通的電源,不需要特殊的冷切裝置就可以在室溫下運行……

2006年末, 兩個硬體發展,讓我這個夢想接近了現實:

  1. 多核普及
  2. 千兆區域網相關硬體普及

結果呢,我就設想了一個小型的,4個節點,使用多核晶元,每個節點使用高速網線連接。


2006年秋天, 卡爾文學院計算機系給了我們一筆小錢——就是2500美元,去構建這麼一個系統,我們當時設定的目標:

  • 費用少於2500美元——這樣一般人都能負擔得起,可以促進普及。
  • 足夠小,適合放在我的桌面上,適合放到旅行箱里。
  • 要夠輕,可以手提,然後帶到我的汽車裡。
  • 性能強勁,測試結果至少要200億次:
    • 用於個人研究,
    • 用於我教授的高性能運算課程,
    • 用於專業論壇講授、高中講演等,
  • 只需要一根電源線,使用普通的120伏電源。
  • 可在室溫下運行。


據我們當時所知,已經有一些小型的超算,或者是性價比不錯的超算出現,這些東西給了我們很好的參考:

  • Little Fe
  • The Ultimate Linux Lunchbox


下面是歷年的性價比之王:

  • 2005: Kronos
  • 2003: KASY0
  • 2002: Green Destiny
  • 2001: The Stone Supercomputer
  • 2000: KLAT2
  • 2000: bunyip
  • 1998: Avalon

在同一時間,還有其他更廉價或者是更具性價比的超算集群,不過這些記錄都在2007年被改變了,最具性價比的就是下文介紹的小超算(2007年一月,9.41美元/億次),而其記錄半年後就被打破(2007年8月 4.784美元/億次)。


架構設計:

個人小超算一般做法是使用多核晶元,集中安裝到一個小的空間里,集中供電——嗯,如果能自己燒制主板,體積上應該可以做得更小——樹莓派的主板體積很小,就是晶元不給力,所以需要那麼多片才能達到2007年用普通電腦晶元實現的性能。

1960年代末,吉恩 庵達郝樂(Gene Amdahl)提出了一個設計準則,叫 "庵達赫樂法則"(Amdahl"s Other Law),大意是:


為了兼容性考慮, 下面幾個特性應該相同:

  • 每片晶元的頻率
  • 每根內存大小
  • 每處帶寬


高性能計算一般有三個瓶頸:晶元運算速度,運算所需內存,吞吐帶寬。本小超算裡面,帶寬主要是指網路帶寬。我們預算是2500美元,在設定了每核內存量,每核的帶寬之後,其中晶元運算速度當然是越快越好。

內部使用千兆網路(GigE),則意味著我們的帶寬只有1Gbps,如果要更快的,可以使用比如Myrinet,不過那會超預算了,此處核心1吉赫茲+每核1吉B內存+1吉bps,嗯,看起來比較完美,哈哈。最終決定是2.0GHz的雙核晶元,每核1GB內存


晶元,使用AMD Athlon 64 X2 3800 AM2+ CPUs. 2007年一月時每片價格$165 ,這種2.0GHz的雙核晶元,是當時可以找到的性價比最好的。
(2007年8月就更便宜了,每片只有$65.00).

為了盡量減少體積,主板選用的是MSI Micro-ATX。此主板特點是小(9.6" by 8.2") ,並且有一個AM2
socket,可支持AMD的Athlon多核晶元。其實如果有條件的話,更應該做的是使用AMD的四核Athlon64
CPU替代這個雙核,而這系統恰好還不用改。

To do so, we use motherboards with
a smaller form-factor
(like Little Fe)
than the usual ATX size,
and we space them using threaded rods
(like this cluster)
and scrap plexiglass, to minimize "packaging" costs.


By building a "double decker sandwich" of
four microATX motherboards, each with
a dual core CPU and
2 GB RAM (1 GB/core),
we can build a 4-node, 8-core, 8GB multiprocessor
small enough to fit on one"s desktop,
powerful enough to do useful work,
and inexpensive enough that anyone can afford one.



主板上已經嵌有一個千兆網卡,還有一個PCI-e擴展插槽,在PCI-e插槽插入另一根網卡(41美元),用於平衡晶元運算速度和網路帶寬。這樣,四塊主
板總共就有內嵌的4個網卡,外加PCI-e插槽的4張網卡,一共8個網路通道,用網線把它們都連接到8口路由器(100美元)上。


Our intent was to provide sufficient bandwidth for each
core to have its own GigE channel,
to make our system less imbalanced with respect to
CPU speed (two x 2 GHz cores) and network bandwidth (two x 1 Gbps adaptors).
This arrangement also let us
experiment with channel bonding the two adaptors,
experiment with HPL using various MPI libraries
using one vs two NICs,
experiment with using one adaptor for "computational" traffic
and the other for "administrative/file-service" traffic,
and so on.)


每塊主板插了兩根內存,共2G,這8G內存消耗了預算的40%!!

為了更小化,本小超算沒有使用機箱,而是一個完全非封閉的外架,像Little Fe 和
這些集群,把主板直接安裝到有機玻璃上面,然後用幾根小鐵杆撐起來,並連接成一立體狀。——(這個架子一般的五金店應該可以製造,用導熱性好的鋁/鐵當托盤,整機的熱分布會好點,也有利於集中散熱)

最底部是兩片有機玻璃隔開的一個夾層,放著8口路由,光碟機,還有250GB的硬碟。

結構圖如下:

我們這小超算的硬體結構

如圖所示,主板放在最頂層的下方,而中間層則兩面都放主板,底層則上方放主板,這樣做的目的是儘可能減少高度。


Since each of our four motherboards is facing another motherboard,
which is upside-down with respect to it, the CPU/heatsink/fan assembly
on one motherboard
lines up with the PCI-e slots in the motherboard facing it.
As we were putting a GigE NIC in one of these PCI-e slots,
we adjusted the spacing between the Plexiglas pieces so as to leave a 0.5" gap
between the top of the fan on the one motherboard and the top of the NIC on the
opposing motherboard. 這樣的結果就是每塊主板間的間距為6",如圖所示:

主板之間的距離


(說明:這些主板都有一個單獨 PCI-e x16插槽,留給以後想提升性能的時候,可以插上一塊GPU)


使用350瓦的電源供電(每塊主板一個),使用雙面膠固定在有機玻璃上,電源插座放在最上面的有機玻璃上,如圖所示:

本小超算的電源和風扇


(此處用膠水固定硬碟、光碟機、路由器)

最靠近夾層的底部主板作為「主節點」——主控主板,連接硬碟、光碟機(可選)等,系統啟動/關機/重啟的時候也是從這個部分操作。其他的主板當作「分支節點」,使用PXE網路啟動方式啟動。

對最底部的主控主板做特殊設置,連接250GB硬碟,並且作為啟動分區。插入光碟機(主要是用於安裝初始系統,現在都不需要了,直接用優盤做系統安裝盤吧……)
插入另一塊網卡10/100 NIC到PCI插槽中,用於連接外部網路。

頂部三個節點都是無硬碟的,
and used NFS to export the space on the 250 GB drive to them。


下圖顯示了本小超算各個部分的連接關係(節點0為重心,連接了硬碟、光碟機、以及連接外部的介面,內部中心為千兆路由,用於連接其他節點):

說明:每個節點都有兩條獨立的通訊線路,連接自己和網路路由器。



With four CPUs blowing hot air into such a small volume, we thought we should
keep the air moving through Microwulf.
To accomplish this, we decided to purchase four Zalman 120mm case fans
($8 each) and grills ($1.50 each).
Using scavenged twist-ties, we mounted two fans
-- one for intake and one for exhaust --
on opposing sides of each pair of facing motherboards.
This keeps air moving across the boards and NICs;
Figure Five shows the two exhaust fans:

Figure Five: Two of Microwulf"s (Exhaust) Fans


So far, this arrangement has worked very well: under load,
the on-board temperature sensors report
temperatures about 4 degrees above room temperature.


Last, we grounded each component (motherboards, hard drive, etc.)
by wiring them to one of the power supplies.


系統使用的是有奔頭(Ubuntu Linux).

開源通用信道(Open MPI)將自動識別每個節點的網路適配器,並讓它們之間組成一個圓環型的信息交流系統。
To try to help Open MPI spread the load on both the
sending and receiving side, we configured the on-board
adaptors to be part of a 192.168.2.x subnet,
and the PCI-e adaptors to be part of a 192.168.3.x subnet.

價格參考(2007年一月):
部件
產品名稱
單價
數量
總價
主板
微星 K9N6PGM-F MicroATX
$80.00
4
$320.00
晶元
威盛Athlon 64 X2 3800+ AM2 CPU
$165.00
4
$660.00
內存
金士頓 DDR2-667 1GByte RAM
$124.00
8
$992.00
電源
Echo Star 325W MicroATX Power Supply
$19.00
4
$76.00
網卡
Intel PRO/1000 PT PCI-Express NIC (節點連接路由)
$41.00
4
$164.00
網卡
Intel PRO/100 S PCI NIC (主控主板連接外部網路)
$15.00
1
$15.00
路由器

Trendware TEG-S80TXE 8-port Gigabit Ethernet Switch
$75.00
1
$75.00
硬碟
希捷7200轉 250GB SATA硬碟
$92.00
1
$92.00
光碟機

Liteon SHD-16S1S 16X
$19.00
1
$19.00
冷切系統

Zalman ZM-F3 120mm Case Fans
$8.00
4
$32.00
風扇
Generic NET12 Fan Grill (120mm)
$1.50
+ shipping
4
$10.00
硬體支架
36" x 0.25" threaded rods
$1.68
3
$5.00
硬體固定
Lots of 0.25" nuts and washers

$10.00
機箱或外殼
12" x 11" 有機玻璃(是我們物理實驗室的廢品)
$0.00
4
$0.00
總價
$2,470.00

非必須的硬體

部件
產品名稱
單價
數量
總價
KVM Switch
Linkskey LKV-S04ASK
$50.00
1
$50.00
總價
$50.00


除了技術支持還有硬體加固 (購買自Lowes), 風扇和轉接器購買自
newegg.com,
其他都購買自(量多有折扣,呵呵):

N F P Enterprises
1456 10 Mile Rd NE
Comstock Park, MI 49321-9666
(616) 887-7385

So we were able to keep the price for the whole system to just under $2,500.
That"s 8 cores with 8 GB of memory and 8 GigE NICs for under $2,500,
or about $308.75 per core.


構建配置:

點擊此處:軟體系統構建說明,有詳細的介紹文件下載——建議想自己構建的人下載下來,然後按照其說明,逐步完成。

細節是魔鬼


首先是選用哪個你牛叉發行版:曾經一度使用Gentoo,但後來覺得gentoo太消耗能量了(包括系統管理員的精力和系統的耗電),後來
試了試有奔頭,一開始安裝的桌面是6.10版本,其內核是2.6.17,但美中不足的是he on-board
NIC的驅動需要到2.6.18才內置,所以一開始兩個月,我們的小超算就用的7.04的測試版(內核是2.6.20),直到最後穩定版發行就換了穩定
版。

在其他三個計算節點上,安裝的是有奔頭的伺服器版,因為它們不需要桌面功能。

也就是:有奔頭桌面版+3個有奔頭伺服器版


我們也試過其他的集群管理軟體:ROCKS,
Oscar, 和 Warewulf.,但ROCKS和Oscar不支持無盤的節點。Warewulf工作良好,但因為本小超算實在太小,目前看不出其優勢來。因為這篇論文,曾經想使用iSCSI。不過為了儘快讓我們的集群運行起來,還是決定使用NFSroot,因為其配置非常簡單,只需要修改/etc/initramfs.conf ,讓其生成一個虛擬內存(initial ramdisk) that does NFSroot and then setting up DHCP/TFTP/PXELinux
on the head node, as you would for any diskless boot situation.


We did configure the network adaptors differently:
we gave each onboard NIC an address on a 192.168.2.x subnet,
and gave each PCI-e NIC an address on a 192.168.3.x subnet.
Then we routed the NFS traffic over the 192.168.2.x subnet,
to try to separate "administrative" traffic from computational traffic.
It turns out that OpenMPI will use both network interfaces (see below),
so this served to spread communication across both NICs.


One of the problems we encountered is that the on-board NICs
(Nvidia) present soem difficulties. After our record setting run
(see the next section) we started to have trouble with the on-board NIC.
After a little googling, we added the following option to the forcedeth
module options:

forcedeth max_interrupt_work=35


The problem got better, but didn"t go away. Originally we had the onboard Nvidia
GigE adaptor mounting the storage.
Unfortunately, when the Nvidia adaptor started to act up, it reset itself,
killing the NFS mount and hanging the "compute" nodes.
We"re still working on fully resolving this problem,
but it hasn"t kept us from benchmarking Microwulf.

效果圖:直接點擊上面目錄連接,可查看

性能表現:

所獲得的性能表現


Once Microwulf was built and functioning it"s fairly obvious that we
wanted to find out how "fast" it was.
Fast can have many meanings, depending upon your definition.
But since the HPL benchmark is the standard used for the Top500 list,
we decided to use it as our first measure of performance.
Yes, you can argue and disagree with us, but we needed to start somewhere.


We installed the development tools for Ubuntu (gcc-4.1.2) and then built
both Open MPI and
MPICH.
Initially we used OpenMPI as our MPI library of choice and we had both GigE NICs
configured (the on-board adaptor and the Intel PCI-e NIC that was in the x16 PCIe slot).


Then we built the
GOTO BLAS
library, and
HPL,
the High Performance Linpack benchmark.


The Goto BLAS library built fine, but when we tried to build HPL
(which uses BLAS),
we got a linking error indicating that someone had left
a function named main() in a module named main.f in /usr/lib/libgfortranbegin.a.
This conflicted with main() in HPL.
Since a library should not need a main() function,
we used ar to remove the offending module from /usr/lib/libgfortranbegin.a,
after which everything built as expected.


Next, we started to experiment with the various parameters for
running HPL - primarily problem size and process layout.
We varied PxQ between {1x8, 2x4}, varied NB between
{100, 120, 140, 160, 180, 200}, and used increasing values of N
(problem size) until we ran out of memory.
As an example of the tests we did, Figure Six below is a plot of the HPL performance
in GFLOPS versus the problem size N.

Figure Six: Microwulf Results for HPL WR00R2R24 (NB=160)


For Figure Six we chose PxQ=2x4, NB=160, and varied
N from a very small number up to 30,000.
Notice that above N=10,000, Microwulf achieves 20 GLFOPS,
and with N greater than 25,000, it exceeds 25 GFLOPS.
Anything above N=30,000 produced "out of memory" errors.


We did achieve a peak performance of 26.25 GFLOPS.
The theoretical peak performance for Microwulf is 32 GLFOPS.
(Eight cores x 2 GHz x 2 double-precision units per core.)
This means we have hit about 82% efficiency (which we find remarkable).
Note that one of the reasons we asume that we achieved such a high efficiency
is due to Open MPI, which will use both GigE interfaces.
It will round-robin data transfers over the various interfaces
unless you explicitly tell it to just use certain interfaces.


It"s important to note that this performance occurred
using the default system and Ethernet settings.
In particular, we did not tweak any of Ethernet parameters mentioned in

Doug Eadline and Jeff Layton"s article on cluster optimization.
We were basically using "out of the box" settings for these runs.


To assess how well our NICs were performing, Tim did some followup HPL runs,
and used netpipe to gauge our NICs latency.
Netpipe reported 16-20 usecs (microseconds) latency on the onboard NICs,
and 20-25 usecs latency on the PCI-e NICs,
which was lower (better) than we were expecting.


As a check on performance we also tried another experiment.
We channel bonded the
two GigE interfaces to produce, effectively, a single interface.
We then used MPICH2 with the channel bonded interface and used the
same HPL parameters we found to be good for Open-MPI.
The best performance we achieved was 24.89 GLOPS (77.8% efficiency).
So it looks like Open MPI and multiple interfaces beats MPICH2 and a
bonded interface.


Another experiment we tried was to use Open MPI and just the PCI-e GigE NIC.
Using the same set of HPL parameters we have been using we achieved a
performance of 26.03 GFLOPS (81.3% efficiency).
This is fairly close to the performance we obtained when using both interfaces.
This suggests that the on-board NIC isn"t doing as much work as we thought.
We plan to investigate this more in the days ahead.


下面看看歷年最強500超算裡面的本小超算性能方面的排名:

  • Nov. 1993: #6
  • Nov. 1994: #12
  • Nov. 1995: #31
  • Nov. 1996: #60
  • Nov. 1997: #122
  • Nov. 1998: #275
  • June 1999: #439
  • Nov. 1999: 被踢出名單了

1993年11月,本小超算可以排名世界第6。1999年6月,排名為第439,相比於一般超算放在一個大大的機房裡,而且需要眾多晶元,這個4片、8芯的集群,只有11" x 12" x 17",能有如此表現,很不錯了。

更進一步挖掘下這個列表:1993年11月的排名中,排在第五位的超算是用了512片核芯的Thinking Machines CM-5/512,運算速度達到300億次。本小超算的4核相當於當年的512核啊,哈哈。


1996年11月,此小超算排在第60位,下一個是用了256片核芯的Cray T3D MC256-8,現在8核俄性能都超過11年前的256核了,此處還沒說價格差異呢,T3D花費了上百萬美元!

超算性能一般以每秒浮算次數(flops)來衡量。早期超算使用百萬次來衡量,隨著硬體飛躍,十億次已經是很落後的指標了,現在都流行用萬億次,甚至千萬億次來表示了。
Early supercomputer performance was measured in
megaflops (Mflops: 10

6

flops).
Hardware advances increased subsequent supercomputers
performance to gigaflops (Gflops: 10

9

flops).
Today"s massively parallel supercomputers
are measured in teraflops (Tflops: 10

12

flops),
and tomorrow"s systems will be measured in petaflops
(Pflops: 10

15

flops).


When discussing supercomputer performance, you must also
distinguish between

  • 峰值性能 --理論上最大的性能表現
  • 測量性能 -- 用檢測軟體檢測出來的性能表現

一般計算機生產商會標示峰值,但實際檢測一般只有峰值的50%-60%左右。

另一個要注意的是精度,一般高性能運算都是用的雙精度,所以不可混淆了單精度和雙精度運算。


The standard benchmark (i.e., used by the
top500.org supercomputer list)
for measuring supercomputer performance is

high performance Linpack (aka HPL),
a program that exercises and reports a supercomputer"s
double-precision floating point performance.
To install and run HPL, you must first install a version of the
Basic Linear Algebra Subprograms (BLAS) libraries,
since HPL depends on them.


In March 2007, we benchmarked Microwulf using HPL and
Goto BLAS.
After compiling and installing each package,
we ran the standard, double-precision version of HPL,
varying its parameter values as follows:
We varied PxQ between {1x8, 2x4};
varied NB between {100, 120, 140, 160, 180, 200};
and used increasing values of N, starting with 1,000.
For the following parameter values:

PxQ = 2x4; NB = 160; N = 30,000

HPL reported 26.25 Gflops on its WR00R2R4 operation.
Microwulf also exceeded 26 Gflops on other operations,
but 26.25 Gflops was our maximum.

在最強500超算中,1996年的Cray T3D-256也才達到253億次,所以我們這個260億次的性能,是足夠用來做很多事情的了。



Since we benchmarked Microwulf,
Advanced
Clustering Technologies
has published a convenient

web-based calculator that removes much of the trial and error from
tuning HPL.

性價比:


When you have measured a supercomputer"s performance using HPL,
and know its price, you can measure its cost efficiency
by computing its price/performance ratio.
By computing the number of dollars you are paying for each
floating point operation (flop),
you can compare one supercomputer"s cost-efficiency against others.


With a price of just $2470
and performance of 26.25 Gflops,
Microwulf"s price/performance ratio (PPR)
is $94.10/Gflop, or less than $0.10/Mflop!

This makes Microwulf
the first general-purpose Beowulf cluster to break
the $100/Gflop (or $0.10/Mflop) threshold
for measured double-precision floating point performance.


下面列表可作為參考,了解下這個性價比的意義:

  • In 1976,
    the Cray-1
    cost more than 8 million dollars
    and had a peak (theoretical maximum) performance of 250 Mflops,
    making its PPR more than $32,000/Mflop.
    Since peak performance exceeds measured performance,
    its PPR using measured performance
    (estimated at 160 Mflops) would be much higher.
  • In 1985,
    the Cray-2
    cost more than 17 million dollars
    and had a peak performance of 3.9 Gflops,
    making its PPR more than $4,350/Mflop ($4,358,974/Gflop).
  • 1997年,打敗西方象棋世界冠軍卡斯帕羅夫的
    IBM 深藍。價格是5百萬美元,性能是113.8億次,其性價比是43936.7美元/億次
  • In 2003, the U. of Kentucky"s Beowulf cluster
    KASY0
    cost $39,454 to build,
    and produced 187.3 Gflops on the double-precision version of HPL,
    giving it a PPR of about $210/Gflop.
  • Also in 2003, the University of Illinois at Urbana-Champaign"s
    National Center for Supercomputing Applications built
    the PS 2 Cluster
    for about $50,000.
    No measured performance numbers are available;
    which isn"t surprising, since the PS-2 has no hardware support for double
    precision floating point operations.
    This cluster"s theoretical peak performance is about 500 Gflops
    (single-precision); however,
    one study
    showed that the PS-2"s double-precision performance took
    over 17 times as long as its single-precision performance.
    Even using the inflated single-precision peak performance value,
    its PPR is more than $100/Gflop;
    it"s measured double-precision performance is probably more than 17 times that.
  • In 2004, Virginia Tech built
    System X,
    which cost 5.7 million dollars,
    and produced 12.25 Tflops of measured performance,
    giving it a PPR of about $465/Gflop.
  • In 2007, Sun"s Sparc Enterprice M9000
    with a base price of $511,385,
    produced 1.03 Tflops of measured performance,
    making its PPR more than $496/Gflop.
    (The base price is for the 32 cpu model,
    the benchmark was run using a 64 cpu model,
    which is presumably more expensive.)


$9.41/億次,我們的小超算可以說是超算裡面性價比最好的一個了,不過呢,還沒法提供千萬億次的運算,若有需要,或許可以突破這個價格限制,讓性能方面獲得更大的提升。

效能 - 世界記錄 功耗:

以2007年一月的價格,本小超算用了2470美元,獲得262.5億次的運算速度,平均9.41美元/億次。這個已經成為新的世界紀錄了。

另外,節能方面的事情最近也比較敏感,性耗比(耗電量/性能)也需要測量下了,性耗比對集群是非常重要的,尤其是成片的集群(比如谷歌的伺服器場)。本小超算我們測試了下,

  • 待機需要消耗250瓦(平均30瓦每核),
  • 運行是需要消耗450瓦,

算了下運行時的性耗比就是1.714瓦/億次。

對比下其他的超算。
專門進行節能設計的超算Green Destiny 使用了非常節能的晶元,只需要較低的冷切,240核消耗了3.2千瓦,獲得的運算性能是1010億次,性耗比為3.1瓦/億次。是我們這個自製的小超算的兩倍哦!!!


Another interesting comparison is to the Orion Multisystems clusters.
Orion is no longer around, but a few years ago they sold two commercial
clusters: a 12-node desktop cluster (the DS-12) and
a 96-node deskside cluster (the DS-96).
Both machines used Transmeta CPUs.
The DS-12 used 170W under load, and its performance was about 13.8 GFLOPS.
This gives it a performance/power ratio of 12.31W/GLFOP
(much better than Microwulf).
The DS-96 consumed 1580W under load, with a performance of 109.4 GFLOPS.
This gives it a performance/power ratio of 14.44W/GFLOP,
which again beats Microwulf.


Another way to look at power consumption and price is to use the metric
from Green 500.
Their metric is MFLOPS/Watt (the bigger the number the better).
Microwulf comes in at 58.33, the DS-12 is 81.18, and the deskside unit is 69.24.
So using the Green 500 metric we can see that the Orion systems are
more power efficient than Microwulf.
But let"s look a little deeper at the Orion systems.


The Orion systems look great at Watts/GFLOP and considering the age of
the Transmeta chips, that is no small feat.
But let"s look at the price/performance metric.
The DS-12 desktop model had a list price of about $10,000,
giving it a price/performance ratio of $724/GFLOP.
The DS-96 deskside unit had a list price of about $100,000,
so it"s price/performance is about $914/GFLOP.
That is, while the Orion systems were much more power efficient,
their price per GFLOP is much higher than that of Microwulf,
making them much less cost efficient than Microwulf.


Since Microwulf is better than the Orion systems in price/performance,
and the Orion systems are better than Microwulf in power/performance,
let"s try some experiments with metrics to see if we can find a useful
way to combine the metrics.
Ideally we"d like a single metric that encompasses
a system"s price, performance, and power usage.
As an experiment, let"s compute MFLOP/Watt/$.
It may not be perfect, but at least it combines all 3 numbers into a
single metric, by extending the Green 500 metric to include price.
You want a large MFLOP/Watt to get the most processing power per
unit of power as possible.
We also want price to be as small as possible
so that means we want the inverse of price to be as large as possible.
This means that we want MFLOP/Watt/$ to be as large as possible.
With this in mind, let"s see how Microwulf and Orion did.

  • Microwulf: 0.2362
  • Orion DS-12: 0.00812
  • Orion DS-96: 0.00069


From these numbers (even though they are quite small),
Microwulf is almost 3 times better than the DS-12
and almost 35 times better than the DS-96 using this metric.
We have no idea if this metric is truly meaningful but it give us
something to ponder.
It"s basically the performance per unit power per unit cost.
(OK, that"s a little strange, but we think it could be a useful
way to compare the overall efficiency of different systems.)


We might also compute the inverse of the MFLOP/Watt/$ metric:
-- $/Watt/MFLOP --
where you want this number to be as small as possible.
(You want price to be small and you want Watt/MFLOP to be small).
So using this metric we can see the following:

  • Microwulf: 144,083
  • Orion DS-12: 811,764
  • Orion DS-96: 6,924,050


This metric measures the price per unit power per unit performance.
Comparing Microwulf to the Orion systems, we find that Microwulf is
about 5.63 times better than the DS-12,
and 48 times better than the DS-96.
It"s probably a good idea to stop here,
before we drive ourselves nuts with metrics.



While most clusters publicize their performance data,
Very few clusters publicize their power consumption data.


Some notable exceptions are:

  • Green
    Destiny,
    an experimental blade cluster built at Los Alamos National Labs in 2002.
    Green Destiny was built expressly to minimze power consumption,
    using 240 Transmeta TM560 CPUs.
    Green Destiny consumed 3.2 kilowatts and produced 101 Gflops
    (on Linpack), yielding a power/performance ratio of 31 watts/Gflop.
    Microwulf"s 17.14 watts/Gflop is much better.
  • The (apparently defunct)
    Orion
    Multisystems DS-12 and DS-96 systems:
  • The DS-12 "desktop" system consumed 170 watts under load,
    and produced 13.8 Gflops (Linpack),
    for a power/performance ratio of 12.31 watts/Gflop.
    (The DS-12"s list price was about $10,000,
    making its price/performance ratio $724/Gflop.)
  • The DS-96 "under desk" system consumed 1580 watts under load,
    and produced 109.4 Gflops (Linpack),
    for a power/performance ratio of 14.44 watts/Gflop.
    (The DS-96"s list price was about $100,000,
    making its price/performance ratio about $914/Gflop.)
  • 我們的小超算性價比上
    遠超這些商業機器,其性耗比也居於前流。


節能500超算名單,是基於最強500超算的(本小超算沒有被列入,呵呵),排名按每瓦運算次數排列。我們的小超算是1.713瓦/億次,換算如下:

1 / 17.14 W/Gflop * 1000 Mflops/Gflop= 58.34 Mflops/W

2007年8月,我們的小超算超越了節能500超算的第二位,Mare Nostrum (58.23 Mflops/W) -- 可惜啊,和排名第一BlueGene/L (112.24 Mflops/W)的距離有點遠。

結論

此小超算用了4塊晶元、8核集群,大小為11" x 12" x 17",適合放在桌面上,也適合打包放到飛機上運輸。

除了小巧,HPL檢測本超算有262.5億次的運算性能,總花費是2470美元(2007年1月),性價比為9.41美元/億次。



本小超算能有如此神力的原因是:

  • 多核晶元已經普及:這樣可以讓系統變得更小。
  • 內存大降價:
    此小超算最貴的部分就是這個,不過價格一直在快速下降中,8G內存應該夠用了吧??
  • 千兆網卡已經普及:On-board GigE adaptors, inexpensive GigE NICs,
    and inexpensive GigE switches allow Microwulf to offer
    enough network bandwidth to avoid starving a parallel computation
    with respect to communication.


我們不打算保守我們的技術秘密,而是希望所有人都來嘗試這玩玩,嗯,其實很多部件都是可以替換的。

比如,隨著固態硬碟的降價,可以試試固態硬碟替換掉機械硬碟,看看對性能有何影響。


比如內存:因為內存降價,可以把內存換為2GB的,這樣每核可以2GB內存。Recalling that HPL kept running out of memory when we increased N above 30,000,
it would be interesting to see how many more FLOPS one could eke out with
more RAM.
The curve in Figure Six suggests that performance is beginning to plateau,
but there still looks to be room for improvement there.


比如主板和晶元:此微星主板使用AM2插槽,這個插槽剛好支持威盛新的4核Athlon64晶元,這樣就可以替換掉上文中的雙核晶元,使得整個系統變成
16核,性能更加強勁。有興趣的同學可以測測這麼做的結果性能提升多少?性價比因此而產生的變化?千兆內部網的效能變化等……

等等……尤其是已經幾年後的今天(2012),這個列表幾乎可以全部替換掉了。

2007年8月配件價格:


各個部件的價格下降很快。晶元、內存、網路、硬碟等,都降了好多價格。2007年8月在 新蛋(Newegg) 中的價格:

部件
產品名稱
單價
數量
總價
主板
微星K9N6PGM-F MicroATX$50.32
4
$201.28
晶元
威盛 Athlon 64 X2 3800+ AM2 CPU$65.00
4
$260.00
內存
Corsair DDR2-667 2 x 1GByte RAM
$75.99
4
$303.96
電源
LOGISYS Computer PS350MA MicroATX 350W Power Supply$24.53
4
$98.12
網卡
Intel PRO/1000 PT PCI-Express NIC (節點連接路由)
$34.99
4
$139.96
網卡
Intel PRO/100 S PCI NIC (主控主板連接外部網路)
$15.30
1
$15.30
路由器
SMC SMCGS8 10/100/1000Mbps 8-port Unmanaged Gigabit Switch$47.52
1
$47.52
硬碟
希捷7200轉 250GB SATA 硬碟$64.99
1
$64.99
光碟機
Liteon SHD-16S1S 16X$23.831$23.83
製冷設備
Zalman ZM-F3 120mm Case Fans$14.98
4
$59.92

風扇
Generic NET12 Fan Grill (120mm)$6.48
4
$25.92
硬體支架
36" x 0.25" threaded rods
$1.68
3
$5.00
硬體加固
Lots of 0.25" nuts and washers

$10.00
機箱或外殼
12" x 11" 有機玻璃(來自物理實驗室的廢物)
$0.00
4
$0.00
總價$1,255.80


(現在價格應該更低了!而且性能方面應該更強悍了!!!)

可見,2007年8月,這個性價比已經達到了4.784美元/億次,突破5美元/億次!!!!!


性耗比則保持不變。


如果融合價格、性能、功耗,則每百萬次/瓦/美元為0.04645,是原來的小超算兩倍。美元/瓦/百萬次為 73,255,也是原來的兩倍。


應用:


和其他超算一樣,本小超算可以運行一些並行運算軟體——需要特別設計,以利用系統的並行運算能力。

這些軟體一般會使用 通用信道和並行虛擬機。這幾個庫提供了分散式計算的最基礎功能,一是使得進程可以在網路間溝通和同步,二是提供了一個分布執行最後匯總的機制,使得程序可以被複製成多份,分別在各個節點上運行。


有很多應用軟體已經可以在本小超算上使用,大部分是由特定領域的科學家寫的,用於解決特定問題:

  • CFD
    codes,
    an assortment of programs for computational fluid dynamics
  • DPMTA,
    a tool for computing N-body interactions
    fastDNAml,
    a program for computing phylogenetic trees from DNA sequences
  • Parallel finite element analysis (FEA) programs, including:

    • Adventure,
      the ADVanced ENgineering analysis Tool
      for Ultra large REal world,
      a library of 20+ FEA modules
    • deal.II,
      a C++ program library providing computational solutions
      for partial differential equations using adaptive finite elements
    • DOUG,
      Domain decomposition On Unstructured Grids
    • GeoFEM,
      a multi-purpose/multi-physics parallel finite element
      simulation/platform for solid earth

    • ParaFEM,
      a general parallel finite element message passing libary
  • Parallel
    FFTW,
    a program for computing fast Fourier transforms (FFT)
  • GADGET,
    a cosmological N-body simulator
  • GAMESS,
    a system for ab initio quantum chemistry computations
  • GROMACS,
    a molecular dynamics program for modeling molecular interactions,
    especially those from biochemistry
  • MDynaMix,
    a molecular dynamics program for simulating mixtures
  • mpiBLAST,
    a program for comparing gene sequences
  • NAMD,
    a molecular dynamics program
    for simulating large biomolecular systems
  • NPB 2,
    the NASA Advanced Supercomputing Division"s Parallel Benchmarks
    suite. These include:

    • BT, a computational fluid dynamics simulation
    • CG, a sparse linear system solver
    • EP, an embarrassingly parallel floating point solver
    • IS, a sorter for large lists of integers
    • LU, a different CFD simulation
    • MG, a 3D scalar Poisson-equation solver
    • SP, yet another (different) CFD simulation
  • ParMETIS,
    a library of operations on graphs, meshes, and sparse matrices

  • PVM-POV,
    a ray-tracer/renderer
  • SPECFEM3D,
    a global and regional seismic wave simulator
  • TPM,
    a collisionless N-body (dark matter) simulator


這是我們使用小超算的領域:

  • 給卡爾文大學的本科生做研究項目
  • As a high performance computing resource for
    CS 374:
    High Performance Computing
  • 正在做的事情:

    • 給本地的高中學校也定製幾個,以提升學生了解計算的興趣
    • 用於會議,作為一個個人超算的示例模型。
  • When not being used for these tasks,
    Microwulf runs the client for Stanford"s
    Folding@Home project,
    which helps researchers better understand protein folding,
    which in turn helps them the causes of
    (and hopefuly the cures for) genetic diseases.
    Excess CPU cycles on a Beowulf cluster like Microwulf can be
    devoted to pretty much any
    distributed
    computing project.

常見問題回答:

  1. Will Microwulf run [insert favorite program/game] faster?
    Unless the program has been written specifically to run in
    parallel across a network
    (i.e., it has been written using a parallel library like

    message passing interface (MPI)), probably not.


    A normal computer with a multicore CPU is a shared memory
    multiprocessor
    , since programs/threads running on the
    different cores can communicate with one another through
    the memory each core shares with the others.


    On a Beowulf cluster like Microwulf, each motherboard/CPU has its own
    local memory, so there is no common/shared memory through which
    programs running on the different CPUs can communicate.
    Instead, such programs communicate through the network,
    using a communication library like

    MPI.
    Since its memory is distributed among the cluster"s CPUs,
    a cluster is a distributed memory multiprocessor.


    Many companies only began writing their programs for shared-memory
    multiprocessors (i.e., using multithreading) in 2006
    when dual core CPUs began to appear.
    Very few companies are writing programs for distributed memory
    multiprocessors (but there are some).
    So a game (or other program) will only run faster on Microwulf
    if it has been parallelized to run on a distributed multiprocessor.

  2. 可以使用視窗系統來驅動小超算么?
    The key to making any cluster work is the availability of
    a software library that will in parallel run a copy of a program
    on each of the cluster"s cores,
    and let those copies communicate across the network.
    The most commonly used library today is

    MPI.


    There are several versions of MPI available for Windows.
    (To find them, just google "windows mpi".)
    So you can build a cluster using Windows.
    But it will no longer be a Beowulf cluster,
    which, by definition, uses an open source operating system.
    Instead, it will be a Windows cluster.


    Microsoft is very interested in high performance computing --
    so interested, they have released a special version of Windows called
    Windows
    Compute Cluster Server (Windows CCS),
    specifically for building Windows clusters.
    It comes with all the software you need to build a Windows cluster,
    including MPI.
    If you are interested in building a Windows cluster,
    Windows CCS is your best bet.

  3. 我也要搞部小超算,可到哪裡學習?
    There are many websites that describe how.
    Here are a few of them:

    • Building a Beowulf System, by Jan Lindheim,
      provides a quick overview
    • Jacek Radajewski and Douglas Eadline"s HowTo
      provides a more detailed overview
    • Kurt Swendson"s HowTo
      provides step-by-step instructions for
      building a cluster using Redhat Linux and LAM-MPI
    • Engineering a Beowulf-style Compute Cluster,
      by Robert Brown, is an online book on building Beowulf clusters,
      with lots of useful information.
    • The Beowulf mailing list FAQ,
      by Don Becker, et al, is a list of answers to questions
      frequently posted to the

      Beowulf.org mailing list,
      which has a

      searchable Archive.

    • Beowulf.org"s Projects page
      provides a list of links to the first hundred or so Beowulf
      cluster project sites.
      Many of these sites provide information that is useful to
      someone building a Beowulf cluster.
  4. How did you mount the motherboards to the plexiglas?
    Our vendor supplied
    screws and brass standoffs
    with our motherboards.
    The standoffs have a male/screw end, normally screwed into the case;
    and a female/nut end, to which the motherboard is screwed.
    To use these to mount the motherboards,
    we just had to:

    1. drill holes in the plexiglass pieces
      in the same positions as the motherboard mounting holes;
    2. screw the brass standoffs into the holes in the
      plexiglass pieces; and
    3. screw the motherboards to the standoffs.


    To prepare each plexiglass piece, we laid a motherboard on top of it
    and then used a marker to color the plexiglass through the
    motherboard"s mounting holes.
    The only tricky parts are:

    • one piece of plexiglass has motherboards on both its top
      and its bottom, so you have to mark both sides; and
    • two motherboards hang upside down, and two sit right-side up,
      so you have to take that into account when marking the holes.


    We used a red marker to mark the positions of the
    holes on motherboards facing up, and a blue marker to mark
    the positions of the holes on motherboards facing down.


    With the plexiglass pieces marked,
    we took them to our campus machine shop
    and used a drill press to drill holes in each piece of plexiglass.


    When all the motherboard holes were drilled, we stacked the
    plexiglass pieces as they would appear in Microwulf and
    drilled holes in their corners for the threaded rods.


    We then screwed the standoffs into the plexiglass,
    taking care not to overtighten them.
    Being made of soft brass, they are very easy to shear off.
    If this happens to you, just take the piece of plexiglass back to the
    drill press and drill out the bit of brass screw that"s in the hole.
    (Or, if this is the only one, you can just leave it there and
    use one fewer screws to mount the motherboard.)


    With the standoffs in place,
    we then placed the motherboards on the standoffs,
    and used screws to secure them in place.
    That"s it!


    The only other detail worth mentioning is that before we screwed
    each motherboard tight to the standoffs, we chose one standoff
    on each motherboard to ground that motherboard against static.
    To do this grounding, we got some old phone wire,
    looped one end to the standoff,
    and then tightened the screw for that standoff.
    We then grounded each wire to one of the threaded rods,
    and grounded that threaded rod to one of the power supplies.

  5. 這小超算是商品么?可以賣么?
    否,主要是因為我們都不懂商業。


    But we are trying to build an endowment to provide in-house
    funding for student projects like Microwulf,
    so if you"ve found this site to be useful,
    please consider making a (tax-deductible) donation to it:

    CS Hardware Endowment Fund
    Department of Computer Science
    Calvin College
    3201 Burton SE
    Grand Rapids, MI 49546

    謝啦!

某網友測試過評論如下
好多年前的事情了.....

不在於系統是ubuntu Linux
而問題的重點是:
你會組裝機器 硬體組裝; 會作系統優化配置, 會配置很多服務, 比如NFS(構建無盤系統),NIS, 構建用戶信息, MPI(高斯可以不用這個並行環境), 網路優化, 幾個機器之間通信能力的優化,
如果你僅僅是明白硬體, 而對於linux系統的水平只專註於3D桌面之類的桌面應用, 那麼你要搞明白這套系統,
還是比較困難的。
我自己作過, 只不過是用的兩台機器,也是無盤系統, 系統採用自己熟悉的RHEL, 5.3 ,
那位作者的組裝說明, 適合管理過linux系統, 熟悉linux網路應用的人看,
沒有涉及過網路管理, 網路應用的, 要作下去比較費勁的。
他寫的只是一個方案, 不是具體的每一步的how-to,
誰有興趣的可以試試!
這套無盤系統, 性能很大程度取決於你的磁碟性能!
注意,這套系統, 適合併


難度並不大,只是如第一名的答案所說的那樣,性能會非常非常爛。不用像第二名的答案那麼麻煩。
你只需要:
1:用一個路由器把它們連在一起(不用太高速,反正這些電腦也很爛)。
2:每台電腦裝個debian吧。應當只有一台裝GUI就行。你不妨裝Xfce或者LXDE桌面,比較省。
3:看看mpich的文檔,把這些機器配上MPI的環境。
4:自己編MPI玩去吧!!


這玩意搭起來沒有啥實際意義。但是作為學習的平台還是不錯的。樓主可以試試搭一個小的hadoop集群,或者MPI集群。「體驗」一下互聯網雲計算。
話說,P4 3.0的年代正是P4耗電量最高的年代,這玩意可不划算啊。不過說回來,學校的電不花自己錢不心疼。


我不知道那些一個勁兒嘲笑這個方案的人是什麼心態,不過我覺得,樓主可能更希望做一次嘗試而不是單純的討論經濟上的考慮。用hadoop搭建,祝願樓主終有一天會擁有一個超級計算機群


你以為"主原料是沙子的CPU"能進行運算是魔法?
你以為用網線把20台電腦捆起來,就能並行計算?
+++++++++
從另外一個角度,說一下這個問題。
現在的並行計算技術,即使是業界最前沿的Spark, Hadoop YARN等也只能對特定的計算任務進行並行化。這裡說的特定任務主要指的是,可以拆解並行化的任務,而且這個「拆解」也是需要人工進行。像題主說的「做電設的軟體」對於並行計算是非常複雜的一項任務,即使能進行並行化,人工的費用也是極不划算的。


[圖]美大學生自製廉價桌面超級計算機
新聞鏈接里有詳細說明:
製作者Tim Brom在自己的主頁里給出了詳細的製作方法,有興趣的可以自己試著製作一下
相關連接:
http://www.calvin.edu/~adams/research/microwulf/
http://www.clustermonkey.net//content/view/211/1/

硬體清單
Microwulf: Hardware Manifest


主要是是看你的需求

  1. 如果是大數據處理,可以考慮使用hadoop
  2. 如果是做 存儲,可以考慮ceph或者swift之類的雲存儲系統

計算能力估計 沒有 這個強


做成超級計算機當然不可行,但是簡單做成集群(Cluster)是可行的。實際上國外一些中小型研究項目都是在幾十個CPU的集群上運算的。國內航空研究領域以前沒有超算,都是自己用十幾台、幾十台台式機搭集群來進行大規模運算。不過需要注意的是,這樣搭建集群需要自己特別編製軟體,並不能讓一般的辦公軟體之類快速運行。編製這類軟體對個人編程能力以及對硬體的理解都要求很高。如果你有能力編這個軟體,還不如把專業技能放到有實際價值的方面。總之就不要白非這個力氣了。


賣給收廢品的,一台20共計400元。

一塊x58寨板加cpu大概在500以內

配齊內存條後,你會擁有一台6核十二線程的1超級電腦,親測可以掛20個LOL打金腳本。

如果你覺得框框還是太少了,那就請選擇雙路x58主板,兩個cpu共計12核二十四線程夠你在娛樂大師跑分中榮獲戰勝98%電腦。

當然,如果說你是農企信徒,可以試試雙路皓龍,讓你充值信仰,感受農神的祝福。

拳打E3,腳踹I7,性價比之最為我x58(滑稽)


請讓校長試著用十萬個算盤組建一個80386


可能很多人已經意識不到,這些機器的性能也許還不如你的手機


這個主意滿好的。

電費是學校出的。領導又不會在意。

練手就是長進的時候。


職校老師一個,和你一樣,手頭破爛很多。唯一好處就是電費不需要考慮。用了四台裝Linux,上Hadoop。然後日夜用爬蟲收集天氣信息,再瞎統計。
毛知識都沒有發現。
但是,
這樣做好處就是自己分析處理數據能力刷刷提高,從工程師轉老師已經快三年了,最近參加市工會組織的數據清洗比賽還拿了個三等獎,是事業單位幾屆比賽中破天荒的第一個獎。
至於很多高票答案談能效的,我持保留意見。


有人拿64台Raspberry Pi組了個並行運算平台,不妨去看看。


別理那些嘲笑你的人,大膽放手去做,哪怕做到一半就放棄,你學習到的知識和技能已經足夠自己找一個薪資乘10的工作了。最差情況不過是玩壞了,還可以讓領導出錢換新電腦嘛。


推薦閱讀:

TAG:雲計算 | 分散式計算 | 計算機網路 | 計算機前沿 |