函數式編程如何工程化？

12-27

在OO裡面能夠進行UML,在函數式編程裡面，Haskell 和Lisp的面向對象的標準不同.所以想知道函數式編程如何進行工程化的開發？

Don Stewart的一系列幻燈片值得參考：https://donsbot.wordpress.com/papers/。裡面有一些講Haskell實現大型項目的經驗。補一個主頁上沒有的，是他在Google Tech Talk 2015上做的報告「Haskell in the Large」，講在渣打銀行開發Haskell的工程經驗。Haskell in the Large [pdf]

一個不錯的stackoverflow上的總結：Large-scale design in Haskell?也是Don Stewart回答的，我就冒昧粘貼過來好了：

I talk a bit about this in Engineering Large Projects in Haskell and in the Design and Implementation of XMonad. Engineering in the large is about managing complexity. The primary code structuring mechanisms in Haskell for managing complexity are :

The type system

Use the type system to enforce abstractions, simplifying interactions.

Enforce key invariants via types

(e.g. that certain values cannot escape some scope)

That certain code does no IO, does not touch the disk

Enforce safety: checked exceptions (Maybe/Either), avoid mixing concepts (Word,Int,Address)

Good data structures (like zippers) can make some classes of testing needless, as they rule out e.g. out of bounds errors statically.

The profiler

Provide objective evidence of your programs heap and time profiles.

Heap profiling, in particular, is the best way to ensure no uneccessary memory use.

Purity

Reduce complexity dramatically by removing state. Purely functional code scales, because it is compositional. All you need is the type to determine how to use some code -- it won"t mysteriously break when you change some other part of the program.

Use lots of "model/view/controller" style programming: parse external data as soon as possible into purely functional data structures, operate on those structures, then once all work is done, render/flush/serialize out. Keeps most of your code pure

Testing

QuickCheck + Haskell Code Coverage, to ensure you are testing the things you can"t check with types.

GHC +RTS is great for seeing if you"re spending too much time doing GC.

QuickCheck can also help you identify clean, orthogonal APIs for your modules. If the properties of your code are difficult to state, they"re probably too complex. Keep refactoring until you have a clean set of properties that can test your code, that compose well. Then the code is probably well designed too.

Monads for Structuring

Monads capture key architectural designs in types (this code accesses hardware, this code is a single-user session, etc .)

E.g. the X monad in xmonad, captures precisely the design for what state is visible to what components of the system.

Type classes and existential types

Use type classes to provide abstraction: hide implementations behind polymorphic interfaces.

Concurrency and paralleism

Sneak par into your program to beat the competition with easy, composable parallelism.

Refactor

You can refactor in Haskell a lot. The types ensure your large scale changes will be safe, if you"re using types wisely. This will help your codebase scale. Make sure that your refactorings will cause type errors until complete.

Use the FFI wisely

The FFI makes it easier to play with foreign code, but that foreign code can be dangerous.

Be very careful in assumptions about the shape of data returned.

Meta programming

A bit of Template Haskell or generics can remove boilerplate.

Packaging and distribution

Use Cabal. Don"t roll your own build system.

Use Haddock for good API docs

Tools like graphmod can show your module structures.

Rely on the Haskell Platform versions of libraries and tools, if at all possible. It is a stable base.

Warnings

Use -Wall to keep your code clean of smells. You might also look at Agda, Isabelle or Catch for more assurance. For lint-like checking, see the great hlint, which will suggest improvements.

With all these tools you can keep a handle on complexity, removing as many interactions between components as possible. Ideally, you have a very large base of pure code, which is really easy to maintain, since it is compositional. That"s not always possible, but it is worth aiming for.

In general: decompose the logical units of your system into the smallest referentially transparent components possible, them implement them in modules. Global or local environments for sets of components (or inside components) might be mapped to monads. Use algebraic data types to describe core data structures. Share those definitions widely.

除了Haskell側，補一個F#側不錯的資源：Home | F# for fun and profit，網站上的三個slide講得不錯：
Functional Programming Design Patterns
Domain Driven Design
Railway Oriented Programming

我覺得可以寫一個函數式編程的設計模式

1. Modular System(MS)
所謂的模塊化設計，不是一個空泛的原則。首先它建立在不同程序語言的模塊化支持之上：ML的Module system，Haskell的Type class，還是Java/C#的OO/namespace等等；需要使用者不斷的練習總結，在介面設計、編碼、測試、重構的循環中積累經驗。

2. Type System
充分利用多態類型的表達力；
利用類型信息發現錯誤；
。。

3. Control Mutability
比如Scala的變數聲明用val/var區分immutalbe，並分別提供immutalbe和mutable的collections庫..

3. Customized Control Flow
利用宏/continuation等實現自定義控制流，但是要限制其作用的範圍。

4. Function Builder
利用高階函數、函數複合，重用現有的函數。

5. Lay by need
按需Lazy。

6. Filter-Fmap-Fold（FFF）
（Fmap應指map, fold就是reduce）

。。。

竊以為 react.js + flux 可以當作工程化的函數式的一個案例。

有個未經證實的傳言。當年Joe Armstrong因為在UNIX上的Smalltalk一GC就是一下午，實在忍不可忍，去訂了一台Smalltalk Machine。結果快遞公司花了幾個月才把貨送到，這個時候Joe Armstrong他們已經用Prolog開發出後來成為Erlang的原型了。後來這台機器不知道為啥就轉給Ivar Jacobson去玩了。在Smalltalk的基礎上，Ivar Jacobson和另外兩位老爺爺一起發明了UML。UML大法非常好，後來愛立信在開發新一代產品時，決定主要採用C++和UML。結果開發了N年也沒開發出來。無奈之下，抱著死馬當活馬醫的心態，決定試試Erlang，竟然順利完成了，也就有了AXD301，竟然賣出去了，還賺到錢了。

不過上面內容不是重點，我要提醒你的是Erlang並不是函數式語言。Erlang是OO語言。

APL家族才能算是函數式語言。顯然他們不需要考慮工程問題。因為一切程序用APL都只需要一行就能寫完了。反正只有一行，背都把它背下來了。你常用的程序也就不到100個，你趁小時候記憶力好的時候，把這些個程序都背下來，要用的時候隨時輸一遍就好了，反正就只有一行。

uml和工程化有什麼關係？
haskell和lisp根本就不是面向對象，談何「標準不同」？
能模塊化的語言都能工程化開發

話說什麼是「工程化」？

用I其它語言能開發大項目用函數式就不行?

語言只是一個工具而已,後面支撐的是你的思想,如何設計,如何測試,如何擴展...

其它怎麼玩的函數式照樣可以怎麼玩.

擇日完善，現在只吐槽：說的好像OO就真的能工程化似的。

今天，我們來詳細地分析一下函數式語言和過程式語言對軟體工程複雜性的不同影響。
根據我們的經驗，函數式語言比較適於開發小項目，而過程式語言比較適於開發大項目。為了定性地分析這個問題，我們藉助演算法中的複雜度的概念，可以定義一個軟體工程的複雜度：
$oleft( n^{x} ight)$
於是我們可以看到：

函數式語言雖然最基礎的語句形式和過程式語言是相同的，但是當處理邏輯的時候，函數式語言能省下大量的定義和管理語句，專註在主要問題上。因此，過程式語言的小階項要比函數式語言大。
但是，當系統趨於複雜的時候，函數式語言表達變數之間關係的語句就不能像過程式語言那麼單純。與此同時，過程式語言的變數和過程定義之後，其互相使用的開銷實際上是比函數式語言低的，而函數式語言因為缺乏定義常常要使值穿過複雜的流程。因此，函數式語言的常數項要比過程式語言大。
基於這樣的推理，我們發現當項目規模較小時，函數式語言複雜度低；而項目規模大時，過程式語言複雜度低，從而成功地驗證了我們的經驗。
啥，你問n和x是什麼？n是你的項目人數，x是你用的軟體工程措施。

只要能把模塊說清，使用的介面搞清楚就行了。