Haskell等語言中的模式匹配在C++中如何實現？

01-03

如果僅僅是字面量的匹配，用條件語句還是可以比較容易地實現。但如果是List，Tuple之類的匹配，如Haskell趣學指南中的例子

這種應該如何在c++中較為簡便地實現？
List類型可對應vector，也可對應鏈表。

如果有了解Haskell編譯器的知友，希望能從Haskell編譯器實現的角度來分析一下。

Scott encoding 告訴我們，代數數據類型/模式匹配是可以用基本類型和一等函數實現的語法糖。C++ 有 lambda，有 std::function，可以用同樣的思路來搞。

比較麻煩的情況是遞歸的代數數據類型。C++ 沒有 isorecursive type，換句話說，不能直接 typedef 一個 std::function ，然後讓他的參數列表裡提到他自己。不過可以用繼承的方式來 workaround 一下。

以下是 C++ 中使用 Scott encoding 編碼單鏈表的例子。所有地方 pass by value，不考慮空間效率和缺乏尾遞歸優化的問題：

#include & #include &


template &

struct SList

    : std::function&, std::function&)&>)&> {

  template &

  SList(F f)

      : std::function&, std::function&)&>)&>(

            f){};

};
template &

SList& Nil = SList&([](auto n, auto c) { return n(); });
template &

SList& Cons(T x, SList& xs) {

  return SList&([=](auto fx, auto fxs) { return fxs(x, xs); });

}

int main() { auto l = Nil&; for (;;) { int x; std::cin &>&> x; if (x &> 0) l = Cons&(x, l); else break; } std::function& fx = []() { std::cout &<&< "End of list!" &<&< std::endl; }; std::function&)&> fxs = [](auto x, auto xs) { std::cout &<&< "List element " &<&< x &<&< std::endl; xs(fx, fxs); }; l(fx, fxs); return 0; }

以上代碼使用 g++ 6.3.0 -std=c++14 編譯通過。代數數據類型 SList 的兩個模板參數 T 和 R 分別代表列表元素類型，以及在這個列表上進行計算的最終返回類型。定義了兩個構造器 Nil 和 Cons，可以用於組裝列表；要對 SList 進行模式匹配，只需要傳入兩個 std::function，分別匹配兩種構造器，而構造器 Cons 的兩個參數，將被綁定到後一個 std::function 的參數上。

基本原理就是這樣。scope-safe，type-safe，除了特別丑和慢以外他是能 work 的。實現函數式語言的模式匹配的話，使用 Scott encoding 是可以考慮的。不過 C++ 還是想辦法用 tagged union 是正經。

關於用 encoding 表示代數數據類型和模式匹配的原理，可以看我的另一個回答：https://www.zhihu.com/question/39930042/answer/137582076

std::variant 已經確定要進C++17了，至於原生的 lvariant 和 pattern matching 支持也有提案，不過估計得趕C++23了

Pattern Matching and Language Variants: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0095r1.html

CppCon 2016: David Sankel 「Variants: Past, Present, and Future": https://www.youtube.com/watch?v=k3O4EKX4z1c

// This lvariant implements a value representing the various commands // available in a hypothetical shooter game. lvariant command { std::size_t set_score; // Set the score to the specified value std::monostate fire_missile; // Fire a missile unsigned fire_laser; // Fire a laser with the specified intensity double rotate; // Rotate the ship by the specified degrees. };

// Output a human readable string corresponding to the specified "cmd" command // to the specified "stream". std::ostream operator&<&<( std::ostream stream, const command cmd ) { return inspect( cmd ) { set_score value =&> stream &<&< "Set the score to " &<&< value &<&< ". " fire_missile m =&> stream &<&< "Fire a missile. " fire_laser intensity: stream &<&< "Fire a laser with " &<&< intensity &<&< " intensity. " rotate degrees =&> stream &<&< "Rotate by " &<&< degrees &<&< " degrees. " }; } // Create a new command "cmd" that sets the score to "10". command cmd = command::set_score( 10 );

C++可以用模板元編程黑魔法造，比如boost hana裡面的一個給boost::any提供模式匹配支持的東西，用起來是這樣的：

boost::any a = "x"; auto r = switch_(a)( case_&([](auto) -&> int { return 1; }), case_&([](auto) -&> long { return 2l; }), default_([]() -&> long long { return 3ll; }) ); // r is inferred to be a long long static_assert(std::is_same&{}, ""); assert(r == 2ll);

實現在這裡

還有@霧雨魔理沙用C++造出的幾（dai）何（shu）數據類型，帶模式匹配，用起來是這樣的

typedef algebraic_data_type&< size_t, recursive_indicator, std::tuple&< recursive_indicator, recursive_indicator &> &> exp; DECLARE_CONSTRUCTOR( exp, 0, var, T ); DECLARE_CONSTRUCTOR( exp, 1, abs, T ); DECLARE_CONSTRUCTOR( exp, 2, app, T );

exp update_index( const exp self, size_t i, size_t current_depth ) { return self.match( with( var( arg ), []( size_t ii ) { return var( ii &> current_depth ? ii + i - 1 : ii ); } ), with( app( arg, arg ), []( const exp l, const exp r ) { return app( update_index( l, i, current_depth ), update_index( r, i, current_depth ) ); } ), with( abs( arg ), []( const exp ex ) { return abs( update_index( ex, i, current_depth + 1 ) ); } ) ); }

實現在這裡

MarisaKirisame/algebraic_data_type

這可是C++，沒有什麼是模板黑魔法解決不了的，如果有，就再糊一層宏

感覺如果在 C++ 里匹配的就不是表面上的整個模式而是分開的部分(全部)欄位了…

類似於 scala編譯成位元組碼之後的

這種匹配是運行時才能決定的，放到C++裡面自然是用if語句。而且C++搞這個具體的例子更方便，更直接。跟Haskell不一樣的是，std::list獲取size也是O(1)，所以也可以這麼寫。

void Fuck(const vector& xs) { switch (xs.size()) { case 0: ... break; case 1: ... break; case 2: ... break; default: ... } }

沒法實現。

C++ 和 Haskell，完全就是不同的語言物種。

C++ 是 ADT 嗎？C++ 完全 immutable 嗎？C++ 有 Laziness 嗎？

如果這些都沒有，就是給硬寫一個長得像的乞丐版的 pattern-matching，能用嗎？

程序（語言）設計里有些思想是相通的，但絕大多數特性是不相通的，跨類類比時要多想想前者，而不要在後者上做糾結及浪費時間。

可以用Visitor設計模式來模擬模式匹配，Java代碼（C++類似）如下：

interface ListVisitor& { O Nil(); O Cons(E x, List& xs); } interface List& { & O accept(ListVisitor& v); } class Nil& implements List& { public & O accept(ListVisitor& v) { return v.Nil(); } } class Cons& implements List& { E x; List& xs; public & O accept(ListVisitor& v) { return v.Cons(x, xs); } } class Length& implements ListVisitor& { public Integer Nil() { return 0; } public Integer Cons(E x, List& xs) { return 1 + xs.accept(this); } }

局限之處在於一個具體Visitor只能進行一層匹配，對於需要深層匹配的函數（如問題中的tell）則需要多個Visitor來模擬，在沒有語法糖的支持下比較繁瑣。Visitor模式的理論基礎與其它回答中提到的Scott encoding相近，為Parigot encoding。其區別可參考 https://ifl2014.github.io/submissions/ifl2014_submission_13.pdf 該文用Haskell給出了List在幾種encoding下的實現

llvm中的，不知道這個算不算

==========

ps：歪樓

看到@孫明琦那個c++hbb，想到這個c#版本

ImageMacro image = new Lolcat("I made you a cookie"); Pattern.Match(image). Case& (c =&> Print("Lolcat says "" + c + """)). Case& (b =&> Print("I has " + b + " buckets")). Case& (() =&> Print("O RLY?"));

代碼來自ML-style Pattern Matching in C# ，不過怎麼看都像是DSL啥的

solodon4/Mach7

在C++ 是這樣的，經過parser 把整段代碼去，再把token變成你定義好的各類型的CELL，再組成s expression 的Abstract Data Tree。

有了這ADT 加上C++ 的特性，把語言的操作符如+,-, * , / 等，overload 去支持自己定義的eval。或者簡單直接用自己定義的操作符

這樣就可以動態支持，像Haskell 的模式匹對。這其實是Functional programming 的天性。