分散式搜索資料庫性能探究

02-02

分散式搜索資料庫產品，能滿足很多企業高速檢索的業務場景，海量的單表數據秒級搜索和全文檢索，完全支持SQL語法，支持數據的增刪改查，兼容MySQL/PostgreSQL協議，企業級分散式搜索資料庫解決海量數據檢索問題。

環境準備

必須修改如下相關的配置文件，不然會無法正常啟動集群。

vi /etc/sysctl.confnvm.max_map_count=262144nnsysctl -pnnvi /etc/security/limits.conf n* soft nofile 65536n* hard nofile 65536nnvi /etc/profilenexport CRATE_HEAP_SIZE=100gn

創建數據存儲目錄，創建用戶searchdb，cratedb集群安裝在searchdb用戶下。

wget https://cdn.crate.io/downloads/releases/crate-2.1.5.tar.gz -O /opt/nntar -zxvf crate-2.1.5.tar.gznnmkdir /disk0{1..4}/searchdbnnuseradd searchdbnchown searchdb:searchdb /disk0{1..4}/searchdbnchown searchdb:searchdb -R /opt/crate-2.1.5nnrm -rf /disk0{1..4}/searchdb/*n

注意：需要提前安裝好JDK，配置JAVA_HOME信息，特別注意JDK版本需要jdk1.8.0_45以上。

集群安裝

三節點集群軟體，需要注意的信息network.host為每個節點的主機名,node.name填寫為主機名,my_cluster必須是唯一的,編輯crate.yml文件。

cluster.name: my_clusternnnode.name: node1nndiscovery.zen.ping.unicast.hosts:n - node1:4300n - node2:4300n - node3:4300nndiscovery.zen.minimum_master_nodes: 2nngateway:n recover_after_nodes: 2n recover_after_time: 5mn expected_nodes: 2nnpath.data: /disk01/searchdb,/disk02/searchdb,/disk03/searchdb/,/disk04/searchdbnnnetwork.host: node1nnpsql.enabled: truenpsql.port: 5432n

啟動集群

bin/crate -dn

訪問集群

簡單測試

create table tweets (n created_at timestamp,n id string primary key,n retweeted boolean,n source string INDEX using fulltext,n text string INDEX using fulltext,n user_id stringn);nninsert into tweets values (1394182937, 1, true, web, Dont panic, Douglas);nninsert into tweetsn values (n 1394182938,n 2,n true,n web,n Time is an illusion. Lunchtime doubly so,n Fordn );nninsert into tweets values (1394182937, 3, true, address, 中國，北京, 北京);nnselect * from tweets where id = 2;nnselect user_id, _score from tweets where match(text, is) order by _score desc;nnselect user_id, _score from tweets where match(text, 北京) order by _score desc;nnselect user_id, _score from tweets where match(text, 京城) order by _score desc;nnDELETE FROM tweets where id=3;n

使用一些基礎SQL語法測試，進行簡單測試，包括帶有的全文檢索、分詞能力，支持Update,Delete數據。

性能測試

生成測試數據，生成books表數據，平均6.0K/條，100G大小，通過一個py腳本把文本數據轉換為json數據。

nohup java -cp dbgen-1.0-jar.jar DBGen -p ./data -b 100 & --Total Time: 2610 secondsnncat books | python csv2json.py --columns id:integer isbn:string category:string publish_date:string publisher:string price:float > books.jsonn

數據示例：

$ head data/books/books n0|6-20386-216-4|STUDY-AIDS|1998-05-31|Gakken|166.99n1|0-60558-466-8|JUVENILE-NONFICTION|1975-02-12|Holtzbrinck|128.99n2|3-16551-636-9|POETRY|1988-01-24|Oxford University Press|155.99n3|4-75505-741-2|COMICS-GRAPHIC-NOVELS|1992-02-24|Saraiva|101.99n4|3-32982-589-8|PERFORMING-ARTS|2011-03-09|Cambridge University Press|183.99n

基礎命令：

./crash --helpn ./crash --hosts node1 --sysinfo nn ./crash --hosts node1 -c "show tables"n

創建表結構，導入數據。

CREATE TABLE books (n id integer,n isbn string,n category string,n publish_date string,n publisher string,n price floatn);nnCOPY books FROM /disk01/data/books/books.json;n

通過如上命令，可以生成不同級別大小的測試數據，根據參數可以生成不同大小的表。

測試場景1

表級別：千萬級
效率：15m/s (node)
JSON大小：6.6 g
入庫CrateDB大小：5.2 g
數據量：47582950
分片數量：6
副本數：2
memory：16g
vcpu：24
storage：1.4T （4*280g）
network：千兆

主要針對單表的查詢測試。

select category,count(*) from books group by category limit 100; -- 3.137 snnselect category,count(*) as num from books group by category order by num limit 100; --2.929 secnnselect category,count(*) as num from books where category=SCIENCE group by category order by num limit 100; --0.143 secnnselect count(*) from books where category=SCIENCE limit 100; -- 0.022 sec nnselect count(distinct category) from books limit 100; -- 2.990 secnnselect distinct category from books limit 100; -- 3.032 secn

修改 number_of_shards 看是否提升性能

ALTER TABLE books SET (number_of_shards = 48)nnOPTIMIZE table books; -- 這個參數比較有用，可以提升性能nnSELECT count(*) as num_shards, sum(num_docs) as num_docs FROM sys.shards WHERE schema_name = doc AND table_name = books;n

測試場景2

表級別：億級表
效率：17285.888 sec
JSON大小：33g
入庫CrateDB大小：27g
數據量：235265838
node_num: 3
分片數量：1024
副本數：2
memory：100g
vcpu：24
storage：1.4T （4*280g）
每入庫秒鐘：13610 條/s
network：千兆

創建表，導入數據。

CREATE TABLE books_t1 (n id integer,n isbn string,n category string INDEX using fulltext,n publish_date string,n publisher string INDEX using fulltext,n price floatn) CLUSTERED BY (category) INTO 1024 SHARDS with (number_of_replicas = 2, refresh_interval=10000);nnCOPY books_t1 FROM /disk01/data/books/books.json; -- COPY OK, 235265838 rows affected (17285.888 sec)n

測試性能。

OPTIMIZE table books_t1;nnselect category,count(*) from books_t1 group by category limit 100; -- 2.556 secnnselect category,count(*) as num from books_t1 group by category order by num limit 100; -- 2.763 secnn問題：Error! SQLActionException[SQLParseException: Cannot GROUP BY category: grouping on analyzed/fulltext columns is not possible]nnselect count(*) from books_t1 where match(category, PERFORMING-ARTS); -- limit 100; -- 0.256 secnnselect * from books_t1 where match(category, ARTS); -- limit 100; -- 0.256 sec; -- 0.928 secn

注意：fulltext欄位的都無法做聚合分析操作，不帶全文索引，只能做全文搜索match，重新導數據在測試.

測試場景3

表級別：億級表
效率：5662.132 sec
JSON大小：33g
入庫CrateDB大小：25.3g
數據量：235265838
node_num: 3
分片數量：1024
副本數：2
memory：100g
vcpu：24
storage：1.4T （4*280g）
每入庫秒鐘：13610 條/s
network：千兆

創建表，插入數據。

CREATE TABLE books_t2 (n id integer,n isbn string,n category string,n publish_date string,n publisher string,n price floatn) CLUSTERED BY (category) INTO 1024 SHARDS;nnCOPY books_t2 FROM /disk01/data/books/books.json;nninsert into books_t2 select * from books_t1; -- INSERT OK, 235265838 rows affected (5662.132 sec)n

性能測試。

OPTIMIZE table books_t2;nnselect category,count(*) from books_t2 group by category limit 100; -- 3.994 secnnselect category,count(*) as num from books_t2 group by category order by num limit 100; -- 4.159 secnnselect category,count(*) as num from books_t2 where category=SCIENCE group by category order by num limit 100; -- 1.731 secnnselect count(*) from books_t2 where category=SCIENCE limit 100; -- 0.001 secnnselect count(distinct category) from books_t2 limit 100; -- 4.677 secnnselect distinct category from books_t2 limit 100; -- 3.914 secnnselect id,price,publisher from books_t2 where publish_date=1999-02-02 and category=SCIENCE limit 100; -- 0.014 secn

注意：分片數量過多導致Heap Usage一直居高不下達到57%，表建立全局索引1024個分片，2個索引欄位

測試場景4

表級別：15億級
效率：15m/s (node)
JSON大小：215 G
入庫CrateDB大小： 175.6g
數據量：1551303191
分片數量：500
副本數：2
memory：100g
vcpu：24
storage：1.4T （4280g） 4節點
network：千兆

生成的文本數據，轉換為JSON格式。

nohup cat /disk01/searchdb/data/books/books | python csv2json.py --columns id:integer isbn:string category:string publish_date:string publisher:string price:float > /disk02/searchdb/books.json &n

切割一個215g數據文件為22個10g大小的數據文件，並行入庫。

split -b 10000m /disk02/searchdb/books.json -d -a 3 split_filen

創建表。

CREATE TABLE books (n id integer,n isbn string,n category string,n publish_date string,n publisher string,n price floatn) CLUSTERED BY (category) INTO 500 SHARDS;n

批量入庫數據。

/opt/crate-2.1.7/bin/crash --hosts node1 -c "COPY books FROM /disk01/searchdb/split_file000"n

主要針對15億單表的查詢 - 性能測試。

OPTIMIZE table books;nnselect category,count(*) from books_t2 group by category limit 100; -- 3.994 secnnselect category,count(*) as num from books_t2 group by category order by num limit 100; -- 4.159 secnnselect category,count(*) as num from books_t2 where category=SCIENCE group by category order by num limit 100; -- 1.731 secnnselect count(*) from books_t2 where category=SCIENCE limit 100; -- 0.001 secnnselect count(distinct category) from books_t2 limit 100; -- 4.677 secnnselect distinct category from books_t2 limit 100; -- 3.914 secnnselect id,price,publisher from books_t2 where publish_date=1999-02-02 and category=SCIENCE limit 100; -- 0.014 secn

測試場景5

測試SQL如下，主要是針對單表的性能測試，無分區。

OPTIMIZE table books;nnselect category,count(*) from books group by category limit 100; -- 37.878 secnnselect category,count(*) as num from books group by category order by num limit 100; -- 46.603 secnnselect category,count(*) as num from books where category=SCIENCE group by category order by num limit 100; -- 11.808 secnnselect count(*) from books where category=SCIENCE limit 100; -- 0.002 secnnselect count(distinct category) from books limit 100; -- 44.924 secnnselect distinct category from books limit 100; -- 44.335 secnnselect id,price,publisher from books where publish_date=1999-02-02 and category=SCIENCE limit 100; -- 0.347 secnnselect price,count(publisher) from books where publish_date=1999-02-02 and category=SCIENCE group by price order by price desc limit 100; -- 0.981 secnnselect price,category from books where publisher=Kyowon group by price,category order by price limit 100; --n3.602 secnnselect price,category,count(*) from books where publisher=Kyowon group by price,category order by price limit 100; -- 1.406 secn

場景1 數據量：496928035 4節點(mem:128g vcore: 24 storage: 4*250g) 大小：56.2g shards: 500 平均：6.0K/條 網路：千兆
場景2 數據量：993106194 4節點(mem:128g vcore: 24 storage: 4*250g) 大小：112g shards: 500 平均：6.0K/條 網路：千兆
場景3 數據量：1551303103 4節點(mem:128g vcore: 24 storage: 4*250g) 大小：174.4g shards: 500 平均：6.0K/條 網路：千兆

注意：如上測試並沒有專業優化並發，除內存外，所有參數使用默認。

在單表15億+，五分區，4台伺服器，千兆網路，表現出來的性能還是非常強勁的，主要針對單表各種統計分析並且還帶有全文檢索的功能，此資料庫可以把它稱為是分散式搜索資料庫。底層用到了很多搜索引擎存儲的技術，包括倒排索引，數據分片，利用大內存，細粒度索引，如果能支持多實例，目前CPU還沒完全使用，磁碟IO和內存都滿載。每一個欄位都帶索引，所以入庫比較慢，在單機上入庫是瓶頸，可以分開在多台機器入庫，這樣避免IO堵在一台機器；壓力過大容易導致節點奔潰脫離集群；join性能沒深入探索，不好評價。

全文搜索(Fulltext Search)、分詞等能力通過特殊SQL語法匹配，搜索結果可以進行複雜的統一分析。
Geo Search，支持基於地理位置信息的複雜演算法分析，響應迅速。
支持分區表，可以基於分區進行檢索分析。
支持多副本，自帶容錯，檢索分流，提高性能。
海量數據實時入庫，實時檢索、複雜統計分析。
Blob(binary large object)，二進位大對象存儲分析。
僅支持JSON/CSV入庫。

一個分散式搜索資料庫，支持標準SQL和JDBC，用來替代ES做一些全文檢索並支持複雜統計分析能力，很有實際意義。

我目前參與過最大的ES集群，也就60節點300億+doc(3個主節點，6T&2塊SATA)，數據量400TB，還有一個小集群20節點70億+doc(3主節點，4T&2塊SSD)，性能基本能滿足要求，存儲近3年的數據，歷史數據HDFS為Backup。

參考：

[1]. https://github.com/baidu/Elasticsearch
[2]. http://www.itweet.cn/blog

歡迎關注微信公眾號，第一時間，閱讀更多有關雲計算、大數據文章。

http://weixin.qq.com/r/zUgkPBvEPEZDrV9f9x3f (二維碼自動識別)

原創文章，轉載請註明：轉載自Itweet的博客

本博客的文章集合: http://www.itweet.cn/blog/archive/