分散式搜索資料庫性能探究
分散式搜索資料庫產品,能滿足很多企業高速檢索的業務場景,海量的單表數據秒級搜索和全文檢索,完全支持SQL語法,支持數據的增刪改查,兼容MySQL/PostgreSQL協議,企業級分散式搜索資料庫解決海量數據檢索問題。
環境準備
必須修改如下相關的配置文件,不然會無法正常啟動集群。
vi /etc/sysctl.confnvm.max_map_count=262144nnsysctl -pnnvi /etc/security/limits.conf n* soft nofile 65536n* hard nofile 65536nnvi /etc/profilenexport CRATE_HEAP_SIZE=100gn
創建數據存儲目錄,創建用戶searchdb,cratedb集群安裝在searchdb用戶下。
wget https://cdn.crate.io/downloads/releases/crate-2.1.5.tar.gz -O /opt/nntar -zxvf crate-2.1.5.tar.gznnmkdir /disk0{1..4}/searchdbnnuseradd searchdbnchown searchdb:searchdb /disk0{1..4}/searchdbnchown searchdb:searchdb -R /opt/crate-2.1.5nnrm -rf /disk0{1..4}/searchdb/*n
注意:需要提前安裝好JDK,配置JAVA_HOME
信息,特別注意JDK版本
需要jdk1.8.0_45
以上。
集群安裝
三節點集群軟體,需要注意的信息network.host
為每個節點的主機名,node.name
填寫為主機名,my_cluster
必須是唯一的,編輯crate.yml
文件。
cluster.name: my_clusternnnode.name: node1nndiscovery.zen.ping.unicast.hosts:n - node1:4300n - node2:4300n - node3:4300nndiscovery.zen.minimum_master_nodes: 2nngateway:n recover_after_nodes: 2n recover_after_time: 5mn expected_nodes: 2nnpath.data: /disk01/searchdb,/disk02/searchdb,/disk03/searchdb/,/disk04/searchdbnnnetwork.host: node1nnpsql.enabled: truenpsql.port: 5432n
- 啟動集群
bin/crate -dn
- 訪問集群
bin/crashnncr> c node1:4200 n+-------------------+-----------+---------+-----------+---------+n| server_url | node_name | version | connected | message |n+-------------------+-----------+---------+-----------+---------+n| http://node1:4200 | node1 | 2.1.5 | TRUE | OK |n+-------------------+-----------+---------+-----------+---------+n
- 簡單測試
create table tweets (n created_at timestamp,n id string primary key,n retweeted boolean,n source string INDEX using fulltext,n text string INDEX using fulltext,n user_id stringn);nninsert into tweets values (1394182937, 1, true, web, Dont panic, Douglas);nninsert into tweetsn values (n 1394182938,n 2,n true,n web,n Time is an illusion. Lunchtime doubly so,n Fordn );nninsert into tweets values (1394182937, 3, true, address, 中國,北京, 北京);nnselect * from tweets where id = 2;nnselect user_id, _score from tweets where match(text, is) order by _score desc;nnselect user_id, _score from tweets where match(text, 北京) order by _score desc;nnselect user_id, _score from tweets where match(text, 京城) order by _score desc;nnDELETE FROM tweets where id=3;n
使用一些基礎SQL語法測試,進行簡單測試,包括帶有的全文檢索、分詞能力,支持Update,Delete數據。
性能測試
生成測試數據,生成books
表數據,平均6.0K/條
,100G大小,通過一個py
腳本把文本數據轉換為json數據。
nohup java -cp dbgen-1.0-jar.jar DBGen -p ./data -b 100 & --Total Time: 2610 secondsnncat books | python csv2json.py --columns id:integer isbn:string category:string publish_date:string publisher:string price:float > books.jsonn
數據示例:
$ head data/books/books n0|6-20386-216-4|STUDY-AIDS|1998-05-31|Gakken|166.99n1|0-60558-466-8|JUVENILE-NONFICTION|1975-02-12|Holtzbrinck|128.99n2|3-16551-636-9|POETRY|1988-01-24|Oxford University Press|155.99n3|4-75505-741-2|COMICS-GRAPHIC-NOVELS|1992-02-24|Saraiva|101.99n4|3-32982-589-8|PERFORMING-ARTS|2011-03-09|Cambridge University Press|183.99n
基礎命令:
./crash --helpn ./crash --hosts node1 --sysinfo nn ./crash --hosts node1 -c "show tables"n
創建表結構,導入數據。
CREATE TABLE books (n id integer,n isbn string,n category string,n publish_date string,n publisher string,n price floatn);nnCOPY books FROM /disk01/data/books/books.json;n
通過如上命令,可以生成不同級別大小的測試數據,根據參數可以生成不同大小的表。
測試場景1
- 表級別:千萬級
- 效率:15m/s (node)
- JSON大小:6.6 g
- 入庫CrateDB大小:5.2 g
- 數據量:47582950
- 分片數量:6
- 副本數:2
- memory:16g
- vcpu:24
- storage:1.4T (4*280g)
- network:千兆
主要針對單表的查詢測試。
select category,count(*) from books group by category limit 100; -- 3.137 snnselect category,count(*) as num from books group by category order by num limit 100; --2.929 secnnselect category,count(*) as num from books where category=SCIENCE group by category order by num limit 100; --0.143 secnnselect count(*) from books where category=SCIENCE limit 100; -- 0.022 sec nnselect count(distinct category) from books limit 100; -- 2.990 secnnselect distinct category from books limit 100; -- 3.032 secn
修改 number_of_shards 看是否提升性能
ALTER TABLE books SET (number_of_shards = 48)nnOPTIMIZE table books; -- 這個參數比較有用,可以提升性能nnSELECT count(*) as num_shards, sum(num_docs) as num_docs FROM sys.shards WHERE schema_name = doc AND table_name = books;n
測試場景2
- 表級別:億級表
- 效率:17285.888 sec
- JSON大小:33g
- 入庫CrateDB大小:27g
- 數據量:235265838
- node_num: 3
- 分片數量:1024
- 副本數:2
- memory:100g
- vcpu:24
- storage:1.4T (4*280g)
- 每入庫秒鐘:13610 條/s
- network:千兆
創建表,導入數據。
CREATE TABLE books_t1 (n id integer,n isbn string,n category string INDEX using fulltext,n publish_date string,n publisher string INDEX using fulltext,n price floatn) CLUSTERED BY (category) INTO 1024 SHARDS with (number_of_replicas = 2, refresh_interval=10000);nnCOPY books_t1 FROM /disk01/data/books/books.json; -- COPY OK, 235265838 rows affected (17285.888 sec)n
測試性能。
OPTIMIZE table books_t1;nnselect category,count(*) from books_t1 group by category limit 100; -- 2.556 secnnselect category,count(*) as num from books_t1 group by category order by num limit 100; -- 2.763 secnn問題:Error! SQLActionException[SQLParseException: Cannot GROUP BY category: grouping on analyzed/fulltext columns is not possible]nnselect count(*) from books_t1 where match(category, PERFORMING-ARTS); -- limit 100; -- 0.256 secnnselect * from books_t1 where match(category, ARTS); -- limit 100; -- 0.256 sec; -- 0.928 secn
注意:fulltext欄位的都無法做聚合分析操作,不帶全文索引,只能做全文搜索match,重新導數據在測試.
測試場景3
- 表級別:億級表
- 效率:5662.132 sec
- JSON大小:33g
- 入庫CrateDB大小:25.3g
- 數據量:235265838
- node_num: 3
- 分片數量:1024
- 副本數:2
- memory:100g
- vcpu:24
- storage:1.4T (4*280g)
- 每入庫秒鐘:13610 條/s
- network:千兆
創建表,插入數據。
CREATE TABLE books_t2 (n id integer,n isbn string,n category string,n publish_date string,n publisher string,n price floatn) CLUSTERED BY (category) INTO 1024 SHARDS;nnCOPY books_t2 FROM /disk01/data/books/books.json;nninsert into books_t2 select * from books_t1; -- INSERT OK, 235265838 rows affected (5662.132 sec)n
性能測試。
OPTIMIZE table books_t2;nnselect category,count(*) from books_t2 group by category limit 100; -- 3.994 secnnselect category,count(*) as num from books_t2 group by category order by num limit 100; -- 4.159 secnnselect category,count(*) as num from books_t2 where category=SCIENCE group by category order by num limit 100; -- 1.731 secnnselect count(*) from books_t2 where category=SCIENCE limit 100; -- 0.001 secnnselect count(distinct category) from books_t2 limit 100; -- 4.677 secnnselect distinct category from books_t2 limit 100; -- 3.914 secnnselect id,price,publisher from books_t2 where publish_date=1999-02-02 and category=SCIENCE limit 100; -- 0.014 secn
注意:分片數量過多導致Heap Usage一直居高不下達到57%,表建立全局索引1024個分片,2個索引欄位
測試場景4
- 表級別:15億級
- 效率:15m/s (node)
- JSON大小:215 G
- 入庫CrateDB大小: 175.6g
- 數據量:1551303191
- 分片數量:500
- 副本數:2
- memory:100g
- vcpu:24
- storage:1.4T (4280g) 4節點
- network:千兆
生成的文本數據,轉換為JSON格式。
nohup cat /disk01/searchdb/data/books/books | python csv2json.py --columns id:integer isbn:string category:string publish_date:string publisher:string price:float > /disk02/searchdb/books.json &n
切割一個215g數據文件為22個10g大小的數據文件,並行入庫。
split -b 10000m /disk02/searchdb/books.json -d -a 3 split_filen
創建表。
CREATE TABLE books (n id integer,n isbn string,n category string,n publish_date string,n publisher string,n price floatn) CLUSTERED BY (category) INTO 500 SHARDS;n
批量入庫數據。
/opt/crate-2.1.7/bin/crash --hosts node1 -c "COPY books FROM /disk01/searchdb/split_file000"n
主要針對15億單表的查詢 - 性能測試。
OPTIMIZE table books;nnselect category,count(*) from books_t2 group by category limit 100; -- 3.994 secnnselect category,count(*) as num from books_t2 group by category order by num limit 100; -- 4.159 secnnselect category,count(*) as num from books_t2 where category=SCIENCE group by category order by num limit 100; -- 1.731 secnnselect count(*) from books_t2 where category=SCIENCE limit 100; -- 0.001 secnnselect count(distinct category) from books_t2 limit 100; -- 4.677 secnnselect distinct category from books_t2 limit 100; -- 3.914 secnnselect id,price,publisher from books_t2 where publish_date=1999-02-02 and category=SCIENCE limit 100; -- 0.014 secn
測試場景5
測試SQL如下,主要是針對單表的性能測試,無分區。
OPTIMIZE table books;nnselect category,count(*) from books group by category limit 100; -- 37.878 secnnselect category,count(*) as num from books group by category order by num limit 100; -- 46.603 secnnselect category,count(*) as num from books where category=SCIENCE group by category order by num limit 100; -- 11.808 secnnselect count(*) from books where category=SCIENCE limit 100; -- 0.002 secnnselect count(distinct category) from books limit 100; -- 44.924 secnnselect distinct category from books limit 100; -- 44.335 secnnselect id,price,publisher from books where publish_date=1999-02-02 and category=SCIENCE limit 100; -- 0.347 secnnselect price,count(publisher) from books where publish_date=1999-02-02 and category=SCIENCE group by price order by price desc limit 100; -- 0.981 secnnselect price,category from books where publisher=Kyowon group by price,category order by price limit 100; --n3.602 secnnselect price,category,count(*) from books where publisher=Kyowon group by price,category order by price limit 100; -- 1.406 secn
- 場景1 數據量:496928035 4節點(mem:128g vcore: 24 storage: 4*250g) 大小:56.2g shards: 500 平均:
6.0K/條
網路:千兆 - 場景2 數據量:993106194 4節點(mem:128g vcore: 24 storage: 4*250g) 大小:112g shards: 500 平均:
6.0K/條
網路:千兆 - 場景3 數據量:1551303103 4節點(mem:128g vcore: 24 storage: 4*250g) 大小:174.4g shards: 500 平均:
6.0K/條
網路:千兆
注意:如上測試並沒有專業優化並發,除內存外,所有參數使用默認。
在單表15億+,五分區,4台伺服器,千兆網路,表現出來的性能還是非常強勁的,主要針對單表各種統計分析並且還帶有全文檢索的功能, 此資料庫可以把它稱為是分散式搜索資料庫。底層用到了很多搜索引擎存儲的技術,包括倒排索引,數據分片,利用大內存,細粒度索引,如果能支持多實例,目前CPU還沒完全使用,磁碟IO和內存都滿載。每一個欄位都帶索引,所以入庫比較慢,在單機上入庫是瓶頸,可以分開在多台機器入庫,這樣避免IO堵在一台機器;壓力過大容易導致節點奔潰脫離集群;join性能沒深入探索,不好評價。
- 全文搜索(Fulltext Search)、分詞等能力通過特殊SQL語法匹配,搜索結果可以進行複雜的統一分析。
- Geo Search,支持基於地理位置信息的複雜演算法分析,響應迅速。
- 支持分區表,可以基於分區進行檢索分析。
- 支持多副本,自帶容錯,檢索分流,提高性能。
- 海量數據實時入庫,實時檢索、複雜統計分析。
- Blob(binary large object),二進位大對象存儲分析。
- 僅支持JSON/CSV入庫。
一個分散式搜索資料庫,支持標準SQL和JDBC,用來替代ES做一些全文檢索並支持複雜統計分析能力,很有實際意義。
我目前參與過最大的ES集群,也就60節點300億+doc(3個主節點,6T&2塊SATA),數據量400TB
,還有一個小集群20節點70億+doc(3主節點,4T&2塊SSD),性能基本能滿足要求,存儲近3年的數據,歷史數據HDFS為Backup。
參考:
- [1]. https://github.com/baidu/Elasticsearch
- [2]. http://www.itweet.cn/blog
歡迎關注微信公眾號,第一時間,閱讀更多有關雲計算、大數據文章。
http://weixin.qq.com/r/zUgkPBvEPEZDrV9f9x3f (二維碼自動識別)
原創文章,轉載請註明: 轉載自Itweet的博客
本博客的文章集合:
http://www.itweet.cn/blog/archive/推薦閱讀:
※目前流行病學有成熟的資料庫嗎?
※阿里雲Elasticsearch的X-Pack:機器學習、安全保障和可視化
※用戶資料庫是用mongodb好,還是用mysql好?