標籤:

TiDB Brings Distributed Scalability to SQL

26 Apr 2017 1:00am,nnttby ntttSusan Hallt tt

A rash of new databases have emerged, such as Google Spanner, FaunaDB, Cockroach and TimeScaleDB, that are focused on solving the problem of scale that plagues standard SQL. Now another entrant, the Beijing, China-based PingCap』s open source TiDB project, aims to make it as scalable as NoSQL systems while maintaining ACID transactions.

n

Its support for the MySQL protocol means users can reuse many MySQL ntools and greatly reduce migration costs, according to PingCap nco-founder and CEO Qi (Max) Liu.n You can use it to replace MySQL for applications without changing any ncode in most cases. And it scales horizontally; increase the capacity nsimply by adding more machines.

n

Liu presented TiDB at the Percona Liven conference in Amsterdam last October. The project was in beta then; nit』s since evolved to release candidate. On Thursday, PingCap co-foundern and Chief Technology Officer Edward Huang will be speaking about TiDB nat the Percona Live event in Santa Clara, Calif.

n

They tout that TiDB offers the best of both the SQL and NoSQL worlds. They focused on making it:

n

  • easy to use;
  • ensuring that no data is ever lost; it is self-healing from failures;
  • cross-platform and can run in any environment;
  • and open source.

It also allows online schema changes, so the schema can evolve with nyour requirements. You can add new columns and indices without stopping nor affecting operations in progress.

n

As an open source project, it has more than 100 contributors, Liu said in an email interview.

n

PingCap drew inspiration for TiDB from Google F1 distributedn database and Spanner. Google built Spanner atop its own proprietary nsystems and it』s not open source, considered a downside to some.

n

「With Spanner, you』re making a commitment to running the service in Google Compute Engine (GCE) andn probably running it there for the service』s lifetime. You』re not going nto have an off-ramp if you choose to run your own stack,」 Spencer Kimball, CEO of Cockroach Labs, told The New Stack previously.

n

Keeping Track of All the Bikes

n

TiDB takes a loose coupling approach. It consists of a MySQL Server nlayer and the SQL layer. Its foundation is the open source distributed ntransactional key-value database TiKV, another PingCap project, which uses the programming language Rust and the distributed protocol Raft.n TiDB is written in Go. Inside TiKV are MVCC (multi-version concurrency ncontrol), Raft, and for local key-value storage, it uses RocksDB. It also uses the Spark Connector.

n

TiDB makes two distinctions from Spanner, Liu said:

n

While the bottom layer of Spanner relies on Google』s Colossusn distributed file system, TiDB ensures that the log is safely stored in nthe Raft layer. TiDB does not depend on any distributed file system, nwhich greatly lowers write latency.

n

「We also see great potential in SQL optimizer, but Google didn』t seemn to go deep into this aspect in its F1 paper. When designing our nproject, we aimed to explore the optimizer』s capability,」 he said.

n

Spanner gained attention for its use of atomic clocks to gain time nsynchronization among geographically distributed data centers. TiDB doesn not use atomic and GPS clocks. Instead, it relies on Timestamp nAllocator introduced in Percolator, a paper published by Google in 2006.

n

It supports the popular containers such as Docker. And the team is nworking to make it work with Kubernetes, though, for this work, Liu npointed out difficulties there to the Amsterdam audience.

n

The biggest problem they』re working on now is latency, especially nbetween geographically distributed data centers, he said, one he hopes nto have resolved in the near future.

n

PingCAP was founded in April 2015 by Huang, a senior distributed system engineer; Cui Qiu,n a senior system engineer; and Liu, also an infrastructure engineer. It nhas 48 engineers working in Beijing and others working remotely from nelsewhere in China.

n

Its clients include mobile gaming provider GAEA,n which uses TiDB to support its cross-platform real-time advertising nsystem, which requires high-volume data capacity and experiences peak nloads during certain periods. TiDB supports automatic sharding and the nbottom layer, TiKV, automatically distributes data among the cluster, nwhich helps GAEA cut the cost of operation and maintenance, Liu said.

n

Another customer is the cashless and station-free bike sharing platform Mobike which uses TiDB for data analysis and to replace a MySQL database for online orders, which now number more than 400 million.

thenewstack.io/tidb-bri


推薦閱讀:

使用 Ansible 安裝部署 TiDB
申礫:細說分散式資料庫的過去、現在與未來
三篇文章了解 TiDB 技術內幕——說存儲
GopherChina 2017 演講實錄|申礫:Go in TiDB
TiDB 在 360 金融貸款實時風控場景應用

TAG:TiDB |