[跟吉姆一起讀LevelDB]10.Delete與Snapshot
析構函數的代碼在http://db_impl.cc 149-176,
DBImpl::~DBImpl() {n // Wait for background work to finishn mutex_.Lock(); // shutting_down_是atomic的, 標記資料庫是否正在關閉n shutting_down_.Release_Store(this); // Any non-NULL value is okn while (bg_compaction_scheduled_) {n bg_cv_.Wait();n }n mutex_.Unlock();nn if (db_lock_ != NULL) {n env_->UnlockFile(db_lock_);n }n // 以下只有各種delete, 釋放內存n delete versions_;n if (mem_ != NULL) mem_->Unref();n if (imm_ != NULL) imm_->Unref();n delete tmp_batch_;n delete log_;n delete logfile_;n delete table_cache_;nn if (owns_info_log_) {n delete options_.info_log;n }n if (owns_cache_) {n delete options_.block_cache;n }n}n
原先預期LevelDB在關閉過程中應該會有些整理操作, 比如優化manifest. 但實際什麼都沒有, 僅僅釋放了內存. 我猜測這是為了盡最大可能保證系統的可用性. 資料庫正在關閉, 不代表完全不能存讀數據了, 能續1s是1s. 等待後台的compaction不會影響可用性, 但重寫manifest會. 當然compaction被打斷, 再次啟動時恢復成本更高也是一方面.
------
之前的分析有提到過快照(Snapshot)和SequenceNumber有關. 每次LevelDB的操作都會分配一個獨特的SequenceNumber(可以理解為任務ID), 次次自增1, 也可以用到差不多地球爆炸了. 小SequenceNumber的操作取不到更高SequenceNumber的數據.
SequenceNumber簡單封裝起來就是Snapshot對象, snapshot.h 17-29,
class SnapshotImpl : public Snapshot {n public:n SequenceNumber number_; // const after creationnn private:n friend class SnapshotList;nn // SnapshotImpl is kept in a doubly-linked circular listn SnapshotImpl* prev_;n SnapshotImpl* next_;nn SnapshotList* list_; // just for sanity checksn};n
Snapshot彼此之間用鏈表相連並存於SnapshotList, 31-63,
class SnapshotList {n public:n SnapshotList() {n list_.prev_ = &list_;n list_.next_ = &list_;n }nn bool empty() const { return list_.next_ == &list_; }n SnapshotImpl* oldest() const { assert(!empty()); return list_.next_; }n SnapshotImpl* newest() const { assert(!empty()); return list_.prev_; }nn const SnapshotImpl* New(SequenceNumber seq) {n SnapshotImpl* s = new SnapshotImpl;n s->number_ = seq;n s->list_ = this;n s->next_ = &list_;n s->prev_ = list_.prev_;n s->prev_->next_ = s;n s->next_->prev_ = s;n return s;n }nn void Delete(const SnapshotImpl* s) {n assert(s->list_ == this);n s->prev_->next_ = s->next_;n s->next_->prev_ = s->prev_;n delete s;n }nn private:n // Dummy head of doubly-linked list of snapshotsn SnapshotImpl list_; // 虛擬的Snapshot, 用於消除判斷n};n
這裡虛置了一個Snapshot防止鏈表操作經常要if NULL. 這是區分是不是有經驗的程序員的經典標準吧. 坊間還流傳著二級指針消除if的故事, 我一般反應不過來. 虛置節點應該是最易懂的.
之所以要把Snapshot串聯起來是為了知道Snapshot的SequenceNumber最小是多少, 先記作MIN_SEQ. 在compaction中, 如果遇到KV的SequenceNumber比MIN_SEQ大, 那無論如何這個KV就不能被清除掉. 因為這段數據正在被某個快照保護著.
http://db_impl.cc 956-961的compaction代碼片段, 注意smallest_snapshot, 那就是MIN_SEQ,
if (last_sequence_for_key <= compact->smallest_snapshot) {n // Hidden by an newer entry for same user keyn drop = true; // (A)n } else if (ikey.type == kTypeDeletion &&n ikey.sequence <= compact->smallest_snapshot &&n compact->compaction->IsBaseLevelForKey(ikey.user_key)) {n
所以, LevelDB的取值和compaction操作都受到snapshot的制約.
推薦閱讀:
※[跟吉姆一起讀LevelDB]11.Iterator與Compaction(1)
※WiscKey: Separating Keys from Values in SSD-conscious Storage
※如何評價 Badger (fast key-value storage)?
※Key-value資料庫比關係型資料庫更加新嗎?