map/set/zset/list data_cf distributed compaction

data_cf distributed compaction needs `DB::Get` from `default` cf in `CompactionFilter`,  we bulk load all hash keys from `default` cf and save to shared filesystem for reading in dcompact workers.

When compaction input of `dafa_cf` is small but hash keys in `default` cf is large, this is a very big waste, --- when compact upper levels of `data_cf`, this may be likely happens.

So we should check data size in `default` cf before starting remote compaction, if it reaches a threshold of a percent of compaction input of `dafa_cf`, it should fallback to local compact.
1. Use `Compaction::column_family_data()` to get `default` cf handle and DB ptr, this needs a global `std::map`
1. Use `DB::GetApproximateSizes()` to get size in `default` cf.
 

Thus a customized `CompactionExecutorFactory` should be defined -- it should references `DcompactEtcd` factory and forward methods, the key point is to override `ShouldRunLocal(compaction)`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

map/set/zset/list data_cf distributed compaction #15

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

map/set/zset/list data_cf distributed compaction #15

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions