Skip to content

Are You Sure You Want to Use MMAP in Your Database Management System? (A. Crotty, CIDR 2022) #6

@benclmnt

Description

@benclmnt

Link to paper

Main Claim

mmap is not a suitable replacement for a traditional buffer pool that manually handles read and write syscalls. This is despite the seemingly less complexity and more efficiency benefits that mmap offers.

Main problems

  1. Transparent Paging: mmap might flush dirty page at any time, without regards to whether the page contains any changes from uncommitted transaction.
    • (3.1) To ensure transactional safety, DB needs to implement either Copy-on-Write (maintaining an extra copy of the database file as private workspace) with WAL, or Shadow Paging.
    • (3.3) To ensure data integrity, DB needs to validate checksum on every page access. Even worse, there is no mechanism to ensure that pages is not corrupted before writing to secondary storage.
  2. (3.2) Since page cache might be full at anytime, read-only queries can trigger unintended page fault, causing IO spikes.
  3. (3.2) mmap doesn't support asynchronous reads.
  4. (3.4) mmap-based file I/O suffers from page table contention, single-threaded page eviction and TLB shootdowns.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions