Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Introduce Page Cache in NGO #267

Open
4 tasks
lucassong-mh opened this issue Jun 21, 2022 · 0 comments
Open
4 tasks

[RFC] Introduce Page Cache in NGO #267

lucassong-mh opened this issue Jun 21, 2022 · 0 comments

Comments

@lucassong-mh
Copy link
Contributor

lucassong-mh commented Jun 21, 2022

  • Feature Name: Introduce Page Cache
  • Start Date: 2022-06-21

Summary

page-cache is a new designed and implemented crate and will be added into NGO.

Page Cache provides cache mechanism for block devices. Similar to Linux Buffer Cache, the goal of our page cache is to minimize disk I/O by storing data (page/block granularity) in physical memory that would otherwise require disk access.

In NGO, Page Cache caches read/write data between Async FS and Block Device. It also utilizes Rust asynchronous programming to gain better performance.

Background

Please refer to:

Design doc: #238

Async Filesystem: #265

High-level Design

page-cache mainly provides LRU-strategy struct PageCache and Usage-wrapper struct CachedDisk for users (filesystems).

page-cache

API

Public types:

PageCache<K: PageKey, A: PageAlloc>: Manage the cached pages for a domain of key (e.g., block IDs). Mainly use LruCache.

PageState: Indicate the state of a cached page.

PageHandle<K: PageKey, A: PageAlloc>: The handle to a cached page acquired from the page cache. Further operations to the page (like change the state or read/write content) must call lock() to get corresponding PageHandleGuard.

FixedSizePageAlloc: A page allocator with fixed total size.

CachedDisk<A: PageAlloc>: A virtual disk with a backing disk and a page cache. Benefit filesystems to access page cache just like accessing the disk. Define a CachedDiskFlusher: PageCacheFlusher for the inner page cache.

Private types:

Page<A: PageAlloc>: A block of memory (same size as BLOCK_SIZE) obtained from an allocator which implements PageAlloc.

PageEvictor<K: PageKey, A: PageAlloc>: Spawn a task to flush and evict pages of all instances of PageCache<K, A> when the memory of A is low.

Trait:

PageAlloc: A trait for a page allocator that can monitor the amount of free memory.

PageCacheFlusher: Allow the owner (CachedDisk or filesystem) of a page cache to specify user-specific I/O logic to flush the dirty pages of a page cache.

PageKey: A trait to define domain of key for page cache.

Detail-level explanation

See cargo doc of page-cache.

Performance improvement

fio-pagecache

AFS+PageCache beats SEFS on an average of 110.9%.
The result of seq-write is outstanding thanks to the batch write-back optimization during CachedDisk's flush().

Future work

  • Batch read blocks from block device in CachedDisk's read().
  • Two-List Strategy: aka LRU/2, used to solve the only-used-once failure. Keep two lists: the active list and the inactive list. Pages on the active list are considered "hot" and are not available for eviction. Pages on the inactive list are available for cache eviction.
  • Implement BlockDevice for CachedDisk.
  • Try to integrate memory-mapped file.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant