net/disk_cache/sql/README.md - codesearch/chromium/src - Git at Google

 # SQL disk cache backend

 This directory contains an experimental SQL-based implementation of the disk
 cache (`disk_cache::Backend`). It uses SQLite to store cache entries and is
 designed to be more robust and performant than the default cache backends
 (block file backend on Windows and simple cache backend on other OSes),
 especially in scenarios with a large number of small entries.

 The implementation is sharded to improve concurrency, with each shard managing
 its own SQLite database file.

 ## Key Classes and Components

 ### Core Backend Logic

 *   **`SqlBackendImpl`**: This is the main entry point and public-facing class
     for the SQL-based disk cache. It implements the `disk_cache::Backend`
     interface, handling requests from the network stack to open, create, doom,
     and enumerate cache entries. It owns and coordinates the
     `SqlPersistentStore`.

 *   **`SqlEntryImpl`**: Implements the `disk_cache::Entry` interface,
     representing a single entry in the cache. It manages the data streams
     (header and body) and metadata (`last_used` time, key, etc.) for an entry.
     All I/O operations are passed to the `SqlBackendImpl`, which then delegates
     them to the persistence layer.

 ### Persistence Layer

 *   **`SqlPersistentStore`**: This class serves as the primary interface to the
     persistence layer, abstracting away the details of database sharding and
     asynchronous operations. It owns multiple `BackendShard` instances and
     distributes operations among them based on the cache entry's key.

 *   **`SqlPersistentStore::BackendShard`**: Manages a single shard of the cache.
     Each shard has its own SQLite database file, a dedicated background task
     runner for database operations, and an in-memory index of its entries. It
     owns a `SequenceBound<SqlPersistentStore::Backend>`.

 *   **`SqlPersistentStore::Backend`**: This class encapsulates all direct
     interactions with a single SQLite database. It runs entirely on a
     background sequence to avoid blocking the main network thread. Its
     responsibilities include:
     *   Executing all SQL queries (CREATE, READ, UPDATE, DELETE).
     *   Managing the database schema and transactions.
     *   Handling database initialization and error recovery.

 ### Data Structures and Utilities

 *   **`CacheEntryKey`**: A memory-efficient wrapper for the cache key string.
     Since cache keys can be long and are stored in multiple in-memory data
     structures, this class uses a `scoped_refptr<base::RefCountedString>` to
     share the underlying string data, reducing memory overhead.

 *   **`SqlPersistentStoreInMemoryIndex`**: An in-memory index that maps a hash
     of a `CacheEntryKey` to its `ResId` (the primary key in the database). This
     allows for fast, synchronous checks to see if an entry is likely to exist in
     the cache without needing to perform a slow, asynchronous database query. It
     is highly optimized for memory usage.

 *   **`ExclusiveOperationCoordinator`**: A synchronization primitive that
     serializes access to resources. It ensures that "exclusive" operations (like
     cache-wide eviction or cleanup) do not run concurrently with "normal"
     operation (like reading or writing a single entry). Normal operations on
     *different* cache keys can run in parallel.

 *   **`EvictionCandidateAggregator`**: A thread-safe helper class used during
     cache eviction. Each shard independently generates a list of its least
     recently used entries as eviction candidates. This class aggregates these
     lists from all shards, performs a final sort, and selects the global set of
     entries to be evicted to bring the cache size back under its limit.

 *   **`InFlightEntryModification`**: A mechanism to queue metadata updates
     (`last_used` time, headers, body size) for a cache entry that is not
     currently active (i.e., not held open as a `SqlEntryImpl` object). When an
     operation modifies an entry, it records the change as an in-flight
     modification. If the entry is opened again before the background database
     write completes, these queued modifications are applied to the entry's data
     as it is read from disk. This ensures that the in-memory representation of
     an entry is always consistent with pending operations, even with fully
     asynchronous database writes.

 ### How It Works

 1.  **Initialization**: `SqlBackendImpl` creates a `SqlPersistentStore`, which
     in turn creates a number of `BackendShard` instances (e.g., 3), each with
     its own background task runner. Each shard initializes its SQLite database.

 2.  **Entry Operations (Create/Open)**:
     *   A request to open or create an entry arrives at `SqlBackendImpl`.
     *   The entry's key is hashed to determine which `BackendShard` is
         responsible for it.
     *   The operation is posted to the shard's background task runner.
     *   The `SqlPersistentStore::Backend` for that shard executes the necessary
         SQL commands to find or create the entry in its database.
     *   The result (a `SqlEntryImpl` or an error) is returned to the main thread
         via a callback.

 3.  **Data I/O**: Reading and writing data to a `SqlEntryImpl` follows a similar
     pattern, with operations being posted to the appropriate shard's background
     task runner.

 4.  **Eviction**:
     *   When the cache size exceeds a certain threshold, `SqlBackendImpl`
         initiates eviction.
     *   It posts an exclusive "start eviction" task to the `SqlPersistentStore`.
     *   Each shard queries its database for a list of its least recently used
         entries.
     *   The `EvictionCandidateAggregator` collects these lists, selects the
         entries to be removed, and sends the list of doomed entries back to each
         shard to be deleted from the database.

 5.  **Coordination**: The `ExclusiveOperationCoordinator` ensures that
     operations like eviction, which affect the entire cache, do not conflict
     with ongoing reads and writes to individual entries. When an exclusive
     operation is requested, the coordinator waits for all active normal
     operations to complete, runs the exclusive operation, and then resumes
     queued normal operations.

 ## Database Schema

 Each shard of the SQL disk cache uses a SQLite database with the following
 schema:

 ### Tables

 *   **`resources`**: Stores the main metadata for each cache entry.
     *   `res_id` (INTEGER, PRIMARY KEY AUTOINCREMENT): Unique ID for the
         resource.
     *   `last_used` (INTEGER): Timestamp for LRU.
     *   `body_end` (INTEGER): End offset of the body.
     *   `bytes_usage` (INTEGER): Total bytes consumed by the entry.
     *   `doomed` (INTEGER): Flag for entries pending deletion (0 for live, 1 for
         doomed).
     *   `check_sum` (INTEGER): The checksum `crc32(head + cache_key_hash)`.
     *   `cache_key_hash` (INTEGER): The hash of `cache_key`.
     *   `cache_key` (TEXT): The full cache key string.
     *   `head` (BLOB): Serialized response headers.

 *   **`blobs`**: Stores the data chunks of the cached body.
     *   `blob_id` (INTEGER, PRIMARY KEY AUTOINCREMENT): Unique ID for the blob.
     *   `res_id` (INTEGER): Foreign key to `resources.res_id`.
     *   `start` (INTEGER): Start offset of this blob chunk.
     *   `end` (INTEGER): End offset of this blob chunk.
     *   `check_sum` (INTEGER): The checksum `crc32(blob + cache_key_hash)`.
     *   `blob` (BLOB): The actual data chunk.

 ### Indexes

 *   **`index_resources_cache_key_hash_doomed`**:
     `ON resources(cache_key_hash,doomed)`
     *   Speeds up lookups for live entries (`doomed=0`) by `cache_key_hash`.
         Crucial for `OpenEntry` and similar operations.

 *   **`index_live_resources_last_used_bytes_usage`**:
     `ON resources(last_used, bytes_usage) WHERE doomed=0`
     *   A covering index on `last_used` and `bytes_usage` for live entries.
         Essential for efficient eviction logic, which targets the least recently
         used entries without needing to access the `resources` table directly.

 *   **`index_blobs_res_id_start`**: (`UNIQUE`) `ON blobs(res_id, start)`
     *   A unique index on `(res_id, start)` in the `blobs` table. Ensures quick
         retrieval of data blobs for a given entry at a specific offset and
         maintains data integrity by preventing overlapping blobs for the same
         entry.
	# SQL disk cache backend

	This directory contains an experimental SQL-based implementation of the disk
	cache (`disk_cache::Backend`). It uses SQLite to store cache entries and is
	designed to be more robust and performant than the default cache backends
	(block file backend on Windows and simple cache backend on other OSes),
	especially in scenarios with a large number of small entries.

	The implementation is sharded to improve concurrency, with each shard managing
	its own SQLite database file.

	## Key Classes and Components

	### Core Backend Logic

	* `SqlBackendImpl`: This is the main entry point and public-facing class
	for the SQL-based disk cache. It implements the `disk_cache::Backend`
	interface, handling requests from the network stack to open, create, doom,
	and enumerate cache entries. It owns and coordinates the
	`SqlPersistentStore`.

	* `SqlEntryImpl`: Implements the `disk_cache::Entry` interface,
	representing a single entry in the cache. It manages the data streams
	(header and body) and metadata (`last_used` time, key, etc.) for an entry.
	All I/O operations are passed to the `SqlBackendImpl`, which then delegates
	them to the persistence layer.

	### Persistence Layer

	* `SqlPersistentStore`: This class serves as the primary interface to the
	persistence layer, abstracting away the details of database sharding and
	asynchronous operations. It owns multiple `BackendShard` instances and
	distributes operations among them based on the cache entry's key.

	* `SqlPersistentStore::BackendShard`: Manages a single shard of the cache.
	Each shard has its own SQLite database file, a dedicated background task
	runner for database operations, and an in-memory index of its entries. It
	owns a `SequenceBound<SqlPersistentStore::Backend>`.

	* `SqlPersistentStore::Backend`: This class encapsulates all direct
	interactions with a single SQLite database. It runs entirely on a
	background sequence to avoid blocking the main network thread. Its
	responsibilities include:
	* Executing all SQL queries (CREATE, READ, UPDATE, DELETE).
	* Managing the database schema and transactions.
	* Handling database initialization and error recovery.

	### Data Structures and Utilities

	* `CacheEntryKey`: A memory-efficient wrapper for the cache key string.
	Since cache keys can be long and are stored in multiple in-memory data
	structures, this class uses a `scoped_refptr<base::RefCountedString>` to
	share the underlying string data, reducing memory overhead.

	* `SqlPersistentStoreInMemoryIndex`: An in-memory index that maps a hash
	of a `CacheEntryKey` to its `ResId` (the primary key in the database). This
	allows for fast, synchronous checks to see if an entry is likely to exist in
	the cache without needing to perform a slow, asynchronous database query. It
	is highly optimized for memory usage.

	* `ExclusiveOperationCoordinator`: A synchronization primitive that
	serializes access to resources. It ensures that "exclusive" operations (like
	cache-wide eviction or cleanup) do not run concurrently with "normal"
	operation (like reading or writing a single entry). Normal operations on
	different cache keys can run in parallel.

	* `EvictionCandidateAggregator`: A thread-safe helper class used during
	cache eviction. Each shard independently generates a list of its least
	recently used entries as eviction candidates. This class aggregates these
	lists from all shards, performs a final sort, and selects the global set of
	entries to be evicted to bring the cache size back under its limit.

	* `InFlightEntryModification`: A mechanism to queue metadata updates
	(`last_used` time, headers, body size) for a cache entry that is not
	currently active (i.e., not held open as a `SqlEntryImpl` object). When an
	operation modifies an entry, it records the change as an in-flight
	modification. If the entry is opened again before the background database
	write completes, these queued modifications are applied to the entry's data
	as it is read from disk. This ensures that the in-memory representation of
	an entry is always consistent with pending operations, even with fully
	asynchronous database writes.

	### How It Works

	1. Initialization: `SqlBackendImpl` creates a `SqlPersistentStore`, which
	in turn creates a number of `BackendShard` instances (e.g., 3), each with
	its own background task runner. Each shard initializes its SQLite database.

	2. Entry Operations (Create/Open):
	* A request to open or create an entry arrives at `SqlBackendImpl`.
	* The entry's key is hashed to determine which `BackendShard` is
	responsible for it.
	* The operation is posted to the shard's background task runner.
	* The `SqlPersistentStore::Backend` for that shard executes the necessary
	SQL commands to find or create the entry in its database.
	* The result (a `SqlEntryImpl` or an error) is returned to the main thread
	via a callback.

	3. Data I/O: Reading and writing data to a `SqlEntryImpl` follows a similar
	pattern, with operations being posted to the appropriate shard's background
	task runner.

	4. Eviction:
	* When the cache size exceeds a certain threshold, `SqlBackendImpl`
	initiates eviction.
	* It posts an exclusive "start eviction" task to the `SqlPersistentStore`.
	* Each shard queries its database for a list of its least recently used
	entries.
	* The `EvictionCandidateAggregator` collects these lists, selects the
	entries to be removed, and sends the list of doomed entries back to each
	shard to be deleted from the database.

	5. Coordination: The `ExclusiveOperationCoordinator` ensures that
	operations like eviction, which affect the entire cache, do not conflict
	with ongoing reads and writes to individual entries. When an exclusive
	operation is requested, the coordinator waits for all active normal
	operations to complete, runs the exclusive operation, and then resumes
	queued normal operations.

	## Database Schema

	Each shard of the SQL disk cache uses a SQLite database with the following
	schema:

	### Tables

	* `resources`: Stores the main metadata for each cache entry.
	* `res_id` (INTEGER, PRIMARY KEY AUTOINCREMENT): Unique ID for the
	resource.
	* `last_used` (INTEGER): Timestamp for LRU.
	* `body_end` (INTEGER): End offset of the body.
	* `bytes_usage` (INTEGER): Total bytes consumed by the entry.
	* `doomed` (INTEGER): Flag for entries pending deletion (0 for live, 1 for
	doomed).
	* `check_sum` (INTEGER): The checksum `crc32(head + cache_key_hash)`.
	* `cache_key_hash` (INTEGER): The hash of `cache_key`.
	* `cache_key` (TEXT): The full cache key string.
	* `head` (BLOB): Serialized response headers.

	* `blobs`: Stores the data chunks of the cached body.
	* `blob_id` (INTEGER, PRIMARY KEY AUTOINCREMENT): Unique ID for the blob.
	* `res_id` (INTEGER): Foreign key to `resources.res_id`.
	* `start` (INTEGER): Start offset of this blob chunk.
	* `end` (INTEGER): End offset of this blob chunk.
	* `check_sum` (INTEGER): The checksum `crc32(blob + cache_key_hash)`.
	* `blob` (BLOB): The actual data chunk.

	### Indexes

	* `index_resources_cache_key_hash_doomed`:
	`ON resources(cache_key_hash,doomed)`
	* Speeds up lookups for live entries (`doomed=0`) by `cache_key_hash`.
	Crucial for `OpenEntry` and similar operations.

	* `index_live_resources_last_used_bytes_usage`:
	`ON resources(last_used, bytes_usage) WHERE doomed=0`
	* A covering index on `last_used` and `bytes_usage` for live entries.
	Essential for efficient eviction logic, which targets the least recently
	used entries without needing to access the `resources` table directly.

	* `index_blobs_res_id_start`: (`UNIQUE`) `ON blobs(res_id, start)`
	* A unique index on `(res_id, start)` in the `blobs` table. Ensures quick
	retrieval of data blobs for a given entry at a specific offset and
	maintains data integrity by preventing overlapping blobs for the same
	entry.