Key-Value cache for a single transformer layer. More...

#include <model.h>

Collaboration diagram for KVCacheLayer:

Detailed Description

Key-Value cache for a single transformer layer.

Stores the key and value tensors for attention mechanism, with optional CUDA support for GPU acceleration.

Definition at line 130 of file model.h.

Member Data Documentation

std::vector<float> KVCacheLayer::k

Definition at line 131 of file model.h.

std::vector<float> KVCacheLayer::v

Definition at line 132 of file model.h.

The documentation for this struct was generated from the following file: