TinyLlama.cpp 1.0
A lightweight C++ implementation of the TinyLlama language model
Loading...
Searching...
No Matches
Public Attributes | List of all members
block_q8_K Struct Reference

8-bit K-quantized block structure with block sums More...

#include <quantization.h>

Collaboration diagram for block_q8_K:
Collaboration graph

Public Attributes

uint16_t d
 
int8_t qs [GGML_QK_K]
 
int16_t bsums [GGML_QK_K/16]
 

Detailed Description

8-bit K-quantized block structure with block sums

Definition at line 111 of file quantization.h.

Member Data Documentation

◆ bsums

int16_t block_q8_K::bsums[GGML_QK_K/16]

Block sums for fast dot product

Definition at line 114 of file quantization.h.

Referenced by vec_dot_q6_k_q8_k_cpu().

◆ d

uint16_t block_q8_K::d

Block scale

Definition at line 112 of file quantization.h.

Referenced by dequantize_q8_k(), and quantize_fp32_to_q8_K().

◆ qs

int8_t block_q8_K::qs[GGML_QK_K]

Quantized values

Definition at line 113 of file quantization.h.

Referenced by dequantize_q8_k(), quantize_fp32_to_q8_K(), vec_dot_q4_k_q8_k_cpu(), and vec_dot_q6_k_q8_k_cpu().


The documentation for this struct was generated from the following file: