Main class for loading tensors from SafeTensors format files (single or sharded) More...

#include <safetensors_loader.h>

Collaboration diagram for SafeTensorsLoader:

Classes
struct	TensorInfo
	Information about a tensor stored in the SafeTensors file(s) More...

Public Member Functions
	SafeTensorsLoader (const std::string &model_load_path)
	Constructs a SafeTensorsLoader.

	~SafeTensorsLoader ()
	Destructor. Cleans up all memory-mapped shards.

	SafeTensorsLoader (const SafeTensorsLoader &)=delete

SafeTensorsLoader &	operator= (const SafeTensorsLoader &)=delete

std::vector< std::string >	tensor_names () const
	Get a list of all tensor names available in the loaded model.

std::vector< uint8_t >	get_tensor_bytes (const std::string &name) const
	Get the raw bytes for a tensor, converting to FP32 if needed.

const TensorInfo &	get_tensor_info (const std::string &name) const
	Get information about a specific tensor.

std::map< std::string, std::vector< uint8_t > >	load_all_tensors_parallel () const
	Load all tensors in parallel.

Static Public Member Functions
static bool	load_model_config_from_json (const std::string &model_path_or_dir, ModelConfig &config_to_populate)
	Loads model configuration from a JSON file corresponding to a .safetensors model path.

Private Member Functions
void	load_from_directory (const std::string &directory_path)
	Load tensors from a directory, handling index files and multiple shards.

void	load_single_file (const std::string &file_path, const std::string &shard_key_override="")
	Load a single .safetensors file as a shard.

void	parse_shard_metadata (Shard &shard, const std::string &shard_key)
	Parse the metadata of a shard and populate tensor information.

std::vector< uint8_t >	convert_tensor_data (const uint8_t *data, size_t size, const std::string &dtype) const
	Convert raw tensor data to FP32 if needed.

const Shard *	get_shard_for_tensor (const std::string &tensor_name) const
	Get the Shard object for a given tensor name.

Private Attributes
std::string	model_load_path_

bool	is_sharded_ = false

std::map< std::string, TensorInfo >	tensors_

std::map< std::string, std::unique_ptr< Shard > >	loaded_shards_

std::map< std::string, std::string >	tensor_name_to_shard_key_map_

Detailed Description

Main class for loading tensors from SafeTensors format files (single or sharded)

Supports both single-file and multi-shard (sharded) SafeTensors models. Handles memory mapping, tensor metadata parsing, and provides efficient access to tensor data. Can load models from a single .safetensors file, a directory containing multiple shards, or a directory with an index file.

Definition at line 120 of file safetensors_loader.h.

Constructor & Destructor Documentation

◆ SafeTensorsLoader() [1/2]

SafeTensorsLoader::SafeTensorsLoader ( const std::string & model_load_path )

explicit

Constructs a SafeTensorsLoader.

The path can be to a single .safetensors file, or a directory containing .safetensors file(s) and potentially an index.json.

Parameters

model_load_path Path to the model file or directory.

Exceptions

std::runtime_error if files cannot be opened, are invalid, or sharding info is inconsistent.

Definition at line 283 of file safetensors_loader.cpp.

    : model_load_path_(model_load_path), is_sharded_(false) {
    Logger::info("SafeTensorsLoader: Initializing for path: " + model_load_path_);
    std::filesystem::path path_obj(model_load_path_);
 
    if (!std::filesystem::exists(path_obj)){
        throw std::runtime_error("SafeTensorsLoader: Provided model_load_path does not exist: " + model_load_path_);
    }
 
    if (std::filesystem::is_directory(path_obj)) {
        Logger::info("SafeTensorsLoader: Path is a directory. Attempting to load from directory.");
        load_from_directory(model_load_path_); 
    } else if (std::filesystem::is_regular_file(path_obj)) {
        Logger::info("SafeTensorsLoader: Path is a single file. Loading single file.");
        std::string file_key = path_obj.filename().string();
        load_single_file(model_load_path_, file_key);
        is_sharded_ = false; 
    } else {
        throw std::runtime_error("SafeTensorsLoader: model_load_path is not a valid file or directory: " + model_load_path_);
    }
 
    if (tensors_.empty() && loaded_shards_.empty()) {
        Logger::warning("SafeTensorsLoader: Initialization complete, but no tensors were loaded and no shards mapped. Check model path and format: " + model_load_path_);
    } else {
        Logger::info("SafeTensorsLoader: Initialization complete. Total unique tensors mapped: " + std::to_string(tensors_.size()) +
                     " from " + std::to_string(loaded_shards_.size()) + " shard(s).");
    }
}

References Logger::info(), is_sharded_, load_from_directory(), load_single_file(), loaded_shards_, model_load_path_, tensors_, and Logger::warning().

◆ ~SafeTensorsLoader()

SafeTensorsLoader::~SafeTensorsLoader ( )

Destructor. Cleans up all memory-mapped shards.

Definition at line 312 of file safetensors_loader.cpp.

                                      {
    Logger::info("SafeTensorsLoader: Destructing. Clearing " + std::to_string(loaded_shards_.size()) + " loaded shards.");
    loaded_shards_.clear(); 
    Logger::info("SafeTensorsLoader: All shards cleared.");
}

References Logger::info(), and loaded_shards_.

◆ SafeTensorsLoader() [2/2]

SafeTensorsLoader::SafeTensorsLoader ( const SafeTensorsLoader & )

delete

Member Function Documentation

◆ convert_tensor_data()

std::vector< uint8_t > SafeTensorsLoader::convert_tensor_data	(	const uint8_t *	data,
		size_t	size,
		const std::string &	dtype
	)		const

private

Convert raw tensor data to FP32 if needed.

Handles conversion from F16/BF16 to FP32 as required by the tensor's dtype.

Parameters

data	Pointer to the raw tensor data.
size	Size of the data in bytes.
dtype	Data type string (e.g., "F32", "F16", "BF16").

Returns: Converted tensor data as a vector of bytes (FP32 format).

Definition at line 580 of file safetensors_loader.cpp.

                                                                                                                                         {
    if (dtype_str_upper == "F32") {
        return std::vector<uint8_t>(data_ptr, data_ptr + n_bytes);
    } else if (dtype_str_upper == "F16") {
        size_t num_elements = n_bytes / 2;
        std::vector<float> f32_vec(num_elements);
        const uint16_t* f16_ptr = reinterpret_cast<const uint16_t*>(data_ptr);
        for (size_t i = 0; i < num_elements; ++i) {
             f32_vec[i] = cpu_f16_to_float32(f16_ptr[i]);
        }
        std::vector<uint8_t> bytes_out(num_elements * sizeof(float));
        memcpy(bytes_out.data(), f32_vec.data(), bytes_out.size());
        return bytes_out;
    } else if (dtype_str_upper == "BF16") {
        size_t num_elements = n_bytes / 2;
        std::vector<float> f32_vec(num_elements);
        const uint16_t* bf16_ptr = reinterpret_cast<const uint16_t*>(data_ptr);
        for (size_t i = 0; i < num_elements; ++i) {
            f32_vec[i] = cpu_bf16_to_float32(bf16_ptr[i]);
        }
        std::vector<uint8_t> bytes_out(num_elements * sizeof(float));
        memcpy(bytes_out.data(), f32_vec.data(), bytes_out.size());
        return bytes_out;
    }
    throw std::runtime_error("SafeTensorsLoader: Unsupported tensor dtype for conversion: " + dtype_str_upper);
}

References cpu_bf16_to_float32(), and cpu_f16_to_float32().

Referenced by get_tensor_bytes().

◆ get_shard_for_tensor()

const Shard * SafeTensorsLoader::get_shard_for_tensor ( const std::string & tensor_name ) const

private

Get the Shard object for a given tensor name.

Looks up the shard key for the tensor and returns a pointer to the corresponding Shard.

Parameters

tensor_name Name of the tensor.

Returns: Pointer to the Shard containing the tensor.

Exceptions

std::logic_error if the shard is not found.

Definition at line 514 of file safetensors_loader.cpp.

                                                                                       {
    auto map_it = tensor_name_to_shard_key_map_.find(tensor_name);
    std::string determined_shard_key;
 
    if (map_it != tensor_name_to_shard_key_map_.end()){
        determined_shard_key = map_it->second;
    } else {
        const auto& tensor_info_direct = get_tensor_info(tensor_name);
        determined_shard_key = tensor_info_direct.shard_key;
    }
    
    if (determined_shard_key.empty()){
         throw std::logic_error("Internal inconsistency: Could not determine shard key for tensor '" + tensor_name + "'.");
    }
 
    auto shard_it = loaded_shards_.find(determined_shard_key);
    if (shard_it == loaded_shards_.end()) {
        throw std::logic_error("Internal inconsistency: Shard key '" + determined_shard_key + "' for tensor '" + tensor_name + "' not found in loaded_shards_ map. Tensors map has it, but shard object itself is missing.");
    }
    return shard_it->second.get();
}

References get_tensor_info(), loaded_shards_, and tensor_name_to_shard_key_map_.

Referenced by get_tensor_bytes().

◆ get_tensor_bytes()

std::vector< uint8_t > SafeTensorsLoader::get_tensor_bytes ( const std::string & name ) const

Get the raw bytes for a tensor, converting to FP32 if needed.

Parameters

name	Name of the tensor to load.

Returns: Vector of bytes containing the tensor data (FP32 format).

Exceptions

std::runtime_error if tensor not found or conversion fails.

Definition at line 536 of file safetensors_loader.cpp.

                                                                                  {
    const TensorInfo& info = get_tensor_info(name); 
    const Shard* shard = get_shard_for_tensor(name); 
    
    const uint8_t* raw_data_ptr = shard->get_tensor_raw_data(info.data_offset, info.nbytes);
    return convert_tensor_data(raw_data_ptr, info.nbytes, info.dtype);
}

References convert_tensor_data(), SafeTensorsLoader::TensorInfo::data_offset, SafeTensorsLoader::TensorInfo::dtype, get_shard_for_tensor(), get_tensor_info(), Shard::get_tensor_raw_data(), and SafeTensorsLoader::TensorInfo::nbytes.

◆ get_tensor_info()

const SafeTensorsLoader::TensorInfo & SafeTensorsLoader::get_tensor_info ( const std::string & name ) const

Get information about a specific tensor.

Parameters

name	Name of the tensor.

Returns: Reference to the tensor's information.

Exceptions

std::runtime_error if tensor not found.

Definition at line 506 of file safetensors_loader.cpp.

                                                                                                 {
    auto it = tensors_.find(name);
    if (it == tensors_.end()) {
        throw std::runtime_error("Tensor not found in SafeTensorsLoader metadata: " + name);
    }
    return it->second;
}

References tensors_.

Referenced by get_shard_for_tensor(), and get_tensor_bytes().

◆ load_all_tensors_parallel()

std::map< std::string, std::vector< uint8_t > > SafeTensorsLoader::load_all_tensors_parallel ( ) const

Load all tensors in parallel.

Returns: Map of tensor names to their data (FP32 format).

Definition at line 544 of file safetensors_loader.cpp.

                                                                                         {
    std::map<std::string, std::vector<uint8_t>> result_map;
    if (tensors_.empty()) {
        Logger::debug("SafeTensorsLoader::load_all_tensors_parallel: No tensors to load.");
        return result_map;
    }
 
    std::vector<std::future<std::pair<std::string, std::vector<uint8_t>>>> futures;
    unsigned int n_threads = std::max(1u, std::thread::hardware_concurrency());
    n_threads = std::min(n_threads, static_cast<unsigned int>(tensors_.size())); 
    if (n_threads > 16) n_threads = 16; 
    
    ThreadPool pool(n_threads);
    Logger::info("SafeTensorsLoader: Loading all " + std::to_string(tensors_.size()) + " tensors in parallel using " + std::to_string(n_threads) + " threads.");
 
    for (const auto& pair : tensors_) {
        const std::string& tensor_name = pair.first;
        futures.push_back(pool.submit([this, tensor_name]() {
            std::vector<uint8_t> data = this->get_tensor_bytes(tensor_name); 
            return std::make_pair(tensor_name, std::move(data));
        }));
    }
 
    for (auto& fut : futures) {
        try {
            std::pair<std::string, std::vector<uint8_t>> tensor_pair = fut.get();
            result_map[tensor_pair.first] = std::move(tensor_pair.second);
        } catch (const std::exception& e) {
            Logger::error("SafeTensorsLoader: Error loading a tensor in parallel task: " + std::string(e.what()));
            throw; 
        }
    }
    Logger::info("SafeTensorsLoader: Finished loading all tensors in parallel.");
    return result_map;
}

References Logger::debug(), Logger::error(), Logger::info(), ThreadPool::submit(), and tensors_.

◆ load_from_directory()

void SafeTensorsLoader::load_from_directory ( const std::string & directory_path )

private

Load tensors from a directory, handling index files and multiple shards.

If an index file is found, parses it and loads the referenced shards. Otherwise, scans for .safetensors files and loads them as individual shards.

Parameters

directory_path Path to the directory containing model files.

Definition at line 318 of file safetensors_loader.cpp.

                                                                               {
    Logger::debug("SafeTensorsLoader::load_from_directory for '" + directory_path_str + "'.");
    std::filesystem::path dir_p(directory_path_str);
    std::filesystem::path index_json_path_v1 = dir_p / "model.safetensors.index.json";
    std::filesystem::path index_json_path_v2 = dir_p / "pytorch_model.bin.index.json"; 
    std::filesystem::path actual_index_path;
 
    bool index_found = false;
    if (std::filesystem::exists(index_json_path_v1) && std::filesystem::is_regular_file(index_json_path_v1)) {
        actual_index_path = index_json_path_v1;
        index_found = true;
    } else if (std::filesystem::exists(index_json_path_v2) && std::filesystem::is_regular_file(index_json_path_v2)) {
        actual_index_path = index_json_path_v2;
        index_found = true;
    }
 
    if (index_found) {
        Logger::info("SafeTensorsLoader: Found index file: " + actual_index_path.string());
        is_sharded_ = true; 
        std::ifstream f(actual_index_path.string());
        if (!f.is_open()) {
            throw std::runtime_error("SafeTensorsLoader: Failed to open index file: " + actual_index_path.string());
        }
        nlohmann::json index_json_data;
        try {
            index_json_data = nlohmann::json::parse(f);
        } catch (const nlohmann::json::parse_error& e) {
            f.close();
            throw std::runtime_error("SafeTensorsLoader: Failed to parse index JSON from " + actual_index_path.string() + ": " + e.what());
        }
        f.close();
 
        if (index_json_data.count("weight_map") && index_json_data["weight_map"].is_object()) {
            // First pass: populate tensor_name_to_shard_key_map_ and identify unique shards to load
            std::map<std::string, std::string> unique_shards_to_load; // shard_filename -> full_path
            for (auto const& [tensor_name, shard_filename_json] : index_json_data["weight_map"].items()) {
                if (!shard_filename_json.is_string()) {
                    Logger::warning("SafeTensorsLoader: Shard filename for tensor '" + tensor_name + "' in index is not a string. Skipping.");
                    continue;
                }
                std::string shard_filename = shard_filename_json.get<std::string>();
                tensor_name_to_shard_key_map_[tensor_name] = shard_filename; 
                if (unique_shards_to_load.find(shard_filename) == unique_shards_to_load.end()) {
                     unique_shards_to_load[shard_filename] = (dir_p / shard_filename).string();
                }
            }
           
            // Second pass: load each unique shard and parse its metadata
            for(const auto& pair : unique_shards_to_load){
                const std::string& shard_filename = pair.first;
                const std::string& full_shard_path = pair.second;
                if (loaded_shards_.find(shard_filename) == loaded_shards_.end()) {
                    Logger::info("SafeTensorsLoader: Loading and parsing shard (from index): " + full_shard_path + " (key:"+ shard_filename + ")");
                    load_single_file(full_shard_path, shard_filename); 
                } else {
                     Logger::debug("SafeTensorsLoader: Shard '" + shard_filename + "' already loaded/parsed (should not happen if unique_shards logic is correct).");
                }
            }
 
        } else {
            throw std::runtime_error("SafeTensorsLoader: Index file " + actual_index_path.string() + " does not contain a valid 'weight_map'.");
        }
    } else {
        Logger::info("SafeTensorsLoader: No index file found in " + directory_path_str + ". Scanning for *.safetensors files.");
        std::vector<std::filesystem::path> shard_files;
        for (const auto& entry : std::filesystem::directory_iterator(dir_p)) {
            if (entry.is_regular_file() && entry.path().extension() == ".safetensors") {
                shard_files.push_back(entry.path());
            }
        }
 
        if (shard_files.empty()) {
             Logger::warning("SafeTensorsLoader: No .safetensors files found directly in directory: " + directory_path_str + ". Checking for model.safetensors as last resort.");
            std::filesystem::path single_model_file = dir_p / "model.safetensors";
            if(std::filesystem::exists(single_model_file) && std::filesystem::is_regular_file(single_model_file)){
                Logger::info("SafeTensorsLoader: Found 'model.safetensors' in directory, loading it as a single non-sharded model.");
                load_single_file(single_model_file.string(), single_model_file.filename().string());
                is_sharded_ = false;
            } else {
                 Logger::info("SafeTensorsLoader: No .safetensors files or index.json found in directory: " + directory_path_str + ". No model weights will be loaded from this path directly.");
            }
        } else if (shard_files.size() == 1) {
            Logger::info("SafeTensorsLoader: Found single .safetensors file: " + shard_files[0].string() + ". Loading as non-sharded.");
            load_single_file(shard_files[0].string(), shard_files[0].filename().string());
            is_sharded_ = false;
        } else {
            Logger::info("SafeTensorsLoader: Found " + std::to_string(shard_files.size()) + " .safetensors files (no index). Loading all as individual shards.");
            is_sharded_ = true;
            for (const auto& p : shard_files) {
                load_single_file(p.string(), p.filename().string());
            }
        }
    }
}

References Logger::debug(), Logger::info(), is_sharded_, load_single_file(), loaded_shards_, tensor_name_to_shard_key_map_, and Logger::warning().

Referenced by SafeTensorsLoader().

◆ load_model_config_from_json()

bool SafeTensorsLoader::load_model_config_from_json	(	const std::string &	model_path_or_dir,
		ModelConfig &	config_to_populate
	)

static

Loads model configuration from a JSON file corresponding to a .safetensors model path.

Given the path to a .safetensors model or directory, this method attempts to find a "config.json" in the same directory. If found, it parses the JSON and populates the provided ModelConfig object.

Parameters

model_path_or_dir	Path to the .safetensors model file or directory.
config_to_populate	Reference to a ModelConfig object to be filled.

Returns: True if config.json was found and successfully parsed, false otherwise.

Definition at line 607 of file safetensors_loader.cpp.

                                                                                                                           {
    std::filesystem::path model_fs_path(model_path_or_dir_str);
    std::filesystem::path config_json_path;
 
    if (std::filesystem::is_directory(model_fs_path)) {
        config_json_path = model_fs_path / "config.json";
    } else if (std::filesystem::is_regular_file(model_fs_path)) {
        config_json_path = model_fs_path.parent_path() / "config.json";
    } else {
        Logger::error("SafeTensorsLoader::load_model_config_from_json: Provided model path is not a valid file or directory: " + model_path_or_dir_str);
        return false;
    }
    std::string config_json_path_str = config_json_path.string();
 
    std::ifstream f(config_json_path_str);
    if (!f.is_open()) {
        Logger::warning("SafeTensorsLoader: config.json not found at: " + config_json_path_str);
        return false;
    }
 
    try {
        nlohmann::json data = nlohmann::json::parse(f);
        f.close();
        
        config_to_populate.hidden_size = data.value("hidden_size", 0);
        config_to_populate.intermediate_size = data.value("intermediate_size", 0);
        config_to_populate.num_attention_heads = data.value("num_attention_heads", 0);
        config_to_populate.num_key_value_heads = data.value("num_key_value_heads", config_to_populate.num_attention_heads);
        config_to_populate.num_hidden_layers = data.value("num_hidden_layers", 0);
        config_to_populate.vocab_size = data.value("vocab_size", 0);
        config_to_populate.max_position_embeddings = data.value("max_position_embeddings", 2048); 
        config_to_populate.rms_norm_eps = data.value("rms_norm_eps", 1e-5f);
        config_to_populate.rope_theta = data.value("rope_theta", 10000.0f);
        config_to_populate.bos_token_id = data.value("bos_token_id", 1);
        config_to_populate.eos_token_id = data.value("eos_token_id", 2); 
        config_to_populate.pad_token_id = data.value("pad_token_id", -1); 
        config_to_populate.unk_token_id = data.value("unk_token_id", 0); 
 
        if (data.contains("architectures") && data["architectures"].is_array() && !data["architectures"].empty()) {
            config_to_populate.architecture = data["architectures"][0].get<std::string>();
        } else {
            config_to_populate.architecture = data.value("model_type", "unknown");
        }
        config_to_populate.model_name = data.value("model_type", config_to_populate.architecture);
 
        bool is_llama3_vocab_size_json = (config_to_populate.vocab_size == 128256);
        bool is_llama3_arch_hint_json = (config_to_populate.architecture.find("LlamaForCausalLM") != std::string::npos &&
                               config_to_populate.architecture.find("Llama2") == std::string::npos);
 
        if (is_llama3_vocab_size_json && is_llama3_arch_hint_json) {
            config_to_populate.tokenizer_family = ModelConfig::TokenizerFamily::LLAMA3_TIKTOKEN;
             if (config_to_populate.rope_theta == 10000.0f) { 
                float llama3_rope_candidate = data.value("rope_theta", 500000.0f);
                if (llama3_rope_candidate > 10000.0f) { 
                    config_to_populate.rope_theta = llama3_rope_candidate;
                } else if (config_to_populate.rope_theta == 10000.0f) { 
                     config_to_populate.rope_theta = 500000.0f;
                }
            }
        } else if (config_to_populate.vocab_size == 32000 || config_to_populate.architecture.find("Llama") != std::string::npos) {
            config_to_populate.tokenizer_family = ModelConfig::TokenizerFamily::LLAMA_SENTENCEPIECE;
        } else {
            config_to_populate.tokenizer_family = ModelConfig::TokenizerFamily::UNKNOWN;
        }
        config_to_populate.is_gguf_file_loaded = false; 
 
        Logger::info("SafeTensorsLoader: Successfully loaded and parsed model config from: " + config_json_path_str);
        return true;
 
    } catch (const nlohmann::json::exception& e) {
        Logger::error("SafeTensorsLoader: Failed to parse config.json: " + config_json_path_str + ". Error: " + e.what());
        return false;
    }
    return false; 
}

References ModelConfig::architecture, ModelConfig::bos_token_id, ModelConfig::eos_token_id, Logger::error(), ModelConfig::hidden_size, Logger::info(), ModelConfig::intermediate_size, ModelConfig::is_gguf_file_loaded, ModelConfig::LLAMA3_TIKTOKEN, ModelConfig::LLAMA_SENTENCEPIECE, ModelConfig::max_position_embeddings, ModelConfig::model_name, ModelConfig::num_attention_heads, ModelConfig::num_hidden_layers, ModelConfig::num_key_value_heads, ModelConfig::pad_token_id, ModelConfig::rms_norm_eps, ModelConfig::rope_theta, ModelConfig::tokenizer_family, ModelConfig::unk_token_id, ModelConfig::UNKNOWN, ModelConfig::vocab_size, and Logger::warning().

Referenced by TinyLlamaModel::TinyLlamaModel(), and tinyllama::TinyLlamaSession::TinyLlamaSession().

◆ load_single_file()

void SafeTensorsLoader::load_single_file	(	const std::string &	file_path,
		const std::string &	shard_key_override = `""`
	)

private

Load a single .safetensors file as a shard.

Memory-maps the file and parses its metadata to populate tensor information.

Parameters

file_path	Path to the .safetensors file.
shard_key_override	Optional key to use for this shard (e.g., filename).

Definition at line 413 of file safetensors_loader.cpp.

                                                                                                        {
    std::string key_to_use = shard_key_override.empty() ? std::filesystem::path(file_path).filename().string() : shard_key_override;
    if (key_to_use.empty()) key_to_use = file_path; 
 
    if (loaded_shards_.count(key_to_use)) {
        Logger::debug("SafeTensorsLoader: Shard/file '" + key_to_use + "' (path: " + file_path + ") already processed/loaded.");
        return;
    }
    Logger::info("SafeTensorsLoader: Loading single file/shard: " + file_path + " with key: " + key_to_use);
    try {
        auto shard = std::make_unique<Shard>(file_path);
        parse_shard_metadata(*shard, key_to_use); 
        loaded_shards_[key_to_use] = std::move(shard);
    } catch (const std::exception& e) {
        throw std::runtime_error("SafeTensorsLoader: Error processing file/shard '" + file_path + "' (key: " + key_to_use + "): " + e.what());
    }
}

References Logger::debug(), Logger::info(), loaded_shards_, and parse_shard_metadata().

Referenced by load_from_directory(), and SafeTensorsLoader().

◆ operator=()

SafeTensorsLoader & SafeTensorsLoader::operator= ( const SafeTensorsLoader & )

delete

◆ parse_shard_metadata()

void SafeTensorsLoader::parse_shard_metadata	(	Shard &	shard,
		const std::string &	shard_key
	)

private

Parse the metadata of a shard and populate tensor information.

Reads the metadata JSON from the shard and adds entries to the tensors_ map.

Parameters

shard	Reference to the Shard object.
shard_key	Key identifying this shard (e.g., filename).

Definition at line 431 of file safetensors_loader.cpp.

                                                                                     {
    Logger::debug("SafeTensorsLoader: Parsing metadata for shard: " + shard_key + " (file: " + shard.file_path + ")");
    if (!shard.metadata_ptr || shard.metadata_size == 0) {
        throw std::runtime_error("Shard metadata is not available for parsing (nullptr or zero size): " + shard.file_path);
    }
    std::string metadata_json_str;
    try {
        metadata_json_str.assign(reinterpret_cast<const char*>(shard.metadata_ptr), shard.metadata_size);
    } catch (const std::length_error& le) {
        throw std::runtime_error("Error constructing metadata string for shard " + shard.file_path + ": " + le.what());
    }
    
    nlohmann::json metadata_root;
    try {
        metadata_root = nlohmann::json::parse(metadata_json_str);
    } catch (const nlohmann::json::parse_error& e) {
        throw std::runtime_error("Failed to parse metadata JSON for shard " + shard.file_path + " (key: " + shard_key + ") at offset 8, metadata_size: " + 
                                 std::to_string(shard.metadata_size) + ". Error: " + e.what() + 
                                 "\nJSON content snippet (first 200 chars): " + metadata_json_str.substr(0, 200));
    }
 
    size_t tensors_in_this_shard_count = 0;
    for (auto const& [tensor_name_str, info_json] : metadata_root.items()) {
        if (tensor_name_str == "__metadata__") continue; 
 
        TensorInfo tensor_info;
        tensor_info.name = tensor_name_str;
        try {
            tensor_info.dtype = info_json.at("dtype").get<std::string>();
            std::transform(tensor_info.dtype.begin(), tensor_info.dtype.end(), tensor_info.dtype.begin(),
                           [](unsigned char c){ return static_cast<char>(std::toupper(c)); });
 
            for (const auto& dim : info_json.at("shape")) {
                tensor_info.shape.push_back(dim.get<size_t>());
            }
            const auto& data_offsets_json = info_json.at("data_offsets");
            if (!data_offsets_json.is_array() || data_offsets_json.size() != 2) {
                 throw std::runtime_error("Tensor '" + tensor_name_str + "' 'data_offsets' must be an array of two numbers.");
            }
            size_t start_offset_in_data_block = data_offsets_json[0].get<size_t>();
            size_t end_offset_in_data_block = data_offsets_json[1].get<size_t>();
            
            tensor_info.data_offset = start_offset_in_data_block; 
            tensor_info.nbytes = end_offset_in_data_block - start_offset_in_data_block;
            tensor_info.shard_key = shard_key;
 
            if (tensors_.count(tensor_info.name)) {
                 Logger::warning("SafeTensorsLoader: Duplicate tensor name '" + tensor_info.name + "' encountered. " + 
                                 "Previous shard key: '" + tensors_[tensor_info.name].shard_key + "', New shard key: '" + shard_key + "'. " +
                                 "Overwriting with info from current shard being parsed. This can happen with unindexed multi-file loads or inconsistent index files.");
            }
            tensors_[tensor_info.name] = tensor_info;
            if (tensor_name_to_shard_key_map_.find(tensor_info.name) == tensor_name_to_shard_key_map_.end()){
                tensor_name_to_shard_key_map_[tensor_info.name] = shard_key;
            }
 
            tensors_in_this_shard_count++;
 
        } catch (const nlohmann::json::exception& e) {
            throw std::runtime_error("Failed to parse tensor info for '" + tensor_name_str + "' in shard " +
                                     shard.file_path + " (key: " + shard_key + "): " + e.what());
        }
    }
     Logger::debug("SafeTensorsLoader: Finished parsing metadata for shard: " + shard_key + ". Parsed " + std::to_string(tensors_in_this_shard_count) + " tensor entries from this shard.");
}

References SafeTensorsLoader::TensorInfo::data_offset, Logger::debug(), SafeTensorsLoader::TensorInfo::dtype, Shard::file_path, Shard::metadata_ptr, Shard::metadata_size, SafeTensorsLoader::TensorInfo::name, SafeTensorsLoader::TensorInfo::nbytes, SafeTensorsLoader::TensorInfo::shape, SafeTensorsLoader::TensorInfo::shard_key, tensor_name_to_shard_key_map_, tensors_, and Logger::warning().

Referenced by load_single_file().

◆ tensor_names()

std::vector< std::string > SafeTensorsLoader::tensor_names ( ) const

Get a list of all tensor names available in the loaded model.

Returns: Vector of tensor names.

Definition at line 497 of file safetensors_loader.cpp.

                                                           {
    std::vector<std::string> names;
    names.reserve(tensors_.size());
    for (const auto& pair : tensors_) {
        names.push_back(pair.first);
    }
    return names;
}

References tensors_.

Member Data Documentation

◆ is_sharded_

bool SafeTensorsLoader::is_sharded_ = false

private

True if model is loaded from multiple shard files

Definition at line 195 of file safetensors_loader.h.

Referenced by load_from_directory(), and SafeTensorsLoader().

◆ loaded_shards_

std::map<std::string, std::unique_ptr<Shard> > SafeTensorsLoader::loaded_shards_

private

Map of shard keys (e.g., filenames) to Shard objects

Definition at line 198 of file safetensors_loader.h.

Referenced by get_shard_for_tensor(), load_from_directory(), load_single_file(), SafeTensorsLoader(), and ~SafeTensorsLoader().

◆ model_load_path_

std::string SafeTensorsLoader::model_load_path_

private

Original path provided to constructor (file or directory)

Definition at line 194 of file safetensors_loader.h.

Referenced by SafeTensorsLoader().

◆ tensor_name_to_shard_key_map_

std::map<std::string, std::string> SafeTensorsLoader::tensor_name_to_shard_key_map_

private

Definition at line 202 of file safetensors_loader.h.

Referenced by get_shard_for_tensor(), load_from_directory(), and parse_shard_metadata().

◆ tensors_

std::map<std::string, TensorInfo> SafeTensorsLoader::tensors_

private

Global map of tensor names to their comprehensive info

Definition at line 197 of file safetensors_loader.h.

Referenced by get_tensor_info(), load_all_tensors_parallel(), parse_shard_metadata(), SafeTensorsLoader(), and tensor_names().

The documentation for this class was generated from the following files:

Classes

Public Member Functions

Static Public Member Functions

Private Member Functions

Private Attributes

Detailed Description

Constructor & Destructor Documentation

◆ SafeTensorsLoader() [1/2]

◆ ~SafeTensorsLoader()

◆ SafeTensorsLoader() [2/2]

Member Function Documentation

◆ convert_tensor_data()

◆ get_shard_for_tensor()

◆ get_tensor_bytes()

◆ get_tensor_info()

◆ load_all_tensors_parallel()

◆ load_from_directory()

◆ load_model_config_from_json()

◆ load_single_file()

◆ operator=()

◆ parse_shard_metadata()

◆ tensor_names()

Member Data Documentation

◆ is_sharded_

◆ loaded_shards_

◆ model_load_path_

◆ tensor_name_to_shard_key_map_

◆ tensors_