|
Advanced Chunk Processing Library 0.2.0
A comprehensive C++ library for advanced data chunking strategies and processing operations
|
Information theory based chunking using mutual information. More...
#include <sophisticated_chunking.hpp>
Public Member Functions | |
| MutualInformationChunking (size_t context_size=5, double mi_threshold=0.3) | |
| Constructor for mutual information based chunking. | |
| std::vector< std::vector< T > > | chunk (const std::vector< T > &data) const |
| Chunk data based on mutual information analysis. | |
| size_t | get_context_size () const |
| Get the size of context window. | |
| double | get_mi_threshold () const |
| Get the threshold for mutual information. | |
| void | set_context_size (size_t size) |
| Set the size of context window. | |
| void | set_mi_threshold (double threshold) |
| Set the threshold for mutual information. | |
Private Member Functions | |
| double | calculateMutualInformation (const std::vector< T > &segment1, const std::vector< T > &segment2) const |
| Calculate mutual information between adjacent segments. | |
Private Attributes | |
| size_t | context_size_ |
| double | mi_threshold_ |
Information theory based chunking using mutual information.
| T | The type of elements to be chunked |
Definition at line 175 of file sophisticated_chunking.hpp.
|
inline |
Constructor for mutual information based chunking.
| context_size | Size of context window |
| mi_threshold | Threshold for mutual information |
Definition at line 230 of file sophisticated_chunking.hpp.
|
inlineprivate |
Calculate mutual information between adjacent segments.
| segment1 | First segment |
| segment2 | Second segment |
Definition at line 186 of file sophisticated_chunking.hpp.
Referenced by sophisticated_chunking::MutualInformationChunking< T >::chunk().
|
inline |
Chunk data based on mutual information analysis.
| data | Input data to be chunked |
Definition at line 238 of file sophisticated_chunking.hpp.
References sophisticated_chunking::MutualInformationChunking< T >::calculateMutualInformation(), sophisticated_chunking::MutualInformationChunking< T >::context_size_, and sophisticated_chunking::MutualInformationChunking< T >::mi_threshold_.
Referenced by demonstrate_mutual_information_chunking(), TEST_F(), and TEST_F().
|
inline |
Get the size of context window.
Definition at line 274 of file sophisticated_chunking.hpp.
References sophisticated_chunking::MutualInformationChunking< T >::context_size_.
|
inline |
Get the threshold for mutual information.
Definition at line 282 of file sophisticated_chunking.hpp.
References sophisticated_chunking::MutualInformationChunking< T >::mi_threshold_.
|
inline |
Set the size of context window.
| size | Size of context window |
Definition at line 290 of file sophisticated_chunking.hpp.
References sophisticated_chunking::MutualInformationChunking< T >::context_size_.
|
inline |
Set the threshold for mutual information.
| threshold | Threshold for mutual information |
Definition at line 300 of file sophisticated_chunking.hpp.
References sophisticated_chunking::MutualInformationChunking< T >::mi_threshold_.
|
private |
|
private |