Installation

Before you begin, ensure your environment meets the following requirements:

  • Python 3.9, 3.10, 3.11: PureCPP is compatible with the latest versions of Python.
  • Linux/WSL support: The library is fully compatible with Linux-based systems and Windows Subsystem for Linux (WSL).
  • pip: Ensure pip is installed and updated to the latest version.

run:

pip install purecpp_chunks_clean

Chunking Modules

The library includes four main chunking modules: ChunkDefault, ChunkCount, ChunkQuery, and ChunkSimilarity.

ModuleDescription
ChunkDefaultSplits large texts into smaller chunks while maintaining context through overlap.
ChunkCountSegments text based on a specific count pattern.
ChunkQueryFilters and retrieves chunks most relevant to a given query.
ChunkSimilaritySplits and ranks chunks based on their similarity.