Build with PureCPP

Quickstart Guide

Welcome to the Quickstart Guide for PureCPP, your all-in-one solution for building Retrieval-Augmented Generation (RAG) pipelines with ease and efficiency. This guide will walk you through the steps to get started quickly.

Prerequisites

Before you begin, ensure your environment meets the following requirements:

Python 3.9, 3.10, 3.11: PureCPP is compatible with the latest versions of Python.
Linux/WSL support: The library is fully compatible with Linux-based systems and Windows Subsystem for Linux (WSL).
pip: Ensure pip is installed and updated to the latest version.

Install the purecpp Python package:

pip install purecpp

Introduction - Data Loader

Data loaders convert raw data into the standardized PureAI format, ensuring consistency across different data sources. Each loader follows a unified structure, offering a consistent set of methods and a seamless usage experience.

Install

Before you begin, ensure your environment meets the following requirements:

Python 3.9, 3.10, 3.11: PureCPP is compatible with the latest versions of Python.
Linux/WSL support: The library is fully compatible with Linux-based systems and Windows Subsystem for Linux (WSL).
pip: Ensure pip is installed and updated to the latest version.

Install the purecpp Python package:

pip install purecpp

Loaders

The library currently includes four loaders: WebLoader, TXTLoader, DOCXLoader, and PDFLoader.

Document Loader	Description
WebLoader	Loads and processes HTML web pages.
TXTLoader	Loads and processes text (.txt) files.
PDFLoader	Loads and processes text from PDF documents.
DOCXLoader	Loads and processes from Microsoft Word (.docx) files.

Quickstart Guide

WEB Loader

Get started with the WebLoader to process HTML web pages:

from purecpp import WebLoader

# Initialize the WebLoader
loader = WebLoader()

# Load content from a URL
url = "https://example.com"
documents = loader.load(url)

# Process the loaded documents
for doc in documents:
    print(f"Title: {doc.metadata['title']}")
    print(f"Content: {doc.page_content[:200]}...")

Next Steps

Explore more loaders and advanced features:

WebLoader

Load and process HTML web pages

TXTLoader

Process plain text files

PDFLoader

Extract content from PDF documents

DOCXLoader

Process Microsoft Word documents

introduction

PureRouter

PureCPP

Build with PureCPP

Build with PureCPP

Quickstart Guide

Prerequisites

Introduction - Data Loader

Install

Loaders

Quickstart Guide

WEB Loader

Next Steps

WebLoader

TXTLoader

PDFLoader

DOCXLoader

introduction

PureRouter

PureCPP

​Build with PureCPP

​Quickstart Guide

​Prerequisites

​Introduction - Data Loader

​Install

​Loaders

​Quickstart Guide

​WEB Loader

​Next Steps

WebLoader

TXTLoader

PDFLoader

DOCXLoader

Build with PureCPP

Quickstart Guide

Prerequisites

Introduction - Data Loader

Install

Loaders

Quickstart Guide

WEB Loader

Next Steps