Data Loaders
Introduction - Data Loader
Data loaders convert raw data into the standardized PureAI format, ensuring consistency across different data sources. Each loader follows a unified structure, offering a consistent set of methods and a seamless usage experience.
Install
Before you begin, ensure your environment meets the following requirements:
- Python 3.9, 3.10, 3.11: PureCPP is compatible with the latest versions of Python.
- Linux/WSL support: The library is fully compatible with Linux-based systems and Windows Subsystem for Linux (WSL).
- pip: Ensure pip is installed and updated to the latest version.
run:
Loaders
The library currently includes four loaders: WebLoader, TXTLoader, DOCXLoader, and PDFLoader.
Document Loader | Description |
---|---|
WebLoader | Loads and processes HTML web pages. |
TXTLoader | Loads and processes text (.txt ) files. |
PDFLoader | Loads and processes text from PDF documents. |
DOCXLoader | Loads and processes from Microsoft Word (.docx ) files. |