Installation
To use theWebLoader
, you first need to install the purecpp_extract
Python package:
Initialization
You can initialize theWebLoader
by providing a single URL.
Load
Once initialized, use theLoad()
method to fetch and extract content from the webpage. This method returns a list containing one Document
object.
Each Document
contains the following attributes:
metadata
: A dictionary with metadata about the documentpage_content
: The full text content of the webpage
- Since only a single URL is allowed per instance, the returned list will always contain exactly one
Document
.