Installation
To use theWebLoader, you first need to install the purecpp_extract Python package:
Initialization
You can initialize theWebLoader by providing a single URL.
Load
Once initialized, use theLoad() method to fetch and extract content from the webpage. This method returns a list containing one Document object.
Each Document contains the following attributes:
metadata: A dictionary with metadata about the documentpage_content: The full text content of the webpage
- Since only a single URL is allowed per instance, the returned list will always contain exactly one
Document.