This data loader allows loading webpages from the internet.
WebLoader
, you first need to install the purecpp_extract
Python package:
WebLoader
by providing a single URL.
Load()
method to fetch and extract content from the webpage. This method returns a list containing one Document
object.
Each Document
contains the following attributes:
metadata
: A dictionary with metadata about the documentpage_content
: The full text content of the webpageDocument
.