Skip to main content

WebPage documentation

The WebPage service is responsible for analyzing, parsing, and indexing HTML pages.

tip

Access to the WebPage service is limited, please get in touch with us for more information.

Analysis

Indexed HTML pages can be analyzed through the Ferret Query Language (FQL) and the parsed data is returned in JSON format. The analysis performance is very high and can reach up to 5.000 requests per second through requests streaming (gRPC).

Query Language

The query Language used to parse and extract data from HTML pages is FQL.

Example

The indexed HTML page is as follows:

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Offline</title>
</head>
<body>
<h1>Hello world</h1>
</body>
</html>

The specified FQL query is as follows:

LET page = PARSE(@document)
LET title = ELEMENT(page, "title")

RETURN {title: title}

The returned value will then be as follows:

{
"title": "Offline"
}