Interface LinkExtractor
-
- All Known Implementing Classes:
ContentInternalLinks
,IndexLinks
,RichTextInternalLinks
public interface LinkExtractor
A link extractor is used to fetch all hyperlinks from a JSON content response that point to other parts of the Site API of the same site to continue crawling with.Link extractors should ignore external URLs or URLs pointing to assets.
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description boolean
accept(java.lang.String suffix)
Returns true if the link extractor accepts the given suffix (processor mapped to this suffix).java.util.stream.Stream<java.lang.String>
getLinks(com.jayway.jsonpath.DocumentContext jsonPathContext)
Retrieves links from the JSON document via JSON path.
-
-
-
Method Detail
-
accept
boolean accept(java.lang.String suffix)
Returns true if the link extractor accepts the given suffix (processor mapped to this suffix).- Parameters:
suffix
- Suffix- Returns:
- true if JSON response of this processor is supported
-
getLinks
java.util.stream.Stream<java.lang.String> getLinks(com.jayway.jsonpath.DocumentContext jsonPathContext)
Retrieves links from the JSON document via JSON path.- Parameters:
jsonPathContext
- Document context- Returns:
- Link URLs
-
-