| Interface | Description |
|---|---|
| URLExemptionFilter |
Interface used to allow exemptions to external domain resources by overriding
db.ignore.external.links. |
| URLFilter |
Interface used to limit which URLs enter Nutch.
|
| URLNormalizer |
Interface used to convert URLs to normal form and optionally perform
substitutions
|
| Class | Description |
|---|---|
| URLExemptionFilters |
Creates and caches
URLExemptionFilter implementing plugins. |
| URLFilterChecker |
Checks one given filter or all filters.
|
| URLFilters |
Creates and caches
URLFilter implementing plugins. |
| URLNormalizerChecker |
Checks one given normalizer or all normalizers.
|
| URLNormalizers |
This class uses a "chained filter" pattern to run defined normalizers.
|
| Exception | Description |
|---|---|
| URLFilterException |
filters
and normalizers.Copyright © 2021 The Apache Software Foundation