See: Description
| Interface | Description |
|---|---|
| SegmentMergeFilter |
Interface used to filter segments during segment merge.
|
| Class | Description |
|---|---|
| ContentAsTextInputFormat |
An input format that takes Nutch Content objects and converts them to text
while converting newline endings to spaces.
|
| SegmentChecker |
Checks whether a segment is valid, or has a certain status (generated,
fetched, parsed), or can be used safely for a certain processing step
(e.g., indexing).
|
| SegmentMergeFilters |
This class wraps all
SegmentMergeFilter extensions in a single object
so it is easier to operate on them. |
| SegmentMerger |
This tool takes several segments and merges their data together.
|
| SegmentMerger.ObjectInputFormat |
Wraps inputs in an
MetaWrapper, to permit merging different types
in reduce and use additional metadata. |
| SegmentMerger.SegmentMergerMapper | |
| SegmentMerger.SegmentMergerReducer |
NOTE: in selecting the latest version we rely exclusively on the segment
name (not all segment data contain time information).
|
| SegmentMerger.SegmentOutputFormat | |
| SegmentPart |
Utility class for handling information about segment parts.
|
| SegmentReader |
Dump the content of a segment.
|
| SegmentReader.InputCompatMapper | |
| SegmentReader.InputCompatReducer | |
| SegmentReader.SegmentReaderStats | |
| SegmentReader.TextOutputFormat |
Implements a text output format
|
Copyright © 2021 The Apache Software Foundation