public class CommonCrawlFormatFactory extends Object
CommonCrawlFormat objects (a.k.a. formatter) that map crawled files to CommonCrawl format.| Constructor and Description |
|---|
CommonCrawlFormatFactory() |
| Modifier and Type | Method and Description |
|---|---|
static CommonCrawlFormat |
getCommonCrawlFormat(String formatType,
Configuration nutchConf,
CommonCrawlConfig config) |
static CommonCrawlFormat |
getCommonCrawlFormat(String formatType,
String url,
Content content,
Metadata metadata,
Configuration nutchConf,
CommonCrawlConfig config)
Deprecated.
|
public static CommonCrawlFormat getCommonCrawlFormat(String formatType, String url, Content content, Metadata metadata, Configuration nutchConf, CommonCrawlConfig config) throws IOException
CommonCrawlFormat object specifying the type of formatter.formatType - the type of formatter to be created.url - the url.content - the content.metadata - the metadata.nutchConf - the configuration.config - the CommonCrawl output configuration.CommonCrawlFormat object.IOException - If any I/O error occurs.public static CommonCrawlFormat getCommonCrawlFormat(String formatType, Configuration nutchConf, CommonCrawlConfig config) throws IOException
IOExceptionCopyright © 2021 The Apache Software Foundation