org.apache.hadoop.mapred
Class FixedLengthInputFormat
java.lang.Object
org.apache.hadoop.mapred.FileInputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.BytesWritable>
org.apache.hadoop.mapred.FixedLengthInputFormat
- All Implemented Interfaces:
- InputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.BytesWritable>, JobConfigurable
@InterfaceAudience.Public
@InterfaceStability.Stable
public class FixedLengthInputFormat
- extends FileInputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.BytesWritable>
- implements JobConfigurable
FixedLengthInputFormat is an input format used to read input files
which contain fixed length records. The content of a record need not be
text. It can be arbitrary binary data. Users must configure the record
length property by calling:
FixedLengthInputFormat.setRecordLength(conf, recordLength);
or
conf.setInt(FixedLengthInputFormat.FIXED_RECORD_LENGTH, recordLength);
- See Also:
FixedLengthRecordReader
|
Method Summary |
void |
configure(JobConf conf)
Initializes a new instance from a JobConf. |
static int |
getRecordLength(org.apache.hadoop.conf.Configuration conf)
Get record length value |
RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.BytesWritable> |
getRecordReader(InputSplit genericSplit,
JobConf job,
Reporter reporter)
Get the RecordReader for the given InputSplit. |
protected boolean |
isSplitable(org.apache.hadoop.fs.FileSystem fs,
org.apache.hadoop.fs.Path file)
Is the given filename splitable? Usually, true, but if the file is
stream compressed, it will not be. |
static void |
setRecordLength(org.apache.hadoop.conf.Configuration conf,
int recordLength)
Set the length of each record |
| Methods inherited from class org.apache.hadoop.mapred.FileInputFormat |
addInputPath, addInputPathRecursively, addInputPaths, computeSplitSize, getBlockIndex, getInputPathFilter, getInputPaths, getSplitHosts, getSplits, listStatus, makeSplit, makeSplit, setInputPathFilter, setInputPaths, setInputPaths, setMinSplitSize |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
FIXED_RECORD_LENGTH
public static final String FIXED_RECORD_LENGTH
- See Also:
- Constant Field Values
FixedLengthInputFormat
public FixedLengthInputFormat()
setRecordLength
public static void setRecordLength(org.apache.hadoop.conf.Configuration conf,
int recordLength)
- Set the length of each record
- Parameters:
conf - configurationrecordLength - the length of a record
getRecordLength
public static int getRecordLength(org.apache.hadoop.conf.Configuration conf)
- Get record length value
- Parameters:
conf - configuration
- Returns:
- the record length, zero means none was set
configure
public void configure(JobConf conf)
- Description copied from interface:
JobConfigurable
- Initializes a new instance from a
JobConf.
- Specified by:
configure in interface JobConfigurable
- Parameters:
conf - the configuration
getRecordReader
public RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.BytesWritable> getRecordReader(InputSplit genericSplit,
JobConf job,
Reporter reporter)
throws IOException
- Description copied from interface:
InputFormat
- Get the
RecordReader for the given InputSplit.
It is the responsibility of the RecordReader to respect
record boundaries while processing the logical split to present a
record-oriented view to the individual task.
- Specified by:
getRecordReader in interface InputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.BytesWritable>- Specified by:
getRecordReader in class FileInputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.BytesWritable>
- Parameters:
genericSplit - the InputSplitjob - the job that this split belongs to
- Returns:
- a
RecordReader
- Throws:
IOException
isSplitable
protected boolean isSplitable(org.apache.hadoop.fs.FileSystem fs,
org.apache.hadoop.fs.Path file)
- Description copied from class:
FileInputFormat
- Is the given filename splitable? Usually, true, but if the file is
stream compressed, it will not be.
FileInputFormat implementations can override this and return
false to ensure that individual input files are never split-up
so that Mappers process entire files.
- Overrides:
isSplitable in class FileInputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.BytesWritable>
- Parameters:
fs - the file system that the file is onfile - the file name to check
- Returns:
- is this file splitable?
Copyright © 2014 Apache Software Foundation. All Rights Reserved.