Package org.apache.parquet.hadoop.api
Class ReadSupport<T>
- java.lang.Object
-
- org.apache.parquet.hadoop.api.ReadSupport<T>
-
- Type Parameters:
T- the type of the materialized record
- Direct Known Subclasses:
DelegatingReadSupport,GroupReadSupport
public abstract class ReadSupport<T> extends Object
Abstraction used by theParquetInputFormatto materialize records
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classReadSupport.ReadContextinformation to read the file
-
Field Summary
Fields Modifier and Type Field Description static StringPARQUET_READ_SCHEMAconfiguration key for a parquet read projection schema
-
Constructor Summary
Constructors Constructor Description ReadSupport()
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Concrete Methods Deprecated Methods Modifier and Type Method Description static MessageTypegetSchemaForRead(MessageType fileMessageType, String partialReadSchemaString)attempts to validate and construct aMessageTypefrom a read projection schemastatic MessageTypegetSchemaForRead(MessageType fileMessageType, MessageType projectedMessageType)ReadSupport.ReadContextinit(org.apache.hadoop.conf.Configuration configuration, Map<String,String> keyValueMetaData, MessageType fileSchema)Deprecated.overrideinit(InitContext)insteadReadSupport.ReadContextinit(InitContext context)called inInputFormat.getSplits(org.apache.hadoop.mapreduce.JobContext)in the front endabstract RecordMaterializer<T>prepareForRead(org.apache.hadoop.conf.Configuration configuration, Map<String,String> keyValueMetaData, MessageType fileSchema, ReadSupport.ReadContext readContext)called inRecordReader.initialize(org.apache.hadoop.mapreduce.InputSplit, org.apache.hadoop.mapreduce.TaskAttemptContext)in the back end the returned RecordMaterializer will materialize the records and add them to the destination
-
-
-
Field Detail
-
PARQUET_READ_SCHEMA
public static final String PARQUET_READ_SCHEMA
configuration key for a parquet read projection schema- See Also:
- Constant Field Values
-
-
Method Detail
-
getSchemaForRead
public static MessageType getSchemaForRead(MessageType fileMessageType, String partialReadSchemaString)
attempts to validate and construct aMessageTypefrom a read projection schema- Parameters:
fileMessageType- the typed schema of the sourcepartialReadSchemaString- the requested projection schema- Returns:
- the typed schema that should be used to read
-
getSchemaForRead
public static MessageType getSchemaForRead(MessageType fileMessageType, MessageType projectedMessageType)
-
init
@Deprecated public ReadSupport.ReadContext init(org.apache.hadoop.conf.Configuration configuration, Map<String,String> keyValueMetaData, MessageType fileSchema)
Deprecated.overrideinit(InitContext)insteadcalled inInputFormat.getSplits(org.apache.hadoop.mapreduce.JobContext)in the front end- Parameters:
configuration- the job configurationkeyValueMetaData- the app specific metadata from the filefileSchema- the schema of the file- Returns:
- the readContext that defines how to read the file
-
init
public ReadSupport.ReadContext init(InitContext context)
called inInputFormat.getSplits(org.apache.hadoop.mapreduce.JobContext)in the front end- Parameters:
context- the initialisation context- Returns:
- the readContext that defines how to read the file
-
prepareForRead
public abstract RecordMaterializer<T> prepareForRead(org.apache.hadoop.conf.Configuration configuration, Map<String,String> keyValueMetaData, MessageType fileSchema, ReadSupport.ReadContext readContext)
called inRecordReader.initialize(org.apache.hadoop.mapreduce.InputSplit, org.apache.hadoop.mapreduce.TaskAttemptContext)in the back end the returned RecordMaterializer will materialize the records and add them to the destination- Parameters:
configuration- the job configurationkeyValueMetaData- the app specific metadata from the filefileSchema- the schema of the filereadContext- returned by the init method- Returns:
- the recordMaterializer that will materialize the records
-
-