Package org.apache.parquet.hadoop
Class UnmaterializableRecordCounter
- java.lang.Object
-
- org.apache.parquet.hadoop.UnmaterializableRecordCounter
-
public class UnmaterializableRecordCounter extends Object
Tracks number of records that cannot be materialized and throws ParquetDecodingException if the rate of errors crosses a limit.These types of errors are meant to be recoverable record conversion errors, such as a union missing a value, or schema mismatch and so on. It's not meant to recover from corruptions in the parquet columns themselves. The intention is to skip over very rare file corruption or bugs where the write path has allowed invalid records into the file, but still catch large numbers of failures. Not turned on by default (by default, no errors are tolerated).
-
-
Field Summary
Fields Modifier and Type Field Description static StringBAD_RECORD_THRESHOLD_CONF_KEY
-
Constructor Summary
Constructors Constructor Description UnmaterializableRecordCounter(double errorThreshold, long totalNumRecords)UnmaterializableRecordCounter(org.apache.hadoop.conf.Configuration conf, long totalNumRecords)UnmaterializableRecordCounter(ParquetReadOptions options, long totalNumRecords)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidincErrors(org.apache.parquet.io.api.RecordMaterializer.RecordMaterializationException cause)
-
-
-
Field Detail
-
BAD_RECORD_THRESHOLD_CONF_KEY
public static final String BAD_RECORD_THRESHOLD_CONF_KEY
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
UnmaterializableRecordCounter
public UnmaterializableRecordCounter(org.apache.hadoop.conf.Configuration conf, long totalNumRecords)
-
UnmaterializableRecordCounter
public UnmaterializableRecordCounter(ParquetReadOptions options, long totalNumRecords)
-
UnmaterializableRecordCounter
public UnmaterializableRecordCounter(double errorThreshold, long totalNumRecords)
-
-