Package org.apache.parquet.hadoop
Class ParquetFileWriter
- java.lang.Object
-
- org.apache.parquet.hadoop.ParquetFileWriter
-
public class ParquetFileWriter extends Object
Internal implementation of the Parquet file writer as a block container
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classParquetFileWriter.Mode
-
Field Summary
Fields Modifier and Type Field Description static intCURRENT_VERSIONstatic StringEF_MAGIC_STRstatic byte[]EFMAGICstatic byte[]MAGICstatic StringMAGIC_STRprotected org.apache.parquet.io.PositionOutputStreamoutstatic StringPARQUET_COMMON_METADATA_FILEstatic StringPARQUET_METADATA_FILE
-
Constructor Summary
Constructors Constructor Description ParquetFileWriter(org.apache.hadoop.conf.Configuration configuration, org.apache.parquet.schema.MessageType schema, org.apache.hadoop.fs.Path file)Deprecated.will be removed in 2.0.0ParquetFileWriter(org.apache.hadoop.conf.Configuration configuration, org.apache.parquet.schema.MessageType schema, org.apache.hadoop.fs.Path file, ParquetFileWriter.Mode mode)Deprecated.will be removed in 2.0.0ParquetFileWriter(org.apache.hadoop.conf.Configuration configuration, org.apache.parquet.schema.MessageType schema, org.apache.hadoop.fs.Path file, ParquetFileWriter.Mode mode, long rowGroupSize, int maxPaddingSize)Deprecated.will be removed in 2.0.0ParquetFileWriter(org.apache.parquet.io.OutputFile file, org.apache.parquet.schema.MessageType schema, ParquetFileWriter.Mode mode, long rowGroupSize, int maxPaddingSize)Deprecated.will be removed in 2.0.0ParquetFileWriter(org.apache.parquet.io.OutputFile file, org.apache.parquet.schema.MessageType schema, ParquetFileWriter.Mode mode, long rowGroupSize, int maxPaddingSize, int columnIndexTruncateLength, int statisticsTruncateLength, boolean pageWriteChecksumEnabled)ParquetFileWriter(org.apache.parquet.io.OutputFile file, org.apache.parquet.schema.MessageType schema, ParquetFileWriter.Mode mode, long rowGroupSize, int maxPaddingSize, int columnIndexTruncateLength, int statisticsTruncateLength, boolean pageWriteChecksumEnabled, FileEncryptionProperties encryptionProperties)ParquetFileWriter(org.apache.parquet.io.OutputFile file, org.apache.parquet.schema.MessageType schema, ParquetFileWriter.Mode mode, long rowGroupSize, int maxPaddingSize, int columnIndexTruncateLength, int statisticsTruncateLength, boolean pageWriteChecksumEnabled, InternalFileEncryptor encryptor)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description voidappendColumnChunk(org.apache.parquet.column.ColumnDescriptor descriptor, org.apache.parquet.io.SeekableInputStream from, ColumnChunkMetaData chunk, org.apache.parquet.column.values.bloomfilter.BloomFilter bloomFilter, org.apache.parquet.internal.column.columnindex.ColumnIndex columnIndex, org.apache.parquet.internal.column.columnindex.OffsetIndex offsetIndex)voidappendFile(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path file)Deprecated.will be removed in 2.0.0; useappendFile(InputFile)insteadvoidappendFile(org.apache.parquet.io.InputFile file)voidappendRowGroup(org.apache.hadoop.fs.FSDataInputStream from, BlockMetaData rowGroup, boolean dropColumns)Deprecated.will be removed in 2.0.0; useappendRowGroup(SeekableInputStream,BlockMetaData,boolean)insteadvoidappendRowGroup(org.apache.parquet.io.SeekableInputStream from, BlockMetaData rowGroup, boolean dropColumns)voidappendRowGroups(org.apache.hadoop.fs.FSDataInputStream file, List<BlockMetaData> rowGroups, boolean dropColumns)Deprecated.will be removed in 2.0.0; useappendRowGroups(SeekableInputStream,List,boolean)insteadvoidappendRowGroups(org.apache.parquet.io.SeekableInputStream file, List<BlockMetaData> rowGroups, boolean dropColumns)voidend(Map<String,String> extraMetaData)ends a file once all blocks have been written.voidendBlock()ends a block once all column chunks have been writtenvoidendColumn()end a column (once all rep, def and data have been written)InternalFileEncryptorgetEncryptor()ParquetMetadatagetFooter()longgetNextRowGroupSize()longgetPos()static ParquetMetadatamergeMetadataFiles(List<org.apache.hadoop.fs.Path> files, org.apache.hadoop.conf.Configuration conf)Deprecated.metadata files are not recommended and will be removed in 2.0.0static ParquetMetadatamergeMetadataFiles(List<org.apache.hadoop.fs.Path> files, org.apache.hadoop.conf.Configuration conf, KeyValueMetadataMergeStrategy keyValueMetadataMergeStrategy)Deprecated.metadata files are not recommended and will be removed in 2.0.0voidstart()start the filevoidstartBlock(long recordCount)start a blockvoidstartColumn(org.apache.parquet.column.ColumnDescriptor descriptor, long valueCount, org.apache.parquet.hadoop.metadata.CompressionCodecName compressionCodecName)start a column inside a blockvoidwriteDataPage(int valueCount, int uncompressedPageSize, org.apache.parquet.bytes.BytesInput bytes, org.apache.parquet.column.Encoding rlEncoding, org.apache.parquet.column.Encoding dlEncoding, org.apache.parquet.column.Encoding valuesEncoding)Deprecated.voidwriteDataPage(int valueCount, int uncompressedPageSize, org.apache.parquet.bytes.BytesInput bytes, org.apache.parquet.column.statistics.Statistics statistics, long rowCount, org.apache.parquet.column.Encoding rlEncoding, org.apache.parquet.column.Encoding dlEncoding, org.apache.parquet.column.Encoding valuesEncoding)Writes a single pagevoidwriteDataPage(int valueCount, int uncompressedPageSize, org.apache.parquet.bytes.BytesInput bytes, org.apache.parquet.column.statistics.Statistics statistics, long rowCount, org.apache.parquet.column.Encoding rlEncoding, org.apache.parquet.column.Encoding dlEncoding, org.apache.parquet.column.Encoding valuesEncoding, org.apache.parquet.format.BlockCipher.Encryptor metadataBlockEncryptor, byte[] pageHeaderAAD)Writes a single pagevoidwriteDataPage(int valueCount, int uncompressedPageSize, org.apache.parquet.bytes.BytesInput bytes, org.apache.parquet.column.statistics.Statistics statistics, org.apache.parquet.column.Encoding rlEncoding, org.apache.parquet.column.Encoding dlEncoding, org.apache.parquet.column.Encoding valuesEncoding)Deprecated.this method does not support writing column indexes; UsewriteDataPage(int, int, BytesInput, Statistics, long, Encoding, Encoding, Encoding)insteadvoidwriteDataPage(int valueCount, int uncompressedPageSize, org.apache.parquet.bytes.BytesInput bytes, org.apache.parquet.column.statistics.Statistics statistics, org.apache.parquet.column.Encoding rlEncoding, org.apache.parquet.column.Encoding dlEncoding, org.apache.parquet.column.Encoding valuesEncoding, org.apache.parquet.format.BlockCipher.Encryptor metadataBlockEncryptor, byte[] pageHeaderAAD)writes a single pagevoidwriteDataPageV2(int rowCount, int nullCount, int valueCount, org.apache.parquet.bytes.BytesInput repetitionLevels, org.apache.parquet.bytes.BytesInput definitionLevels, org.apache.parquet.column.Encoding dataEncoding, org.apache.parquet.bytes.BytesInput compressedData, int uncompressedDataSize, org.apache.parquet.column.statistics.Statistics<?> statistics)Writes a single v2 data pagevoidwriteDictionaryPage(org.apache.parquet.column.page.DictionaryPage dictionaryPage)writes a dictionary page pagevoidwriteDictionaryPage(org.apache.parquet.column.page.DictionaryPage dictionaryPage, org.apache.parquet.format.BlockCipher.Encryptor headerBlockEncryptor, byte[] AAD)static voidwriteMergedMetadataFile(List<org.apache.hadoop.fs.Path> files, org.apache.hadoop.fs.Path outputPath, org.apache.hadoop.conf.Configuration conf)Deprecated.metadata files are not recommended and will be removed in 2.0.0static voidwriteMetadataFile(org.apache.hadoop.conf.Configuration configuration, org.apache.hadoop.fs.Path outputPath, List<Footer> footers)Deprecated.metadata files are not recommended and will be removed in 2.0.0static voidwriteMetadataFile(org.apache.hadoop.conf.Configuration configuration, org.apache.hadoop.fs.Path outputPath, List<Footer> footers, ParquetOutputFormat.JobSummaryLevel level)Deprecated.metadata files are not recommended and will be removed in 2.0.0
-
-
-
Field Detail
-
PARQUET_METADATA_FILE
public static final String PARQUET_METADATA_FILE
- See Also:
- Constant Field Values
-
MAGIC_STR
public static final String MAGIC_STR
- See Also:
- Constant Field Values
-
MAGIC
public static final byte[] MAGIC
-
EF_MAGIC_STR
public static final String EF_MAGIC_STR
- See Also:
- Constant Field Values
-
EFMAGIC
public static final byte[] EFMAGIC
-
PARQUET_COMMON_METADATA_FILE
public static final String PARQUET_COMMON_METADATA_FILE
- See Also:
- Constant Field Values
-
CURRENT_VERSION
public static final int CURRENT_VERSION
- See Also:
- Constant Field Values
-
out
protected final org.apache.parquet.io.PositionOutputStream out
-
-
Constructor Detail
-
ParquetFileWriter
@Deprecated public ParquetFileWriter(org.apache.hadoop.conf.Configuration configuration, org.apache.parquet.schema.MessageType schema, org.apache.hadoop.fs.Path file) throws IOException
Deprecated.will be removed in 2.0.0- Parameters:
configuration- Hadoop configurationschema- the schema of the datafile- the file to write to- Throws:
IOException- if the file can not be created
-
ParquetFileWriter
@Deprecated public ParquetFileWriter(org.apache.hadoop.conf.Configuration configuration, org.apache.parquet.schema.MessageType schema, org.apache.hadoop.fs.Path file, ParquetFileWriter.Mode mode) throws IOException
Deprecated.will be removed in 2.0.0- Parameters:
configuration- Hadoop configurationschema- the schema of the datafile- the file to write tomode- file creation mode- Throws:
IOException- if the file can not be created
-
ParquetFileWriter
@Deprecated public ParquetFileWriter(org.apache.hadoop.conf.Configuration configuration, org.apache.parquet.schema.MessageType schema, org.apache.hadoop.fs.Path file, ParquetFileWriter.Mode mode, long rowGroupSize, int maxPaddingSize) throws IOException
Deprecated.will be removed in 2.0.0- Parameters:
configuration- Hadoop configurationschema- the schema of the datafile- the file to write tomode- file creation moderowGroupSize- the row group sizemaxPaddingSize- the maximum padding- Throws:
IOException- if the file can not be created
-
ParquetFileWriter
@Deprecated public ParquetFileWriter(org.apache.parquet.io.OutputFile file, org.apache.parquet.schema.MessageType schema, ParquetFileWriter.Mode mode, long rowGroupSize, int maxPaddingSize) throws IOException
Deprecated.will be removed in 2.0.0- Parameters:
file- OutputFile to create or overwriteschema- the schema of the datamode- file creation moderowGroupSize- the row group sizemaxPaddingSize- the maximum padding- Throws:
IOException- if the file can not be created
-
ParquetFileWriter
public ParquetFileWriter(org.apache.parquet.io.OutputFile file, org.apache.parquet.schema.MessageType schema, ParquetFileWriter.Mode mode, long rowGroupSize, int maxPaddingSize, int columnIndexTruncateLength, int statisticsTruncateLength, boolean pageWriteChecksumEnabled) throws IOException- Parameters:
file- OutputFile to create or overwriteschema- the schema of the datamode- file creation moderowGroupSize- the row group sizemaxPaddingSize- the maximum paddingcolumnIndexTruncateLength- the length which the min/max values in column indexes tried to be truncated tostatisticsTruncateLength- the length which the min/max values in row groups tried to be truncated topageWriteChecksumEnabled- whether to write out page level checksums- Throws:
IOException- if the file can not be created
-
ParquetFileWriter
public ParquetFileWriter(org.apache.parquet.io.OutputFile file, org.apache.parquet.schema.MessageType schema, ParquetFileWriter.Mode mode, long rowGroupSize, int maxPaddingSize, int columnIndexTruncateLength, int statisticsTruncateLength, boolean pageWriteChecksumEnabled, FileEncryptionProperties encryptionProperties) throws IOException- Throws:
IOException
-
ParquetFileWriter
public ParquetFileWriter(org.apache.parquet.io.OutputFile file, org.apache.parquet.schema.MessageType schema, ParquetFileWriter.Mode mode, long rowGroupSize, int maxPaddingSize, int columnIndexTruncateLength, int statisticsTruncateLength, boolean pageWriteChecksumEnabled, InternalFileEncryptor encryptor) throws IOException- Throws:
IOException
-
-
Method Detail
-
start
public void start() throws IOExceptionstart the file- Throws:
IOException- if there is an error while writing
-
getEncryptor
public InternalFileEncryptor getEncryptor()
-
startBlock
public void startBlock(long recordCount) throws IOExceptionstart a block- Parameters:
recordCount- the record count in this block- Throws:
IOException- if there is an error while writing
-
startColumn
public void startColumn(org.apache.parquet.column.ColumnDescriptor descriptor, long valueCount, org.apache.parquet.hadoop.metadata.CompressionCodecName compressionCodecName) throws IOExceptionstart a column inside a block- Parameters:
descriptor- the column descriptorvalueCount- the value count in this columncompressionCodecName- a compression codec name- Throws:
IOException- if there is an error while writing
-
writeDictionaryPage
public void writeDictionaryPage(org.apache.parquet.column.page.DictionaryPage dictionaryPage) throws IOExceptionwrites a dictionary page page- Parameters:
dictionaryPage- the dictionary page- Throws:
IOException- if there is an error while writing
-
writeDictionaryPage
public void writeDictionaryPage(org.apache.parquet.column.page.DictionaryPage dictionaryPage, org.apache.parquet.format.BlockCipher.Encryptor headerBlockEncryptor, byte[] AAD) throws IOException- Throws:
IOException
-
writeDataPage
@Deprecated public void writeDataPage(int valueCount, int uncompressedPageSize, org.apache.parquet.bytes.BytesInput bytes, org.apache.parquet.column.Encoding rlEncoding, org.apache.parquet.column.Encoding dlEncoding, org.apache.parquet.column.Encoding valuesEncoding) throws IOException
Deprecated.writes a single page- Parameters:
valueCount- count of valuesuncompressedPageSize- the size of the data once uncompressedbytes- the compressed data for the page without headerrlEncoding- encoding of the repetition leveldlEncoding- encoding of the definition levelvaluesEncoding- encoding of values- Throws:
IOException- if there is an error while writing
-
writeDataPage
@Deprecated public void writeDataPage(int valueCount, int uncompressedPageSize, org.apache.parquet.bytes.BytesInput bytes, org.apache.parquet.column.statistics.Statistics statistics, org.apache.parquet.column.Encoding rlEncoding, org.apache.parquet.column.Encoding dlEncoding, org.apache.parquet.column.Encoding valuesEncoding) throws IOException
Deprecated.this method does not support writing column indexes; UsewriteDataPage(int, int, BytesInput, Statistics, long, Encoding, Encoding, Encoding)insteadwrites a single page- Parameters:
valueCount- count of valuesuncompressedPageSize- the size of the data once uncompressedbytes- the compressed data for the page without headerstatistics- statistics for the pagerlEncoding- encoding of the repetition leveldlEncoding- encoding of the definition levelvaluesEncoding- encoding of values- Throws:
IOException- if there is an error while writing
-
writeDataPage
public void writeDataPage(int valueCount, int uncompressedPageSize, org.apache.parquet.bytes.BytesInput bytes, org.apache.parquet.column.statistics.Statistics statistics, long rowCount, org.apache.parquet.column.Encoding rlEncoding, org.apache.parquet.column.Encoding dlEncoding, org.apache.parquet.column.Encoding valuesEncoding) throws IOExceptionWrites a single page- Parameters:
valueCount- count of valuesuncompressedPageSize- the size of the data once uncompressedbytes- the compressed data for the page without headerstatistics- the statistics of the pagerowCount- the number of rows in the pagerlEncoding- encoding of the repetition leveldlEncoding- encoding of the definition levelvaluesEncoding- encoding of values- Throws:
IOException- if any I/O error occurs during writing the file
-
writeDataPage
public void writeDataPage(int valueCount, int uncompressedPageSize, org.apache.parquet.bytes.BytesInput bytes, org.apache.parquet.column.statistics.Statistics statistics, long rowCount, org.apache.parquet.column.Encoding rlEncoding, org.apache.parquet.column.Encoding dlEncoding, org.apache.parquet.column.Encoding valuesEncoding, org.apache.parquet.format.BlockCipher.Encryptor metadataBlockEncryptor, byte[] pageHeaderAAD) throws IOExceptionWrites a single page- Parameters:
valueCount- count of valuesuncompressedPageSize- the size of the data once uncompressedbytes- the compressed data for the page without headerstatistics- the statistics of the pagerowCount- the number of rows in the pagerlEncoding- encoding of the repetition leveldlEncoding- encoding of the definition levelvaluesEncoding- encoding of valuesmetadataBlockEncryptor- encryptor for block datapageHeaderAAD- pageHeader AAD- Throws:
IOException- if any I/O error occurs during writing the file
-
writeDataPage
public void writeDataPage(int valueCount, int uncompressedPageSize, org.apache.parquet.bytes.BytesInput bytes, org.apache.parquet.column.statistics.Statistics statistics, org.apache.parquet.column.Encoding rlEncoding, org.apache.parquet.column.Encoding dlEncoding, org.apache.parquet.column.Encoding valuesEncoding, org.apache.parquet.format.BlockCipher.Encryptor metadataBlockEncryptor, byte[] pageHeaderAAD) throws IOExceptionwrites a single page- Parameters:
valueCount- count of valuesuncompressedPageSize- the size of the data once uncompressedbytes- the compressed data for the page without headerstatistics- statistics for the pagerlEncoding- encoding of the repetition leveldlEncoding- encoding of the definition levelvaluesEncoding- encoding of valuesmetadataBlockEncryptor- encryptor for block datapageHeaderAAD- pageHeader AAD- Throws:
IOException- if there is an error while writing
-
writeDataPageV2
public void writeDataPageV2(int rowCount, int nullCount, int valueCount, org.apache.parquet.bytes.BytesInput repetitionLevels, org.apache.parquet.bytes.BytesInput definitionLevels, org.apache.parquet.column.Encoding dataEncoding, org.apache.parquet.bytes.BytesInput compressedData, int uncompressedDataSize, org.apache.parquet.column.statistics.Statistics<?> statistics) throws IOExceptionWrites a single v2 data page- Parameters:
rowCount- count of rowsnullCount- count of nullsvalueCount- count of valuesrepetitionLevels- repetition level bytesdefinitionLevels- definition level bytesdataEncoding- encoding for datacompressedData- compressed data bytesuncompressedDataSize- the size of uncompressed datastatistics- the statistics of the page- Throws:
IOException- if any I/O error occurs during writing the file
-
endColumn
public void endColumn() throws IOExceptionend a column (once all rep, def and data have been written)- Throws:
IOException- if there is an error while writing
-
endBlock
public void endBlock() throws IOExceptionends a block once all column chunks have been written- Throws:
IOException- if there is an error while writing
-
appendFile
@Deprecated public void appendFile(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path file) throws IOException
Deprecated.will be removed in 2.0.0; useappendFile(InputFile)instead- Parameters:
conf- a configurationfile- a file path to append the contents of to this file- Throws:
IOException- if there is an error while reading or writing
-
appendFile
public void appendFile(org.apache.parquet.io.InputFile file) throws IOException- Throws:
IOException
-
appendRowGroups
@Deprecated public void appendRowGroups(org.apache.hadoop.fs.FSDataInputStream file, List<BlockMetaData> rowGroups, boolean dropColumns) throws IOException
Deprecated.will be removed in 2.0.0; useappendRowGroups(SeekableInputStream,List,boolean)instead- Parameters:
file- a file stream to read fromrowGroups- row groups to copydropColumns- whether to drop columns from the file that are not in this file's schema- Throws:
IOException- if there is an error while reading or writing
-
appendRowGroups
public void appendRowGroups(org.apache.parquet.io.SeekableInputStream file, List<BlockMetaData> rowGroups, boolean dropColumns) throws IOException- Throws:
IOException
-
appendRowGroup
@Deprecated public void appendRowGroup(org.apache.hadoop.fs.FSDataInputStream from, BlockMetaData rowGroup, boolean dropColumns) throws IOException
Deprecated.will be removed in 2.0.0; useappendRowGroup(SeekableInputStream,BlockMetaData,boolean)instead- Parameters:
from- a file stream to read fromrowGroup- row group to copydropColumns- whether to drop columns from the file that are not in this file's schema- Throws:
IOException- if there is an error while reading or writing
-
appendRowGroup
public void appendRowGroup(org.apache.parquet.io.SeekableInputStream from, BlockMetaData rowGroup, boolean dropColumns) throws IOException- Throws:
IOException
-
appendColumnChunk
public void appendColumnChunk(org.apache.parquet.column.ColumnDescriptor descriptor, org.apache.parquet.io.SeekableInputStream from, ColumnChunkMetaData chunk, org.apache.parquet.column.values.bloomfilter.BloomFilter bloomFilter, org.apache.parquet.internal.column.columnindex.ColumnIndex columnIndex, org.apache.parquet.internal.column.columnindex.OffsetIndex offsetIndex) throws IOException- Parameters:
descriptor- the descriptor for the target columnfrom- a file stream to read fromchunk- the column chunk to be copiedbloomFilter- the bloomFilter for this chunkcolumnIndex- the column index for this chunkoffsetIndex- the offset index for this chunk- Throws:
IOException
-
end
public void end(Map<String,String> extraMetaData) throws IOException
ends a file once all blocks have been written. closes the file.- Parameters:
extraMetaData- the extra meta data to write in the footer- Throws:
IOException- if there is an error while writing
-
getFooter
public ParquetMetadata getFooter()
-
mergeMetadataFiles
@Deprecated public static ParquetMetadata mergeMetadataFiles(List<org.apache.hadoop.fs.Path> files, org.apache.hadoop.conf.Configuration conf) throws IOException
Deprecated.metadata files are not recommended and will be removed in 2.0.0Given a list of metadata files, merge them into a single ParquetMetadata Requires that the schemas be compatible, and the extraMetadata be exactly equal.- Parameters:
files- a list of files to merge metadata fromconf- a configuration- Returns:
- merged parquet metadata for the files
- Throws:
IOException- if there is an error while writing
-
mergeMetadataFiles
@Deprecated public static ParquetMetadata mergeMetadataFiles(List<org.apache.hadoop.fs.Path> files, org.apache.hadoop.conf.Configuration conf, KeyValueMetadataMergeStrategy keyValueMetadataMergeStrategy) throws IOException
Deprecated.metadata files are not recommended and will be removed in 2.0.0Given a list of metadata files, merge them into a single ParquetMetadata Requires that the schemas be compatible, and the extraMetadata be exactly equal.- Parameters:
files- a list of files to merge metadata fromconf- a configurationkeyValueMetadataMergeStrategy- strategy to merge values for same key, if there are multiple- Returns:
- merged parquet metadata for the files
- Throws:
IOException- if there is an error while writing
-
writeMergedMetadataFile
@Deprecated public static void writeMergedMetadataFile(List<org.apache.hadoop.fs.Path> files, org.apache.hadoop.fs.Path outputPath, org.apache.hadoop.conf.Configuration conf) throws IOException
Deprecated.metadata files are not recommended and will be removed in 2.0.0Given a list of metadata files, merge them into a single metadata file. Requires that the schemas be compatible, and the extraMetaData be exactly equal. This is useful when merging 2 directories of parquet files into a single directory, as long as both directories were written with compatible schemas and equal extraMetaData.- Parameters:
files- a list of files to merge metadata fromoutputPath- path to write merged metadata toconf- a configuration- Throws:
IOException- if there is an error while reading or writing
-
writeMetadataFile
@Deprecated public static void writeMetadataFile(org.apache.hadoop.conf.Configuration configuration, org.apache.hadoop.fs.Path outputPath, List<Footer> footers) throws IOException
Deprecated.metadata files are not recommended and will be removed in 2.0.0writes a _metadata and _common_metadata file- Parameters:
configuration- the configuration to use to get the FileSystemoutputPath- the directory to write the _metadata file tofooters- the list of footers to merge- Throws:
IOException- if there is an error while writing
-
writeMetadataFile
@Deprecated public static void writeMetadataFile(org.apache.hadoop.conf.Configuration configuration, org.apache.hadoop.fs.Path outputPath, List<Footer> footers, ParquetOutputFormat.JobSummaryLevel level) throws IOException
Deprecated.metadata files are not recommended and will be removed in 2.0.0writes _common_metadata file, and optionally a _metadata file depending on theParquetOutputFormat.JobSummaryLevelprovided- Parameters:
configuration- the configuration to use to get the FileSystemoutputPath- the directory to write the _metadata file tofooters- the list of footers to mergelevel- level of summary to write- Throws:
IOException- if there is an error while writing
-
getPos
public long getPos() throws IOException- Returns:
- the current position in the underlying file
- Throws:
IOException- if there is an error while getting the current stream's position
-
getNextRowGroupSize
public long getNextRowGroupSize() throws IOException- Throws:
IOException
-
-