public class GoogleHadoopSyncableOutputStream extends OutputStream implements org.apache.hadoop.fs.Syncable
Syncable interface by composing objects
created in separate underlying streams for each hsync() call.
Prior to the first hsync(), sync() or close() call, this channel will behave the same way as a basic non-syncable channel, writing directly to the destination file.
On the first call to hsync()/sync(), the destination file is committed and a new temporary file using a hidden-file prefix (underscore) is created with an additional suffix which differs for each subsequent temporary file in the series; during this time readers can read the data committed to the destination file, but not the bytes written to the temporary file since the last hsync() call.
On each subsequent hsync()/sync() call, the temporary file closed(), composed onto the destination file, then deleted, and a new temporary file is opened under a new filename for further writes.
Caveat: each hsync()/sync() requires many underlying read and mutation requests occurring sequentially, so latency is expected to be fairly high.
If errors occur mid-stream, there may be one or more temporary files failing to be cleaned up, and require manual intervention to discover and delete any such unused files. Data written prior to the most recent successful hsync() is persistent and safe in such a case.
If multiple writers are attempting to write to the same destination file, generation ids used with low-level precondition checks will cause all but a one writer to fail their precondition checks during writes, and a single remaining writer will safely occupy the stream.
| Modifier and Type | Field and Description |
|---|---|
static String |
TEMPFILE_PREFIX |
| Constructor and Description |
|---|
GoogleHadoopSyncableOutputStream(GoogleHadoopFileSystemBase ghfs,
URI gcsPath,
org.apache.hadoop.fs.FileSystem.Statistics statistics,
CreateFileOptions createFileOptions,
SyncableOutputStreamOptions options)
Creates a new GoogleHadoopSyncableOutputStream.
|
| Modifier and Type | Method and Description |
|---|---|
void |
close() |
void |
hflush()
There is no way to flush data to become available for readers without a full-fledged hsync(),
If the output stream is only syncable, this method is a no-op.
|
void |
hsync() |
void |
sync() |
void |
write(byte[] b,
int offset,
int len) |
void |
write(int b) |
flush, writepublic static final String TEMPFILE_PREFIX
public GoogleHadoopSyncableOutputStream(GoogleHadoopFileSystemBase ghfs, URI gcsPath, org.apache.hadoop.fs.FileSystem.Statistics statistics, CreateFileOptions createFileOptions, SyncableOutputStreamOptions options) throws IOException
IOExceptionpublic void write(int b)
throws IOException
write in class OutputStreamIOExceptionpublic void write(byte[] b,
int offset,
int len)
throws IOException
write in class OutputStreamIOExceptionpublic void close()
throws IOException
close in interface Closeableclose in interface AutoCloseableclose in class OutputStreamIOExceptionpublic void sync()
throws IOException
IOExceptionpublic void hflush()
throws IOException
If it is rate limited, unlike hsync(), which will try to acquire the permits and block, it will do nothing.
hflush in interface org.apache.hadoop.fs.SyncableIOExceptionpublic void hsync()
throws IOException
hsync in interface org.apache.hadoop.fs.SyncableIOExceptionCopyright © 2024. All rights reserved.