When programmer writes a program to read data from a file continuously and sequentially with a fixed data type reading sequence (e.g. to read three integer
s then 2 double
s repeatedly), he or she may find the MappedByteBuffer
from Java library is useful.
MappedByteBuffer
is pretty useful because its content is a memory-mapped region of a file. Program can access to the position of the file it wants directly. It also does some optimization on loading data from file, where part of the content are loaded into memory before the read action is actually performed. But when the data will be load? It remains unknown and it is operating system dependent. This is the first problem when using MappedByteBuffer
– are the data buffered?
The second problem arises when programmer uses MappedByteBuffer
to read a big file. MappedByteBuffer
only allows data reading up to the maximum value of an integer (2,147,483,648 bytes). If the program needs to read further from that, it will need to re-map the file by calling the FileChannle.map
function by parsing in the correct file position. The programmer will need to be very careful in passing the correct file position because some time it can be very confusing.
To overcome these two problems, we can introduce, design and write a new class that ensures the data reading action is buffered for the best performance and also allows continuous of reading without the need of re-mapping the FileChannle
. First, let’s write out the pseudo code for the read file action:
[java]
// get data (e.g. get integer)
// ensure the memory buffer has enough data for the requested data type
// if not enough, read data from file
// read the data from buffer and return them in the requested data type
[/java]
Below is the example of implementation:
[java]
public class FileChannelReader {
private final static int BUFFER_SIZE = 64 * 1024; // 64k
private ByteBuffer readerBuffer;
private FileChannel readerFileChannel;
public FileChannelReader(FileChannel fileChannel) {
readerBuffer = ByteBuffer.allocate(PAGE_SIZE);
readerFileChannel = fileChannel;
readerBuffer.clear();
readerBuffer.flip();
}
// Ensure the buffer has enough data
private void ensureData(int size) throws IOException {
if (readerBuffer.remaining() < size) {
readerBuffer.compact();
if (readerFileChannel.read(readerBuffer) <= 0)
throw new IOException("Unexpected end-of-stream");
readerBuffer.flip();
}
}
// Get current position
public long position() throws IOException {
return readerFileChannel.position() – readerBuffer.remaining();
}
// Set current position
public void position(long position) throws IOException {
readerFileChannel.position(position);
readerBuffer.clear();
readerBuffer.flip();
}
// Get integer
public int getInt() throws IOException {
ensureData(Integer.SIZE/8);
return readerBuffer.getInt();
}
// Get long
public long getLong() throws IOException {
ensureData(Long.SIZE/8);
return readerBuffer.getLong();
}
// And so on …
}
[/java]
I’m using this technique in my programming projects and it performs pretty well. The data to be read is always ensured to be buffered and it also reduces complexity in writing a program to read data sequentially.
[…] Java: Continuously Read Data From FileChannel Without …Java: Continuously Read Data From FileChannel Without MappedByteBuffer … readerFileChannel = fileChannel; readerBuffer.clear(); readerBuffer.flip(); … […]