How to – Implement Random File Access

by Retired ‎05-17-2011 02:13 PM - edited ‎05-17-2011 02:16 PM (2,781 Views)

Random File Access is the ability to change to an arbitrary position within a file while reading or writing. This article will discuss how to change the current position within a file in both BlackBerry® Device Software 5.0, where a new API is available, and an implementation that will work on earlier BlackBerry Device Software versions as well. Full source code of the implementation is also provided.


Random File Access in BlackBerry Device Software 5.0

Starting in BlackBerry Device Software 5.0, the interface becomes available. This interface offers the two methods getPosition() and setPosition(long) to change the current position. Using the methods is quite straightforward, but it is important to note that the interface is only implemented by DataInputStream. Therefore, the interface doesn’t offer any benefit for writing files, or reading bytes directly from an InputStream. The implementation shared in the rest of this article cover those cases through an alternate approach.


Random File Access using FileConnection

The basic requirements for changing the current position in a file have long been available through, and the related InputStream and OutputStream classes. Given a FileConnection file, the InputStream to read from the file is given by file.openInputStream(). This will open the stream to the start of the file. Using the skip(long) method on the InputStream, a number of bytes can be skipped in the file. The number must be non-negative, but using the reset() method, the position can be returned to the beginning of the file. By combining these methods, it’s possible to skip to any location within the file while reading.


For writing to a file, the OutputStream doesn’t offer the same methods; instead, the file.openOutputStream(long position) method must be used. This means that a new OutputStream must be opened each time a different position in the file is desired for writing. In situations where numerous writes may occur at different positions, 

opening OutputStreams can significantly degrade the performance, as this is a costly operation. The following section will examine a sample that buffers output to reduce the number of OutputStreams that need to be opened.


Buffered Random File Access

Given the API that is available, it is possible to implement very simple random file access on top of FileConnections. However, examining the performance of this while doing random writes shows a need to reduce the number of calls to openOutputStream(long). Using a buffer, it’s possible to read in a section of the file and make changes to that area then write out the whole section at once, and repeat for another area as needed. However, due to the way that reading files works, this is a poor choice for doing read calls. In that case it’s better to read and skip directly within an InputStream, rather than buffer. The source code linked to this article puts those performance considerations into practice by buffering for writing but bypassing the buffer when reading by default.


Random File Access Performance Testing

The following section discusses the performance testing that was done using the BuffereredRandomFile class, in comparison to the use of FileConnection and its InputStream or OutputStream methods. Each of the various operations done and tested in the sample is discussed below with recommendations for use, as well as the parameters of the test to be aware of. Overall, it’s important to note that performance numbers are highly sensitive to other factors on the device. The BlackBerry smartphone is a multi-process, multi-threaded device with many services running simultaneously, and other running processes can impact the speed of a performance test, and skew results.


File Creation or Sequential Writing

The BufferedRandomFile class adds some unnecessary overhead when writing large amounts of sequential data. There’s really no need to buffer data when creating a file for the first time, so for the peak performance, don’t use the BufferedRandomFile class, simply open the FileConnection, get the OutputStream, and write out the bytes. The performance loss is not overwhelming though, so for ease of implementation it is possible to use the buffered class.


Random Read Access

The random access tests use a configurable number of reads within the file size to test the speed of the buffer. The random read locations are not connected in any way, so there is no temporal or spatial relationship as there typically is when accessing data, so the buffer performance is likely to be average (neither best case, nor worst case) and may have little connection to any particular application’s expected performance. However, it seems clear that the buffering process, particularly with large buffers, simply adds overhead. Skipping is a fast operation in the File System, so buffering only to read will typically just increase the amount of data read, which slows down the performance. The sample code avoids this by reading directly from the InputStream by default, and only reading from the buffer for the current buffered section. This provides the best average performance. Note that due to the differences between file systems, the performance loss is much more significant on the internal flash than it is on the SDCard, where performance is fairly similar and in some situations can be improved by using a buffer, based on the number of reads, and the size of the buffer used.


Random Write Access

This is the area where buffering really has a significant gain. In this test the random writes are again randomly spread through the file size, so temporal and spatial factors are averaged out. While testing without buffering, only 8 random writes in a 512KB file were tested as it took so much longer to do the test as compared to the buffered approach. It took an average of about 2 seconds per write on the internal flash in one test, and 600ms on the SDCard. However, the buffered approach was able to write in approximately 200-300ms on the SDCard, and as much as 1/10 that time on the internal flash. These numbers are based on a BlackBerry® Bold™ 9700 smartphone so are not necessarily representative of older hardware. The testing was done using a 512KB file and cache sizes between 1 and 128KB. Tests of 8 random writes and 32 random writes were done for comparison with the various cache sizes. Performance can be very different between runs but in these tests the best performance was typically seen with a cache size of 1-8KB, with the best usually seen around 4KB. It is expected that this will not be the same for every application, and that individual testing should be done to determine the best performance numbers in each case. The number writes, the size of the records, the proximity of subsequent writes, and the cache size will all be a factor in how the buffering system performs.