Java - Binary file operations

Discussion in 'Programmer's Corner' started by WBahn, Jun 6, 2015.

  1. WBahn

    Thread Starter Moderator

    Mar 31, 2012
    17,737
    4,789
    Does anyone know of a good resource for working with binary files in Java?

    Most of the resources, including texts, seem to be of the opinion that binary file operations are abstracted away and completely portable. I have a hard time buying that. I can believe that binary files created and read by Java programs are portable across platforms because the underlying JVM makes them so, but I don't buy that this applies in general. For instance, most binary file formats, such as BMP and WAV, explicitly specify the endianness of values stored in them. Unless you are using a Java library class to access them (so that the library class knows this information), there is no way for the JVM to know how to deal with the endianness of an arbitrary binary file format.

    I've seen some disjointed references on how to get at the endianness of the underlying hardware, but that is irrelevant (unless the Java JVM implementation makes it relevant).

    What I'm looking for is a class that reads data from a file into a variable (or memory buffer) and vice verse in the same order that it is in the file.

    I can certainly play around and figure out how it works on my machine, but I want a means that is portable to other platforms.
     
  2. vpoko

    Member

    Jan 5, 2012
    258
    47
    I think you're looking at things at a different level of abstraction than Java is. When you write a stream to a file, you're just writing a series of bytes. Whether those represent individual bytes, multi-byte types (whether big-endian or little-endian), or complex objects, isn't relevant to the IO libraries. How you arrange the bytes in a binary file is for you to decide, using existing specs if you're implementing a well known file format.
     
  3. WBahn

    Thread Starter Moderator

    Mar 31, 2012
    17,737
    4,789
    But if I write an 4-byte integer to a file on a big endian machine and read a 4-byte integer from that file on another machine that is little endian, will I get the same value back. If the answer is yes, that means that Java is storing the data in the file in a format independent of the endianness of the underlying hardware. I can easily see the JVM doing this (I don't know if it does or not -- and I don't have access to any big endian machines to do tests on). But since I have seen several references that state/imply that endianness is abstracted away (for files written/read using Java applications) then I'm assuming that the JVM must be doing just that.

    In order for me to arrange the bytes as I want to arrange them, I need to be able to use some kind of a buffer that I know is going to be written to the file in the order I want. The only thing I can think of is using a byte array and translating between the integer variables (or whatever) and the byte array. I'm pretty sure I can force that to work, but it seems odd that there aren't library routines to facilitate that.
     
    Last edited: Jun 9, 2015
  4. vpoko

    Member

    Jan 5, 2012
    258
    47
    The Java file IO libraries allow you to specify endianness when creating a ByteBuffer (using the instance's order property). Java cannot infer the endianness when reading from a generic binary file. If you were to save an integer to a file and look at it in a hex editor, there would only be 4 bytes there (ordered either big or little-endian), with no meta-data to tell you the right way to read it. The file format specs would be the ones to tell you that. It's not strictly related to the platform. Though CPU's have an endianness for how they represent words, there's no requirement that files be stored with bytes in that order; it only depends on the file format.
     
  5. WBahn

    Thread Starter Moderator

    Mar 31, 2012
    17,737
    4,789
    Thanks.

    This is the first reference I've seen to a "ByteBuffer" class, but it looks like what I think I need.

    I definitely understand that the endianess of how data is stored in the file is properly a matter for the file specification and that, in general, it can't be determined by an application reading the file unless some kind of metadata is present, such as a field that is a specific valued integer that the application can read and if it doesn't get that read correctly it can try the other endianness (and, of course, that can be avoided if the application accesses the file as direct memory transfers to a buffer and then deals with endianness there according to the file spec).
     
Loading...