It's surprisingly how little documentation java.io.InputStream
has for a class of its complexity and importance. Almost every Java application uses it in one way or another, but how one uses it correctly is not completely spelled out.
Naturally the documentation allows most of the methods to throw an IOException
if "an I/O error occurs." Since no attempt is made to define I/O error, so that means an exception could be thrown for any reason at all. The only methods that are restrictive about their exceptions are mark
, markSupported
, and reset
. mark
and markSupported
cannot throw IOException
s at all, and reset
only throws them under refreshingly specific circumstances. Surprisingly, even available
may throw an IOException
on the slightest whim.
It's not hard to assume that IOException
s are like natural disasters, able to come crashing down upon an application at any time and totally beyond control. This is probably necessary because we are interacting with physical devices that may be affected by unpredictable forces, perhaps even actual natural disasters. There are no usage guidelines that will protect your application from the disk it is reading being destroyed by an earthquake. Even so, I wish the documentation made some sort of promise about that rather than letting us wonder if some combination of reads might cause an IOException
where a different combination of reads could have gotten the same bytes without the exception. With the documentation as it is, a certain InputStream
might throw an exception upon any attempt to read an odd number of bytes.
The skip
method is most worrying of all. Even in comparison to the other methods of InputStream
, skip
makes very few promises about what it will do. At least read
promises to read at least one byte unless the end the stream has been reached. On the other hand, skip
might skip 0 bytes because of "any number of conditions." With read
you can use a loop to wait for everything you need to read and each time through the loop you are guaranteed to get at least one byte. With skip
you have no guarantee that that the loop will ever end or even that skip
will do any blocking to prevent your loop from consuming your CPU. Is it a good practice to use skip
in a loop with a Thread.sleep
and a Thread.interrupted
so the rest of the rest of the application can abort the read if it seems like the skip loop might go on forever? Or would the better practice be to read one byte between skips to guarantee that the loop will block when waiting for input and to check for the end of the stream?
I have seen people use skip
to skip over just a few bytes that aren't needed in a stream, ignoring the return value and the minefield of possible behaviours that skip
could have, even in tutorials. There is even a suggestion that skip
might throw an IOException
when an equivalent read
would not: the skip
documentation says that it throws an IOException
"if the stream does not support seek, or if some other I/O error occurs." Since there is no apparent way to check of seek is supported, does that mean that the correct way to use skip
is something like this?
try { skipped = in.skip(n); } catch(IOException e) { skipped = in.read(new byte[n]); }
Naturally that won't work if n
is very large, but it's much worse than that because it is ignoring an IOException
that might be caused by something more serious than seek not being supported, and we have no way of being sure that the read would rethrow the same exception instead of reading corrupted data, such as if half the bytes were skipped by skip
before it threw the exception.
On one hand you naturally want to save time and memory by skipping bytes that you don't need, but on the other hand you are faced by the possibility that by trying to skip bytes you may cause an IOException
to derail your reading of a stream that would have been read successfully otherwise.