Without reading the whole file, you cannot select every line with equal probability. One reason is that you have no way of knowing whether the portions of the file you did not read consist of one long line each, or a whole bunch of little tiny lines each.