Hi everyone,

I'm currently working on a cross-platform project to copy data from the computer onto a USB drive quickly.

Main problem: EOF in the middle of the file

It's going good so far, but I stumbled across a problem in my copy_file function:

int copy_file(char *source_path, char *destination_path) {

    FILE *input;
    FILE *output;

    char temp1[MAX_PATH] = {0};
    strcpy(temp1, source_path);

    char temp2[MAX_PATH] = {0};
    strcpy(temp2, destination_path);

    if ( (input = fopen(temp1, "r")) != NULL ) {

        if ( (output = fopen(temp2, "w+")) != NULL ) {

            /* Copying source to destination: */
            char c;
            while ( (c = getc(input)) != EOF ) {
                putc(c, output);
            }

        } else {

            /* If the output file can't be opened: */
            fclose(input);
            return 1;
        }

    } else {

        /* If the input file can't be opened: */
        return 1;

    }

    return 0;

}

It works fine for regular formatted files such as WORD-documents and text files, but with movies it stops after about 705 bytes, although the file is 42MB. I think it's because there is a EOF character in the middle of the file which breaks up the while loop. Does anyone know how to solve this?

Secondary issue: speed

In regards to speed, I need to write code that gets the job done as fast as possible. This was the simplest copy function I could think of, but I don't know whether it is the fastest. What is faster, fread/fwrite or getc/putc ? Or is there another standard function that can copy even faster?

I'm also not sure what the bottleneck is in regards to speed, as an portable drive takes more time to write to than for example internal memory. If I write a function that compresses files before they are copied, would that increase speed? Or would it just slow the read/write process down?

Thanks in advance,
~G

long fileSize;
fseek(input, 0L, SEEK_END); //Go to end of file
fileSize = ftell(input); //Current position is the file length
rewind(input); //Rewind to beginning

Then just do a for loop.

As for speed, your fgetc is killing you. Single character reads is too slow.
Once you know your file size, malloc a buffer, and fread into it.
Then fwrite out.

It works fine for regular formatted files such as WORD-documents and text files, but with movies it stops after about 705 bytes, although the file is 42MB. I think it's because there is a EOF character in the middle of the file which breaks up the while loop. Does anyone know how to solve this?
~G

EOF is not stored with the file but is return by the operating system when its encountered.

http://faq.cprogramming.com/cgi-bin/smartfaq.cgi?answer=1048865140&id=1043284351

Something tells me you are on a Windows system. There is a distinction between a text mode and a binary mode. In the first case the ^Z character does serve as an end-of-file marker. Open your files in a binary mode: pass "rb" as a second argument to fopen. Same goes for writing.


Secondary issue: speed

In regards to speed, I need to write code that gets the job done as fast as possible. This was the simplest copy function I could think of, but I don't know whether it is the fastest. What is faster, fread/fwrite or getc/putc ? Or is there another standard function that can copy even faster?

The stdio buffers data by the same chunks, so the performance difference between fread and getc is in a number of function calls made. Of course fread would be called less times than getc, but I suppose the speed gain would be infinitesimal, comparing with the actual IO time.

I'm also not sure what the bottleneck is in regards to speed, as an portable drive takes more time to write to than for example internal memory. If I write a function that compresses files before they are copied, would that increase speed? Or would it just slow the read/write process down?

The real bottleneck is in transferring data to and from the peripheral device. You sure want to minimize the amount of data transferred, so yes, compression is a standard way to increase the performance.

I've tried adjusting the modus for file reading/writing, but it did not solve the problem: the file copy is still stopped after 705 bytes.

If I understand correctly, the difference in speed between fread/fwrite and getc/putc is the same? But isn't the getc opening the file each time it gets a character and then reopens the stream, in contrary to fread that would only need to access it one time? I think I'il try the fread/fwrite method in this case.

Assuming I find a solution to the EOF problem (probably using fseek, suggested by DeanM), will fread work correctly with really large files (e.g. 3GB +), as it would allocate an absurd amount of virtual memory if it does it all at once. Would it be better to do read/write every 50MB or so? Or should I depend it on the virtual memory available (and if so, how do I measure it with standard C functions)?

Edit: gerard, if EOF is a standard value returned if the file-read has ended, why does it stop after 705 bytes?

~G

I've tried adjusting the modus for file reading/writing, but it did not solve the problem: the file copy is still stopped after 705 bytes.

Then you didn't fix the problem properly :icon_wink:

If I understand correctly, the difference in speed between fread/fwrite and getc/putc is the same?

You understand incorrectly. getc()/putc() is slower because it reads one byte at a time. fread()/fwrite() can read multiple bytes so it's much faster.

But isn't the getc opening the file each time it gets a character and then reopens the stream, in contrary to fread that would only need to access it one time? I think I'il try the fread/fwrite method in this case.

No, if getc() opened the stream every time you'd only be able to read the first byte of the file.

Assuming I find a solution to the EOF problem (probably using fseek, suggested by DeanM),...

Worthless and makes your code too complex. Just read a block, write a block, read a block, write a block... Keep it up until fread() returns EOF.

... will fread work correctly with really large files (e.g. 3GB +), as it would allocate an absurd amount of virtual memory if it does it all at once. Would it be better to do read/write every 50MB or so? Or should I depend it on the virtual memory available (and if so, how do I measure it with standard C functions)?

Read the file in small enough blocks that
1) it doesn't overload memory
2) you don't need dynamic (virtual) memory

When programming this task, read short files (100-200 bytes) and set your read block to 50 bytes. That way you can follow what's going on using the debugger without going crazy. Once it seems to be working, increase your buffer and file sizes.

Edit: gerard, if EOF is a standard value returned if the file-read has ended, why does it stop after 705 bytes?

You did something wrong. :icon_confused:

commented: Constructive! +6

Thanks for the thorough replies, it really put me in the right direction for this program :)

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.