text files and binary files

Question

anumash 0 Newbie Poster

12 Years Ago

I am trying to open a text file which contains a dictionary of english words. Each word and it's definition are on the same line and the entries are delimited by a newline. Now, my question is that if you open a text file using fopen() in "rt" mode then do the newlines have a \r\n or just \n? In binary mode does the newline get interpreted as \r\n or just \n? Massive confusion!

c

4 Contributors
12 Replies
214 Views
4 Days Discussion Span
Latest Post 12 Years Ago Latest Post by deceptikon

rubberman 1,355 Nearly a Posting Virtuoso

12 Years Ago

From your question, I assume you are using a Windows system? Do you know if the files are in MS, or in Unix/Linux format?

rubberman 1,355 Nearly a Posting Virtuoso

12 Years Ago

Ok. On Windows, a text newline ('\n') IS a carriage-return+linefeed ('\r\n') combination. You would only need to use the latter representation if you were reading the file from Unix/Linux systems. On Windows, it is still encoded as '\n'. IE, don't sweat it unless you are reading a file from one system type on another and have not passed the file through a filter to convert newlines accordingly, which normally a tool like ftp will do for you if the transfer is specified as text-type. There are also other tools which will convert newlines for you - this is a very common problem.

So, if you execute the function fprintf(outfile, "Hello World.\n"); on Windows, the file will contain a '\r\n' terminator on the line. On Linux/Unix, it would contain only a linefeed ('\n'). Reading back, the same code should work appropriately on either system, making programming applications that is intended to work on both types of systems much easier. Again, problems only occur when you are processing data written on one system type on the other.

And welcome to cross-platform programming and all the little warts you will encounter in that endeavor! :-)

Adak 419 Nearly a Posting Virtuoso

12 Years Ago

When you have a string of words - here a word, and then it's definition, on the same line, you want to use fgets() and put the entire line into a char array (I use "buffer", all at once.

The newline will be included on the end of the buffer (space permitting), so now using strlen(buffer) you can get the full size. Easy smeazy.

while((fgets(buffer, sizeof(buffer), filePointer))!= NULL) {
   //your other code in here
}

Remember to make buffer longer than any possible line of text, and you're good to go. A word, plus a definition, may be a line longer than 200 chars - so think 500 for starters.

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

anumash 0 Newbie Poster · Answer 1 · 2013-02-20T14:19:29+00:00

I am using a Windows system, the file is a .txt file

deceptikon 1,790 Code Sniper Team Colleague Featured Poster · Answer 2 · 2013-02-20T14:57:16+00:00

In text mode the newline sequence will be converted to '\n', this is true for any platform. In binary mode you're on your own, no translation will occur so on Windows you need to look for and handle newlines in the form of CRLF.

But it's problematic because you can get a text file formatted using POSIX newlines (just LF rather than CRLF). So if you just look for CRLF or rely on text mode translation the lines might not be split correctly. Fun, huh? ;)

anumash 0 Newbie Poster · Answer 3 · 2013-02-20T15:03:59+00:00

what should I do if I want to detect a new line?

deceptikon 1,790 Code Sniper Team Colleague Featured Poster · Answer 4 · 2013-02-20T15:20:58+00:00

In text mode, look for '\n'. In binary mode, look for '\r' followed immediately by '\n'.

anumash 0 Newbie Poster · Answer 5 · 2013-02-20T15:24:58+00:00

i did that and I keep getting stuck in an infinite loop..i'll post my code in a minute...

anumash 0 Newbie Poster · Answer 6 · 2013-02-20T15:58:01+00:00

/* program used to determine the number of characters and in turn the number of bytes in an
 alphabet entry i.e. number of bytes in 'A', 'B' etc..
 This program also searches for the longest entry in the database i.e. the maximum 
 number of bytes for a given word and it's definition which are found on the same line.
 the program gets stuck in an infinite loop and I don't know why.
*/
#include<stdio.h>
#include<stdlib.h>
int main(){
FILE *fp;
fp=fopen("database.txt","rt");
if(fp==NULL)
{printf("Error opening file!");
exit(1);
}                           // File open and error checking
char ch;
ch= fgetc(fp);              
char alphabet='A';
unsigned long countal[26];  //to store the number of bytes for a particular entry (dictionary is sorted)
int size1=0;
short i=0;
int size=0;
while(alphabet<='Z')     // A through Z, looping through the entire file untile eof.
{
unsigned long chars=0;
if(ch==alphabet)            /* if found then increment the number of bytes and check the size
{                              of a given entry */
while(ch!='\n')             // infinite loop??
{chars++;
size++;
ch=fgetc(fp);
}

}
else
{

while(ch!='\n')
{size++;
ch=fgetc(fp);
}

}

if(size>size1)
size1=size;
size=0;
ch=fgetc(fp);
countal[i]=chars;
i++;
if(ch==EOF)
{
alphabet++;
rewind(fp);
}
}

printf("Largest directory entry: %d\n",size1);
char abcd='A';
for(i=0;i<26;i++)
{
printf("%c= ",abcd);
printf("%u bytes\n",countal[i]);
abcd++;
}
fclose(fp);
return 0;
}

anumash 0 Newbie Poster · Answer 7 · 2013-02-25T06:10:23+00:00

Hey thanks for your valuable suggestion! :):)

anumash 0 Newbie Poster · Answer 8 · 2013-02-25T06:20:40+00:00

Does the function fgets() increment the file pointer internally to point to the next line??

deceptikon 1,790 Code Sniper Team Colleague Featured Poster · Answer 9 · 2013-02-25T12:48:54+00:00

Does the function fgets() increment the file pointer internally to point to the next line??

Yes, all of the standard I/O functions adjust the file position accordingly.