I try to imitate the first stage of pre-processor, which is to remove comments from a *.c file.

The main principle is that a *.c will be read and the program will create another file *.c1 which is an exact copy of *.c but without comments(c/c++ comments).

Example(in Linux): >./myprog name.c
produce: name.c1

I wrote the following code, but when it runs and create the *.c1 file, I can see (in Linux file image) that the beginning of that file is OK but the other is crap for some unknown reason..

Please help me solve this issue ...

thnx all !!

The code :

#include <stdio.h>
#include <ctype.h>
#include <stdlib.h>
#include <string.h>

enum status {OUT , IN_STRING , LEFT_SLASH , IN_COMMENT , RIGHT_STAR , IN_CPP_COM};

int main(int argc , char *argv[])
{
	FILE *fd , *new_fd;	/* fd -> *.c ; new_fd -> *.c1 */
	int ch;
	int state = OUT;	
	int new_file_string_len = strlen(argv[1])+2; /*num of chars in name.c +1 for name.c1*/
	char *new_file_name;
	
	new_file_name=(char *) malloc ((new_file_string_len)*sizeof(char)); 
	
	strcpy(new_file_name , argv[1]);	
	new_file_name[new_file_string_len-1] = '\0'; 
	new_file_name[new_file_string_len-2] = '1';

	if( !(fd = fopen (argv[1],"r") ) )
	{
	 fprintf(stderr,"cannot open file !\n");
	 exit(0);
	}

	if( !(new_fd = fopen (new_file_name,"w+") ) )
	{
	 fprintf(stderr,"cannot open file !\n");
	 exit(0);
	}

	while ( (ch=fgetc(fd)) != (feof(fd)) )	
		
		switch (state)
		{
     			case OUT:
				if (ch=='/')
				  state = LEFT_SLASH;
			    		else
					{
					 fputc(ch,new_fd);
					 if (ch=='\"')
					   state = IN_STRING;
					}			  
				break; /*OUT*/

			case LEFT_SLASH:
				if(ch=='*')
				  state = IN_COMMENT;
					else if (ch=='/')
					  state = IN_CPP_COM;
						else
						{
						 fputc('/',new_fd);
						 fputc(ch,new_fd);
						 state = OUT;
						}
				break; /*LEFT_SLASH*/
						 
			case IN_COMMENT:
				if(ch=='*')
				  state = RIGHT_STAR;
				break; /*IN_COMMENT*/
				
			case IN_CPP_COM:
				if(ch=='\n')
				{
				 state = OUT;
				 fputc('\n',new_fd);			
				}
				break; /*IN_CPP_COM*/
	
			case RIGHT_STAR:
				if(ch=='/')
				  state = OUT;
					else if (ch!= '*')
					  state = IN_COMMENT;
				break; /*RIGHT_STAR*/

			case IN_STRING:
				if(ch=='\"')
				  state = OUT;
				fputc(ch,new_fd);
				break; /*IN_STRING*/


		} /*switch*/	 

	fclose(fd);
	fclose(new_fd);
	return 0; /*dummy*/

} /*main()*/

You are not taking account of escaped characters in strings, for example how would your code cope with the string "\"//" .

You are not taking account of character constants, how would you software cope with '"' This while ( (ch=fgetc(fd)) != (feof(fd)) ) is wrong and might well result in your reading past the end of the file. I suggest you read up on the return value of the 2 functions called.

You are not taking account of escaped characters in strings, for example how would your code cope with the string "\"//" .

You are not taking account of character constants, how would you software cope with '"' This while ( (ch=fgetc(fd)) != (feof(fd)) ) is wrong and might well result in your reading past the end of the file. I suggest you read up on the return value of the 2 functions called.

How I can change the while so it will be good ? we learned that if we want to read file until EOF we need to use the =! feof(fd) and it will work.

Can you be more specific because I work hours on it and no result for now... :-(

the function feof returns

A non-zero value is returned in the case that the End-of-File indicator associated with the stream is set.
Otherwise, a zero value is returned.

A non-zero value can be anything (other than zero), 1 is a common possibility, -1 might be as well.

At the end of the stream fgetc returns EOF. EOF is defined as

[any] negative integral constant expression

EOF is any negative value quite often -1.

So at the end of the file while ( (ch=fgetc(fd)) != (feof(fd)) ) is equivalent to while ( <Any negative integral value often -1> != <any integral value that is not 0 possibly -1>) ) .

There is a lot of possibilities where these equality will be true and the loop will continue.

You should either use the return value of fgetc or the return value of feof but you should not compare the return value of the 2 functions. while ( (ch=fgetc(fd)) != EOF ) or while ( !feof(fd) ) Noting the the second version does not actually read anything from the stream if it is not the end of file.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.