Hey,

I have a file of RGB values, seperated by : just so they count as a single line.

I'm importing these into a string vector and sorting these so they come out sorted from 0:0:0 to 99:99:99.

vector<string> sV; 
  ifstream in("Images\\Stage1RBGList.txt", std::ios::binary);  
  string word; 
  cout << "Starting Compression..." << endl;
  while(in >> word)    
	  sV.push_back(word);  

   sort( sV.begin(), sV.end() ); 

 ofstream myfile2;
 myfile2.open ("Images\\Stage2SortedRBGList.txt",  std::ios::binary);
  cout << "Starting Sorting..." << endl;
  for(int i = 0; i < sV.size(); i++)   
	  myfile2 <<  sV[i] << endl;  

myfile2.close();
  cout << "Sorting Ended" << endl;

This works fine.

Next I'm trying to count the duplicates in the file, such that I can replace them with a small symbol to reduce size.

e.g.

0:0:0
0:0:0
0:0:0
0:0:0
0:0:0
0:0:0
0:0:0
0:0:10
0:0:12
0:0:16

goes to:

a=0:0:0
7a
0:0:10
0:0:12
0:0:16

I'm sure I'm just being retarded on this point, but my code never seems to count them properly, or go past the first counter.

string uniqueString[10];
int counter[10];

ifstream inTxtFile("Images\\Stage2SortedRBGList.txt", std::ios::binary);
string lineRead;
bool found = false;

inTxtFile >> lineRead;
uniqueString[0] = lineRead;
counter[0] = 1;
cout << uniqueString[0] << endl;


while(inTxtFile >> lineRead) 
{
	for(int arraycell = 0; arraycell < 10; arraycell++)
	{
		found = false;
		if (uniqueString[arraycell] == lineRead)
		{
			if (found == true)
			{
				break;
			}
			else
			{
			//cout << "Match Found @ " << arraycell << " " << lineRead << endl;
			counter[arraycell] = counter[arraycell] + 1;
			found = true;
			break;
			}
		}
	}
}

This should:

1) Populate the first uniqueString[] with the first symbol
2) for every line in the file
3) Scan uniqueString[]
4) If match found, increase counter array by 1
5) If not found by end of the uniqueString[] , add to last spot
6) Once whole uniqueString[] full up, assign symbols and replace those lines in File1.

From what I've debugged, It adds the first symbol 0:0:0 to uniqueString[0].

And scans through rest, Breaking out if same 0:0:0, but counter doesn't increase, and after it passes 0:0:0 it doesn't add the next string 0:0:10.

I'm sure I'm just being stupid here, But anything I'm blatantly missing? I know point 6 won't work, as that's not implemented yet.

Thanks Lilly

Your algorithm is incorrect. When you read in a string, you need to check
the whole array if the string exist. I suggest you use a map, to store the
the value as a key, then you won't have to much work since the map
automatically does not add elements with the same key. What do you think?

Your algorithm is incorrect. When you read in a string, you need to check
the whole array if the string exist. I suggest you use a map, to store the
the value as a key, then you won't have to much work since the map
automatically does not add elements with the same key. What do you think?

Mmm why do you think it doesn't? O.o

for(int arraycell = 0; arraycell < 10; arraycell++)	{

Should search from uniqueString[0] to uniqueString[9], which would be the entiraty of the array, or am I being stupid?

Thanks Lilly

* ps - Edit or is it that second break statement?

* Edit 2 - In terms of the map the problem I had was that if it automatically ignores that value, I wouldn't be able to increase the counter. As the sorted file is just purely for counting, I need to edit the original file by replacing the "most" common values.

int counter[10] = 0;  //initialize all counters to zero
int numUnique = 0;  //to keep track of unique strings found so far
int arrayCell;  //declare outside loops to keep in scope throughout loops

while(inTxtFile >> lineRead) 
{	
   for(arraycell = 0; arraycell < numUnique; arraycell++)	
   {			
     if (uniqueString[arraycell] == lineRead)		
     {			
        break;			
     }			
   }	
   
    //after for loop completed evaluate value of arrayCell
    if(arrayCell == numUnique)
    {
       uniqueString[arraycell] = lineRead;
    }
    
    //increase counter of appropriate element
    counter[arraycell] = counter[arraycell] + 1;
}
int counter[10] = 0;  //initialize all counters to zero
int numUnique = 0;  //to keep track of unique strings found so far
int arrayCell;  //declare outside loops to keep in scope throughout loops

while(inTxtFile >> lineRead) 
{	
   for(arraycell = 0; arraycell < numUnique; arraycell++)	
   {			
     if (uniqueString[arraycell] == lineRead)		
     {			
        break;			
     }			
   }	
   
    //after for loop completed evaluate value of arrayCell
    if(arrayCell == numUnique)
    {
       uniqueString[arraycell] = lineRead;
    }
    
    //increase counter of appropriate element
    counter[arraycell] = counter[arraycell] + 1;
}

Hey, If you output the entire Unique string array at the end, it only has a single value. This is the same problem mine was having :<

No idea why though >.<

Thankies Lilly

Post code used to display uniqueString[] and indicate if the code is in main() or in a function. Also, where is uinqueString[] declared relative to it's being displayed----same or different function?

Between line 6 and 7 of my post you could output lineRead to be sure you're getting something with the read each time through the loop which should ensure something being entered into loop to work with.

string uniqueString[10];

ifstream inTxtFile("Images\\Stage2SortedRBGList.txt", std::ios::binary);
string lineRead;
bool found = false;

int counter[10] = {0};  //initialize all counters to zero
int numUnique = 0;  //to keep track of unique strings found so far
int arrayCell;  //declare outside loops to keep in scope throughout loops 

uniqueString[0] = lineRead;
counter[0] = 1;

while(inTxtFile >> lineRead) 
{	   
	for(arrayCell = 0; arrayCell < numUnique; arrayCell++)	   
	{			     
		if (uniqueString[arrayCell] == lineRead)		    
		{	
			break;			     
		}			   
	}	     //after for loop completed evaluate value of arrayCell    
	if(arrayCell == numUnique)    
	{       
		uniqueString[arrayCell] = lineRead;  
	}     //increase counter of appropriate element    
	counter[arrayCell] = counter[arrayCell] + 1;
	//cout << "Counter + 1" << endl;
	
}

for(int i = 0; i < 10; i++)	
{
cout << uniqueString[i] ;
}
system("pause");

Is the entire code for this part, It's in it's own function, but it's the only one being called.

I had to change a few spellings from the code you pasted, as it was complaining about some Caps/Lowercase, but it's all the same code.

Not sure if I'm just being dumb or what >.<

Thanks Lilly

Line 32 shouldn't use 10 as the terminating value of the loop as uniqueString could have a single unique string if all inputs are the same, 10 uniques strings if inputs are all different, or someplace in between. You only want to display the actual number of unique strings.

Between line 15 and 16 put an output statement and a pause to see what the actual input is.

cout << "this line is " << lineRead << endl;
cin.get();

This is called debugging code and should be removed once you've isolated the problem. It is a very powerful technique for trying to answer questions like this. If you are comfortable using a debugger the process is (somewhat) automated for you.

Line 32 shouldn't use 10 as the terminating value of the loop as uniqueString could have a single unique string if all inputs are the same, 10 uniques strings if inputs are all different, or someplace in between. You only want to display the actual number of unique strings.

Between line 15 and 16 put an output statement and a pause to see what the actual input is.

cout << "this line is " << lineRead << endl;
cin.get();

This is called debugging code and should be removed once you've isolated the problem. It is a very powerful technique for trying to answer questions like this. If you are comfortable using a debugger the process is (somewhat) automated for you.

Sorry, Maybe I explained this badly. I do debug the code here, Just remove all my couts when posting code.

The read in works fine, Every line of the file is read through, and there's roughly 55,000 duplicated lines (out of 200k+ total lines), so the 10 was just a test to see if the code was working.

The problem is that the array that's set to store the value of how many times a single line is repeated only ever stores the one value.

Whether this is 0:0:0 or 99:99:99 it will only ever store one.

This is why I was referring to the same problem I had with my code as yours, as it's not filling up the array with the different duplicated codes, only the "last" duplicated.

Hope that explains it a little better.

Thanks Lilly

Duh, my bad. If a unique string is identified then numUnique needs to be increased in addition to the unique string being entered into uniqueStrings. In post #4, between line 17 and 19 put a line that looks something like this:

++numUnique;

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.