Member Avatar for Griff0527

I've searched the C++ forums and cannot find anything that relates to my question. I am writing a program (for class) that opens a txt file and reads doubles from the file, then computes the standard deviation of the numbers in the file. In order to do this, I am opening the file, reading the numbers in and computing the average, then closing the file, outputting the average to the screen, reopening the file (in order to pull the data from the beginning again) and then computing standard deviation. My code "LOOKS" 100% correct to me, but in running

if (inStream.fail())

the second time at lines 37 and 38, I get a failure opening the file after I have closed it at line 32. Can someone help me figure out what my error is? I appreciate it.

#include <fstream>
#include <iostream>
#include <cstdlib>
#include <cmath>
using namespace std;

int main()
{
	cout << "This program reads a list of numbers from a text file\n"
		<< "and then displays the average and the standard deviation\n"
		<< "of the numbers in the file.  If any errors occur in the\n"
		<< "reading of the file, you will be notified.\n\n"
		<< "Starting data read now......\n\n"; // General text to describe program

	ifstream inStream("numbersIN.txt"); //set input stream name and opens it
	if (inStream.fail()) //test for file open failure
	{
		cout << "Opening input file failed.\n"; // output to screen if file does not open
		exit(1);
	}

	double next; // variable to hold next number
	int counter = 0; // counter to store number of numbers read
	double total = 0; // variable to hold total of all numbers

	while (inStream >> next) // while there is a next number to be read (not EOF)
	{
		total += next; // add numbers from file and store in total
		counter++; // add 1 to counter
	}

	inStream.close( ); // close input file
	double avg = (total/counter); // calculate the average

	cout << "The Average is " << avg << endl;

	inStream.open("numbersIN.txt"); // reopen file at the beginning
	if (inStream.fail()) //test to ensure file opens correctly again
	{
		cout << "Opening input file second time failed.\n";
		exit(1);
	}
	double stdDev; // variable to hold the standard deviation
	double dev;  // placeholding variable to store deviation of number being read from file
	double devSqrd; // variable to hold squared deviations
	double totalDevSqrd = 0; // variable to hold sum of all squared deviations
	while (inStream >> next);
	{
		dev = next - avg; // calculate deviation of number by subtracting the average from the number
		devSqrd = dev * dev; // square the result of the deviation
		totalDevSqrd += devSqrd; // adds the result to totalDevSqrd
	} // exit while loop
	inStream.close(); // close the input file
	stdDev = sqrt((totalDevSqrd / (avg - 1))); /* standard deviation equals the square root of 
										 the total square deviations devided by the average minus 1 */
	cout.setf(ios::fixed | ios::showpoint | ios::precision(6);

	cout << "The average of the numbers in the file is " << avg << endl;
	cout << "The standard deviation of the numbers in the file is " << stdDev << endl;

	return (0);
}

Use the .is_open() method on lines 16 and 38 rather than fail. Fail is something different like for example if your stream is trying to read in doubles and it encounters a letter in the file (there are other situations that this arises in too). See http://www.cplusplus.com/reference/iostream/ios/fail/

Also, you could just read the data into an array (as long as it's not overwhelmingly huge) and do both of your calculations off that (unless you assignment precludes this).

Hello

Could that be that you have mistaken counter and avg in:

stdDev = sqrt((totalDevSqrd / (avg - 1))); /* standard deviation equals the square root of the total square deviations devided by the average minus 1 */

// shouldn't it be?
	stdDev = sqrt((totalDevSqrd / (counter - 1))); //... divided by counter-1

-- tesu

Btw, it is not necessary to read the data file twice for average and standard deviation can be calculated together within the first loop. You may do the following extensions to do this:

// After         double total = 0; // variable to hold total of all numbers

// add:
   double qtotal = 0;   // variable to hold the total squares of all numbers


// After 	total += next; // add numbers from file and store in total

// add:
   qtotal += next*next; // add the squares of the numbers


// After        double avg = (total/counter); // calculate the average

// add:
   double std = sqrt((qtotal - counter * avg)/(counter - 1)); // calculate the standard deviation
   cout << "The standard deviation is " << std; 

/*
The identity 

     sqrt(SUM(avg - next)^2/(counter - 1)) == sqrt((qtotal - counter * avg)/(counter - 1))

can easily be shown by multiplying out the summation of (avg - next)^2. 

If a large quantity of data should be calculated that way and where the standard deviation becomes very small the term (qtotal - counter * avg)/(counter - 1) could slightly fall below nought caused by numerical reasons. If this happens abs(qtotal - counter * avg) should be calculated. And sure, if there is only one number (counter=1), std is not defined.  
*/

You may try this new formula, it will save reading the file twice and the second loop.

-- tesu

Standard dev is Sum((datapt - mean)^2)/(counter -1) so you can't add the squared data and then subtract the mean squared unfortunately since there is a cross term of -2*datapt*mean when you square it out. You need to hold all the data then calculate the mean, then do the calculation.

Standard dev is Sum((datapt - mean)^2)/(counter -1) so you can't add the squared data and then subtract the mean squared unfortunately since there is a cross term of -2*datapt*mean when you square it out. You need to hold all the data then calculate the mean, then do the calculation.

Sorry, even a phrenologist is allowed to make a mistake. Unfortunately, your implication is incorrect.
Proof: ;)

Let standard deviation be defined as:

s^2 = sum (a - xi)^2 /( n - 1)

By multiplying out the sum we get:

(n-1)*s^2 = sum(a^2) - 2*sum (a * xi) + sum(xi^2)
(n-1)*s^2 = sum(a^2) - 2*a*sum(xi) + sum(xi^2)
(n-1)*s^2 = sum(a^2) - 2*a*n*a + sum(xi^2)
(n-1)*s^2 = sum(xi^2) - n*a^2

and finally:

s = sqrt((sum(xi^2) - n*a^2)/(n-1)); 

(where n is number of measured data, xi is a single measured data, a is average sum(xi)/n, s is standard deviation of the measurement series, ^ stands for power of, sqrt() is square root)

You may take a try:

n = 5                              
i      1    2    3    4    5    sum
x      1    2    3    4    5    15
xi^2   1    4    9   16   25    55

a = 15 / 5 = 3
s = sqrt((55 - 5 * 3^2)/(5-1)) = sqrt(( 55 - 45) / 4 ) = sqrt(10/4) = 1.58

Prove it by taking the inconvenient way:

s = sqrt(((3-1)^2 + (3-2)^2 +(3-3)^2 +(3-4)^2 +(3-5)^2)/(5-1))  ?

You can also examine this facts on wikipedia. You may study there the last formula of paragraph "Identities and mathematical properties", where my above formula for finite population with equal probabilities on all points is listed.

Great! This site shows the identity in a mathematically correct way including a full proof!

-- tesu

commented: I concede +4
Member Avatar for Griff0527

Hello

Could that be that you have mistaken counter and avg in:

stdDev = sqrt((totalDevSqrd / (avg - 1))); /* standard deviation equals the square root of the total square deviations devided by the average minus 1 */

// shouldn't it be?
	stdDev = sqrt((totalDevSqrd / (counter - 1))); //... divided by counter-1

-- tesu

Thanks Tesu, good catch on that. I hadn't gotten to debugging the actual logic of the mathematics yet. That one probably would have had me stumped for a little while.

Member Avatar for Griff0527

Use the .is_open() method on lines 16 and 38 rather than fail. Fail is something different like for example if your stream is trying to read in doubles and it encounters a letter in the file (there are other situations that this arises in too). See http://www.cplusplus.com/reference/iostream/ios/fail/

Also, you could just read the data into an array (as long as it's not overwhelmingly huge) and do both of your calculations off that (unless you assignment precludes this).

Thank you jonsca. I really appreciate the reference link as well. I was able to get it to stop failing on the opening of the file the second time, added some debugging cout's to the code and now I see my issue. When I reopen the file, for some reason it only reads through the first number and stops which throws the calculation of the standard deviation way off. However, I'm going to continue messing with it for a bit before I post my "new" code or ask for more help on resolving this issue as during my debugging I added lines to print out the data that is read in and on the first loop it reads all 40 numbers, but on the second loop (after close and reopen) the only value read from the file is the LAST value.

Member Avatar for Griff0527

Ok, I'm not sure if when helping people figure out their code if you actually copy, paste and attempt to run the code, or if you just look it over, so I am going to explain what I have done and what the error is now. (By the way, I am curious if anyone does try to run the code). I changed the testing on the input files and added several cout lines to see what my program is reading. On the first opening, the file reads (and displays) all the numbers from the text file. In the second opening, only the LAST number from the file is read into the program. I have added .clear(), .close(), followed by .open() and .seekg(0, ios::beg) ... I know that is redundant, but I was trying to see if any of them would make the program read from the beginning of the file again. Here is the code as it stands now with all the additional cout lines, etc. Can someone help me figure out why the second open does not go back to the beginning of the file? From everything I have found, either the .clear() or the .seekg(0, ios::beg) should reset the input stream to the beginning of the file and it is not working. Thanks again.

#include <fstream>
#include <iostream>
#include <cstdlib>
#include <cmath>
using namespace std;

int main()
{
	cout << "This program reads a list of numbers from a text file\n"
		<< "and then displays the average and the standard deviation\n"
		<< "of the numbers in the file.  If any errors occur in the\n"
		<< "reading of the file, you will be notified.\n\n"
		<< "Starting data read now......\n\n"; // General text to describe program

	ifstream inStream("numbersIN.txt", ios::in); //set input stream name and opens it
	if (! inStream.is_open()) //test for file open failure
	{
		cout << "Opening input file failed.\n"; // output to screen if file does not open
		exit(1);
	}

	double next; // variable to hold next number
	int counter = 0; // counter to store number of numbers read
	double total = 0; // variable to hold total of all numbers

	while ((inStream >> next)) // while there is a next number to be read (not EOF)
	{
		total += next; // add numbers from file and store in total
		counter++; // add 1 to counter
		cout << counter << " The next number is: " << next << endl;
	}
	inStream.clear(); // makes program "forget" it hit eof
	inStream.close( ); // close input file
	double avg = (total/counter); // calculate the average



	cout << "The Average is " << avg << endl;

	inStream.open("numbersIN.txt", ios::in); // reopen file
	inStream.seekg(0, ios::beg); // set to beginning of file

	if (! inStream.is_open()) //test to ensure file opens correctly again
	{
		cout << "Opening input file second time failed.\n";
		exit(1);
	}
	double stdDev; // variable to hold the standard deviation
	double dev;  // placeholding variable to store deviation of number being read from file
	double devSqrd; // variable to hold squared deviations
	double totalDevSqrd = 0; // variable to hold sum of all squared deviations
	while (inStream >> next);
	{
		cout << "The next input number is: " << next << endl;
		dev = next - avg; // calculate deviation of number by subtracting the average from the number
		cout << "dev equals " << dev; // test for results
		devSqrd = dev * dev; // square the result of the deviation
		cout << "     devSqrd equals " << devSqrd;
		totalDevSqrd += devSqrd; // adds the result to totalDevSqrd
		cout << "     totalDevSqrd equals " << totalDevSqrd << endl;
	} // exit while loop
	inStream.close(); // close the input file
	stdDev = sqrt(totalDevSqrd / (counter - 1)); /* standard deviation equals the square root of 
										 the total square deviations devided by the average minus 1 */
	cout.setf(ios::fixed | ios::showpoint);
	cout.precision(6);

	cout << "The average of the numbers in the file is " << avg << endl;
	cout << "The standard deviation of the numbers in the file is " << stdDev << endl;

	return (0);
}

If I replace line 52 with

while (! inStream.eof());

The program will hang until CTRL+BREAK is pressed and then exit. This proves to me that the file is not resetting back to the beginning of the file.

Sorry, even a phrenologist is allowed to make a mistake.

Yes. It's an aperiodic event when it does happen :)

Unfortunately, your implication is incorrect.

Granted. Apologies to the OP.

Ok, I'm not sure if when helping people figure out their code if you actually copy, paste and attempt to run the code, or if you just look it over, so I am going to explain what I have done and what the error is now.

If it's a compilable piece I usually try to run it, or if it's a function that I can couple with a small main to test.

The whole idea that tesuji was trying to show you would eliminate the need for 2 separate file openings:

while(inFile>>next)
{
   total+=next;
   totalsq+=next*next;
}

ave = next/counter;

Then use the formula on line 14 of tesuji's proof. ( sum(xi^2) is totalsq ).

Your problem with your current code is on line 52:

while (inStream >> next);
commented: Excellent at guiding in the right direction while making you think for yourself. +1
Member Avatar for Griff0527

Btw, it is not necessary to read the data file twice for average and standard deviation can be calculated together within the first loop. You may do the following extensions to do this:

// After         double total = 0; // variable to hold total of all numbers

// add:
   double qtotal = 0;   // variable to hold the total squares of all numbers


// After 	total += next; // add numbers from file and store in total

// add:
   qtotal += next*next; // add the squares of the numbers


// After        double avg = (total/counter); // calculate the average

// add:
   double std = sqrt((qtotal - counter * avg)/(counter - 1)); // calculate the standard deviation
   cout << "The standard deviation is " << std; 

/*
The identity 

     sqrt(SUM(avg - next)^2/(counter - 1)) == sqrt((qtotal - counter * avg)/(counter - 1))

can easily be shown by multiplying out the summation of (avg - next)^2. 

If a large quantity of data should be calculated that way and where the standard deviation becomes very small the term (qtotal - counter * avg)/(counter - 1) could slightly fall below nought caused by numerical reasons. If this happens abs(qtotal - counter * avg) should be calculated. And sure, if there is only one number (counter=1), std is not defined.  
*/

You may try this new formula, it will save reading the file twice and the second loop.

-- tesu

In the lines stating:

double std = sqrt((qtotal - counter * avg)/(counter - 1));
cout << "The standard deviation is " << std;

I found an error and once corrected, this worked brilliantly. I had to sit down with paper and work out the proof for myself before I realized the missing variable. Once spotted I saw it quite easily in comparing this bit of code and the other post where you give the equation on line 14 of your post. This is what I came up with.

#

double std = sqrt((qtotal - counter * avg *avg)/(counter - 1));
cout << "The standard deviation is " << std;
commented: Thank you for this correcting +3

He had the a^2 there. Doesn't matter, glad you got it working.

Member Avatar for Griff0527

If it's a compilable piece I usually try to run it, or if it's a function that I can couple with a small main to test.

The whole idea that tesuji was trying to show you would eliminate the need for 2 separate file openings:

while(inFile>>next)
{
   total+=next;
   totalsq+=next*next;
}

ave = next/counter;

Then use the formula on line 14 of tesuji's proof. ( sum(xi^2) is totalsq ).

Your problem with your current code is on line 52:

while (inStream >> next);

This is the final on my code. I am going to keep working on the other version where I open the file twice, just because it is irritating me that I can't get the same file to open two times. I'm stubborn like that. I know it has to be able to be done, I'm just not sure how. Thank you all again for guiding me in the right direction on this bit of code.

#include <fstream>
#include <iostream>
#include <cstdlib>
#include <cmath>
using namespace std;

int main()
{
	cout << "This program reads a list of numbers from a text file\n"
		<< "and then displays the average and the standard deviation\n"
		<< "of the numbers in the file.  If any errors occur in the\n"
		<< "reading of the file, you will be notified.\n\n"
		<< "Starting data read now......\n\n"; // General text to describe program

	ifstream inStream("numbersIN.txt", ios::in); //set input stream name and opens it

	if (! inStream.is_open()) //test for file open failure
	{
		cout << "Opening input file failed.\n"; // output to screen if file does not open
		exit(1);
	}

	double next; // variable to hold next number
	int counter = 0; // counter to store number of numbers read
	double total = 0; // variable to hold total of all numbers
	double qtotal = 0; // variable to hold the total squares of all numbers

	while ((inStream >> next)) // while there is a next number to be read (not EOF)
	{
		total += next; // add numbers from file and store in total
		qtotal += next*next; // add the squares of the numbers
		counter++; // add 1 to counter
	}

	inStream.close( ); // close input file
	double avg = (total/counter); // calculate the average
	double std = sqrt((qtotal - counter * avg * avg)/(counter - 1)); // calculate the standard deviation
 
	cout.setf(ios::fixed | ios::showpoint);
	cout.precision(6);

	cout << "The Average is " << avg << endl;
	cout << "The standard deviation is " << std << endl <<endl;

	return (0);
}

If it's a compilable piece I usually try to run it, or if it's a function that I can couple with a small main to test.

Your problem with your current code is on line 52:

while (inStream >> next);

That was the problem with the other one, you had a ; after the while statment causing it to run just the inStream part without having it run the inside of the { } except once after that was complete.

Member Avatar for Griff0527

That was the problem with the other one, you had a ; after the while statment causing it to run just the inStream part without having it run the inside of the { } except once after that was complete.

Wow... the simple oversights. Thanks again. I can't believe I missed that.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.