Interesting name for a title wouldn't you say? Well I couldn't find any better way to put it. Anywho, I have been working on a new program that is designed to read in a website ever so many many seconds, parse the data, and update the GUI.
Now I have finally found what I believe to be a good way to read in webpages, and I have started to work with threads, more specifically backgroundWorkers. The program itself currently contains two backgroundWorkers, one for a time elapsed function (that updates a label), and one for the reading and parsing part.
Okay so I'm now going to post the code, explain it, and then bring up the problem after that.
private void button1_Click (object sender, EventArgs e) //Start Button
{
if (((Properties.Settings.Default.webURL).Trim() != string.Empty) && (string.IsNullOrWhiteSpace(Properties.Settings.Default.webURL) == false)) //makes sure we have a URL at all
{
toolStripStatusLabel1.Text = "Start Time: " + DateTime.Now.ToString();
startTime = DateTime.UtcNow;
button1.Enabled = false;
button2.Enabled = true;
if (backgroundWorker1.IsBusy != true) //thread for the elapsed time
{
backgroundWorker1.RunWorkerAsync(); // Start the asynchronous operation.
}
if (backgroundWorker2.IsBusy != true) //thread for checking page update
{
backgroundWorker2.RunWorkerAsync(); // Start the asynchronous operation.
}
}
else //no URL so don't even try to start the program
{
MessageBox.Show("There is currently no URL address for where the scoreboard is located", "No URL", MessageBoxButtons.OK, MessageBoxIcon.Error);
}
}
This code is pretty simple, click the button (start button) check to make sure there's even a URL to work with and start the backgroundWorkers
private void backgroundWorker2_DoWork (object sender, DoWorkEventArgs e) //backgroundWorker2 is used to read in the webpage data
{
BackgroundWorker worker2 = sender as BackgroundWorker;
while (true)
{
if (worker2.CancellationPending == true)
{
e.Cancel = true;
break;
}
else
{
worker2.ReportProgress(0);
System.Threading.Thread.Sleep(refreshTimer * 1000);
}
}
worker2.Dispose(); // this calls after the thread is canceled (and after the thread sleep occurs)
}
Pretty simple, keep the look going, checking every 5 secs the website (to simplify it), do this till the user clicks the cancel button or closes the program
private void backgroundWorker2_ProgressChanged (object sender, ProgressChangedEventArgs e) //used to read in the webpage data and handle it
{
readInData.WebBrowser();
tempHash = generateHashCode.createHash(readInData.isDownloadedData()); //generates a hash from the data read in
if (tempHash != readInData_Hash) //if the hashes don't equal (so something changed)
{
tempTeamData.Clear();
if ((readInData.isDownloadedData() != "Invalid URL") && (readInData.isDownloadedData() != "Timed Out"))
{
tempTeamData = stripStringData.breakUp(readInData.isDownloadedData(), numOfFlags);
readInData.clearString();
readInData_Hash = tempHash; //update the hash
}
else if (readInData.isDownloadedData() == "Invalid URL") //the URL provided was invalid
{
richTextBox1.AppendText("Invalid URL" + "\n");
}
else if (readInData.isDownloadedData() == "Timed Out") //reading the site timed out
{
richTextBox1.AppendText("Reading Timed Out" + "\n");
}
}
else //no change so clear out the read in data
{
readInData.clearString();
}
}
This is what's called every five seconds. The program reads in the webpage, hashes the string, if the hash is different (meaning the webpage data changed) the code is parsed ... of course that's assuming one of the two strings you see in the else if statement aren't passed back instead.
//===================================================================================================================
public class readInWebpage
{
string downloadedData;
string URL;
Uri myUri;
System.Timers.Timer aTimer;
bool timerTrigger;
//-------------------------------------------------------------------------------------------------------------------
public readInWebpage (string URL)
{
this.downloadedData = "";
this.URL = URL;
this.aTimer = new System.Timers.Timer(5000);
aTimer.Elapsed += new ElapsedEventHandler(OnTimedEvent);
this.timerTrigger = false;
}
//-------------------------------------------------------------------------------------------------------------------
public string isDownloadedData ()
{
return downloadedData;
}
//-------------------------------------------------------------------------------------------------------------------
public void WebBrowser ()
{
myUri = null;
timerTrigger = false; //a timer system to allow the process to only work 5 secs (preventing an infinite loop)
aTimer.Interval = 5000;
aTimer.Enabled = true;
aTimer.Start();
if (Uri.TryCreate(URL, UriKind.Absolute, out myUri)) //checks to make sure the URL is even valid
{
WebBrowser wb = new WebBrowser();
wb.Navigate(URL);
wb.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(wb_DocumentCompleted);
while ((wb.ReadyState != WebBrowserReadyState.Complete) && (timerTrigger == false))
{
Application.DoEvents();
}
if (timerTrigger == false)
{
aTimer.Stop();
downloadedData = wb.Document.Body.InnerHtml; //Added this line, because the final HTML takes a while to show up
}
else
{
downloadedData = "Timed Out";
}
wb.Dispose();
}
else //invalid URL
{
downloadedData = "Invalid URL";
}
//===================================================================================================================
So here's where it all comes down to. This program works rather nicely, espeically for what I need. The problem is the Applications.DoEvents(). This is a dangerous line of code, without it, the program locks up, doing it's job but the GUI doesn't get updated and the user can't interact with the GUI (some reason the backgroundWorkers don't count for threading here ... even though it's called, but whatever).
Now the code here does check to make sure the URL is valid before going into all this work, but there's a problem here. Sometimes a website can appear valid but trying to access it will be meet with a super long load time that will eventually lead to page can not be displayed or never stop loading.
This is where the Application.DoEvents becomes dangerous. This little line (from what I have come to make out) pretty much goes "okay program keep running, don't get held up on this webpage loading" (it's like multiprogramming, dang you OS class). Well if a webpage never loads completely, I could have thread upon thread opening up.
So I came up with what I felt would be a fix, place a timer on that while statement. More specially a boolean variable as you see in the code. When the timer goes off it triggeres and event, changed the boolean, and we're all good to go, problem avoided ... except it didn't work that way. Something to do with the Application.DoEvents() is blocking it. I commented out this line, placed a breakpoint on the event for completeion of timer and boom it hit the point (of course the whole GUI locks up so not going to work), and when I left the line in, the timer never went off.
Well that's a lot to read, and I hope you guys are still here with me (details mean better help in my opinon). My problem is, is there a way I can implement a timer that will actually work while still leaving in the Application.DoEvents()? Or some other option to my build that can allow me the luxury of these features? Again I am kind of new to threading and have had issues in the past with reading in websites.
Thanks in advance for any and all help.