Hi,
I am developing a Winform app in 2010 Express. I want to be able to srip out superscript and subscript characters out of the strings that I am cutting and pasting from web pages.
For example:
1 I, Nephi, having been aborn of bgoodly cparents, therefore I was dtaught somewhat in all the learning of my father; and having seen many eafflictions in the course of my days, nevertheless, having been highly favored of the Lord in all my days; yea, having had a great knowledge of the goodness and the mysteries of God, therefore I make a frecord of my proceedings in my days.
The 'a' in aborn and the 'b' in bgoodly and the 'c' in cparents are superscript characters.
I am using regexp to get rid of any excess white space:
public void FormatText()
{
string rtbTemp; // RichTextBox contents placed in this variable.
Regex r = new Regex(@"\s+");
IDataObject iData = Clipboard.GetDataObject();
//try catch here ... error trapping for Windows Clipboard errors
try
{
if (iData.GetDataPresent(DataFormats.Text))
{
rtbTemp = (String)iData.GetData(DataFormats.Text);
rtbVerse.Text = r.Replace((rtbTemp.Trim()), @" ");
ClipboardOk = true;
}
else
{
ClipboardOk = false;
}
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
}
}
but I'm not that familiar with regexp to know if it can handle detecting superscript/subscript. Has anyone had any experience with stripping superscript before?