Hi All,
I've got a VC++ dll with Boost 1_34_1 regular expressions in it. I am a programmer but not VC++ or Regex.
The program worked fine until the target HTML format (sPage) was changed and I need to get the regexs updated. I got them updated OK. I checked them all with RegexBuddy and individually, they all work. I substituted the old regexs for the new but now after the regex split command the output-iterator (oMessageInfo) contains only the output of the first regular expression.
I looked at the Boost docs website ( http://www.boost.org/doc/libs/1_31_0/libs/regex/doc/regex_split.html ) regarding the regex_split command
"Effects: Each version of the algorithm takes an output-iterator for output, and a string for input. If the expression contains no marked sub-expressions, then the algorithm writes one string onto the output-iterator for each section of input that does not match the expression. If the expression does contain marked sub-expressions, then each time a match is found, one string for each marked sub-expression will be written to the output-iterator."
Unfortunately I have no idea what a 'marked sub-expression' is.
Hoefully someone here can spot the schoolboy mistake.
const std::string sMessages1 =
"<td class=\"msgnumh smalltype\">\#([0-9]*)<\/td>"
"(?:[\s\S]*?)From:(?:<\/em>)?(?:<\/span>)?(?:[\s]*)?(?:")?([^&]*)(?:")?(?:[\s]*)?<([^&]*)>"
"(?:[\s\S]*?)Date:(?:<\/em>)?(?:<\/span>)?[\s]*([^<]*)(?:<br>)?"
boost::regex oRegExMessages1(sMessages1, boost::regbase::normal | boost::regbase::icase);
boost::regex_split(std::back_inserter(oMessageInfo), sPage, oRegExMessages1);
Cheers,
Wilson.