Trying to re-write a php function to C++ using boost.. I have NOT seen code for this anywhere on the net. I thought it was useful and I need it for C++ but I need help..

definition of the function in c++

void preg(string pattern, string subject, string &matches, int flags, int offset)       //Matches is a 2D array or map or 2d vector..
{

}

Function I want to re-write to C++..
http://php.net/manual/en/function.preg-match-all.php


Attempt (takes in a string, matches it to an expression, returns the match:

string ID;           //Match to be returned.
string Source;              //Source of Data to match against..
boost::regex expression("(.*)");     //A patter to match..



void preg_match_all(string Source, boost::regex &expression, string &ID)
{
       try
       {
           std::string::const_iterator start, end;
           start = Source.begin();
           end = Source.end();
           boost::smatch what;
           boost::match_flag_type flags = boost::match_default;

//Match against the data.. if a match is found, return it to ID..

           while(boost::regex_search(start, end, what, expression, flags))
           {
                //Destination = boost::regex_replace(Source, expression, "");
                ID = what[0];
                start = what[0].second;
           }
       }
       catch(exception &e)
       {
           cout<<"Exception Caught.. Function: preg_match_all.\n\n";
       }

    return;
}

Problem.. I want it to return an array of every match it finds.. At the moment it returns one match..

Example Data:

<td vsgsgs> whatever else here</td><td>fsgsgs</td><td aifngn;aga></td>

It will return that whole thing into the string ID. I want it to return:

array[0] = <td vsgsgs> whatever else here</td>
array[1] = <td>fsgsgs</td>
array[2] = <td aifngn;aga></td>

Can you guys help me write a to match this? It should find 3 <td></td> 's in there..

Thing is they aren't separated by any spaces or anything.. so I'm not sure how to get them..

But I used <td(.*)</td>.. and it finds one match.. Any Ideas?

<td class="wsod_change" nowrap="nowrap"><span stream="arrow_599362" streamFeed="SunGard"><img src="http://i.cdn.turner.com/money/.element/img/3.0/data/arrowDown.gif" align="absmiddle" /></span>&nbsp;<span stream="change_599362" streamFeed="SunGard"><span class="negData">-74.70</span></span><span class="wsod_grey">&nbsp;/&nbsp;</span><span stream="changePct_599362" streamFeed="SunGard"><span class="negData">-0.61%</span></span><div class="wsod_quoteLabel wsod_quoteLabelChange">Today&rsquo;s Change</div></td><td class="wsod_52week" nowrap="nowrap"><div class="wsod_quoteRanger clearfix"><div title="10/04/2011" class="val lo">10,404</div><div class="wsod_quoteRangerMeter"><span class="wsod_qRangeInidicator" style="left:53px;">Today</span><span class="wsod_qRangeBarOuter"><span class="wsod_qRangeBarInner">|||</span></span><span class="wsod_quoteLabel52WkChg">52-Week Range</span></div><div title="05/02/2011" class="val hi">12,876</div></div></td><td class="wsod_ytd" nowrap="nowrap" ytdVal="4.331414958829656"><span class="posData">+4.33%</span><div class="wsod_quoteLabel wsod_quoteLabelYTD">Year-to-Date</div></td>

> But I used <td(.*)</td> .. and it finds one match.. Any Ideas?

With the given text, it should find just one match. That repeat operator is greedy; it will consume as much input as possible.

Use the non-greedy repeat operator instead: <td.*?/td>

Repeat: Use boost::sregex_iterator, perhaps?

Like this:

#include <string>
#include <vector>
#include <boost/regex.hpp>

std::vector<std::string> preg_match_all( const std::string& str, const boost::regex& regex )
{
    std::vector<std::string> matches ;
    boost::sregex_iterator begin( str.begin(), str.end(), regex ), end ;
    for( ; begin != end ; ++begin ) matches.push_back( begin->str() ) ;
    return matches ;
}
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.