hi, i am trying to write a method that looks for the word 'not' before another word and these words i.e being positive or negative are stored in my database so for example this is not good, the problem im having is getting the word from my DB to be identified

$pos = mysql_query("SELECT word FROM positive");
$neg = mysql_query("SELECT word FROM negative");
$poswords = $pos['word'];
$negwords = $neg['word'];


$find = $review_text;
if (preg_match("/(?<=not) $negwords/i", $find)) 
   {
    echo $good++;
   }
  if (preg_match("/(?<=not) $poswords/i", $find)) 
   {
   echo $bad++;
   }
Member Avatar for diafol

Is it always not?

Could it be "isn't", "aren't"

for what i am trying to do i need to use 'not' but other words will be bought in later i.e very

Member Avatar for diafol

The reason I was asking is that if you want a simple solution for not - fine. But if you want something a little more flexible, contributors need to know otherwise a lot of time could be wasted.

Don't think your code will work.
How many good / bad words do you have? This may determine which method of attack to employ.

Your preg works just fine if the variable holds a single word. I think we discussed this in another thread. Are you sure it's not caused by something else?

I made a demo for you. It counts right in my test, 4 negative, 2 positive. Hope this helps:

$subject = 'This weird move is bad bad bad, not awful, funny but not good.';

$negWords = array ('bad', 'awful');
$posWords = array ('good', 'funny');

$negCount = 0;
$posCount = 0;

foreach ($negWords as $negWord) {
    # negative word prepended by not
    preg_match_all("/(?<=\bnot\b)\b$negWord\b/i", $subject, $matches);
    $posCount += count($matches[0]);

    # negative word
    preg_match_all("/(?<!\bnot\b)\b$negWord\b/i", $subject, $matches);
    $negCount += count($matches[0]);
}

foreach ($posWords as $posWord) {
    # positive word prepended by not
    preg_match_all("/(?<=\bnot\b)\b$posWord\b/i", $subject, $matches);
    $negCount += count($matches[0]);

    # positive word
    preg_match_all("/(?<!\bnot\b)\b$posWord\b/i", $subject, $matches);
    $posCount += count($matches[0]);
}

echo "<br/>";
echo "pos: $posCount";
echo "<br/>";
echo "neg: $negCount";

im a bit confused as to how yor method works so for
this is not good not bad not poor not boring not cheesy not great not good not good
id expect pos to = 4 and neg to = 4 also for my words i have about 200 of each pos and neg

Add the good words (good, great) to the posWords array, and the bad words (bad, poor, boring, cheesy) to the negWords array. If you then change subject to the line you showed, you'll see it counts 4 and 4. If you put the words in the right array, it will check whether or not it has the word "not" in front of it. "Bad" counts as negative, whereas "Not bad" counts as positive.

so how would i put my words in to an array ?? i have attempted this
$poswords = array ($pos['word']);
$negwords = array ($neg['word']);

mysql is very good at that, is there a way you can get mysql to do it?

eg.

SELECT count(*) as `positive` FROM `reviews` LEFT JOIN `positive` ON 1 = 1 WHERE `review` LIKE CONCAT('%','not','%',`word`,'%');
SELECT count(*) as `negative` FROM `reviews` LEFT JOIN `negative` ON 1 = 1 WHERE `review` LIKE CONCAT('%','not','%',`word`,'%');

it may get quite large since it needs to make a table the size of the rows in reviews*rows in words

or if its just on a single row you need to run it:

SELECT count(*) as `positive` FROM `reviews` LEFT JOIN `positive` ON 1 = 1 WHERE `reviewid` = 1 AND `review` LIKE CONCAT('%','not','%',`word`,'%');
SELECT count(*) as `negative` FROM `reviews` LEFT JOIN `negative` ON 1 = 1 WHERE `reviewid` = 1 AND `review` LIKE CONCAT('%','not','%',`word`,'%');

You could even pull the word id's from the positive table so you know exactly what words match.

$review_text will have to get written into a database for this to work though otherwise the join can't happen

$posResult = mysql_query("SELECT word FROM positive");//returns all words from `positive`
$negResult = mysql_query("SELECT word FROM negative");//returns all words from `negative`

$posWords = array();
$negWords = array();
if($posResult !== FALSE){
    while($posrow = mysql_fetch_assoc($posResult)){
        $posWords[] = $posrow['word'];
    }
}
if($negResult !== FALSE){
    while($negrow = mysql_fetch_assoc($negResult)){
        $negWords[] = $negrow['word'];
    }
}
//$posWords is now an array of all positive words
//$negWords is now an array of all negative words
commented: Beat me to it ;) +13

To fill the arrays use something like this:

    $pos = mysql_query("SELECT word FROM positive");
    $posWords = array ();
    if ($pos) {
        while ($row = mysql_fetch_assoc($pos)) {
            $posWords[] = $row['word'];
        }
    }

    $neg = mysql_query("SELECT word FROM negative");
    $negWords = array ();
    if ($neg) {
        while ($row = mysql_fetch_assoc($neg)) {
            $negWords[] = $row['word'];
        }
    }

thanks alot, your 2 methods seem to have done the trick that was what i was missing :)

just a quick one regarding this how can i make it so that if 'not' or a positive and negative word is the first word it will detect it as it is not picking it up

Am not sure what you mean. Can you give an example of what's not working in code?

If i understood right, you could do that using substr() and strlen()

<?php 
$subject = 'This weird move is bad bad bad, not awful, funny but not good.';

if(strtolower(substr($subject,0,3)) == 'not'){
        echo "{$subject} starts with {$v}<br/>\r\n";
}
foreach($negWords as $v){
    $length = strlen($v);
    if(strtolower(substr($subject,0,$length)) == $v){
        echo "{$subject} starts with {$v}<br/>\r\n";
    }
}
foreach($posWords as $v){
    $length = strlen($v);
    if(strtolower(substr($subject,0,$length)) == $v){
        echo "{$subject} starts with {$v}<br/>\r\n";
    }
}
?>

actually the code is working okay, i basically had a problem for example if a sentence started as:
not good but it was ok, then the 'not good' was not being picked up but this was due to having a space in (?<=\b not\b)

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.