Pulling RSS with PHP Help!

Question

asif49 9 Posting Whiz in Training

13 Years Ago

This is the type of feed which I'm pulling data from...

<item> 
      <title>This is a title</title>  
      <description>Description.</description>  
      <link>Link</link>  
      <media:content width="60" height="50" url="http://www.someurl.com"/>  
      <media:content width="150" height="80" url="http://www.someurl.com"/> 
    </item>  
    <item> 
      <title>This is a title</title>  
      <description>Description.</description>  
      <link>Link</link>  
    </item>  
    <item> 
      <title>This is a title</title>  
      <description>Description.</description>  
      <link>Link</link>  
      <media:content width="60" height="50" url="http://www.someurl.com"/>  
    </item>

Here's how I do it...

<?php
$xml="$_GET['url']"; //link to RSS Feed
$xmlDoc = new DOMDocument();
$xmlDoc->load($xml);

for ($i=0; $i<=10; $i++)
  {
  $x=$xmlDoc->getElementsByTagName('item');
  $item_title=$x->item($i)->getElementsByTagName('title')
  ->item(0)->childNodes->item(0)->nodeValue;
  $item_link=$x->item($i)->getElementsByTagName('link')
  ->item(0)->childNodes->item(0)->nodeValue;
  $item_desc=$x->item($i)->getElementsByTagName('description')
  ->item(0)->childNodes->item(0)->nodeValue;

  $x = $xmlDoc->getElementsByTagNameNS("http://search.yahoo.com/mrss/", "content");
  
  $image = $x->item($i)->getAttribute('url');
  echo "<img src='$image'/>";
  echo ("<p><a href='" . $item_link
  . "'>" . $item_title . "</a>");
  echo ("<br />");
  echo ($item_desc . "</p>");
  }

?>

As you can see, some of the child nodes of each <item> element often occur twice, only once or not at all I.E <media:content> in this example.

This means that although the title, description and link will be printed but when there isn't an image under a particular <item> node then it would just print the next image which is supposed to be printed with a different title and description - hope that makes sense.

What I am intending to do is to be able to recognize when there isn't a <media:content> element then just print the title/description/link but not the image because it belongs to the next <item> down. And if there are two, only print one and move on. I really need some help with this as I've become frustrated trying to solve this issue!

php

Edited 13 Years Ago by asif49 because: n/a

3 Contributors
17 Replies
1K Views
3 Days Discussion Span
Latest Post 13 Years Ago Latest Post by asif49

All 17 Replies

veedeoo 474 Junior Poster

13 Years Ago

Hi,

Are there anymore codes above <item></item>? Can you give me the live url, so that I can take a look.

cereal 1,524 Nearly a Senior Poster

13 Years Ago

You have already asked this question here:

http://www.daniweb.com/web-development/php/threads/407319

veedeoo 474 Junior Poster

13 Years Ago

ok since I don't feel lazy today, here is a simple RSS parser I just wrote for anyone who needs it. The source code is heavily commented already that I don't even see the needs of explaining anything right now.

<?php
## Some pretty basic RSS parser
### some Announcement from me
## Written by PoorBoy from http://veedeoo.com/forum or Veedeoo from http://daniweb.com
## Date Created : January 18, 2012
## end of my tiny announcement

## type in your xml file location or rss feed
$xml ='http://feeds.bbci.co.uk/news/england/london/rss.xml';
## remove @ to debug
$xml = @simplexml_load_file($xml); 
 foreach($xml as $items)
 ## disecting the xml as items
{
## we are only interested on the <item> and its children
$item = $items->item;
foreach($item as $news){
## parse title
$title = $news->title;
## parse description
$description = $news->description;
## parse news link
$news_link = $news->link;
## parse publication date
$pub_date = $news->pubDate;
## this is standards in all xml and rss parsing
$c_thumb = $news->children('http://search.yahoo.com/mrss/');
## try to grab the first thumbnail of the article
$t_attrs = $c_thumb->thumbnail[0]->attributes();
## we only want the url for this purpose
$thumb = $t_attrs['url'];
## echo everything
echo $title."<br/>";
echo "<img src='".$thumb."'/> </br>";
echo "Description: ".$description."<br/>";
echo "Link: ".$news_link."<br/>";
echo "Publication Date: ".$pub_date."<br/><br/>";

}
}
  ?>

HINT: Do not double post your question in the future.. That's a bad practice..

Edited 13 Years Ago by veedeoo because: n/a

veedeoo 474 Junior Poster

13 Years Ago

You are very much welcome. Good thing it all worked out for you..

asif49 commented: Offered great and thorough help, and even answered subsequent questions. Great Member! +3

veedeoo 474 Junior Poster

13 Years Ago

Yes, you are correct.. You can always control the start and the stop. To achieved this, we need to assign another variable. Just like in basic mathematics where we invent another constant or variable Z , so that we can solve for the X and Y. Something like this Z = X+Y , Y = Z- X , X = Z - Y respectively. Can we apply those simple transpositions? Yes..

He we go. Remember our codes in the previous page? We can upgrade it to this

<?php
## Some pretty basic RSS parser
### some Announcement from me
## Written by PoorBoy from http://veedeoo.com/forum or Veedeoo from http://daniwed.com
## Date Created : January 18, 2012
## end of my tiny announcement

## type in your xml file location or rss feed
$xml ='http://feeds.bbci.co.uk/news/england/london/rss.xml';
## remove @ to debug
$xml = @simplexml_load_file($xml); 
$somecount = 0;
$base = 10;
$limit = 5;
 foreach($xml as $items)
 ## disecting the xml as items
{

## we are only interested on the <item> and its children
$item = $items->item;
foreach($item as $news){
## parse title
$title = $news->title;
## parse description
$description = $news->description;
## parse news link
$news_link = $news->link;
## parse publication date
$pub_date = $news->pubDate;
## this is standards in all xml and rss parsing
$c_thumb = $news->children('http://search.yahoo.com/mrss/');
## try to grab the first thumbnail of the article
$t_attrs = $c_thumb->thumbnail[0]->attributes();
## we only want the url for this purpose
$thumb = $t_attrs['url'];
## echo everything
++$somecount;
if ($somecount >= $base) { 
##comment the somecount below
echo $somecount."<br/>" ;
echo $title."<br/>";
echo "<img src='".$thumb."'/> </br>";
echo "Description: ".$description."<br/>";
echo "Link: ".$news_link."<br/>";
echo "Publication Date: ".$pub_date."<br/><br/>";

}
## we need to add the limit and the base
## none of the two could be equal to zero
if ($somecount >=$limit + $base) break;


}
}
  ?>

If you run the script above, you will notice that it will only display from your base count and terminated by your limit..

Edited 13 Years Ago by veedeoo because: n/a

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

Offered great and thorough help, and even answered subsequent questions. Great Member!

asif49 9 Posting Whiz in Training · Answer 1 · 2012-01-20T01:00:13+00:00

Yes, sure. I'm intending to pull from a URL such as this:
http://feeds.bbci.co.uk/news/wales/north_east_wales/rss.xml
but less consistent than this in terms of how many <media:thumbnail> or <media:content> elements are contained inside each <item> ... </item>

asif49 9 Posting Whiz in Training · Answer 2 · 2012-01-20T03:38:32+00:00

Sort of, but since I gained no reply I thought I'd word the question better here. Thanks for pointing it out for absolutely no reason though. Cheers.

asif49 9 Posting Whiz in Training · Answer 3 · 2012-01-20T06:53:09+00:00

Solution works really nicely as far as I can tell! Forgive me for being a novice but how can I limit it so it only gets a pre-determined amount of <item>'s elements

veedeoo 474 Junior Poster Featured Poster · Answer 4 · 2012-01-20T07:15:05+00:00

Hi, that's pretty easy... This is how you stop foreach loop iteration at any given point, assuming that the item within the loop is greater than 0.

On the codes above, find

$xml = @simplexml_load_file($xml);

just below it add;

$somecount = 0;
$limit = 10;

scroll down the page and find;

echo "Publication Date: ".$pub_date."<br/><br/>";

Just below it, add;

if (++$somecount >=$limit) break;

You must define your limit as integer like this $limit = 10; common mistakes even experts would do it like this $limit = "10"; which is a string..

asif49 9 Posting Whiz in Training · Answer 5 · 2012-01-20T20:53:31+00:00

asif49 9 Posting Whiz in Training

13 Years Ago

That is exactly what I needed! Thank you sir.

asif49 9 Posting Whiz in Training · Answer 6 · 2012-01-21T03:32:25+00:00

Actually, one final thing... this is the most vital thing in what I'm trying to achieve.

At the moment, as you said, we can set a start and limit as start = 0; and limit = 10;

Can we go a step further and say start = 10; and limit = 20; so we only access from the 10th <item> element onwards.

Much much much thanks in advance!

asif49 9 Posting Whiz in Training · Answer 7 · 2012-01-21T23:03:00+00:00

You're a lifesaver. It worked! Wished I had a teacher who was as good with explaining things as you. I don't want you to feel like I keep pressing you for more and more information but there is a certain error I'm getting here.. it says

"Call to a member function attributes() on a non-object in getRSS.php" which specifies that the error is at the following line:

$c_thumb = $news->children('http://search.yahoo.com/mrss/');
$t_attrs = $c_thumb->thumbnail[1]->attributes();
$thumb = $t_attrs['url'];

it looks like this is because some <item> elements contain only one thumbnail but we are trying to get the second <media:thumbnail> element's attribute above. I've tried to correct it using the following:

if ($c_thumb->thumbnail[1]->attributes() != null) {
$t_attrs = $c_thumb->thumbnail[1]->attributes();
} else {
$t_attrs = $c_thumb->thumbnail[0]->attributes();
}

but it doesn't seem to work. Any ideas here?

veedeoo 474 Junior Poster Featured Poster · Answer 8 · 2012-01-22T08:54:14+00:00

Hi this may work. I just don't have the time to test it locally. It should work..

$c_thumb = $news->children('http://search.yahoo.com/mrss/');
## try to grab the first thumbnail of the article
$t_one = $c_thumb->thumbnail[0]->attributes();
## try to grab the second thumbnail of the article
$t_two = $c_thumb->thumbnail[1]->attributes();
## url of the first thumb
$thumb_one = $t_one['url'];
## url of the second thumb
$thumb_two = $t_two['url'];
## set condition if
## we cannot use !=null, becuase simpleXML views the file as xml file
if($thumb_two != ""){
$thumb = $thumb_two;

else{
$thumb = $thumb_one;
}

or this may work also

$c_thumb = $news->children('http://search.yahoo.com/mrss/');
## try to grab the first thumbnail of the article
$t_one = $c_thumb->thumbnail[0]->attributes();
## try to grab the second thumbnail of the article
$t_two = $c_thumb->thumbnail[1]->attributes();
## url of the first thumb
$thumb_one = $t_one['url'];
## url of the second thumb
$thumb_two = $t_two['url'];
## set condition if
## we can use !=null, if simpleXML can return NULL as value for not found attributes
if($thumb_two != null){
$thumb = $thumb_two;

else{
$thumb = $thumb_one;
}

veedeoo 474 Junior Poster Featured Poster · Answer 9 · 2012-01-22T13:59:56+00:00

Here is the corrected codes for the above

$c_thumb = $news->children('http://search.yahoo.com/mrss/');
## try to grab the first thumbnail of the article
$t_one = $c_thumb->thumbnail[0]->attributes();
## try to grab the second thumbnail of the article
$t_two = $c_thumb->thumbnail[1]->attributes();
## url of the first thumb
$thumb_one = $t_one['url'];
## url of the second thumb
$thumb_two = $t_two['url'];
## set condition if
## we cannot use !=null, becuase simpleXML views the file as xml file
if($thumb_two != ""){
$thumb = $thumb_two;
}
else{
$thumb = $thumb_one;
}

for the second codes

$c_thumb = $news->children('http://search.yahoo.com/mrss/');
## try to grab the first thumbnail of the article
$t_one = $c_thumb->thumbnail[0]->attributes();
## try to grab the second thumbnail of the article
$t_two = $c_thumb->thumbnail[1]->attributes();
## url of the first thumb
$thumb_one = $t_one['url'];
## url of the second thumb
$thumb_two = $t_two['url'];
## set condition if
## we can use !=null, if simpleXML can return NULL as value for not found attributes
if($thumb_two != null){
$thumb = $thumb_two;
}
else{
$thumb = $thumb_one;
}

Make sure to test which one of the codes above works for you.. Double check and make sure you are displaying the second thumb "image info"..

asif49 9 Posting Whiz in Training · Answer 10 · 2012-01-22T23:21:44+00:00

The worst expected thing happened, both of the methods failed :(

The error seems to be at this point:

$t_one = $c_thumb->thumbnail[0]->attributes(); 
$t_two = $c_thumb->thumbnail[1]->attributes();

It works until it encounters an object where there is only one <media:thumbnail> element. I think it's because in that case, the second line above which tries to access the second element simply isn't there so it sends an error and stops doing anything else

veedeoo 474 Junior Poster Featured Poster · Answer 11 · 2012-01-23T01:24:28+00:00

To test if the attributes ..

$t_one = $c_thumb->thumbnail[0]->attributes();
$t_two = $c_thumb->thumbnail[1]->attributes();

exist, then you will have to use bool like this

$t_one =(bool)( $c_thumb->thumbnail[0]->attributes());
$t_two = (bool)($c_thumb->thumbnail[1]->attributes());

If the attributes ever exist on the xml file, then the script should return 1.. Based on this response, you can use it to switch your thumbnails or whatever you want to access.

Comment the if statement and trigger the (bool) and then echo out the "$t_one".. This should give you something like this on your page

1
1
1
1
1
1
1
1

10
and so forth... For none existing attributes it should return zero (0)..

That's all you have to do.. It is true or false..

IMPORTANT! Once you use the "$t_one" for boolean, you will have to assign another variable for your thumb like

if($t_one == 1){
$t_one_is_true = $c_thumb->thumbnail[0]->attributes();
$thumb_one = $t_one_is_true['url'];

}

Once the script has been tested several times, I strongly suggest for you to rewrite in condensed form, or you can just leave like that.

asif49 9 Posting Whiz in Training · Answer 12 · 2012-01-23T03:37:05+00:00

It still results a lot of 1's and then when there is a 0 it just displays that error so a lot of 1's are seen then there is just the error.

I've tried to follow your brief and don't think I made any mistakes but here it is anyway,

<?php

$xml ='http://feeds.bbci.co.uk/news/england/london/rss.xml';
$xml = @simplexml_load_file($xml); 

foreach($xml as $items)
{
$item = $items->item;
foreach($item as $news){
$c_thumb = $news->children('http://search.yahoo.com/mrss/');
$t_one =(bool)( $c_thumb->thumbnail[0]->attributes());
$t_two = (bool)($c_thumb->thumbnail[1]->attributes()); /*<<<<ERROR OCCURS HERE when there is only one element in feed*/

if ($t_two == 1) {
$t_two_is_true = $c_thumb->thumbnail[1]->attributes();
$thumb = $t_two_is_true['url'];
} else {
$t_one_is_true = $c_thumb->thumbnail[0]->attributes();
$thumb = $t_one_is_true['url'];
}

echo $thumb . "<br>";

}
}
?>

Pulling RSS with PHP Help!

Recommended Answers Collapse Answers

All 17 Replies

Recommended Answers