Greetings,
I have reached a point where I need some help. I have a tivo at home, and I'm trying to script something that will allow me to 1.) pull the XML off of the tivo, and save the file, 2.) Take the text in the XML, and pull out the http:// links, 3.) put all the links into an array where I can 4.) use cURL, and download these files for further processing later.
I've managed to download the tivo.xml file using cURL, but now, I'm stuck with a text file with no spaces, or newlines that I can use to separate out the xml tags:
<?xml version="1.0" encoding="utf-8"?><TiVoContainer xmlns="http://www.tivo.com/developer/calypso-protocol-1.6/"><Details><ContentType>x-tivo-container/tivo-videos</ContentType><SourceFormat>x-tivo-container/tivo-dvr</SourceFormat><Title>Now Playing</Title><LastChangeDate>0x49B7F8FF</LastChangeDate><TotalItems>73</TotalItems><UniqueId>/NowPlaying</UniqueId></Details><SortOrder>Type,CaptureDate</SortOrder><GlobalSort>Yes</GlobalSort><ItemStart>0</ItemStart><ItemCount>73</ItemCount><Item><Details><ContentType>video/x-tivo-raw-pes</ContentType><SourceFormat>video/x-tivo-raw-pes</SourceFormat><Title>The Daily Show With Jon Stewart</Title><SourceSize>783286272</SourceSize><Duration>3600000</Duration><CaptureDate>0x49B7535F</CaptureDate><SourceChannel>48</SourceChannel><SourceStation>COMEDYP</SourceStation><HighDefinition>No</HighDefinition><ProgramId>EP2930531384</ProgramId><SeriesId>SH293053</SeriesId><EpisodeNumber>14034</EpisodeNumber><ByteOffset>0</ByteOffset></Details><Links><Content><Url>http://192.168.1.20:80/download/The%20Daily%20Show%20With%20Jon%20Stewart.TiVo?Container=%2FNowPlaying&id=3207375</Url><ContentType>video/x-tivo-raw-pes</ContentType></Content><TiVoVideoDetails><Url>https://192.168.1.20:443/TiVoVideoDetails?id=3207375</Url><ContentType>text/xml</ContentType><AcceptsParams>No</AcceptsParams></TiVoVideoDetails></Links></Item><Item><Details><ContentType>video/x-tivo-raw-pes</ContentType><SourceFormat>video/x-tivo-raw-pes</SourceFormat><Title>Explorer</Title><SourceSize>780140544</SourceSize><Duration>3600000</Duration><CaptureDate>0x49B71B1F</CaptureDate><EpisodeTitle>T. Rex Walks Again</EpisodeTitle><Description>Dinosaur builder Hall Train and paleoartist Jason Brougham work to build the world's most accurate, fully skinned, mechanical replica of a T. rex. Copyright Tribune Media Services, Inc.</Description><SourceChannel>108</SourceChannel><SourceStation>NGC</SourceStation><HighDefinition>No</HighDefinition><ProgramId>EP7231310112</ProgramId><SeriesId>SH723131</SeriesId><ByteOffset>0</ByteOffset></Details><Links><Content><Url>http://192.168.1.20:80/download/Explorer.TiVo?Container=%2FNowPlaying&id=3212467</Url><ContentType>video/x-tivo-raw-pes</ContentType></Content><CustomIcon><Url>urn:tivo:image:expires-soon-recording</Url><ContentType>image/*</ContentType><AcceptsParams>No</AcceptsParams></CustomIcon><TiVoVideoDetails><Url>https://192.168.1.20:443/TiVoVideoDetails?id=3212467</Url><ContentType>text/xml</ContentType><AcceptsParams>No</AcceptsParams></TiVoVideoDetails></Links></Item><Item><Details><ContentType>video/x-tivo-raw-pes</ContentType><SourceFormat>video/x-tivo-raw-pes</SourceFormat><Title>The Daily Show With Jon Stewart</Title><SourceSize>795869184</SourceSize><Duration>3601000</Duration><CaptureDate>0x49B601DE</CaptureDate><SourceChannel>48</SourceChannel><SourceStation>COMEDYP</SourceStation><HighDefinition>No</HighDefinition><ProgramId>EP2930531382</ProgramId><SeriesId>SH293053</SeriesId><EpisodeNumber>14033</EpisodeNumber><ByteOffset>0</ByteOffset></Details><Links><Content><Url>http://192.168.1.20:80/download/The%20Daily%20Show%20With%20Jon%20Stewart.TiVo?Container=%2FNowPlaying&id=3204470</Url><ContentType>video/x-tivo-raw-pes</ContentType></Content><CustomIcon><Url>urn:tivo:image:expires-soon-recording</Url><ContentType>image/*</ContentType><AcceptsParams>No</AcceptsParams></CustomIcon><TiVoVideoDetails><Url>https://192.168.1.20:443/TiVoVideoDetails?id=3204470</Url><ContentType>text/xml</ContentType><AcceptsParams>No</AcceptsParams></TiVoVideoDetails></Links></Item><Item><Details><ContentType>video/x-tivo-raw-pes</ContentType><SourceFormat>video/x-tivo-raw-pes</SourceFormat><Title>The Wonder Pets!</Title><SourceSize>398458880</SourceSize><Duration>1800000</Duration><CaptureDate>0x49B5D7AE</CaptureDate><EpisodeTitle>Join the Circus!</EpisodeTitle><Description>After the pets rescue a young circus lion, the ringmaster offers each of the pets a job at the circus. Copyright Tribune Media Services, Inc.</Description><SourceChannel>47</SourceChannel><SourceStation>NIKP</SourceStation>
This may look like there are newlines, but this is because of the copy/paste I did to show an example.
Now, I was looking at trying to put newlines in, but so far, I've been unable to find an example of how to do this. I figured out that if I could find all instances of "><", and replace with ">\n<", then I might be able to separate everything into specific lines. It wouldn't look pretty, but it would allow me to script up a "grep", and "cut" type command to get my links.
I only need the links with "192.168.1.20:80" in them, but when I attempt to grep for them, I just get the entire file. A little more searching found that after doing "vi tivo.xml", there is only one line. This is why I thought up the idea of find/replace and adding newlines.
Now, some caveats (as if this needs to be any more difficult), I don't have xsltproc, which I attempted to get from ports and packages. I did find sablotron, but I'm trying to learn shell scripting and I believe it should be possible to do this without resorting to more programs. I use OpenBSD as an OS, so bash is out. I use mostly pdksh and sh. Perl is available in the base system, and if regex is needed, I would be okay with that... I've been wanting to learn that too...
If I happen upon a solution, I will post it, but I was hoping that I'd come out of lurkerdom, and ask after I've been working this for a week on my own...
Regards,
Bryan