Hi. I don't know anything about python but I need to use a script which I have found on the internet. I don't know where to start. I have installed python on my windows xp desktop and I have also downloaded and put in the Lib folder the file BeautifulSoup.py (http://www.crummy.com/software/BeautifulSoup/) since I know that the script needs it. The aim of the script is to generate an html file with a list of links based on the name of a linux package. Infact I need to use this script to download all the *.deb files of a given package plus all the .deb files which depend upon that a package from http://packages.ubuntu.com/. I guess that it's very easy to use this script but I don't know how! Help would be really appreciated.
"""ubuntu deb digger"""
from BeautifulSoup import BeautifulSoup
import urllib
import urlparse
import re
_depgif = '../../Pics/dep.gif'
_deps = {}
def get_debs(url, arch="i386", packages=None):
"""grab the deb defined by url from packages.ubuntu.com and all its dependencies"""
if packages is None:
packages = {}
source = urllib.urlopen(url).read()
soup = BeautifulSoup(source)
downloadheader = soup('div', {'id': 'pdownload'})[0].h2
name = downloadheader.string.replace('Download ','')
if name in packages:
return {}
print name
# update the packages dictionary with the download link for this package
archlinks = [link for link in soup('a') if link.string in (arch, 'all')]
archlink = urlparse.urljoin(url,archlinks[0]['href'])
mirrorpage = urllib.urlopen(archlink).read()
mirrorsoup = BeautifulSoup(mirrorpage)
downloadlink = mirrorsoup.firstText('archive.ubuntu.com/ubuntu').parent['href']
packages.update({name: downloadlink})
# get dependencies
deplinks = [dt.a for dt in soup('dt') if dt.img['src'] == _depgif]
for link in deplinks:
get_debs(urlparse.urljoin(url,link['href']), packages=packages)
return packages
if __name__ == '__main__':
import sys
packages = get_debs(sys.argv[0])
html = "\n".join(["<a href='%s'>%s</a><br/>" % (value,key) for key,value in packages.iteritems()])
print 'writing packages.html'
open('packages.html','w').write(html)
If i try to run it I get:
Traceback (most recent call last):
File "C:\Documents and Settings\Andrea\Desktop\UbuntuPackageGrabber.py", line 40, in <module>
packages = get_debs(sys.argv[0])
File "C:\Documents and Settings\Andrea\Desktop\UbuntuPackageGrabber.py", line 15, in get_debs
source = urllib.urlopen(url).read()
File "C:\Python25\lib\urllib.py", line 82, in urlopen
return opener.open(url)
File "C:\Python25\lib\urllib.py", line 187, in open
return self.open_unknown(fullurl, data)
File "C:\Python25\lib\urllib.py", line 199, in open_unknown
raise IOError, ('url error', 'unknown url type', type)
IOError: [Errno url error] unknown url type: 'c'