Several weeks ago, I wrote a simple web crawler that was designed to simply retrieve robots.txt from a range of IP addresses. It was entirely CLI based, and I had it working perfectly. As my python experience has increased, I decided to start learning about GUI based programming, and decided for my first GUI application, I would just 'improve' upon my existing web crawler. However, I broke something.
What's happening is my itertools.product code is working exactly as it had before, but for some reason, the portion of the try block which is responsible for actually getting HTTP request does not execute until the entire range has been iterated, and then only attempts to make the connection to the last IP address in the range. I determined this from a keenly placed print statement, but the strange thing is, I've changed none of the original CLI based code, except for the portions which receive input from the GUI screen instead of the command line. I'm guessing the problem has something to do with the PyQt syntax that I'm over looking. Anyone with more experience with PyQt have any ideas? BTW this is with Python 3.1.1, and PyQt4. Code is as follows:
#! /usr/bin/python3.1
import urllib
import urllib.request
import sys
import itertools
import time
from PyQt4 import QtCore, QtGui
from robo_ui import Ui_robominer
class Start(QtGui.QMainWindow):
def __init__(self, parent=None):
QtGui.QWidget.__init__(self, parent)
self.ui = Ui_robominer()
self.ui.setupUi(self)
QtCore.QObject.connect(self.ui.mine_button, QtCore.SIGNAL("clicked()"), self.go_mining)
QtCore.QObject.connect(self.ui.quit_button, QtCore.SIGNAL("clicked()"), self.quit_mining)
def go_mining(self):
ip_parts =[]
for part in self.ui.robo_target.text().split("."):
if "-" in part:
min, max = part.split("-")
ip_parts.append(range(int(min), int(max)+1))
else:
ip_parts.append([int(part)])
ip_addresses = itertools.product(*ip_parts)
for ip_addy in ip_addresses:
self.ui.textBrowser.append("Trying to fetch from %s..." %ip_addy)
try:
time.sleep(2)
roboFile = urllib.request.urlopen("http://%d.%d.%d.%d/robots.txt" %ip_addy, timeout=5)
fetched = roboFile.read()
decoded = fetched.decode("utf8")
self.ui.textBrowser.append(decoded)
except Exception as err:
e = str(err)
self.ui.textBrowser.append(e)
def quit_mining(self):
sys.exit()
if __name__ == "__main__":
app = QtGui.QApplication(sys.argv)
myapp = Start()
myapp.show()
sys.exit(app.exec_())