Hello everyone,
i have a question for You.
i have a script with grabs URL and etc.
here is a example of it:
Sorry for very messy code... I'm just testing
while (a < 10) :
if a == 2 :
f = urllib.urlopen("****" % params1).read()
linkai = re.compile('</a> - <a href="(.*?)"')
surasti = re.findall(linkai, f)
for link in surasti:
u = urllib.urlopen(link).read()
urlas = re.compile('ne"><a href="(.*?)"')
miau = re.compile('(?s)<pre.*?>http://(.+?)</pre>')
surasti2 = re.findall(urlas , u)
miau2 = re.findall(miau, u)
time.sleep(3)
for i in surasti2:
print i
l.write(i + '\n')
for b in miau2:
print (b + '\n\n')
l2.write(b + '\n\n')
g +=1
dic[b]=i
a = a + 1
else:
f = urllib.urlopen("*****" % params2).read()
linkai = re.compile('</a> - <a href="(.*?)"')
surasti = re.findall(linkai, f)
for link in surasti:
u = urllib.urlopen(link).read()
urlas = re.compile('ne"><a href="(.*?)"')
miau = re.compile('(?s)<pre.*?>http://(.+?)</pre>')
surasti2 = re.findall(urlas , u)
miau2 = re.findall(miau, u)
time.sleep(3)
for i in surasti2:
print i
l.write(i + '\n')
for b in miau2:
print (b + '\n\n')
l2.write(b + '\n\n')
g +=1
dic[b]=i
a = a + 1
Every time i run this lets say 10 pages (in every page are another 10 URL) so total should be 100 URL so dictionary should be 100 entries length.
Here is example of few dictionaries
{ 'http://sdfsdfasdfs.com/sdfsdf' : 'http:sdfasdfasdf.com/asdfasdfas' , 'http://sdfsdfasdfs.com/sdfsdf' : 'http:sdfasdfasdf.com/asdfasdfas' and so on}
But max entries i get 20! i could set it on 100 pages (should be 1000 entries(pairs) in dictionary ) but I'm getting 20 entries no matter what!
i did test and changed variable "b" into "g"(witch is g += 1) and it works just fine (run trough 10 pages (100 entries) and I'm getting 100 entries(pairs) in dictionary )
Please help me with this :)