Dictionary limit?

Question

Creatinas 0 Newbie Poster

13 Years Ago

Hello everyone,

i have a question for You.

i have a script with grabs URL and etc.

here is a example of it:

Sorry for very messy code... I'm just testing

while (a < 10) :
    if a == 2 :
        f = urllib.urlopen("****" % params1).read()
        linkai = re.compile('</a> -     <a href="(.*?)"')
        surasti = re.findall(linkai, f)
        for link in surasti:
            u = urllib.urlopen(link).read()
            urlas = re.compile('ne"><a href="(.*?)"')
            miau = re.compile('(?s)<pre.*?>http://(.+?)</pre>')
            surasti2 = re.findall(urlas , u)
            miau2 = re.findall(miau, u)

            time.sleep(3)
            for i in surasti2:
                print i
                l.write(i + '\n')
            
            for b in miau2:
                print (b + '\n\n')
                l2.write(b + '\n\n')
                g +=1
            dic[b]=i
            
        a = a + 1        
     
    else:
        f = urllib.urlopen("*****" % params2).read()
        linkai = re.compile('</a> -     <a href="(.*?)"')
        surasti = re.findall(linkai, f)
        
        
        for link in surasti:
            u = urllib.urlopen(link).read()
            urlas = re.compile('ne"><a href="(.*?)"')
            miau = re.compile('(?s)<pre.*?>http://(.+?)</pre>')
            surasti2 = re.findall(urlas , u)
            miau2 = re.findall(miau, u)
            
            time.sleep(3)

            for i in surasti2:
                print i
                l.write(i + '\n')
                
            for b in miau2:
                print (b + '\n\n')
                l2.write(b + '\n\n')
                g +=1

            dic[b]=i
        
        a = a + 1

Every time i run this lets say 10 pages (in every page are another 10 URL) so total should be 100 URL so dictionary should be 100 entries length.

Here is example of few dictionaries

{ 'http://sdfsdfasdfs.com/sdfsdf' : 'http:sdfasdfasdf.com/asdfasdfas' , 'http://sdfsdfasdfs.com/sdfsdf' : 'http:sdfasdfasdf.com/asdfasdfas' and so on}

But max entries i get 20! i could set it on 100 pages (should be 1000 entries(pairs) in dictionary ) but I'm getting 20 entries no matter what!

i did test and changed variable "b" into "g"(witch is g += 1) and it works just fine (run trough 10 pages (100 entries) and I'm getting 100 entries(pairs) in dictionary )

Please help me with this :)

python

3 Contributors
6 Replies
244 Views
10 Hours Discussion Span
Latest Post 13 Years Ago Latest Post by Creatinas

All 6 Replies

TrustyTony 888 ex-Moderator

13 Years Ago

Please do not handle HTML with regular expressions, use proper tools like Beautiful Soup
http://www.crummy.com/software/BeautifulSoup/

Dictionary has only limit amount of memory, I think, here proof:

Python 2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> big = dict((a,a+1) for a in range(100000))
>>> len(big)
100000
>>>

Edited 13 Years Ago by TrustyTony because: n/a

woooee 814 Nearly a Posting Maven

13 Years Ago

Dictionary keys have to be unique, so if you try to add a key that is already in the dictionary, it will not add a new key but write over the existing key, so add some code to test if the key already exists in the dictionary and print a message if it does. Finally, this code does not add to the dictionary on each pass through the loop but only adds the final pass, and I can't tell from the code if that is what you want to do or not so can't give any more help.

for b in miau2:
    print (b + '\n\n')
    l2.write(b + '\n\n')
    g +=1
 
dic[b]=i

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

Creatinas 0 Newbie Poster · Answer 1 · 2011-09-29T17:02:56+00:00

Thank for Your reply, yes i should use beautifulsoup instead os regex :) i need to learn it first.

Yes i know the limit is much more then 20 entries, but WHY everytime i'm getting limit of 20 then i'm collecting url : url? Because with int : url everything works just fine..

TrustyTony 888 ex-Moderator Team Colleague Featured Poster · Answer 2 · 2011-09-29T17:12:34+00:00

maybe you have only 20 values for key and need to collect list of values with append?

Creatinas 0 Newbie Poster · Answer 3 · 2011-09-29T19:44:33+00:00

tried to append one urls address to one list and other url address to another list. Printed them out, they had 80 item each. tried to write every item in those lists to dictionary, and got 20 pairs max.... :) so strange :)

Creatinas 0 Newbie Poster · Answer 4 · 2011-09-29T21:21:38+00:00

Creatinas 0 Newbie Poster

13 Years Ago

Solved!! Thank you guys ! :)

Edited 13 Years Ago by Creatinas because: n/a

Dictionary limit?

Recommended Answers Collapse Answers

All 6 Replies

Recommended Answers