Reading from a website

Question

nuaris 0 Newbie Poster

16 Years Ago

How do I read a .txt/.csv file from an internet address? For example: http:\\www.internetaddress.com\file.txt I don't think file() would work for this.

Thanks

python

6 Contributors
8 Replies
5K Views
6 Years Discussion Span
Latest Post 10 Years Ago Latest Post by snippsat

All 8 Replies

Stefano Mtangoo 455 Senior Poster

16 Years Ago

Basic example with for loop

#URL LIBRARY
from urllib2 import *
ur = urlopen("http://www.daniweb.com/forums/thread161312.html")#open url
contents = ur.readlines()#readlines from url file
fo = open("test.txt", "w")#open test.txt
for line in contents: 
    print "writing %s to a file" %(line,)
    fo.write(i)#write lines from url file to text file
fo.close()#close text file

Stefano Mtangoo 455 Senior Poster

16 Years Ago

http://docs.python.org/library/urllib2.html

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

Gribouillis 1,391 Programming Explorer Team Colleague · Answer 1 · 2008-12-09T00:55:26+00:00

from urllib2 import urlopen
data = urlopen("http://internetaddress.com/file.txt").read()

nuaris 0 Newbie Poster · Answer 2 · 2008-12-10T06:29:56+00:00

nuaris 0 Newbie Poster

16 Years Ago

Thanks for the help. That solved my problem.

rmad17 0 Newbie Poster · Answer 3 · 2013-07-26T11:04:16+00:00

rmad17 0 Newbie Poster

11 Years Ago

How to remove all the html tags?

Jalexmaines 0 Newbie Poster · Answer 4 · 2015-03-23T15:40:36+00:00

urlopen() does not seem to work for me, as in I cannot import it. I am using Python 3.4.3 though.

Gribouillis 1,391 Programming Explorer Team Colleague · Answer 5 · 2015-03-23T15:47:55+00:00

In python 3, urlopen() is in module urllib.request. You can go here https://docs.python.org/3/index.html and type the name of a function in the quick search box to find it in the documentation.

snippsat 661 Master Poster · Answer 6 · 2015-03-23T20:34:25+00:00

Here are the diffrent ways,
and also what i would call the prefered way these day with Requests.

Python 2:

from urllib2 import urlopen

page_source = urlopen("http://python.org").read()
print page_source

Python 3:

from urllib.request import urlopen

page_source = urlopen('http://python.org').read().decode('utf_8')
print(page_source)

For Python 3 to get str output and not byte we need to decode to utf-8.

Here with Requests,work for Python 2 and 3:

import requests

page_source = requests.get('http://python.org')
print(page_source.text)

Basic web-scraping we read in with Requests and parse with BeautifulSoup.

import requests
from bs4 import BeautifulSoup    

page_source = requests.get('http://python.org')
soup = BeautifulSoup(page_source.text)
print(soup.find('title').text) #--> Welcome to Python.org

Reading from a website

Recommended Answers Collapse Answers

All 8 Replies

Recommended Answers