What are these errors and how do I handle them?
-
10-07-2019 - |
Question
I am using this simple code
for l in bios:
OpenThisLink = url + l
response = urllib2.urlopen(OpenThisLink)
to open about 200 urls and search them with regex (and BeautifulSoup), but after a dozen or so I get these errors and IDLE quits. What do they mean? How can I handle them?
Thank you.
Traceback (most recent call last):
File "\PROJECTS\JD\jd10.py", line 15, in <module> response = urllib2.urlopen(OpenThisLink)
File "C:\Python26\lib\urllib2.py", line 124, in urlopen return _opener.open(url, data, timeout)
File "C:\Python26\lib\urllib2.py", line 389, in open response = meth(req, response)
File "C:\Python26\lib\urllib2.py", line 502, in http_response 'http', request, response, code, msg, hdrs)
File "C:\Python26\lib\urllib2.py", line 421, in error result = self._call_chain(*args)
File "C:\Python26\lib\urllib2.py", line 361, in _call_chain result = func(*args)
File "C:\Python26\lib\urllib2.py", line 597, in http_error_302 return self.parent.open(new)
File "C:\Python26\lib\urllib2.py", line 389, in open response = meth(req, response)
File "C:\Python26\lib\urllib2.py", line 502, in http_response 'http', request, response, code, msg, hdrs)
File "C:\Python26\lib\urllib2.py", line 421, in error result = self._call_chain(*args)
File "C:\Python26\lib\urllib2.py", line 361, in _call_chain result = func(*args)
File "C:\Python26\lib\urllib2.py", line 597, in http_error_302 return self.parent.open(new)
File "C:\Python26\lib\urllib2.py", line 389, in open response = meth(req, response)
File "C:\Python26\lib\urllib2.py", line 502, in http_response 'http', request, response, code, msg, hdrs)
File "C:\Python26\lib\urllib2.py", line 427, in error return self._call_chain(*args)
File "C:\Python26\lib\urllib2.py", line 361, in _call_chain result = func(*args)
File "C:\Python26\lib\urllib2.py", line 510, in http_error_default raise HTTPError(req.get_full_url(), code, msg, hdrs, fp) HTTPError: HTTP Error 404: Not Found
Solution
The error being raised is HTTPError
- specifically, a 404 is being thrown for one of your URLs. You could either ignore it:
for l in bios:
OpenThisLink = url + l
try:
response = urllib2.urlopen(OpenThisLink)
except urllib2.HTTPError:
pass
Or, you could re-raise the error with a (marginally) more meaningful message:
for l in bios:
OpenThisLink = url + l
try:
response = urllib2.urlopen(OpenThisLink)
except urllib2.HTTPError as e:
raise Exception('Error opening %s: %s' % (e.geturl(), e))
OTHER TIPS
I don't know anything about the particular libraries you're using. However, this looks to me like one big stack trace that leads to this original error at the very end:
HTTPError: HTTP Error 404: Not Found
I think one of the links was bad and that triggered an exception which wasn't caught.
Edit: By "bad" I mean the page couldn't be retrieved by the server, hence the 404 error.