Python Penetration Testing Cookbook
上QQ阅读APP看书,第一时间看更新

With Python 3

In Python 3 both urllib and urllib2 are part of the urllib module, so there is some difference in using urllib. Also, the urllib package contains the following modules:

  • urllib.request
  • urllib.error
  • urllib.parse
  • urllib.robotparser

The urllib.requestmodule is used for opening and fetching URLs with Python 3:

  1. First import the urllib.request module from urllib packages:
>>> import urllib.request
  1. Get the web page with the urlopen method:
>>> webpage = urllib.request.urlopen("https://www.packtpub.com/")  
  1. Read the object with the read method:
>>> source =  webpage.read()  
  1. Close the object:
>>> webpage.close()  
  1. Print the source:
>>> print(source)  
  1. You can write the contents of the source string to a local file on your computer as follows. Make sure that the output file is in binary mode:
>>> f = open('packtpub-home.html', 'wb')
      >>> f.write(source)
      >>> f.close  

Python 2 modules urllib and urllib2 help to do URL-request-related stuff, but both have different functionalities.
urllib provides the urlencode method, which is useful in generating GET requests. However, urllib2 doesn't support the urlencode method. Also, urllib2 can accept the request object and modify the headers for a URL request, but urllib can only accept the URL and is not capable of modifying the headers in it.