add_header() 添加header头

SRE实战 互联网时代守护先锋,助力企业售后服务体系运筹帷幄!一键直达领取阿里云限量特价优惠。

例:from urllib import request as sa

         url = 'https://blog.csdn.net/dQCFKyQDXYm3F8rB0/article/details/84302896'

         r = sa.Request(url)

         r.add_header('User-Agent','Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.26 Safari/537.36 Core/1.63.6776.400 QQBrowser/10.3.2601.400')

         d = sa.urlopen(r).read()

发送post数据,urlencode() 整理数据、encode() 转换编码

例:from urllib import request as sa

         from urllib import parse as sp

         url = 'http://www.iqianyue.com/mypost/'

         p = sp.urlencode({

             'name':111,

             'pass':222,

         }).encode('utf-8')

         r = sa.Request(url,p)

         d = sa.urlopen(r).read()

http://yum.iqianyue.com/proxy 代理服务器地址

使用代理服务器爬取网站信息

ProxyHandler() 设置对应的代理服务器信息

build_opener() 创建opener工具

install_opener() 创建全局opener对象

例:from urllib import request as sa

         from urllib import parse as sp

def up(p,url):

             pr = sa.ProxyHandler({'http':p})

             op = sa.build_opener(pr,sa.HTTPHandler)

             sa.install_opener(op)

             da = sa.urlopen(url).read().decode('utf-8')

             return da

         p = '219.234.5.128:3128'

         url = 'http://www.baidu.com'

         da = up(p,url)

         print(da)

DebugLog设置

HTTPHandler() debuglevel=1

HTTPSHandler() debuglevel=1

build_opener() 创建opener对象并使用HTTPHandler、HTTPSHandler设置的参数

install_opener() 创建全局默认opener对象

例:from urllib import request as sa

         ht = sa.HTTPHandler(debuglevel=1)

         hs = sa.HTTPSHandler(debuglevel=1)

         op = sa.build_opener(ht,hs)

         sa.install_opener(op)

         da = sa.urlopen("http://edu.51cto.com")

         print(da)

send: b'GET / HTTP/1.1\r\nAccept-Encoding: identity\r\nHost: edu.51cto.com\r\nUser-Agent: Python-urllib/3.7\r\nConnection: close\r\n\r\n'

reply: 'HTTP/1.1 200 OK\r\n'

header: Date: Tue, 27 Nov 2018 03:30:42 GMT

header: Content-Type: text/html; charset=UTF-8

header: Transfer-Encoding: chunked

header: Connection: close

header: Set-Cookie: acw_tc=276aedef15432894423986507e64d81a2f2aba60d34ca9de13e960bac343d2;path=/;HttpOnly;Max-Age=2678401

header: Server: nginx

header: Vary: Accept-Encoding

header: Vary: Accept-Encoding

header: X-Powered-By: PHP/7.1.9

header: Set-Cookie: acw_tc=276aedef15432894423986507e64d81a2f2aba60d34ca9de13e960bac343d2;path=/;HttpOnly;Max-Age=2678401

header: Set-Cookie: acw_tc=276aedef15432894423986507e64d81a2f2aba60d34ca9de13e960bac343d2;path=/;HttpOnly;Max-Age=2678401

header: Set-Cookie: acw_tc=276aedef15432894423986507e64d81a2f2aba60d34ca9de13e960bac343d2;path=/;HttpOnly;Max-Age=2678401

header: Load-Balancing: web01

header: Load-Balancing: web01

<http.client.HTTPResponse object at 0x03915BB0>

扫码关注我们
微信号:SRE实战
拒绝背锅 运筹帷幄