Tag:Python random code
Article From:https://www.cnblogs.com/weirdo-xo/p/9064770.html

Before thinking about the writing of CSDN and the blogosphere more suitable for yourself, it was found that the blogosphere was more suitable for yourself (mainly because he was more focused on blogs, using it to write a blog more naturally and fluently). My first blog theme is to solve the Chinese garbled code of Python.

Before I want to climb the content of the ordinary world novel, I suddenly found the Chinese chaotic code, think of several ways or not, and finally to the vast number of netizens, find a better way to share with you. For other questions, please refer to this blog: https://blog.csdn.net/Winterto1990/article/details/51217363.

This is my code:

import requests
import chardet
from  bs4 import  BeautifulSoup


#Crawl the target page
url='http://www.pingfandeshijie.net/di-yi-bu-01.html'
#The head part does not use this part.
user_agent='Mozilla/5.0(Windows;U;WindowsNT6.1;en-us)AppleWebKit/534.50(KHTML,likeGecko)Version/5.1Safari/534.50'

headers={"User-Agent":user_agent}
r=requests.get(url=url,headers=headers)

r.encoding='gbk2312'        #After you get the webpage, the encoding format GBK is traditional, and gbk2313 is simplified.
demo=r.text

soup=BeautifulSoup(demo,'html.parser',from_encoding='gbk')
print(soup.find_all('p'))

View Code

There are two codes used in the code. After examination, only the first encoding is effective (sorry, the first time it will not be changed). The code is clearly marked. If there is any doubt, you can leave a message for me to solve together.

Link of this Article: Essays

Leave a Reply

Your email address will not be published. Required fields are marked *