Article From:https://www.cnblogs.com/zhangziyan/p/9123031.html

Climb several pictures from the home page of geek college.

For titita.txt content, for the geek home page source code excerpt:

<div class="jk-uptodate">
    <h2>The latest course</h2>
    <ul>
        
        <li class="uptodate">
            <a href="/zhiye/course/135.html?type=50" target="_blank">
                <img class="uptodate-img" src="https://jiuye-res.jikexueyuan.com/zhiye/showcase/attach-/20170928/8cc3edeb-0115-43ea-a46f-db6c6e9255ca.jpg" alt="">
                <p class="uptodate-title">KerasIntroduction to the framework</p>
                <p class="uptodate-info">
                    primary<span>|</span>5Door class</p>
            </a>
        </li>
        
        <li class="uptodate">
            <a href="/zhiye/course/143.html?type=38" target="_blank">
                <img class="uptodate-img" src="https://jiuye-res.jikexueyuan.com/zhiye/showcase/attach-/20171101/b12ae422-fd63-4b7d-a0d3-13c3ab4479c5.jpg" alt="">
                <p class="uptodate-title">【Real war] Python and Message Oriented Middleware</p>
                <p class="uptodate-info">
                    intermediate<span>|</span>4Door class</p>
            </a>
        </li>
        
        <li class="uptodate">
            <a href="/zhiye/course/134.html?type=50" target="_blank">
                <img class="uptodate-img" src="https://jiuye-res.jikexueyuan.com/zhiye/showcase/attach-/20170928/85a3364e-47a3-41df-b5c8-daf48a57b7cd.jpg" alt="">
                <p class="uptodate-title">Natural Language Processing</p>
                <p class="uptodate-info">
                    primary<span>|</span>8Door class</p>
            </a>
        </li>
        
        <li class="uptodate">
            <a href="/zhiye/course/145.html?type=18" target="_blank">
                <img class="uptodate-img" src="https://jiuye-res.jikexueyuan.com/zhiye/showcase/attach-/20171123/9625ede8-31e9-4edc-93e7-74bf5b752585.jpg" alt="">
                <p class="uptodate-title">AndroidPerformance optimization - u i.</p>
                <p class="uptodate-info">
                    intermediate<span>|</span>7Door class</p>
            </a>
        </li>
        
    </ul>
</div>

 

The Python program to climb the picture is as follows:

import re
import requests

f=open('titita.txt','r')
f1=f.read()
f.close()

htmls=re.findall('<img class="uptodate-img" src="(.*?)" alt="">',f1,re.S)
i=0
for each in htmls:
    print('nowdoloading:'+each)
    pic=requests.get(each)
    fp=open('pic\\'+str(i)+'.jpg','wb')
    fp.write(pic.content)
    fp.close
    i+=1

#Requests was not installed before, to build a new project

 

PycharmThe output is:

nowdoloading:https://jiuye-res.jikexueyuan.com/zhiye/showcase/attach-/20170928/8cc3edeb-0115-43ea-a46f-db6c6e9255ca.jpg
nowdoloading:https://jiuye-res.jikexueyuan.com/zhiye/showcase/attach-/20171101/b12ae422-fd63-4b7d-a0d3-13c3ab4479c5.jpg
nowdoloading:https://jiuye-res.jikexueyuan.com/zhiye/showcase/attach-/20170928/85a3364e-47a3-41df-b5c8-daf48a57b7cd.jpg
nowdoloading:https://jiuye-res.jikexueyuan.com/zhiye/showcase/attach-/20171123/9625ede8-31e9-4edc-93e7-74bf5b752585.jpg

 

The crawling images are shown in the resource manager as:

Leave a Reply

Your email address will not be published. Required fields are marked *