Article From:https://segmentfault.com/q/1010000012136179
Question:

When you grab data with Python, you extract a JSON string STR from the web page.

{\"count\":8,\"sub_images\":[{\"url\":\"http:\\/\\/p3.pstatp.com\\/origin\\/470700000c7084773fb2\",\"width\":1178,\"url_list\":[{\"url\":\"http:\\/\\/p3.pstatp.com\\/origin\\/470700000c7084773fb2\"},{\"url\":\"http:\\/\\/pb9.pstatp.com\\/origin\\/470700000c7084773fb2\"},{\"url\":\"http:\\/\\/pb1.pstatp.com\\/origin\\/470700000c7084773fb2\"}],\"uri\":\"origin\\/470700000c7084773fb2\",\"height\":1590},{\"url\":\"http:\\/\\/p9.pstatp.com\\/origin\\/47050001b69355a0bf1b\",\"width\":1178,\"url_list\":[{\"url\":\"http:\\/\\/p9.pstatp.com\\/origin\\/47050001b69355a0bf1b\"},{\"url\":\"http:\\/\\/pb1.pstatp.com\\/origin\\/47050001b69355a0bf1b\"},{\"url\":\"http:\\/\\/pb3.pstatp.com\\/origin\\/47050001b69355a0bf1b\"}],\"uri\":\"origin\\/47050001b69355a0bf1b\",\"height\":1557},{\"url\":\"http:\\/\\/p3.pstatp.com\\/origin\\/470300020761150d671a\",\"width\":1178,\"url_list\":[{\"url\":\"http:\\/\\/p3.pstatp.com\\/origin\\/470300020761150d671a\"},{\"url\":\"http:\\/\\/pb9.pstatp.com\\/origin\\/470300020761150d671a\"},{\"url\":\"http:\\/\\/pb1.pstatp.com\\/origin\\/470300020761150d671a\"}],\"uri\":\"origin\\/470300020761150d671a\",\"height\":1552},{\"url\":\"http:\\/\\/p1.pstatp.com\\/origin\\/47000002200f2a0a9020\",\"width\":1178,\"url_list\":[{\"url\":\"http:\\/\\/p1.pstatp.com\\/origin\\/47000002200f2a0a9020\"},{\"url\":\"http:\\/\\/pb3.pstatp.com\\/origin\\/47000002200f2a0a9020\"},{\"url\":\"http:\\/\\/pb9.pstatp.com\\/origin\\/47000002200f2a0a9020\"}],\"uri\":\"origin\\/47000002200f2a0a9020\",\"height\":1575},{\"url\":\"http:\\/\\/p1.pstatp.com\\/origin\\/470000022011d5569ccb\",\"width\":1178,\"url_list\":[{\"url\":\"http:\\/\\/p1.pstatp.com\\/origin\\/470000022011d5569ccb\"},{\"url\":\"http:\\/\\/pb3.pstatp.com\\/origin\\/470000022011d5569ccb\"},{\"url\":\"http:\\/\\/pb9.pstatp.com\\/origin\\/470000022011d5569ccb\"}],\"uri\":\"origin\\/470000022011d5569ccb\",\"height\":1588},{\"url\":\"http:\\/\\/p3.pstatp.com\\/origin\\/4700000220127db96444\",\"width\":1178,\"url_list\":[{\"url\":\"http:\\/\\/p3.pstatp.com\\/origin\\/4700000220127db96444\"},{\"url\":\"http:\\/\\/pb9.pstatp.com\\/origin\\/4700000220127db96444\"},{\"url\":\"http:\\/\\/pb1.pstatp.com\\/origin\\/4700000220127db96444\"}],\"uri\":\"origin\\/4700000220127db96444\",\"height\":1561},{\"url\":\"http:\\/\\/p3.pstatp.com\\/origin\\/46ff000532e33a9fa35a\",\"width\":1178,\"url_list\":[{\"url\":\"http:\\/\\/p3.pstatp.com\\/origin\\/46ff000532e33a9fa35a\"},{\"url\":\"http:\\/\\/pb9.pstatp.com\\/origin\\/46ff000532e33a9fa35a\"},{\"url\":\"http:\\/\\/pb1.pstatp.com\\/origin\\/46ff000532e33a9fa35a\"}],\"uri\":\"origin\\/46ff000532e33a9fa35a\",\"height\":1563},{\"url\":\"http:\\/\\/p9.pstatp.com\\/origin\\/470700000c7b871a5fae\",\"width\":1178,\"url_list\":[{\"url\":\"http:\\/\\/p9.pstatp.com\\/origin\\/470700000c7b871a5fae\"},{\"url\":\"http:\\/\\/pb1.pstatp.com\\/origin\\/470700000c7b871a5fae\"},{\"url\":\"http:\\/\\/pb3.pstatp.com\\/origin\\/470700000c7b871a5fae\"}],\"uri\":\"origin\\/470700000c7b871a5fae\",\"height\":1575}],\"max_img_width\":1178,\"labels\":[],\"sub_abstracts\":[\" \",\" \",\" \",\" \",\" \",\" \",\" \",\" \"],\"sub_titles\":[\"\\u6e05\\u65b0\\u81ea\\u7136\\uff0c\\u7f8e\\u4e3d\\u65e0\\u53cc\",\"\\u6e05\\u65b0\\u81ea\\u7136\\uff0c\\u7f8e\\u4e3d\\u65e0\\u53cc\",\"\\u6e05\\u65b0\\u81ea\\u7136\\uff0c\\u7f8e\\u4e3d\\u65e0\\u53cc\",\"\\u6e05\\u65b0\\u81ea\\u7136\\uff0c\\u7f8e\\u4e3d\\u65e0\\u53cc\",\"\\u6e05\\u65b0\\u81ea\\u7136\\uff0c\\u7f8e\\u4e3d\\u65e0\\u53cc\",\"\\u6e05\\u65b0\\u81ea\\u7136\\uff0c\\u7f8e\\u4e3d\\u65e0\\u53cc\",\"\\u6e05\\u65b0\\u81ea\\u7136\\uff0c\\u7f8e\\u4e3d\\u65e0\\u53cc\",\"\\u6e05\\u65b0\\u81ea\\u7136\\uff0c\\u7f8e\\u4e3d\\u65e0\\u53cc\"]}

You can see that it has been transferred, so it can not be used.

data = json.loads(str)

Turning into the python object, now there are two ways to find out.

1.Search the demjson library from the Internet

data = json.loads(demjson.decode(str))

2. For STR regular replacement, but it seems that it is not so good to replace, it may replace the correct data.

Is there any other way to learn Python and ask for advice?

Answer 0:
import re
import json

data = your string
pattern = "\\\""
data = re.sub(pattern, r'"', data)
pattern = "\\\\/"
data = re.sub(pattern, r"/", data)
data_json = json.loads(data)
print data_json

Answer 1:

All the replacement is empty

Answer 2:

Pro test effectiveness

import re
import json

jsonStr = '{\"count\":8,\"sub_images\":[{\"url\":\"http:\\/\\/p3.pstatp.com\\/origin\\/470700000c7084773fb2\",\"width\":1178,\"url_list\":[{\"url\":\"http:\\/\\/p3.pstatp.com\\/origin\\/470700000c7084773fb2\"},{\"url\":\"http:\\/\\/pb9.pstatp.com\\/origin\\/470700000c7084773fb2\"},{\"url\":\"http:\\/\\/pb1.pstatp.com\\/origin\\/470700000c7084773fb2\"}],\"uri\":\"origin\\/470700000c7084773fb2\",\"height\":1590},{\"url\":\"http:\\/\\/p9.pstatp.com\\/origin\\/47050001b69355a0bf1b\",\"width\":1178,\"url_list\":[{\"url\":\"http:\\/\\/p9.pstatp.com\\/origin\\/47050001b69355a0bf1b\"},{\"url\":\"http:\\/\\/pb1.pstatp.com\\/origin\\/47050001b69355a0bf1b\"},{\"url\":\"http:\\/\\/pb3.pstatp.com\\/origin\\/47050001b69355a0bf1b\"}],\"uri\":\"origin\\/47050001b69355a0bf1b\",\"height\":1557},{\"url\":\"http:\\/\\/p3.pstatp.com\\/origin\\/470300020761150d671a\",\"width\":1178,\"url_list\":[{\"url\":\"http:\\/\\/p3.pstatp.com\\/origin\\/470300020761150d671a\"},{\"url\":\"http:\\/\\/pb9.pstatp.com\\/origin\\/470300020761150d671a\"},{\"url\":\"http:\\/\\/pb1.pstatp.com\\/origin\\/470300020761150d671a\"}],\"uri\":\"origin\\/470300020761150d671a\",\"height\":1552},{\"url\":\"http:\\/\\/p1.pstatp.com\\/origin\\/47000002200f2a0a9020\",\"width\":1178,\"url_list\":[{\"url\":\"http:\\/\\/p1.pstatp.com\\/origin\\/47000002200f2a0a9020\"},{\"url\":\"http:\\/\\/pb3.pstatp.com\\/origin\\/47000002200f2a0a9020\"},{\"url\":\"http:\\/\\/pb9.pstatp.com\\/origin\\/47000002200f2a0a9020\"}],\"uri\":\"origin\\/47000002200f2a0a9020\",\"height\":1575},{\"url\":\"http:\\/\\/p1.pstatp.com\\/origin\\/470000022011d5569ccb\",\"width\":1178,\"url_list\":[{\"url\":\"http:\\/\\/p1.pstatp.com\\/origin\\/470000022011d5569ccb\"},{\"url\":\"http:\\/\\/pb3.pstatp.com\\/origin\\/470000022011d5569ccb\"},{\"url\":\"http:\\/\\/pb9.pstatp.com\\/origin\\/470000022011d5569ccb\"}],\"uri\":\"origin\\/470000022011d5569ccb\",\"height\":1588},{\"url\":\"http:\\/\\/p3.pstatp.com\\/origin\\/4700000220127db96444\",\"width\":1178,\"url_list\":[{\"url\":\"http:\\/\\/p3.pstatp.com\\/origin\\/4700000220127db96444\"},{\"url\":\"http:\\/\\/pb9.pstatp.com\\/origin\\/4700000220127db96444\"},{\"url\":\"http:\\/\\/pb1.pstatp.com\\/origin\\/4700000220127db96444\"}],\"uri\":\"origin\\/4700000220127db96444\",\"height\":1561},{\"url\":\"http:\\/\\/p3.pstatp.com\\/origin\\/46ff000532e33a9fa35a\",\"width\":1178,\"url_list\":[{\"url\":\"http:\\/\\/p3.pstatp.com\\/origin\\/46ff000532e33a9fa35a\"},{\"url\":\"http:\\/\\/pb9.pstatp.com\\/origin\\/46ff000532e33a9fa35a\"},{\"url\":\"http:\\/\\/pb1.pstatp.com\\/origin\\/46ff000532e33a9fa35a\"}],\"uri\":\"origin\\/46ff000532e33a9fa35a\",\"height\":1563},{\"url\":\"http:\\/\\/p9.pstatp.com\\/origin\\/470700000c7b871a5fae\",\"width\":1178,\"url_list\":[{\"url\":\"http:\\/\\/p9.pstatp.com\\/origin\\/470700000c7b871a5fae\"},{\"url\":\"http:\\/\\/pb1.pstatp.com\\/origin\\/470700000c7b871a5fae\"},{\"url\":\"http:\\/\\/pb3.pstatp.com\\/origin\\/470700000c7b871a5fae\"}],\"uri\":\"origin\\/470700000c7b871a5fae\",\"height\":1575}],\"max_img_width\":1178,\"labels\":[],\"sub_abstracts\":[\" \",\" \",\" \",\" \",\" \",\" \",\" \",\" \"],\"sub_titles\":[\"\\u6e05\\u65b0\\u81ea\\u7136\\uff0c\\u7f8e\\u4e3d\\u65e0\\u53cc\",\"\\u6e05\\u65b0\\u81ea\\u7136\\uff0c\\u7f8e\\u4e3d\\u65e0\\u53cc\",\"\\u6e05\\u65b0\\u81ea\\u7136\\uff0c\\u7f8e\\u4e3d\\u65e0\\u53cc\",\"\\u6e05\\u65b0\\u81ea\\u7136\\uff0c\\u7f8e\\u4e3d\\u65e0\\u53cc\",\"\\u6e05\\u65b0\\u81ea\\u7136\\uff0c\\u7f8e\\u4e3d\\u65e0\\u53cc\",\"\\u6e05\\u65b0\\u81ea\\u7136\\uff0c\\u7f8e\\u4e3d\\u65e0\\u53cc\",\"\\u6e05\\u65b0\\u81ea\\u7136\\uff0c\\u7f8e\\u4e3d\\u65e0\\u53cc\",\"\\u6e05\\u65b0\\u81ea\\u7136\\uff0c\\u7f8e\\u4e3d\\u65e0\\u53cc\"]}'

jsonSt2 = re.sub(r'\\{1,2}', '',jsonStr)

jsonObj = json.loads(jsonStr)
print(jsonObj)
print(jsonObj.get('count'))

Answer 3:

Quite simple:

>>> print '"Hello,\\nworld!"'.decode('string_escape')
"Hello,
world!"

>>> data = json.loads('{\"count\":8,\"sub_images\":[{\"url\":\"http:\\/\\/p3.pstatp.com\\/origin\\/470700000c7084773fb2\",\"width\":1178,\"url_list\":[{\"url\":\"http:\\/\\/p3.pstatp.com\\/origin\\/470700000c7084773fb2\"},{\"url\":\"http:\\/\\/pb9.pstatp.com\\/origin\\/470700000c7084773fb2\"},{\"url\":\"http:\\/\\/pb1.pstatp.com\\/origin\\/470700000c7084773fb2\"}],\"uri\":\"origin\\/470700000c7084773fb2\",\"height\":1590},{\"url\":\"http:\\/\\/p9.pstatp.com\\/origin\\/47050001b69355a0bf1b\",\"width\":1178,\"url_list\":[{\"url\":\"http:\\/\\/p9.pstatp.com\\/origin\\/47050001b69355a0bf1b\"},{\"url\":\"http:\\/\\/pb1.pstatp.com\\/origin\\/47050001b69355a0bf1b\"},{\"url\":\"http:\\/\\/pb3.pstatp.com\\/origin\\/47050001b69355a0bf1b\"}],\"uri\":\"origin\\/47050001b69355a0bf1b\",\"height\":1557},{\"url\":\"http:\\/\\/p3.pstatp.com\\/origin\\/470300020761150d671a\",\"width\":1178,\"url_list\":[{\"url\":\"http:\\/\\/p3.pstatp.com\\/origin\\/470300020761150d671a\"},{\"url\":\"http:\\/\\/pb9.pstatp.com\\/origin\\/470300020761150d671a\"},{\"url\":\"http:\\/\\/pb1.pstatp.com\\/origin\\/470300020761150d671a\"}],\"uri\":\"origin\\/470300020761150d671a\",\"height\":1552},{\"url\":\"http:\\/\\/p1.pstatp.com\\/origin\\/47000002200f2a0a9020\",\"width\":1178,\"url_list\":[{\"url\":\"http:\\/\\/p1.pstatp.com\\/origin\\/47000002200f2a0a9020\"},{\"url\":\"http:\\/\\/pb3.pstatp.com\\/origin\\/47000002200f2a0a9020\"},{\"url\":\"http:\\/\\/pb9.pstatp.com\\/origin\\/47000002200f2a0a9020\"}],\"uri\":\"origin\\/47000002200f2a0a9020\",\"height\":1575},{\"url\":\"http:\\/\\/p1.pstatp.com\\/origin\\/470000022011d5569ccb\",\"width\":1178,\"url_list\":[{\"url\":\"http:\\/\\/p1.pstatp.com\\/origin\\/470000022011d5569ccb\"},{\"url\":\"http:\\/\\/pb3.pstatp.com\\/origin\\/470000022011d5569ccb\"},{\"url\":\"http:\\/\\/pb9.pstatp.com\\/origin\\/470000022011d5569ccb\"}],\"uri\":\"origin\\/470000022011d5569ccb\",\"height\":1588},{\"url\":\"http:\\/\\/p3.pstatp.com\\/origin\\/4700000220127db96444\",\"width\":1178,\"url_list\":[{\"url\":\"http:\\/\\/p3.pstatp.com\\/origin\\/4700000220127db96444\"},{\"url\":\"http:\\/\\/pb9.pstatp.com\\/origin\\/4700000220127db96444\"},{\"url\":\"http:\\/\\/pb1.pstatp.com\\/origin\\/4700000220127db96444\"}],\"uri\":\"origin\\/4700000220127db96444\",\"height\":1561},{\"url\":\"http:\\/\\/p3.pstatp.com\\/origin\\/46ff000532e33a9fa35a\",\"width\":1178,\"url_list\":[{\"url\":\"http:\\/\\/p3.pstatp.com\\/origin\\/46ff000532e33a9fa35a\"},{\"url\":\"http:\\/\\/pb9.pstatp.com\\/origin\\/46ff000532e33a9fa35a\"},{\"url\":\"http:\\/\\/pb1.pstatp.com\\/origin\\/46ff000532e33a9fa35a\"}],\"uri\":\"origin\\/46ff000532e33a9fa35a\",\"height\":1563},{\"url\":\"http:\\/\\/p9.pstatp.com\\/origin\\/470700000c7b871a5fae\",\"width\":1178,\"url_list\":[{\"url\":\"http:\\/\\/p9.pstatp.com\\/origin\\/470700000c7b871a5fae\"},{\"url\":\"http:\\/\\/pb1.pstatp.com\\/origin\\/470700000c7b871a5fae\"},{\"url\":\"http:\\/\\/pb3.pstatp.com\\/origin\\/470700000c7b871a5fae\"}],\"uri\":\"origin\\/470700000c7b871a5fae\",\"height\":1575}],\"max_img_width\":1178,\"labels\":[],\"sub_abstracts\":[\" \",\" \",\" \",\" \",\" \",\" \",\" \",\" \"],\"sub_titles\":[\"\\u6e05\\u65b0\\u81ea\\u7136\\uff0c\\u7f8e\\u4e3d\\u65e0\\u53cc\",\"\\u6e05\\u65b0\\u81ea\\u7136\\uff0c\\u7f8e\\u4e3d\\u65e0\\u53cc\",\"\\u6e05\\u65b0\\u81ea\\u7136\\uff0c\\u7f8e\\u4e3d\\u65e0\\u53cc\",\"\\u6e05\\u65b0\\u81ea\\u7136\\uff0c\\u7f8e\\u4e3d\\u65e0\\u53cc\",\"\\u6e05\\u65b0\\u81ea\\u7136\\uff0c\\u7f8e\\u4e3d\\u65e0\\u53cc\",\"\\u6e05\\u65b0\\u81ea\\u7136\\uff0c\\u7f8e\\u4e3d\\u65e0\\u53cc\",\"\\u6e05\\u65b0\\u81ea\\u7136\\uff0c\\u7f8e\\u4e3d\\u65e0\\u53cc\",\"\\u6e05\\u65b0\\u81ea\\u7136\\uff0c\\u7f8e\\u4e3d\\u65e0\\u53cc\"]}'.decode('string_escape'))
>>> 
>>> data["count"]
8
>>> 

Answer 4:

This is the street map of today’s headlines, which can be changed directly into empty (‘)’ and then json.loads.

Leave a Reply

Your email address will not be published. Required fields are marked *