帮忙看一下chatgpt写的爬虫程序?

查看 67|回复 3
作者:Qiulei7755887   
帮忙看看,他写得咋样
[Python] 纯文本查看 复制代码import requests
from bs4 import BeautifulSoup
# 设置搜索关键字和页面数
keyword = "海洋生态保护修复"
page_count = 10
# 构造搜索页面的URL
base_url = "https://www.pkulaw.com/search/"
params = {
    "searchKeyword": keyword,
    "fullText": "true",
    "range": "text",
    "library": "pflaw",
    "pageIndex": 1,
    "pageSize": 20,
    "highlight": "true"
}
# 发送搜索请求,获取搜索结果
results = []
for i in range(1, page_count + 1):
    params["pageIndex"] = i
    response = requests.get(base_url, params=params)
    soup = BeautifulSoup(response.text, "html.parser")
    items = soup.select(".search-result .result-list .item")
    for item in items:
        title = item.select_one(".title a").text.strip()
        url = item.select_one(".title a")["href"]
        content = item.select_one(".content .text").text.strip()
        results.append((title, url, content))
# 将搜索结果导出为文本文件
with open("result.txt", "w", encoding="utf-8") as f:
    for title, url, content in results:
        f.write(title + "\n")
        f.write(url + "\n")
        f.write(content + "\n\n")

爬虫, 搜索结果

土鸡炖蘑菇   

urllib.error.HTTPError: HTTP Error 404: Not Found
报错
Qiulei7755887
OP
  


土鸡炖蘑菇 发表于 2023-3-8 16:53
urllib.error.HTTPError: HTTP Error 404: Not Found
报错

对啊,想请你们帮我看看哪里错了,可以加cb
土鸡炖蘑菇   


Qiulei7755887 发表于 2023-3-8 16:57
对啊,想请你们帮我看看哪里错了,可以加cb

网址错了吧?
您需要登录后才可以回帖 登录 | 立即注册

返回顶部