泡书吧网站小说爬取程序,多线程极速下载

查看 60|回复 6
作者:pnnhnjh   
泡书吧网站小说爬取程序,多线程极速下载,运行后打开网站,选取你喜欢的小说,打开小说的目录页面(
[color=]小说目录页
),复制网址(
[color=]如:
http://www.paoshu8.info/224_224190/
[color=]“)
后粘贴到输入提示窗口回车即可。
[color=]注:不输入任何内容直接回车则开始示例小说下载!
[Python] 纯文本查看 复制代码import os
import random
import time
import requests
import threading
from queue import Queue
from lxml import etree
import logging
import colorlog
from requests.adapters import HTTPAdapter
# 配置日志
handler = colorlog.StreamHandler()
handler.setFormatter(colorlog.ColoredFormatter(
    '%(log_color)s%(asctime)s - %(levelname)s - %(message)s',
    log_colors={
        'DEBUG': 'cyan',
        'INFO': 'green',
        'WARNING': 'yellow',
        'ERROR': 'red',
        'CRITICAL': 'bold_red',
    }
))
logger = logging.getLogger()
logger.addHandler(handler)
logger.setLevel(logging.INFO)
user_agents = [
    'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.18363',
    'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36',
    'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:66.0) Gecko/20100101 Firefox/66.0',
    'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36',
    'Mozilla/5.0 (iPhone; CPU iPhone OS 13_2_3 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.0.3 Mobile/15E148 Safari/604.1'
]
# 增加连接池大小
session = requests.Session()
adapter = HTTPAdapter(pool_connections=100, pool_maxsize=100)  # 设置连接池大小为100
session.mount('http://', adapter)
session.mount('https://', adapter)
def get_chaptercontent(chapter_url, temp_file, queue, semaphore, session, max_retries=5):
    semaphore.acquire()  # 获取信号量
    try:
        retry_count = 0
        chaptercontent = ''
        while retry_count

章节, 线程

xiaohuaiwu   

学习了,感谢大佬
XiaoLuoSheng   

感谢大佬!
zhaomingX   

感谢大佬
lufei2002   

感谢大佬
moka518   

代码写的很清楚❤️
159357ssy   

学习一下
您需要登录后才可以回帖 登录 | 立即注册

返回顶部