1.爬取目标是国内的小说站点,起点、晋江等。
2.爬取的数据主要是榜单、收藏、订阅等数据。不包含小说本身的文章。
3.爬取时间每小时一次。
用途:
1.主要用来做数据分析。
2.次要目的想自己做个导览网站/应用。
robots.txt
以起点举例:
User-agent: ClaudeBot
Disallow: /
User-agent: ChatGPT-User
Disallow: /
User-agent: GPTbot
Disallow: /
User-Agent: *
Allow: /
Disallow: /*.css
Disallow: /*.js
Disallow: /so/*
Sitemap: https://www.qidian.com/zhuanti/qyn/post-sitemap.xml
Sitemap: https://www.qidian.com/zhuanti/qyn/post-sitemap2.xml
Sitemap: https://www.qidian.com/zhuanti/qyn/post-sitemap3.xml
Sitemap: https://www.qidian.com/zhuanti/qyn/post-sitemap4.xml
Sitemap: https://www.qidian.com/zhuanti/qyn/post-sitemap5.xml
Sitemap: https://www.qidian.com/zhuanti/qyn/post-sitemap6.xml
Sitemap: https://www.qidian.com/zhuanti/qyn/post-sitemap7.xml
Sitemap: https://www.qidian.com/zhuanti/qyn/post-sitemap8.xml
Sitemap: https://www.qidian.com/zhuanti/qyn/post-sitemap9.xml
Sitemap: https://www.qidian.com/zhuanti/qyn/post-sitemap10.xml
Sitemap: https://www.qidian.com/zhuanti/qyn/post-sitemap11.xml
Sitemap: https://www.qidian.com/zhuanti/qyn/post-sitemap12.xml
Sitemap: https://www.qidian.com/zhuanti/qyn/post-sitemap13.xml
Sitemap: https://www.qidian.com/zhuanti/qyn/post-sitemap14.xml
Sitemap: https://www.qidian.com/zhuanti/qyn/post-sitemap15.xml
Sitemap: https://www.qidian.com/zhuanti/qyn/post-sitemap16.xml
Sitemap: https://www.qidian.com/zhuanti/qyn/post-sitemap17.xml
Sitemap: https://www.qidian.com/zhuanti/qyn/page-sitemap.xml
Sitemap: https://www.qidian.com/zhuanti/qyn/sr_playlist-sitemap.xml
Sitemap: https://www.qidian.com/zhuanti/qyn/post-archive-sitemap.xml
Sitemap: https://www.qidian.com/zhuanti/qyn/category-sitemap.xml
以上,会有法律风险吗?
----------------------------------------------------
进一步:
如果以上数据我作为收费项目,会有法律风险吗?