office、pdf缩略图批量提取工具

作者：cchzhz 发布时间：2024-12-5 18:05:27

求office、pdf缩略图批量提取工具，最好能自动监控文件夹，可以提取几张内页缩略图拼接成一张大图最佳，哪位大神有啊

相关帖子

qiaojiwen 2024-12-5 18:06:23

pdf2image：这是一个 Python 库，可以从 PDF 中提取页面作为图片。结合 Python 脚本，您可以监控文件夹并批量处理。
创建 Python 脚本：
[Asm] 纯文本查看复制代码import os
from pdf2image import convert_from_path
from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler
from PIL import Image
# 设置要监控的文件夹路径
folder_to_watch = 'your_directory_path_here'
def extract_thumbnails(pdf_path):
images = convert_from_path(pdf_path, first_page=1, last_page=5)  # 提取前5页
thumbnails = []
for img in images:
      img.thumbnail((200, 200))  # 创建缩略图
      thumbnails.append(img)
return thumbnails
def save_combined_image(thumbnails, output_path):
widths, heights = zip(*(i.size for i in thumbnails))
total_width = sum(widths)
max_height = max(heights)
combined_img = Image.new('RGB', (total_width, max_height))
x_offset = 0
for img in thumbnails:
      combined_img.paste(img, (x_offset, 0))
      x_offset += img.width
combined_img.save(output_path)
class WatcherHandler(FileSystemEventHandler):
def on_created(self, event):
      if event.is_directory:
         return
      if event.src_path.lower().endswith('.pdf'):
         print(f"Detected new file: {event.src_path}")
         thumbnails = extract_thumbnails(event.src_path)
         output_path = os.path.join('output', f"{os.path.basename(event.src_path)}_combined.jpg")
         save_combined_image(thumbnails, output_path)
         print(f"Saved combined thumbnail for {event.src_path} as {output_path}")
def start_monitoring():
if not os.path.exists('output'):
      os.makedirs('output')
event_handler = WatcherHandler()
observer = Observer()
observer.schedule(event_handler, folder_to_watch, recursive=False)
observer.start()
print(f"Monitoring folder: {folder_to_watch} for new PDF files...")
try:
      while True:
         pass
except KeyboardInterrupt:
      observer.stop()
observer.join()
if __name__ == "__main__":
start_monitoring()
说明：
这个脚本会监控指定文件夹（folder_to_watch），当新 PDF 文件被创建时，它会自动提取前 5 页的缩略图。
缩略图将被拼接成一张大图并保存在 output 文件夹中。
确保 pdf2image 和 Pillow 库已安装。
运行：
将 folder_to_watch 设置为您要监控的文件夹路径。
运行脚本后，脚本将自动监控该文件夹，并对每个新上传的 PDF 文件执行提取和拼接操作。

office、pdf缩略图批量提取工具

相关帖子

热门主题

国产英伟达，摩尔把上市融资的75亿元拿去买

✅DMIT 三网 GIA CMIN2 MALIBU EB 维多利亚

有MJJ遇到过TG号全部设备都被登出了吗？

【快讯】HostHatch Seoul HH 新节点首尔

Hk-One-0.5G-52-LS 少量放貨速度

公司项目分享：硅谷人工智能公司 Nexa AI

拿到了 300 来部短剧的海外发行版权，下一

长话短说大家觉得花三十万结婚，存款花完

建议拉黑 IObit 旗下所有软件

重度苹果用户投华做了两面派

热门板块

公告

网站帮助 - Yoo趣儿

我们的愿景

在 Yoo趣儿投放广告

Yoo趣儿网站用户应遵守规则

office、pdf缩略图批量提取工具

相关帖子

热门主题

国产英伟达，摩尔把上市融资的75亿元拿去买

✅DMIT 三网 GIA CMIN2 MALIBU EB 维多利亚

有MJJ遇到过TG号全部设备都被登出了吗？

【快讯】HostHatch Seoul HH 新节点 首尔

Hk-One-0.5G-52-LS 少量放貨 速度

公司项目分享：硅谷人工智能公司 Nexa AI

拿到了 300 来部短剧的海外发行版权，下一

长话短说 大家觉得花三十万结婚，存款花完

建议拉黑 IObit 旗下所有软件

重度苹果用户投华做了两面派

热门板块

公告

网站帮助 - Yoo趣儿

我们的愿景

在 Yoo趣儿 投放广告

Yoo趣儿网站用户应遵守规则

【快讯】HostHatch Seoul HH 新节点首尔

Hk-One-0.5G-52-LS 少量放貨速度

长话短说大家觉得花三十万结婚，存款花完

在 Yoo趣儿投放广告