Running large language models like OPT-175B/g p t-3 on a single GPU. Up to 100x faster than other offloading systems.
在单个gpu上运行大型语言模型,类似opt-175b/g p t-3,相比类似系统有高达100x速度提升
Hardware: an NVIDIA T4 (16GB) instance on GCP with 208GB of DRAM and 1.5TB of SSD.
硬件:gcp上一块16gb t4,208gb内存,1.5tb ssd。是不是要那么多内存存疑
12小时直接2.3k star,什么是国际热度,战术后仰