M2 Max + 96GB unified memory == 7B @ 10 token/s ( https://github.com/facebookresearch/llama/issues/79#issuecomment-1460500315)
12700k + 128GB RAM + 8GB 3070Ti == 65B @ 0.01 token/s ( https://github.com/facebookresearch/llama/issues/79#issuecomment-1460464011)
Ryzen 5800X + 32GB RAM + 16GB 2070 == 7B @ 1 token/s ( https://github.com/facebookresearch/llama/issues/79#issuecomment-1457172578)
2x 8GB 3060 == 7B @ 3 token/s ( https://github.com/facebookresearch/llama/issues/79#issuecomment-1457437793)
8x 24GB 3090 == 65GB @ 500 token/s ( https://github.com/facebookresearch/llama/issues/79#issuecomment-1455284428)