Post by Andrej Karpathy: Great post on the technical challenges of training a 176B Transformer Language Model. ~10 years ago y...

Post

Andrej Karpathy @karpathy · Jul 18, 2022

From Twitter

Great post on the technical challenges of training a 176B Transformer Language Model. ~10 years ago you'd train neural nets on your CPU workstation with Matlab. Now need a compute cluster and very careful orchestration of its GPU memory w.r.t. both limits and access patterns.

Article Jul 14, 2022

The Technology Behind BLOOM Training

by Stas Bekman

Recommended by 1 person

1 mention

Replies

No replies yet