upcarta
  • Sign In
  • Sign Up
  • Explore
  • Search
Mentions
Andrej Karpathy @karpathy · Jul 18, 2022
  • From Twitter

Great post on the technical challenges of training a 176B Transformer Language Model. ~10 years ago you'd train neural nets on your CPU workstation with Matlab. Now need a compute cluster and very careful orchestration of its GPU memory w.r.t. both limits and access patterns.

Article Jul 14, 2022
The Technology Behind BLOOM Training
by Stas Bekman
Post Add to Collection Mark as Completed
Recommended by 1 person
1 mention
Share on Twitter Repost
  • upcarta ©2025
  • Home
  • About
  • Terms
  • Privacy
  • Cookies
  • @upcarta