Andrej Karpathy

101 Followers

community-curated profile

Director of AI at Tesla. Previously a research scientist at OpenAI and CS PhD student at Stanford. I like to train deep neural nets on large datasets 🧠🤖💥

Overview Posts Content Recommendations

Andrej Karpathy @karpathy · Mar 12, 2024

From Twitter

+1 to the best AI newsletter atm that I enjoy skimming, great/ambitious work by @swyx & friends: [link] "Skimming" because they are very long. Not sure how it is built, sounds like there is a lot of LLM aid going on indexing ~356 Twitters, ~21 Discords, etc.

AI News

Recommended by 1 person

1 mention

Andrej Karpathy @karpathy · Jan 2, 2024

From Twitter

"After 34 Years, Someone Finally Beat Tetris" Wow, incredible video on what it took to beat Tetris, waaay beyond the game's original design. Also a great reference for reinforcement learning and what superintelligence might look like.

Video Dec 31, 2023

After 34 Years, Someone Finally Beat Tetris

Recommended by 1 person

1 mention

Andrej Karpathy @karpathy · Oct 17, 2023

From Twitter

State of AI Report: very nice snapshot of the AI ecosystem across research, industry and (geo)politics (as usual each year :)).

Report 2023

State of AI Report 2023

by Nathan Benaich

Recommended by 1 person

1 mention

Andrej Karpathy @karpathy · Jun 30, 2023

From Twitter

I think this is mostly right.
- LLMs created a whole new layer of abstraction and profession.
- I've so far called this role "Prompt Engineer" but agree it is misleading. It's not just prompting alone, there's a lot of glue code/infra around it. Maybe "AI Engineer" is ~usable, though it takes something a bit too specific and makes it a bit too broad.
- ML people train algorithms/networks, usually from scratch, usually at lower capability.
- LLM training is becoming sufficently different from ML because of its systems-heavy workloads, and is also splitting off into a new kind of role, focused on very large scale training of transformers on supercomputers.
- In numbers, there's probably going to be significantly more AI Engineers than there are ML engineers / LLM engineers.
- One can be quite successful in this role without ever training anything.
- I don't fully follow the Software 1.0/2.0 framing. Software 3.0 (imo ~prompting LLMs) is amusing because prompts are human-designed "code", but in English, and interpreted by an LLM (itself now a Software 2.0 artifact). AI Engineers simultaneously program in all 3 paradigms. It's a bit 😵‍💫

Article Jun 30, 2023

The Rise of the AI Engineer

by swyx.ai

Recommended by 1 person

1 mention

Andrej Karpathy @karpathy · Nov 17, 2022

From Twitter

Good post. A lot of interest atm in wiring up LLMs to a wider compute infrastructure via text I/O (e.g. calculator, python interpreter, google search, scratchpads, databases, ...). The LLM becomes the "cognitive engine" orchestrating resources, its thought stack trace in raw text

Article Nov 16, 2022

The Near Future of AI is Action-Driven

by John McDonnell

Recommended by 1 person

1 mention

Andrej Karpathy @karpathy · Nov 11, 2022

From Twitter

Excellent post about applying insights from ML (overfitting control) to a much broader class of systems that optimize against an objective: politics, science, orgs, daily life. Underfitting is underrated.

Article Nov 6, 2022

Too much efficiency makes everything worse: overfitting and the strong version of Goodhart’s law

by Jascha Sohl-Dickstein

Recommended by 1 person

1 mention

Andrej Karpathy @karpathy · Sep 6, 2022

From Twitter

"AI And The Limits Of Language" good article on a big open question in my mind - how much can an AI learn from internet text alone? what if added a lot of images/videos from the internet? do we have to reach all the way to embodied agents?

Article Aug 23, 2022

AI And The Limits Of Language

by Jake Browning and Yann LeCun

Recommended by 2 people

2 mentions

Andrej Karpathy @karpathy · Jul 23, 2022

From Twitter

(randomly triggered while reading Animal Eyes, which is quite excellent

Book Jan 1, 2002

Animal Eyes

by Michael F. Land and Dan-Eric Nilsson

Recommended by 1 person

1 mention

Andrej Karpathy @karpathy · Jul 18, 2022

From Twitter

Great post on the technical challenges of training a 176B Transformer Language Model. ~10 years ago you'd train neural nets on your CPU workstation with Matlab. Now need a compute cluster and very careful orchestration of its GPU memory w.r.t. both limits and access patterns.

Article Jul 14, 2022

The Technology Behind BLOOM Training

by Stas Bekman

Recommended by 1 person

1 mention

Andrej Karpathy @karpathy · Jul 17, 2022

From Twitter

Cool, wasn't aware, his backpack post is awesome more generally

Article Jun 20, 2022

My 40-liter backpack travel guide

by Vitalik Buterin

Recommended by 1 person

1 mention

Andrej Karpathy @karpathy · Apr 11, 2022

From Twitter

👍Arrival is a masterpiece, Ted Chiang in top form. The short story, not the movie.

Book Jul, 2002

Arrival

by Ted Chiang

Recommended by 3 people

3 mentions

Andrej Karpathy @karpathy · Apr 1, 2022

From Twitter

Just making sure everyone read “The Bitter Lesson”, as it is one of the best compact pieces of insight into nature of progress in AI. Good habit to keep checking ideas on whether they pass the bitter lesson gut check

Article

The Bitter Lesson

by Rich Sutton

Recommended by 1 person

1 mention

Andrej Karpathy

Recent Posts