Deep Learning (affiliate link) — Ian Goodfellow, Yoshua Bengio, and Aaron Courville
A foundational textbook that covers the principles of deep learning. It provides both theoretical depth and practical insights, making it essential for understanding the architectures that underpin large language models.
Artificial Intelligence: A Modern Approach (affiliate link) — Stuart Russell and Peter Norvig
The definitive text on AI fundamentals — from search and planning to probabilistic reasoning and learning. A must-have for anyone who wants a comprehensive grounding in AI concepts.
Machine Learning: A Probabilistic Perspective (affiliate link) — Kevin P. Murphy
A rigorous and mathematically rich guide to machine learning through the lens of probability theory, covering everything from Bayesian inference to graphical models.
Natural Language Processing with Transformers (affiliate link) — Lewis Tunstall, Leandro von Werra, and Thomas Wolf
A practical guide to modern NLP with transformer models, focusing on the Hugging Face ecosystem. Ideal for hands-on practitioners building real-world LLM applications.
Transformers for Natural Language Processing (affiliate link) — Denis Rothman
An in-depth exploration of transformer architectures from BERT to GPT-3, complete with implementation details and real-world NLP case studies.
GPT-3: Building Innovative NLP Products Using Large Language Models (affiliate link) — Sandra Kublik, Shubham Saboo, and Dhaval Pattani
A hands-on guide to building applications using GPT-3. Covers prompt design, API integration, and productization strategies for generative AI systems.
Hands-On Large Language Models: Language Understanding and Generation (affiliate link)
A practical toolkit for working with LLMs across copywriting, summarization, and semantic search. Includes fine-tuning, transformer internals, and optimization methods for production use.
AI Engineering: Building Applications with Foundation Models (affiliate link) — Chip Huyen (2025)
A modern, practical playbook for designing, deploying, and maintaining AI systems powered by foundation models. Topics include prompt engineering, RAG, evaluation, fine-tuning, latency/cost trade-offs, and continuous learning loops.
Neural Networks and Deep Learning — Michael Nielsen
A classic free online textbook that offers a clear, intuitive introduction to deep learning fundamentals and how neural networks learn from data.
Introduction to Information Retrieval (affiliate link) — Christopher Manning, Prabhakar Raghavan, and Hinrich Schütze
A foundational reference in search, ranking, and retrieval — essential background for understanding modern RAG systems and embedding-based retrieval in LLMs.
Designing Data-Intensive Applications (affiliate link) — Martin Kleppmann
Explains the data systems that power large-scale AI applications. Covers distributed systems, databases, and data pipelines — indispensable for anyone engineering LLM infrastructure.
Pattern Recognition and Machine Learning (affiliate link) — Christopher Bishop A classic and rigorous mathematical treatment of probabilistic machine learning and inference.
Deep Learning for Coders with fastai and PyTorch (affiliate link) — Jeremy Howard and Sylvain Gugger A beginner-to-advanced hands-on guide that teaches deep learning intuitively with code-first examples.
Bayesian Reasoning and Machine Learning (affiliate link) — David Barber A highly readable book connecting Bayesian modeling to modern ML techniques.
Generative Deep Learning (2nd Edition) (affiliate link) — David Foster Explores modern generative techniques — VAEs, GANs, diffusion, and transformers — with code examples.
Building LLMs for Production: Enhancing LLM Abilities and Reliability with Prompting, Fine-Tuning, and RAG (affiliate link) - Louis-François Bouchard & Louie Peters, 2024 A practical, end-to-end guide to designing and deploying reliable LLM applications.
Practical Deep Learning for Cloud, Mobile, and Edge (affiliate link) — Anirudh Koul, Siddha Ganju, Meher Kasam Great for understanding deployment challenges beyond GPUs — how to optimize and serve models efficiently.
Prompt Engineering for Generative AI (affiliate link) — James Phoenix (O’Reilly, 2024) Practical strategies for designing and evaluating prompts for ChatGPT, Claude, and Gemini-like models.
Machine Learning Engineering (affiliate link) — Andriy Burkov A concise, actionable guide to deploying, monitoring, and scaling ML systems in production.
Reliable Machine Learning: Applying SRE Principles to ML in Production (affiliate link) — Cathy Chen, Niall Richard Murphy, Kranti Parisa, D. Sculley, Todd Underwood A practical guide to applying Site Reliability Engineering (SRE) principles to machine learning. Covers monitoring, governance, and operational best practices for building reliable, accountable ML systems in production.
Mining of Massive Datasets(affiliate link) — Anand Rajaraman, Jeff Ullman, Jure Leskovec A comprehensive text on large-scale data mining, clustering, and graph algorithms foundational to search systems.
The Alignment Problem(affiliate link) — Brian Christian A deeply researched look at fairness, ethics, and interpretability challenges in AI systems.
Tools and Weapons(affiliate link) — Brad Smith & Carol Ann Browne (Microsoft) A thoughtful exploration of how AI reshapes society, privacy, and governance.
Atlas of AI(affiliate link) — Kate Crawford Investigates the human, environmental, and political costs of AI’s global infrastructure.
Reinforcement Learning: An Introduction (2nd Edition)(affiliate link) — Richard Sutton and Andrew Barto The gold standard for understanding reinforcement learning, policy gradients, and Q-learning.
Probabilistic Graphical Models: Principles and Techniques(affiliate link) — Daphne Koller and Nir Friedman A comprehensive text bridging probabilistic reasoning and machine learning.
Speech and Language Processing (3rd Edition, Draft)(affiliate link) — Dan Jurafsky and James H. Martin The authoritative NLP text — currently updated online to cover transformers and deep learning.
Agentic AI: On Evaluations — common metrics for multi-turn chatbots, RAG, and agentic systems; reviews frameworks like DeepEval, RAGAS, and OpenAI’s Evals library.
Agentic AI: Implementing Long-Term Memory — how to equip stateless LLMs with long-term memory so they can recall facts, maintain conversation context, and interconnect knowledge.
Agentic RAG: Company Knowledge Slack Agents — building an AI knowledge agent that searches internal docs and answers via Slack (or Teams/Discord), using LlamaIndex and Modal.
Agentic AI: Comparing New Open-Source Frameworks — hands-on testing of open-source agentic frameworks, comparing ease of use and capabilities.
Managing context on the Claude Developer Platform: An introduction to capabilities that enable developers to build AI agents that can handle long-running tasks at higher performance and without hitting context limits or losing critical information.
A Postmortem of Three Recent Issues: A detailed technical post by Anthropic analyzing three overlapping infrastructure bugs (involving context‐window routing, output corruption, and approximate top-k miscompilation) that intermittently degraded Claude’s response quality.
Defeating Nondeterminism in LLM Inference: A technical deep dive published by Thinking Machines AI that explains why operations like atomicAdd introduce nondeterminism in GPU execution, disentangles common misconceptions around “GPU concurrency + floating point math,” and surveys practical strategies to ensure reproducible results in large-scale LLM inference.
AI 2027: Forecasting a Superhuman AI Future: A rigorously constructed scenario released on April 3, 2025 by the nonprofit AI Futures Project, outlining quantitative forecasts of superhuman AI emerging by 2027 and exploring both acceleration and slowdown paths.
LLM Powered Autonomous Agents: A deep dive into how large language models are powering the next generation of autonomous agents, enabling systems to perform complex tasks with minimal human input.
Google “We Have No Moat, And Neither Does OpenAI”: Leaked internal Google document discussing the competitive landscape of AI and arguing that neither Google nor OpenAI have sustainable competitive advantages in the long term.
Prompt Engineering: An introduction to prompt engineering techniques, providing guidelines on how to effectively interact with large language models to obtain the best results.
How does GPT Obtain its Ability? Tracing Emergent Abilities of Language Models to their Sources: This article investigates how GPT models acquire their emergent abilities, tracing them back to the training data and architectures used.
Why did all of the public reproduction of GPT-3 fail?: This post explores the difficulties and challenges researchers faced when attempting to reproduce the capabilities of GPT-3, offering insights into why these efforts largely fell short.
Alpaca: Synthetic Data for LLMs: Stanford’s approach to generating synthetic data for fine-tuning large language models using OpenAI’s API.
Evol-Instruct: Improving Dataset Quality: Techniques for enhancing instruction datasets with evolved synthetic data.
Orca: High-Quality Data Generation: Orca paper explaining how to generate better synthetic data through instruction following and feedback models.
Scaling Laws for LLMs: A study on scaling laws, which predict LLM performance based on model and dataset size.
Chinchilla’s Wild Implications: Insights into how the scaling laws affect LLMs’ computational efficiency.
TinyLlama: A project focused on training a Llama model from scratch, providing insights into pre-training LLMs.
BigBench: LLM Benchmarking: A large-scale benchmark for evaluating LLM capabilities across various tasks.
Training a Causal Language Model from Scratch: Hugging Face tutorial on pre-training GPT-2 from scratch using the transformers library.
LLMDataHub: Curated Datasets for LLMs: Collection of datasets for pre-training, fine-tuning, and RLHF of large language models.
Perplexity in LLMs: Hugging Face guide on measuring model perplexity for text generation tasks.
Karpathy’s Zero to Hero: GPT: A 2-hour course by Andrej Karpathy on building a GPT model from scratch, focusing on tokenization and transformer fundamentals.
Karpathy’s Intro to Tokenization: A detailed introduction to tokenization for LLMs, explaining how text is processed into tokens for transformer models.
Karpathy’s LLM Intro Series: Multiple introductory tutorial videos on LLMs by Andrej Karpathy. A must see.
Andrej Karpathy
1.05M subscribers · 17 videos
Deep-dive lectures on neural networks, transformers, and practical ML from a Tesla/OpenAI legend.
AI Coffee Break with Letitia
58.1K subscribers · 138 videos
Bite-sized, lighthearted explanations of ML concepts and research papers.
Umar Jamil
72.4K subscribers · 26 videos
Clear tutorials on deep learning and practical ML engineering.
Simon Oz
14.1K subscribers · 23 videos
Concise breakdowns of AI concepts, with a focus on accessibility.
3Blue1Brown
7.67M subscribers · 220 videos
Stunning visual explanations of math foundations for ML (linear algebra, calculus, neural nets).
GPU MODE
21.8K subscribers · 121 videos
Community-driven channel exploring GPU research, AI scaling, and experiments.
AI Jason
204K subscribers · 76 videos
Practical AI tutorials and product-focused ML builds for developers.
Yannic Kilcher
298K subscribers · 478 videos
Detailed research paper reviews and commentary on the latest ML/AI trends.
Artem Kirsanov
307K subscribers · 47 videos
Explains ML/AI concepts with academic rigor — often referencing neuroscience.
Aleksa Gordić – The AI Epiphany
61.4K subscribers · 240 videos
Step-by-step coding tutorials on deep learning, transformers, and modern AI systems.
The AI Alignment Podcast: Conversations with leading AI researchers and thinkers like Stuart Russell, Yoshua Bengio, and more, covering cutting-edge research in AI alignment and deep learning.
Lex Fridman Podcast: Features interviews with AI pioneers like Yann LeCun, Geoffrey Hinton, Demis Hassabis, and Andrej Karpathy, discussing AI, deep learning, and the future of technology.
Machine Learning Street Talk: In-depth discussions with AI researchers such as Yannic Kilcher and Connor Leahy, tackling topics in AI ethics, deep learning, and more.
The Gradient Podcast: Interviews with researchers and practitioners in AI, deep learning, and NLP, including guests like Fei-Fei Li and Sebastian Ruder.
TWIML AI Podcast: Host Sam Charrington interviews top minds in AI and machine learning, such as Andrew Ng and Ian Goodfellow, diving deep into industry trends and research breakthroughs.
Data Skeptic: A podcast covering data science, machine learning, and AI, featuring leading experts from academia and industry, like Charles Isbell and Dario Amodei.
Andrej Karpathy – Software 3.0: A visionary keynote at AI Startup School, where Karpathy outlines the shift to “Software 3.0,” a world where foundation models are the new computing platform, and natural language becomes the new programming interface. Drawing on experience at OpenAI, Tesla, and Stanford, he explores how this transformation reshapes software, developer tools, and the future of startups.
Sam Altman on AGI, GPT‑5, and What’s Next – OpenAI Podcast Ep. 1: In the debut episode of the OpenAI Podcast, host Andrew Mayne speaks with Sam Altman about the future of AI — covering GPT‑5, AGI and superintelligence, OpenAI’s internal tools like Operator and Deep Research, AI-powered parenting (with ChatGPT as his personal assistant), and how AI is reshaping scientific workflows and productivity.
Below is a collection of university and online courses that offer a deep dive into the concepts, tools, and applications of Large Language Models (LLMs). These courses range from theoretical foundations to practical applications in business and data science.
Stanford University - TECH 16: Large Language Models for Business with Python: This course covers the use of LLMs in business applications, with a focus on practical programming with Python. Students learn how to integrate LLMs into business processes to drive innovation and efficiency.
ETH Zürich - 263-5354-00L: Large Language Models: Focused on the theoretical underpinnings and current developments of LLMs, this course covers a broad range of topics from model training to application.
University of Toronto - COMP790-101: Large Language Models: This seminar-style course reviews the latest research on LLMs, covering both foundational knowledge and emerging trends in their development.
Coursera - Natural Language Processing with Transformers: This course introduces transformers, which are the foundation of modern LLMs. It focuses on using transformers for various NLP tasks such as text classification, summarization, and translation.
DataCamp - Transformer Models for NLP: Learn how to leverage transformer models to perform advanced natural language processing tasks with hands-on coding exercises in Python.
Udemy - GPT-3 and OpenAI API: A Guide for Building LLM-Powered Applications: This course provides practical insights into using GPT-3 and OpenAI’s API to build applications that utilize LLMs, with a focus on creating conversational agents and content generation.
DeepLearning.AI - Generative AI with Large Language Models: This course from DeepLearning.AI covers the key concepts of generative AI, with a particular focus on LLMs. It includes hands-on practice in fine-tuning LLMs, prompt engineering, and applying these models to real-world use cases.
Generative AI for Everyone: Non-technical introduction to generative AI and large language models, covering prompt engineering, business applications, and strategy. Taught by Andrew Ng.
AI Python for Beginners: Learn Python API calls, chatbots, debugging, and LLM integrations. Taught by Andrew Ng.
LangChain for LLM Application Development: Build intelligent LLM apps featuring chains, memory, and QA using LangChain. Co-taught by Andrew Ng and Harrison Chase.
ChatGPT Prompt Engineering for Developers: Techniques for crafting effective prompts and building bots with the OpenAI API. Co-taught by Andrew Ng and Isa Fulford.
Building Systems with the ChatGPT API: Develop end-to-end LLM workflows and integrations using the ChatGPT API. Co-taught by Andrew Ng and Isa Fulford.
Orchestrating Workflows for GenAI Applications: Learn to turn a GenAI or RAG prototype into a production-ready, automated pipeline using Apache Airflow. (by Astronomer)
DSPy: Build and Optimize Agentic Apps: Build, debug, and optimize AI agents using DSPy and MLflow. (by Databricks)
Reinforcement Fine-Tuning LLMs with GRPO: Improve LLM reasoning and performance with reinforcement learning using GRPO (Group Relative Policy Optimization). (by Predibase)
MCP: Build Rich-Context AI Apps with Anthropic: Build AI apps that access tools, data, and prompts using Anthropic’s Model Context Protocol. (by Anthropic)
Building AI Voice Agents for Production: Build responsive, human-like AI voice applications. (by LiveKit and RealAvatar)
LLMs as Operating Systems: Agent Memory: Build memory-augmented systems with MemGPT agents. (by Letta)
Building Code Agents with Hugging Face smolagents: Build agents that write and execute code using Hugging Face’s smolagents framework. (by Hugging Face)
Building AI Browser Agents: Build browser agents that navigate and interact with websites reliably. (by AGI Inc)
Getting Structured LLM Output: Generate structured output to power robust production LLM applications. (by DotTxt)
Vibe Coding 101 with Replit: Learn to build and deploy AI coding agents in a web-based IDE. (by Replit)
Long-Term Agentic Memory with LangGraph: Build long-memory agents using LangGraph and LangMem. (by LangChain)
Event-Driven Agentic Document Workflows: Process documents and fill forms using agent workflows with RAG. (by LlamaIndex)
Build Apps with Windsurf’s AI Coding Agents: Debug and deploy applications with Windsurf’s AI-powered IDE. (by Windsurf)
Evaluating AI Agents: Evaluate, improve, and iterate on AI agents using structured assessments. (by Arize AI)
Attention in Transformers: Concepts and Code in PyTorch: Implement the attention mechanism in PyTorch and understand its impact. (by StatQuest)
How Transformer LLMs Work: A visual and code-based introduction to the architecture behind modern LLMs. (by Jay Alammar & Maarten Grootendorst)
Building Towards Computer Use with Anthropic: Learn how AI assistants can perform real tasks on computers. (by Anthropic)
Build Long-Context AI Apps with Jamba: Create apps that handle long documents using the Jamba model. (by AI21 Labs)
Reasoning with o1: Learn how to use and prompt OpenAI’s o1 model for reasoning tasks. (by OpenAI)
Collaborative Writing and Coding with OpenAI Canvas: Collaborate with AI to write and code using OpenAI Canvas. (by OpenAI)
LangChain: A framework for building LLM-powered applications with modular integrations, memory, and chaining prompts.
LlamaIndex: Connects LLMs with external data like documents and databases, ideal for knowledge-augmented applications.
Dyson: Enables dynamic instruction tuning and fine-tuning of LLMs with custom prompts and instructions.
LangGraph: Integrates LLMs with graph-based data, enhancing structured data querying and reasoning.
DeepSpeed: Optimizes large model training with techniques like ZeRO, quantization, and memory efficiency.
Hugging Face Transformers: Provides tools for using, fine-tuning, and deploying transformer models like GPT and BERT.
OpenRouter: An open-source alternative for routing prompts through multiple LLM APIs like GPT-4 and Claude.
Guidance: A library to guide and structure LLM outputs programmatically for complex tasks.
Haystack: A framework for building scalable LLM-powered search and retrieval systems, including RAG pipelines.
FastRAG: Efficient framework for low-latency, scalable Retrieval-Augmented Generation (RAG) pipelines.
DSPy: A library that allows you to optimize prompts and LLM outputs through programmatic evaluation.
Curated from Chip Huyen’s writing and references — spanning foundations, deployment, evaluation, optimization, and best practices.
Please, feel free to submit a web form to add more links in this page.
As an Amazon Associate, this site earns from qualifying purchases made. This comes at no additional cost to you. (All Amazon links marked as “affiliate link.”)