What I read.
AI Progress & Benchmarks
- Epoch AI ↗
Research institute investigating key trends and questions about the trajectory and governance of AI.
- Artificial Analysis ↗
Independent model evaluations to understand the AI landscape and choose the best model for your use case.
- ARC Prize ↗
Nonprofit dedicated to accelerating AGI development through human-calibrated benchmarks that measure the gap between human and AI capabilities.
- METR ↗
Model Evaluation and Threat Research — measuring AI ability to complete long, complex tasks.
- Vals.ai ↗
Independent benchmarks of LLMs for tasks that mimic real industry use cases in legal, finance, math, and more.
- LifeArchitect.ai ↗
Comprehensive AI model rankings, timeline of AI and language models, and AI IQ testing research.
- AI Futures Project ↗
A small research group forecasting the future of AI.
- Vending-Bench ↗
Benchmark evaluating AI models on managing a simulated vending machine business over one year, testing long-term coherence and strategic negotiation.
AI Agents & Agentic Engineering
- AI Agents — The Illustrated Guidebook ↗
A comprehensive visual guide covering everything you need to know about AI agents.
- Cognition Devin ↗
The AI software engineer — an autonomous agent that can handle full engineering tasks end-to-end.
- FutureHouse ↗
Superintelligent AI agents for scientific discovery — specializing in literature search, synthesis, and chemistry experiment planning.
- Mechanize ↗
Startup focused on fully automating work through advanced AI agents.
- Manus ↗
A general-purpose AI agent capable of autonomous task completion across a wide range of domains.
Prompting & Learning
- Prompt Engineering Guide ↗
Comprehensive guide to prompt engineering — understanding capabilities and limitations of large language models.
- Anthropic Prompt Engineering Overview ↗
Anthropic's guide to prompt engineering for Claude, emphasizing best practices and why it's faster and more flexible than fine-tuning.
- Wharton Prompt Library ↗
Discover, refine, build, and test prompts — with examples of effective prompting approaches from the Wharton Generative AI Labs.
- OpenAI Academy ↗
Platform for democratizing AI knowledge and skills through expert-led learning, community collaboration, and up-to-date AI content.
- One Useful Thing — Ethan Mollick ↗
Wharton professor Ethan Mollick examines the implications of the AI era for work and education.
Vibe Coding
- Claude Code ↗
Anthropic's agentic AI coding tool — the tool used to build this website and my other projects.
- v0 by Vercel ↗
AI-powered UI generation — natural language to production React/Next.js components.
- Vibe Coding Explained — Google Cloud ↗
Google's guide to vibe coding — both exploratory and professional applications, including workflows and lifecycle stages.
- Deep Dive into LLMs — Andrej Karpathy ↗
Detailed, accessible 3.5-hour explanation of how large language models like ChatGPT are built, trained, and fine-tuned.