West Coast Machine Learning — Channel Summaries
AI-powered summaries of 15 videos about West Coast Machine Learning.
15 summaries
Diffusion Policy Controlling Robots - Part 1
Diffusion policy is being positioned as a practical way to teach robots dexterous, vision-guided manipulation from relatively few...
Tree of Thought Prompting
Tree of Thought prompting reframes large language model problem-solving as an explicit search process: generate candidate intermediate “thoughts,”...
Mamba sequence model - part 1
Mamba’s core pitch is that sequence models can match Transformer-quality results on language and other modalities while scaling linearly with...
State of GPT
Large language models are built through a pipeline that starts with internet-scale next-token pre-training and then progressively adds human...
DeepSeek Multihead Latent Attention
DeepSeek V2’s standout inference optimization is Multi-Head Latent Attention (MLA), a transformer attention redesign that slashes the size of the KV...
Mamba part 2 - Can it replace Transformers?
Mamba’s core pitch is simple: it aims to match—and in some settings surpass—Transformer-style language modeling while scaling linearly with sequence...
Consistency Models
Consistency models aim to cut diffusion sampling time by replacing many denoising steps with a learned, one-step (or few-step) mapping from a noisy...
Diffusion Policy Controlling Robots - Part 2
Diffusion policy for robot control turns a noisy guess of future actions into a smooth, goal-reaching trajectory by repeatedly denoising an action...
Mamba part 4 - System Details and Implementation
Mamba’s core implementation hinges on a state-space “mixer” that updates a hidden state sequentially while keeping most computations...
Mamba part 3 - Details of Mamba and Structured State Space
Mamba’s core pitch is that sequence modeling can be made both fast and selective without attention’s quadratic cost. The approach builds on state...
Alpha Geometry
Alpha Geometry is a system that solves a difficult subset of geometry proofs—specifically “plane geometry” problems—without human demonstrations, by...
Biology of LLMs - Part 1
Mechanistic interpretability is moving from “what concepts are stored where” toward “how those concepts get used to produce the next token.” The...
Parameter Efficient Fine Tuning
Parameter-efficient fine-tuning is presented as a practical way to adapt large Transformer and language models to new tasks without retraining the...
Rotary Positional Embeddings (RoPE): Part 1
Rotary Positional Embeddings (RoPE) replace the usual “add a position vector” approach with a rotation-based scheme that bakes relative distance...
Transformer Circuits Part 1
Transformer circuits work centers on a simple but powerful claim: even in a stripped-down, one-layer attention-only Transformer, the model’s behavior...