AI Speculative Decoding - Graph View Technique where a smaller draft model generates candidate tokens that a larger model verifies in parallel to speed up inference. View concept details Related ConceptsAI Inference Large Language Models (LLMs) Knowledge Distillation Model Quantization AI KV Cache Small Language Models (SLMs) Tokenization ← Back to full graph