Training Compute-Optimal Large Language Models (Chinchilla)

Type: paper

View reference

Related Concepts

Neural Scaling Laws

← Back to all references