Temperature

A parameter that controls the randomness of a language model's output — higher values produce more creative responses, lower values produce more predictable ones.

Temperature is a scalar applied to the logit distribution before sampling in a language model. At temperature 0 the model always picks the highest-probability token (greedy decoding, fully deterministic). At temperature 1 the model samples from the raw probability distribution. Above 1, low-probability tokens become more likely, increasing diversity and creativity at the cost of coherence.

Typical creative tasks use temperature 0.7–1.0; factual or code-generation tasks use 0.0–0.3. Temperature interacts with top-p (nucleus sampling) — both parameters are usually tuned together to balance quality and variety.

🔍 Click image to zoom

Temperature — controlling LLM randomness

Frequently Asked Questions

What is Temperature?

A parameter that controls the randomness of a language model's output — higher values produce more creative responses, lower values produce more predictable ones. Temperature is a scalar applied to the logit distribution before sampling in a language model. At temperature 0 the model always picks the highest-probability token (greedy decoding, fully deterministic). At temperature 1 the model samples from the raw probability distribution. Above 1, low-probability tokens become more likely, increasing diversity and creativity at the cost of coherence.

How is Temperature used in practice?

Why is Temperature important in AI?

Temperature is a foundational concept in Prompting Technique. A parameter that controls the randomness of a language model's output — higher values produce more creative responses, lower values produce more predictable ones.

Temperature

Frequently Asked Questions

What is Temperature?

How is Temperature used in practice?

Why is Temperature important in AI?

See Also