DEFINITION
Inference
LLM Inference
Definition
The process of running an LLM to generate a response — as opposed to training. When your agent 'thinks,' it's performing inference. Inference speed, cost, and reliability are key metrics for agent builders choosing an LLM provider.
Examples in the Wild
- Example 1:Local inference via Ollama keeps data on your machine
- Example 2:Cloud inference via Anthropic API is fast but costs per token