22 Appendix B — Glossary

Terms are added as they are introduced in chapters. Keep definitions short.

Abstention — An agent’s decision to stop, ask, or escalate rather than answer when the evidence will not support a reliable conclusion.

Adaptive retrieval — Retrieving evidence only when a request needs it, and critiquing whether the retrieved passages actually support the answer, rather than always pulling a fixed number of passages [1].

Agent — A system that perceives an environment and acts on it to pursue a goal; in the modern sense, an LLM that directs its own tool use in a loop [2], [3].

Agent–computer interface (ACI) — The design of tools and their documentation as seen by the model; the agentic analogue of a human–computer interface [3].

Augmented LLM — An LLM equipped with tools, retrieval, and memory; the basic building block of agentic systems [3].

Autonomy budget — The finite amount of uncertainty, variation, cost, and operational risk a system is willing to tolerate in exchange for letting the model choose its own path; a way of framing autonomy as a trade to be spent only where flexibility pays for itself.

BDI — Belief–Desire–Intention, a classical agent model [4].

Chain-of-thought (CoT) — Prompting a model to produce intermediate reasoning steps [5].

Chain-of-thought faithfulness — Whether a model’s written reasoning is the actual cause of its answer; unfaithful traces can rationalize an answer after the fact [6], [7].

Control boundary — The view of an augmentation not just as a new capability but as a place to constrain the model: what it is allowed to do, what evidence it may use, what state it may change, and what must be validated outside the model.

Grounding — Replacing a model’s fallible memory with an external source of truth (a tool result or retrieved document); it supplies evidence, not correctness.

Guardrail — A safety check validating agent inputs or outputs [8].

Handoff — Transfer of control from one agent to another [8].

MCP (Model Context Protocol) — An open standard for connecting agents to tools and data [9].

Model–runtime boundary — The separation in which the model only proposes a tool call while the surrounding runtime validates arguments, checks permissions, gates risky actions, and executes it, keeping real-world effects outside the model.

Non-parametric memory — Knowledge held outside a model’s weights, in an external store that can be read, audited, and updated without retraining [10].

Parametric memory — Knowledge encoded in a model’s weights during training; fluent and always available, but frozen and hard to inspect or correct [10].

Prompt injection — An attack in which untrusted content (a tool result, retrieved document, or memory entry) is treated as an instruction the model obeys rather than as data to evaluate [11].

ReAct — Interleaving reasoning traces and actions [12].

Reasoning topologies — The shapes a reasoning process can take, from a single chain to a branching tree to a merging graph [13].

Reliability ladder — The stack of support techniques (direct answer, chain of thought, self-consistency, tool grounding, step verification, policy checks, human approval) matched to the cost of being wrong.

Process vs. outcome supervision — Checking each intermediate reasoning step versus checking only the final answer [14].

Reflection — Self-critique to iteratively improve outputs [15].

Retrieval-augmented generation (RAG) — Grounding generation in retrieved documents [10].

Workflow — LLMs and tools orchestrated through predefined code paths [3].

[1]

A. Asai, Z. Wu, Y. Wang, A. Sil, and H. Hajishirzi, “Self-RAG: Learning to retrieve, generate, and critique through self-reflection,” in International conference on learning representations (ICLR), 2024. Available: https://arxiv.org/abs/2310.11511

[2]

S. Russell and P. Norvig, Artificial intelligence: A modern approach, 4th ed. Pearson, 2021.

[3]

Anthropic, E. Schluntz, and B. Zhang, “Building effective agents.” https://www.anthropic.com/engineering/building-effective-agents, Dec. 2024.

[4]

A. S. Rao and M. P. Georgeff, “BDI agents: From theory to practice,” in Proceedings of the first international conference on multi-agent systems (ICMAS), 1995.

[5]

J. Wei et al., “Chain-of-thought prompting elicits reasoning in large language models,” in Advances in neural information processing systems (NeurIPS), 2022.

[6]

I. Arcuschin, J. Janiak, R. Krzyzanowski, S. Rajamanoharan, N. Nanda, and A. Conmy, “Chain-of-thought reasoning in the wild is not always faithful,” in International conference on machine learning (ICML), 2025.

[7]

Y. Chen et al., “Reasoning models don’t always say what they think,” arXiv preprint arXiv:2505.05410, 2025.

[8]

OpenAI, “New tools for building agents.” https://openai.com/index/new-tools-for-building-agents/, Mar. 2025.

[9]

Anthropic, “Model context protocol.” https://modelcontextprotocol.io, 2024.

[10]

P. Lewis et al., “Retrieval-augmented generation for knowledge-intensive NLP tasks,” in Advances in neural information processing systems (NeurIPS), 2020.

[11]

OWASP, “OWASP top 10 for large language model applications.” https://owasp.org/www-project-top-10-for-large-language-model-applications/, 2025.

[12]

S. Yao et al., “ReAct: Synergizing reasoning and acting in language models,” in International conference on learning representations (ICLR), 2023.

[13]

M. Besta et al., “Demystifying chains, trees, and graphs of thoughts,” arXiv preprint arXiv:2401.14295, 2024.

[14]

H. Lightman et al., “Let’s verify step by step,” arXiv preprint arXiv:2305.20050, 2023.

[15]

N. Shinn, F. Cassano, E. Berman, A. Gopinath, K. Narasimhan, and S. Yao, “Reflexion: Language agents with verbal reinforcement learning,” in Advances in neural information processing systems (NeurIPS), 2023.