Under the Hood of Prompt Engineering: How AI thinks

AI often feels like a magic trick. You type something, and out comes an answer that seems thoughtful, even creative. But behind the curtain, it isn’t magic – it’s mathematics, patterns, and probabilities.

If prompt engineering is learning how to “talk” to AI, then understanding how AI thinks is learning what’s happening “under the hood” when it responds. Lets unpack that process step by step – from how words are represented, to how predictions are made, and finally how parameters like temperature shape creativity.

AI Doesn’t Think in Words – It Thinks in Numbers

When you type “Explain the concept of opportunity cost in simple terms” into an AI, it doesn’t see economic vocabulary or complicated ideas about choice. It sees numbers.

Every word or sub-word is converted into a token (think of it like a Lego block).
Each token is mapped to a vector – a point in a high-dimensional space.
Similar words land close together. For example, “cat” and “dog” might be neighbors, while “banana” lives far away.

This process is called embedding. It’s how AI understands relationships between words. Imagine a giant map where every word in the language has an address. Words with similar meaning live in the same neighborhood.

AI is predicting, Not Understanding

At its core, AI doesn’t “understand” in the human sense. It’s predicting the next most likely token given the context.

When you ask, “What is the capital of France?” the AI isn’t recalling facts like a person. Instead:

It looks at the pattern: “capital of X → Y.”
From training, it has seen “France → Paris” thousands of times.
So it predicts “Paris” as the most likely next token.

It’s less like remembering a fact and more like solving a puzzle piece by piece, where each word is chosen because, statistically, it best fits the pattern around it.

Deep Thinking - Layers on Layers on Layers

Models like GPT are called deep learning models because they have many layers (sometimes hundreds). Each layer transforms the input into a slightly more abstract representation.

Early layers detect simple patterns (like word order).
Middle layers detect relationships (like subject vs. object).
Later layers detect meaning (like tone or style).

Think of it like a restaurant kitchen:

The early layers are like the junior cooks chopping vegetables and measuring ingredients. They’re focused on the basic, raw parts (the words) and making sure they’re in the right order.

The middle layers are like the line cooks assembling the dishes. They understand how different ingredients (ideas) go together to form a coherent plate (a sentence or phrase).

The later layers are like the head chef tasting the final dish. They’re not just checking if the parts are there. They are judging the overall flavor, balance, and presentation to ensure the meal tells a story and has the right feel (the tone and meaning of the entire text).

The Role of Probability – Why Answers Vary

AI doesn’t always give the same response, even if you ask the same question twice. That’s because every next token has probabilities.

Imagine the AI is writing a story, one word at a time, and it has a set of high-stakes dice. When you give it the first word, the AI calculates the most logical next step.

Let’s say you give the AI the prompt: “I went to the store and bought a…”

For the next word, the AI doesn’t just pick the single most probable one; it calculates a probability distribution for hundreds of words, and then essentially rolls a weighted die to make a choice.

So for the above prompt, the responses could look like:

Next Token	Probability
Apple	30%
Book	20%
Shirt	15%
Toy	10%
Car	5%

Depending on settings (like temperature), the AI may choose the safest bet or roll the dice for creativity. Common words are more likely, but rare words still have a chance to show up.

Temperature: The Creativity Dial

Beyond the examples you provide, there’s a crucial parameter that controls the AI’s “personality” – that is Temperature. Think of it like a spice dial that you can turn down for predictability, or crank it up for wild creativity.

Temperature Definition: A parameter (typically a float between 0 and 1, although some models allow up to 2) that controls the randomness and creativity of the AI’s output.

Low Temperature (e.g., 0.1-0.4): The AI will choose the most probable words, leading to more predictable, focused, and “safe” outputs.
High Temperature (e.g., 0.7-1.0+): The AI will consider a wider range of less probable words, leading to more diverse, creative, and sometimes quirky or unexpected outputs.

Examples:

Temperature 0.0: “Give me a motivational quote.” → “Success comes from hard work and determination.” (Very standard, highly probable words).
Temperature 0.9: “Give me a motivational quote.” → “Success is a stubborn flame that grows brighter every time failure tries to snuff it out.” (More descriptive, imaginative, and metaphorical).

When to use:

Low Temperature (0-0.5): Ideal for tasks requiring high accuracy, consistency, and factual correctness, such as coding, mathematical problems, or extracting specific data.
High Temperature (0.6-1.0+): Best for creative tasks like brainstorming ideas, writing stories, generating marketing copy, or exploring diverse perspectives.

Memory (or Lack Thereof)

Most AIs don’t actually “remember” past conversations unless memory is explicitly enabled. They only see the text you provide in the prompt (the context window).

That’s why:

Short prompts = shallow memory.
Long prompts (few-shot, N-shot) = more context, but risk hitting token limits.

Because AI is probability-driven, your prompt acts like the steering wheel. The clearer and more structured your instructions, the better the model’s predictions align with your goal.

Zero-shot = “just go.”
Few-shot = “here’s how I’d like you to drive.”
N-shot = “here’s the entire map.”
Temperature = “drive straight” vs. “take scenic detours.”

Zero-Shot Prompting: Asking Cold

Imagine walking into a restaurant and simply telling the chef, “Make dinner.” You haven’t provided a menu, cuisine preference, or any examples of what you like. That’s essentially zero-shot prompting – asking the AI to complete a task without any prior examples or demonstrations.

Zero-Shot Definition: Asking the AI to complete a task based solely on its pre-trained knowledge, without any specific examples provided in the prompt itself.

Example:

Prompt: “Explain the concept of opportunity cost in simple terms”

Output: “Opportunity cost is what you must give up to acquire something else; it’s the cost of the next best alternative you forego. For instance, if you spend an hour studying, the opportunity cost is the hour of TV you could have watched. Economists use it to measure the real cost of a choice, as resources (like time and money) are scarce.”

When to use: Zero-shot is perfect for general tasks where the AI’s vast existing knowledge is sufficient, and clarity is more important than specific style or format replication.

Few-Shot Prompting: Teaching by Example

Now, imagine you want your friend to write a text message for you. Instead of just saying “write a text,” you show them a few examples of your typical texting style – short, witty, and with a specific emoji usage. This is few-shot prompting, guiding the AI with 1 to 5 examples of the desired input-output format.

Definition: Providing the AI with a small number (typically 1-5) of input-output examples within the prompt to teach it the desired task, pattern, or style.

Example:

Q: What is the capital of France?

A: Paris

Q: What is the capital of Spain?

A: Madrid

Q: What is the capital of Italy?

Output: Rome

When to use: Few-shot prompting excels in tasks requiring classification, pattern-matching, or when you need the AI to replicate a specific style or format. It significantly improves accuracy and consistency compared to zero-shot for these types of tasks.

Let’s compare zero-shot vs few-shot in action:

	Zero-Shot Prompting	Few-Shot Prompting
Examples	None	1-5 Examples
Guidance	Relies solely on AI’s pre-trained knowledge	Guides AI with specific input-output patterns
Typical Use	General summaries, simple questions, basic information	Classification, style replication, structured data extraction
Output Style	General, often less specific to desired format	More aligned with provided examples, consistent format

N-Shot Prompting: Scaling It Up

If few-shot is showing someone a couple of examples, N-shot prompting is like giving the AI a comprehensive study guide before the test. It’s providing a significant number of examples (N) within the prompt to extensively demonstrate the desired task, pattern, or style.

Example: Giving the AI 10 examples of short motivational quotes, then asking it to generate 5 more in the same style and tone.

Tradeoff: More context and examples generally lead to better, more accurate, and more consistent results. However, every word in your prompt consumes “tokens,” and models have a limited token window. Longer prompts with many examples can quickly eat into this limit, leaving less room for the AI’s response or making the prompt impractical.

Putting It Together – A Mental Model for AI Thinking

Here’s a simplified flow of what happens when you ask AI a question:

Tokenize: Break text into pieces.
Embed: Map pieces into numerical space.
Attend: Decide which tokens are most important.
Layer: Process through multiple neural network layers.
Predict: Assign probabilities to the next token.
Sample: Pick the next token (based on temperature/randomness).
Repeat: Continue until the output is complete.

From Black Box to Collaborator

AI doesn’t think like us – it doesn’t have beliefs, intentions, or emotions. What it does have is a powerful ability to spot patterns, weigh context, and generate predictions with remarkable fluency.

The more you understand how AI “thinks,” the more you can guide it. Instead of being a mysterious black box, it becomes a tool you can shape – whether you want precise answers, creative sparks, or something in between.

This is only the surface. In the next layer, we’ll look at how system prompts, chain-of-thought, and multi-step reasoning push AI even closer to human-like problem solving.