Projects / Motive

What is Motive?

This is an AI-driven videogame in which the player must solve a crime scene. To do so, they need to uncover four elements: the killer’s profession, the place of the crime, the cause of death, and the murder weapon. The mechanic is simple but challenging: the player can ask any kind of question, but the AI can only answer “yes” or “no”.

How did the idea come about?

In 2022, when I first encountered ChatGPT and began experimenting with it by asking for poems and summaries, a question came to mind: the chat be a good host for a guessing game in the style of Clue? More specifically: would an AI be capable of “thinking” about a crime scene and, without revealing it, answering my questions with “yes” or “no” until the mystery was solved?

Although today the answer may seem obvious, back then the technology was new and we were still exploring its limits. But yes: the experiment went terrible. For the AI, it was essential that the word be explicitly written in its context; when it wasn’t, it began to hallucinate, contradict itself, or reveal the supposed mystery after my very first question. On top of that, it quickly broke the rules and started responding with phrases that were not strictly “yes” or “no”.

After several prompt adjustments, I managed to keep the behavior consistent during the first few questions… but, over time, it always ended up drifting. A chat, with its cumulative conversational history, is simply not a suitable format for this type of game.

Some time later, when I launched whatMLmodel —an application with an integrated LLM— I realized that, for the first time, I had the necessary tools to check whether that guessing game was actually plausible. I already knew how to work with the API calls; now I wanted to see if, by keeping a well-defined context and constraining the AI’s answers, it was possible to play in a reasonably satisfying way at solving a crime scene.

How would it work?

The idea was for the player to face multiple cases and, in each one, have to uncover the same four clues: profession, place, cause, and weapon. At first, I considered generating the cases in real time using the AI, but I quickly realized that this core aspect of the game required human intervention. Even so, I relied on ChatGPT with a simple instruction: to create cases where the four items were well connected, included an unexpected twist, and had a reasonable level of difficulty.

After several iterations, the results were things like “a personal trainer who poisons their victim with a protein shake in a gym”. Very clever ideas, yes, but sometimes they lacked spark or broke the coherence between elements. For that reason, I refined around sixty cases that would be chosen randomly at the start of each game.

The mysteries would remain hidden from the player and would be used to build a prompt sent to the AI with the following information:

The item to guess and its solution (for example, “Killer’s profession: Engineer”).
The player’s question about that item (for example, “Does it require advanced studies?”).
The other three items, also with their solutions, to enrich the context.

The model would then return true or false depending on whether the player’s question was on the right track or not. The main challenge at this stage was getting the AI to understand the concept of “guessing”. In early tests, false positives and negatives appeared far too often, and although extending the prompt seemed like a way to reduce ambiguity, it turned out to be a double-edged sword: because the application needed to support certain features, it had to perform additional queries —within the same call to optimize resources— and this compound request, if not short and concise, led to many hallucinations.

Ultimately, I needed the AI to translate each player question into a clear set of state keys:

isGuessed: a boolean indicating whether the question matched the correct answer exactly (for example: “Is he an engineer?” → true);
isAFact: a boolean indicating whether the question reveals an important fact about the answer, even if it doesn’t guess it exactly (“Does he work with blueprints or designs?” → true);
factSentence: if isAFact is true, this contains the sentence derived from that fact (“This profession typically involves working with blueprints and designs on a regular basis.”);
isPositive: a boolean that simply indicates whether the player’s question is positive for the answer, even if both isGuessed and isAFact are negative (“Does it require advanced studies?” → true);
isContinuation: a boolean indicating whether the current question is linked to previous ones (I’ll explain the reason for this item right away).

Later on I’ll show which prompt worked to achieve these responses, but what really optimized the results was making it dynamic: it wasn’t just about inserting information into a string, but about having that string vary depending on the item being guessed. This allowed me to define specific rules and examples for profession, place, cause, and weapon.

Another key point was keeping the AI focused and fast, which required the smallest possible context. The dynamic prompt would be part of that context, but was it worth resending the entire history of questions and answers on every call? My intuition said no; however, not doing so prevented the game from behaving like a real chat: questions such as “Does it require advanced studies?” followed by “Very advanced?” could not be chained.

That is the reason for the isContinuation key. Each request includes the current question and the immediately previous one; if the latter is marked with isContinuation equal to true, the one before it is also included, and so on until one is marked false. This made it possible to preserve conversational flow without overloading the context.

The game engine

My first intention was to use webLLM, an engine that runs generative models directly in the browser —locally, without API calls—. However, several issues quickly surfaced: reliance on WebGPU (poorly supported on mobile devices), long download times, low performance even for simple questions, and unreliable models that produced false positives and negatives far too often. Because of this, I chose the same strategy I had used in whatMLmodel: calling the Gemini API, whose free tier was more than sufficient.

On the frontend, the logic is centralized in an orchestrator service that captures the player’s question, packages it together with the current game state (items and their answers), and sends everything to the server. The backend receives this request, selects the appropriate prompt based on the item being guessed, injects the data, and makes the call to Gemini. There, I implemented a model fallback strategy to ensure stability: if the lighter and faster model fails, the application automatically scales up to a more powerful —though slower— one without the player noticing. Additionally, I designed a key-rotation mechanism to further extend the free-tier margin: four API keys managed through a counter in Firebase.

But the true core of everything was the prompts: for each category, I crafted instructions that guide the model to compare the player’s question with the secret solution and respond exclusively in a predefined JSON format. This way, the AI can distinguish between an exact guess (isGuessed), a relevant fact (isAFact), a positive afirmation (isPositive), and a direct continuation of questions (isContinuation). Here is the fragment of the prompt corresponding to the killer’s profession:

The art

From the very beginning, I knew Motive had to be a pixel art game. It was a personal preference, but also an opportunity: since it didn’t require a large number of illustrations, I could take the time to experiment with this style for the first time.

I ran into several design challenges, but the real hurdle came during development: the most recurring element —in buttons, containers, and so on— was a seemingly simple pixelated box that couldn’t be reproduced using native CSS options. I therefore had to design a component capable of adapting in both height and width, rendering multiple divs internally to simulate a pixelated border.

Typography selection was more straightforward: I needed a minimalist font for body text and a more decorative one for the logo. For the text, I chose VP Pixel Simplified (manually extending it to support Spanish special characters), and for the logo, Alagard, which fit the game’s name perfectly.

Still, Motive needed a few illustrations to truly come to life. The idea was to always keep a detective avatar visible so the chat would feel like an actual conversation. After exploring designs generated with image models, I found one I really liked and manually recreated it in a 26×26 pixel layout. That’s how Motiveman was born: a character that reacts to the player’s actions through a small sprite system, thinking, nodding, or shaking his head depending on the game state. This design also defined the color palette. With sound effects and mystery music added on top, the experience stopped feeling like a simple interface and became a game with its own identity.

The result

Menu navigation is based on an array that works as a breadcrumb: we have a sequence of keys —for example, home/settings/createAccount— and for each one a layout is rendered. Clicking an option adds an item to the breadcrumb and animates it with a slide to the left. Clicking “Back” does the opposite: first a translation to the right, then the item is removed. The result is a versatile system that allowed me to chain as many screens as needed.
When starting a game, an introduction appears and, if it’s the first time playing, a tutorial guides the player through the mechanics. I justified the “yes” and “no” mechanic narratively by having the player interrogate a witness in a state of shock, who can only nod or shake their head.
At the top, the player selects the item to investigate, and below it the chat is displayed. Besides “yes” and “no”, there is an “hm…” response for ambiguous questions. To avoid frustration, a “You need to be more specific” message appears if this response shows up too frequently.
When a new fact is discovered, it is added to a Facts Panel. This mechanic emerged late in development but ended up being key to guiding the player. If an explicit hint is still needed, one can be requested via a green icon that adds a sentence —also generated by AI— to the same list.
Tracking false positives and negatives was essential to improving the AI’s reasoning later on. For this, I designed a reporting system that allows players to describe issues and flag conflicting answers. The goal is to build a solid dataset that can later be used for LLM output improvement.
When the player believes they have identified the answer, they can validate it by naming it explicitly. If correct, the mystery is revealed and highlighted in green. For this moment, I implemented a dedicated avatar animation, giving a thumbs-up in approval.
Toward the end of development, an additional idea emerged: a fifth mystery that gave meaning to the game’s name. The objective then becomes identifying the motive for the murder. To avoid extending the experience too much, I limited this stage to a reduced number of questions. The detective justifies it by explaining that the witness needs to rest, but that their lawyer agrees to grant five more questions. At that point, the music changes and the game takes on a more tense and exciting tone.
At the end of the game, the score is calculated based on factors such as playtime and the number of hints requested. Some factors have a positive impact, while others affect the score negatively. If the user is not registered, the game invites them to create an account to join the player leaderboard.

The result

Menu navigation is based on an array that works as a breadcrumb: we have a sequence of keys —for example, home/settings/createAccount— and for each one a layout is rendered. Clicking an option adds an item to the breadcrumb and animates it with a slide to the left. Clicking “Back” does the opposite: first a translation to the right, then the item is removed. The result is a versatile system that allowed me to chain as many screens as needed.

When starting a game, an introduction appears and, if it’s the first time playing, a tutorial guides the player through the mechanics. I justified the “yes” and “no” mechanic narratively by having the player interrogate a witness in a state of shock, who can only nod or shake their head.

At the top, the player selects the item to investigate, and below it the chat is displayed. Besides “yes” and “no”, there is an “hm…” response for ambiguous questions. To avoid frustration, a “You need to be more specific” message appears if this response shows up too frequently.

When a new fact is discovered, it is added to a Facts Panel. This mechanic emerged late in development but ended up being key to guiding the player. If an explicit hint is still needed, one can be requested via a green icon that adds a sentence —also generated by AI— to the same list.

Tracking false positives and negatives was essential to improving the AI’s reasoning later on. For this, I designed a reporting system that allows players to describe issues and flag conflicting answers. The goal is to build a solid dataset that can later be used for LLM output improvement.

When the player believes they have identified the answer, they can validate it by naming it explicitly. If correct, the mystery is revealed and highlighted in green. For this moment, I implemented a dedicated avatar animation, giving a thumbs-up in approval.

Toward the end of development, an additional idea emerged: a fifth mystery that gave meaning to the game’s name. The objective then becomes identifying the motive for the murder. To avoid extending the experience too much, I limited this stage to a reduced number of questions. The detective justifies it by explaining that the witness needs to rest, but that their lawyer agrees to grant five more questions. At that point, the music changes and the game takes on a more tense and exciting tone.

At the end of the game, the score is calculated based on factors such as playtime and the number of hints requested. Some factors have a positive impact, while others affect the score negatively. If the user is not registered, the game invites them to create an account to join the player leaderboard.