Advanced Prompting Strategies for Digital Assistant Development

Kelly Roussel
Hoomano
Published in
7 min readFeb 12, 2024

--

Mojodex is an open-source digital assistant platform designed to help users accomplish specific tasks in their job. Each task is a discrete piece of work, inspired by the JRC-Eurofound Tasks Framework and O*NET OnLine, representing various skills and specializations required to accomplish it.

https://joint-research-centre.ec.europa.eu/scientific-activities-z/employment/job-tasks-and-work-organisation_en

Task concept implementation in Mojodex

🎯 We build this “task-centric” assistant to provide the best help to the user, tailored to their needs. Technically, a task is described as a json configuration file containing all information the assistant needs. This way, any expert can create a really specific task and the assistant will be able to guide the user through the process of accomplishing it in a conversational, seamless way.

How it works

✨ How is it possible that anyone can create any specific task and the assistant remains consistent in the task resolution process?

Here comes the core prompt for this feature of the agent.

Basically, this prompt is a function of the context that, along with the conversation history, outputs several pieces of information used by the agent. Among this information there is always an assistant’s message which is displayed to the user in the chat interface.

Operation of general prompt tuned for a specific task

Here’s a link to the prompt:

We will breakdown how it works in this article.

High level picture

This prompt is a 50 lines Jinja2 template, rendered at task runtime with the selected task’s configuration. It also contains some user’s data, global context and instructions that will make the task successful.

The prompt comes in the LLM call along with the task conversation history if any.

The prompt is invoked each time the user sends a message or requests an action through the UI, regardless of the task’s progress. This necessitates a highly flexible prompt, which can significantly streamline the code.

Here is an example of the “Prepare LinkedIn Post” task runtime on the Mojodex mobile app. The user’s interactions with the assistant are accumulated in the messages list, yet the prompt’s template stays consistent throughout the entire interaction.

Running “Prepare LinkedIn Post” on Mojodex mobile application

👉 Let’s break this prompt down.

Note: As mentioned above, the prompt is a Jinja2 template: any value between double curly braces will be replaced by the corresponding value at run time. The curly braces with the percentage sign are used to include control structures like if, for, etc.

Context

YOUR CONTEXT
{{mojo_knowledge}}

GLOBAL CONTEXT
{{global_context}}

USER NAME
{{username}}

{%if user_company_knowledge%}USER'S COMPANY KNOWLEDGE
{{user_company_knowledge}}{%endif%}

The first lines of the prompt gives the assistant the context of the conversation:

- YOUR CONTEXT is the assistant itself’s context. Giving it its name will be useful to make it personal in the conversation. It is now well-known that providing LLM call with a role is a good practice to give it intrinsic goals and make it more consistent.

- GLOBAL CONTEXT is made of the date, including day of the week and time in user’s location so that the assistant won’t hallucinate those.

- USER NAME and USER’S COMPANY KNOWLEDGE (if provided during the onboarding process) are ways to personalize the assistance by calling the user by their name and content of tasks results (ex: emails signature).

Task

TASK TO ACCOMPLISH
{{task.name_for_system}}: {{task.definition_for_system}}

{%if user_task_inputs%}USER INPUTS
{{ user_task_inputs | tojson(indent=4) }}{%endif%}

PRIMARY INFO NEEDED TO COMPLETE THE TASK
{{ task.infos_to_extract | tojson(indent=4) }}

TASK OUTPUT FORMAT
{{ title_start_tag }}{{task.output_format_instruction_title}}{{ title_end_tag }}
{{ draft_start_tag }}{{task.output_format_instruction_draft}}{{ draft_end_tag }}

TASK INSTRUCTION
{{task.final_instruction}}

{%if task_tool_associations%}TOOLS TO USE
{%for task_tool_association in task_tool_associations %}{{task_tool_association.tool_name}}: {{task_tool_association.usage_description}}
{%endfor%}{%endif%}

The following lines get straight to the heart of the matter: the task to run.

Here comes the configuration of the task as defined by the expert. It contains:

- Context of the task so that the assistant knows exactly what the user wants to accomplish: name, definition, formats…

- User’s inputs (if the user used the webapp form) so that the assistant knows what information the user has already provided and what is still missing.

- Instructions of the tasks: final instruction and also eventual tools the assistant can require to use to accomplish it.

UI-specific instructions

{%if audio_message%}The user's messages are transcriptions of audio messages.{%endif%}

Using the mobile app, the user mainly interact with Mojodex by voice.

So we specify whether the user’s messages are transcriptions of audio messages to provide the assistant with the context and flexibility regarding eventual misspelling or misunderstanding.

Output format instruction

{%if not produced_text_done%}Answer in following format:
{%if language is none%}<user_language><2 letters language code></user_language>{%endif%}
{%for info in task.infos_to_extract%}<missing_{{info.info_name}}><yes/no></missing_{{info.info_name}}>
{%endfor%}<missing_primary_info><yes/no></missing_primary_info>
{if <missing_primary_info> == yes}
<ask_user_primary_info><question to ask the user to collect missing info you need to complete the task></ask_user_primary_info>{endif}
{if <missing_primary_info> == no}{%if task_tool_associations%}<need_tool><yes/no></need_tool>
{if <need_tool> == yes}
<tool_to_use><tool name></tool_to_use>
<tool_usage><Brief explanation of how you intend to use the tool for the user and ask if they agree.></tool_usage>{else}{%endif%}
<execution><Task delivery in TASK OUTPUT FORMAT within those tags.
{%if tag_proper_nouns%}Tag EVERY delivery's proper noun with '*': *PROPER NOUN*. Also tag EVERY delivery's word that may have been misunderstood by the transcription system and could be a proper noun with the same '*' tag.{%endif%}No talk, only delivery.></execution>{%if task_tool_associations%}{%endif%}{endif}{endif}
{%else%}Always use <execution> and </execution> tags when you write a task delivery or edition of a task delivery.{%endif%}

This part is almost the most important of the prompt as it defines the output format in which the LLM should respond. This is crucial because all response processing depends on a good formatting.

Curly braces with the percentage sign are Jinja2 control structures: they allow to adapt the prompt at runtime depending on the advancement of the task. For example, if a first result (produced_text) has already been drafted, then the state of the task is in edition, the assistant no longer needs to ensure it has all the information to complete the task.

Within this part of the code also are some curly braces with no percentage. This is pseudo-code provided to the assistant. This is a way to integrate a rules engine and ensure task resolution always follow the same process. Here is this process:

  • 1. Task configuration contains a list of information the assistant absolutely has to collect before starting to draft task result. For each of those information, the LLM has to respond whether or not this information has been provided by the user at this point. Enforcing the LLM to write this down for each information ensures it won’t skip any.
  • 2. Once going through each of those info one by one, the assistant can easily conclude whether it misses information to start the task or not.
  • 3. First rule: if an information is missing, “write a message to ask the user about it”. Else, if any tools have been provided in the task configuration, “do you need to use any tool?”
  • 4. If a tool is needed, provided specifications to use it and an informative message for the user.
  • 5. Finally when all data have been collected and no more tool is needed, a first draft can be generated.

Before GPT4 was released, we used to manage those rules by splitting the prompt in different LLM calls and processing results with code rules engine, rooting to next call depending on the result. This lead to bad response-time for the user and a lot of complexity in the code.

🚀 GPT4 has been a game changer as we found it was able to follow in-prompt pseudo-code rules with enough consistency to manage the process by itself.

Last but not least, this part of the prompt contains “tags” looking like HTML tags. Those have 2 purposes:

  • Referencing previous generated token within the same call in pseudo-code.
  • Splitting the response into different inputs to be processed by different code components while streaming parts that are intended for the user, removing tags that should not be seen on the fly — JSON format checking can’t be done until the final token is generated.

📌 Here is a flowchart representation of this part of the main prompt run to help you understand the output format. Purple rules are Jinja’s and green are embedded as pseudo-code.

Output format of the prompt as defined by Jinja2 rules (purple), pseudo-code embedded in the prompt (green) and tags (yellow)

Final instructions

{%if language is none%}Speak same language as the user.{%else%}Use language {{language}}.{%endif%}
Ensure you include all tags required by answer format.
No talk, just follow answer format. Remember to use required tags.

This part is the last instruction given to the assistant, mentioning to speak the user’s language, insisting on the importance of the answer format and specifying the now good old “No talk” to avoid any unwanted chitchat and get straight to the point.

Conclusion

In conclusion, Mojodex offers a fascinating glimpse into a new paradigm of code interaction, blending complexity with elegance. Its advanced prompt generation, powered by Jinja2 templates and pseudo-code integration, represents a significant leap forward in code-driven assistance.

Dive into Mojodex’s open-source project on GitHub to explore this innovative approach and join the community in reshaping the landscape of code interaction. We love ️🌟 on GitHub ❤️.

--

--