Disclaimer #1: The insights shared in this article are based on experiments conducted with Semantic Kernel (version < 1.0). While Langchain exhibits similar challenges, I have not delved deeply into its workings.
Disclaimer #2: The crux of this discussion revolves around devising a plan, rather than its execution.
Disclaimer #3: Concepts such as CoT, ToT, GoT, etc., are not pertinent to this discussion.
The Sequential Planner Flow
An overview of the current implementation of the sequential planner:
- A goal expressed as a string.
- A list of available functions that the planner can interpret (SK => plugins).
- Extract a concise list of functions that may be relevant for the task at hand.
- Formulate a prompt encompassing:
- Descriptions of the selected functions.
- Directions to craft a plan.
- The specified goal.
- Engage the LLM with the crafted prompt to obtain the plan.
Given the objective: “Summarize an input, translate it to French, and e-mail it to John Doe”, the following plan was devised:
Steps: - SummarizePlugin.Summarize input='$INPUT' => SUMMARY - WriterPlugin.Translate input='$SUMMARY' => TRANSLATED_SUMMARY - email.GetEmailAddress input='John Doe' => EMAIL_ADDRESS - email.SendEmail input='$TRANSLATED_SUMMARY' email_address='$EMAIL_ADDRESS'
Limitations of the Current Approach
While this methodology suffices for rudimentary tasks with concise plans, it falters when addressing more intricate challenges. Some of the pitfalls include:
- Irrespective of the prompt instructions, the generated plan may inadvertently employ collections as variables.
- The structure of the final solution remains ambiguous. Merely augmenting instructions is ineffective, given the absence of a verification mechanism.
- Should a function crucial for achieving the goal be absent, the invoking system remains oblivious.
- Similarly, if supplementary information is essential for goal accomplishment, the system remains uninformed.
A Glimmer of Hope
Firstly, LLMs at the caliber of GPT-4 are equipped to devise algorithms to tackle almost any challenge (e.g., the 12 tasks delineated in the ARC report). This includes the ability to decompose a task into more manageable sub-tasks. While the resultant algorithm may not always be optimal, leveraging multiple generations (with a non-zero temperature) could pave the way for satisfactory outcomes.
The aspiration is to develop a planner capable of:
- Ensuring that the final result adheres to the defined structure and meets acceptance criteria.
- Informing the invoking system in case certain information or functionality is lacking.
- Employing a top-down strategy, further dissecting tasks as needed.
- Seamlessly integrating with collections/lists and other data structures.