利用LangChain Agent实现ReAct式思维

发布日: 2023-10-27

1.简介
2.关于ReAct
1. 2.1 ReAct是什么？
2. 2.2 ReAct实现提示
3.关于Agent
4.输出
5.总结

1. 简介

这次我们将讲解一下LangChain Agent及其基本操作。LangChain的Agent实现了ReAct式的思维，所以我先从ReAct的解释开始，然后看实际的代码！

2. 关于ReAct

2.1 ReAct是什么？

在解释 ReAct 之前，让我们花点时间思考一下人类的思维过程。

当人类思考时，我们巧妙地将言语推理与行动结合起来。你有一种天生的自我调整、制定策略和记忆的能力。

例如，在厨房做饭时，你可能会想，“我已经把所有的蔬菜都切好了，所以接下来我会烧水”，或者改变你的计划，“如果我没有盐，我“用酱油和胡椒粉代替吧”。我也会想，“我现在能做什么？”并采取行动寻找答案。通过这样正确地运用“行动”和“思考”，人类可以快速学习新事物，并在不熟悉的情况下做出决定和推理。

那么LLM（大规模语言模型）会怎么样？

随着技术的最新进步，大模型现在能够像人类一样思考并做出复杂的决策。其中最受关注的是“ReAct”。ReAct是一种结合“推理”和“行动”的方法，旨在让人们像人类一样思考和行动。ReAct 的框架允许大模型在整合周围环境信息的同时进行思考、计划和调整。这使您能够更加灵活、适应性更强，并做出相应的决策。ReAct的灵感来自于整合人类思维和行为，有望帮助大规模语言模型解决困难任务。

2.2 ReAct实现提示

现在，我们来看看实现ReAct式思维的提示词模板是如何编写的。

PREFIX = """Answer the following questions as best you can. You have access to the following tools:"""

FORMAT_INSTRUCTIONS = """Use the following format:

Question: the input question you must answer

Thought: you should always think about what to do

Action: the action to take, should be one of [{tool_names}]

Action Input: the input to the action

Observation: the result of the action

...(this Thought/Action/Action Input/Observation can repeat N times)

Thought: I now know the final answer

Final Answer: the final answer to the original input question"""

SUFFIX = """Begin!

Question: {input}

Thought:{agent_scratchpad}"""

经过Question, Thought, Action, Action Input, Observation, Thought的过程最终输出Final Answer。在某些情况下，Action/Action Input/Observation 过程可能会重复多次。这就是LangChain的Agent提示模板的基本结构。基于这个形式，您可以自由定制创建您自己的Agent。

3. 关于Agent

3.1 如何使用Agent

使用LangChain Agent最简单的方式就是调用initialize_agent。在这种情况下，返回值是AgentExecutor，因此您无需自己调用AgentExecutor即可得到答案。

chat_agent = initialize_agent(

tools,

llm=LLM,

agent = "zero-shot-react-description",

verbose=True,

system_message="ou are a kind assistant. Please answer in Chinese!",

)

question = 'Please tell me about Vegeta'

result = chat_agent.run(question)

print(result)

这里，agent我们指的是代理类型，标准有多种代理类型。示例：ZERO_SHOT_REACT_DESCRIPTION、CONVERSATIONAL_REACT_DESCRIPTION、OPENAI_MULTI_FUNCTIONS、...

3.2 定制Agent

实际中在开发聊天机器人等应用程序时，总是有详细的要求，而且实际情况是需要进行一些定制。需要定制的东西一般是PromptTemplate、OutputParser、Agent、AgentExecutor。

下面是如何调用自定义模块的示例代码。

# CustomPromptTemplate

self.prompt_agent = CustomPromptTemplate(

template=TEMPLATE_AGENT,

tools=tools,

input_variables=["input", "intermediate_steps", "history"]

)

# Custom Output Parser

output_parser = CustomOutputParser()

# Custom Agent

llm_chain = LLMChain(llm=self.LLM, prompt=self.prompt_agent)

tool_names = [tool.name for tool in tools]

agent = CustomAgent(

llm_chain=llm_chain,

output_parser=output_parser,

stop=["\nObservation:"],

allowed_tools=tool_names

)

# CustomAgent Executor

self.agent_executor = CustomAgentExecutor.from_agent_and_tools(

agent=agent,

tools=tools,

verbose=True,

memory=self.memory,

handle_parsing_errors=False

)

3.3 LLMSingleActionAgent

下面我们通过LLMSingleActionAgent来对自定义Agent的作用做个了解。

class LLMSingleActionAgent(BaseSingleActionAgent):

"""Base class for single action agents."""

llm_chain: LLMChain

"""LLMChain to use for agent."""

output_parser: AgentOutputParser

"""Output parser to use for agent."""

stop: List[str]

"""List of strings to stop on."""

def plan(

self,

intermediate_steps: List[Tuple[AgentAction, str]],

callbacks: Callbacks = None,

**kwargs: Any,

) -> Union[AgentAction, AgentFinish]:

"""Given input, decided what to do.

Args:

intermediate_steps: Steps the LLM has taken to date,

along with the observations.

callbacks: Callbacks to run.

**kwargs: User inputs.

Returns:

Action specifying what tool to use.

"""

output = self.llm_chain.run(

intermediate_steps=intermediate_steps,

stop=self.stop,

callbacks=callbacks,

**kwargs,

)

return self.output_parser.parse(output)

这里要关注的是plan方法。返回值为 AgentAction 或 AgentFinish。 LLMSingleActionAgent内部调用LLMChain，根据ReAct进行思考的同时决定实际的动作。如果有要采取的操作，LLMChain 将通过 output_parser 返回 AgentAction；如果想法完成，则返回 AgentFinish。

这是Agent的主要作用。你可以根据这个来进行自由定制。例如，如果你想让它成为多操作，你可以将其更改为返回多个操作，或者你可以让LLMChain仅在某些条件下工作。

3.4 关于AgentExecutor

AgentExecutor的作用是实际执行动作并获取答案。如上面模板示例中所述，根据需要重复执行“Thought”、“Action”、“Action Input”和“Observation”。

class AgentExecutor(Chain):

"""Agent that is using tools."""

def _take_next_step(

self,

name_to_tool_map: Dict[str, BaseTool],

color_mapping: Dict[str, str],

inputs: Dict[str, str],

intermediate_steps: List[Tuple[AgentAction, str]],

run_manager: Optional[CallbackManagerForChainRun] = None,

) -> Union[AgentFinish, List[Tuple[AgentAction, str]]]:

"""Take a single step in the thought-action-observation loop.

Override this to take control of how the agent makes and acts on choices.

"""

for agent_action in actions:

# We then call the tool on the tool input to get an observation

observation = tool.run(

agent_action.tool_input,

verbose=self.verbose,

color=color,

callbacks=run_manager.get_child() if run_manager else None,

**tool_run_kwargs,

)

result.append((agent_action, observation))

return result

for agent_action in actions:部分有一个action循环（换句话说，AgentExecutor本身支持多action），而在observation = tool.run(...)部分，选择的工具实际上是作为一个action来执行的.您可以看到我们正在得到一个Observation（答案）。

4. 总结

这次，我们介绍了ReAct的概念，并讲解了ReAct框架是如何在LangChain Agent中的实现。