Anthropic Will Use Chain of Thought Reasoning to Improve Prompts

Anthropic has released yet another feature on Anthropic Console. The latest addition lets you improve your prompts for higher-quality outputs.

“The prompt improver allows developers to take existing prompts and leverage Claude to automatically refine them using advanced prompt engineering techniques,” said Anthropic in the announcement.

More importantly, this marks Anthropic’s foray into the world of reasoning. Anthropic has mentioned that the prompt improver uses chain-of-thought reasoning to detect problems and refine the prompt. The prompt improver will include steps to break it down systematically, and ‘think’ before responding.

In addition, the tool will check for grammatical errors and prefill with any necessary information to improve the accuracy of the output.

Anthropic also revealed a considerable improvement in the output’s accuracy. They said, “Our testing shows that the prompt improver increased accuracy by 30% for a multilabel classification test and brought word count adherence up to 100% for a summarization task.”

The test involved mapping a randomly picked sentence to a parent article, within 500 samples picked from Wikipedia.

Another test involved assessing how accurately Claude could adhere to the word limit when summarising ten articles on Wikipedia. In the latter test, Claude scored a full 100% accuracy.

Anthropic is also allowing developers to add input-output examples, which are then transformed into a ‘standardised’ XML format to help the model process it with the best clarity.
In case a developer can’t craft examples, Claude will also generate synthetic ones to emulate them. “Claude can automatically create synthetic example inputs and draft outputs for you to streamline this process,” said Anthropic.

Furthermore, Anthropic has also introduced a ‘prompt evaluator’ that allows developers to benchmark and grade prompts on a five-point scale. Anthropic is also enabling developers to provide feedback, and further improve the results.

Interestingly, Anthropic has already tested this feature with one of their customers, Kapa.ai. “Anthropic’s prompt improver streamlined our migration to Claude 3.5 Sonnet and enabled us to get to production faster,” said Finn Bauer, Co-Founder at Kapa.ai in the announcement from Anthropic.

A few days ago, Dario Amodei, CEO at Anthropic revealed that Claude 3.5 Opus is on the cards. We’re curious if today’s announcement is a hint towards integrating reasoning capabilities in the flagship Claude model.

The post Anthropic Will Use Chain of Thought Reasoning to Improve Prompts appeared first on Analytics India Magazine.