Big Tech’s AI Models Are Lost in Logic

Some say large language models (LLMs) are a step towards AGI, the rest think of it as merely a cool new tool. Every nook and cranny of the content generation industry—from newsrooms to script writers—is loomed by the fear of being taken over by AI language models. These tools have a credible ability to write everything from a Shakespearan poet to code in several languages. These models can spit out nicely stitched sentences but lack a fundamental human aspect: logical reasoning.

Yoshua Bengio had mentioned during an interview with AIM, the magnitude of data that these systems possess is almost equal to a person reading every day, every waking hour, all their life, and then living 1000 lives. However, they fail at reasoning. “LLMs are encyclopaedic thieves,” he stated, pointing towards the models’ incapability to reason with that knowledge as consistently as humans.

While researchers have long studied the subject, there is no sign yet that by adding layers, parameters, and attention heads to transformers, the logical reasoning gap will be bridged.

Looking Beyond Words

The extent to which famous text-based LLMs can reason remains uncertain. Models trained solely on text data inherently face limitations when it comes to common sense and real-world knowledge. While expanding the training dataset helps to a certain degree, these models may still have unexpected knowledge gaps. Multi-modal models, such as LLMs that understand both text and images can address some of these challenges.

In a paper published in IEEE, Meta’s AI chief, Yann LeCun, echoes Bengio’s sentiment painting a bleak picture of LLM understanding, presenting a pessimistic assessment of LLMs’ capabilities to understand solely through reading. Multi-modal models have demonstrated improved reasoning abilities compared to their single-sense counterparts. It is worth noting, however, that symbolic logic, an approach dominating decades, yielded minimal progress during the time period.

While multimodal large language models (MLLMs) have kindled hopes of making AI models capable of reasoning, their development remains in a rudimentary stage.

Big Tech Tryna Reason

While many have already declared language models cannot think, the big tech has been digging further to find means to make these AI tools good at logical reasoning.

A group of researchers from Virginia Tech Microsoft introduced a unique methodology known as the “Algorithm of Thoughts (AoT).” This approach propels LLMs along the paths of algorithmic reasoning, creating a novel path for contextual learning. Additionally, it suggests that with this training method, LLMs could possess the capability to integrate their intuition into searches that are optimised for better outcomes.

The research cites that LLMs have traditionally been trained on methods such as the “Chain-of-Thought,” “Self-consistency,” and “Least-to-Most Prompting.” However, these methods presented certain limitations that restricted their overall effectiveness. The method addresses the limitations of current in-context learning techniques like the “Chain-of-Thought” (CoT) approach. While CoT occasionally provides incorrect intermediary steps, the AoT steers the model by using algorithmic examples, resulting in more dependable results.

A week ago, researchers at Google released a study titled, ‘Teaching language models to reason algorithmically’ to teach models like ChatGPT to reason better algorithmically. The method takes the in-context learning approach and introduces an algorithm better at reasoning. These discoveries suggest that exploring longer contexts, and prompting more informative explanations could provide valuable research.

Earlier this year, researchers from Amazon won an outstanding-paper award for showing that knowledge distillation using contrastive decoding in the teacher model and counterfactual reasoning in the student model improves the consistency of “chain of thought” reasoning.

Teaching LLMs to reason for rational outputting is a hyperactive topic of research today. A conventional approach is the so-called chain-of-thought paradigm but researchers are gradually attaining better results along with other methodologies. While the big tech is chasing the subject one step at a time, the companies are yet to strike gold.

The post Big Tech’s AI Models Are Lost in Logic appeared first on Analytics India Magazine.

Big Tech’s AI Models Are Lost in Logic

Looking Beyond Words

Big Tech Tryna Reason

Latest stories

How Circle co-founder Sean Neville plans to construct the primary...

Meta provides enterprise voice calling to WhatsApp, explores AI-powered product...

Meta restructures its AI unit below ‘Superintelligence Labs’

Why AI will eat McKinsey’s lunch — however not...

As job losses loom, Anthropic launches program to trace AI’s...

You might also like...

How Circle co-founder Sean Neville plans to construct the primary AI-native monetary establishment

Meta provides enterprise voice calling to WhatsApp, explores AI-powered product reccomendations

Meta restructures its AI unit below ‘Superintelligence Labs’