Apple’s ReALM Challenges OpenAI’s GPT-4

Amid the buzz surrounding Apple’s unveiling of the new MM1 model last month, the tech giant has now introduced another contender poised to beat OpenAI’s GPT-4 with its latest AI model, ReALM (Reference Resolution As Language Modeling).

This new model comprehends various contexts and delivers accurate information. Users can pose queries, which are visible on the screen or running in the background, and receive precise answers seamlessly.

Apple says its latest AI model ReALM is even “better than OpenAI’s GPT4”.
It likely is as GPT4 has regressed because of “alignment”.
The ReALM war begins at WWDC 2024.
Paper: https://t.co/3emVSjgRvK pic.twitter.com/tOPMVaVI9V

— Brian Roemmele (@BrianRoemmele) April 1, 2024

Apple believes its latest AI model surpasses OpenAI’s GPT-4.

“We also benchmark against GPT-3.5 and GPT-4, with our smallest model achieving performance comparable to that of GPT-4, and our larger models substantially outperforming it,” said the researchers in the paper titled ReALM: Reference Resolution As Language Modeling.

The researchers include Joel Ruben Antony Moniz, Soundarya Krishnan, Melis Ozyildirim, Prathamesh Saraf, Halim Cagri Ates, Yuan Zhang, Hong Yu, and Nidhi Rajshree.

GPT-4 vs ReALM

Apple researchers said the difference between GPT-3.5 and GPT-4 is how they process information. They said that GPT-3.5 can only understand text, so we only give it text prompts. On the other hand, GPT-4 can also understand images. This combination of text and image helps GPT-4 perform much better.

ReALM, on the other hand, uses both text and images (like screenshots) to understand and respond to prompts more effectively.

The researchers, however, said that there are even more ways to enhance results, like using similar phrases until you reach a certain length of the prompt. “This more complex approach deserves further, dedicated exploration, and we leave this to future work,”

Further, they said that the ReALM model will be tested across three distinct entity types associated with diverse tasks: on-screen entities, conversational entities, and background entities.

Decoding Reference Resolution

Apple researchers further said that understanding references like ‘they’ or ‘that’ in human speech is intuitive for our brains and helps us effortlessly understand contextual cues. However, deciphering such references poses a challenge for an LLM-based chatbot as it struggles to understand the intended context.

This challenge is known as reference resolution, where the aim is to comprehend the specific entity or concept to which an expression refers.

The researchers believe that the low-power nature and latency constraints of such systems require the use of a ‘single LLM’ with extensive prompts to achieve seamless experiences.

For instance, a user asks about nearby pharmacies, which can be done by Siri, leading to a list being presented. Later, the user asks to call the bottom listed number (present on-screen). Siri would not perform this particular task. However, with ReALM, the language model can comprehend the context by analysing on-device data and fulfilling the query. This also hints that at WWDC 2024, scheduled for June 10-14, 2024, Siri will most likely get a generative AI upgrade, setting the stage and heralding the arrival of the ReALM. “It’s going to be Absolutely Incredible!” said Apple SVP of marketing Greg Joswiak, in his recent post, hinting at the AI innovations that are going to be unveiled at the developers’ conference.

The post Apple’s ReALM Challenges OpenAI’s GPT-4 appeared first on Analytics India Magazine.

Follow us on Twitter, Facebook
0 0 votes
Article Rating
Subscribe
Notify of
guest
0 comments
Oldest
New Most Voted
Inline Feedbacks
View all comments

Latest stories

You might also like...