Back in November 2022 before OpenAI’s ChatGPT entered the market, Meta and Papers with Code released Galactica, an open-source large language model, boasting 120 billion parameters, for scientific research.
However, just three days after its launch, Meta took it down as it was hallucinating and blurting out incorrect information.
Despite being trained on an extensive dataset of 48 million scientific materials, including articles, textbooks, and lecture notes, it was deemed by users to be a “random bullshit generator”, leading to deep scientific fakes where researchers’ names could be cited on papers they did not write. It could generate counterfeit scientific papers and attribute them falsely to legitimate researchers, as well as fabricate fictional wiki articles, such as one detailing the “history of bears in space.”
While designed to address the issue of information overload in scientific searches by organising knowledge from diverse sources, every output generated by Galactica came with a warning about potential unreliability, as language models are prone to hallucinating text.
The model’s false results raised serious concerns about the potential dangers of misinformation in scientific research, emphasising the risk of misleading information infiltrating scientific submissions.
However, since then Meta has come up with better and more accurate foundational like Llama and Llama 2, yet the research community awaits the fusion of GPT-4 and a resurrection of Galactica.
“I hope one day this product can be re-launched after researchers figure out a way to harness the ‘monster of hallucination’ because the future of AI rests not just in its creativity, but also in its trustworthiness,” said Daliana Liu, Senior Data Scientist, Predibase.
Should You Trust LLM Feedback for Research?
While Galactica might not have been sufficient, pairing it up with GPT-4 can be a good solution because according to a recent research paper by Stanford University proves that LLMs, particularly GPT-4, have the potential to be valuable contributors to the scientific feedback process for research manuscripts.
The study demonstrates that the feedback generated by GPT-4 shows a noteworthy overlap with human peer reviewer feedback, especially in the context of weaker papers that are typically rejected.
Additionally, a user study involving researchers in the field of AI and computational biology indicates that a significant portion of users find GPT-4-generated feedback helpful and, in many cases, more beneficial than feedback from some human reviewers.
Even Jensen Huang, the chief of NVIDIA, earlier told AIM that he finds ChatGPT useful for preliminary search in environmental causes like dissolving plastic.
The research underscores that LLMs, such as GPT-4, help in scientific review, providing valuable feedback to augment human expert insights. However, the inherent challenge lies in eradicating hallucination.
Hallucinations are Parts & Parcels of LLMs
Given LLMs’ stochastic predictions and reliance on a vast decision tree for token selection, getting rid of hallucinations is impossible. LLMs lack the experiential capacity to discern between true and false statements, operating solely on linguistic analysis — an inherent feature.
The reliance on human-provided information to validate synthetic statements complicates their performance, and even with extensive data training, the inherent limitation of discerning real-world experiences hinders the complete elimination of error rates in generating accurate outputs.
However, many like Kevin Scott, CTO of Microsoft considers hallucinations to be a part of the learning process of LLMs, noting that pushing the model into a hallucinatory path detaches it from grounded reality.
According to Vu Ha from the Allen Institute for AI, Ha acknowledges that any deployed LLM-based system will still exhibit hallucinations. However, his main concern is whether the benefits of the model outweigh the negatives caused by occasional hallucinations.
Sebastian Berns, a doctoral researcher at Queen Mary University of London, proposes that models prone to hallucinations might not be entirely detrimental; instead, they could function as valuable “co-creative partners.” offering imaginative narratives that, though not entirely accurate, may contain useful threads of ideas for exploration. However, this is a huge problem for researchers as data needs to be factually correct.
Hallucinations in LLM are due to the Auto-Regressive prediction.
I think what I call "Objective Driven AI" will solve the problem: systems that plan their answer by optimizing a number of objective functions *at inference time* https://t.co/JcR5hItwzJ— Yann LeCun (@ylecun) June 9, 2023
Meanwhile, Mustafa Suleyman, CEO and co-founder, of Inflection AI believes that LLM hallucinations will be significantly reduced by 2025, emphasising their profound impact beyond current model errors. However, Meta AI chief scientist Yann LeCun thinks that hallucinations occur due to auto-regressive prediction and the solution lies in “Objective Driven AI,” where systems plan their answers by optimising multiple objective functions during inference.
So, the aim is to strike a balance, acknowledging that LLMs hold the promise of enhancing scientific research, provided the challenge of hallucination is effectively addressed. Whether through Meta or another major tech player, the development of a more sophisticated Galactica holds significant potential for advancing the research ecosystem.
The post Bring Back Galactica appeared first on Analytics India Magazine.