Make room for RAG: How Gen AI’s balance of power is shifting

baidu-2024-rag-outline.png

Baidu's rendering of a general RAG approach.

Much of the interest surrounding artificial intelligence (AI) is caught up with the battle of competing AI models on benchmark tests or new so-called multi-modal capabilities.

OpenAI announces a video capability, Sora, that stuns the world, Google responds with Gemini's ability to pick out a frame of video, and the open-source software community quickly unveils novel approaches that speed past the dominant commercial programs with greater efficiency.

Also: OpenAI is training GPT-4's successor. Here are 3 big upgrades to expect from GPT-5

But users of Gen AI's large language models, especially enterprises, may care more about a balanced approach that produces valid answers speedily.

A growing body of work suggests the technology of retrieval-augmented generation, or RAG, could be pivotal in shaping the battle between large language models (LLMs).

RAG is the practice of having an LLM respond to a prompt by sending a request to some external data source, such as a "vector database", and retrieve authoritative data. The most common use of RAG is to reduce the propensity of LLMs to produce "hallucinations", where a model asserts falsehoods confidently.

Also: Is OpenAI sweating? 9 Google features announced for Gemini, Search, Android, and more

Commercial software vendors, such as search software maker Elastic, and "vector" database vendor Pinecone, are rushing to sell programs that let companies hook up to databases and retrieve authoritative answers grounded in, for example, a company's product data.

What's retrieved can take many forms, including documents from a document database, images from a picture file or video, or pieces of code from a software development code repository.

What's already clear is the retrieval paradigm will spread far and wide to all LLMs, both for commercial and consumer use cases. Every generative AI program will have hooks into external sources of information.

Today, that process can be achieved with function calling, which OpenAI and Anthropic offer for their GPT and Claude programs respectively. Those simple mechanisms provide limited access to data for limited queries, such as getting the current weather in a city.

Function calling will probably have to meld with, or be supplanted, by RAG at some point to extend what LLMs can offer in response.

That shift implies RAG will become commonplace in how most AI models perform.

Also: Pinecone's CEO is on a quest to give AI something like knowledge

And that prominence raises issues. In this admittedly early phase of RAG's development, different LLMs perform differently when using RAG, doing a better or worse job of handling the information that the RAG software sends back to the LLM from the database. That difference means that RAG becomes a new factor in the accuracy and utility of LLMs.

RAG, even as early as the initial training phase of AI models, could start to affect the design considerations for LLMs. Until now, AI models have been developed in a vacuum, built as pristine scientific experiments that have little connection to the rest of data science.

There may be a much closer relationship in the future between the building and training of neural nets for generative AI and the downstream tools of RAG that will play a role in performance and accuracy.

Pitfalls of LLMs with retrieval

Simply applying RAG has been shown to increase the accuracy of LLMs, but it can also produce new problems.

For example, what comes out of a database can lead LLMs into conflicts that are then resolved by further hallucinations.

Also: I've tested dozens of AI chatbots since ChatGPT's debut. Here's my new top pick

In a report in March, researchers at the University of Maryland found that GPT-3.5 can fail even after retrieving data via RAG.

"The RAG system may still struggle to provide accurate information to users in cases where the context provided falls beyond the scope of the model's training data," they write. The LLM would at times "generate credible hallucinations by interpolating between factual content."

Scientists are finding that design choices of LLMs can affect how they perform with retrieval, including the quality of the answers gotten back.

A study this month by scholars at Peking University noted that "the introduction of retrieval unavoidably increases the system complexity and the number of hyper-parameters to tune," where hyper-parameters are choices made about how to train the LLM.

For example, when a model chooses from several possible "tokens", including which tokens to pick from the RAG data, one can dial up or down how broadly it searches, meaning how great or narrow a pool of tokens to choose from.

Choosing a small group, known as "top-k sampling", was found by the Peking scholars to "improve attribution but harm fluency," so that what's gotten back by the user has trade-offs in quality, relevance, and more.

Because RAG can dramatically expand the so-called context window, the number of total characters or words an LLM has to handle, using RAG can make a model's context window a bigger issue than it would be.

Also: The best AI image generators: Tested and reviewed

Some LLMs can handle many more tokens — on the order of a million, for Gemini — some far less. That fact alone could make some LLMs better at handling RAG than others.

Both examples, hyper-parameters and context length affecting results, stem from the broader fact that, as the Peking scholars observe, RAG and LLMs each have "distinct objectives". They weren't built together, they're being bolted together.

It may be that RAG will evolve more "advanced" techniques to align with LLMs better, or, it may be the case that LLM design has to start to incorporate choices that accommodate RAG earlier in the development of the model.

Trying to make LLMs smarter about RAG

Scholars are spending a lot of time these days studying in detail failure cases of RAG-enabled LLMs, in part to ask a fundamental question: what's lacking in the LLM itself that is tripping things up?

Scientists at Chinese messaging firm WeChat described in a research paper in February how LLMs don't always know how to handle the data they retrieve from the database. A model might spit back incomplete information given to it by RAG.

Also: OpenAI just gave free ChatGPT users browsing, data analysis, and more

"The key reason is that the training of LLMs does not clearly make LLMs learn how to utilize input retrieved texts with varied quality," write Shicheng Xu and colleagues.

To deal with that issue, they propose a special training method for AI models they call "an information refinement training method" named INFO-RAG, which they show can improve the accuracy of LLMs that use RAG data.

WeChat's INFO-RAG seeks to train a large language model to be RAG-aware.

The idea of INFO-RAG is to use data retrieved with RAG upfront, as the training method for the LLM itself. A new dataset is culled from Wikipedia entries, broken apart into sentence pieces, and the model is trained to predict the latter part of a sentence fetched from RAG by being given the first part.

Therefore, INFO-RAG is an example of training a LLM with RAG in mind. More training methods will probably incorporate RAG from the outset, seeing that, in many contexts, using RAG is what one wants LLMs to do.

More subtle aspects of the RAG and LLM interaction are starting to emerge. Researchers at software maker ServiceNow described in April how they could use RAG to rely on smaller LLMs, which runs counter to the notion that the larger a large language model, the better.

"A well-trained retriever can reduce the size of the accompanying LLM at no loss in performance, thereby making deployments of LLM-based systems less resource-intensive," write Patrice Béchard and Orlando Marquez Ayala.

Also: What is Copilot (formerly Bing Chat)? Here's everything you need to know

If RAG substantially enables size reduction for many use cases, it could conceivably tilt the focus of LLM development away from the size-at-all-cost paradigm of today's increasingly large models.

There are alternatives, with issues

The most prominent alternative is fine-tuning, where the AI model is retrained, after its initial training, by using a more focused training data set. That training can impart new capabilities to the AI model. That approach has the benefit of producing a model that could use specific knowledge encoded in its neural weights without relying on access to a database via RAG.

But there are issues particular to fine-tuning as well. Google scientists described this month that there are problematic phenomena in fine-tuning, such as the "perplexity curse", in which the AI model cannot recall the necessary information if it's buried too deeply in a training document.

That issue is a technical aspect of how LLMs are initially trained and requires special work to overcome. There can also be performance issues with fine-tuned AI models that degrade how well they perform relative to a plain vanilla LLM.

Fine-tuning also implies having access to the language model code to re-train it, which is a problem for those who don't have source-code access, such as the clients of OpenAI or another commercial vendor.

Also: This free tool from Anthropic helps you create better AI prompts

As mentioned earlier, function calling today provides a simple way for GPT or Claude LLMs to answer simple questions. The LLM converts a natural language query such as "What's the weather in New York City?" into a structured format with parameters, including name and a "temperature" object.

Those parameters are passed to a helper app designated by the programmer, and the helper app responds with the exact information, which the LLM then formats into a natural-language reply, such as: "It's currently 76 degrees Fahrenheit in New York City."

But that structured query limits what a user can do or what an LLM can be made to absorb as an example in the prompt. The real power of an LLM should be to field any query in natural language and use it to extract the right information from a database.

A simpler approach than either fine-tuning or function calling is known as in-context learning, which most LLMs do anyway. In-context learning involves presenting prompts with examples that give the model a demonstration that enhances what the model can do subsequently.

The in-context learning approach has been expanded to something called in-context knowledge editing (IKE), where prompting via demonstrations seeks to nudge the language model to retain a particular fact, such as, "Joe Biden", in the context of a query, such as, "Who is the president of the US?"

The IKE approach, however, still may entail some RAG usage, as it has to draw facts from somewhere. Relying on the prompt can make IKE somewhat fragile, as there's no guarantee the new facts will remain within the retained information of the LLM.

The road ahead

The apparent miracle of ChatGPT's arrival in November of 2022 is the beginning of a long engineering process. A machine that can accept natural-language requests and respond in natural language still needs to be fitted with a way to have accurate and authoritative responses.

Performing such integration raises fundamental questions about the fitness of LLMs and how well they cooperate with RAG programs — and vice versa.

The result could be an emerging sub-field of RAG-aware LLMs, built from the ground up to incorporate RAG-based knowledge. That shift has large implications. If RAG knowledge is specific to a field or a company, then RAG-aware LLMs could be built much closer to the end user, rather than being created as generalist programs inside the largest AI firms, such as OpenAI and Google.

It seems safe to say RAG is here to stay, and the status quo will have to adapt to accommodate it, perhaps in many different ways.

Artificial Intelligence

NVIDIA Unveils ‘Rubin’ Months Ahead of Blackwell Release, AMD Announces MI400X

NVIDIA Unveils ‘Rubin’ Months Ahead of Blackwell Release, AMD Announces ‘Turin’ to Compete

At Taipei’s Computex Conference, NVIDIA CEO Jensen Huang announced the launch of the Rubin AI chip platform, slated for 2026, and the Blackwell Ultra chip, expected in 2025, marking a shift to an annual update cycle for NVIDIA’s AI accelerators.

The Rubin architecture follows the March announcement of the Blackwell model, which is set to ship later in 2024. “We are seeing computation inflation,” Huang stated, highlighting the need for accelerated computing to manage the growing data processing demands. He emphasised NVIDIA’s technology, which promises 98% cost savings and 97% less energy consumption.

Previously, NVIDIA had a two-year update timeline for its AI chips. The shift to an annual release schedule underscores the competitive intensity in the AI chip market and NVIDIA’s efforts to maintain its leadership. The Rubin platform will feature new GPUs and a central processor named Vera, although details were scarce.

Huang announced that the forthcoming Rubin AI platform will incorporate HBM4, the next generation of high-bandwidth memory. This memory type has become a bottleneck in AI accelerator production due to high demand, with leading supplier SK Hynix Inc. largely sold out through 2025. Huang did not provide detailed specifications for the Rubin platform, which is set to succeed Blackwell.

AMD Focusing on AI Workloads

Not just NVIDIA, during the opening keynote at Computex 2024, AMD Chair and CEO Lisa Su showcased the growing momentum of the AMD Instinct accelerator family. AMD unveiled a multiyear, expanded AMD Instinct accelerator roadmap, introducing an annual cadence of leadership AI performance and memory capabilities.

In 2026, AMD plans to release the AMD Instinct MI400 series, based on the AMD CDNA “Next” architecture, which will provide the latest features and capabilities to enhance performance and efficiency for AI training and inference.

Previewed at Computex, the 5th Gen AMD EPYC processors, codenamed “Turin”, will utilise the “Zen 5” core, continuing the high performance and efficiency of the AMD EPYC processor family. These processors are expected to be available in the second half of 2024.

The roadmap begins with the AMD Instinct MI325X accelerator, set to be available in Q4 2024. This accelerator will feature 288GB of HBM3E memory and 6 terabytes per second of memory bandwidth, using the same Universal Baseboard design as the MI300 series. It boasts industry-leading memory capacity and bandwidth, being 2x and 1.3x better than the competition, respectively, and offering 1.3x better compute performance.

Following this, the AMD Instinct MI350 series, powered by the new AMD CDNA 4 architecture, is expected in 2025. It promises up to a 35x increase in AI inference performance compared to the MI300 series with CDNA 3 architecture.

The AMD Instinct MI350X accelerator will be the first product in this series, utilising advanced 3 nm process technology, supporting FP4 and FP6 AI data types, and including up to 288 GB of HBM3E memory.

The post NVIDIA Unveils ‘Rubin’ Months Ahead of Blackwell Release, AMD Announces MI400X appeared first on AIM.

Master Data Management (MDM) and CRM: Ensuring data quality for enhanced customer relationships

CAP-US-Header-02-4-Marketing-Customer_Acquisition_and_Retention_US_1200x400_Neptune_DLVR

Data quality has become vital in the digital age, where data shapes decisions and business strategies. Customer Relationship Management (CRM) systems, crucial for managing customer interactions and fostering growth, depend heavily on quality data. Here, the fusion of Master Data Management (MDM) and CRM emerges as a potent force. This article explores their symbiotic bond, stressing the significance of data quality and the role of a data quality framework in upholding it.

Understanding Master Data Management (MDM)

Master Data Management (MDM) is a comprehensive approach to managing an organization’s critical data entities, commonly called master data. These entities encompass a range of data types, including customer details, product information, supplier data, and more. MDM ensures this master data remains consistent, accurate, and reliable across various systems, departments, and organizational processes.

The convergence of MDM and CRM

The convergence of Master Data Management (MDM) and Customer Relationship Management (CRM) represents a strategic alignment that amplifies the value of data across an organization. This integration transcends traditional data management boundaries, creating synergies that drive business growth, enhance customer relationships, and foster operational excellence. Let’s delve deeper into the multifaceted convergence of MDM and CRM.

Unified customer data landscape

  • Centralized data repository: Integrating MDM with CRM systems creates a centralized repository where master data entities, such as customer profiles, product information, and sales data, are harmonized and consolidated. This centralized view ensures data consistency, accuracy, and accessibility across the organization.
  • Real-time data synchronization: MDM and CRM integration facilitates real-time data synchronization, ensuring that any changes or updates made to master data in one system reflect immediately across all interconnected systems. This real-time synchronization enhances data currency and reliability.

Enhanced customer insights and analytics

  • Holistic customer view: By combining master data from MDM systems with transactional and interaction data from CRM systems, organizations can build a holistic view of their customers. This comprehensive customer view enables deeper insights into customer behaviors, preferences, and lifecycle stages.
  • Advanced analytics capabilities: The convergence of MDM and CRM data sets unlocks advanced analytics capabilities, including predictive modeling, customer segmentation, and trend analysis. These analytical insights empower organizations to anticipate customer needs, identify growth opportunities, and optimize marketing strategies.

Optimized customer engagement strategies

  • Personalized marketing and sales: A unified MDM and CRM environment enables organizations to execute personalized marketing and sales campaigns by leveraging enriched customer data. Personalization based on accurate and comprehensive data enhances customer engagement, conversion rates, and revenue growth.
  • Tailored customer service: Organizations can deliver more personalized and efficient customer service with a consolidated view of customer data. Customer service representatives have access to relevant customer information, enabling them to promptly address inquiries, resolve issues, and provide tailored solutions.

Efficient data governance and management

  • Standardized data governance: MDM and CRM integration fosters standardized data governance practices by establishing clear data stewardship roles, quality standards, and management processes. This standardized approach ensures data integrity, compliance, and accountability.
  • Automated data management processes: Integration between MDM and CRM systems enables automated data management processes, such as data cleansing, validation, and enrichment. Automation reduces manual errors, accelerates data processing, and enhances operational efficiency.

Scalability and flexibility

  • Adaptable architecture: MDM and CRM integration offers a flexible and scalable architecture. This architecture adapts to evolving business needs and technological advancements. Organizations can easily extend their data management capabilities, integrate new data sources, and scale their operations without compromising performance.
  • Future-proofing data strategy: The convergence of MDM and CRM establishes a robust foundation for future data initiatives, such as implementing advanced analytics, integrating emerging technologies like AI and IoT, and adapting to regulatory changes. This future-proofing ensures organizations remain agile and competitive in a rapidly evolving digital landscape.

Data Quality Framework: A Cornerstone for MDM and CRM

Implementing a robust data quality framework is pivotal for ensuring high-quality data in both MDM and CRM environments. A data quality framework outlines the policies, standards, procedures, and tools required to maintain data accuracy, completeness, and consistency.

Within the realm of MDM and CRM, a data quality framework facilitates:

  • Data governance: Establishing clear roles, responsibilities, and processes for data management.
  • Data cleansing and validation: To maintain data integrity, identify and rectify data errors, inconsistencies, and duplicates.
  • Data monitoring and reporting: Continuous monitoring of data quality metrics, report generation, and performance tracking.
  • Compliance and security: Ensuring adherence to data privacy regulations and implementing security measures to safeguard customer data.

Benefits of MDM and CRM Integration

The integration of Master Data Management and Customer Relationship Management systems has numerous benefits that extend beyond data quality. Here’s a detailed exploration of the advantages organizations can reap from this strategic alignment:

1. Improved Data Accuracy and Consistency

  • Unified data view: MDM and CRM integration ensures a consistent view of customer data across the organization. This unified view eliminates data discrepancies and ensures all customer-facing systems access accurate and up-to-date information.
  • Reduced data duplication: With integrated MDM and CRM systems, duplicate data entry and inconsistencies are minimized, improving data accuracy and reliability.

2. Enhanced Customer Experience

  • Personalized interactions: A unified view of customer data enables organizations better to understand customer preferences, behaviors, and needs. This understanding allows for more personalized marketing campaigns, for example.
  • Improved customer engagement: Consistent and high-quality data empowers organizations to engage with customers more effectively across various touchpoints, enhancing customer satisfaction and loyalty.

3. Operational Efficiency and Cost Savings

  • Streamlined processes: Integration between MDM and CRM systems streamlines data management processes, reduces manual data entry, and eliminates data silos. This efficiency translates into faster response times, reduced errors, and lower operational costs.
  • Optimized resource allocation: By automating data-related tasks and reducing manual interventions, organizations can allocate resources more effectively, focusing on strategic initiatives that drive business growth.

4. Data-Driven Decision Making

  • Informed insights: A unified and consistent data source gives organizations reliable insights into customer behavior, market trends, and business performance. These insights enable data-driven decision-making, fostering agility and adaptability in response to changing market dynamics.
  • Predictive analytics: Integrated MDM and CRM data sets enable advanced analytics and predictive modeling. Organizations can leverage these capabilities to proactively forecast customer trends, identify opportunities, and mitigate risks.

5. Compliance and Risk Mitigation

  • Data governance and compliance: MDM and CRM integration facilitates robust data governance practices, ensuring adherence to data privacy regulations. This compliance minimizes legal risks and enhances organizational trustworthiness.
  • Security enhancements: A unified data management approach strengthens security measures. These security enhancements protect sensitive customer information from unauthorized access and potential breaches.

Conclusion

Master Data Management (MDM) and Customer Relationship Management (CRM) are integral to a holistic data management strategy. Their integration and the implementation of a robust data quality framework are essential for maintaining high-quality master data that drives informed decision-making, fosters meaningful customer relationships, and propels business success in a data-driven world. Organizations prioritizing MDM and CRM integration and investing in data quality initiatives are better positioned to navigate the complexities of the modern business landscape and capitalize on the opportunities it presents.

Kunal Shah Tells Entrepreneurs to Take Two Days Off a Week to Think

Kunal Shah Tells Entrepreneurs to Take Two Days Off a Week to Think

In a recent podcast, Kunal Shah, the founder and CEO of Cred, revealed his strategies and ideas behind building a successful business. One of the things he advised entrepreneurs is to take two days off every week to think.

“I don’t think people do that because it feels like cheating,” said Shah, that entrepreneurs are hesitant to take two days off in the middle of the week. He also said that many entrepreneurs love ‘sangharsh’, which stands for struggle. This communicates that it is important for entrepreneurs to take some time to think and reflect on what they are building maximising productivity.

“There is no way you could be cute and a top ranker in IIT,” said Shah, explaining that people who have struggled to build their companies and have worked hard in their lives, naturally assume that struggle is the way forward. “Anything which is not ‘sangharsh’ is cheating for you,” he added.

But Shah said that wealth does not necessarily come by just struggling. “I am not saying that people who are successful do not work hard, but their time is spent in very different ways,” Shah added, about how entrepreneurs do not take leaves during the week.

This brings back the conversation about when Narayana Murthy, the Infosys co-founder said that Indian youth should work 70-hours a week. “Performance leads to recognition, recognition leads to respect, respect leads to power,” he said and asked the “wonderful youth of the country” to realise this and work 12 hours a day.

Recalling his philosophical question asked to Sam Altman when he visited India about what he learned about humans, Shah said that Altman answered that he discovered that intelligence is just a property of matter. “If we give birth to an AI or anything else that is superior to us…the wealth might shift over there and we may not exist as a species,” Shah explained reflecting on the question and that humans have always been an asymmetric information gathering systems, and the same would be true for AI systems.

The post Kunal Shah Tells Entrepreneurs to Take Two Days Off a Week to Think appeared first on AIM.

People are using AI music generators to create hateful songs

A robot reading music

Malicious actors are abusing generative AI music tools to create homophobic, racist, and propagandic songs — and publishing guides instructing others how to do so.

According to ActiveFence, a service for managing trust and safety operations on online platforms, there’s been a spike in chatter within “hate speech-related” communities since March about ways to misuse AI music creation tools to write offensive songs targeting minority groups. The AI-generated songs being shared in these forums and discussion boards aim to incite hatred toward ethnic, gender, racial, and religious cohorts, say ActiveFence researchers in a report, while celebrating acts of martyrdom, self-harm, and terrorism.

Hateful and harmful songs are hardly a new phenomenon. But the fear is that, with the advent of easy-to-use free music-generating tools, they’ll be made at scale by people who previously didn’t have the means or know-how — just as image, voice, video and text generators have hastened the spread of misinformation, disinformation, and hate speech.

“These are trends that are intensifying as more users are learning how to generate these songs and share them with others,” Noam Schwartz, co-founder and CEO of ActiveFence, told TechCrunch in an interview. “Threat actors are quickly identifying specific vulnerabilities to abuse these platforms in different ways and generate malicious content.”

Creating “hate” songs

Generative AI music tools like Udio and Suno let users add custom lyrics to generated songs. Safeguards on the platforms filter out common slurs and pejoratives, but users have figured out workarounds, according to ActiveFence.

In one example cited in the report, users in white supremacist forums shared phonetic spellings of minorities and offensive terms, such as “jooz” instead of “Jews” and “say tan” instead of “Satan,” that they used to bypass content filters. Some users suggested altering spacings and spellings when referring to acts of violence, like replacing “my rape” with “mire ape.”

TechCrunch tested several of these workarounds on Udio and Suno, two of the more popular tools for creating and sharing AI-generated music. Suno let all of them through, while Udio blocked some — but not all — of the offensive homophones.

Reached via email, a Udio spokesperson told TechCrunch that the company prohibits the use of its platform for hate speech. Suno didn’t respond to our request for comment.

In the communities it canvassed, ActiveFence found links to AI-generated songs parroting conspiracy theories about Jewish people and advocating for their mass murder; songs containing slogans associated with the terrorist groups ISIS and Al-Qaeda; and songs glorifying sexual violence against women.

Impact of song

Schwartz makes the case that songs — as opposed to, say, text — carry emotional heft that make them a potent force for hate groups and political warfare. He points to Rock Against Communism, the series of white power rock concerts in the U.K. in the late ’70s and early ’80s that spawned whole subgenres of antisemitic and racist “hatecore” music.

“AI makes harmful content more appealing — think of someone preaching a harmful narrative about a certain population and then imagine someone creating a rhyming song that makes it easy for everyone to sing and remember,” he said. “They reinforce group solidarity, indoctrinate peripheral group members and are also used to shock and offend unaffiliated internet users.”

Schwartz calls on music generation platforms to implement prevention tools and conduct more extensive safety evaluations. “Red teaming might potentially surface some of these vulnerabilities and can be done by simulating the behavior of threat actors,” Schwartz said. “Better moderation of the input and output might also be useful in this case, as it will allow the platforms to block content before it is being shared with the user.”

But fixes could prove fleeting as users uncover new moderation-defeating methods. Some of the AI-generated terrorist propaganda songs ActiveFence identified, for example, were created using Arabic-language euphemisms and transliterations — euphemisms the music generators didn’t detect, presumably because their filters aren’t strong in Arabic.

AI-generated hateful music is poised to spread far and wide if it follows in the footsteps of other AI-generated media. Wired documented earlier this year how an AI-manipulated clip of Adolf Hitler racked up more than 15 million views on X after being shared by a far-right conspiracy influencer.

Among other experts, a UN advisory body has expressed concerns that racist, antisemitic, Islamophobic and xenophobic content could be supercharged by generative AI.

“Generative AI services enable users who lack resources or creative and technical skills to build engaging content and spread ideas that can compete for attention in the global market of ideas,” Schwartz said. “And threat actors, having discovered the creative potential offered by these new services, are working to bypass moderation and avoid being detected — and they have been successful.”

GIGABYTE Announces AI Top, an All-Round Solution to Train AI Locally on PCs

GIGABYTE, a Taiwanese computer hardware manufacturer and distributor, announced GIGABYTE AI TOP, a groundbreaking solution to train AI locally, at a launch event a day before COMPUTEX 2024. CEO Eddie Lin stated that GIGABYTE AI TOP was born with the motto ‘Train Your Own AI on Your Desk’, aiming to complete the last mile in the booming era of local AI.

GIGABYTE AI TOP is the all-around solution to train AI models locally. It features the AI TOP Utility, the AI TOP Hardware, and the AI TOP Tutor.

The AI TOP Utility is reinvented software with a friendly user interface and experience. It supports up to 236B-parameter large language models while maintaining privacy and security. The AI TOP Hardware offers flexibility and upgradability compared to traditional training solutions on the cloud and is suitable for standard electrical systems without extra cost in electricity construction.

The AI TOP Tutor provides comprehensive consultation for AI TOP solutions, intuitive set-up guidance, and technical support. All these features make GIGABYTE AI TOP easily adapted by both beginners and professionals to start up their local AI training projects.

The AI TOP Hardware features a variety of GIGABYTE products, including motherboards, graphics cards, SSDs, and power supply units. One of the event’s highlights was the unveiling of the Radeon PRO W7900 AI TOP 48G and Radeon PRO W7800 32G.

Their presence makes GIGABYTE the first and the only professional graphics card partner on the market that collaborates with the AMD Radeon PRO series.

“GIGABYTE’s persistent pursuit in quality and reliability strengthens our partnership with leading silicon giants in making the world better with AI,” Eddie Lin addressed the audience at the launch event.

GIGABYTE has been working closely with top chip leaders such as partnering with NVIDIA to launch high-end RTX AI PCs, delivering exceptional user experiences.

The post GIGABYTE Announces AI Top, an All-Round Solution to Train AI Locally on PCs appeared first on AIM.

Data detective work: An anti-money laundering example

Data detective work: An anti-money laundering example

Image by Alexa from PIxabay

I’ve been studying the effects of sanctions lately, which has led to a better understanding of how governments are collaborating and sharing data in more substantial ways.

In May 2024 Daleep Singh, US Deputy National Security Advisor, International Economics, gave a keynote at a Brookings Institution event titled “Sanctions on Russia: What’s working? What’s not?” Singh’s main point was that sanctions should be seen as one tool in a toolbox. But he does make clear that sanctions against Russia have had a significant impact over the past two years.

Sanctions on Russia are currently in place against 4,500 individuals and entities. Many of these entities are shell companies that didn’t exist before Russia’s February 2022 invasion of Ukraine.

The overall sanctions effort of the US and its partners has immobilized more than $300 billion in assets, according to Singh. Singh and others propose to invest those assets in government bonds and use the interest earned to support Ukraine in its war efforts against Russia.

Money laundering and the Magnitsky Act

Sergei Magnitsky was a Russian lawyer, tax advisor and whistleblower for whom the Magnitsky Act was named. Some of the people Magnitsky blew the whistle on managed to have him arrested and sent to Butyrka prison in Moscow, where he was beaten. Magnitsky ended up dying in prison after refusing to seek treatment in 2009.

Before his death, Magnitsky was investigating criminals who’d stolen the corporate identities of Hermitage Capital’s Russian investment fund units. Those criminals used the identities to create fake contracts and then secure fraudulent court judgments to obtain a $230 tax refund — the largest refund in Russian history. The criminals then laundered money from that refund.

One of Magnitsky’s clients, Founder and CEO of Hermitage Capital Bill Browder, envisioned and has been the lead advocate for the passage of the Magnitsky Act.

Sanctions and the Magnitsky Act

The Magnitsky Act makes it possible to sanction foreign government officials who’ve violated human rights by freezing their assets. Anti-money laundering investigations support the efforts to freeze assets by collecting evidence that the assets are associated with human rights violations.

The Magnitsky Act first became law in the US in 2012. Since then, 35 countries have passed laws modeled after the US law, including 25 countries who are members of the European Union. Those same 35 countries make up most of the 39 US partners who are enforcing sanctions against Russia imposed after that country’s invasion of Ukraine.

Tracking and tracing money laundering activities

Central to the investigative work of anti-money laundering is the methodical construction of a schematic depicting a flow of transactions from money illegally obtained as a result of criminal activity. The Magnitsky case, explained in Browder’s 2022 book Freezing Order, is a prime example.

Key to creating a full anti-money laundering schematic is investigating suspect entities across many different kinds of data sources in different countries.

While independently investigating transactions associated with the Magnitsky case, Barron’s reporter Bill Alpert found a database of wire transfers from Banca di Economii in Moldova. Using that database, he traced money from two Moldovan entities through countries such as Cyprus, Lithuania, Latvia and Estonia.

Alpert found that one of the entities, Prevezon Holdings, sent $857,764 from the fraudulent $230 million tax payment mentioned above. A Russian named Denis Katsyv owned Prevezon Holdings.

Alpert then had the idea to search New York property records for that same entity. “In total,” Browder notes in the book, “Denis Katsyv had used Prevezon to purchase roughly $17 million worth of real estate in New York,” including a posh apartment in the financial district of Manhattan.

Reducing the AML compliance burden with a knowledge graph approach

Financial institution spending on AML technology and operations neared $60 billion in 2023, according to FI technology research firm Celent. Some of the biggest banks may be spending $1 billion a year just on AML.

Imagine the challenge of an AML investigator, someone who needs to locate and scrutinize many different separate sources to be able to piece together a money trail.

One obvious way to simplify the work AML investigators do would be to desilo the data needed, making at least the most commonly reused databases into a virtual, unified, contextualized knowledge graph resource, rather than leaving the data in thousands of different databases.

Here in the US, counties maintain property records. There are 3,143 counties in the US. With a unified approach, AML investigators would be able to search once, rather than county by county. The likelihood of initial success would be far greater.

Now imagine a global Magnitsky Act and what it would take to unify the resources globally. It’s clear the major challenge would be to request permission and gain access to tens of thousands of databases maintained by local government agencies. Doing so would still be far cheaper than paying for all the labor needed to turn over rocks one at a time across an unnecessarily fragmented data landscape.

Soon, LLMs Can Help Humans Communicate with Animals

LLMs Can Help Humans Communicate With Animals

A common cliche held in the language industry is that translation helps to break the language barrier. Since the late 1950s, researchers have been attempting to understand animal communication. Now, scientists are blending animal wisdom with LamDA’s secrets, embracing GPT-3’s essence.

By studying massive datasets, which can include audio recordings, video footage, and behavioural data, researchers are now using machine learning to create a programme that can interpret these animal communication methods, among other things.

Closer to Reality

The Earth Species Project (ESP) seeks to build on this by utilising AI to address some of the industry’s enduring problems. With projects like mapping out crow vocalisations and creating a benchmark of animal sounds, ESP is establishing the groundwork for further AI research.

The organisation’s first peer-reviewed publication, Scientific Reports, presented a technique that could separate a single voice from a recording of numerous speakers, demonstrating impressive strides being made in the field of animal communication with the help of AI, inspiring the audience with the possibilities.

Scientists refer to the complex task of isolating and understanding individual animal communication signals in a cacophony of sounds as the cocktail-party problem. From there, the organisation started evaluating the information in bloggers to pair behaviours with communication signals.

ESP co-founder Aza Raskin stated, “As human beings, our ability to understand is limited by our ability to perceive. AI does widen the window of what human perception is capable of.”

Easier Said than Done

A common mistake is assuming that animals employ sounds as one form of communication. Visual and tactile stimuli are as equally significant in animal communication as auditory stimuli, highlighting the intricate and fascinating nature of this field, which is sure to pique the interest of the audience.

For example, when beluga whales communicate, specific vocalisation cues show their social systems. Meerkats utilise a complex system of alarm cries in response to predators based on the predator’s proximity and level of risk. Birds also convey danger and other information to their flock members in the sky, such as the status of a mating pair.

These are only a few challenges researchers must address while studying animal communication.

To do this, Raskin and the ESP team are incorporating some of the most popular and consequential innovations of the moment into a suite of tools to actualise their project – generative AI and huge language models. These advanced technologies can understand and generate human-like responses in multiple languages, styles, and contexts using machine learning.

Understanding non-human communication can be significantly aided by the insights provided by models like OpenAI’s GPT-3 and Google’s LaMDA, which are examples of such generative AI tools.

ESP has recently developed the Benchmark for Animal Sounds, or BEANS for short, the first-ever benchmark for animal vocalisations. It established a standard against which to measure the performance of machine learning algorithms on bioacoustics data.

On the basis of self-supervision, it has also created the Animal Vocalisation Encoder, or AVES. This is the first foundational model for animal vocalisations and can be applied to many other applications, including signal detection and categorisation.

The nonprofit is just one of many groups that have recently emerged to translate animal languages. Some organisations, like Project Cetacean Translation Initiative (CETI), are dedicated to attempting to comprehend a specific species — in this case, sperm whales. CETI’s research focuses on deciphering the complex vocalisations of these marine mammals.

DeepSqueak is another machine learning technique developed by University of Washington researchers Kevin Coffey and Russell Marx, capable of decoding rodent chatter. Using raw audio data, DeepSqueak identifies rodent calls, compares them to calls with similar features, and provides behavioural insights, demonstrating the diverse approaches to animal communication research.

ChatGPT for Animals

In 2023, an X user named Cooper claimed that GPT-4 helped save his dog’s life. He ran a diagnosis on his dog using GPT-4, and the LLM helped him narrow down the underlying issue troubling his Border Collie named Sassy.

Though achieving AGI may still be years away, Sassy’s recovery demonstrates the potential practical applications of GPT-4 for animals.

While it is astonishing in and of itself, developing a foundational tool to comprehend all animal communication is challenging. Animal data is hard to obtain and requires specialised research to annotate, in contrast to human data, which is annotated in a simple manner (for humans).

Compared to humans, animals have a far limited range of sounds, even though many of them are capable of having sophisticated, complex communities. This means that the same sound can have multiple meanings depending on the context in which it is used. The only way to determine meaning is to examine the context, which includes the caller’s identity, relationships with others, hierarchy, and past interactions.

Yet, this might be possible within a few years, according to Raskin. “We anticipate being able to produce original animal vocalisations within the next 12 to 36 months. Imagine if we could create a synthetic crow or whale that would seem to them to be communicating with one of their own. The plot twist is that, before we realise what we are saying, we might be able to engage in conservation”, Raskin says.

This “plot twist”, as Raskin calls it, refers to the potential for AI to not only understand animal communication but also to facilitate human-animal communication, opening up new possibilities for conservation and coexistence.

The post Soon, LLMs Can Help Humans Communicate with Animals appeared first on AIM.

OpenAI’s GPT-4 Shows Prowess in Picking Stocks

OpenAI’s GPT-4 Shows Prowess in Picking Stocks

Researchers at the University of Chicago’s Booth School of Business have demonstrated that OpenAI’s GPT-4 can perform as well as or even better than human experts in financial statement interpretations.

Using a method called chain-of-thought, the researchers trained GPT-4 to simulate the mental processes of a human financial analyst. This allowed the robot to analyse and forecast future market movements.

The team taught the model to produce precise predictions by teaching it to recognise patterns, calculate ratios, and synthesise data. The study claimed that GPT-4 could forecast future profit direction with 60% accuracy, outperforming the majority of human financial analysts who averaged between 53% and 57% accuracy.

The researchers concluded, “LLM prediction does not stem from its training memory.” Instead, we discover that the LLM produces insightful narratives regarding a business’s potential performance.

Skeptics Aren’t Convinced

However, it’s crucial to exercise caution and not draw excessive conclusions from these findings.

ChatGPT uses the data it was trained on to answer user questions and works on their prompts. To be accurate, you must ask the correct question.

On Hacker News, a user pointed out that the researchers’ artificial neural network model, which they used as a benchmark, is from 1989 and cannot be compared to most advanced models utilised by financial analysts today.

On X, AI researcher Matt Holden questioned the researchers’ assertions, stating that it is improbable for GPT-4 to select equities that can outperform a more general index like S&P 500. These concerns reflect the ongoing debate about the effectiveness of AI in stock market analysis.

In one experiment, ChatGPT reported a 26.9% net return for the benchmark stock index the previous year, even though the index had dropped 20%. This was in response to a question on the performance of S&P 500.

These examples demonstrate both the potential and the limitations of AI in stock market analysis.

Researchers from Virginia Tech, Queen’s University, and JPMorgan AI Research have looked at how ChatGPT and GPT-4 performed on simulated Chartered Financial Analyst (CFA) exams. The outcomes are not that impressive. ‘The researcher found that, in tested situations, ChatGPT probably wouldn’t be able to pass CFA levels I and II.’

This suggests that while ChatGPT shows promise, it still has a long way to go before it can match the expertise of human financial analysts.

For now, the market seems to have chosen to stick with more traditional approaches, emphasising the continued importance of human discretion in financial analysis.

The post OpenAI’s GPT-4 Shows Prowess in Picking Stocks appeared first on AIM.

India Leads APAC in Data Centre Expansion; Surpasses Japan, Singapore

India is outpacing Japan, Singapore, Hong Kong and other Asian countries in data centre capacity growth, driven by major investments from global tech giants like Amazon and Indian conglomerates such as Reliance Industries, according to a new report from real estate services firm CBRE.

The surging demand reflects India’s rapidly expanding digital economy and rising consumption of online services by its massive, internet-savvy population.

India, now the world’s most populous nation, is projected to add up to 850 MW of new data centre capacity between 2024-2026, nearly doubling its current capacity of around 950 MW and exceeding the growth of regional competitors. Excluding China, India will surpass South Korea (495 MW planned), Japan (407 MW), and Australia (314 MW) in new capacity during this period, the CBRE study found.

“India has a large population, so there are a lot of end users that can be served locally, whereas in Singapore or Hong Kong, one has to cater to foreign demand as well because the local population is not enough to make them a large market,” said Mikhail Jaura, senior researcher at technology consultancy IDC.

Several factors are fuelling India’s data centre boom. Soaring data consumption due to the increasing adoption of digital payments, e-commerce, streaming, and other online services by India’s 700 million+ internet users is a key driver.

Investments and cloud region launches by global tech firms like AWS (investing $12.7 billion by 2030), Google, and Microsoft are also playing a significant role.

Expansion by major Indian companies is further boosting growth. Reliance Industries’ joint venture with Brookfield aims to develop data centres in Chennai and Mumbai, while AdaniConneX secured $1.44 billion to build sites with 67 MW total capacity.

Data localisation regulations requiring certain data to be stored within India and the COVID-19 pandemic accelerating digitisation across industries, with India expected to become a $1 trillion internet economy by 2030, are other important factors.

However, experts note that India still needs to improve infrastructure stability, bridge the hardware talent gap, and ensure a cost-effective, uninterrupted power supply to capitalize on its data centre potential fully. Compared to other real estate assets, the industry faces high capital requirements and long development timelines.

“One of the biggest challenges is infrastructure stability,” said IDC executive Franco Chiam. “That needs to be addressed as the first step to give confidence that businesses will be able to host data centres in that environment.”

Despite the hurdles, India’s data centre market is poised for robust growth in the coming years, with investments expected to reach around $5 billion annually by 2025. The sector is attracting billions from global firms, Indian companies, real estate developers, and private equity funds, boosting real estate activity in key hubs like Mumbai, Chennai, Hyderabad, and Delhi-NCR.

By 2026, India’s data centre capacity is forecast to reach over 1800 MW as demand continues to surge. The data centre industry is set to play a pivotal role in India’s ongoing digital transformation and economic growth, reshaping the country’s technological landscape in the years ahead.

The post India Leads APAC in Data Centre Expansion; Surpasses Japan, Singapore appeared first on AIM.