AI — Страница 975

Wayve raises $1B to take its Tesla-like technology for self-driving to many carmakers

Wayve raises $1B to take its Tesla-like technology for self-driving to many carmakers Mike Butcher @mikebutcher / 8 hours

Wayve, a U.K.-born startup developing a self-learning rather than rule-based system for autonomous driving, has closed $1.05 billion in Series C funding led by SoftBank Group. This is the U.K.’s largest AI fundraise ever and sits among the top 20 AI fundraises globally to date.

Also participating in the raise was Nvidia and existing investor Microsoft. Waye’s early-stage investors included Meta’s head of AI, Yann LeCun.

Wayve, which was founded in Cambridge in 2017, raised $200 million in a Series B round in January 2022 led by Eclipse Ventures.

The company plans to use the fresh capital injection to develop its product for “eyes on” assisted driving and “yes off” fully automated driving and other AI-assisted automotive applications. It plans to expand operations globally.

San Francisco has become known as the epicenter for autonomous driving roll-outs, with Alphabet-owned Waymo and GM-owned Cruise both operating services in the city. By contrast, Wayve’s “end-to-end” self-driving system began its life around the tiny streets of Cambridge on an electric Renault Twizy.

Since then, it has been training its model on delivery vehicles for the likes of companies like U.K. grocery delivery company Ocado, which invested $13.6 million in the startup.

Wayve’s approach to autonomous driving is similar to Tesla’s, but Wayve plans to sell its autonomous driving model to a variety of auto OEMs. The implication, of course, is that Wayve will garner a great deal more training data on which to improve its model, as Tesla must rely on someone buying their car brand. The company has not announced any such automotive partners yet, however.

Wayve calls its hardware-agnostic mapless product an “Embodied AI,” and it plans to distribute its platform not just to car makers but also to robotics companies serving manufacturers of all descriptions, allowing the platform to learn from human behavior in a wide variety of real-world environments. The company’s research on multimodal and generative models, known as LINGO and GAIA, will offer “language-responsive interfaces, personalized driving styles, and co-piloting,” the firm promises.

Wayve co-founder and CEO Alex Kendall told TechCrunch: “Seven years ago, we started the company to go build an embodied AI. We have been heads down building technology … What happened last year was everything really started to work.”

He said the key moment has been the automotive industry’s “step change” into having cameras surrounding new cars, from which Wayve can draw data for its autonomous platform: “Now their production vehicles are coming out with GPUs, surrounding cameras, radar, and of course the appetite to now bring AI onto, and enable, an accelerated journey from assisted to automated driving. So this fundraise is a validation of our technological approach, and gives us the capital to go and turn this technology into product and bring this product to market.”

He added that Wayve has big plans for robotics as well.

“Very soon you’ll be able to buy a new car, and it’ll have Wayve’s AI on it … Then this goes into enabling all kinds of embodied AI, not just cars, but other forms of robotics. I think the ultimate thing that we want to achieve here is to go way beyond where AI is today with language models and chatbots. But to really enable a future where we can trust intelligent machines that we can delegate tasks to, and of course they can enhance our lives and self-driving will be the first example of that.”

In a move that signified the importance of this fundraise more broadly to the U.K., Prime Minister Rishi Sunak issued a supporting statement, saying: “From the first electric light bulb or the World Wide Web, to AI and self-driving cars — the U.K. has a proud record of being at the forefront of some of the biggest technological advancements in history.”

“I’m incredibly proud that the U.K. is the home for pioneers like Wayve who are breaking ground as they develop the next generation of AI models for self-driving cars. The fact that a homegrown, British business has secured the biggest investment yet in a U.K. AI company is a testament to our leadership in this industry, and that our plan for the economy is working,” he said.

“We are leaving no stone unturned to create the economic conditions for businesses to grow and thrive in the U.K. We already have the third highest number of AI companies and private investment in AI in the world, and this announcement anchors the U.K.’s position as an AI superpower,” he added.

Also in a statement, Kentaro Matsui, managing partner at SoftBank Investment Advisers and a Wayve board member, said: “AI is revolutionizing mobility … The potential of this type of technology is transformative; it could eliminate 99% of traffic accidents. SoftBank Group is delighted to be at the forefront of this effort with Wayve, as advanced intelligence redefines mobility and connectivity, contributing to a more convenient and safer society.”

RSA: Google Enhances its Enterprise SecOps Offerings With Gemini AI

The RSA Conference, held in San Francisco from May 6-9, brings together cybersecurity professionals from across the world. This year’s conference is buzzing with conversation about generative AI: how to use generative AI to protect against attacks and how to secure generative AI itself.

We’re rounding up the enterprise business tech news from RSA that is most relevant for IT and tech decision-makers. This article will be updated throughout RSA with more tech news highlights.

Google updates Google Security Operations and more with Gemini AI

Google is combining the security capabilities of information security company Mandiant and malware scanner VirusTotal with Gemini AI and Google’s own user and device footprint in a new offering called Google Threat Intelligence. Available May 6 wherever Google Cloud Security is distributed, Google Threat Intelligence uses Gemini AI to get a top-down look at security data, competing with Microsoft’s Copilot for Security.

In addition, Google announced:

New curated detections for Google Security Operations that are designed to reduce manual processes and suggest outcomes relevant to the wider Google Cloud and updated to include recently-detected threats.
AI consulting services from Mandiant, which can red team both an organization’s AI defenses and how an organization’s security could be compromised by AI.
New services taking advantage of Gemini in Security.

Google Gemini analyzes malicious code in Google Threat Intelligence. Image: Google

IBM and AWS research: Generative AI’s unpredictable risks worry the C-suite

IBM and AWS published a report during RSA on how executives are thinking about securing generative AI. The report found that fewer than a quarter (24%) of respondents said they are including security as part of their generative AI projects — possibly a sign that hyperscalers have a niche to step into as the business of securing AI projects becomes more mainstream.

Most respondents were concerned about generative AI’s effect on security, with 51% saying they were worried about unpredictable risks and new security vulnerabilities arising, and 47% watching out for new attacks targeting AI. IBM pitched its Framework for Securing Generative AI, which was released in January 2024, as a solution.

Risk and governance frameworks will be key to help secure generative AI, IBM and AWS found in the report. In addition, IBM is extending its X-Force Red testing services to AI, including generative AI applications, MLSecOps pipelines and AI models.

SEE: It’s open season on Adobe’s Firefly and Content Credentials for select bug bounty hunters. (TechRepublic)

Proofpoint adds AI screening to email security products

At RSA, Proofpoint announced two novel email security services:

Pre-delivery semantic analysis, the large language model-based detection of social engineering emails to stop email fraud or malicious links before they reach Microsoft 365 and Google Workplace inboxes.
Adaptive Email Security, an Integrated Cloud Email Security solution with automatic quarantining and explanation of behavioral anomalies for high-value targets.

Both of these email security services are available May 6. Adaptive Email Security is available only on a rolling basis for select customers who already have standard email security packages and have identified high-risk employees.

Cisco and Splunk expand Cisco Hypershield

On May 6 at RSA, Cisco showed one of the first results of its March acquisition of Splunk. Cisco added two capabilities to its Cisco Hypershield data center and cloud security product, which can now:

Detect and block attacks from unknown vulnerabilities within runtime workload environments.
Isolate suspected workloads.

Cisco also announced that Cisco Identity Intelligence AI analytics are now available in the Cisco Duo security platform, adding specific tools to catch identity-based attacks.

Splunk announced on May 6 a new asset and risk intelligence solution called Asset and Risk Intelligence. Splunk Asset and Risk Intelligence is now in early access.

TechRepublic is covering RSA remotely.

DocuSign acquires AI-powered contract management firm Lexion

DocuSign acquires AI-powered contract management firm Lexion Kyle Wiggers 7 hours

As DocuSign reportedly explores a sale to private equity, it’s acquiring a company itself.

On Monday, DocuSign announced that it’s buying Lexion, a contract workflow automation startup, for $165 million. The purchase comes as DocuSign makes increasing investments in the contract management space, most recently launching DocuSign IAM, a service aimed at connecting different components of the corporate agreement creation and negotiation process.

Lexion was incubated at the Allen Institute for Artificial Intelligence (AI2), the AI-focused research arm of the nonprofit Allen Institute. Oberoi founded the company together with former Microsoft research software development engineering lead Emad Elwany and engineering veteran James Baird; Oberoi previously co-founded survey platform Precision Polling, which SurveyMonkey acquired shortly after it launched.

Lexion began as a “smart” repository for contracts, letting legal teams ask natural language questions about documents. But it slowly expanded with tools to address various use cases and challenges in document creation for teams across not only legal departments, but sales, IT, HR and finance.

Lexion had raised $35.2 million in venture capital prior to the acquisition from investors including Khosla Ventures, Madrona, and Point72 Ventures.

According to DocuSign CEO Allan Thygesen, Legion’s technology will enable DocuSign customers to gain a “more granular” understanding of their contract structures and data, as well as better identify insights and potential risks. DocuSign will tap Lexion’s AI models for contract creation and negotiations, while Lexion will build integrations with DocuSign’s products and solutions.

The purchase comes at a pivotal moment for DocuSign, valued at about $12.5 billion, which is said to be in the process of selling itself to a private equity firm. Perhaps in a bid to make its books more attractive to suitors, DocuSign in February announced plans to lay off ~6% of its workforce — some 400 jobs.

Reuters reported in January that Bain and Hellman & Friedman are among the final bidders in an auction for DocuSign, which could be one of the biggest leveraged buyouts in 2024.

DocuSign’s other acquisitions include SpringCM (in July 2018 for $220 million), a cloud platform for sales contract management, and Seal Software (in February 2020 for $188 million), a company specializing in AI-driven contract analytics.

HCLTech and AWS Partner to Help Enterprises Explore Gen AI Use Cases, PoC, and Solutions

HCLTech announced a global strategic collaboration agreement with Amazon Web Services (AWS) to enterprises explore and develop GenAI-led use cases, proofs of concept (PoC), tools and solutions.

They will develop a structural framework with target-based milestones aligned to business strategy that enables the co-creation of customized Gen AI-led solutions and offers clients flexible consumption models.

The companies will work together to implement AWS GenAI services such as Amazon Bedrock, Amazon CodeWhisperer, Amazon SageMaker and Amazon Titan for enterprises across multiple industries, adding momentum to their digital transformation journeys.

Leveraging HCLTech’s full technology stack, core engineering capabilities and AI experience, this alliance will allow clients to see the impact of their GenAI investment and gain early access to AWS’s advanced GenAI services.

“This strategic collaboration agreement seeks to help enterprises unlock the value of GenAI by empowering them to reshape business models, elevate customer experiences and foster growth. A premier partner with a diverse range of AWS competencies, we are committed to accelerating the widespread adoption of AI to our global client base,” Prabhakar Appana, senior vice president and AWS Global Head, HCLTech said.

HCLTech recently earned the AWS Generative AI Competency Partner status for complementing AWS’s advanced GenAI portfolio with its own innovative GenAI solutions spanning various industries and enterprise functions.

HCLTech offers a unique set of end-to-end AI capabilities, from chip development to business process optimization. Leveraging strategic partnerships with AWS and many others, HCLTech is paving the way for the adoption of generative AI across industries.

The post HCLTech and AWS Partner to Help Enterprises Explore Gen AI Use Cases, PoC, and Solutions appeared first on Analytics India Magazine.

GitHub Copilot Workspace Makes Devin Sweat

GitHub recently introduced Copilot Workspace, an AI-powered tool designed to ‘assist’ the developer. However, it’s not the first AI coding tool to make waves in the industry. Earlier this year, Cognition Labs announced Devin, which quickly grew popular.

The key difference between the two as explained by Jonathan Carter, head of GitHub Next were, “We don’t view GitHub Copilot Workplace as an ‘AI engineer’; we view it as an AI assistant to help developers be more productive and happier,”

While both Copilot Workspace and Devin aim to assist developers, they differ in their approaches and capabilities. And this space of building developer tools is gaining more competition as newer companies with fat investments enter it. Recently, we saw Augment AI, another company with impressive talent with the intention of easing the developer’s stress.

While Augment doesn’t have a product so far, the notable difference between Devin and GitHub Copilot Workspace is that Devin works more as an agent. Carter explained, “Devin includes a build/test/run agent that attempts to self-repair errors,” while “GitHub decided to focus on ‘optimising the core user experience’ for the Copilot Workspace technical preview, rather than including a similar feature.”

Are assistants better so far?

GitHub Copilot Workspace boasts several advantages that set it apart from Devin and other AI coding assistants.

Dan Shipper, a software engineer who tested Copilot Workspace, noted, “Copilot Workspace seamlessly integrates with GitHub features like issues, pull requests, and code reviews, making collaboration easier.”

This tight integration enables developers to collaborate more effectively and streamline their workflow, ultimately enhancing productivity.

Another advantage of Copilot Workspace is its mobile compatibility. As Shipper points out, “Developers can use Copilot Workspace on their mobile devices, allowing for greater flexibility and on-the-go collaboration.”

This feature proves particularly useful for quick code reviews or issue triaging, enabling developers to work efficiently even when away from their primary workstations.

Copilot Workspace also stands out for its adaptive learning capabilities. As developers interact with the tool, it learns from their codebase and adapts to their coding style over time. Shipper observed, “This leads to more accurate and relevant suggestions over time.”

This adaptive learning feature ensures the tool continuously improves its code suggestions, making it an increasingly valuable asset for developers.

On the other hand, Devin, the AI engineer, is a more autonomous entity capable of independently tackling complex development tasks. Devin’s ability to ‘self-repair errors’ suggests a higher level of autonomy and decision-making capacity compared to an AI assistant.

However, the actual performance of these tools may not always align with their marketed capabilities. In a video review of Devin by Santiago L. Valdarrama, a developer who got early access, tested the tool on various projects and found mixed results. While Devin successfully completed some tasks, such as building a Tic Tac Toe game and a digital classification application, it struggled with more complex projects like a lunar lander simulation.

Valdarrama noted that Devin often generated excessive or irrelevant code, requiring manual intervention and cleanup. He also observed that Devin’s ability to understand and implement complex requirements was limited, often leading to suboptimal or incorrect solutions.

While Devin generated significant buzz with its bold claims of being the world’s first AI software engineer, Copilot Workspace appears to be a more refined and developer-centric tool at this juncture. As Carter points out, “We definitely intend to explore a VS Code extension in the not-too-distant future.”

AI agents like Devin continue to advance in their reasoning and planning capabilities. While Devin’s current performance may not be impressive, it only completes tasks accurately about 14% of the time, it’s crucial to consider the rapid pace of AI development that led us here. When GPT was first introduced it couldn’t strategically reason.

We’ve witnessed significant advancements in AI models, and if this trend continues, we may soon find ourselves in a world where autonomous agents can effectively handle complex problems and engage in long-term planning.

The post GitHub Copilot Workspace Makes Devin Sweat appeared first on Analytics India Magazine.

Large Action Models (LAMs): The Next Frontier in AI-Powered Interaction

Almost a year ago, Mustafa Suleyman, co-founder of DeepMind, predicted that the era of generative AI would soon give way to something more interactive: systems capable of performing tasks by interacting with software applications and human resources. Today, we're beginning to see this vision take shape with the development of Rabbit AI‘s new AI-powered operating system, R1. This system has demonstrated an impressive ability to monitor and mimic human interactions with applications. At the heart of R1 lies the Large Action Model (LAM), an advanced AI assistant adept at comprehending user intentions and executing tasks on their behalf. While previously known by other terms such as Interactive AI and Large Agentic Model, the concept of LAMs is gaining momentum as a pivotal innovation in AI-powered interactions. This article explores the details of LAMs, how they differ from traditional large language models (LLMs), introduces Rabbit AI's R1 system, and looks at how Apple is moving towards a LAM-like approach. It also discusses the potential uses of LAMs and the challenges they face.

Understanding Large Action or Agentic Models (LAMs)

A LAM is an advanced AI agent engineered to grasp human intentions and execute specific objectives. These models excel at understanding human needs, planning complex tasks, and interacting with various models, applications, or people to carry out their plans. LAMs go beyond simple AI tasks like generating responses or images; they are full-fledge systems designed to handle complex activities such as planning travel, scheduling appointments, and managing emails. For example, in travel planning, a LAM would coordinate with a weather app for forecasts, interact with flight booking services to find appropriate flights, and engage with hotel booking systems to secure accommodations. Unlike many traditional AI models that depend solely on neural networks, LAMs utilize a hybrid approach combining neuro-symbolic programming. This integration of symbolic programming aids in logical reasoning and planning, while neural networks contribute to recognizing complex sensory patterns. This blend allows LAMs to address a broad spectrum of tasks, marking them as a nuanced development in AI-powered interactions.

Comparing LAMs with LLMs

In contrast to LAMs, LLMs are AI agents that excel at interpreting user prompts and generating text-based responses, assisting primarily with tasks that involve language processing. However, their scope is generally limited to text-related activities. On the other hand, LAMs expand the capabilities of AI beyond language, enabling them to perform complex actions to achieve specific goals. For example, while an LLM might effectively draft an email based on user instructions, a LAM goes further by not only drafting but also understanding the context, deciding on the appropriate response, and managing the delivery of the email.

Additionally, LLMs are typically designed to predict the next token in a sequence of text and to execute written instructions. In contrast, LAMs are equipped not just with language understanding but also with the ability to interact with various applications and real-world systems such as IoT devices. They can perform physical actions, control devices, and manage tasks that require interacting with the external environment, such as booking appointments or making reservations. This integration of language skills with practical execution allows LAMs to operate across more diverse scenarios than LLMs.

LAMs in Action: The Rabbit R1

The Rabbit R1 stands as a prime example of LAMs in practical use. This AI-powered device can manage multiple applications through a single, user-friendly interface. Equipped with a 2.88-inch touchscreen, a rotating camera, and a scroll wheel, the R1 is housed in a sleek, rounded chassis crafted in collaboration with Teenage Engineering. It operates on a 2.3GHz MediaTek processor, bolstered by 4GB of memory and 128GB of storage.

At the heart of the R1 lies its LAM, which intelligently oversees app functionalities, and simplifies complex tasks like controlling music, booking transportation, ordering groceries, and sending messages, all from a single point of interaction. This way R1 eliminates the hassle of switching between multiple apps or multiple logins to perform these tasks.

The LAM within the R1 was initially trained by observing human interactions with popular apps such as Spotify and Uber. This training has enabled LAM to navigate user interfaces, recognize icons, and process transactions. This extensive training enables the R1 to adapt fluidly to virtually any application. Additionally, a special training mode allows users to introduce and automate new tasks, continuously broadening the R1’s range of capabilities and making it a dynamic tool in the realm of AI-powered interactions.

Apple's Advances Towards LAM-Inspired Capabilities in Siri

Apple's AI research team has recently shared insights into their efforts to advance Siri's capabilities through a new initiative, resembling those of LAMs. The initiative, outlined in a research paper on Reference Resolution As Language Modeling (ReALM), aims to improve Siri's ability to understand conversational context, process visual content on the screen, and detect ambient activities. The approach adopted by ReALM in handling user interface (UI) inputs draws parallels to the functionalities observed in Rabbit AI's R1, showcasing Apple's intent to enhance Siri's understanding of user interactions.

This development indicates that Apple is considering the adoption of LAM technologies to refine how users interact with their devices. Although there are no explicit announcements regarding the deployment of ReALM, the potential for significantly enhancing Siri's interaction with apps suggests promising advancements in making the assistant more intuitive and responsive.

Potential Applications of LAMs

LAMs have the potential to extend their impact far beyond enhancing interactions between users and devices; they could provide significant benefits across multiple industries.

Customer Services: LAMs can enhance customer service by independently handling inquiries and complaints across different channels. These models can process queries using natural language, automate resolutions, and manage scheduling, providing personalized service based on customer history to improve satisfaction.
Healthcare: In healthcare, LAMs can help manage patient care by organizing appointments, managing prescriptions, and facilitating communication across services. They are also useful for remote monitoring, interpreting medical data, and alerting staff in emergencies, particularly beneficial for chronic and elderly care management.
Finance: LAMs can offer personalized financial advice and manage tasks like portfolio balancing and investment suggestions. They can also monitor transactions to detect and prevent fraud, integrating seamlessly with banking systems to quickly address suspicious activities.

Challenges of LAMs

Despite their significant potential, LAMs encounter several challenges that need addressing.

Data Privacy and Security: Given the broad access to personal and sensitive information LAMs need to function, ensuring data privacy and security is a major challenge. LAMs interact with personal data across multiple applications and platforms, raising concerns about the secure handling, storage, and processing of this information.
Ethical and Regulatory Concerns: As LAMs take on more autonomous roles in decision-making and interacting with human environments, ethical considerations become increasingly important. Questions about accountability, transparency, and the extent of decision-making delegated to machines are critical. Additionally, there may be regulatory challenges in deploying such advanced AI systems across various industries.
Complexity of Integration: LAMs require integration with a variety of software and hardware systems to perform tasks effectively. This integration is complex and can be challenging to manage, especially when coordinating actions across different platforms and services, such as booking flights, accommodations, and other logistical details in real-time.
Scalability and Adaptability: While LAMs are designed to adapt to a wide range of scenarios and applications, scaling these solutions to handle diverse, real-world environments consistently and efficiently remains a challenge. Ensuring LAMs can adapt to changing conditions and maintain performance across different tasks and user needs is crucial for their long-term success.

The Bottom Line

Large Action Models (LAMs) are emerging as a significant innovation in AI, influencing not just device interactions but also broader industry applications. Demonstrated by Rabbit AI's R1 and explored in Apple's advancements with Siri, LAMs are setting the stage for more interactive and intuitive AI systems. These models are poised to enhance efficiency and personalization across sectors such as customer service, healthcare, and finance.

However, the deployment of LAMs comes with challenges, including data privacy concerns, ethical issues, integration complexities, and scalability. Addressing these issues is essential as we advance towards broader adoption of LAM technologies, aiming to leverage their capabilities responsibly and effectively. As LAMs continue to develop, their potential to transform digital interactions remains substantial, underscoring their importance in the future landscape of AI.

The Relevance of RAG in the Era of Long-Context LLMs

The year 2024 seems to be one of long contexts, quite literally. There’s Anthropic’s Claude 3 with a context window of 200K (goes up to 1 million tokens for specific use cases) and Google’s Gemini 1.5 with a 1M context length window.

Meta’s Llama 3 has become the latest muse for the online developer community with many users coming up with wild use cases everyday such as that by Gradient, where it extended LLama-3 8B’s context length from 8k to over 1048K.

However, this is just the start because now it’s the race from long context length to infinite context length, with all big companies like Microsoft, Google, and Meta taking strides in this direction.

All this has ignited the ‘long context vs RAG’ debate yet again.

Respectfully disagree,
Long context is all you need. https://t.co/JQ87DXRegj

— Yam Peleg (@Yampeleg) May 1, 2024

Does RAG Really Suck?

RAG, or retrieval augmented generation, was introduced as a solution to address the challenge of LLM hallucinations by extending the model’s capabilities to external sources, vastly widening the scope of accessible information.

This seemed to be a good idea, for till just about a year ago context windows were somewhere in the range of 4K to 8K tokens. But now, with long context, if we can stuff a million tokens into an LLM, which is like thousands of pages of text or hundreds of documents, then why do we any longer need an index to actually store those documents?

This has left many wondering if ‘it’s time-up for RAG?’

RAG is bad.
RAG is a makeshift solution.
RAG will die soon.

— Pratik Desai (@chheplo) April 29, 2024

After all, RAG comes with its own set of limitations, such as while it is most useful in ‘knowledge-intensive’ situations where a model needs to satisfy a specific ‘information need’, it’s not that effective in ‘reasoning-intensive’ tasks.

Models can get distracted by irrelevant content, especially in lengthy documents where the answer isn’t clear. At times, they just ignore the information in retrieved documents, opting instead to rely on their built-in memory.

Using RAG can be costly due to the hardware requirements for running it at scale as retrieved documents have to be temporarily stored in memory for the model to access them. Another expenditure is compute for the increased context a model has to process before generating its response.

Are Long-Context LLMs Foolproof?

If long context LLMs can really replace RAG, then they should also be able to retrieve specific facts from the context you give them, reason about it, and return an answer based on it. But guess what? Even long-context LLMs are too stubborn with hallucinations.

The paper ‘Lost in the Middle: How Language Models Use Long Contexts’ explains that LLMs exhibit high information retrieval accuracy at the document’s start and end. However, this accuracy declines in the middle, especially with increased input processing.

Another analysis, called ‘Needle In A Haystack’ that tests reasoning & retrieval in long context LLMs, highlights the fact that as the number of needles (correct data facts spread across the context to be retrieved) goes up in long context, the performance of an LLM goes down.

While short-context LLMs (about a thousand tokens) can effectively retrieve facts from the given context, LLM’s ability to recall details diminishes as the context lengthens, particularly missing facts from earlier parts of the document.

This issue stems from what is known as recency bias in LLMs, where due to its underlying training, the model puts greater emphasis on more recent or nearby tokens to predict the next token when generating responses. This way, the LLM learns a bias to attend to recent tokens than earlier ones in answer generation, which is not a good thing for retrieval tasks.

Source: X

Then there are high cost, high token usage, and high latency issues to consider when choosing long context LLMs over RAG.

If you choose to ditch RAG and instead stuff all your documents into the LLM’s context, then for each query, the LLM will need to handle one million tokens. For example, if you use Gemini 1.5 Pro, which costs approx $7 per million tokens, you will essentially be paying this amount every time the full million tokens are utilised in a query.

The price difference is stark as the cost per call with RAG is a fraction of the $7 required for a single query in Gemini 1.5, especially for applications with frequent queries.

Additionally, there’s latency to consider because each time a user submits a query they must provide the complete context or full information to Gemini to obtain results. This requirement can introduce significant overhead.

Another thing: even if you CAN load in your entire database, even if it were enterprise scale… would you seriously be wasting so much money on compute when a simple RAG can load in the right data instead of millions upon millions of tokens for no good reason? Plus the approach…

— zee (@zeeb0t) April 30, 2024

RAG is versatile across various domains like customer service, educational tools, and content creation. In contrast, when using long context LLMs you might need to ensure that all necessary information is correctly fed into the system with each query.

It requires continuously updating the context for different applications, such as switching from customer service to education, which can be inconvenient and repetitive.

What about for proprietary info within companies? Especially info that updates often. Even a super smart model won't have access to that info.

— Colonel Tasty (@JoshhuaSays) April 29, 2024

Once the information is stored in a database for RAG, it remains accessible and only needs updating when data changes, like adding new products. This is simpler compared to using long context LLMs, where adjusting context repeatedly is necessary, potentially leading to high costs and inefficiency in obtaining optimal outputs.

RAG Survives The Day

Long context doesn’t kill RAG it only enhances it (more documents, longer references). The thing it does do is actually make fine tuning more of a niche (for very specific tasks). RAG is search, literally not going anywhere even if you have 1 billion context you’ll still use…

— anton (@abacaj) April 30, 2024

It’s not long context vs RAG, but combining RAG with long-context LLMs that can create a powerful system capable of effectively and efficiently retrieving and analysing data on a large scale.

RAG is not limited to vector database matching anymore, there are many advanced RAG techniques being introduced that improve retrieval significantly.

For example, the integration of Knowledge Graphs (KGs) into RAG. By leveraging the structured and interlinked data from KGs, the reasoning capabilities of current RAG systems can be greatly enhanced.

Also, there are many ongoing efforts to train models to make better use of RAG-retrieved documents.

Some approaches involve models that can autonomously decide when to access documents, or even opt not to retrieve any if deemed unnecessary. Additional efforts are concentrated on developing more efficient ways to index massive datasets and enhance document search capabilities beyond mere keyword matching.

There’s the concept of representation indexing which involves using an LLM to summarise documents, creating a compact, embedded summary for retrieval. Another technique is ‘Raptor’, a method to address questions that span multiple documents by creating a higher level of abstraction. It is particularly useful in answering queries that involve concepts from multiple documents.

Methods like Raptor go really well with long context LLMs because you can just embed full documents without any chunking.

So, time to finally settle the debate. No, RAG isn’t dead. But, yes, it’s likely to change and get better. The developer ecosystem is already experimenting with RAG, like building a RAG app with Llama-3 running locally and many enterprises are also coming up with new developments like Rovo, a new AI-Powered knowledge discovery tool unveiled by Atlassian.

The post The Relevance of RAG in the Era of Long-Context LLMs appeared first on Analytics India Magazine.

Tessolve’s New SMARC Module Powers AI in Robotics and Transportation

Tessolve, a global silicon and systems solutions provider, has collaborated with Renesas to introduce a new SMARC module based on Renesas’s high-performance RZ/V2H MPU.

The technology collaboration aims to deliver a ready-to-integrate system on module and computer vision system solution for OEMs, streamlining the product realisation cycle.

SMARC SOM Features

The SMARC SOM, compatible with SMARC standard 2.1 version, incorporates four Arm Cortex-A55 CPU cores, two Cortex-R8 cores, one Cortex-M33, and a dedicated AI Accelerator: DRP-AI3 capable of 8 TOPS/dense and 80 TOPS/sparse.

The module runs on Yocto Linux and supports up to 16GB Low power DDR4 RAM and up to 64GB flash. It also features high-speed interfaces, including PCIe, USB 3.2, SDIO, MIPI CSI, MIPI DSI, and Gigabit Ethernet, and is available in Industrial temperature grade.

Efficient Management of Vision AI and Real-Time Control Tasks

Combining these powerful cores enables efficient management of vision AI and real-time control tasks. With lower power consumption and eliminating the need for fans and other cooling components, the RZ/V2H is an ideal solution for autonomous robots, smart cameras, and machine vision in factory automation.

“Tessolve’s SMARC SOM, based on Renesas RZ/V2H MPU, aims to deliver an AI-powered computer vision system with a 360° surround view solution to the Industrial, robotics, and transportation markets, accelerating OEMs’ time to market,” said Kiran Kumar Nagendra, AVP-Embedded Systems, Tessolve.

“The computer vision system solution supports up to 4 camera inputs, enabling 360° surround view capabilities, making it ideal for demanding applications with AI requirements of the future,” he added.

Tessolve’s system solution can be adopted by OEMs either ‘as is’ or can be customised for their needs. The company also accelerates OEMs’ time to market with exceptional ODM abilities, offering white labelling.

The post Tessolve’s New SMARC Module Powers AI in Robotics and Transportation appeared first on Analytics India Magazine.

5 Free Stanford AI Courses

Image generated with ideogram.ai

We are in an era where Artificial Intelligence has been applied in many businesses and brings many values from AI tools. With so many benefits, there are abundant companies looking for AI experts who can provide these values.

Do you want to be one of these AI experts? Well, you can do it now. For Free!

There are many courses from top universities, such as Stanford University that are open for access.

So, what are these AI free courses from Stanford University? Let’s get into it.

StanfordOnline: Statistical Learning with Python

To start with Artificial Intelligence in the modern era, you should understand the statistics methodology and Python programming.

Statistics would be important as the concepts are used in many AI foundational tools, and it’s an important field to understand the mechanisms of AI. Effectively, statistical knowledge would lead you to build a better AI.

On the other hand, Python is the programming language used to build modern AI tools. So, it’s important to learn them as well.

Statistical Learning with Python from Stanford University course would teach you basic statistical methods and supervised learning while still devoting the usage of Python. You would become familiar with how statistical modeling works, and it would help you understand the next course as well.

Stanford also provides free book material to accompany this course, which you can access on their website.

CS229: Machine Learning

Machine Learning might be considered the subset of Artificial Intelligence, but modern AI tools require a strong Machine Learning foundation. To develop a great AI, we need machine learning knowledge as it allows us to drive where the AI would go and enhance the output.

You would learn many Machine Learning foundations in the CS229 Machine Learning course at Stanford University. These include:

Supervised Learning
Unsupervised Learning
Learning Theory
Reinforcement Learning
Adaptive Theory
Machine Learning Applications

This course would strengthen your Machine Learning foundation and, in turn, Artificial Intelligence.

Intro to Artificial Intelligence

Let’s learn the basics of Artificial Intelligence with the Intro to Artificial Intelligence course from Stanford University. This course could initiate your AI learning as all the foundation you need is one course.

From 22 lessons and 9 practice exams, you would learn several concepts, including:

Probability
Machine Learning
Game Theory
Computer Vision
Robotics
Natural Language Processing

These lessons would give you the confidence to explore artificial intelligence in depth.

CS221: Artificial Intelligence: Principles and Techniques

The next course is more in-depth learning about AI, as we would explore much more than the foundational in the CS221: Artificial Intelligence: Principles and Techniques by Stanford University.

The course would provide you with many Artificial Intelligence techniques for implementing various AI systems you would need to know, including:

Search
Markov decision processes
Game playing
Constraint satisfaction
Graphical models
Logic

It’s quite an advanced course, so you might want to refresh many of the foundational courses recommended by Stanford before following this course.

The AI Awakening: Implications for the Economy and Society

Lastly, we need to learn about the impact of AI on becoming a great AI expert. We might understand the technical aspects, but knowing how to make a difference with AI separates us from other talents.

The AI Awakening: Implications for the Economy and Society by Stanford University course would provide a different perspective for a technical person. Instead of learning how to develop them, the course would touch more about the impact and risks.

It’s a short course you can finish in a day, but it would be important for an AI expert.

Conclusion

This collection from Stanford University will help you learn about AI and prepare you to become that expert.

If you are more interested in data science, read the 5 Free Stanford University Courses to Learn Data Science.

Cornellius Yudha Wijaya is a data science assistant manager and data writer. While working full-time at Allianz Indonesia, he loves to share Python and data tips via social media and writing media. Cornellius writes on a variety of AI and machine learning topics.

Stack Overflow signs deal with OpenAI to supply data to its models

Stack Overflow signs deal with OpenAI to supply data to its models Kyle Wiggers 8 hours

OpenAI is collaborating with Stack Overflow, the Q&A forum for software developers, to improve its generative AI models’ performance on programming-related tasks.

As a result of the partnership, announced Monday, OpenAI’s models, including models served through its ChatGPT chatbot platform, should get better over time at answering programming-related questions, the two companies say. At the same time, Stack Overflow will benefit from OpenAI’s expertise in developing new generative AI integrations on the Stack Overflow platform.

The first set of features will go live by the end of June.

The tie-up with OpenAI is a remarkable reversal for Stack Overflow, which initially banned responses from ChatGPT on its platform over fears of spammy responses.

Stack Overflow began experimenting with generative AI features last April, promising to craft models that “reward” devs who contribute knowledge to the platform. In July, the company launched a conversational search tool that lets users pose queries and receive answers based on Stack Overflow’s database of over 58 million questions and answers, along with tools for businesses to fine-tune searches on their own documentation and knowledge bases.

Some members of Stack Overflow’s developer community rebelled against the changes, pointing out concerns related to the validity of information generated by AI, information overload, and data privacy for individual contributors on the platform.

There was at least some basis for those concerns. An analysis of more than 150 million lines of code committed to project repos over the past several years by GitClear found that generative AI dev tools are resulting in more mistaken code being pushed to codebases. Elsewhere, security researchers have warned that such tools can amplify existing bugs and security issues in software projects.

But despite the apparent flaws, developers are embracing generative AI tools for at least some coding tasks. In a Stack Overflow poll from June 2023, 44% of developers said that they use AI tools in their development process now while 26% plan to soon.

This has precipitated something of an existential crisis for Stack Overflow. Traffic to the platform has reportedly dipped significantly since the release of capable new generative AI models last year — models that in many cases were trained on data from Stack Overflow.

So now, as it cuts costs, Stack Overflow is pursuing licensing agreements with AI providers.

The company’s deal with OpenAI — the financial terms of which weren’t disclosed — comes after Stack Overflow partnered with Google to enrich Google’s Gemini models with Stack Overflow data and work with Google to bring more AI-powered features to its platform. Stack Overflow stressed at the time that the agreement wasn’t exclusive — and indeed, that turned out to be the case.

Prashanth Chandrasekar, CEO of Stack Overflow, previously said that 10% of the platform’s nearly 600 staff was focused on its AI strategy, and has described potential additional revenue from the strategy as key to ensuring Stack Overflow can keep attracting users and maintaining high-quality information.

“Stack Overflow is the world’s largest developer community,” Prashanth Chandrasekar said in a press release this morning. “Through [our] industry-leading partnership with OpenAI, we strive to redefine the developer experience, fostering efficiency and collaboration through the power of community, best-in-class data, and AI experiences. Our goal with OverflowAPI, and our work to advance the era of socially responsible AI, is to set new standards with vetted, trusted, and accurate data that will be the foundation on which technology solutions are built and delivered to our user.”

Рубрика: AI