Pixel 8 Pro becomes the first smartphone powered by Google’s new AI model, Gemini

Pixel 8 Pro becomes the first smartphone powered by Google’s new AI model, Gemini Sarah Perez @sarahintampa / 22 hours

The Pixel 8 Pro will now become the first Android smartphone to be powered by Google’s next-generation AI model, Gemini, starting today, the company announced. Gemini Nano, a version of the model designed for running on-device, as on smartphones, will now leverage Google’s Tensor G3 to deliver two Pixel 8 Pro features, Summarize in Recorder and Smart Reply in Gboard. Because the AI runs on-device, it will help keep sensitive data from leaving the phone as well as allow for the use of the features even if you’re without a network connection.

Gemini Nano is the most efficient version of Google’s new AI model, designed to run on devices, like smartphones. With its launch, it will now be the brains behind an AI summarizer feature in the Recorder app on the Pixel 8 Pro. The Recorder app, which lets users push a button to record and transcribe audio, will now include a Gemini-powered summary of your recorded conversations, interviews, presentations or other audio. With the shift to Gemini Nano, users will be able to get these summaries even if they don’t have a signal or Wi-Fi connection available.

This feature will be available on Pixel 8 Pro devices starting today. In addition, the Recorder app now transcribes in 28 new languages.

Meanwhile, Gemini Nano is coming to Gboard, Google’s keyboard app, as a developer preview. Here, it will power a feature called Smart Reply which helps to suggest the next thing you’ll want to say when having a conversation in a messaging app. The feature will initially work with WhatsApp, but will come to more apps in 2024, Google says. With Gemini Nano, the responses suggested will be of higher quality, the company claims, as the AI model will have more conversational awareness.

In addition to running Gemini Nano on the device, Android users will be able to tap into the new Gemini model via Bard. Early next year, this will also include powering Assistant with Bard on Pixel devices, essentially upgrading Google’s answer to Siri with a model that’s as smart and even more capable in some areas than something like ChatGPT.

Longer-term, the impacts of improved AI on devices could prompt people to make the switch to Android. Combined with the launch of Beeper, an app bringing iMessage to Android, some are already considering the possibility of moving to their preferred Android device. Plus, Google’s advances in AI are making its Pixel phone a more AI-enabled smartphone, with a bevy of AI features, like Magic Editor and Best Take in Google Photos, Assistant with Bard, AI-powered call screening and more.

Alongside the Gemini-powered features, the Pixel 8 Pro received other new features, Google announced today via a blog post. This includes a Video Boost feature that leverages the Tensor G3 to upload video to the cloud where computational models adjust the video’s color, lighting, stabilization, and graininess. This also enables Night Sight video on the Pixel 8 Pro, which uses AI to apply noise reduction to videos recorded at nighttime or in other low-light conditions, including timelapse videos.

Another new AI model in Google Photos will also improve Portrait light, while a Photo Unblur feature is now able to sharpen images of dogs and cats who wiggle during photos.

Other Pixel devices are getting upgrades, too, like a “Dual Screen Preview” mode on Pixel Fold, for previewing photos.

Plus, Pixel Fold and Pixel 6 and newer phones will be able to connect to a computer via USB to use the phone as the webcam on video calls. Those devices will also now suggest “contextual replies” during Call Screen, which has arrived on the Pixel Watch, too. Pixel’s Direct My Call and Hold for Me features have expanded to business numbers without a toll-free prefix and are now available in the U.K.

A new “Clean” feature will remove smudges and stains from scanned documents. A few other features around security, authentication, new widgets and others focused on the Pixel Watch and Pixel Tablet are detailed on the Google blog.

Updated, 12/6/23, 4:50 PM ET with more details about the Pixel feature drop.

Google Reveals Gemini, Its Much-Anticipated Large Language Model

Google has revealed Gemini, its long-rumored large language model and rival to GPT-4. Global users of Google Bard and the Pixel 8 Pro will be able to run Gemini starting now; an enterprise product, Gemini Pro, is coming on Dec. 13. Developers can sign up now for an early preview in Android AICore.

Jump to:

  • What is Gemini?
  • Does Gemini have an enterprise product?
  • Gemini’s timing compared to other popular LLMs

What is Gemini?

Gemini is a large language model that runs generative artificial intelligence applications; it can summarize text, create images and answer questions. Gemini was trained on Google’s Tensor Processing Units v4 and v5e.

Google’s Bard is a generative AI based on the PaLM large language mode. Starting today, Gemini will be used to give Bard “more advanced reasoning, planning, understanding and more,” according to a Google press release.

SEE: Microsoft invested $3.2 billion on AI in the UK. (TechRepublic)

Gemini size options

Gemini comes in three model sizes: Ultra, Pro and Nano. Ultra is the most capable, Nano is the smallest and most efficient, and Pro sits in the middle for general tasks. The Nano version is what Google is using on the Pixel, while Bard gets Pro. Google says it plans to run “extensive trust and safety checks” before releasing Gemini Ultra to select groups.

Gemini for coding

Gemini can code in Python, Java, C++, Go and other popular programming languages. Google used Gemini to upgrade Google’s AI-powered code generation system, AlphaCode.

Gemini will be added to more Google products

Next, Google plans to bring Gemini to Ads, Chrome and Duet AI. In the future, Gemini will be used in Google Search as well.

Competitors to Gemini

Gemini and the products built with it, such as chatbots, will compete with OpenAI’s GPT-4, Microsoft’s Copilot (which is based on OpenAI’s GPT-4), Anthropic’s Claude AI, Meta’s Llama 2 and more. Google claims Gemini Ultra outperforms GPT-4 in several benchmarks, including the massive multitask language understanding general knowledge test and in Python code generation.

Does Gemini have an enterprise product?

Starting Dec. 13, enterprise customers and developers will be able to access Gemini Pro through the Gemini API in Google’s Vertex AI or Google AI Studio.

Google expects Gemini Nano to be generally available for developers and enterprise customers in early 2024. Android developers can use this LLM to build Gemini apps on-device through AndroidAICore.

Possible enterprise use cases for Gemini

Of particular interest to enterprise use cases might be Gemini’s ability to “understand and reason about users’ intent,” said Palash Nandy, engineering director at Google, in a demonstration video. Gemini generates a bespoke UI depending on whether the user is looking for images or text. In the same UI, Gemini will flag areas in which it doesn’t have enough information and ask for clarification. Through the bespoke UI, the user can explore other options with increasing detail.

Gemini has been trained on multimodal content from the very beginning instead of starting with text and expanding to audio, images and video later, letting Gemini parse written or visual information with equal acuity. One example of how this might be useful for business Google provides is the prompt “Could Gemini help make a demo based on this video?” in which the AI translates video content to an original animation.

Gemini’s timing compared to other popular LLMs

Gemini has been hotly rumored, as Google tries to compete with OpenAI. The New York Times reported Google executives were “shaken” by OpenAI’s tech in January 2023. More recently, Google supposedly struggled with releasing Gemini in languages other than English, leading to a delay of an in-person launch event.

However, releasing Google’s own large language model after ChatGPT has received gradual GPT-4 powered updates for nearly a year means Google has the advantage of leapfrogging the last year of AI development. For example, Gemini is multimodal (i.e., able to work with text, video, speech and code) and lives natively on the Google Pixel 8. Users can access Gemini on their Google Pixel 8 without an internet connection, unlike ChatGPT, which started out in a browser.

Meta rolls out its AI-powered image generator as a dedicated website

Meta's AI image generator now works as a dedicated website

Using Meta's AI image generator to create images is now more convenient, thanks to its new dedicated website. Previously available only through individual and group chats in Meta's social network platforms, the Imagine tool is now freely accessible on the web for anyone in the US to try.

To take the tool for a spin, head to Meta's Imagine website. You'll need to log in before you can start generating images, which requires a free Meta account. After you sign in, the site works just like any other AI image generator. Type a description of the image you want. Try to be precise yet imaginative. Click the Generate button, and Imagine cooks up four different images that hopefully match what you imagined.

Also: Meta-IBM alliance promotes 'open' approach to AI development

To grab any of the images, select it, click the ellipsis icon, and select Download. The image is saved as a JPG file where you can view and modify it in any image editor. You'll also notice that each image is stamped with a watermark in the corner, identifying it as one "imagined with AI."

Prior to its debut as a website, Meta's Imagine tool was designed to generate and share images in a chat using Facebook, Instagram, or WhatsApp. The tool works as part of the company's Meta AI, which lets you pose a question, submit a request, ask for information, or solicit recommendations. You can chat directly with the AI chatbot or include it in a group conversation. To use the image generator in a chat, just type/imagine at the prompt, followed by a description of the image you want.

The image generation is based on Meta's Emu model, built to fashion high-quality, photorealistic images within seconds. In my testing of the Imagine website, the image generator was able to produce realistic images within a few seconds, seemingly faster than I've experienced with other image generators.

Also: The best AI image generators

"We've enjoyed hearing from people about how they're using imagine, Meta AI's text-to-image generation feature, to make fun and creative content in chats," Meta said in a Wednesday news post announcing the website and other new AI features. "Today, we're expanding access to imagine outside of chats, making it available in the US to start at imagine.meta.com. While our messaging experience is designed for more playful, back-and-forth interactions, you can now create free images on the web, too."

Artificial Intelligence

Google’s Gemini isn’t the generative AI model we expected

Google’s Gemini isn’t the generative AI model we expected Kyle Wiggers 19 hours

Google’s long-promised, next-gen generative AI model, Gemini, has arrived. Sort of.

The version of Gemini launching this week, Gemini Pro, is essentially a lightweight offshoot of a more powerful, capable Gemini model set to arrive… sometime next year. But I’m getting ahead of myself.

Yesterday in a virtual press briefing, members of the Google DeepMind team — the driving force behind Gemini, alongside Google Research — gave a high-level overview of Gemini (technically “Gemini 1.0”) and its capabilities.

Gemini, as it turns out, is actually a family of AI models — not just one. It comes in three flavors:

  • Gemini Ultra, the flagship Gemini model
  • Gemini Pro, a “lite” Gemini model
  • Gemini Nano, which is distilled to run on mobile devices like the Pixel 8 Pro*

*To make matters more confusing, Gemini Nano comes in two model sizes, Nano-1 (1.8 billion parameters) and Nano-2 (3.25 billion parameters) — targeting low- and high-memory devices, respectively.

Gemini

Image Credits: Google

The easiest place to try Gemini Pro is Bard, Google’s ChatGPT competitor, which as of today is powered by a fine-tuned version of Gemini Pro — at least in English in the U.S. (and only for text, not images). Sissie Hsiao, GM of Google Assistant and Bard, said during the briefing that the fine-tuned Gemini Pro delivers improved reasoning, planning and understanding capabilities over the previous model driving Bard.

We can’t independently confirm any of those improvements, I’ll note. Google didn’t allow reporters to test the models prior to their unveiling and, indeed, didn’t give live demos during the briefing.

Gemini Pro will also launch December 13 for enterprise customers using Vertex AI, Google’s fully managed machine learning platform, and then head to Google’s Generative AI Studio developer suite. (Some eagle-eyed users have already spotted Gemini model versions appearing in Vertex AI’s model garden.) Elsewhere, Gemini will arrive in the coming months in Google products like Duet AI, Chrome and Ads, as well as Search as a part of Google’s Search Generative Experience.

Gemini Nano, meanwhile, will launch soon in preview via Google’s recently released AI Core app, exclusive to Android 14 on the Pixel 8 Pro for now; Android developers interested in incorporating the model into their apps can sign up today for a sneak peek. On the Pixel 8 Pro first and other Android devices in the future, Gemini Nano will power features that Google previewed during the Pixel 8 Pro’s unveiling in October, like summarization in the Recorder app and suggested replies for supported messaging apps (starting with WhatsApp).

Natively multimodal

Gemini Pro — or at least the fine-tuned version of Gemini Pro powering Bard — isn’t much to write home about.

Hsiao says that Gemini Pro is more capable at tasks such as summarizing content, brainstorming and writing, and outperforms OpenAI’s GPT-3.5, the predecessor to GPT-4, in six benchmarks, including one (GSM8K) that measures grade school math reasoning. But GPT-3.5 is over a year old — hardly a challenging milestone to surpass at this point.

So what about Gemini Ultra? Surely it must be more impressive?

Somewhat.

Like Gemini Pro, Gemini Ultra was trained to be “natively multimodal” — in other words, pre-trained and fine-tuned on a large set of codebases, text in different languages, audio, images and videos. Eli Collins, VP of product at DeepMind, claims that Gemini Ultra can comprehend “nuanced” information in text, images, audio and code and answer questions relating to “complicated” topics, particularly math and physics.

Gemini

Image Credits: Google

In this respect, Gemini Ultra does several things better than rival OpenAI’s own multimodal model, GPT-4 with Vision, which can only understand the context of two modalities: words and images. Gemini Ultra can transcribe speech and answer questions about audio and videos (e.g. “What’s happening in this clip?”) in addition to art and photos.

“The standard approach to creating multimodal models involves training separate components for different modalities,” Collins said during the briefing. “These models are pretty good at performing certain tasks like describing an image, but they really struggle with more complicated conceptual and complicated reasoning tasks. So we designed Gemini to be natively multimodal.”

I wish I could tell you more about Gemini’s training datasets — I’m curious myself. But Google repeatedly refused to answer questions from reporters about how it collected Gemini’s training data, where the training data came from and whether any of it was licensed from a third party.

Collins did reveal that at least a portion of the data was from public web sources and that Google “filtered” it for quality and “inappropriate” material. But he didn’t address the elephant in the room: whether creators who might’ve unknowingly contributed to Gemini’s training data can opt out or expect/request compensation.

Google’s not the first to keep its training data close to the chest. The data isn’t only a competitive advantage, but a potential source of lawsuits pertaining to fair use. Microsoft, GitHub, OpenAI and Stability AI are among the generative AI vendors being sued in motions that accuse them of violating IP law by training their AI systems on copyrighted content, including artwork and e-books, without providing the creators credit or pay.

Gemini

Image Credits: Google

OpenAI, joining several other generative AI vendors, recently said it would allow artists to opt out of the training datasets for its future art-generating models. Google offers no such option for art-generating models or otherwise — and it seems that policy won’t change with Gemini.

Google trained Gemini on its in-house AI chips, tensor processing units (TPUs) — specifically TPU v4 and v5e (and in the future the v5p) — and is running Gemini models on a combination of TPUs and GPUs. (According to a technical whitepaper released this morning, Gemini Pro took “a matter of weeks” to train, with Gemini Ultra presumably taking much longer.) While Collins claimed that Gemini is Google’s “most efficient” large generative AI model to date and “significantly cheaper” than its multimodal predecessors, he wouldn’t say how many chips were used to train it or how much it cost — or the environmental impact of the training.

One article estimates that training a model the size of GPT-4 emits upwards of 15 metric tons of CO2 — equivalent to the annual emissions of nearly 1,000 Americans. One would hope Google took steps to mitigate the impact, but since the company chose not to address the issue — at least not during the briefing this reporter attended — who can say?

A better model — marginally

In a prerecorded demo, Google showed how Gemini could be used to help with physics homework, solving problems step-by-step on a worksheet and pointing out possible mistakes in already filled-in answers.

In another demo — also prerecorded — Gemini was shown identifying scientific papers relevant to a particular problem set, extracting information from those papers and “updating” a chart from one by generating the formulas necessary to recreate the chart with more recent data.

“You can think of the work here as an extension of what [DeepMind] pioneered with ‘chain of thought prompting,’ which is that, with further instruction tuning, you can get the model to follow [more complex] instructions,” Collins said. “If you think of the physics homework example, you can give the model an image but also instructions to follow — for example, to identify the flaw in the math of the physics homework. So the model is able to handle more complicated prompts.”

Collins several times during the briefing touted Gemini Ultra’s benchmark superiority, claiming that the model exceeds current state-of-the-art results on “30 of the 32 widely used academic benchmarks used in large language model research and development.” But dive into the results, and it quickly becomes apparent that Gemini Ultra scores only marginally better than GPT-4 and GPT-4 with Vision across many of those benchmarks.

Gemini

Image Credits: Google

For example, on GSM8K, Gemini Ultra answers 94.4% of the math questions correctly compared to 92% in GPT-4’s case. On the DROP benchmark for reading comprehension, Gemini Ultra barely edges out GPT-4 82.4% to 80.9%. On VQAv2, a “neural” image understanding benchmark, Gemini does a measly 0.6 percentage points better than GPT-4 with Vision. And Gemini Ultra bests GPT-4 by just 0.5 percentage points on the Big-Bench Hard reasoning suite.

Collins notes that Gemini Ultra achieves a “state-of-the-art” score of 59.4% on a newer benchmark, MMMU, for multimodal reasoning — ahead of GPT-4 with Vision. But in a test set for commonsense reasoning, HellaSwag, Gemini Ultra is actually a fair bit behind GPT-4 with a score of 87.8%; GPT-4 scores 95.3%.

Asked by a reporter if Gemini Ultra, like other generative AI models, falls victim to hallucinating — i.e. confidently inventing facts — Collins said that it “wasn’t a solved research problem.” Take that how you will.

Presumably, bias and toxicity are well within the realm of possibility for Gemini Ultra too given that even the best generative AI models today respond problematically and harmfully when prompted in certain ways. It’s almost certainly as Anglocentric as other generative AI models — Collins said that, while Gemini Ultra can translate between around 100 languages, no specific work has been done to localize the model to Global South countries.

Gemini

Image Credits: Google

In another key limitation, while the Gemini Ultra architecture supports image generation (as does Gemini Pro, in theory), that capability won’t make its way into the productized version of the model at launch. That’s perhaps because the mechanism is slightly more complex than how, say, ChatGPT generates images; rather than feed prompts to an image generator (like DALL-E 3, in ChatGPT’s case), Gemini outputs images “natively” without an intermediary step.

Collins didn’t provide a timeline as to when image generation might arrive — only an assurance that the work is “ongoing.”

Rushed out the gate

The impression one gets from this week’s Gemini “launch” is that it was a bit of a rush job.

At its annual I/O developer conference, Google promised that Gemini would deliver “impressive multimodal capabilities not seen in prior models” and “[efficiency] at tool and API integrations.” And in an interview with Wired in June, Demis Hassabis, the head and co-founder of DeepMind, described Gemini as introducing somewhat novel capabilities to the text-generating AI domain, such as planning and the ability to solve problems.

It may well be that Gemini Ultra is capable of all of this — and more. But the briefing yesterday wasn’t especially convincing, and — given Google’s previous, recent gen AI stumbles — I’d argue that it needed to be.

Gemini

Image Credits: Google

Google’s been playing catch-up in generative AI since early this year, racing after OpenAI and the company’s viral sensation ChatGPT. Bard was released in February to criticism for its inability to answer basic questions correctly; Google employees, including the company’s ethics team, expressed concerns over the accelerated launch timeline.

Reports later emerged that Google hired overworked, underpaid third-party contractors from Appen and Accenture to annotate Bard’s training data. The same may be true for Gemini; Google didn’t deny it yesterday, and the technical whitepaper says only that annotators were paid “at least a local living wage.”

Now, to be fair to Google, it’s making progress in the sense that Bard has improved substantially since launch and that Google has successfully injected dozens of its products, apps and services with new generative AI-powered features, powered by homegrown models like PaLM 2 and Imagen.

But reporting suggests that Gemini’s development has been troubled.

Gemini — which reportedly had direct participation from Google higher-ups, including Jeff Dean, the company’s most senior AI research executive — is said to be struggling with tasks like reliably handling non-English queries, which contributed to a delay in the launch of Gemini Ultra. (Gemini Ultra will only be available to select customers, developers, partners and “safety and responsibility experts” before rolling out to developers and enterprise customers followed by Bard “early next year,” Google says.) Google doesn’t even understand all of Gemini Ultra’s novel capabilities yet, Collins said — nor has it figured out a monetization strategy for Gemini. (Given the sky-high cost of AI model training and inferencing, I doubt it’ll be long before it does.)

Gemini

Image Credits: Google

So we’re left with Gemini Pro — and very possibly an underwhelming Gemini Ultra, especially if the model’s context window remains ~24,000 words as outlined in the technical whitepaper. (Context window refers to the text the model considers before generating any additional text.) GPT-4 handily beats that context window (~100,000 words), but context window admittedly isn’t everything; we’ll reserve judgement until we’re able to get our hands on the model.

Could it be that Google’s marketing, telegraphing that Gemini would be something truly remarkable rather than a slight move of the generative AI needle, is to blame for today’s dud of a product launch? Perhaps. Or perhaps building state-of-the-art generative AI models is really hard — even if you reorganize your entire AI division to juice up the process.

Meta launches a standalone AI-powered image generator

Meta launches a standalone AI-powered image generator Kyle Wiggers 16 hours

Not to be outdone by Google’s Gemini launch, Meta’s rolling out a new, standalone generative AI experience on the web, Imagine with Meta, that allows users to create images by describing them in natural language.

Similar to OpenAI’s DALL-E, Midjourney and Stable Diffusion, Imagine with Meta, which is powered by Meta’s existing Emu image generation model, creates high-resolution images from text prompts. It’s free to use (at least for now) for users in the U.S. and generates four images per prompt.

“We’ve enjoyed hearing from people about how they’re using imagine, Meta AI’s text-to-image generation feature, to make fun and creative content in chats. Today, we’re expanding access to imagine outside of chats,” Meta writes in a blog post published this morning. “While our messaging experience is designed for more playful, back-and-forth interactions, you can now create free images on the web, too.”

Imagine with Meta

Image Credits: Imagine with Meta

Now, Meta’s image generation tools have landed the company in hot water in the recent past (see: Meta’s racially biased AI sticker generator), which makes this writer wonder whether there’s safeguards in Imagine with Meta to prevent history from repeating itself. We weren’t given the chance to test the tool prior to its launch, but rest assured we’ll be keeping a close eye as Imagine with Meta reaches more users.

They won’t be live at the start, but Meta pledged to begin adding watermarks to content generated by Imagine with Meta in the coming weeks for “increased transparency and traceability.” (There’s already a visible watermark.) The watermarks, which are invisible, will be generated with an AI model and detectable with a corresponding model, Meta says. No word on whether the detection model will be made public at some point.

Imagine with Meta

Image Credits: Meta

“[The watermarks are] resilient to common image manipulations like cropping, resizing, color change (brightness, contrast, etc.), screen shots, image compression, noise, sticker overlays and more,” Meta stated in the post. “We aim to bring invisible watermarking to many of our products with AI-generated images in the future.”

Watermarking techniques for generative art aren’t new. French startup Imatag offers a watermarking tool that it claims isn’t affected by resizing, cropping, editing or compressing images. Another firm, Steg.AI, employs an AI model to apply watermarks that survive resizing and other edits. Microsoft and Google have adopted AI-based watermarking standards and technologies, while elsewhere, Shutterstock and Midjourney have agreed to guidelines to embed markers indicating their content was created by a generative AI tool.

But the pressure is ramping up on tech firms to make it clearer that works were generated by AI — particularly in light of the flood of Deepfakes from the Gaza war and filter-bypassing AI-generated child abuse images.

Recently, China’s Cyberspace Administration issued regulations requiring that generative AI vendors mark generated content — including text and image generators — without affecting user usage. And in recent U.S. Senate committee hearings, Senator Kyrsten Sinema (I-AZ) emphasized the need for transparency in generative AI, including by using watermarks.

What is Gemini? Everything you should know about Google’s new AI model

Google Gemini website on laptop reads, welcome to the Gemini era

A comparison chart from Google shows how Gemini Ultra and Pro compare to OpenAI's GPT-4 and Whisper, respectively.

Compared to GPT-4, a primarily text-based model, Gemini easily performs multimodal tasks natively. While GPT-4 excels in language-related tasks like content creation and complex text analysis natively, it resorts to OpenAI's plugins to perform image analysis and access the web, and it relies on DALL-E 3 and Whisper to generate images and process audio.

Also: The best AI chatbots: ChatGPT and other noteworthy alternatives

Google's Gemini also appears to be more product-focused than other models available now. It's either integrated into the company's ecosystem or with plans to be, as it's powering both Bard and Pixel 8 devices. Other models, like GPT-4 and Meta's Llama, are more service-oriented, and available for various third-party developers for applications, tools, and services.

Artificial Intelligence

Meta’s AI characters are now live across its US apps, with support for Bing Search and better memory

Meta’s AI characters are now live across its US apps, with support for Bing Search and better memory Sarah Perez @sarahintampa / 14 hours

Earlier this year, Meta introduced a set of AI characters, including those based on real-life celebs including the likes of Paris Hilton, MrBeast, Kendall Jenner, Tom Brady, Charli D’Amelio, Snoop Dog and others, which users could chat with across Meta’s apps. Today, the company announced its 28 AI characters are fully rolled out across the U.S. for people to chat with across WhatsApp, Messenger and Instagram. In addition, the company said more of its AI characters will support search powered by Bing and it will begin experimenting with “long-term memory” in several — meaning, the characters will learn and remember your conversation when it’s over.

While the advantage of the latter is a character that feels more like a real person, who would also remember what you spoke about before, it also gives Meta the ability to retain user data between sessions to help it improve its AI products over time.

Meta says the long-term memory feature will be live in Billie (based on Kendell Jenner), Carter (a dating coach), Scarlett (a “hype woman bestie”), Zach (based on MrBeast), Victor (based on Dwyane Wade), Sally (based on Sam Kerr) and Leo (a career coach).

The company explains that when users chat with this subset of AI characters, they’ll be able to pick up where they left off.

“Our goal is to bring the potential for deeper connections and extended conversational capabilities to your chats with AI,” the company’s blog post states. It also notes that users will be able to clear their chat history with the AIs at any time, while Meta’s use of the chat data will be guided by its Generative AI Privacy guidelines.

In addition to long-term memory, more of the characters will support the ability to tap into Bing Search. Two of its sports-related AIs, Bru (based on Tom Brady) and Perry (based on Chris Paul), have supported Bing Search since their debut. Now, that same feature will roll out to AI characters Luiz (based on Izzy Adesanya), Coco (based on Charlie D’Amelio), Lorena (based on Padma Lakshmi), Tamika (based on Naomi Osaka), Izzy (an aspiring singer-songwriter) and Jade (a “hip-hop obsessive”), as well.

To access the AIs across Meta’s app, users will first start a new message in the app and then select “Create an AI chat” across Instagram, Messenger and WhatsApp.

The launch puts Meta in competition with other AI character apps, including the popular Character AI, founded by Google AI researchers who helped build LaMDA, which has been catching up to ChatGPT in the U.S. The startup raised a massive $150 million in Series A funding, valuing its business at $1 billion.

Meta filmed Mr Beast, Paris Hilton and 26 more to build celebrity AIs based on Llama 2

AMD Releases MI300X Accelerator, Competing with NVIDIA H100

AMD Releases MI300X Accelerator, Competing with NVIDIA H100

At its Advancing AI event, AMD has released Instinct MI300X accelerators, boasting an industry leading bandwidth of generative AI, along with Instinct MI300A accelerated processing unit (APU), combined with the latest AMD CDNA 3 architecture and Zen 4 CPUs – all focused for HPC and AI workloads.

Customers including Microsoft have already announced that it is leveraging AMD Instinct accelerator portfolio in the recently released Azure ND MI300x v5 Virtual Machine (VM) series, which is specifically optimised for IA workloads. Meta is also adding MI300X accelerators to its data centres.

OpenAI is also adding support for AMD Instinct accelerators to Triton 3.0, providing support for AMD accelerators that will allow developers to test their models.

Moreover, Oracle Cloud Infrastructure (OCI) is planning to add MI300X-based bare metal instances into its high-performance accelerated computing systems. This will support OCI Supercluster with ultrafast RDMA networking.

Furthermore, Dell showcased at the event its Dell PowerEdge XE9680 server, which features eight AMD Instinct MI300 series accelerators along with AMD ROCm-powered AI frameworks.

Read: AMD Eyes Big Wins with MI300X for AI Workloads

Not just Dell, HPE and Lenovo are also all into the AMD Instinct ecosystem. HPE announced Cray Supercomputing EX255a, its first supercomputing accelerator blade, which is powered by Instinct MI300A APUs. These will be available by early 2024.

Lenovo has announced its design support for MI300 series accelerators, which planned availability from the first half of 2024. Supermicro also announced the H13 generation of accelerated servers, which would be powered by the 4th Gen AMD EPYC CPUs and MI300 accelerators.

“AMD Instinct MI300 Series accelerators are designed with our most advanced technologies, delivering leadership performance. These accelerators will be utilised in large-scale cloud and enterprise deployments,” said Victor Peng, President of AMD. “By leveraging our leadership hardware, software, and open ecosystem approach, cloud providers, OEMs, and ODMs are bringing to market technologies that empower enterprises to adopt and deploy AI-powered solutions.”

Gigabytes, Inventec, QCT, Ingensys, and Wistron also announced their plans to offer solutions powered by the new AI accelerators.

Other specialised cloud providers including Aligned, Arkon Energy, Cirrascale, Crusoe, Denver Dataworks, and Tensorwaves are all going to incorporate and expand access to MI300X GPUs for developers and AI startups.

Additionally, El Capitan, an exascale supercomputer in Lawrence Livermore National Laboratory is also hosting an unannounced number of MI300s already, building on the hype of the release. It is expected to deliver two exaflops of double precision performance when fully deployed.

Specifications of MI300X

The AMD Instinct MI300X accelerators are driven by the new AMD CDNA 3 architecture. In comparison to the previous generation AMD Instinct M1250X accelerators, the MI300X has nearly 40% more compute units, 1.5 times more memory capacity, and 1.7 times more peak theoretical memory bandwidth. Additionally, the MI300X supports new maths formats such as FP8, specifically designed for AI and HPC workloads.

The accelerators offer extensive memory and compute capabilities. Featuring HBM3 memory with 5.3 TB/s peak bandwidth and a best-in-class 192 GB of bandwidth, these accelerators cater to the demanding nature of AI workloads.

The AMD Instinct Platform, constructed on an industry-standard OCP design, incorporates eight MI300X accelerators, providing a total of 1.5TB of HBM3 memory capacity. This platform enables OEM partners to seamlessly integrate M1300X accelerators into existing AI offerings.

In comparison to the NVIDIA H100 HGX, the AMD Instinct Platform claims to offer up to a 1.6x increase in performance. This is particularly notable when running inference on LLMs like BLOOM 176B4 and Llama2 with a single M1300X accelerator.

Specifications of MI300A

The AMD Instinct MI300A is an Accelerated Processing Unit (APU) designed specifically for data centre applications, marking a significant advancement as the world’s first APU tailored for HPC and AI workloads.

At its core, the MI300A features high-performance AMD CDNA 3 GPU cores combined with the latest AMD “Zen 4” x86-based CPU cores, forming a potent synergy for computational tasks. With 128GB of next-generation HBM3, the MI300A boasts impressive capabilities, claiming a remarkable ~1.9x improvement in performance-per-watt on HPC and AI workloads compared to its predecessor.

The emphasis on energy efficiency is paramount, particularly given the demands of HPC and AI workloads, which are inherently data-intensive and computationally demanding. By integrating CPU and GPU cores into a single APU, the MI300A achieves a 30x improvement in energy efficiency on FP32, addressing the critical need for power optimisation.

The post AMD Releases MI300X Accelerator, Competing with NVIDIA H100 appeared first on Analytics India Magazine.

AMD Releases Ryzen AI PCs with NPUs

AMD Releases Ryzen AI PCs with NPUs

AMD, at the Advancing AI event, has revealed Ryzen 8040 series mobile processors, thereby extending its range of on-edge offerings. These processors boast enhanced performance that surpasses the competition, establishing AMD’s leadership in the mobile PC domain.

The integration of the Ryzen AI neural processing units (NPUs) in select models introduces state-of-the-art AI capabilities, surpassing the processing performance of prior AMD models.

To complement the hardware advancements, AMD introduces the Ryzen AI 1.0 software. This software facilitates the seamless deployment of models on Ryzen AI PCs, making it more convenient for users to harness the power of AI in various computing scenarios, be it for work, play, or AI-enabled experiences.

Leading OEMs such as Acer, Asus, Dell, HP, Lenovo, and Razer are set to feature the AMD Ryzen 8040 Series processors, broadening the availability and accessibility of these advanced AI processors.

“We continue to deliver high performance and power-efficient NPUs with Ryzen Al technology to reimagine the PC” said Jack Huynh, SVP and GM of AMD computing and AMD graphics business. “The increased Al capabilities of the 8040 series will now handle larger models to enable the next phase of Al user experiences.”

The rollout is anticipated to commence in Q1 2024, ensuring that users can experience the cutting-edge capabilities of these processors across a range of devices.

Read: AMD Eyes Big Wins with MI300X for AI Workloads

Furthermore, AMD emphasises its commitment to delivering high-performance and power-efficient NPUs with Ryzen AI technology, as stated by Jack Huynh, Senior Vice President and General Manager of AMD’s computing and graphics business. The increased AI capabilities of the 8040 series are positioned to handle larger models, paving the way for the next phase of AI user experiences.

In addition to hardware developments, AMD underscores its compatibility with the Windows 11 ecosystem, offering users the full range of Windows 11 features, including robust security.

These processors offer up to 64% faster video editing and up to 37% faster 3D rendering compared to competitors, catering to the diverse needs of users across different domains. With LPDDR5 memory support, advanced power management features, and compatibility with the Windows 11 ecosystem, the AMD Ryzen 8040 Series processors provide users with a compelling choice for ultrathin PCs, gaming, content creation, and other demanding AI use cases.

“It’s been incredible to see AMD and Microsoft’s long partnership of technology, bringing Al innovation to our shared customers,” said Pavan Davuluri, CVP Windows + Devices, Microsoft. “We’re so excited to see Ryzen 8040 series processor powered devices come to life in the Windows ecosystem and can’t wait to see what developers and customers do with all of this innovation.”

As part of its broader ecosystem, AMD introduces the Ryzen AI Software, which is now widely available. This software empowers developers to build and deploy machine learning models trained in popular frameworks such as PyTorch or TensorFlow.

Developers are granted early access to AI features such as advanced gesture recognition, biometric authentication, and other accessibility features. This aligns with AMD’s vision to provide a comprehensive platform for developers to build AI applications with ease.

The AMD Ryzen 8040 Series processors stand out not only for their AI capabilities but also for their overall performance. Built on the AMD “Zen 4” architecture, these processors deliver up to 16 processing cores, ensuring leading single-core and multi-core performance.

The processors also feature AMD RDNA TM 3 architecture-based RadeonTM graphics and are designed for creative professionals, gamers, and mainstream users seeking a powerful laptop with reliable performance.

The post AMD Releases Ryzen AI PCs with NPUs appeared first on Analytics India Magazine.

How Google’s Gemini AI makes your Pixel 8 Pro faster and smarter

Google Pixel 8 Pro AI Wallpaper

If you're a Google Pixel 8 Pro user, your phone is about to get a serious AI upgrade from Google. And if you're not a Pixel user, but still an Android user, there's good news for you too.

Starting today, Google is bringing its new Gemini model of the Bard AI to the Pixel 8 Pro. First announced in May of this year, Gemini is Google's most capable and advanced language model AI to date (and a decent rival to ChatGPT).

Also: What is Gemini? Everything you should know about Google's new AI model

However, since Gemini was designed to be run in a large data center environment, what's appearing on the Pixel 8 Pro will be a slimmed-down version — Gemini Nano —that's specifically designed for smartphones. But even though it's not the "full" version, that doesn't mean it's any less valuable.

In a post announcing the rollout, Google noted that Gemini Nano was the "most efficient model built for on-device tasks," adding that since it "runs directly on mobile silicon," there's support for several use cases.

At first, only two things on the Pixel 8 Pro will be run by Gemini Nano — the Smart Reply on the Gboard keyboard — which suggests the next thing to say in a messaging app — and the Recorder app's auto summarize feature, which lets you get a recap of a full meeting with one click. Both of these processes, part of Google's December Pixel feature drop, will run directly on the phone, which ensures three things: sensitive data stays on your device, no online connection is needed, and the processes should be very quick.

In summary, right now all you'll see is two apps on your Pixel 8 running a little faster and communicating a little more effectively. But it's only the beginning.

Also: Google's Smart Search now showing up on more Android phones

This is indeed only a small initial rollout, and your phone may not even feel that different. But if Gemini can work the way Google thinks it can, and if it can be brought to the Google Assistant on Pixel phones, this could over time become a huge addition to Google's product line. And when Gemini rolls out to Android overall — which is in the works — it will have an even broader impact.

Long term, Gemini — and AI tools in general — look to have a potentially massive impact on Android devices, and could change almost everything we do with our devices. Combined with the introduction of Beeper, an app that lets the color-conscious finally get rid of their blue Android bubble, will we start to see a migration from iOS to Android?

Artificial Intelligence