How a Traditional HCM Giant is Powering Workplaces with GenAI

How a Traditional HCM Giant is Powering Workplaces with GenAI

In today’s fast-paced business environment, leading human capital management (HCM) providers are harnessing generative AI to efficiently manage a large workforce. This technology helps automate routine tasks and deliver actionable insights for better decision-making.

ADP, an HCM and payroll giant, is at the forefront of this AI-driven revolution. The company operates in 140 countries, providing payroll, HCM, and networking solutions.

“In the US, we pay one in five private sector employees, processing payroll for approximately 32 million employees. So, it’s huge,” said Srinivas Konidena, CTO and VP of APAC products at ADP, in an exclusive interview with AIM.

The company is leveraging generative AI to automate repetitive HR tasks and payroll, predict attrition, and provide personalised employee experiences.

AI in HCM

“We have something called ADP Data Cloud. It is more like a concept of a lake where all the systems, whether it is down market, mid market, or up market, they all flow the data into this data cloud. And, on top of which, we would like to use AI to generate some interesting analytics, interesting things,” he explained.

Among the different products, ADP’s attrition prediction model, not only predicts attrition rates but also suggests where to find replacement candidates and the expected salary range.

“When you have attrition in an organisation, the general expectation immediately after, is to predict attrition. So, we had some models to predict attrition based on the industry, based on how things are going in the company and things like that,” Konidena shared.

Wary of ChatGPT and Copilot

“We are very paranoid about data. Because of the business we are in, clients trust our data,” Konidena stated.

To ensure data security, ADP has banned the use of public AI tools such as ChatGPT within the company. Instead, they utilise LLMs that are captive inside ADP’s systems, with no external connections.

The use of copilots, such as GitHub Copilot and Amazon Code Whisperer, requires approval from their central office called GAIN (Generative AI Now) office.

“One thing clear is, we do not allow the data to go out, even when we use copilots,” Konidena explained. “What we’re doing is we’re restricting the copilot to focus only on the IDE and not go back to the repo.”

Currently, ADP is not developing new AI models but rather reusing existing models from providers such as AWS, Azure, and Facebook.

Konidena also revealed that ADP is working with Microsoft to create a separate ChatGPT instance for the company, allowing them to use the technology more actively while maintaining data security.

Unlike Zoho and Others

Besides ADP, platforms like Rezolve AI and Zoho are also leveraging AI to transform HR workflows.

AI-powered service desks handle repetitive tasks, such as password resets and software installations, allowing human agents to focus on complex issues requiring empathy and critical thinking.

Zoho has also implemented AI models on its platforms, including AI chatbots for query resolution related to organisational policies and an AI-based document processing technology called ‘IDP’ for extracting information from documents.

However, ADP differentiates itself by primarily focusing on specialised HCM and payroll solutions, whereas companies like Zoho offer a single system for accounting and other business functions.

“Their focus is on entering into a single system for accounting and work and things like that. On the other hand, we are mostly on the specialised part of it, which is HCM payroll,” said Konidena.

Promising Indian Market

In India, ADP is experimenting with AI to detect payroll anomalies and streamline the year-end tax submission process.

“The ability to accept change and the rate of innovation in India, unbelievable. And I see the same in China. So accepting of change. Then what happens is our ability to rapidly develop the product is good,” Konidena remarked.

The company is also expanding by allocating dedicated funds for emerging technologies at the corporate level.

“We are co-investing using ADP ventures. So I think that will give us some leverage more than anything, I think it will give an insight into what’s going on,” said Konidena.

ADP sees significant potential in tier-two and tier-three cities in India, where the standard of living is improving, and companies are establishing operations.

“Those are the engines of growth. We’re also seeing all the manufacturing, right? Earlier used to be centred around these places, they’re shifting out. And when they shift out, they’re not getting regular manual labour. I mean, it’s all fully automated supervisors,” Konidena pointed out.

ADP’s growth has been driven by both organic expansion and acquisitions. In India, ADP acquired a company called Moffa, which had around 100 clients at the time. Over the past 12 years, ADP India has grown to serve close to 2,000 clients.

The post How a Traditional HCM Giant is Powering Workplaces with GenAI appeared first on AIM.

Apple WWDC 2024 recap: Every new feature in iOS 18, Siri, AI, and more

Apple WWDC 2024 badge

Apple's 2024 Worldwide Developer Conference (WWDC) shaped up to be one of the company's biggest events in decades. The opening keynote, which took place on Monday, focused almost entirely on the buzzword we can't stop talking about — artificial intelligence (AI).

After trailing behind major players like OpenAI, Google, and Microsoft, Apple unveiled a slew of AI features spread across the company's most popular operating systems. While AI was the event's main focus, Apple executives also announced this year's software upgrades for the iPhone, iPad, Apple Watch, Mac, and Vision Pro.

If you couldn't tune into the two-hour-long event, ZDNET has you covered. Here's a complete breakdown of all the announcements at WWDC.

VisionOS 2

New hand controls in VisionOS 2

  • Apple unveiled the first major upgrade to its recently released VisionOS — VisionOS 2.

  • In Vision OS 2, Photos gets an upgrade that allows users to transform any 2D photo into a Spatial Photo, with added depth from moments already in their camera rolls.

  • Spatial Personas in the Photos app lets users view photos together, creating a more shared experience.

  • VisionOS 2 also supports new hand motion gestures, allowing users to access some settings more easily. For example, users can open their hands and tap to reach the home screen or turn their wrists to see the battery level.

  • Users who mirror their MacOS to their Vision Pro will soon get a higher resolution and bigger size, creating an ultrawide monitor view equal to two 4K monitors side by side.

  • The Vision Pro will also include train support for travel mode, making working during your commute easier.

  • Vision Pro will now feature a Guest User option that allows additional users to save their eye and hand data for 30 days.

  • Users can now personalize their Home View, placing apps wherever they want.

  • Users will now be able to watch videos in an Environment when using Safari, even on sites such as YouTube and Netflix.

  • Apple TV brings multiview to Vision Pro, which is especially useful when watching sports games.

  • Users can cast content from their iPhone, iPad, or Mac to the Vision Pro using AirPlay.

  • Apple is also making the Vision Pro available in more countries starting June 28. You can see the full list here.

iOS 18

  • iPhone and iPad users will be able to customize their home screen further by placing apps wherever they'd like on the screen, as opposed to the usual fixed grid. App icon colors will also be customizable, allowing users to make apps any color they want or even match their home screen. Users can also change app icons to dark mode.

  • After five years of remaining untouched, the Control Center received several upgrades, including the ability to customize its toggles, such as flashlight, screen recording, calculator, auto-rotate, screen mirroring, and more, by tapping, holding, and rearranging. The Control Center toggle will also feature different pages with completely customizable user controls, and users can switch controls on the bottom of the Lock Screen.

  • Apple also added privacy options, including the ability to lock an app, which requires users to authenticate with FaceID or passcode before accessing the app. Users can also hide an app, which makes it disappear from the home screen to a hidden part of their app library.

  • Messages received several upgrades. Tapbacks, the feature allowing users to react to messages by holding them down, was upgraded to feature different colors and include emojis. Users can add text effects to specific phrases or words instead of the entire phrase. Texts can also be customized further with formatting options like bold, underline, italics, and strikethrough. Lastly, users will be able to schedule messages.

  • iPhone 14 and later models will have a new Messages via Satellite feature, which allows users to send messages via satellite when they don't have Wi-Fi or cellular service.

  • The Mail app will automatically categorize emails, a feature that will be available later this year.

  • The Wallet app now allows users to tap phones together to exchange Apple Cash without requiring them to share personal information like phone numbers.

  • The Journal app will now show more statistics and insights, including how many entries you've had this year, how many days you journaled, and more.

  • There is a new Game Mode for iPhone, meant to help gamers optimize their gaming experience. This includes minimizing background activity and using more responsive accessories, such as controllers.

  • The Photos app got what Apple dubbed its "biggest ever redesign," featuring a cleaner design, a new carousel with highlights that update each day, the ability to pin collections, and an improved search.

  • The Messages app now supports Rich Communication Services (RCS).

  • The Safari app was upgraded to include key information about a webpage. You can read more about the Safari upgrades under the MacOS section of this article.

  • The Calendar app can now pull from the Reminders app for a more seamless schedule overview.

AirPods

  • AirPods Pro are getting Voice Isolation to enhance call quality in noisier environments.

  • With the new Siri Interactions, users can now nod or shake their head "yes" or "no" when responding to Siri.

  • Apple is also releasing a Personalized Spatial Audio API for game developers to build around the AirPods' audio technology.

tvOS 18

Actor and music title insights on tvOS

  • When users watch an Apple TV+ show or movie, the new InSight feature on tvOS will include additional information such as actor names and music titles. Users can then easily add those music titles to their Apple Music playlist. When using an iPhone as a remote, the InSight information will also appear on the smartphone.

  • The Enhance Dialogue feature was upgraded to deliver greater vocal clarity over other elements of the movie or show, such as music or background noise, on Apple TV 4K.

  • Subtitles were optimized to automatically appear when the language does not match one of the devices, when users mute, or when skipping back.

  • Apple added support for 21:9 formatting for viewing content on projectors.

  • There are new, fun screensavers, including Portraits, TV and movies, and Snoopy.

  • When FaceTiming on tvOS 18, users will now have the option of Live Captions for English in the US and Canada.

  • tvOS will also feature a redesigned Apple Fitness+ experience, including new For You, Explore, and Library spaces.

WatchOS 11

  • The new training load allows users to gain insights into how their workouts' intensity impacts their long-term performance.

  • The new Vitals app will give users a quick look at their most important health metrics, including heart rate, respiratory rate, wrist temperature, sleep duration, and blood oxygen. It will also provide context to help them make more informed decisions. If something seems out of the ordinary, users will receive pings alerting them of the anomaly.

  • When a user logs a pregnancy in the Health App on iPhone or iPad, the Cycle Tracking app on Apple Watch is upgraded to show gestational age, and allows users to log symptoms experienced during pregnancy. Pregnant users can also ask to be reminded to take a mental health assessment every month. Using the Walking Steadiness feature, users can also be alerted of increased fall risk.

  • Users will also experience more customizable Activity Rings, which allow them to pause their rings when they want to take a day off without impacting their award streaks.

  • Apple Fitness+ was upgraded to include personalized For You tabs, Explore and Library tabs, search features, and enhanced awards.

  • Smart Stack is also getting more intelligent: it's now able to suggest widgets when needed automatically, among other improvements.

  • With watchOS, users will get suggestions on the best photo options for their watch face from their photo library based on criteria such as aesthetics, composition, facial expressions, and more. The face can also be customized with different sizes, layouts, and fonts.

  • Check In is available on Apple Watch, even during workouts, to help users stay safe. Translate is also coming to Apple Watch so users can translate text right from their wrists.

  • Another notable change is that the double tap gesture can be used to scroll through apps.

iPadOS 18

  • The update will feature a redesigned tab bar that floats above app content. Users can customize it to showcase their favorite apps and access the most important sections of an app. You can also long-tap the bar to move it around. The tab bar also morphs into the sidebar for added insights.

  • Shareplay will allow users to remotely control someone else's iPad or iPhone and share drawings on their screens.

  • In a long-awaited release, iPads will now have a calculator app for the first time, complete with the same interface as the one currently found on iPhones. Plus, users can use it with the Apple Pencil through a new Math Notes experience, which allows users to write expressions that the calculator app will solve for them once they type the equal sign, in their own handwriting,

  • Handwriting in Notes also got an upgrade with Smart Script, which refines users' writing to make it more legible while keeping the authenticity of the user's handwriting style. The feature can also match copied and pasted text to the user's handwriting. Typed text was also enhanced, with five new highlight colors and the ability to toggle sections under headings or subheadings.

  • iPadOS 18 supports screen-sharing via SharePlay and the same Control Center customizations, Photos app upgrades, Safari updates, and emoji Tapbacks found in iOS 18.

MacOS 15/Sequoia

  • Apple unveiled MacOS Sequoia, which will include many of the new features that were added to iOS 18 and iPadOS 18, including the updated Safari, Photos, Messages, and the new Passwords management app.

  • The new iPhone mirroring capability on Mac allows users to experience their phone almost entirely from their Mac. For example, iPhone notifications will now be available on Mac, allowing users to interact with them and open corresponding apps, though the iPhone itself will appear locked.

  • Window tiling was made possible to help users stay more organized. When users drag a window to the edge of the screen, macOS automatically suggests a tile position.

  • Video meetings are also getting an upgrade, with new backgrounds and a preview experience that allows you to see what you are about to share before sharing it. This feature works with popular video conferencing applications such as FaceTime and Zoom.

  • The AI summarization tool will live in Safari to help users process content like web pages and articles more efficiently. Safari will also assist users in discovering more helpful information about a page they are browsing when relevant, such as directions.

  • Apple also launched a new Viewer experience, which does for video what Reader does for text.

Apple Intelligence

  • Apple unveiled what it calls its new "personal intelligence" system under the name Apple Intelligence. The release puts generative models at the heart of the ecosystem of Apple devices.

  • With Apple Intelligence, your iPhone can prioritize notifications to ensure you get notified only when it's crucial throughout your day.

  • The release includes writing tools that leverage AI, including rewriting, proofreading, and summarizing text features available across mail, keynotes, third-party apps, and more.

  • Users can now create personalized images in the photo library, including sketches, illustrations, and animations. This feature is available in Messages, Apps, Freeform, Keynote, and Pages.

  • Apple Intelligence can tap into tools and carry out tasks on your behalf, such as "Show me all the photos," "Play the podcast," or "Pull the files that my coworker shared with me last week."

  • Because it's grounded in your personal information and context, and can retrieve data from across your apps and reference the content on your screen, Apple Intelligence is positioned to be your personal assistant.

  • Apple emphasized the safety and privacy precautions built into Apple Intelligence, particularly for on-device intelligence processing. The company touted the security of Apple's silicon, A17 Pro, and its M family of chips (M1, M2, M3, and M4).

  • For tasks that are too large for on-device processing and need to be completed in the cloud, Apple unveiled Private Cloud Compute, which protects users' privacy by running on servers specially created using Apple Silicon. When users make requests, Apple Intelligence first tests on-device capability, but calls on Private Cloud Compute if the task requires more power. Apple reiterated that user data is never stored or sold to external parties.

  • Siri finally got the AI makeover it deserves, first with a new look: when tapped, light wraps around the edges of your screen. Siri can now better understand users, even if they stutter, due to more advanced natural language processing (NLP). It now has conversational context, remembering what you just said and using it to complete the next task. Users can also type requests to Siri. Because it has in-depth product knowledge, Siri can answer questions about functionality on iPad, iPhone, and Mac. Siri will also have Apple Intelligence's on-screen awareness, allowing it to act on what it sees. The voice assistant can also take actions across apps, including photo editing. With access to your personal context, Siri can understand and complete new commands, such as pulling your driver's license information from a photo and inputting it into a form. The Siri updates are coming to iPad and Mac, too.

  • Apple Intelligence also powers new features in Mail, including Rewrite, which offers users different versions of what they have already drafted. Suggestions are shown in-line, and Proofread edits forgrammar, word choice, and sentence structure. You can also use Summarize to convert your text into bullet points. Smart Reply identifies relevant selections of an email and uses them to help craft a custom message. Summaries will now appear at the top of emails, making browsing an inbox easier. Apple Intelligence can even help prioritize your emails, placing what is most important at the top of your inbox.

  • There is an all-new focus option: reduce interruptions. When in this setting, your phone will only show you what is most important based on your personal activity and context.

  • Genmoji allows users to create AI-generated emojis based on what they type. You can also create a Genmoji based on a photo of a friend. Genmojis can be included in-line in Messages and even used for Tapbacks.

  • Image Playground allows users to leverage AI on-device to create images from text prompts, which can be easily shared in iMessage and elsewhere. The feature is also available in Keynote, Pages, and Freeform, and as a stand-alone Image Playground app.

  • Image Wand in the Notes app transforms a rough sketch into a polished image and is available directly in the tool palette. For example, you can circle a rough sketch in Notes and open Image Playground to transform your doodle into a fully-fledged image.

  • Apple Intelligence will also upgrade the Photos app with a new clean-up tool that removes unwanted objects. Search in videos allows users to easily find specific snippets of content, and users can create Memories on-demand, using text to edit and organize photos into movies.

  • In the Notes app, users can record and transcribe audio, which Apple Intelligence will generate a text summary of. This experience is also available in the Notes phone app.

  • Apple Intelligence is free on iOS 18, iPadOS 18, and MacOS Sequoia, and will be available to try in English only this summer.

Partnership with OpenAI

  • Apple also confirmed its partnership with OpenAI by integrating ChatGPT with Siri. With a user's permission, Siri can send a request to ChatGPT for help. For example, if you ask Siri for assistance on a task it deems better for ChatGPT, Siri will suggest you use the chatbot instead and forward your request. ChatGPT's writing capabilities can also be leveraged within certain writing tasks.

  • Users can access ChatGPT via this integration for free, and OpenAI will not log their data. ChatGPT Plus users can connect their subscriptions to access more advanced features, in which case OpenAI's data usage policies apply.

  • The ChatGPT integration will be coming to iOS 18, iPadOS 18, and MacOS Sequoia later this year.

Featured

Why an Open Cloud Network Would Benefit Indian Tech Startups, Developers, IITs

The cloud computing market is dominated by three names – AWS, Microsoft Azure, and Google Cloud. However, accessing these Cloud Service Providers (CSPs) could prove to be expensive for micro and small businesses, individual developers, as well as research institutions in India.

So what is the solution? An open cloud computing (OCC) network.

People+AI is on a mission to develop an interoperable cloud computing network that could transform India’s growing tech ecosystem.

The idea is not just to democratise cloud computing in India but also to enable smaller entities within the Indian public cloud sector, beyond the hyperscalers like NeevCloud, Vigyanlabs, and Von Neumann AI, to attract more customers.

“One of the challenges these smaller players are facing is discoverability. Moreover, besides helping them with marketing, OCC also aims to support them in figuring out the right tax subsidies, infrastructural policies and with ease of doing business,” Tanvi Lall, director of strategy at People+AI, told AIM.

People+AI, which branches out from Nandan Nilekani-backed non-profit EkStep Foundation, has already teamed up with 24 technology partners, including the likes of Oracle Cloud, Vigyan Labs, Protean Cloud, Dell, NeevCloud, and Tata Communications, among others.

Democratising Compute in India

Based on the concept of open-networks such as the Open Network for Digital Commerce (ONDC), the aim of OCC is to establish an open network where consumers can discover and access heterogeneous compute providers through standard interfaces.

“It’s an open network of providers powered by protocols, powered by trust, where diverse providers can log in and then on the other hand customers who need computing in different shapes and formats can go to this interface and find the provider on the other end,” Lall said.

According to Lall, an OCC will be significant for students and research institutions that need compute for their research work, hackathons, etc.

It will also benefit individual developers who are experimenting with newer things, for example, fine tuning an open source model.

“These are developers who are not developing any products but are carrying out different projects. This also includes pre-startup developers who might not be able to afford a CSP,” Lall added.

Moreover, the open cloud network will prove to be of significance for Micro, Small & Medium Enterprises (MSMEs). India is one of the largest MSME markets in the world, and today many of these enterprises are in their digital transformation process.

“These companies today want a website, they want some sort of CRM tool, some planning tool, business ops. It might not be the high calibre graphics processing units (GPU) compute, but they want some compute,” Lall said.

“However, there are also MSMEs who are starting to apply AI to their business process, for example, to improve sales, deploying AI chatbots for customer service, etc.”

These companies might even need GPUs for inference and other similar workloads. Lastly, Indian tech startups also stand to benefit a lot. These are startups, which are developing B2B or B2C AI applications for others to use and might need high calibre GPUs, according to Lall.

“This is a very big category and they need significant reliable compute power because they are the ones selling these AI applications to others,” she said.

Furthermore, OCC aims to create common standards, APIs, and interoperability protocols to enable seamless integration and provisioning of compute resources from multiple providers within the network.

This implies that although your data may be stored with one CSP, with OCC, you can access services from another provider without the need to migrate your data. Generally, migrating data from one CSP proves to be challenging.

Challenges Ahead

While it’s an ambitious project, implementing it on a population scale will truly help the whole ecosystem reap its benefits. However, implementing this at such a large scale remains a challenge.

According to Lall, building the technology is not the biggest issue. “We have talented people in this country who can build the tech very easily,” she said. The biggest challenge for Lall and her team will be developing the ecosystem and sustaining it.

Moreover, it will be critical for these cloud companies to find value in the network, the same goes for the end users.

What makes many enterprises choose the CSP is their ability to provide scalable and flexible infrastructure, robust security measures, seamless integration with existing systems, and reliable customer support.

The network will have to ensure the same things are easily available for the stakeholders in the ecosystem.

“We’re adopting a highly granular approach to this. It’s not merely compiling a GPU directory or a simple listing of available GPUs; while that’s undoubtedly beneficial, we’re emphasising the importance of considering user experience, deployment strategies, and other critical factors,” Lall said.

Moreover, according to Lall, what makes the CSPs like AWS worthwhile is the whole community they have developed. “They put out these videos and documentation which help you find quick fixes to your problems while using their services.

“At OCC as well, we need to establish an ecosystem where individuals and entities such as startup founders seeking compute resources can easily communicate. They should be able to engage in straightforward conversations about their requirements,” Lall said.

Presently, there isn’t a centralised platform for this purpose. While there may be some Discord channels available, according to Lall, there isn’t a single venue where users can explore offerings and interact with each other.

The Indian AI-as-an-Infrastructure Landscape

In the last year or so, we have seen various startups emerging in the AI cloud space. For instance, startups such as Yotta and NeevCloud have announced their entry into the AI cloud space.

Yotta, backed by the Hiranandani group, is setting up a GPU infrastructure of 32,768 GPUs by the end of 2025. Likewise, NeevCloud intends to acquire 40,000 GPUs by 2026.

Tata Communications, which is also a 24-member technology partner for People+AI, is ramping up efforts to establish AI cloud infrastructure by aligning with NVIDIA.

There are also other players like Jarvis Labs, for instance, which is based in Coimbatore. What OCC also sets out to enable is to bring computing closer to MSMEs or tech startups located in Tier 2 and Tier 3 cities in India.

Can Hyperscalers be Part of it?

According to Lall, hyperscalers like AWS, Azure and Google Cloud could even be part of the network. “ Our entire approach is plus one. The idea is not to leave certain folks out. There is room for everybody and there is a reason why we approached Oracle Cloud.”

“If the hyperscalers feel they have something to offer in the Indian context, they are welcome. It would also be foolish for us to imagine that everyone will abandon the big CSPs as and when the network is up and running,” she concluded.

The post Why an Open Cloud Network Would Benefit Indian Tech Startups, Developers, IITs appeared first on AIM.

OpenAI Partners with Oracle Cloud Infrastructure To Run ChatGPT

Oracle, Microsoft, and OpenAI are partnering to extend the Microsoft Azure AI platform to Oracle Cloud Infrastructure (OCI), providing additional capacity for OpenAI.

OpenAI, the company behind ChatGPT, serves over 100 million users monthly with its generative AI services.

“We are delighted to be working with Microsoft and Oracle. OCI will extend Azure’s platform and enable OpenAI to continue to scale,” said Sam Altman, Chief Executive Officer, OpenAI.

“The race to build the world’s greatest large language model is on, and it is fueling unlimited demand for Oracle’s Gen2 AI infrastructure,” said Larry Ellison, Oracle Chairman and CTO. “Leaders like OpenAI are choosing OCI because it is the world’s fastest and most cost-effective AI infrastructure.”

OCI’s AI infrastructure supports a wide range of AI innovators, including Adept, Modal, MosaicML, NVIDIA, Reka, Suno, Together AI, Twelve Labs, and xAI, which use OCI Supercluster for training and inferencing next-generation AI models.

OCI’s AI capabilities allow startups and enterprises to build and train models faster and more reliably across Oracle’s distributed cloud. For training large language models (LLMs), OCI Supercluster can scale up to 64,000 NVIDIA Blackwell GPUs or GB200

Grace Blackwell Superchips, connected by ultra-low-latency RDMA cluster networking and various HPC storage options. OCI Compute virtual machines and OCI’s bare metal NVIDIA GPU instances power applications for generative AI, computer vision, natural language processing, and recommendation systems.

This partnership aims to meet the growing demand for AI infrastructure, enabling OpenAI and other AI innovators to scale their workloads efficiently on OCI.

The post OpenAI Partners with Oracle Cloud Infrastructure To Run ChatGPT appeared first on AIM.

Apple’s iOS 18 will let you record phone calls without a third-party app

Apple's call recording and transcription in iOS 18

Recording phone calls on your iPhone has long been doable but it's always required a third-party recording app. Now, Apple has unveiled a built-in phone recording feature available via the AI skills introduced with iOS 18.

Based on a quick preview at Apple's WWDC 2024 on Monday, the new phone call recording option will appear on the screen during a live call. Tapping the Record button will kick off the recording and alert you and the other person that the call is being recorded. A soundwave indicates the audio level of the recording, while a timer shows you how long the recording has been running.

Also: Apple staged the AI comeback we've been hoping for — but here's where it still needs work

When you're done, the recording is sent to the Notes app where you can play it. (You can also start an audio recording from a note.) Using the AI-powered Apple Intelligence technology, the recording is automatically transcribed for you to read. Plus, you can ask the AI to summarize the call to zero in on the key topics and comments in your conversation.

The recording and transcription are saved in the note for future reference. You can then search for specific text in the transcription and even combine parts of the conversation with other notes.

For now, direct call recording on an iPhone is possible only through such apps as TapeACall, Call Recorder Pro, Rev Voice Recorder, Easy Voice Recorder, and Google Voice (incoming calls only). But many of the third-party apps are either limited or impose a fee for full recording and transcription. The built-in call recording promises to be simpler and more seamless.

Also: Your Apple Watch is getting an upgrade — here are the coolest features in WatchOS 11

The transcription capability will support English, Spanish, French, German, Japanese, Mandarin Chinese, Cantonese, and Portuguese. iOS 18 is currently available to anyone as a developer's beta. However, you won't be able to try out the recording and transcription until the AI-based Apple Intelligence arrives in a future beta this fall.

Featured

Oracle and Google Cloud Partner to Simplify Multicloud Deployments

Oracle and Google Cloud today announced a partnership enabling customers to combine Oracle Cloud Infrastructure (OCI) and Google Cloud technologies, accelerating application migrations and modernisation.

Google Cloud’s Cross-Cloud Interconnect will initially be available for customer onboarding in 11 global regions, allowing the deployment of general-purpose workloads without cross-cloud data transfer charges.

Later this year, a new offering, Oracle Database@Google Cloud, will be available, featuring Oracle database and network performance on par with OCI.

The partnership aims to simplify cloud migration, multicloud deployment, and management. Both companies will jointly market Oracle Database@Google Cloud, targeting enterprises across financial services, healthcare, retail, manufacturing, and more.

“Customers want the flexibility to use multiple clouds,” said Larry Ellison, Oracle Chairman and CTO. “To meet this demand, Google and Oracle are connecting Google Cloud services with the latest Oracle Database technology.”

Sundar Pichai, CEO of Google and Alphabet, added, “This partnership will help customers use Oracle database and applications with Google Cloud’s platform and AI capabilities.”

Oracle Database@Google Cloud offers direct access to Oracle database services running on OCI and deployed in Google Cloud datacenters. This service is designed to help customers accelerate cloud migration and modernize IT environments while leveraging Google Cloud infrastructure, tools, and AI services, including Vertex AI and Gemini foundation models.

Benefits for customers include:

  • Flexible migration options for Oracle databases to Google Cloud, with compatibility with migration tools such as Oracle Zero-Downtime Migration.
  • Simplified purchasing via Google Cloud Marketplace, allowing use of existing Google Cloud commitments and Oracle license benefits.
  • Unified customer experience and support from Google Cloud and Oracle.
  • Deployment of Oracle database services, including Oracle Exadata Database Service, Oracle Autonomous Database Service, MySQL Heatwave, and more, within Google Cloud datacenters.
  • Integration of Oracle data with Google’s AI services to enhance applications in customer service, employee services, creative studios, and more.

Oracle will manage Oracle database services within Google Cloud datacenters globally, starting with regions in North America and Europe. Oracle Exadata Database Service, Oracle Autonomous Database Service, and Oracle Real Application Clusters (RAC) will launch later this year in US East (Ashburn), US West (Salt Lake City), UK South (London), and Germany Central (Frankfurt), with further expansion planned.

The partnership also offers customers the ability to deploy workloads across both OCI and Google Cloud regions without cross-cloud data transfer charges. Initial onboarding will be available in 11 regions, including Australia East (Sydney), Australia South East (Melbourne), Brazil East (São Paulo), Canada South East (Montreal), Germany Central (Frankfurt), India West (Mumbai), Japan East (Tokyo), Singapore, Spain Central (Madrid), UK South (London), and US East (Ashburn).

This collaboration allows customers to innovate using the best combination of Oracle and Google Cloud services, providing low-latency, high-throughput private connections between the two cloud providers.

Customers can run multiple Oracle applications on OCI with distributed data stores on both OCI and Google Cloud and build new cloud-native applications using Google Cloud’s AI technologies.

The new multicloud capabilities offer a fully integrated experience for deploying, managing, and using Oracle database instances within Google Cloud, enabling data movement and deployment of new cloud-native applications across both clouds. This integration allows organisations to leverage existing skills while utilising the best of Oracle and Google Cloud capabilities.

The post Oracle and Google Cloud Partner to Simplify Multicloud Deployments appeared first on AIM.

Velocity Launches Vani AI, India’s First AI-based Interactive Calling Solution for Financial Institutions

Velocity, a top Indian cash flow-based financing platform located in Bengaluru, has launched Vani AI, an in-house developed GenAI conversational tool designed to transform customer service in the financial sector by providing a human-like interactive experience.

Leveraging advanced voice synthesis and speech recognition technology, Vani AI facilitates natural, human-like conversations in multiple languages and dialects, surpassing the one-way communication of traditional robotic IVR systems. It can help financial institutions reduce operational costs by 20-30% while enhancing customer experience.

Vani AI is built with industry-specific intelligence and awareness, enabling it to answer contextual questions and provide subjective responses. The tool enhances its performance continuously by utilising customer data and organisational context, improving with each interaction. Additionally, it can also access and update structured databases.

Currently trained on Velocity’s proprietary data and publicly available datasets, Vani AI can be customised for any BFSI company to significantly reduce operational expenses.

“We started by understanding how we could leverage GenAI for our internal use cases, such as lead qualification, data collections and collections calling. We realized human calls were expensive and non-standard in content, context, and tonality. By using AI internally, we were able to automate a lot of these processes and achieve great results, which led us to develop Vani AI as a solution for the broader financial services sector,” said Saurav Swaroop, Co-Founder and CTO, Velocity.

“With Vani AI, financial institutions can customise AI agents to make automated, intelligent, and human-like calls across various financial service use cases,” he added.

In Velocity’s own use cases, Vani AI achieved over 85% accuracy as a key metric.

Designed for seamless integration, it can easily connect with downstream and upstream systems such as whatsapp, calendars, CRMs, among others ensuring smooth operation with existing infrastructures and processes.

The support for multiple Indian languages and dialects enables Vani AI to serve a diverse customer base. It can also be used for voice customisation by replicating a person’s voice from a 30-second voice recording.

While Vani AI can handle common customer inquiries, if an issue arises that it cannot resolve, it seamlessly transitions the call to human agents and integrates with CRM systems, marketing automation softwares and other third-party tools.

Its CRM integration includes features like sending document upload links during conversations and logging this action in the CRM. It can also schedule callback appointments and check the availability of the assigned human agent.

Vani AI also offers options for hyper-personalised and customised calls to meet specific business needs. It includes a collection of ready-to-use scenarios tailored for fintechs, banks, NBFC’s and financial services firms, with plans for expansion into other sectors in the future.

The post Velocity Launches Vani AI, India’s First AI-based Interactive Calling Solution for Financial Institutions appeared first on AIM.

Apple’s AI extravaganza left out 2 key advances — maybe next time?

apple-intelligence-2024.png

The unveiling of Apple's over-arching strategy for AI on Mac, iPhone, and iPad on Monday contained numerous intriguing features under the rubric "Apple Intelligence," a clever re-branding of the ubiquitous acronym.

ZDNET's Sabrina Ortiz has the details, which include many ways in which Apple software can enhance the device experiences, and also tap into OpenAI's ChatGPT. There was also a big play for security and privacy in Private Cloud Compute.

One glaring omission, however, is the lack of what's called "on-device" training.

Also: Everything Apple announced at WWDC 2024, including iOS 18, Siri, AI, and more

AI models — groupings of neural networks such as GPT-4o and Gemini — are developed during an initial phase in the lab known as training. The neural net is given numerous examples of success and its results are tweaked until they produce optimal answers. That training becomes the basis of the neural network's question-answering, known as inference.

While Apple didn't disclose any technical details of what Gen AI it is using, the descriptions suggest the on-board capabilities — the capabilities on the iPhone, iPad, and Mac — do not include training of the neural networks, even though that's an area where Apple has offered original research.

Instead, what is offered appears to be simply a form of "retrieval-augmented generation," or, RAG, a growing movement to performance inference — the making of predictions — by tapping into a database. Apple refers to the approach as the "Semantic Index," which knows about the user's personal data.

That's no small thing: augmenting Gen AI with on-board, personal data is itself quite an accomplishment for inference "at the edge" rather than in the cloud.

Also: Apple staged the AI comeback we've been hoping for — but here's where it still needs work

But it's not the same as on-board training. Apple's most interesting research work to date (at least, what's publicly disclosed) is to conduct some training on the client device itself.

What can you do if you train the neural net on a person's constantly updated device data?

A simple example is to boost image categorization by giving the neural net more context about what's in the image. This is not "a cat" in the photo you're looking at but your cat, similar to the many others you have taken, presented to you as an instant album of your cat, similar to what Apple does today when it recognizes faces in portraits.

Walking through an art gallery, if you snap a pic of a painting, your phone might recall connections between that artist and something you've snapped in a museum last month.

Also: Apple Intelligence FAQ: Every new feature, what models support it, and privacy concerns

Apple may be doing some re-training of neural nets in the cloud, via the Private Cloud Compute, given that it takes a lot of computing power — more than most client devices possess — to train neural nets

While ZDNET noted earlier this year that 2024 could be the year AI learns "in the palm of your hand," Monday's event suggests it could take a couple more iPhone generations before training on the device is possible.

There was another glaring omission: Apple's announcements dealt mostly with data already on the device, not with leveraging the device's sensors, especially the cameras, to enhance the world around you.

Apple could, for example, apply Gen AI to how the camera is used as an AI companion, such as letting the assistant help the user pick the best frames when taking a multi-exposure "live" photo. Even better, "Tell me what's wrong with this composition" is the kind of photography-for-dummies advice some people might want in real-time — before they press the shutter button.

Also: 2024 may be the year AI learns in the palm of your hand

Apple instead showed off some modest AI enhancements for post-production, such as fixing an already snapped photo by later removing background objects. That's not the same as a live camera agent that helps you while you are using the camera.

It seems likely Apple will get to both on-device training and applying Gen AI to the sensors at some point. Both approaches play to Apple's integrated control of hardware and software.

Apple

Yandex Unveils YaFSDP for 26% Faster LLM Training 

Run LLM locally on computer

Russian technology MNC Yandex has introduced YaFSDP, an open-source tool designed to improve the efficiency of training LLMs. This method improves GPU communication and reduces memory usage, offering a speedup of up to 26% over existing tools.

YaFSDP outperforms the traditional FSDP method, showing improvements in training speed, especially for large models. For example, YaFSDP achieved a 21% speedup on Llama 2 with 70 billion parameters and a 26% speedup on Llama 3 with the same number of parameters. These enhancements make YaFSDP a valuable tool for AI developers working with large, complex models.

By optimising GPU consumption, YaFSDP can save developers and companies significant amounts of money—potentially hundreds of thousands of dollars monthly.

“Currently, we’re actively experimenting with various model architectures and parameter sizes to expand YaFSDP’s versatility,” said Mikhail Khruschev, senior developer at Yandex and part of the team behind YaFSDP. “

The open-source model is available on GitHub.

Benefits and Implementation of YaFSDP

LLM training requires substantial computing power and resources, often resulting in high costs and extended training times. YaFSDP addresses these challenges leading to faster training times and reduced resource consumption.

For example, in scenarios involving models with 70 billion parameters, YaFSDP can save the equivalent of about 150 GPUs. This translates to potential monthly savings ranging from $0.5 to $1.5 million, depending on the GPU provider. The tool is particularly effective during the most communication-intensive stages of LLM training, such as pre-training, alignment, and fine-tuning.

Previously, the company has developed and shared several other open-source tools including, DataLens, CatBoost, YTsaurus, AQLM, and Petals.

The post Yandex Unveils YaFSDP for 26% Faster LLM Training appeared first on AIM.

Microsoft Rolls Out VALL-E 2, Attains Human-Level Speech Synthesis

Building on the success of VALL-E, Microsoft has introducedVALL-E 2, a neural codec language model designed to achieve human-level performance in zero-shot text-to-speech (TTS) synthesis.

This model rolls out two new features called Repetition Aware Sampling and Grouped Code Modeling to improve the stability and efficiency of the speech synthesis process.

Let’s take a look at the new methods.

  1. Repetition Aware Sampling: This method refines the traditional nucleus sampling by considering token repetition in the decoding history to improve stability and prevent the infinite loop issues encountered in earlier models.
  2. Grouped Code Modeling: This technique organises codec codes into groups to reduce sequence length, thereby speeding up inference and addressing the challenges associated with long sequence modelling.

These innovations enable VALL-E 2 to synthesise speech with high accuracy and naturalness, even for complex sentences. The model requires only simple speech-transcription pair data for training, simplifying the data collection and processing.

The model has been evaluated on the LibriSpeech and VCTK datasets, demonstrating superior performance in speech robustness, naturalness, and speaker similarity compared to previous systems. It is the first model to achieve human parity on these benchmarks, producing high-quality speech for complex and repetitive sentences.

Read the full paper here.

What Makes VALL-E 2 Better

In January of 2023, the company had come up with VALL-E which demonstrated in-context learning capabilities in zero-shot scenarios after being pre-trained on 60,000 hours of English speech data.

However, it faced issues with stability and efficiency. VALL-E relied on random sampling, which could lead to unstable outputs, and its autoregressive architecture resulted in slow inference speeds.

Follow-up works have tried to address these problems by leveraging text-speech alignment information and non-autoregressive methods, but these approaches introduced new complexities and limitations.

The capabilities of VALL-E 2 can be particularly beneficial for generating speech for individuals with speech impairments, such as those with aphasia or amyotrophic lateral sclerosis.

While the new model has significant potential, it also carries risks of misuse, such as voice spoofing or impersonation. The model assumes user consent for voice synthesis. In real-world applications, it should include protocols for speaker approval and detection of synthesised speech to prevent abuse.

The post Microsoft Rolls Out VALL-E 2, Attains Human-Level Speech Synthesis appeared first on AIM.