“Oh, Another Billion Dollars,” Chuck Robbins Says Cisco’s AI Strategy is Different

‘Oh, Another Billion Dollars,” Chuck Robbins Says Cisco’s AI Strategy is Different

Cisco Live 2024 is proving to be a phenomenal event in Las Vegas. Apart from boasting Elton John as the star performer, the company has announced a billion-dollar AI investment fund, which CEO Chuck Robbins claims is not just another fund.

“Everybody yawns when you hear about a billion dollars in AI these days — ‘Oh, another billion dollars’,” said Robbins. He emphasised that Cisco’s strategy was quite different from others who are currently pouring billions of dollars into the industry.

Cisco Investments has unveiled a $1 billion AI investment fund to bolster the startup ecosystem and further advancements with GenAI and LLMs. Additionally, the company revealed its financial support for Cohere, Mistral AI, and Scale AI, all of which have achieved billion-dollar valuations through their continuous fundraising efforts.

These three startups are among over 20 acquisitions and investments Cisco has made in AI in recent years. The $1 billion Global AI Investment Fund aims to further expand these efforts. So far, the fund has already allocated nearly $200 million.

Moreover, the company is also fuelling competition against OpenAI by investing in startups such as Mistral and Cohere. Cisco’s support for Cohere is part of a $450 million funding round that includes contributions from NVIDIA, Salesforce Ventures, and PSP Investments. This fusion of capital brings Cohere’s valuation to $5 billion.

The Agnostic Play of AI

Much like other tech giants, Cisco has made numerous AI acquisitions and investments over the past several years while also integrating generative AI across its product portfolio. Notably, in March, Cisco completed a $28 billion acquisition of the observability platform Splunk.

Apart from these investments, NVIDIA is collaborating with Cisco to enable enterprise generative AI infrastructure and ensure optimal performance of these models. Cisco’s new Nexus HyperFabric AI cluster solution, created in partnership with NVIDIA, offers a pathway for enterprises to operationalise generative AI using NVIDIA’s hardware.

The Cisco HyperFabric solution is an enterprise-ready, end-to-end infrastructure solution designed to scale generative AI workloads. This sounds eerily similar to the approach that Databricks and Snowflake are taking for providing infrastructure for LLM training and deployment.

According to Cisco corporate development SVP Derek Idemoto, the company considers several factors before making investments. The first question they ask is, “Does it address the real and evolving needs of our diverse customer base?”

“Put simply, we connect and protect the AI era. Only Cisco brings together extensive experience with AI-scale infrastructure and domain expertise in building AI across networking, security, and observability,” said Idemoto.

For Cisco, AI Has Always Been Big Because It Has Always Been a Data Company

Robbins said in the keynote that AI is moving at an unprecedented pace and generative AI is “coming on the scene very quickly.” Comparing it with the cloud boom, Robbins said, “I think AI is going to be that on steroids.”

Over the past few years, Cisco has actively integrated AI into its products. Webex initially spearheaded this effort by introducing features such as background noise removal, live transcripts, and meeting insights.

Since then, AI has significantly enhanced Cisco’s security offerings, highlighted by the launch of Hypershield around the RSA Conference 2024. Additionally, Cisco has rolled out AI assistants to help administrators better manage infrastructure.

Moreover, Cisco has also announced its partnership with Tech Mahindra in delivering next-generation firewall (NGFW) modernisation for customers and its workforce. Just recently, Lenovo also collaborated with Cisco to jointly design and engineer integrated products focused on networking, purpose-built AI infrastructure, and generative AI solutions.

Despite having a comprehensive AI strategy that spans its entire portfolio, the integration remains incomplete. By combining its network, observability, and security telemetry with Splunk data, Cisco could leverage a broader dataset to power its AI projects. Data has been one of the biggest parts of Cisco, and now, with NVIDIA and others, the company is going to bet big on generative AI with its unique data.

Earlier this year, Robbins appointed Mark Patterson as the company’s chief strategy officer, tasked with shaping Cisco’s overarching AI narrative. While it is unrealistic to expect Patterson to have all the answers immediately, we can anticipate valuable insights into the company’s future direction.

This strategic focus on AI is crucial for Cisco, as it will be integral to its customers’ AI journeys.

The post “Oh, Another Billion Dollars,” Chuck Robbins Says Cisco’s AI Strategy is Different appeared first on AIM.

Top 6 Parallel Computing Alternatives to CUDA

6 alternatives to CUDA

CUDA is a wonderful piece of tech that allows you to squeeze every bit out of your Nvidia GPU. However, it only works with NVIDIA, and it’s not easy to port your existing CUDA code to other platforms.

You look for an alternative to CUDA, obviously.

What are the alternatives to CUDA?

  1. OpenCL: An open standard for parallel programming across CPUs, GPUs, and other processors with some performance overhead compared to CUDA.
  2. AMD ROCm: An open-source GPU computing platform developed by AMD that allows the porting of CUDA code to AMD GPUs.
  3. SYCL: A higher-level programming model based on C++ for heterogeneous processors enabling code portability across CUDA and OpenCL through Intel’s DPC++ and hipSYCL.
  4. Vulkan Compute: It is a compute API of the Vulkan graphics framework, enabling GPU computing on a wide range of GPUs with lower-level control.
  5. Intel oneAPI: It is a cross-architecture programming model from Intel, including a DPC++ compiler for SYCL, offering an alternative to CUDA for Intel GPUs.
  6. OpenMP: It is an API for parallel programming on CPUs and GPUs. It uses compiler directives, and recent versions support GPU offloading as an alternative to CUDA.

Let’s address each with more depth.

1. OpenCL

OpenCL (Open Computing Language) is an open industry standard maintained by the Khronos Group that lets you utilise parallel programming across various platform architectures.

OpenCL allows you to write a program once, which it can then run on several different processors from different companies like AMD, Intel, and NVIDIA.

This can be useful if you want to use the hardware you already have or if you want to choose the best processor for a specific task, regardless of which company made it.

2. AMD ROCm

ROCm (Radeon Open Compute) is a platform designed by AMD to run code effectively on AMD GPUs. But the best part is that ROCm is open-source and can be accessed by everyone.

One of the most important parts of ROCm is called Heterogeneous-computing Interface for Portability, or HIP. HIP is quite close to CUDA programming in terms of syntax. This means if you know how to program CUDA then there’s no stiff learning curve if you’re switching over.

There’s even a tool called HIPIFY that can automatically convert CUDA code into code that works with HIP and AMD GPUs, with just a few minor changes required.

3. SYCL

SYCL (pronounced “sickle”) is a higher-level programming model based on standard C++ for heterogeneous processors. SYCL is built on top of the C++ programming language, enabling code portability across OpenCL devices.

The core idea of SYCL is to provide the performance of OpenCL with the flexibility of C++. Good examples of SYCL include Intel’s DPC++ (Data Parallel C++) based on Clang/LLVM that can target CUDA and OpenCL devices.

4. Vulkan Compute

Vulkan’s low-overhead design and close-to-metal nature can enable performance close to and sometimes even exceeding CUDA in many compute workloads. It provides compute shaders to enable GPU computing.

Since Vulkan Compute is a relatively new technology, its ecosystem is still maturing in terms of libraries, tools and language binding. It has a steeper learning curve, especially when graphics interoperability is also used.

However, new Vulkan Compute-focused frameworks like Kompute are emerging to make Vulkan GPU computing more accessible.

While Vulkan Compute can also interoperate with APIs like OpenCL, CUDA and DirectX 12, there are some very specific features like CUDA’s dynamic parallelism, that are not available with Vulkan.

5. Intel oneAPI

oneAPI is an open, unified programming model developed by Intel that aims to simplify development across diverse computing architectures (CPUs, GPUs, FPGAs, and other accelerators).

oneAPI consists of a core set of tools and libraries, including DPC++ language and libraries for deep learning, machine learning and more.

A key goal of oneAPI is to provide an alternative to proprietary models like NVIDIA’s CUDA. It aims to prevent vendor lock-in and allow code portability across Intel, NVIDIA, AMD and other hardware.

Furthermore, case studies have shown up to an 18x speedup for compute-intensive algorithms using oneAPI tools and Intel hardware.

6. OpenMP

Open Multi-Processing, or OpenMP, is an API that supports multi-platform shared-memory parallel programming in C, C++, and Fortran. It has also been used for parallel computing in CPUs.

Recent versions of OpenMP, starting from version 4.0, have introduced support for GPU offloading. This allows OpenMP to be used for GPU computing as an alternative to CUDA.

OpenMP provides a higher level of abstraction compared to CUDA. It handles many low-level details like data movement and kernel launches automatically, which can make it easier to use for some developers.

CUDA is a proprietary solution from Nvidia and it is fine-tuned to get the most out of Nvidia hardware. So, finding an exact replacement may not be possible as they’ll always have an advantage over any open-source platform. Sure, if you want to run parallel computation over other GPUs, then the given solution will get the job done in the most efficient way possible.

The post Top 6 Parallel Computing Alternatives to CUDA appeared first on AIM.

H2O.ai Launches Generative AI Driven Native Apps on Snowflake Marketplace

H2O.ai Launches Generative AI Driven Native Apps on Snowflake Marketplace

At Snowflake Data Cloud Summit ‘24, H2O.ai unveiled native H2O ML and generative AI Apps in Snowflake Marketplace, offering seamless integrated workflows within Snowflake accounts. These apps aim to democratise access to LLMs for enterprises, enabling them to derive insights from their data more efficiently.

Sri Ambati, CEO and Founder of H2O.ai, emphasised the significance of this collaboration: “By embedding our predictive and generative AI apps into Snowflake’s ecosystem, we’re helping organisations get powerful insights from large language models right within their Snowflake accounts.”

H2O.ai and Snowflake are dedicated to empowering joint customers to leverage their data effectively. Three H2O ML and GenAI Bundles are now available for Snowflake users: H2O Predictive Modeling Starter Pack, H2O GenAI LLM Starter Pack, and H2O Machine Learning Starter Pack.

Key features of H2O.ai’s native applications within Snowflake Marketplace include:

  • Seamless Integration: Users can utilise GenAI models directly within Snowflake Native Applications, eliminating the need for complex integrations.
  • Data Enrichment: GenAI applications automatically generate new columns based on advanced calculations, enriching insights and facilitating informed decision-making.
  • Predictive AI: GenAI capabilities enable sophisticated question answering and content generation based on existing data stores, enhancing traditional predictive models.

Chris Child, Senior Director of Product Management at Snowflake, highlighted the value of this collaboration: “Partners like H2O.ai can deliver valuable capabilities like enriching data and deriving insights quickly, securely, and powerfully to customers in the Data Cloud.”

H2O.ai’s integration with Snowflake Native Apps enables customers to derive insights directly from their Snowflake account where their data resides.

Jeffrey Vagg, Chief Data and Analytics Officer at North American Bancard (NAB), shared his experience: “Integrating Driverless AI and eScorer within Snowflake’s cutting-edge container services and native apps has revolutionised our approach to data analytics.”

By leveraging the combined capabilities of Snowflake Native App Framework and Snowpark Container Services, developers can build sophisticated applications that run on configurable hardware options, distribute them on Snowflake Marketplace, and deploy them within customers’ Snowflake accounts, without requiring data movement.

The company had also released H2O-Danube2-1.8B LLM, which builds on the achievements of its forerunner, H2O-Danube 1.8B, incorporating significant enhancements and refinements that position it as a leader in the 2B SLM classification.

The post H2O.ai Launches Generative AI Driven Native Apps on Snowflake Marketplace appeared first on AIM.

Snowflake Enhances AI for Enterprise with Upgrades to Cortex AI with Meta’s Llama 3 and Mistral LLMs

Snowflake Enhances AI for Enterprise with Upgrades to Cortex AI with Meta’s Llama 3 and Mistral LLMs

Snowflake is expanding access to enterprise AI with significant updates to Snowflake Cortex AI and Snowflake ML, democratising AI customisation through a no-code interactive interface and providing access to leading LLMs.

These enhancements include serverless fine-tuning capabilities and an integrated ML experience, enabling developers to manage models across the ML lifecycle. This unified platform allows businesses to derive more value from their data while ensuring full security and governance.

“Snowflake is at the epicentre of enterprise AI, putting easy, efficient, and trusted AI in the hands of every user so they can solve their most complex business challenges, without compromising on security or governance,” said Baris Gultekin, Head of AI at Snowflake.

The company is introducing two new chat capabilities, Snowflake Cortex Analyst and Snowflake Cortex Search, both entering public preview soon. These tools enable users to develop chatbots that interact with structured and unstructured data, facilitating faster and more efficient decision-making processes.

Cortex Analyst, utilising Meta’s Llama 3 and Mistral Large models, allows secure application building on Snowflake’s analytical data. Cortex Search integrates Neeva’s retrieval and ranking technology for enhanced document and text-based dataset searches.

Awinash Sinha, Corporate CIO at Zoom, highlighted the importance of Snowflake’s AI solutions for their enterprise analytics: “By combining the power of Snowflake Cortex AI and Streamlit, we’ve been able to quickly build apps leveraging pre-trained large language models in just a few days.”

Snowflake is also unveiling Snowflake Cortex Guard, an LLM-based safeguard for filtering harmful content across organisational data, further ensuring the safety and usability of AI models. This feature, leveraging Meta’s Llama Guard, will be generally available soon.

In addition to these advancements, Snowflake is introducing Document AI and Snowflake Copilot, both generally available soon. Document AI allows users to extract content from documents using the multimodal LLM Snowflake Arctic-TILT. Snowflake Copilot enhances productivity for SQL users by combining Mistral Large with Snowflake’s proprietary SQL generation model.

“Although businesses typically use dashboards to consume information from their data for strategic decision-making, this approach has some drawbacks including information overload, limited flexibility, and time-consuming development,” said Mukesh Dubey, Product Owner Data Platform at Bayer.

Snowflake’s new AI & ML Studio, currently in private preview, offers a no-code interface for AI development, enabling users to test and evaluate models for cost-effectiveness. Cortex Fine-Tuning, now in public preview, provides serverless customization for a subset of Meta and Mistral AI models.

Additionally, Snowflake ML enhances MLOps capabilities, facilitating the management of models and features across their lifecycle. This includes the Snowflake Model Registry, now generally available, and the Snowflake Feature Store, in public preview.

These comprehensive updates reinforce Snowflake’s commitment to making AI accessible and effective for enterprises while maintaining robust security and governance frameworks.

The post Snowflake Enhances AI for Enterprise with Upgrades to Cortex AI with Meta’s Llama 3 and Mistral LLMs appeared first on AIM.

Cisco Live 2024: Cisco Unveils AI Deployment Solution With NVIDIA

Cisco will invest $1 billion in AI and package a new networking solution with NVIDIA’s AI infrastructure, the organization announced at its annual consumer event on June 4. Cisco Live is being held in Las Vegas from June 2 – 6. These and the other announcements from Cisco Live respond to the enterprise trends for AI, plus increased visibility and security.

Cisco partners with NVIDIA on Nexus HyperFabric AI clusters

A new networking solution from Cisco and NVIDIA called Cisco Nexus HyperFabric AI clusters is the companies’ attempt to create a smooth on-ramp to AI for customers who may not have in-depth knowledge of AI deployment or IT skills. Cisco Nexus HyperFabric AI clusters can be used to deploy, manage and monitor generative AI infrastructure for a business.

This diagram shows the interplay between Cisco on-prem AI infrastructure and NVIDIA hardware.
This diagram shows the interplay between Cisco on-prem AI infrastructure and NVIDIA hardware. Image: Cisco

“While the promise of AI is clear, the path forward for many just starting out is not. Customers often face economic and operational challenges to get an AI stack up and running,” said Jonathan Davidson, executive vice president and general manager of Cisco Networking, in a press release. “Cisco is committed to making the deployment and operation of AI infrastructure simpler.”

SEE: The UALink Promoter Group seeks to create a standard for AI infrastructure – notably, without NVIDIA.

The idea is organizations need guidance in order to deploy AI infrastructure. So, the Nexus HyperFabric AI cluster solution combines Cisco Ethernet switching and cloud-managed options with NVIDIA AI Enterprise software, NVIDIA NIM inference microservices, NVIDIA Tensor Core GPUS, the VAST data platform and more.

General availability is expected in the fourth quarter of 2024, with select customers gaining access earlier in Q4.

To further ease IT teams into running generative AI, Cisco is offering training about AI infrastructure. The CCDE AI Infrastructure certification can be gained from Cisco Learning and Certifications.

Just the start of AI investment: Cisco kicks off global fund

Cisco is putting $1 billion into AI via its global investment fund, which is a pledge to support startups and generative AI solutions. Cohere, Mistral AI and Scale AI will receive funding from Cisco as strategic partners. Cisco has committed $200 million to this fund so far.

Cisco announces security enhancements for cloud, web apps and more

Cisco announced enhancements to existing security products:

  • The Firewall 1200 Series — with which enterprise branch locations no longer need multiple appliances for their switches, routers and firewalls — will be available October 2024.
  • Security Cloud Control, a management architecture for Cisco Security Cloud, which provides an AI overview for Cisco Secure Firewall, Secure Firewall Threat Defense, Secure Firewall ASA, Multicloud Defense and Hypershield, coming in September.
  • Technical add-ons for Splunk, which Cisco acquired in March, including new sources of Cisco telemetry within Splunk for the security operations center. These will roll out in the next few months.
  • Version 7.6 of Firewall Threat Defense for all Cisco firewalls.
  • Google Chrome Enterprise threat and data protection for all web apps secured by Cisco Secure Access.

Support for Cisco Hypershield added to AMD Pensado DPUs and Intel infrastructure

Cisco Hypershield, a security architecture the company unveiled in April 2024 that spreads security enforcement throughout virtual machines or Kubernetes clusters, is coming to some new services. AMD Pensando DPUs will support Cisco Hypershield by the end of 2024, specifically starting with Cisco Unified Computing System servers. Support for Hypershield on AMD Pensando DPUs will expand to other server vendors and Intel infrastructure processing units by the end of the year.

“By leveraging our DPUs in customer servers or in future Cisco networking platforms, Hypershield users can enjoy high-capacity throughput and intelligent policy enforcement without compromising on workload performance,” wrote Soni Jiandani, general manager of the networking technology and solutions group at AMD, in a press release.

ThousandEyes visibility engine comes to Digital Experience Assurance

ThousandEyes is a visibility engine for device management that Cisco acquired in 2020. Now, Cisco has added AI alerts from ThousandEyes to Digital Experience Assurance for Cisco Networking Cloud. ThousandEyes pulls from across the Cisco ecosystem to provide proactive suggestions, sending device and telemetry data to customers’ domain controllers and management systems.

Now, ThousandEyes can map AWS environments, show a unified view of external and internal network conditions on Cisco and non-Cisco networking platforms in Traffic Insights, and take in Meraki Wi-Fi and Local Area Network telemetry and device information. ThousandEyes partnered further with Meraki to enhance the Meraki Assurance Overview.

TechRepublic is covering Cisco Live remotely.

I was a Copilot diehard until ChatGPT added these 5 features

pink and orange illustration

Before ChatGPT's popularity skyrocketed, I was already testing the chatbot and other models. As a result, in the past two years, I have developed a sense of what makes a model great, including speed, reliability, accessibility, cost, features, and more. Since Copilot launched in February 2023, it has been at the top of my list — until now.

Copilot's competitive advantage was offering users features that ChatGPT reserved for ChatGPT Plus subscribers for free. These features also solved all of ChatGPT's most significant pain points, including access to the internet, current event knowledge, and citations. Consequently, Copilot brought the best value to users and earned its spot as the best AI chatbot in my book.

Also: The best AI chatbots of 2024: ChatGPT, Copilot and worthy alternatives

However, in May, OpenAI held its Spring Update event, launching highly anticipated upgrades to ChatGPT, including Copilot's standout features, such as web browsing, multimodal prompts, image generation, GPT-4 intelligence, document uploads, and more.

So why isn't it a tie? With the update, ChatGPT also gained features that Copilot doesn't have, including access to OpenAI's latest flagship model, GPT-4o, and other cutting-edge upgrades. Below is a round-up of the features that helped ChatGPT reclaim its crown (and how you can use them).

1. GPT-4o

I had to start with the crown jewel of the updates — GPT-4o. The ChatGPT upgrade to GPT-4o from GPT-3.5 granted the chatbot GPT-4 intelligence and a series of bonus features that enable it to perform at a higher level, including higher speeds and improved understanding of vision and audio.

Also: What does GPT stand for? Understanding GPT-3.5, GPT-4, GPT-4o, and more

The "o" stands for omni, referring to its ability to understand text, audio, image, and video inputs and output text, audio, and images. This is a significant upgrade from GPT-3.5, which could only understand and output text.

With GPT-4o, free ChatGPT users also got access to the features previously available on Copilot and ChatGPT Plus, including getting responses from the model and the web, chatting about images and documents, using the Memory feature, accessing the GPT Store, and analyzing data.

Also: OpenAI just gave free ChatGPT users browsing, data analysis, and more

Copilot still runs on GPT-4 Turbo, which is a very intelligent model despite lacking the many upgrades the new model boasts, as discussed above.

2. Improved understanding of images

Before this update, if you wanted to upload an image, you couldn't use the free version of ChatGPT but instead had to use Copilot or ChatGPT Plus. Now, users can not only upload images to the free version of ChatGPT but the feature has also been upgraded to offer users better insights.

Also: How to subscribe to ChatGPT Plus (and 5 reasons why you should)

OpenAI shared an example of how users could leverage the improved vision by uploading a picture of their menu for the chatbot to translate and offer in-depth insights, such as its history, significance, and recommendations. Another example of its advanced capabilities is that users can upload an image and have the chatbot extract the text and turn it into code.

3. GPTs and GPT Store

Perhaps one of the most significant advantages of ChatGPT vs Copilot is that you can create GPTs for free on ChatGPT, while on Copilot, you have to pay for a Copilot Pro subscription of $20 per month. By creating your own GPT, you can customize a chatbot to meet your needs, which saves you lots of time in your workflow because you can skip prompting a chatbot every single time to perform the same function.

Also: You can now make your own custom Copilot GPT. Here's how

Another bonus is the free ChatGPT users can now access the GPT Store, where there are millions of customized chatbots created by other users and developers you can choose from that can help with specific tasks. The store includes GPTs from popular applications and sites, including AllTrails, Khan Academy Code Tutor, Canva, and more.

4. Data analysis

Many people rely on chatbots to assist with tedious data analysis tasks. Luckily, both Copilot and ChatGPT can do that. The major difference is that Copilot's data assistance is limited to Copilot Pro subscribers and works in Excel to generate formulas and analyze and summarize data.

Also: How to use ChatGPT to make charts and tables with Advanced Data Analysis

However, free ChatGPT now offers advanced data analysis tools, including creating interactive charts and tables to visualize your data, interpreting CSV files or spreadsheets, data summarization, and more. Since these operations don't have to occur in Excel, users can get assistance in whatever data management platform they prefer for free.

5. More languages

As mentioned above, GPT-4o ChatGPT was given some pretty incredible translation capabilities, made possible by its support for more languages. Since the update, ChatGPT has supported more than 50 languages. The chatbot will, by default, detect the browser's language and update ChatGPT to match, or users can switch it manually in settings.

Also: Forget Copilot: 5 major AI features Google just rolled out to Chromebooks

The increased language support guarantees that people around the globe can access ChatGPT in their language of choice. Copilot does not have such expansive availability with support for only 24 languages.

Bonus: New Voice Mode (coming soon)

If you tuned into the Spring Update event, you likely remember the new Voice Mode demo, which stole the show. Despite Voice Mode previously existing in the chatbot, the new and improved voice mode completely elevates the experience using GPT-4o's video and audio capabilities.

With the new voice mode, ChatGPT can use the context from your environment to provide voice answers, as seen in the demo below in which the chatbot comments on the user's emotions just by looking at his face.

Another plus is that the new Voice Mode has been optimized to produce more natural conversation, stopping when interrupted and having different intonation patterns, as seen in the video above. This experience will first roll out to ChatGPT Plus users in alpha but will become more broadly available to free users afterward. On the other hand, Copilot's voice input acts mostly the same as your standard, unnatural voice assistant.

Artificial Intelligence

Was ChatGPT down for you? OpenAI’s chatbot hit by major outage — here’s what happened

OpenAI ChatGPT GPT-4o

ChatGPT went down for the second time today due to a major OpenAI outage, but it's finally back up and running. The artificial intelligence (AI) chatbot has become a major productivity tool for many users. As the US began the workday this morning, many users could not access it from 10:30 a.m. through 1:17 p.m. ET.

Also: The best AI chatbots of 2024: ChatGPT, Copilot and worthy alternatives

At that time, OpenAI updated its status page to report the outage's resolution:

"Resolved — We experienced a major outage impacting all users on all plans of ChatGPT. The impact included all ChatGPT-related services. The impact did not include platform.openai.com or the API. This incident started June 4th at 2:15p GMT and was resolved June 4th at 5:01p GMT."

Before then, OpenAI had recommended ChatGPT users perform a hard refresh by clearing their browser cache before reopening the AI chatbot.

OpenAI reported its first major outage of the day this morning, which began around 2:30 a.m. ET. Five hours later, the company pushed out a fix, resolving the issue. Not everyone was affected, however, as reports say it appears to have affected mostly logged-in users. The outage was widespread, affecting the web version and the mobile and Mac apps.

Also: I was a Copilot diehard until ChatGPT added these 5 features

If you find yourself looking for AI chatbots alternative to ChatGPT in the event of another outage, here are a few you can try:

  • Microsoft Copilot: Touted as the best ChatGPT alternative, Copilot is accessible online and can access internet sources. Copilot also uses GPT-4, which is OpenAI's LLM but isn't down now.
  • Gemini: This is a good opportunity if you haven't tried Google's chatbot yet. This free AI chatbot has access to Google and replies quickly.
  • You.com: This AI bot lets you use to the most advanced LLMs, including GPT-4, GPT-4o, Gemini 1.5 Pro, Llama 3, and Claude 3 Opus, and is available for free with web access.

Featured

Ola will Save INR 15 Crore Annually Using Krutrim AI Cloud

Ola’s decision to use Krutrim AI Cloud instead of Microsoft Azure and AWS cloud infrastructure is going to save them approximately INR 15 crore annually.

“Cloud costs are crazy. He [Bhavish Aggarwal] basically used to burn INR 5 lakh per day, which is quite a bit of money,” said Sasank Chilamkurthy, the founder of Qure AI and Von Neumann AI (the company building a personal AI server JOHNAIC), in an exclusive interview with AIM.

Chilamkurthy swiftly did some number-crunching and came out with the estimation that it costs about INR 8.2-10 crore to buy servers that will handle the current load of Ola on Azure. Had they not migrated, the expense over three years would have totalled INR 54.74 crore.

The difference, INR 54.75 crore minus INR 8.2 crore, amounts to about INR 45 crore, translating to a substantial savings of roughly INR 15 crore annually.

Assuming a company uses an AWS g4dn.4xlarge instance and reserves it for three years, it would cost them INR 10-25 lakh for a specific instance, according to Chilamkurty.

According to estimates and disclosures by the Ola chief on X, this shift could lead to a daily revenue loss of INR 5-25 lakh for Microsoft Azure and INR 30-40 lakh for AWS. Annually, this could translate to approximately INR 18.25-91.25 crore for Azure and INR 109.5-146 crore for AWS.

Moreover, according to Chilamkurthy, Ola spent around INR 85 lakh on egress costs paid to Azure. Egress cost is the charge levied by cloud providers for moving or transferring data from the cloud storage where it was earlier uploaded.

Chilamkurthy told AIM that he recently bumped into Aggarwal at a cafe, where he pitched JOHNAIC for Krutrim AI Cloud. He said that Aggarwal showed some interest in JOHNAIC, partly because it also utilises Intel GPUs. He further revealed that Ola Krutrim will buy Intel Gaudi 2 to build Krutrim AI Cloud.

In a recent post, Aggarwal said that over 2,500 developers have signed up for Krutrim Cloud services, and they will work with everyone to onboard them in the coming weeks. Most recently, Aggarwal met Arm chief Rene Haas in Taiwan.

JOHNAIC Cloud in a Box

Von Neumann AI recently launched JOHNAIC, described as a “cloud in a box” solution. It provides the benefits of cloud computing, such as scalability and flexibility, but within a local, on-premises environment. This hybrid approach combines the best of both worlds—cloud capabilities with local data control.

It comes with the following hardware configurations: an Intel i5 12400 CPU, an Intel Arc 770 16 GB GPU, 64 GB of RAM, a 1 TB SSD, and an optional 1100 VA UPS.

“We have sachetised AI into small boxes so that you can personalise them and take them wherever you want,” said Chilamkurthy. It claims to slash AI costs by 85% and comes with in-built SaaS and AI tools to run SMEs and startups.

JOHNAIC comes with Ubuntu, Matrix, and ERPNext, supports oneAPI, and can run Meta Llama. “JOHNAIC is a one-time investment of around INR 2 lakh, resulting in 80-92% savings,” said Chilamkurthy, adding that users don’t need to buy an Apple MacBook to run AI applications, which comes at a similar price.

Chilamkurthy is currently targeting the inference market with JOHNAIC. “It is very reasonable and affordable for most folks and startups trying to do something cool using OpenAI GPT’s APIs,” he said.

People+AI is the first customer of JOHNAIC and is using it for its own AI requirements while keeping its data private. Compared to public cloud services like AWS, JOHNAIC offers a highly competitive pricing model making it an attractive option for businesses looking to reduce their AI infrastructure costs while maintaining robust capabilities.

“Even on AWS, you cannot really buy just one NVIDIA A100, you have to purchase it in a set of eight GPUs, which I think is really expensive. For most people, one A100 would be sufficient, and they would just run it for a week,” said Chilamkurthy.

Lately, Intel is aggressively targeting India, aiming to seize the market share created by the unavailability of NVIDIA GPUs. “I don’t use NVIDIA because I believe they overcharge us. They don’t offer small packets like sachets, instead, they push for large server purchases,” said Chilamkurthy.

[Updated] June 4, 2024, 17:21 | The article has been updated to show that it costs about INR 8.2-10 Cr to buy servers that will handle the current load of Ola on Azure.

The post Ola will Save INR 15 Crore Annually Using Krutrim AI Cloud appeared first on AIM.

Build your own chatbot and talk to your own documents

Interview with Jans Aasman, CEO of Franz, Inc.

Build your own chatbot and talk to your own documents

Image by Gerd Altmann from Pixabay

Jans Aasman’s AI background goes back to his training as a cognitive scientist at the University of Groningen in the Netherlands beginning in 1978.

Jans-Aasman

Jans Aasman, CEO of Franz, Inc.

Since then, he’s seen over 40 years of AI’s evolution. Throughout that time, he’s been hands on with a range of modeling techniques, from LISP and Prolog, to description logic and statistical machine learning.

Jans is the CEO of Franz, Inc., which started as a LISP compiler company. Under his leadership, Franz designs enterprise-class graph database management systems (DBMSes) used for metadata-rich knowledge graph development. Now the company is at the leading edge of neurosymbolic or composite AI, the kind of AI that effectively blends three different branches of AI, as shown in this diagram:

Franz-Neuro-Symbolic-AI

Illustration courtesy of Franz, Inc.

One of the key advantages to using the semantic graph database approach that Jans describes is a continual learning loop. “Any analytics you ever do needs to go back into the database. You need to store the provenance, so you can go back and review a prediction,” Jans says. “If you don’t store all the metadata, within a few months, you don’t know what happened anymore.”

Toward the end of the interview, Jans elaborates on the beginnings of an agent-oriented knowledge graph, one you can talk to. This sort of composite AI, soon to be available in Allegrograph 8.2, makes it possible to blend these elements together:

  • The precision of logic
  • LLM chatbot interface capabilities
  • The relevance of wide range of heterogeneous data sources a knowledge graph approach can make interoperable
  • The complex data handling and boundary crossing graph query generation capabilities of graph database management systems

Hope you find this conversation as illuminating as I have.

Interview with Jans Aasman, CEO of Franz, Inc.

Intel Lunar Lake NPU Brings 48 TOPS of AI Acceleration

On June 4 at Computex, held in Taiwan, Intel announced the next generation of its AI PC products: the Lunar Lake client processor architecture. Other reveals from Intel at the international computer expo were:

  • Pricing for the Intel Gaudi 2 and Gaudi 3 AI accelerator kits.
  • The release of Intel Xeon 6 processors with Efficient-cores (E-cores).

Intel products are available globally from the corporation and its distributors.

Lunar Lake processor runs at up to 48 TOPS

Intel revealed the details of the Lunar Lake processor, which will enable AI PC performance. Lunar Lake brings:

  • 40% lower system-on-chip power and more than three times the AI compute compared to the previous generation.
  • An NPU with up to 48 trillion operations per second.
  • A new GPU design with Xe2 GPU cores for improved graphics performance and Xe Matrix Extension arrays, which is a second AI accelerator with up to 67 TOPS.

Lunar Lake will appear in AI PCs from more than 20 brands, including Microsoft, throughout 2024.

An illustration shows the angled view of the Lunar Lake package.
The Lunar Lake processor can perform 48 TOPS of AI operations. Image: Intel Corporation

Competitors to Lunar Lake

Intel’s Lunar Lake competes with Qualcomm’s Snapdragon X Elite, AMD’s AI 300 series, Apple’s M4 and an increasingly crowded field of chips designed to make generative AI work on PCs.

Companies are competing for greater TOPS speed and reduced power draw to enable AI PC capabilities like Microsoft’s Copilot. AMD claims its AI 300 series reaches 50 TOPS. NVIDIA is talking about getting between 200 and 1,300 TOPS with its GeForce RTX GPUs for gaming and heavy-duty creative work on PCs and workstations.

SEE: What is generative AI, and what does it mean for business?

Last week, it was announced that Intel is part of a promoter group for a standard communications accelerator meant for AI chips in data centers, too.

For business use cases, faster AI performance could power seamless generative AI assistants. Copilot, for instance, runs on OpenAI’s technology. OpenAI and Microsoft are working on making its generative AI sound more like humans and remember where you might have dropped a lost item on your Copilot+ PC.

Intel Gaudi 2 and 3 pricing revealed

A standard AI kit made up of eight Gaudi 2 AI accelerators and a universal baseboard will be available to system providers at $65,000. Gaudi 2 is shipping now.

A kit made up of eight Gaudi 3 AI accelerators and a universal baseboard will be available to system providers at $125,000. It will be available in Q2 2024.

In addition to revealing the price of the kits, Intel announced at Computex that six new providers will be working with Gaudi 3. Asus, Foxconn, Gigabyte, Inventec, Quanta and Wistron have been added to the initial deals with Dell, Hewlett Packard Enterprise, Lenovo and Supermicro.

The Intel Xeon 6 P-Core processor is available now

Initially introduced at the Intel Vision conference in April, the first of the two new Xeon 6 processors was made available beginning June 3, Intel announced at Computex. That processor, the Intel Xeon 6 P-core (code-named Sierra Forest), is suitable for AI, other high-performance computing projects and cloud-native applications.

One tier up from the Xeon 6 P-core is the Xeon 6 E-core (code-named Granite Rapids), which has a shared software stack with the P-core version, plus:

  • Higher core density.
  • Better performance per watt.
  • Lower energy costs.

The Xeon 6 E-core is expected to ship in the third quarter of 2024.

A worker at the Intel Kulim Test Assembly Test facility in Malaysia examines Intel Xeon 6 processors with E-cores.
A worker at the Intel Kulim Test Assembly Test facility in Malaysia examines Intel Xeon 6 processors with E-cores. Image: Intel Corporation

“AI is driving one of the most consequential eras of innovation the industry has ever seen,” said Intel CEO Pat Gelsinger in a press release. “The magic of silicon is once again enabling exponential advancements in computing that will push the boundaries of human potential and power the global economy for years to come.”