A Deep Dive into GPT Models: Evolution & Performance Comparison

By Ankit, Bhaskar & Malhar

Over the past few years, there has been remarkable progress in the field of Natural Language Processing, thanks to the emergence of large language models. Language models are used in machine translation systems to learn how to map strings from one language to another. Among the family of language models, the Generative Pre-trained Transformer (GPT) based Model has garnered the most attention in recent times. Initially, the language models were rule-based systems that heavily relied on human input to function. However, the evolution of deep learning techniques has positively impacted the complexity, scale and accuracy of the tasks handled by these models.

In our previous blog, we provided a comprehensive explanation of the various aspects of the GPT3 model, evaluated features offered by Open AI’s GPT-3 API and also explored the model’s usage and limitations. In this blog, we will shift our focus to the GPT Model and its foundational components. We will also look at evolution — starting from GPT-1 to the recently introduced GPT-4 and dive into the key improvements done in each generation that made the models potent over time.

1. Understanding GPT Models

GPT (Generative Pre-trained Transformers) is a deep learning-based Large Language Model (LLM), utilizing a decoder-only architecture built on transformers. Its purpose is to process text data and generate text output that resembles human language.

As the name suggests, there are three pillars of the model namely:

  1. Generative
  2. Pre-trained
  3. Transformers

Let's explore the model through these components:

Generative: This feature emphasizes the model's ability to generate text by comprehending and responding to a given text sample. Prior to GPT models, text output was generated by rearranging or extracting words from the input itself. The generative capability of GPT models gave them an edge over existing models, enabling the production of more coherent and human-like text.

This generative capability is derived from the modeling objective used during training.

GPT models are trained using autoregressive language modeling, where the models are fed with an input sequence of words, and the model tries to find the most suitable next word by employing probability distributions to predict the most probable word or phrase.

Pre-Trained: "Pre-trained" refers to an ML model that has undergone training on a large dataset of examples before being deployed for a specific task. In the case of GPT, the model is trained on an extensive corpus of text data using an unsupervised learning approach. This allows the model to learn patterns and relationships within the data without explicit guidance.

In simpler terms, training the model with vast amounts of data in an unsupervised manner helps it understand the general features and structure of a language. Once learned, the model can leverage this understanding for specific tasks such as question answering and summarization.

Transformers: A type of neural network architecture that is designed to handle text sequences of varying lengths. The concept of transformers gained prominence after the groundbreaking paper titled "Attention Is All You Need" was published in 2017.

GPT uses decoder-only architecture.The primary component of a transformer is its "self-attention mechanism," which enables the model to capture the relationship between each word and other words within the same sentence.

Example:

  1. A dog is sitting on the bank of the River Ganga.
  2. I’ll withdraw some money from the bank.

Self-attention evaluates each word in relation to other words in the sentence. In the first example when “bank” is evaluated in the context of “River”, the model learns that it refers to a river bank. Similarly, in the second example, evaluating "bank" with respect to the word "money" suggests a financial bank.

2. Evolution of GPT Models

Now, let's take a closer look at the various versions of GPT Models, with a focus on the enhancements and additions introduced in each subsequent model.

A Deep Dive into GPT Models
*Slide 3 in GPT Models

GPT-1

It is the first model of the GPT series and was trained on around 40GB of text data. The model achieved state-of-the-art results for modeling tasks like LAMBADA and demonstrated competitive performance for tasks like GLUE and SQuAD. With a maximum context length of 512 tokens (around 380 words), the model could retain information for relatively short sentences or documents per request. The model's impressive text generation capabilities and strong performance on standard tasks provided the impetus for the development of the subsequent model in the series.

GPT-2

Derived from the GPT-1 Model, the GPT-2 Model retains the same architectural features. However, it undergoes training on an even larger corpus of text data compared to GPT-1. Notably, GPT-2 can accommodate double the input size, enabling it to process more extensive text samples. With nearly 1.5 billion parameters, GPT-2 exhibits a significant boost in capacity and potential for language modeling.

Here are some major improvements in GPT-2 over GPT 1:

  1. Modified Objective Training is a technique utilized during the pre-training phase to enhance language models. Traditionally, models predict the next word in the sequence solely based on previous words, leading to potentially incoherent or irrelevant predictions. MO training addresses this limitation by incorporating additional context, such as Parts of Speech (Noun, Verb, etc.) and Subject-Object Identification. By leveraging this supplementary information, the model generates outputs that are more coherent and informative.
  2. Layer normalization is another technique employed to improve training and performance. It involves normalizing the activations of each layer within the neural network, rather than normalizing the network's inputs or outputs as a whole. This normalization mitigates the issue of Internal Covariate Shift, which refers to the change in the distribution of network activations caused by alterations in network parameters.
  3. GPT 2 is also powered by superior sampling algorithms as compared to GPT 1. Key improvements include:
    1. Top — p sampling: Only tokens with cumulative probability mass exceeding a certain threshold are considered during sampling. This avoids sampling from low-probability tokens, resulting in more diverse and coherent text generation.
    2. Temperature scaling of the logits (i.e., the raw output of the neural network before Softmax), controls the level of randomness in the generated text. Lower temperatures yield more conservative and predictable text, while higher temperatures produce more creative and unexpected text.
    3. Unconditional sampling (random sampling) option, which allows users to explore the model's generative capabilities and can produce ingenious results.

GPT-3

Training Data source Training Data Size
Common Crawl, BookCorpus, Wikipedia, Books, Articles, and more over 570 GB of text data

The GPT-3 Model is an evolution of the GPT-2 Model, surpassing it in several aspects. It was trained on a significantly larger corpus of text data and featured a maximum of 175 billion parameters.

Along with its increased size, GPT-3 introduced several noteworthy improvements:

  • GShard (Giant-Sharded model parallelism): allows the model to be split across multiple accelerators. This facilitates parallel training and inference, particularly for large language models with billions of parameters.
  • Zero-shot learning capabilities facilitates GPT-3 to exhibit the ability to perform tasks for which it hadn't been explicitly trained. This means it could generate text in response to novel prompts by leveraging its general understanding of language and the given task.
  • Few-shot learning capabilities powers GPT-3 to quickly adapt to new tasks and domains with minimal training. It demonstrates an impressive ability to learn from a small number of examples.
  • Multilingual support: GPT-3 is proficient in generating text in ~30 languages, including English, Chinese, French, German, and Arabic. This broad multilingual support makes it a highly versatile language model for diverse applications.
  • Improved sampling: GPT-3 uses an improved sampling algorithm that includes the ability to adjust the randomness in generated text, similar to GPT-2. Additionally, it introduces the option of "prompted" sampling, enabling text generation based on user-specified prompts or context.

GPT-3.5

Training Data source Training Data Size
Common Crawl, BookCorpus, Wikipedia, Books, Articles, and more > 570 GB

Similar to its predecessors, the GPT-3.5 series models were derived from the GPT-3 models. However, the distinguishing feature of GPT-3.5 models lies in their adherence to specific policies based on human values, incorporated using a technique called Reinforcement Learning with Human Feedback (RLHF). The primary objective was to align the models more closely with the user's intentions, mitigate toxicity, and prioritize truthfulness in their generated output. This evolution signifies a conscious effort to enhance the ethical and responsible usage of language models in order to provide a safer and more reliable user experience.

Improvements over GPT-3:

OpenAI used Reinforcement Learning from human feedback to fine-tune GPT-3 and enable it to follow a broad set of instructions. The RLHF technique entails training the model using reinforcement learning principles, wherein the model receives rewards or penalties based on the quality and alignment of its generated outputs with human evaluators. By integrating this feedback into the training process, the model gains the ability to learn from errors and enhance its performance, ultimately producing text outputs that are more natural and captivating.

GPT 4

GPT-4 represents the latest model in the GPT series introducing multimodal capabilities that allow it to process both text and image inputs while generating text outputs. It accommodates various image formats, including documents with text, photographs, diagrams, graphs, schematics, and screenshots.

While OpenAI has not disclosed technical details such as model size, architecture, training methodology, or model weights for GPT-4, some estimates suggest that it comprises nearly 1 trillion parameters. The base model of GPT-4 follows a training objective similar to previous GPT models, aiming to predict the next word given a sequence of words. The training process involved using a huge corpus of publicly available internet data and licensed data.

GPT-4 has showcased superior performance compared to GPT-3.5 in OpenAI's internal adversarial factuality evaluations and public benchmarks like TruthfulQA. The RLHF techniques utilized in GPT-3.5 were also incorporated into GPT-4. OpenAI actively seeks to enhance GPT-4 based on feedback received from ChatGPT and other sources.

Performance Comparison of GPT Models for Standard Modeling Tasks

Scores of GPT-1,GPT-2 and GPT-3 in standard NLP Modeling tasks LAMBDA, GLUE and SQuAD.

Model GLUE LAMBADA SQuAD F1 SQuAD Exact Match
GPT-1 68.4 48.4 82.0 74.6
GPT-2 84.6 60.1 89.5 83.0
GPT-3 93.2 69.6 92.4 88.8
GPT-3.5 93.5 79.3 92.4 88.8
GPT-4 94.2 82.4 93.6 90.4

All numbers are in Percentages. || source — BARD

This table demonstrates the consistent improvement in results, which can be attributed to the aforementioned enhancements.

GPT-3.5 and GPT-4 are tested on the newer benchmarks tests and standard examinations.

The newer GPT models, (3.5 and 4) are tested on the tasks that require reasoning and domain knowledge. The models have been tested on numerous examinations which are known to be challenging. One such examination for which GPT-3 (ada, babbage, curie, davinci), GPT-3.5 , ChatGPT and GPT-4 are compared is the MBE Exam. From the graph we can see continuous improvement in score, with GPT-4 even beating the average student score.

Figure 1 illustrates the comparison of percentage of marks obtained in MBE* by different GPT models:

A Deep Dive into GPT Models
*The Multistate Bar Exam (MBE) is a challenging battery of tests designed to evaluate an applicant’s legal knowledge and skills, and is a precondition to practice law in the US.

The below graphs also highlights the progress of the models and again beating the average students scores for different streams of legal subject areas.

A Deep Dive into GPT Models
Source: Data Science Association

Conclusion

The above results validate the power of these new models with model performance being compared against the human scores is a big indicator of the same. In a short span of about 5 years since the introduction of GPT-1 the model size has grown around ~8,500 times.

In the next blog, we will explore specialized versions of GPT models in greater detail, including their creation process, capabilities, and potential applications. A comparative analysis of models will be done to gain valuable insights into their strengths and limitations.

Index

GPT-1 GPT-2 GPT-3 (175B) GPT-3.5 GPT-4
GLUE 65.4 77.7 92.8
LAMBADA 25.4% 63.24% (ZS) 76.2% (ZS)
SQuAD F1 Score 60.3% 72.1% 93 %
SQuAD Exact Match — EM Score 56.0% 62.4% 87.1%

Note: ZS: Zero Shot ,Source: ChatGPT, BARD

Conclusion

With the rise of Transformer-based Large Language Models (LLMs), the field of natural language processing is undergoing rapid evolution. Among the various language models built on this architecture, the GPT models have emerged exceptional in terms of output and performance. OpenAI, the organization behind GPT, has consistently enhanced the model on multiple fronts since the release of the first model.

Over the course of five years, the size of the model has scaled significantly, expanding approximately 8,500 times from GPT-1 to GPT-4. This remarkable progress can be attributed to continuous enhancements in areas such as training data size, data quality, data sources, training techniques, and the number of parameters. These factors have played a pivotal role in enabling the models to deliver outstanding performance across a wide range of tasks.

  • Ankit Mehra is a Senior Data Scientist at Sigmoid. He specializes in analytics and ML-based data solutions.
  • Malhar Yadav is an Associate Data Scientist at Sigmoid and a coding and ML enthusiast.
  • Bhaskar Ammu is a Senior Lead Data Scientist at Sigmoid. He specializes in designing data science solutions for clients, building database architectures, and managing projects and teams.

More On This Topic

  • Deep Learning Recommendation Models (DLRM): A Deep Dive
  • Optimizing Python Code Performance: A Deep Dive into Python Profilers
  • High-Performance Deep Learning: How to train smaller, faster, and better…
  • High-Performance Deep Learning: How to train smaller, faster, and better…
  • High-Performance Deep Learning: How to train smaller, faster, and better…
  • High-Performance Deep Learning: How to train smaller, faster, and better…

Dell VP on the changing world of DevOps, CloudOps, AI and multicloud by design 

Connected clouds on a circuit board.
Image: kras99/Adobe Stock

At Dell Technologies World in Las Vegas, I sat down with Caitlin Gordon, vice president, product management, software and solutions at Dell Technologies, to learn about her company’s push toward multicloud by design. We also discussed DevOps, AI workloads, the skills gap and much more. The following transcript of the interview has been edited for length and clarity.

Jump to:

  • What is multicloud by design?
  • Dell’s solution: Apex-as-a-service
  • What’s changing in the world of DevOps?
  • New ways to utilize known solutions
  • AI workloads beyond “the cool kid”
  • More news from Dell Technologies World

What is multicloud by design?

Multicloud by design is how Dell refers to multicloud workloads, applications or processes that are managed in total, not in silos. The idea is that everything in the cloud stack can be managed together and, as the name suggests, is designed from the beginning to be managed together. At its conference this week, Dell announced several additions to its Apex service portfolio that aim to make multicloud management more flexible.

Multicloud data storage services allow organizations to store data independently of any one cloud vendor; instead, the services can spread data across multiple clouds. This may involve coordinating with a partner – such as Dell’s Apex, IBM Cloud Satellite, Google Cloud Service or small vendors – to manage that data.

SEE: We offer a handy guide for companies considering switching to multicloud.

Megan Crouse: In your own words, what enables multicloud by design to work? Why adopt it now?

Caitlin Gordon: What we’ve found from customer conversations over the last few years is that customers have evolved from cloud-first strategies years ago to cloud optimization strategies. They now have a little bit more perspective and experience on what the public cloud can provide and what they want to have on-prem. They are really thinking about those two estates as two different parts of one strategy versus conflicting strategies.

SEE: Multicloud explained: A cheat sheet (TechRepublic)

Ultimately, what we heard from customers a few years ago that drove this whole initiative is they felt like they got into multicloud by default, meaning it felt more like “multi-contract” to them. They knew who their partners were and who their primary and secondary was, maybe their public cloud. They had on-prem standardization to some extent. But there wasn’t any real way that they interoperated with each other and really simplified the world, or, from a CIO’s perspective, simplified everything across the board. And part of what we’ve seen is customers really taking a step back and thinking about: How do I make all of this work together? How do I pick not just the right partners in and of themselves, but the right partners that all are working with each other as well?

Customers ultimately can only solve so much by themselves. They have more skills gaps than they ever have before; they have developer productivity challenges; they have more security challenges than they’ve ever had; they have data sovereignty challenges. It’s about getting the cloud experience into any data center that you want and really giving you that control with that agility that you expect from the cloud.

Dell’s solution: Apex-as-a-service

Megan Crouse: Does Apex-as-a-service sit on top of the multicloud framework?

Caitlin Gordon: There are three or four dimensions to it. One is: How do you accelerate what you want to do in the public cloud? I think about what we refer to as our ground-to-cloud strategy. Being able to use best-in-class, enterprise-class storage in the public cloud so you can have more workload flexibility, that’s one side of it.

The other side is: How do I really optimize what I’m doing in my own data center by bringing those cloud operating models, the cloud operating systems and the cloud-to-ground side of things?

The third piece is the as-a-service portfolio. This is how you get a cloud consumption experience, not just across those multicloud initiatives but for anything that we offer in the Dell portfolio, whether it’s compute, storage, data protection or even PCs and peripherals. Those are the different dimensions: both a management and a consumption experience.

Megan Crouse: If a business doesn’t know where to start with multicloud by design, what should they consider first?

Caitlin Gordon: It comes down to: Every customer is different. What matters to that customer is what’s driving their own business. Are they a business driven heavily by data – something like life sciences where their business is data? Or a bank where they have heavy regulations they’re worrying about?

It depends on the different levels of security, velocity, culture and philosophy. You want to have balance between what’s going to be in your data centers, what’s going to be in which public clouds, how much risk are you willing to take on? How much control do you need? Who do you want to be partnering with? How important is simplicity? Within that, you can really tune that strategy. One of the backdrops to this concept of multicloud by design is choice, flexibility and not saying, “Well, I want cloud, so it means this.” It’s about saying, “I want cloud, but I want some flexibility in what that experience is going to be.”

Megan Crouse: Similarly, when making decisions about what cloud operating models to bring into the data center, what should organizations consider?

Caitlin Gordon: It comes back to workloads. Are you dealing with a current landscape of workloads that look like thousands of VMs you need to manage? How many of those are strategic? Where do they need to live? Are you relatively small and new and actually building most of your applications starting now, so really, truly cloud-native and application-centric? Is it a balance of the two? Where do you need to invest? Where do you need to maintain?

Then you get into the combination of whether it’s going to be more Red Hat leaning, or more VMware or Microsoft leaning. What role does AWS potentially play in that? And then you start figuring out who the ecosystem partners are. We believe our strategic value to our customers is whatever the answer is for them, we can support that and we’re working with all of those different partners.

What’s changing in the world of DevOps?

Megan Crouse: What is changing in cloud operations and DevOps today?

Caitlin Gordon: We see customers are on a broad continuum of DevOps maturity. Do they have more siloed traditional operations that are around the components of their infrastructure, or do they have the other end of the spectrum: platform engineering? And then there’s everything in between. When you get into things like CloudOps, DevOps, AIOps, SecOps and how they work together, that’s really getting into a more mature, truly infrastructure-as-code-driven IT approach. It’s probably the exception, not the norm, today. There are probably a lot of benefits to it, but customers have a lot of technical debt in what they own, but also just culturally and in terms of skills to be able to get to that model.

SEE: DevOps: A cheat sheet (TechRepublic)

Ultimately, a lot of this comes down to the public cloud bringing a lot of benefits to our customers: agility, scale, global reach. But also they are used to a lot of things in the data center they don’t get in that public cloud. How do we give them both? Part of the expectation now is the public cloud gives fast agility and a really simple experience for easy developer productivity. People are saying they want that, but they want it in their data center. And that’s where you start to see this CloudOps type of model try to come on-prem.

Megan Crouse: What do you think CloudOps will look like in one to three years?

Caitlin Gordon: More people will move in that direction. We may come up with a new term for it, because we like to do that around here. But the concept of having a more agile way to approach IT, the concept of being able to be more automation-driven, is going to continue to grow. The way that applications went from really being VM centric to now getting more container centric, the operating model of IT needs to evolve to support that. But we also know nothing ever completely goes away. Being able to bring yourself from where you start to where you’re going, and keeping or moving what you had to the new model, that’s really where the work begins. And that’s going to take a long time.

New ways to use known solutions

Megan Crouse: The Apex umbrella is a culmination of things Dell has historically done well, from PCs to Software-as-a-Service. Do you see it this way, or do you see it as totally novel or a mix of both?

Caitlin Gordon: I think it’s a mix of both. Ultimately, our Apex strategy is about bringing, pretty simply, consumption models and our cloud experience to our customers, and we’re doing that with an open ecosystem of partners. There’s novelty in that, because a lot of what we’re doing is informed by the expectation our customers have because of what the public cloud has provided. At the same time, the lineage of this company is partnering closer with partners, including Microsoft, on delivering that unified, simplified experience. When you ship one of our PCs out of the factory, it was always built with Windows built in and is built to make that really easy for customers to get up and running. Now, what we’re doing with Microsoft with the Apex Cloud Platform for Azure is the same idea, but for a data center and a full software infrastructure stack. That idea is where we come from.

Megan Crouse: You mentioned the skills gap. There is a big conversation now about making sure the people who are in these operations teams can use multicloud to manage vast amounts of data, as well as companies struggling to find skilled workers in general. Can you speak to how we got here with the skills gap, and what happens next?

Caitlin Gordon: How did we get here? I think a lot of how we got here, you mentioned it earlier, thinking of public cloud and on-prem as separate strategies is part of how we got here. We were going down a highway and a lot of companies went off a cloud-first exit ramp, but people still were on the other highway. And you still have people who are managing, building and supporting workloads, but you need to find the new skills to work in the new environment. I talked to customers today who are treating them (multicloud and on-prem) as separate.

How will that evolve? I think now we’re starting to see “I can’t keep going this way.” People used to want to do it themselves, but they don’t anymore because either they can’t or they don’t feel it’s worth the investment. So they’re asking for help from us to do things they used to be able to do themselves. And also more and more it’s, “I have more partners than I had before, because in the world where I only had data centers, I had at least dual-vendor strategies, but you standardized there.” Then they introduced cloud partners; once you started putting those together, you had two different ecosystems of multiple partners. Most customers are not going to standardize on a single public cloud. That means in the data center they need to standardize, they need commonality, they need to trust a very small set of partners. That’s the key part for us. They need consistency, commonality and as few stacks as possible, because there aren’t enough skills to go around.

AI workloads beyond “the cool kid”

Megan Crouse: Do you see generative AI in this space, either behind the scenes on your team or in terms of customer demand?

Caitlin Gordon: I would broaden it to all AI. Generative AI is the cool kid on the block. [AI is] one of the categories of workloads driving everything we’re doing in multicloud, whether that means I’m trying to use the different machine learning models in the public cloud and need storage that can scale with that.

Maybe [multicloud could scale] in a way the native file storage, for example, doesn’t. Now we have the Apex File Storage for AWS, which may support what you need to do with AI in the cloud better and be able to move that on-prem more seamlessly. At the same time, maybe I want to create an AI model in my own data centers, and I want to be able to do that with the right GPUs, with the right partners. That’s really what we can support on the cloud platforms. We have a variety of different GPUs we support on those platforms; it gives the customer the ability to control that data, control that environment and still take advantage of those models.

More news from Dell Technologies World

  • Dell brings more cloud products under APEX umbrella
  • Dell’s Project Helix is a wide-reaching generative AI service
  • Dell’s Project Helix heralds a move toward specifically trained generative AI
  • Dell reveals new edge as-a-service portfolio, NativeEdge

Disclaimer: Dell paid for my airfare, accommodations and some meals for the Dell Technologies World event held May 22-25 in Las Vegas.

Innovation Insider Newsletter

Catch up on the latest tech innovations that are changing the world, including IoT, 5G, the latest about phones, security, smart cities, AI, robotics, and more.

Delivered Tuesdays and Fridays Sign up today

AI is coming to a business near you. But let’s sort these problems first

woman using laptop

While most of you will be familiar with ChatGPT, which is a generative artificial intelligence (AI) tool built on a large language model (LLM) that provides relatively intelligent responses to questions, few of you will be using it at work. ChatGPT is usually not considered safe for serious business endeavors and is mainly used for tinkering at this point.

Now, efforts are underway to package language models into enterprise environments, focused on resident enterprise data. But at the same time, AI practitioners and experts are urging caution with the development of AIs and LLMs.

Also: How to use ChatGPT: Everything you need to know

These are the findings from a survey of 300 AI partitioners and experts released by expert.ai. "Enterprise-specific language models are the future," the report's authors state. "Business and technical executives are being asked by their boards and increasingly by shareholders how they plan to leverage this new dawn of AI and the promise it provides to unlock language to solve problems."

The research suggests more than one-third (37%) of enterprises are already considering building enterprise-specific language models.

At the same time, AI practitioners recognize that building and maintaining a language model is a non-trivial task. A majority of enterprises (79%) realize that the effort required to train a usable and accurate enterprise-specific language model is "a major undertaking".

Also: The best AI chatbots: ChatGPT and alternatives to try

Nevertheless, efforts are underway — teams are already budgeting for LLM adoption and training projects, with 17% having budget this year, another 18% planning to allocate budget, and 40% discussing budgeting for next year.

"This makes sense, as most of the public domain data used to train LLMs like ChatGPT is not enterprise-grade or domain-specific data," the expert.ai authors state. "Even if a language model has been trained on different domains, it is not likely representative of what is used in most complex enterprise use cases, whether vertical domains like financial services, insurance, life sciences and healthcare, or highly specific use cases like contract review, medical claims, risk assessment, fraud detection and cyber policy review. Training effort will be required to have quality and consistent performance within highly specific domain use cases."

For enterprise AI advocates in the survey, the top concern with generative AI is security, cited by 73%. Lack of truthfulness is another issue, cited by 70%. More than half (59%) express concern about intellectual property and copyright protection — particularly with LLMs such as GPT, "trained on wide swaths of information, some of which is copyright protected, and because it comes from publicly available internet data," the report's authors maintain. "It has a fundamental garbage-in, garbage-out issue."

Also: How to use ChatGPT to write code

AI might reduce the need for human resources in specific tasks but, ironically, it is going to require even more people to build and sustain it. More than four in ten (41%) AI advocates express concern about a shortage of skilled professionals with expertise to develop and implement enterprise generative AI.

More than a third (38%) of survey respondents express concern about the amount of computational resources required to run LLMs. Infrastructure, such as powerful servers or cloud computing services, are needed to support the large-scale deployment of language models, the report's authors state.

Enterprise adoption of language models requires careful planning and consideration for a range of factors, including data privacy and security, infrastructure and resource requirements, integration with existing systems, ethical and legal considerations, and skill and knowledge gaps.

Also: AI is more likely to cause world doom than climate change

As with any emerging technology, successful adoption depends on use cases that demonstrate a significant leap over previous methods. There are some solid use cases for generative AI, as explored in the survey:

  • Human-computer interaction: Enterprise language models will serve to provide end users and customers "with quick and easy access to information and support, such as product details, troubleshooting guides and frequently asked questions." The most prevalent use cases at this stage are chatbots (54%), question and answering (53%), and customer care (23%).
  • Language generation: "Generative AI can write new content, create realistic images, generate marketing copy, compose music and even generate programming code." The two most popular examples at this time are content summarization (51%) and content generation (45%).
  • Information extraction: The top use cases here are knowledge mining (49%), content classification, and metadata creation (38%). Content categorization for routing (27%) and entity extraction (20%) are also mentioned.
  • Search: General search (39%), semantic search (31%,) and recommendations (29%) are seen as "important tools for helping people find the information they need quickly and accurately, without having to look through lots of irrelevant results."

While many enterprises might be seeking to adopt enterprise LLMs, most AI advocates in the survey advise caution with proceeding with AI. Almost three-quarters (71%) agree that government regulations are required immediately to deal with legitimate commercial AI use and malicious use. AI and LLMs "can have significant ethical and legal implications, particularly around issues of bias, fairness and truthfulness," the report's authors warn.

Artificial Intelligence

What is ChatGPT and why does it matter? Here’s what you need to know

See also

  • How to use ChatGPT to write Excel formulas
  • How to use ChatGPT to write code
  • ChatGPT vs. Bing Chat: Which AI chatbot should you use?
  • How to use ChatGPT to build your resume
  • How does ChatGPT work?
  • How to get started using ChatGPT

What is AWS’ Generative AI Strategy?

Generative AI has revolutionised the technology landscape, opening up a plethora of possibilities for enterprises across various domains. Today, organisations are actively exploring and uncovering potential use cases to harness the power of generative AI. AWS, which is the largest cloud service provider in the world, is also investing heavily in generative AI.

“It has always been a significant area of focus for AWS. If you look at Amazon’s legacy for nearly 20 years, we have been building several experiences for our customers using machine learning or different AI techniques,” Anupam Mishra, Director, Solution Architecture, AWS India and South Asia, told AIM at the recently held AWS Summit in Mumbai.

He expects generative AI will be an area where AWS keeps investing more and more. The cloud giant’s approach to investing in new technologies is driven by customer feedback. “Over 90% of the features we have launched are a direct result of listening to our customers’ needs. We consistently prioritise understanding and fulfilling their requirements, continuously building and delivering solutions that align with their expectations.”

AWS Generative AI strategy

AWS has hundreds of thousands of customers in India and million active customers globally. Mishra has revealed that most of AWS’ customers are exploring the use of Generative AI.

AWS’ Generative AI strategy is to empower customers with the capabilities and resources to build customised Generative AI solutions that cater to their specific needs and requirements. Mishra, who leads AWS India’s technology team of Solution Architects, said AWS’ strategy is to bring AI to the hands of every developer. “We are focused on democratising AI as much as possible, Our Generative AI strategy is divided in three parts.”

Primarily, AWS focuses on providing cost-effective hardware solutions that enable high-performance generative AI tasks. “For example, in the hardware layer we have Inferentia, a new instance designed for efficient inference, and training, a chip optimised for training models.

“These hardware solutions aim to significantly improve both training time and cost efficiency. For instance, Inferentia demonstrates a 40% improvement in price performance compared to other MC2 instances. Similarly, we achieve a significant 50% improvement in performance per watt, a crucial factor for promoting sustainability,” he said.

Secondly, AWS wants to make it effortless for enterprises to leverage large language models and tailor them to deliver the desired experiences through fine-tuning. “With Bedrock, we provide a seamless serverless experience that minimises the need for extensive training time and reduces costs. By leveraging existing foundation models (LLMs), enterprises can save valuable time and resources,” Mishra said.

“Lastly, the third layer after Bedrock is how do we offer API based experiences to our customers. One of the products which we have created is called CodeWhisperer, which allows you as a developer to automatically generate code by writing comments.”

A Generative AI ecosystem

While GPT-4 stands as one of the most advanced large language models to date, enterprises are actively exploring use cases for a variety of models developed by different AI labs. For instance, Anthropic introduced Claude, which has garnered attention and interest among organisations. Besides, in recent months, a multitude of open-source LLMs have emerged, expanding the options available to developers and researchers.

AWS is actively exploring ways to enable customers to leverage these models, allowing them to provide unique and differentiated experiences to their own customers, all while minimising the effort required.

“The aim is to establish an ecosystem where foundational models developed by AWS partners, such as Anthropic, Stability AI, and Hugging Face, can be leveraged by our customers to bring the value to the market,” Mishra said.

Furthermore, earlier this year, AWS launched a set of foundational models called Titan. “For example, the first model is a LLM specifically designed for tasks such as text generation and summarization, among other things. Whereas the second model improves searches and personalisations,” Mishra added.

AWS in India

For AWS, India is one of its key markets. Recently, the cloud giant announced an investment of USD12.7 billion (INR 1,05,600 crores) into cloud infrastructure in India by 2030. “India is a very important market for us. In fact, it is one of the fastest growing areas for AWS. So far we have already invested USD3.7 billion. Further, we also expect about 1,31,700 full time jobs to be created because of the investment,” Mishra said.

So what makes India a lucrative market for AWS? Besides being home to a growing startup ecosystem, India is one of the biggest consumer markets, with numerous companies experiencing rapid growth and expansion within the country. “However, we are seeing that a lot of companies are going global from India where they build the product in India and then they are serving the rest of the world. FreshWorks is a great example, built on AWS, and now they’re serving the whole world from India.”

The large number of SMBs in India also makes it an important market for AWS. India is the largest SMB market in the world with nearly 75 million players in the space. AWS sees huge potential in the SMB segment as well. “Our chairman Jeff Bezos was in India a couple of years back and he made a pledge of digitalising 10 million SMBs in India by 2025. We are committed to deliver on that pledge helping SMBs in different ways and let them ride this digital wave,” Mishra said.

Explaining with the example of Havemor, an ice cream manufacturer, Mishra said that they have leveraged AWS to migrate their SAP HANA workloads, including their fulfilment portal and in-store operations. “By relying on AWS, they can offload the heavy lifting involved in infrastructure management and tool development, allowing them to focus on their core competency of making ice cream. This principle extends to numerous other companies who choose to let AWS handle the infrastructure complexities while they concentrate on excelling in their respective industries,” Mishra concluded.

The post What is AWS’ Generative AI Strategy? appeared first on Analytics India Magazine.

The official ChatGPT app is now available in 11 more countries

The official ChatGPT app is now available in 11 more countries Romain Dillet @romaindillet / 7 hours

OpenAI has announced in a tweet that the official ChatGPT mobile app is now available in more countries. When OpenAI first unveiled its mobile app last week, the app was only available on iOS and in the U.S. Now, many people living in Europe, South Korea, New Zealand and more will be able to download the app from the App Store.

The ChatGPT app is a free app without any ads. People who are already familiar with ChatGPT will feel right at home as it’s basically just a way to interact with the chatbot — nothing more, nothing less.

Here’s the full list of countries where the ChatGPT is now available: Albania, Croatia, France, Germany, Ireland, Jamaica, New Zealand, Nicaragua, Nigeria, South Korea, the U.K. and the U.S. Once again, the app is only available on iOS for now. In its original announcement, OpenAI also promised that an Android app was “coming soon.”

When you open the app, you can start typing text in a text box at the bottom of the screen. It works just like sending a message in any messenger app. While you can dictate text using Apple’s built-in speech recognition feature, you can also leverage OpenAI’s open source speech recognition system Whisper for voice input.

After you hit the send button, OpenAI processes your request and returns an AI-generated answer. You can follow up with more information or ask for a different answer. The app supports code blocks and users can copy and paste answers.

By default, ChatGPT saves your chat history and uses it for model training. When this feature is enabled, you will also be able to find your conversations on desktop. It’s worth noting that there’s no way to disable data sharing without disabling chat history too.

If you are a ChatGPT Plus subscriber, you will be able to access GPT-4’s capabilities through the mobile app. Users should also notice faster response times. ChatGPT Plus costs $20 per month on desktop and is also available as an in-app purchase in your local currency (€22.99 per month in Europe, £19.99 in the U.K., etc.).

The timing of this expansion, which includes several European countries, is interesting as OpenAI’s CEO Sam Altman is meeting European heads of states this week, such as France’s Emmanuel Macron, Spain’s Pedro Sánchez and the U.K.’s Rishi Sunak. Altman has expressed criticisms toward rushed AI regulatory policy. And now, ChatGPT will be much more accessible in Europe as people will be able to say “just download the app.”

Image Credits: OpenAI (App Store screenshot)

Is the GPT Breakthrough in Robotics Approaching Soon?

Be it Agility Robotics’ bots lifting weights to Google DeepMind’s playing football, we have seen tech companies heavily training and working on improving a robot’s capabilities. In a run towards improving task performance, companies are constantly innovating in the field.

However, compared to the growth of LLM models, advancements in robotics would take time, as it deals with imperfections of control and sensors in the real world. Building and testing these prototypes often get delayed. And while an LLM’s prowess is measured by the training datasets and tokens, a robot’s efficiency is gauged by the number of tasks it can accomplish — something that takes years to master.

Drawing parallels to how GPT was built and improved by an ecosystem of companies that worked together, data expert and AI strategy advisor Vin Vashishta recently spoke about how robotics is still on the path to having its own GPT moment.

He believes that innovation doesn’t start with advanced robotics, instead with applied robotics and an ecosystem of companies working together. With tech companies working and investing in robotics companies, their next big breakthrough will probably happen with companiesbeing able to build on small-scale demonstrations and development of fundamental data collection components.

So, there are two pertinent questions — how far have tech companies gone in robotics development and will the turning point occur any time soon?

Waiting for the GPT-Moment

The sector, as a whole, is continuously innovating, and yet, the breakthrough that everyone is anticipating is yet to come. This can be attributed to the fact that robotics, as a sector, is extremely hard to get right. It’s hard to make the systems robust and repeatable at scale. Moreover, robotics is a tricky sector for investors as huge money is required to bring production to scale. This discourages investors who are looking to make quick money.

Recently, AIM got in touch with Gokul NA, founder of CynLr: Cybernetics Laboratory, a robotics company that works on robotic arms. He spoke about the time taken in robotics training. “It might be easy to teach a robotic arm to take something from one position to another, but it takes over two years to get a successful robotic arm application, if feasible at all.”

Gokul also spoke about the future of this field. “We have seen the data revolution, and it’s time to see the object revolution at this point. We see bright possibilities for this industry and it is left to us to figure out how we can leverage it and make it a success.”

According to a recent report, the global robotics market size was valued at $12.1 billion in 2020, and is expected to reach $150 billion by 2030. One of the biggest use cases for robotics is in warehouses, where the need for automation is huge, creating room for growth. Other industries such as construction, agriculture, and healthcare are also sectors where robotics work.

Big Tech Goes the Robotics Way

Companies are taking leaps to invest in robotics companies. You even have major tech players, such as Google, Microsoft, OpenAI, and Tesla, which have invested in it, and continue to do so. In spite of scrapping their in-house robotics division in 2021, OpenAI recently invested in a Norway-based robotics startup, 1x, that builds humanoid robots capable of human-like movements and behaviours.

The first AI robot officially entered the workforce.
1X, backed by OpenAI, just outpaced Elon Musk's Tesla as the first humanoid robot in a professional environment.
Their EVE robot has been integrated as a security guard in manufacturing sites. pic.twitter.com/ryv7VuOtJ7

— Rowan Cheung (@rowancheung) May 23, 2023

A few weeks ago, AMP Robotics, a Colorado-based startup that creates robotic systems, received close to $100 million from Microsoft’s Climate Innovation Fund.

Last week, Canadian AI and robotics company SanctuaryAI unveiled their advanced general-purpose robot called Phoenix. The company claims to have developed the world’s first humanoid general-purpose robot powered by Carbon, which is a unique AI control system.

Last year, Tesla revealed their humanoid robot Optimus. It received a lukewarm response, and even garnered criticism. Gary Marcus raised concerns about why Tesla would build such a robot with no clear direction on what the real-world applications are. Snubbing the critiques, the companies are continuously investing in this sector, awaiting the ‘GPT moment’ for robotics.

The post Is the GPT Breakthrough in Robotics Approaching Soon? appeared first on Analytics India Magazine.

Spearphishing report: 50% of companies were impacted in 2022

A hook on a keyboard.
Image: ronstik/Adobe Stock

Spearphishing is a sliver of all email exploits, but the extent to which it succeeds is revealed in a new study from cybersecurity firm Barracuda Network, which analyzed 50 billion emails across 3.5 million mailboxes in 2022, unearthing around 30 million spearphishing emails. These findings are in the company’s new report about Spear-Phishing Trends.

While that proportion represents less than a tenth of a percent of all emails, half of the organizations the firm examined in the study, which includes findings from a survey of more than 1,000 companies, were victimized by spearphishing last year. A quarter had at least one email account compromised through an account takeover (Figure A).

Figure A

Barracuda Networks identified 13 types of email exploits.
Barracuda Networks identified 13 types of email exploits. Image: Barracuda Networks

Jump to:

  • Identity theft and brand impersonation lead spearphishing exploits
  • Damage to machines, data exfiltration top consequences
  • The higher the mix of remote workers, the greater the vulnerability
  • Companies slow to identify and respond to email attacks
  • Slow pace of response keeps door open to cybertheft
  • Automation and AI accelerate response times
  • The spearphishing trend to continue in 2023
  • AI models can flag unusual email communication patterns

Identity theft and brand impersonation lead spearphishing exploits

Barracuda Networks’ study isolated the five most prevalent spearphishing exploits.

  • Scamming: 47% of spearphishing attacks tricked victims into disclosing information in order to defraud them and/or steal their identity.
  • Brand impersonation: 42% of spearphishing attacks mimicked a brand familiar to the victim to harvest credentials.
  • Business email compromise: 8% of spearphishing exploits impersonated an employee, partner, vendor or another trusted person to compel victims to make wire transfers or provide information from finance departments.
  • Extortion: 3% of spearphishing emails used threats of the revelation of personal material.
  • Conversation hijacking: 0.3% of attacks involved the hijacking of existing conversations.

The company also found that Gmail users were more likely to be spearphishing victims than users of Microsoft 365 (57% versus 41%, respectively).

Damage to machines, data exfiltration top consequences

The report detailed the results of a Barracuda-commissioned survey conducted by the independent researcher Vanson Bourne, who polled 1,350 organizations with 100 to 2,500 employees across a range of industries in the U.S., EMEA and APAC countries.

The survey queried companies about damages they experienced as a result of email attacks. Over half said machines were infected with malware, and roughly half reported theft of confidential information (Figure B).

Figure B

Company-reported impact of spearphishing attacks over 12 months.
Company-reported impact of spearphishing attacks over 12 months. Image: Barracuda Networks

The higher the mix of remote workers, the greater the vulnerability

Remote work is increasing risks: Users at companies with more than a 50% remote workforce report higher levels of suspicious emails — 12 per day on average, compared to 9 per day for those with less than a 50% remote workforce. Companies favoring remote work also reported that it took longer to detect and respond to email security incidents — 55 hours to detect and 63 hours to respond and mitigate, compared to an average of 36 hours and 51 hours, respectively, for organizations with fewer remote workers.

On average, 10 suspicious emails were reported to IT on a typical workday, with users in India having reported the highest average number of suspicious daily emails — 15 per day, which is 50% above the global average. By contrast, the U.S. average was nine suspicious daily emails (Figure C).

Figure C

Companies in India reported the highest number of suspicious emails.
Companies in India reported the highest number of suspicious emails. Image: Barracuda Networks

According to the report, the relatively high number of reported incidents in India may be evidence that organizations there are struggling to prevent email attacks or that organizations in India are placing higher focus on suspicious emails.

The average organization received approximately five emails per day that were identified as spearphishing exploits, and these attacks garnered an average 11% clickthrough rate, according to the report.

Companies slow to identify and respond to email attacks

From its survey of enterprises, Barracuda found that on average it takes nearly two days for organizations to detect an email security incident. On average, the enterprises polled by Barracuda took nearly 100 hours in total to identify, respond to and remediate an email exploit. They took 56 hours to respond and remediate after the attack was detected.

According to the report, from the respondents that experienced a spearphishing attack:

  • 55% reported their machines were infected with malware or viruses.
  • 49% reported having sensitive data stolen.
  • 48% reported having stolen login credentials.
  • 39% reported direct monetary loss.

Fleming Shi, the chief technology officer of Barracuda, said email is still very much the main attack vector used against enterprises, even small to medium-sized businesses, with threat actors who go after large companies often seeking prizes above and beyond what can be filched from a single hit.

Shi said, “They might be going after a person, a brand, data exfiltration or anything going beyond just the first ransom attack, getting to the point where they can hold an enterprise ransom for multiple years or multiple payouts,” he said. “At the end of the day, financially motivated attacks are still going to be numerous, but we also have to watch out for nation-state or politically-driven cyberattacks that try to influence or change opinion and maybe even impact the 2024 election. These are also possible because all they have to do is tweak the weapon to have a different impact.”

Slow pace of response keeps door open to cybertheft

The survey found that for 20% of organizations, it takes longer than 24 hours to identify an email attack. According to the study, the long period means users have time to click on a malicious link or respond to an email. Thirty-eight percent of respondents reported taking more than 24 hours to respond to and remediate attacks. Obstacles cited include lack of automation, predictability and knowledge among staff hampering the discovery process. (Figure D).

Figure D

Company-reported obstacles to the fast response to email exploits.
Company-reported obstacles to the fast response to email exploits. Image: Barracuda Networks

“Even though spearphishing is low volume, with its targeted and social engineering tactics, the technique leads to a disproportionate number of successful breaches, and the impact of just one successful attack can be devastating,” said Shi. “To help stay ahead of these highly effective attacks, businesses must invest in account takeover protection solutions with artificial intelligence capabilities. Such tools will have far greater efficacy than rule-based detection mechanisms. Improved efficacy in detection will help stop spearphishing with reduced response needed during an attack.”

Organizations victimized by spearphishing were more likely to say the costs associated with an email security breach increased in the last year: $1.1 million versus about $760,880 for those who were victims of other kinds of email attacks, according to the report.

Automation and AI accelerate response times

According to Barracuda Networks, 36% of organizations in the U.S. use automated incident response tools, and 45% use computer-based security awareness training. Both groups report faster response times on average, which means they are using fewer IT resources, and those resources can focus on other tasks.

Larger organizations cite lack of automation as the most likely obstacle preventing a rapid response to an incident — 41% for organizations with more than 250 employees, compared to 28% for organizations with 100–249 staff. The smaller companies cite additional reasons almost equally, including:

  • Lack of predictability (29%)
  • Knowledge among staff (32%)
  • Proper security tools (32%)

The spearphishing trend to continue in 2023

Shi said it is likely that spearphishing, particularly related to conversation hijacking and business email compromise, will continue to prevail this year, with conversation hijacking building on past data breaches, basically where emails had been stolen.

“The example I will use is ProxyLogon, which was a vulnerability exchange by Microsoft where attackers took not only credentials but past email conversations that allowed them to reiterate and basically recreate a weapon based on previous interactions,” he said. “So, it makes it much easier to bypass all the guardrails, especially the human level awareness that we have.”

He also said that these attacks will be harder to block because not all of them are going to have links and attachments. “Sometimes it’s just an interaction to gain trust, and then it potentially leads to further access to the environment,” he said.

BECs drive spearphishing and vice versa

Shi sees the relationship between BECs and spearphishing as “intimate and symbiotic” because BECs can lead to additional phishing attacks, and phishing can lead to BECs.

“The main difference is that most BECs do not have links or attachments. It’s an interaction, a conversation that eventually leads to something bad happening. In order to get there, however, somebody has to compromise the environment. That weapon could be the initial spearphishing type of attack where credentials get stolen.”

Then, he added, with stolen credentials, actors can access the environment to identify communication patterns that continue the attack. “They somewhat camouflage themselves into the environment because once trust is built, an attacker can start activating new weapons that can be evasive to detection mechanisms.”

AI models can flag unusual email communication patterns

Barracuda Networks suggested machine learning is a useful tool for identifying anomalous emails through the establishment of normal communication patterns. And, that AI can be deployed to automatically recognize when accounts have been compromised.

The firm also suggests:

  • Using technology to identify logins from unknown accounts.
  • Monitoring emails for inbox rules that are malicious.
  • Using multifactor authentication.
  • Implementing DMARC authentication and reporting.
  • Automating incident réponse.
  • Training staff to recognize and report attacks.

Cybersecurity Insider Newsletter

Strengthen your organization's IT security defenses by keeping abreast of the latest cybersecurity news, solutions, and best practices.

Delivered Tuesdays and Thursdays Sign up today

Get Started With Data Streaming

Get Started With Data Streaming

Streaming data is everywhere, and today’s developer needs to learn how to build systems and applications that can ingest, process and act on data continuously generated in real-time. Developers are tapping into endless data streams to solve operational challenges, build delightful customer experiences and build new products, which means the learning opportunities are endless too.

One of the most widely adopted data streaming technologies is Apache Kafka. In the 12 years since this event streaming platform was open-sourced, developers have used Kafka to build applications that transformed their categories.

Think Uber, Netflix, or Meesho. Developers working with data streaming in these kinds of organisations and the open-source community have created applications that track locations instantly, deliver personalised content and process payments in real time.

These real-time capabilities are so embedded into our daily lives that we take them for granted. Before bringing those capabilities to life, developers first had to understand this platform and how to best take advantage of it.

A deeper understanding of how Kafka works

Apache Kafka can record, store, share and transform continuous streams of data in real-time. Each time data is generated and sent to Kafka, this “event” or “message” is recorded in a sequential log through publish-subscribe messaging.

While that’s true of many traditional messaging brokers, Kafka is designed to deliver messages at network-limited throughput, scale to trillions of messages per day, store those data streams and provide the storage and compute elasticity required of today’s internet-scale applications.

When client applications generate and publish messages to Kafka, they’re called producers. It’s automatically organised into defined “topics” and partitioned across multiple nodes or brokers when stored. Client applications acting as “consumers” can then subscribe to data streams from specific topics.

These functionalities make Kafka ideal for real-time ingestion and processing of large volumes of data such as logistics, retail inventory management, threat detection and better customer experiences. The architecture enables organisations to democratise their data, giving developers and other data-streaming practitioners access to shareable streams of data from across their organisations.

Skills a developer needs to effectively use Data Streaming

Kafka provides client libraries that simplify the reading, and writing of streams of data in a variety of programming languages. Kafka streams enable developers to easily process these data streams in real time. Kafka Connect enables developers to easily bring in data from their data sources and turn data-at-rest into data-in-motion.

Confluent completes Kafka. ksqlDB further simplifies stream processing allowing developers to use SQL to process the data streams. With Schema Registry, Stream Catalog and Stream Governance, developers can confidently democratise data within their organisations.

Once developers have mastered creating data pipelines with Kafka, they’ll be ready to explore streaming processing, which unlocks a host of operational and analytics use cases while creating reusable data products.

Learn to problem-solve with a data-streaming mindset

Instead of thinking of data as finite “sets of sets,” as they’re stored in relational databases, you’ll have to learn how to apply data stored as immutable, appending logs.

Kafka is one of the most active open-source projects, and businesses across sectors are doubling down on their investment in data streaming. With so many companies and industries standardising Kafka as the de facto solution for data streaming, there’s a robust community for newcomers to join and learn alongside.

Developers who invest time in learning how to solve these kinds of impactful use cases will have a wealth of job opportunities, which means more interesting problems to solve and space to grow their skills and careers.

Learn on the ground and in real-time

We want to show you how Kafka can work for you to help your business work in real-time. That’s why we are bringing the Data in Motion (DIM) Tour to Bengaluru and Mumbai in June. This tour is for data-driven startups and enterprises that want to learn how data streaming can enable real-time analytics and decision-making with the power and flexibility of the cloud.

Whether you are new to Kafka or a seasoned pro, if you are ready to take your Kafka skills to the next level and learn how to leverage data streaming in the cloud, don’t miss the Data in Motion Tour. You will find something valuable and inspiring there.

The post Get Started With Data Streaming appeared first on Analytics India Magazine.

Google Bard cheat sheet: What is Bard, and how can you access it?

A prompt that says, "Bard can..." with a list of actions that Google Bard can take.
Image: Andy Wolber/TechRepublic

Bard is Google’s artificial intelligence chatbot which generates responses to user-provided natural language prompts. In response to a prompt, Bard can pull information from the internet and present a response. The large language model behind Bard delivers the response in natural language — in contrast to a standard Google search, where a result consists of a snippet of information or a list of links.

SEE: Explore how ChatGPT and other generative AI tools can help you be more productive.

Google announced Bard in February 2023 after OpenAI and Microsoft both garnered attention for AI chatbot systems. And in May 2023, Bard and related AI advancements featured prominently in Google’s I/O event.

According to Sundar Pichai, CEO of Google and Alphabet, Bard is “an experimental conversational AI service.”

In fact, Google places the word “Experiment” next to the system’s name to show it is still a work in progress. Additionally, Google indicates that “Bard may display inaccurate or offensive information that doesn’t represent Google’s views” in a disclaimer placed below the prompt box.

Jump to:

  • What is Google Bard used for?
  • When was Google Bard released?
  • How can you get access to Google Bard?
  • What countries and languages is Google Bard available in?
  • Can I manage my Google Bard history?
  • Is Google Bard free to use?
  • Is Google Bard using PaLM 2?
  • What are alternatives to Google Bard?

What is Google Bard used for?

Bard’s prompt-response process can help you obtain answers faster than a standard Google search sequence.

A classic Google search requires you to enter keywords, follow links, review content, then compile the results or repeat the process with a refined keyword search string.

SEE: Check out these Google Bard search prompting tips.

With Bard, you enter a prompt, then review the response. If the response isn’t exactly what you want, you have four options:

  • View other drafts to display alternatively formatted responses.
  • Regenerate the response to have the system craft a new reply.
  • Follow-up with another prompt.
  • Switch to a search with the Google it button.

Bard can handle all sorts of tasks, but many of the most common uses are covered by the categories of capabilities detailed below.

Google Bard can summarize

As a large language model, Bard can adeptly summarize text. For example, provide a link to a web page and ask Bard to summarize the contents, e.g.:

Please summarize ​​https://blog.google/technology/ai/bard-google-ai-search-updates/.

You also can suggest a specific length if you want a particular degree of brevity, such as “Please summarize in 100 words.”

Google Bard can compare

Bard can compare two or more items. In many cases, when you ask Bard to compare things, the system will display some of the data in a table. For example, if you prompt Bard:

Compare a Pixel 7, Pixel 7a and Samsung Galaxy S23.

Similarly, you may ask Bard to compare web pages.

Google Bard can suggest

Bard may serve as a suggestion engine for products, services or activities. Enter the title of books, music or movies you like, then ask Bard to suggest others. This can be useful when you’re researching unfamiliar topics. For example, you might try:

I am interested in learning the history of machine learning. Can you recommend 10 useful and highly respected books on the topic?

Google Bard can explain

When you want to learn about a topic or historical event, you can ask Bard to explain it to you. If you like, you may suggest a desired level in order to guide the system toward an explanation that may be either easier to understand or more detailed. For a general overview of a core technology that helps make Bard work, you might ask:

Can you explain the basics of how neural networks operate? Explain it to me as if I am in my first year of college.

Google Bard can brainstorm

One of the best uses of a chatbot is to gather a long list of ideas. Ask Bard to “Brainstorm ideas for…” followed by whatever topic you wish, such as a new project, promotional effort or paper. Encourage Bard to provide creative, unusual or inventive ideas for additional variety in the responses.

Google Bard can code and debug

In April 2023, Bard added the ability to create and help debug code in more than 20 programming languages. When you ask for code, make sure to specify the programming language and describe in as much detail as possible the code you need. If the code generated doesn’t work, let Bard know what exactly went awry, and ask for a suggested fix or for help interpreting an error code.

SEE: Explore other Google Bard enhancements.

Bard can draft text

Bard can help you write, too. As with most prompts, provide as much detail about the topic, length, format (blog post, poem, essay, book report, etc.) and style as possible. If you have a rough outline of a blog post, you might include the desired points in your prompt. For this section of text, for example, you might prompt:

Using the following points as an outline, can you draft examples and explanatory text? "Bard can summarize. Bard can compare. Bard can suggest. Bard can explain. Bard can brainstorm. Bard can draft text. Bard can code (and debug). Bard can search."

The responses Bard generated were reasonable and might have required only a little editing and correction to be usable.

Google makes it easy to move Bard text elsewhere. Select the response export button to move content to either a new Google Doc or Gmail. Alternatively, select the More button (the three vertical dots), then choose Copy to place the response text on the system clipboard for pasting into any app of your choice.

Bard can search

Since Bard can access internet content, many conventional keyword searches will also work in Bard. Ask about current news topics, weather forecasts or pretty much any standard keyword search string. However, Bard will provide responses mostly in conventional text, sometimes supplemented with images, whereas Google search may show content in custom formats (e.g., weather forecasts often display a chart). When you seek a set of links, switch out of Bard back to a standard Google search.

Bard can be wrong

Bard can get things wrong. Never rely solely on content provided in Bard responses without verification. When Bard does provide an inaccurate, misleading or inappropriate response, select the thumbs down icon to convey to the system that it provided a bad response. Remember, Bard is an experiment.

When was Google Bard released?

At launch in March 2023, Google limited Bard access via a waitlist to people with personal Google accounts. In early May 2023, Google eliminated the waitlist and made Bard more widely available.

How can you get access to Google Bard?

To access Bard, go to https://bard.google.com in a web browser, and sign in with a Google account (Figure A).

Figure A

Go to bard.google.com in any modern browser, then sign in with a Google account.
Go to bard.google.com in any modern browser, then sign in with a Google account.

If your account is managed by a Google Workspace administrator, such as an account for work or school, the administrator may adjust settings to either allow or prevent access to Bard. Check with your administrator, should you have any questions.

If you are a Google Workspace administrator and wish to review or adjust the settings that affect Bard availability for people in your organization, access the Admin console | Apps | Additional Google services | Early Access Apps, then modify the Service status and Core Data Access Permissions as desired.

What countries and languages is Google Bard available in?

As of May 10, 2023, Google expanded Bard to support Japanese and Korean in addition to U.S. English. Simultaneously, Google made Bard available in more than 180 countries and territories. However, Bard was not made available on that date to people in European Union countries, such as Germany, France, Italy and Spain. By the end of 2023, Google intends to make Bard available in the 40 most spoken languages.

Can I manage my Bard activity history?

Yes, Google gives you control over your Bard activity history, much as it does your search and browsing history. To adjust the settings, select Bard Activity from the left menu. Then, you may choose whether Bard Activity history is on or off (Figure B).

Figure B

While access to previous prompts can be helpful, Google gives you full control over whether or not your Bard Activity history is stored.
While access to previous prompts can be helpful, Google gives you full control over whether or not your Bard Activity history is stored.

If on, you may choose to Auto-delete activity after three, 18 or 36 months or not at all. Additionally, you may access your Bard activity history, which can be helpful if you wish to review or rerun a previous prompt.

Is Google Bard free to use?

Yes, Google Bard is available to use for free. As of May 2023, Google Bard remains free of advertising, as well.

Is Google Bard using PaLM 2?

In May 2023 Google announced that Bard had switched to using Pathways Language Model 2 rather than Language Model for Dialogue Applications. Google promotes PaLM 2 as a “state-of-the-art language model with improved multilingual, reasoning and coding capabilities.”

SEE: Learn how to successfully use ChatGPT.

Google plans to make PaLM 2 available in four distinct sizes: Gecko, Otto, Bison and Unicorn. The distinct sizes are intended to serve a wide range of computing environments. The smallest, Gecko, is intended to be functional even on a mobile device without an internet connection.

What are alternatives to Google Bard?

The ability to access current internet content is a key differentiator between Google Bard and many other chatbot AI systems. Many large language model chatbot systems were trained on older data and lack access to information about current events. This inability to browse the internet limits the usefulness of many of these systems.

Three alternatives to Bard that can access current internet content and are worth exploring are:

  • Perplexity.ai: Available free on the web with account sign in optional.
  • Bing: Available free on the web in Microsoft Edge with Microsoft account sign in.

ChatGPT Plus: Available for $20 per month in a web browser or in an iPhone app. In late May 2023, Microsoft announced that the free edition of ChatGPT will gain access to Bing, as well.

Innovation Insider Newsletter

Catch up on the latest tech innovations that are changing the world, including IoT, 5G, the latest about phones, security, smart cities, AI, robotics, and more.

Delivered Tuesdays and Fridays Sign up today