7 Models Based on LLaMa 2

Llama 2 is Communist

At Microsoft’s Inspire event, Meta and Microsoft launched Llama 2, the latest version of their renowned open-source LLM, LLaMa. It comes with various improvements to enhance its performance and safety. Notably, it introduces the 7B, 13B, and 70B pre-trained and fine-tuned parameter models, offering a substantial increase in pre-trained data and leveraging GQA for better inference capabilities.

The release of Llama 2 is available for both research and commercial use, accessible on platforms like Microsoft Azure and Amazon SageMaker. It is also compatible with Windows platforms such as Subsystem for Linux (WSL), Windows terminal, Microsoft Visual Studio, and VS Code.

The model has undergone meticulous optimization for dialogue purposes, resulting in fine-tuned Llama 2-Chat models that set new benchmarks in the field of language processing and understanding. The collaboration between Meta and various other companies, including Amazon, HuggingFace, NVIDIA, Qualcomm, IBM, Zoom, Dropbox, and academic leaders, emphasizes the importance of open-source software.

Here are some models which are already based on LLaMa-2 and can be used to access the latest Meta offering:

Perplexity

Perplexity.ai is a unique chatbot platform that takes on a search engine-like approach. It scours the internet to find answers to user queries and provides the sources for the responses it generates. The platform has its own LLaMA chatbot available at llama.perplexity.ai, where users can toggle between the 13-billion-parameter model and the 7-billion-parameter model to compare the results.

Impressively, Perplexity swiftly released a new chatbot utilizing Meta’s latest Llama 2 AI model within 24 hours of its introduction as an open-source large language model.

Here’s the link to LLaMa by Perplexity: https://labs.perplexity.ai/

The LLaMa Chat, which is built on Llama 2, is currently in an experimental phase and is exclusively accessible via http://labs.pplx.ai. However, it is not available on their mobile apps at the moment.

One of the remarkable features of Perplexity is its generous offering to users. They provide the Llama 2 models with 70 billion, 13 billion, and 7 billion parameters for free, allowing users to experiment and leverage the power of these large language models.

Furthermore, the chatbot has a maximum token length of 4096, which enables it to handle more extensive and complex inputs from users. This ensures that the chatbot is capable of providing detailed and informative responses.

Overall, Perplexity.ai presents a novel approach to chatbots by incorporating search engine capabilities, and its fast adoption of Meta’s Llama 2 AI model showcases its commitment to providing cutting-edge technology and free access to users for experimentation.

Baby llama

Andrej Karpathy took on the ambitious task of implementing the Llama 2 architecture in the C programming language, moving away from the commonly used GPT-2. The primary objective was to demonstrate the possibility of running complex language models on resource-constrained devices through a minimalistic C implementation. Surprisingly, the model achieved impressive inference rates, even on devices with limited computational resources.

Here’s the Github: https://github.com/karpathy/llama2.c

To accomplish this, Karpathy utilised the nanoGPT model as a starting point and developed the Llama 2 model with approximately 15 million parameters. Remarkably, the C implementation of this model achieved an inference speed of around 100 tokens per second on an M1 MacBook Air, showcasing the feasibility of running sophisticated models on devices without requiring powerful GPUs.

The Baby Llama approach involved training the Llama 2 LLM architecture from scratch using PyTorch. Subsequently, Karpathy wrote a concise C code, titled “run.c,” specifically for performing inferences. The emphasis was on maintaining a low-memory footprint and avoiding the need for external libraries. This efficient approach allowed the model to be executed efficiently on a single M1 laptop without relying on GPU acceleration. Karpathy also explored the use of various compilation flags to further optimize the C code for better performance.

This highlights the tremendous potential of leveraging C code to run sophisticated language models on resource-constrained devices, a domain not traditionally associated with machine learning applications.

Poe

Poe, a chatbot platform, has recently added support for several Llama 2 models, including Llama-2-70b, Llama-2-13b, and Llama-2-7b. Among these, Poe recommends Llama-2-70b as it delivers the highest quality responses.

The platform boasts unique features that make it stand out from others. Poe is possibly the only consumer product allowing users to employ Llama on native iOS or Android apps, upload and share files, and continue conversations seamlessly.

Unlike other chatbot platforms like ChatGPT or Google Bard, Poe does not create its own language models. Instead, it provides users with access to various pre-existing models. Some of Poe’s official bots include Llama 2, Google PaLM 2, GPT-4, GPT-3.5 Turbo, Claude 1.3, and Claude 2.

Additionally, Poe offers an Assistant bot as the default one, which is based on GPT-3.5 Turbo. Users can also create their own third-party bots with built-in prompts to fulfill specific tasks.

Wizard LM

WizardLM models are trained on Llama-2 using brand-new Evol+ methods. The WizardLM-13B-V1.2 achieves impressive results, with a score of 7.06 on MT-Bench, 89.17% on Alpaca Eval, and 101.4% on WizardLM Eval. These models support a 4k context window and are licensed under the same terms as Llama-2.

The core contributors are currently working on the 65B version and plan to empower WizardLM with the capability to perform instruction evolution autonomously, making it cost-effective for adapting to specific data.

Additionally, they have released WizardCoder-15B-V1.0, which outperforms other models on HumanEval Benchmarks. The WizardLM-13B-V1.0 model has also achieved the top rank among opensource models on the AlpacaEval Leaderboard.

The performance comparison reveals that WizardLMs consistently excel over LLaMa models of the same size, particularly evident in NLP foundation tasks and code generation tasks. The WizardLM-30B model shows better results than Guanaco-65B.

Overall, WizardLM represents a significant advancement in large language models, particularly in following complex instructions and achieving impressive performance across various tasks.

Stable Beluga 2

Stable Beluga 2 is an open-access LLM (Large Language Model) based on the LLaMA 2 70B foundation model. It showcases remarkable reasoning capabilities across various benchmarks. The model is fine-tuned using a synthetically generated dataset in standard Alpaca format, employing the Supervised Fine-Tune (SFT) approach. Its performance even compares favorably with GPT-3.5 on certain tasks. Researchers attribute the high performance to the rigorous synthetic data training approach, making Stable Beluga 2 a significant milestone in the field of open-access LLMs.

Stable Beluga 2 is based on Llama2 70B and fine-tuned on an Orca-style dataset. The model’s usage involves starting conversations using provided code snippets. The training dataset for Stable Beluga 2 is an internal Orca-style dataset.

Stable Beluga 2 is trained through supervised fine-tuning datasets using mixed-precision (BF16) and optimized with AdamW. Detailed hyperparameters are outlined for the training procedure.

LunaAI

“Luna AI Llama2 Uncensored” is an advanced chat model based on Llama2, which underwent fine-tuning using more than 40,000 lengthy chat discussions. Tap, the creator of Luna AI, led the fine-tuning process, resulting in an improved Llama2 7b model that competes with ChatGPT in various tasks effectively.

Here’s the link to LunAI: https://huggingface.co/TheBloke/Luna-AI-Llama2-Uncensored-GGML

What sets this model apart is its extended responses, meaning it can generate detailed and comprehensive answers, its low hallucination rate, indicating it produces fewer fictitious or incorrect information, and its absence of censorship mechanisms, guaranteeing open and unrestricted communication.

For the model training, an exceptionally powerful 8x a100 80GB machine was employed to conduct the fine-tuning process. The model was predominantly trained on synthetic outputs, which means the training data was generated rather than solely collected from existing human conversations. This custom dataset was thoughtfully curated from diverse sources and comprised multiple rounds of conversations between humans and AI.

Redmond-Puffin-13B:

Redmond-Puffin-13B is a pioneering language model based on Llama-2 and fine-tuned by Nous Research. The fine-tuning process involved a meticulously crafted dataset containing 3,000 high-quality examples. Many of these examples were designed to fully utilize the 4096 context length capability of Llama-2. LDJ took the lead in both training the model and curating the dataset, while J-Supha made significant contributions to dataset formation.

Here’s the link to the model: https://huggingface.co/TheBloke/Redmond-Puffin-13B-GGML

The computational resources for this project were generously sponsored by Redmond AI, and Emozilla provided valuable assistance during the training experiments, helping to address various issues encountered during the process. Moreover, Caseus and Teknium were recognized for their contributions to resolving specific training issues.

The model, named Redmond-Puffin-13B-V1.3, was trained for multiple epochs on the carefully curated dataset of 3,000 GPT-4 examples. These examples mainly comprised extensive conversations between real humans and GPT-4, allowing the model to grasp complex contexts effectively. Additionally, the training data was enriched with relevant sections extracted from datasets such as CamelAI’s Physics, Chemistry, Biology, and Math.

The post 7 Models Based on LLaMa 2 appeared first on Analytics India Magazine.

Top 7 Generative AI Courses by AWS

Generative AI holds the power to transform our customers’ operations, enhancing their efficiency, productivity, and innovation capabilities. As work methods continue to evolve, the demand for cloud expertise is increasing. Moreover, according to the World Economic Forum, AI skills represent the third-highest priority for companies’ training strategies, right alongside analytical and creative thinking. It is pretty much true that ‘AI might not replace you, but a person who uses AI could.’

To upskill individuals, AWS has introduced a diverse range of 7 generative AI courses, thoughtfully tailored to cater to both technical and non-technical audiences. Among these, 5 courses are specifically designed for developers and technical professionals, while the remaining 2 are geared towards those with a non-technical background.

While anyone can take these courses, however these are specifically designed for developers who want to utilize Amazon CodeWhisperer, engineers and data scientists who aim to employ generative AI by training and deploying foundation models (FMs), executives who want to gain insights into how generative AI can help address their business challenges and AWS partners who wish to assist their customers in better understanding generative AI services and customer use cases.

Here is a list of seven courses you can explore today to get started.

For Technical Audience

Amazon CodeWhisperer – Getting Started : This self-paced digital course offers learners an introduction to Amazon CodeWhisperer, an AI coding companion that facilitates developers in accomplishing tasks more efficiently and quickly.

It’s a free course designed to provide a comprehensive understanding of CodeWhisperer’s functionalities and how it can enhance the coding experience.In this course, you will learn how to install and start using CodeWhisperer in your supported integrated development environment (IDE) or code editor.

AWS Jam Journey – Build Using Amazon CodeWhisperer : It is a hands-on and interactive training tailored for DevOps professionals. Dive into practical challenges, building and exploring Amazon CodeWhisperer in a secure, sandboxed AWS environment. This unique learning opportunity comes bundled with an AWS Skill Builder subscription.

Generative AI Foundations on AWS :​ This free 8 hour, on-demand technical deep-dive course is specifically created for AI modeling experts. It offers conceptual fundamentals, practical insights, and hands-on instructions to pre-train, fine-tune, and deploy cutting-edge FMs on AWS and other platforms

Generative AI with Large Language Models : This course is co-created by AWS, DeepLearning.AI, and machine learning pioneer Andrew Ng. This comprehensive three-week program equips data scientists and engineers with the skills to excel in selecting, training, fine-tuning, and deploying large language models (LLMs) for practical, real-world applications.

AWS PartnerCast – Building Generative AI on AWS: AWS PartnerCast offers an in-depth exploration of our generative AI services and capabilities on AWS, such as Amazon Bedrock, Amazon CodeWhisperer, and Amazon SageMaker. It demonstrates how organizations can effectively utilize these tools to assist their customers.

For Business and Nontechnical Audiences

AWS Partner: Generative AI on AWS Essentials (Business): This course tailored for AWS Partner customer-facing professionals. It covers the fundamentals of generative AI, explores essential customer use cases and personas, and illustrates how generative AI on AWS empowers customers to transform and revitalize their businesses.

Generative AI for Executives: It is a 13 minute fundamental generative AI course to help C-suite executives understand how generative AI can help address their business challenges and drive business growth.

The post Top 7 Generative AI Courses by AWS appeared first on Analytics India Magazine.

Introduction to “AI & Data Literacy: Empowering Citizens of Data Science”

Book-5-Intro-graphic

One of the reasons that I moved back to Iowa last year was that I saw an opportunity to work with local educational institutions to create an AI Institute for organizations in middle America that either get overlooked in the AI conversation or are unsure what AI means to them. I wanted to reduce the AI hype to a simple conversation in which everyone was empowered to participate. Plus, life is much more fulfilling when one has a mission, and this seemed like the perfect mission at the perfect time in my career.

AI is not just for the high priesthood of large corporations and 3-lettered government agencies. AI is a tool that everyone needs to understand where and how to use. We must ensure that AI is a tool that is approachable and usable by anyone: that it’s not some ominous, independent-acting entity that will take over the world.

However, AI is only a useful and practical tool if we take a comprehensive approach to ensure its proper design, development, deployment, and ongoing management. Achieving this goal necessitates the involvement of everyone. We must train everyone to become “Citizens of Data Science” – to be educated in AI and data literacy so everyone can actively participate in where and how AI is used to improve the human condition.

Thusly, the motivation and inspiration for my latest (and final?) book, “AI & Data Literacy: Empowering Citizens of Data Science.”

Introducing “AI & Data Literacy: Empowering Citizens of Data Science

Slide1-3

Figure 1: “AI & Data Literacy: Empowering Citizens of Data Science

First off, notice the cover of this book. Simple. Straightforward. No hyperbole about the extinction of humankind. No outrageous claims about massive human unemployment. Just a simple cover with a simple title to reflect the simple concept of Artificial Intelligence (AI). Because here is the simple truth about AI: AI will do exactly what you train it to do. The actions AI takes and the decisions AI makes will be guided 100% by the user-defined desired outcomes and the metrics and measures against which outcomes’ effectiveness will be measured. And all of these are 100% defined by you.

AI is only a tool, but unlike any other tool we have seen, this tool can continuously learn and adapt with minimal human intervention. And that’s what scares people. How do you control what AI learns? How do you control how AI adapts to make new decisions and take new actions?

So, let’s simplify the conversation. Let’s empower everyone with the knowledge to ensure that AI is working for the benefit of humankind. But first, an important point about your upcoming AI & Data Literacy journey…what is your role as a “citizen”?

What Does It Mean to be a “Citizen”

Citizens of Data Science Mandate

Ensuring that everyone of every age and every background have access to the education necessary to flourish in an age where economic growth and personal development opportunities are driven by AI and Data

The purpose of this book is to equip everyone with the necessary skills to thrive in a world driven by AI and data. However, do you really understand your obligation as a “Citizen” of Data Science? To quote my good friend John Morley on citizens and citizenship:

“Citizenship isn’t something that is bestowed upon us by an external, benevolent force. Citizenship requires action. Citizenship requires stepping up. Citizenship requires individual and collective accountability – accountability to continuous awareness, learning, and adaptation. Citizenship is about having a proactive and meaningful stake in building a better world.”

I believe that in the future, the best-paying and most rewarding careers will be those that require mastering where and how to leverage AI and data to make those professions more meaningful, relevant, and effective.

Book Content Overview

The book starts by discussing how your personal data is gathered, analyzed, and used to influence or manipulate your beliefs, perspectives, decisions, and actions. Then the book will transition into a conversation about data privacy efforts that are currently underway and their potential ramifications on you.

We will review the advanced analytics maturity index and supporting ecosystem to understand where and how to leverage these advanced analytic algorithms to drive better personal and professional outcomes and create new business, operational, environmental, and societal value sources. We will then deep dive into AI – how AI works, the importance of understanding and determining user intent, and the critical importance of building a responsible and ethical AI Utility Function.

We will then discuss how we can build decision models that leverage AI and data to make more informed, more accurate, and less risky decisions in an imperfect world. We will review how we can hone our problem solving skills to create models that leverage AI and data to improve the decisions that drive improved business, operational, and economic outcomes.

Sorry, but we’ll have a short primer on statistics, probabilities, and confidence levels. We will discuss how we can use statistics to help us improve the odds of making more effective and safer decisions in a world of constant economic, environmental, political, societal, and healthcare disruption.

Next, we will discuss how organizations of all sizes can leverage AI and data to engineer or create “value.” We will learn a framework for understanding how organizations define value and then identify the KPIs and metrics they will use to measure their value creation effectiveness. We will also discuss why the “economies of learning” are more powerful than the “economies of scale” in a digital-centric world.

Then we will talk about how we can approach the tricky topic of ethics. We will frame the ethics conversation from an economics perspective because we need to codify the variables and metrics around which we define ethical behaviors to create AI models that exhibit ethical behaviors.

Then, we’ll talk about the importance of empowerment; to ensure that everyone has a voice in deciding and defining how best to leverage AI and data from a personal perspective. We will discuss how we must become “more human” to thrive alongside AI. This is clearly my favorite chapter!

Finally, I’ll apply the concepts from the book to the current world of ChatGPT and Generative AI (GenAI). I will quickly review the key enabling GenAI technologies and then test myself to see how well the book has prepared me to understand where and how I can apply GenAI to deliver meaningful, relevant, responsible, and ethical responses.

This is a relatively easy conversation if we break the AI and data literacy conversation into its material components. To facilitate a more holistic awareness and education, we will break the AI and data literacy conversation into six interlocking components (Figure 2):

  • Data & Privacy Awareness
  • AI & Analytic Techniques
  • Making Informed Decisions
  • Predictions & Statistics
  • Value Engineering Competency
  • AI Ethics
Slide2-1

Figure 2: AI & Data Literacy Educational Framework

I hope you enjoy reading and learning from the book as much as I did in researching, testing, learning, relearning, and writing this book. I hope you enjoy your AI & Data Literacy journey and becoming a Citizen of Data Science!

BTW, if you’d like to get a personalized Certificate of Completion, complete the “AI and data literacy radar chart” in Chapter 9, post it on LinkedIn, and tag me. In return, I’ll post a personalized Certificate of Completion on your LinkedIn post!

Slide3-2

MachineCon USA 2023: Generative AI Captured Imagination of Tech Leaders

MachineCon USA 2023 marked the first venture into the vibrant and challenging market in the country, and it turned out to be a massive success! The best minds and top leaders in the field of analytics and artificial intelligence came together to discuss the past, present, and future of this transformative technology.

The one-day event held in New York hosted more than 150 members and celebrated the winners of the AI100 award. The award celebrates the top 100 most influential AI and Analytics leaders globally. Leaders from diverse industries were chosen from across the globe, along with innovative startup founders and individuals who made significant contributions in the field.

Speakers presented interesting perspectives on generative AI and took part in panel discussions on the impact of the technology, opening up the conversation with everyone present.

“I want to thank Analytics India Magazine for a fantastic conference here. It is amazing to see the brilliant minds we get to meet and talk about generative AI here,” said Vinod Krishna Ravi Iyer, the Vice President of Marketing at Genpact.

Explaining the importance of adopting AI in marketing, he said, “The notion of AI is here and now, and this is the moment to seize the opportunity to be an AI-first brand in the marketspace. The rapid pace at which technology is pivoting, we need to be on the train right now and get going on all aspects of innovation.”

Highlights from MachineCon USA

The overarching theme of this year was, as expected, Generative AI. With everyone talking about generative AI, the experts in the field offered informed perspectives on the topic.

The Keynote speaker, Katie Stein, the Chief Strategy Officer at Genpact, spoke about how businesses should rethink the existing structure and focus on hyperproductivity with Generative AI. She said, “I am very concerned about the narrative and the hyperfocus on productivity. Productivity is part of what this tool can deliver to us, but the hyperfocus on productivity is creating a narrative about job loss, where I think about the potential creation of [jobs].”

Krishna Gade, founder and CEO of Fiddler, and Peeyush Dubey, the Chief Marketing & Strategy Officer at TheMathCompany, followed with insights on the responsible adoption of AI and looking beyond the hype created on the subject. Their insightful inputs on the concerns of the industry and how established analytics organizations have made significant progress in implementing enterprise-level model management, which can set them apart as a key differentiator in the rapidly evolving GenAI landscape.

Ending the first half of the day, Amaresh Tripathy, Managing Partner at AuxoAI, emphasized the role of evolution in life and in technology in his talk about what the companies should do about Generative AI.

The panel discussions focused on the impact of generative AI on businesses. Parvathi Puttanna, Priya Serai, and Deepak Khetpal compared how each of them leveraged generative AI. Priya Serai said, “Definitely AI/ML is not new to banking and finance. We’ve been working on these models, but the difference is how to leverage what we’ve been building the past few years to the next level of generative AI. Data is the foundation for all of the new models, and companies that worked with the data foundations are able to expand on these capabilities better than companies that haven’t. Data is the cake and everything else on top; the generative AI is the only icing on top.”

The second panel revolved around the opportunities and threats in the adoption of generative AI where Agus Sudjianto, Joe Kleinhenz, Wayne Huang, and Harsh Kar brought their expertise in their specific domains. They discussed the ways in which their organisation is adopting AI and their thought process behind it.

AI100 Award Winners

On a rainy summer day in the middle of New York, some of the most resourceful innovators were brought together at MachineCon. Their contribution was recognized; Brian Erickson, the Chief Data and Artificial Intelligence Officer U.S. Coast Guard, won the Leader of the year. Krishna Cheriath, Chief Data and Analytics Officer of Zoetis, won the Young Business Leader of the Year award.

The group awards were shared by many in the categories of AI Innovator, Early Adopters, Exemplary Data Scientists.

The post MachineCon USA 2023: Generative AI Captured Imagination of Tech Leaders appeared first on Analytics India Magazine.

LGBMClassifier: A Getting Started Guide

LGBMClassifier: A Getting-Started Guide
Image by Editor

There are a vast number of machine learning algorithms that are apt to model specific phenomena. While some models utilize a set of attributes to outperform others, others include weak learners to utilize the remainder of attributes for providing additional information to the model, known as ensemble models.

The premise of the ensemble models is to improve the model performance by combining the predictions from different models by reducing their errors. There are two popular ensembling techniques: bagging and boosting.

Bagging, aka Bootstrapped Aggregation, trains multiple individual models on different random subsets of the training data and then averages their predictions to produce the final prediction. Boosting, on the other hand, involves training individual models sequentially, where each model attempts to correct the errors made by the previous models.

Now that we have context about the ensemble models, let us double-click on the boosting ensemble model, specifically the Light GBM (LGBM) algorithm developed by Microsoft.

What is LGBMClassifier?

LGBMClassifier stands for Light Gradient Boosting Machine Classifier. It uses decision tree algorithms for ranking, classification, and other machine-learning tasks. LGBMClassifier uses a novel technique of Gradient-based One-Side Sampling (GOSS) and Exclusive Feature Bundling (EFB) to handle large-scale data with accuracy, effectively making it faster and reducing memory usage.

What is Gradient-based One-Side Sampling (GOSS)?

Traditional gradient boosting algorithms use all the data for training, which can be time-consuming when dealing with large datasets. LightGBM's GOSS, on the other hand, keeps all the instances with large gradients and performs random sampling on the instances with small gradients. The intuition behind this is that instances with large gradients are harder to fit and thus carry more information. GOSS introduces a constant multiplier for the data instances with small gradients to compensate for the information loss during sampling.

What is Exclusive Feature Bundling (EFB)?

In a sparse dataset, most of the features are zeros. EFB is a near-lossless algorithm that bundles/combines mutually exclusive features (features that are not non-zero simultaneously) to reduce the number of dimensions, thereby accelerating the training process. Since these features are "exclusive", the original feature space is retained without significant information loss.

Installation

The LightGBM package can be installed directly using pip – python's package manager. Type the command shared below either on the terminal or command prompt to download and install the LightGBM library onto your machine:

pip install lightgbm

Anaconda users can install it using the “conda install” command as listed below.

conda install -c conda-forge lightgbm

Based on your OS, you can choose the installation method using this guide.

Hands-On!

Now, let's import LightGBM and other necessary libraries:

import numpy as np  import pandas as pd  import seaborn as sns  import lightgbm as lgb  from sklearn.metrics import classification_report  from sklearn.model_selection import train_test_split

Preparing the Dataset

We are using the popular Titanic dataset, which contains information about the passengers on the Titanic, with the target variable signifying whether they survived or not. You can download the dataset from Kaggle or use the following code to load it directly from Seaborn, as shown below:

titanic = sns.load_dataset('titanic')

Drop unnecessary columns such as “deck”, “embark_town”, and “alive” because they are redundant or do not contribute to the survival of any person on the ship. Next, we observed that the features “age”, “fare”, and “embarked” have missing values – note that different attributes are imputed with appropriate statistical measures.

# Drop unnecessary columns  titanic = titanic.drop(['deck', 'embark_town', 'alive'], axis=1)    # Replace missing values with the median or mode  titanic['age'] = titanic['age'].fillna(titanic['age'].median())  titanic['fare'] = titanic['fare'].fillna(titanic['fare'].mode()[0])  titanic['embarked'] = titanic['embarked'].fillna(titanic['embarked'].mode()[0])

Lastly, we convert the categorical variables to numerical variables using pandas' categorical codes. Now, the data is prepared to start the model training process.

# Convert categorical variables to numerical variables  titanic['sex'] = pd.Categorical(titanic['sex']).codes  titanic['embarked'] = pd.Categorical(titanic['embarked']).codes    # Split the dataset into input features and the target variable  X = titanic.drop('survived', axis=1)  y = titanic['survived']

Training the LGBMClassifier Model

To begin training the LGBMClassifier model, we need to split the dataset into input features and target variables, as well as training and testing sets using the train_test_split function from scikit-learn.

# Split the dataset into training and testing sets  X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Let’s label encode categorical (“who”) and ordinal data (“class”) to ensure that the model is supplied with numerical data, as LGBM doesn’t consume non-numerical data.

class_dict = {  "Third": 3,  "First": 1,  "Second": 2  }  who_dict = {  "child": 0,  "woman": 1,  "man": 2  }  X_train['class'] = X_train['class'].apply(lambda x: class_dict[x])  X_train['who'] = X_train['who'].apply(lambda x: who_dict[x])  X_test['class'] = X_test['class'].apply(lambda x: class_dict[x])  X_test['who'] = X_test['who'].apply(lambda x: who_dict[x])

Next, we specify the model hyperparameters as arguments to the constructor, or we can pass them as a dictionary to the set_params method.

The last step to initiate the model training is to load the dataset by creating an instance of the LGBMClassifier class and fitting it to the training data.

params = {  'objective': 'binary',  'boosting_type': 'gbdt',  'num_leaves': 31,  'learning_rate': 0.05,  'feature_fraction': 0.9  }  clf = lgb.LGBMClassifier(**params)  clf.fit(X_train, y_train)

Next, let us evaluate the trained classifier’s performance on the unseen or test dataset.

predictions = clf.predict(X_test)  print(classification_report(y_test, predictions))
             precision    recall  f1-score   support               0       0.84      0.89      0.86       105             1       0.82      0.76      0.79        74        accuracy                           0.83       179     macro avg       0.83      0.82      0.82       179  weighted avg       0.83      0.83      0.83       179

Hyperparameter Tuning

The LGBMClassifier allows for much flexibility via hyperparameters which you can tune for optimal performance. Here, we will briefly discuss some of the key hyperparameters:

  • num_leaves: This is the main parameter to control the complexity of the tree model. Ideally, the value of num_leaves should be less than or equal to 2^(max_depth).
  • min_data_in_leaf: This is an important parameter to prevent overfitting in a leaf-wise tree. Its optimal value depends on the number of training samples and num_leaves.
  • max_depth: You can use this to limit the tree depth explicitly. It's best to tune this parameter in case of overfitting.

Let's tune these hyperparameters and train a new model:

model = lgb.LGBMClassifier(num_leaves=31, min_data_in_leaf=20, max_depth=5)  model.fit(X_train, y_train)
predictions = model.predict(X_test)  print(classification_report(y_test, predictions))
             precision    recall  f1-score   support               0       0.85      0.89      0.87       105             1       0.83      0.77      0.80        74        accuracy                           0.84       179     macro avg       0.84      0.83      0.83       179  weighted avg       0.84      0.84      0.84       179

Note that the actual tuning of hyperparameters is a process that involves trial and error and may also be guided by experience and a deeper understanding of the boosting algorithm and subject matter expertise (domain knowledge) of the business problem you're working on.

In this post, you learned about the LightGBM algorithm and its Python implementation. It is a flexible technique that is useful for various types of classification problems and should be a part of your machine-learning toolkit.
Vidhi Chugh is an AI strategist and a digital transformation leader working at the intersection of product, sciences, and engineering to build scalable machine learning systems. She is an award-winning innovation leader, an author, and an international speaker. She is on a mission to democratize machine learning and break the jargon for everyone to be a part of this transformation.

More On This Topic

  • Getting Started with SQL Cheatsheet
  • Getting Started with ReactPy
  • Getting Started with spaCy for NLP
  • Getting Started with PyCaret
  • Getting Started with GitHub CLI
  • Getting Started with Reinforcement Learning

Meet PhotoGuard: The Ultimate Shield Against Unauthorised Image Manipulation

Is AI Copyright Really Necessary?

From romantic poems to Salvador Dali-inspired images, generative AI can now do it all. And it can do it so well, that it is often impossible to differentiate between AI and human-generated artworks. Since the Turing Test, which set the standard for successful AI performance as being able to mimic humans so well that it becomes indistinguishable, the discussion about technology imitating humans has been a major topic of public debate. The community has always tried to distinguish between text written by humans and text generated by AI, amidst the risk of possible misuse of technology.

MIT’s Computer Science & Artificial Intelligence Laboratory has come up with a solution for this.

MIT Has A Solution: PhotoGuard

Scientists from MIT CSAIL have created a new AI tool called “PhotoGuard” that aims to stop unauthorised changes to images made by models like DALL-E and Midjourney. This tool is specifically designed to protect against image manipulation without proper authorization.

PhotoGuard leverages “adversarial perturbations,” which are minuscule alterations in pixel values that are not visible to the human eye but can be detected by computer models. These perturbations disrupt the AI model’s ability to manipulate images effectively. There are two attack methods used by PhotoGuard to generate these perturbations.

The “encoder” attack targets the AI model’s latent representation of the image, causing the model to perceive the image as random. The goal of this attack is to disrupt the LDM’s process of encoding the input image into a latent vector representation, which is then used to generate a new image. They achieve this by solving an optimization problem using projected gradient descent (PGD). The resulting small, imperceptible perturbations added to the original image cause the LDM to generate an irrelevant or unrealistic image.

On the other hand, the “diffusion” attack defines a target image and optimizes the perturbations to make the final image resemble the target closely. This attack is more complex and aims to disturb the diffusion process itself, targeting not only the encoder but also the full diffusion process that includes text prompt conditioning. The goal is to generate a specific target image (e.g., random noise or a gray image) by solving another optimization problem using PGD. This attack nullifies not only the effect of the immunized image but also that of the text prompT.

Hadi Salman, lead author of the paper and a PhD student at MIT told AIM, “In essence, PhotoGuard’s mechanism of adversarial perturbations adds a layer of protection to images, making them immune to manipulation by diffusion models.” By repurposing these imperceptible modifications of pixels, PhotoGuard safeguards images from being tampered with by such models.

For example, consider an image with multiple faces. You could mask any faces you don’t want to modify, and then prompt with “two men attending a wedding.” Upon submission, the system will adjust the image accordingly, creating a plausible depiction of two men participating in a wedding ceremony. Now, consider safeguarding the image from being edited; adding perturbations to the image before upload can immunise it against modifications. In this case, the final output will lack realism compared to the original, non-immunized image.

“I would be skeptical of AI’s ability to supplant human creativity. I expect that in the long-run AI will become just another (powerful) tool in the hands of designers to boost productivity of individuals to articulate their thoughts better without technical barriers” concluded Salman.

Decoding the Problem

The recent Senate discussion around AI regulation has turned the spotlight on the most pressing issues of copyright and artist incentivisation. Senior executives from OpenAI, HuggingFace, Meta, among others have testified before the US Congress about the potential dangers of AI and suggested the creation of a new government agency to licence large AI models, revoke permits for non-compliance and set safety protocols.

The major impetus behind this plea for regulation stems from concerns regarding copyright infringement. It first started when the artist community filed a lawsuit against the companies behind image generators like Stability AI, Midjourney, and DeviantArt seeking compensation for damages caused by these companies using their art without credit.

AI-generated content is facing opposition from stock image companies like Shutterstock, Getty, as well as artists who see it as a threat to their intellectual property. But eventually most of them have gotten on board with partnerships. Adobe’s Firefly, a generative image maker designed for “safe commercial use.” Adobe offers IP indemnification to safeguard users from legal issues related to its use. It is built on NVIDIA’s Picasso which is trained on licensed images from Getty Images, Shutterstock. Shutterstock also partnered with DALL-E creator OpenAI to provide training data. It also now provides full indemnification to its enterprise customers who use generative AI images on their platform, ensuring protection against any potential legal claims related to the images’ usage. Google, Microsoft, OpenAI have also started watermarking in the aim of mitigating copyright issues.

The post Meet PhotoGuard: The Ultimate Shield Against Unauthorised Image Manipulation appeared first on Analytics India Magazine.

HackerOne: How Artificial Intelligence Is Changing Cyber Threats and Ethical Hacking

Person looking at a visualization of an interconnected big data structure.
Image: NicoElNino/Adobe Stock

HackerOne, a security platform and hacker community forum, hosted a roundtable on Thursday, July 27, about the way generative artificial intelligence will change the practice of cybersecurity. Hackers and industry experts discussed the role of generative AI in various aspects of cybersecurity, including novel attack surfaces and what organizations should keep in mind when it comes to large language models.

Jump to:

  • Generative AI can introduce risks if organizations adopt it too quickly
  • How threat actors take advantage of generative AI
  • Deepfakes, custom cryptors and other threats
  • “Nothing that comes out of a GPT model is new”
  • How businesses can secure generative AI

Generative AI can introduce risks if organizations adopt it too quickly

Organizations using generative AI like ChatGPT to write code should be careful they don’t end up creating vulnerabilities in their haste, said Joseph “rez0” Thacker, a professional hacker and senior offensive security engineer at software-as-a-service security company AppOmni.

For example, ChatGPT doesn’t have the context to understand how vulnerabilities might arise in the code it produces. Organizations have to hope that ChatGPT will know how to produce SQL queries that aren’t vulnerable to SQL injection, Thacker said. Attackers being able to access user accounts or data stored across different parts of the organization often cause vulnerabilities that penetration testers frequently look for, and ChatGPT might not be able to take them into account in its code.

The two main risks for companies that may rush to use generative AI products are:

  • Allowing the LLM to be exposed in any way to external users that have access to internal data.
  • Connecting different tools and plugins with an AI feature that may access untrusted data, even if it’s internal.

How threat actors take advantage of generative AI

“We have to remember that systems like GPT models don’t create new things — what they do is reorient stuff that already exists … stuff it’s already been trained on,” said Klondike. “I think what we’re going to see is people who aren’t very technically skilled will be able to have access to their own GPT models that can teach them about the code or help them build ransomware that already exists.”

Prompt injection

Anything that browses the internet — as an LLM can do — could create this kind of problem.

One possible avenue of cyberattack on LLM-based chatbots is prompt injection; it takes advantage of the prompt functions programmed to call the LLM to perform certain actions.

For example, Thacker said, if an attacker uses prompt injection to take control of the context for the LLM function call, they can exfiltrate data by calling the web browser feature and moving the data that’s exfiltrated to the attacker’s side. Or, an attacker could email a prompt injection payload to an LLM tasked with reading and replying to emails.

SEE: How Generative AI is a Game Changer for Cloud Security (TechRepublic)

Roni “Lupin” Carta, an ethical hacker, pointed out that developers using ChatGPT to help install prompt packages on their computers can run into trouble when they ask the generative AI to find libraries. ChatGPT hallucinates library names, which threat actors can then take advantage of by reverse-engineering the fake libraries.

Attackers could insert malicious text into images, too. Then, when an image-interpreting AI like Bard scans the image, the text will deploy as a prompt and instruct the AI to perform certain functions. Essentially, attackers can perform prompt injection through the image.

Deepfakes, custom cryptors and other threats

Carta pointed out that the barrier has been lowered for attackers who want to use social engineering or deepfake audio and video, technology which can also be used for defense.

“This is amazing for cybercriminals but also for red teams that use social engineering to do their job,” Carta said.

From a technical challenge standpoint, Klondike pointed out the way LLMs are built makes it difficult to scrub personally identifying information out of their databases. He said that internal LLMs can still show employees or threat actors data or execute functions that are supposed to be private. This doesn’t require complex prompt injection; it might just involve asking the right questions.

“We’re going to see entirely new products, but I also think the threat landscape is going to have the same vulnerabilities we’ve always seen but with greater quantity,” Thacker said.

Cybersecurity teams are likely to see a higher volume of low-level attacks as amateur threat actors use systems like GPT models to launch attacks, said Gavin Klondike, a senior cybersecurity consultant at hacker and data scientist community AI Village. Senior-level cybercriminals will be able to make custom cryptors — software that obscures malware — and malware with generative AI, he said.

“Nothing that comes out of a GPT model is new”

There was some debate on the panel about whether generative AI raised the same questions as any other tool or presented new ones.

“I think we need to remember that ChatGPT is trained on things like Stack Overflow,” said Katie Paxton-Fear, a lecturer in cybersecurity at Manchester Metropolitan University and security researcher. “Nothing that comes out of a GPT model is new. You can find all of this information already with Google.

“I think we have to be really careful when we have these discussions about good AI and bad AI not to criminalize genuine education.”

Carta compared generative AI to a knife; like a knife, generative AI can be a weapon or a tool to cut a steak.

“It all comes down to not what the AI can do but what the human can do,” Carta said.

SEE: As a cybersecurity blade, ChatGPT can cut both ways (TechRepublic)

Thacker pushed back against the metaphor, saying that generative AI cannot be compared to a knife because it’s the first tool humanity has ever had that can “… create novel, completely unique ideas due to its wide domain experience.”

Or, AI could end up being a mix of a smart tool and creative consultant. Klondike predicted that, while low-level threat actors will benefit the most from AI making it easier to write malicious code, the people who benefit the most on the cybersecurity professional side will be at the senior level. They already know how to build code and write their own workflows, and they’ll ask the AI to help with other tasks.

How businesses can secure generative AI

The threat model Klondike and his team created at AI Village recommends software vendors to think of LLMs as a user and create guardrails around what data it has access to.

Treat AI like an end user

Threat modeling is critical when it comes to working with LLMs, he said. Catching remote code execution, such as a recent problem in which an attacker targeting the LLM-powered developer tool LangChain, could feed code directly into a Python code interpreter, is important as well.

“What we need to do is enforce authorization between the end user and the back-end resource they’re trying to access,” Klondike said.

Don’t forget the basics

Some advice for companies who want to use LLMs securely will sound like any other advice, the panelists said. Michiel Prins, HackerOne cofounder and head of professional services, pointed out that, when it comes to LLMs, organizations seem to have forgotten the standard security lesson to “treat user input as dangerous.”

“We’ve almost forgotten the last 30 years of cybersecurity lessons in developing some of this software,” Klondike said.

Paxton-Fear sees the fact that generative AI is relatively new as a chance to build in security from the start.

“This is a great opportunity to take a step back and bake some security in as this is developing and not bolting on security 10 years later.”

Person using a laptop computer.

Subscribe to the Daily Tech Insider Newsletter

Stay up to date on the latest in technology with Daily Tech Insider. We bring you news on industry-leading companies, products, and people, as well as highlighted articles, downloads, and top resources. You’ll receive primers on hot tech topics that will help you stay ahead of the game.

Delivered Weekdays Sign up today

Can Edge AI Drive India’s USD 5 Trillion Economy Ambitions?

India’s manufacturing sector accounts for 15% of India’s Gross Domestic Product (GDP) and employs nearly 12% of the country’s workforce. In India’s ambitions of becoming a USD 5 trillion economy, the manufacturing sector will play a pivotal role as Edge AI is poised to revolutionise the sector. In the Union Budget 2023-24, the Indian government announced strategic investments in training the youth on emerging technology such as AI.

Dr. Mukesh Gandhi, Founder and CEO, Creative Synergies Group said this is a promising step towards addressing the skill gap in the country, which could, in turn, contribute to India’s global expertise and USD 5 trillion economy goal. “By using Cloud 2.0 and Edge AI, Indian manufacturing companies can optimise their operations and reduce downtime through predictive maintenance. It can allow real-time data processing to monitor equipment performance, predict maintenance requirements, provide the processing power and storage,” he told AIM.

A report released earlier this year by Rockwell Automation titled ‘State of Smart Manufacturing Report’ revealed that India has the largest number of manufacturing firms investing in technology. The survey was conducted across 1,350 manufacturers in 13 of the major manufacturing countries including India, China, US, Germany, Japan and the UK.

In fact, it is not just the manufacturing sector, Dr. Gandhi feels even in sectors such as healthcare, edge AI can enable real-time monitoring of patients’ vital signs and enable early detection of diseases and storage capabilities unlocked with cloud 2.0 can aid with diagnosis and treatment. “Since the government is making efforts towards technological upskilling, these emerging technologies can be used to provide educational resources and training to students in remote areas. It could also provide personalised learning experiences through real-time analysis of student performance data.”

However, challenges remain

Even though Indian manufacturers tops the charts when it comes to spending on technology, yet, India is still lagging behind in digital maturity. Another report by Lenovo released during the same period found that 48% of Indian businesses are still clung to the first stage of digital maturity. Nonetheless, things are expected to improve as time progresses.

Furthermore, when it comes to Edge AI, a lot of the potential it holds depends on the advancement in terms of hardware. Dr. Gandhi claims he is already seeing a lot of transformation in this regard. “A lot of the hardware which is out there today for edge and so on is being used in certain cases. But we know of several companies which are either developing proprietary chips, or they are working with companies that are developing proprietary chips and figuring out a configuration which would work very specifically for them,” he said.

Given the Edge AI market is expected to see unprecedented growth in the coming years, today, there are startups such as Sima.ai working on chips to bring an AI revolution to the industrial world. The Edge AI market is currently valued at around USD 5 billion, and is expected to grow at a CAGR of 20% during 2023-32. “At this rate, we can expect to see accelerated adoption amongst businesses in the coming years, which will further establish edge AI as an essential tool across sectors,” he said.

As edge devices become more powerful and efficient, we will see more sophisticated AI models and algorithms being deployed on them. This will enable the devices to process more data and perform more complex tasks, without needing to rely on cloud resources. “Edge AI can also facilitate federated learning. This way, multiple edge devices can collaboratively train a ML model without sharing data with a central server. This can facilitate the development of sophisticated AI models in the future.”

A cautious approach to generative AI

Even though generative AI is catching everyone’s attention, according to Gandhi, his company is taking a rather cautious approach. “Even though we see a lot of promise for generative AI, we are currently confining it to certain areas where we can leverage it in a safe environment,” he said.

“Our customers come in two different shades. On one side we have the conservative ones, and on the other hand, the progressive ones. It would be fair to say that most of our progressive customers want to leverage this technology. However, at the same time, they understand that applying it thoughtlessly, considering the current technology’s untested nature, could lead to undesired consequences. In mission-critical industries like automotive, an algorithm’s failure could prove costly in various aspects.”

Helping enterprises with digital transformation

Founded in 2011, Creative Synergies Group has been helping numerous enterprises, which includes Fortune 500 companies, with their digital transformation. With presence in the US, Japan, the Netherlands, Germany, UK, and India, “Our digital DNA helps us analyze long-term macro trends that could significantly impact digital transformation journeys of our customers. We co-create a digital roadmap, designing innovative products and solutions for businesses across verticals. Creative’s solutions lie in mobility, digital products such as consumer wearables, smart manufacturing and processing (Industry 4.0), green technologies, cloud engineering, AI, machine learning, the internet of things, AR/VR, 5G, SaaS, PaaS and TaaS.”

For example, for one of their tier-1 clients in the aerospace industry, the company has helped develop a smart airport passenger management system. “We customized engagement and interactivity with passengers by serving AI/ML driven content and microservices. Our solutions also unlocked the capability to determine content consumption patterns based on gender, age and ethnicity to enhance monetization of new business opportunities.”

Also, for a leading Japanese company, the company has executed multidisciplinary engineering for semiconductor ultra-pure water treatment. “This helped the brand manufacture LCD/AMOLED displays. We also developed mobility solutions for an IoT-driven smart plant for a major chemical conglomerate. Our cloud-driven mobility solutions enabled real-time data collection from plant equipment, instrumentation and control systems combined with edge computing,” he concluded.

The post Can Edge AI Drive India’s USD 5 Trillion Economy Ambitions? appeared first on Analytics India Magazine.

Google Bard vs Claude.ai (2023): What are the Key Differences?

The development of conversational AI chatbot systems from competing companies in the summer of 2023 is intense, yet there’s no single system that excels at all tasks. For instance, Anthropic launched Claude.ai on July 11 to provide public access to Claude 2, which supports long documents and comes with enhanced content generation and coding skills. Two days later, Google expanded Bard’s availability to more countries and in more languages, along with the ability to modify responses and pin and share chats, among other improvements.

This article compares the core capabilities of Google Bard and Claude.ai, both of which are free and considered either experimental or in beta as of July 2023. To use Bard, you must sign in with a Google account. To use Claude, you must sign in either with a Google account or an email address. In either case, if you use a Google Workspace account, you may want to check with your administrator to make sure access is allowed.

Jump to:

  • What is Google Bard?
  • What is Claude.ai?
  • Bard vs Claude: Comparison table
  • Bard vs Claude pricing
  • Feature comparison: Google Bard vs Claude.ai
  • Claude.ai: Pros and cons
  • Bard: Pros and cons
  • Should you use Bard or Claude?
  • Methodology

What is Google Bard?

Bard is an experimental conversational AI system designed by Google. As a large language model, Bard can generate responses of text, code and other content in response to an entered prompt or image. Google launched Bard in the first quarter of 2023.

What is Claude.ai?

Claude.ai is a public-facing beta of Anthropic’s conversational AI system. It was launched at the same time as when the company announced Claude 2 in mid-2023. Anthropic has emphasized its efforts to design Claude with an emphasis on AI safety. The company offers access to various editions of Claude (e.g., Claude 1, Claude Instant and Claude 2); Claude 2 drives Claude.ai interactions.

Bard vs Claude: Comparison table

The chart below highlights the major differences between Google’s Bard and Anthropic’s Claude.

Feature Bard Claude.ai
Made by Google Anthropic
Available countries 230+ U.S. and U.K.
Language support 40+ English. Some support for other languages, most notably Portuguese, French and German.
Coding support 20+ languages Yes
Internet access Yes No
Accepts image upload Yes No
Accepts text file upload No Yes
Context window* Not specified. "Purposefully limited," according to a Bard FAQ. Approximately 75,000 words.
Export options Gmail, Google Docs, Google Sheets, public link, copy content Copy content
Visit Bard Visit Claude.ai

* The context window determines the length of content the system can retain and respond to at once. Put simply, if you exceed the limit, the system may provide a response that indicates a lack of “retention” or “understanding” of content beyond the window. A long window makes the analysis of long documents possible.

Bard vs Claude pricing

Bard and Claude.ai are free to use. Neither Google nor Anthropic offers a paid edition of Bard or Claude.ai. The most notable competition to these two systems, ChatGPT from OpenAI, offers both a free edition and a more capable edition for an upgrade of $20 per month.

However, Google and Anthropic seek to serve enterprise and developer AI needs. Google offers a wide range of AI and machine learning systems, while Anthropic offers access to distinct editions of Claude, each optimized for different situations and price points.

Feature comparison: Google Bard vs Claude.ai

Conversational chat

Both Bard and Claude operate as conventional AI chatbots: You type a prompt, and the system replies with a response. As conversational chat systems, both Bard and Claude let you reference prior prompts in a chat. For example, if your first prompt is a request to summarize a book:

Can you summarize the key recommendations in the book, “Power and Progress: Our Thousand-Year Struggle Over Technology & Prosperity”?

A later query in the chat conversation with the intent to learn the authors might be:

Who wrote it?

The system should accurately understand that “it” refers to the book and “who” refers to the authors. Contrast that with a legacy search, where a summary of key points is simply not feasible and where you would need to combine everything into one keyword search query just to learn the authors.

The one significant difference between the two systems is that Claude supports a much longer context window than Bard. The result is that Claude can work with longer texts and also allows a more extended conversation about a topic than Bard.

Write text

Bard and Claude work well when you want the system to generate text. For example, you can ask either system to write text you need for pretty much any purpose, such as an email, marketing copy, a cover letter or a blog post. Specify in as much detail as possible the contents, length and type of text you want, and both systems will tend to produce usable text. Not happy with the result? Describe exactly how you need the tone, content, length or structure of the text revised, and ask Bard or Claude to try again.

Generate suggestions

Bard and Claude excel when you seek suggestions. Ask either chatbot for lists of things you want to learn about, explore or read, and you’ll likely receive at least a few relevant items. If you have examples, include them to provide context and nudge the system toward a more relevant response.

Remember, both of these systems may provide information that is inaccurate, misleading or simply wrong, so you may need to verify the content for accuracy.

SEE: Hiring kit: Prompt engineer (TechRepublic Premium)

Claude.ai: Pros and cons

Claude’s distinctive strength is its ability to work with long text files, thanks to the system’s significantly long context window. You may upload as many as five files of up to 10MB each along with your message.

For example, you might upload one or more sample technology plans, guides or strategy documents. Once shared, you could then explore various details within those documents with Claude (Figure A). Alternatively, you might provide Claude with relevant details about a different organization or client and ask the system to write a new plan, drawing from the prior example plans you had uploaded.

Figure A

Upload as many as five files, 10MB each, to Claude.ai and then prompt the system to produce responses drawn from the file contents.
Upload as many as five files, 10MB each, to Claude.ai and then prompt the system to produce responses drawn from the file contents.

Pros of Claude

  • Excellent long document capabilities.
  • May upload up to five files of up to 10MB each as part of a prompt.

Cons of Claude

  • Available only in the U.S. and U.K.
  • No support for image uploads.
  • No access to internet search.

Visit Claude.ai

Google Bard: Pros and cons

Bard’s ability to access the internet and accept image input sets the system apart from Claude. Bard can leverage search to incorporate recent news, weather and other information into responses, unlike some large language models that lack access to information after a defined date. In many cases, this makes Bard able to provide up-to-date data in responses (Figure B), with a sample prompt about an event that occurred within the prior month. Additionally, since Bard incorporates many image capabilities previously launched in Google Lens, you may upload a JPEG, PNG or WebP file and then chat about the image content. (Bard’s image capabilities are available initially only when used in English.)

Figure B

Bard can search the internet for recent information. Plus, Bard can leverage Google Lens capabilities to analyze uploaded images (initially, image capabilities are available only when used in English).
Bard can search the internet for recent information. Plus, Bard can leverage Google Lens capabilities to analyze uploaded images (initially, image capabilities are available only when used in English).

Pros of Bard

  • Accessible in many countries and languages.
  • Internet search.
  • Built-in export to Gmail, Google Docs and public sharing.

Cons of Bard

  • Lacks support for file uploads.
  • Limited context window compared to Claude.ai.

Visit Bard

Should you use Bard or Claude?

Bard and Claude represent state-of-the-art conversational AI systems with rapidly evolving capabilities. Either system is an excellent option when you want to experiment with a generative AI tool.

However, Bard is clearly the best when your chat requires internet access or image support, while Claude is clearly the better option when your chat might benefit from uploaded documents or long conversational context.

Experiment with both AI systems to become familiar with their capabilities, and then select the chatbot that is most able to help you accomplish a specific task.

Methodology

This comparison relied on detailed reviews of announcements from each vendor, experimentation with each system and manual testing of specific features publicly available as of mid-July 2023. Since both companies are adding capabilities fairly often, you may want to check the latest Google Bard updates or Anthropic announcements to learn of more recently added features.

Person using a laptop computer.

Subscribe to the TechRepublic News and Special Offers (Intl) Newsletter

Keep informed about the latest site features, downloads, special offers, and products from TechRepublic.

Delivered Wednesdays and Fridays Sign up today

Foodpanda wants to deliver more than good eats with these restaurant management tools

Person about to scan a digital menu

Foodpanda is looking to make inroads into restaurants, where it believes diners are likely to spend more if ordering systems are fully digitalized and powered by artificial intelligence (AI).

The Asian food delivery operator wants to help its network of restaurants do so alongside its sister company, TabSquare. Both businesses are subsidiaries of Berlin-based Delivery Hero, which acquired TabSquare 1.5 years ago and Foodpanda in 2016.

Also: What is Worldcoin? Eye-scanning crypto project launched by OpenAI CEO

Founded in 2012, TabSquare offers software tools that help restaurants manage orders and process payments, letting diners browse digital menus and place their orders by scanning a QR code with their mobile phone. The software company has operations in 10 markets, including Singapore, Indonesia, Australia, Thailand, and Sweden.

Foodpanda in recent years has expanded its services to include grocery delivery and dine-in voucher redemption. It operates in 11 markets across Asia, including Taiwan, Bangladesh, Hong Kong, and the Philippines. More than 8,000 restaurants in eight markets offer dine-in deals via the mobile app.

Foodpanda and TabSquare now will collaborate to help restaurants digitize their operations, spanning ordering, payment, and customer engagement. Doing so can improve these businesses' bottom line, where ordering via digital menus can push up table bills by 10% and cut staff costs by half, said Foodpanda CEO Jakob Angele, pointing to stats from TabSquare's clientele.

He noted that food deliveries account for about 15% of orders processed by restaurants, with dine-in orders accounting for the remaining 85%.

Also: Is Temu legit? What to know about this shopping app before you place an order

Together, Foodpanda and TabSquare are targeting to handle 120 of their customers' food transactions each month, Angele said. Restaurants on their network also can leverage data and predictive tools to improve their customer engagement and service.

These AI-powered applications track and analyze customers' purchase history, so their subsequent visits and experience can be personalized according to food preference. Restaurants also can identify trends related to price sensitivity and volume.

They then can use the data insights to offer tailored promotions to retain or attract diners, according to TabSquare's co-founder Anshul Gupta.

Citing Delivery Hero's German headquarters, Angele noted that the company's data use adheres to the European Union's General Data Protection Regulation (GDPR).

Also: People are more pessimistic about AI now than before the boom

With growing interest in generative AI, Gupta added that there was potential for the technology to be used, for instance, to help restaurants create images for their marketing campaigns. It also can be tapped to gather and manage customer feedback, he said, adding that TabSquare currently is exploring how it can leverage and integrate generative AI with its current services.

Angele pointed to opportunities in using the technology to enhance customer services but noted that some caution would be necessary to address risks such as hallucinations.

For now, the two subsidiaries will be focusing their joint efforts on helping restaurants with digitizing their operations, starting with their networks in Singapore, Malaysia, Taiwan, and the Philippines.

Artificial Intelligence