Ever since Google integrated the powers of Google Lens into Bard, enabling users to inquire the chatbot for insights about an image or create descriptive captions, the image recognition game went through a massive makeover. Not only can it identify the things in the picture, but it can also extract text and make sense of the image. Even GPT-4 is multimodal. But it is still in the research phase. Several users have taken to Twitter to express their excitement about the new update.
Let’s take a look at some interesting use cases of the same.
Read and Understand Images
Let’s start with the basics. Bard is now able to understand and explain a picture. I uploaded a picture of Salvador Dali’s masterpiece ‘The Persistence of Memory’ and asked Bard to explain its meaning. The chatbot quickly gave detailed descriptions of the paintings, and the story behind them and also told me that it can be interpreted in different ways.
Another user shared that she gave an image of a pug wearing a graduation cap and asked Bard about what is happening. It gave three potential instances of how the puppy could be graduating from obedience school or therapy program and more.
Multimodal bard: Image is a heartwarming depiction of a dog's journey to graduation
pic.twitter.com/TPVUyO7BeW
— Keerthana Gopalakrishnan (@keerthanpg) July 15, 2023
Create a Website from Sketch
Add an image of your sketch with the ‘+’ option and provide your prompt, for example, “Compose a concise HTML/JS script to transform this mock-up into a vibrant website, wherein the jokes are substituted with two genuine jokes.” Since the initial outcome may not meet user expectations, Bard offers additional choices by selecting “View other drafts.” Alternatively, you can regenerate the output. If any specific modifications are desired, you can make another request in a separate prompt. To execute the script, the HTML code should be copied into a text editor or saved as a text file.
Google Bard’s update is INSANE!
It created me a website from a napkin sketch, with a single prompt
Here is how you can create yours:
[THREAD] pic.twitter.com/aFkCfhNFOp— Alvaro Cintas (@dr_cintas) July 15, 2023
Understand Complex Graphs
Another use used Google Bard’s image input function to read text and graphs on the GPT-4 demo slides and perform calculations and that too in Japanese. They explain that they used to rely on ChatGPT or Perplexity for their tasks, but Bard has now become a powerful tool for them, depending on the specific application.
Google Bardの画像入力機能で、GPT-4のデモであったスライドの文字とグラフを読み取って計算する、というのも余裕でできてる…!
少し前までBardのポジションが中途半端でChatGPTかPerplexityしか使ってなかったんだけど、用途によってはBardがかなり強力なツールに躍り出た…! pic.twitter.com/QHUG9A86DK— KAJI | 梶谷健人 (@kajikent) July 14, 2023
Cooking Gets Easier
Saw a picture of pasta on your feed and now you have a craving for it? Well, now you can upload an image of a meal and ask for a full recipe, Bard will give it to you. And that is what AI influencer Rowan Cheung did.
Google Bard's new upgrades are INSANE
I gave it an image of a meal I had recently and asked for a full recipe, and it gave me an exact step-by-step.
This means Bard is officially multimodal. ChatGPT has some serious competition.
Here's all of Bard's recent upgrades:
-You… pic.twitter.com/ORXqI3GZx5— Rowan Cheung (@rowancheung) July 14, 2023
Create iPhone App from a Screenshot
Ammar Barshi, the design manager at Brex, used Bard to replicate a basic timer application for the iPhone, in just 4 minutes from a screenshot without any explicit cues about the app’s functionality. Bard generated the necessary code, although it did commit a few errors, but they were easily rectified.
Whoa, I just used Google's Bard AI to recreate a basic timer app for iPhone in under 4 minutes… just from a screenshot!
Did not give it hints as to what the app did, and it provided all of the code—it made some mistakes but nothing it couldn't fix!
Here's the full process pic.twitter.com/RAulZcROg2— Ammaar Reshi (@ammaar) July 14, 2023
Brain CT Diagnoses
Another user added the image of a CT scan and asked Bard to understand it. The user gave inputs in Japanese. Bard was able to list the potential causes despite lacking specialised expertise in the field.
Googleのbard凄い & 怖い !!!
Bardによる脳CTの診断!
「画像には脳の白い部分が写っていると言えます。。。脳腫瘍、脳卒中、出血など、さまざまな原因が考えられます」
答えは脳出血です。その鑑別診断を挙げるだけでも凄いですね。特化型で学習しているわけではありませんので。 pic.twitter.com/aEdF5xtlqt— 河野 健一 生成AI ✕ 医療に注目! 手術支援AI CEO 脳外科医 (@CeoImed) July 14, 2023
Social Media Caption
Bard is also a great buddy if you want to have great captions for your social media posts. It can generate captions according to your needs.
Personal Expense Manager
If you need to compile your expenditure for an expense report but you have too many bills to handle, you can now use Bard’s image recognition to take photos of the receipts and feed them to the chatbot. Bard organises the receipts into a table with details like date, time, category, description, and amount and this table can be exported to Google Sheets, eliminating the need for separate expense report apps.
How I used Google Bard as my Expense Management assistant
Google Bard recently got a massive upgrade that ChatGPT currently does not have, Image Recognition.
This new image recognition feature on Google Bard is pretty amazing. It's not just an ability to recognize objects in… pic.twitter.com/wuAdLSKPvE— Min Choi (@minchoi) July 15, 2023
The post Top 8 Use Cases of Bard’s New Image Recognition appeared first on Analytics India Magazine.

pic.twitter.com/TPVUyO7BeW