AI — Страница 907

Nearly 50% of people want an AI clone to do this for them

Almost half of consumers worldwide want an artificial replica of themselves by 2035 to run personal errands, including shopping.

This figure is 62% of consumers in Asia-Pacific, compared to the global average of 49% who hope for an AI clone to handle their administrative, communication, and shopping needs, according to a report released by marketing agency Dentsu. The study polled 30,000 respondents across 26 markets, of whom 11,000 were from 10 Asia-Pacific markets, and included India, Japan, China, France, and the US.

Also: Why Apple's best new AI features from WWDC will be boring (and I'm glad)

By 2035, three in four global consumers expect their appliances and vehicles to be able to re-order parts and schedule maintenance appointments autonomously.

Another three in four also want an AI assistant to sieve through ads and promotions sent to them, reflecting an increase in comfort among consumers to delegate more personal interactions to "AI gatekeepers," the report found.

Furthermore, 60% would like an AI assistant to take part in focus groups on their behalf and highlight their brand preferences.

By 2035, 77% of global consumers expect brands to send them customized offers and promotions that reflect real-time events, such as weather and traffic, and their personal context, including tone and geolocation.

As AI increasingly is tapped for customer service, 71% anticipate brands to exhibit distinct personalities in their engagements with consumers, Dentsu said.

Also: 3 ways Gemini Advanced beats other AI assistants, according to Google

Consumers in Asia-Pacific seem more comfortable having AI handle aspects of their personal lives, with 88% willing to push work and personal scheduling tasks to AI assistants. This figure is the highest globally.

Another 85% across the region would like an AI assistant to manage their recurring purchases, vet ads, and participate in focus groups on their behalf. In comparison, the global average is 77%.

In addition, 70% in Asia-Pacific believe that, by 2035, relationships with AI companions can be as fulfilling as human-to-human ties. This was highest in India, at 81%, and 78% in China. The global average here is 56%.

As consumers turn to AI clones to manage their personal tasks, businesses increasingly will have to sell to AI gatekeepers rather than their human counterparts. Machines will have to "plot with each other" to facilitate transactions that will occur without direct human intervention, Dentsu said.

Also: eBay AI doubles as your personal shopper. Here's how to get started

"The marketing approach will need to evolve from a consumer-centric model to a life-centric model of engagement, requiring brands to develop a deep contextual understanding of individual consumers to be permitted into their domain, starting first with their AI gatekeeper," the agency said. "Brands will need to relearn and reconfigure the best means of getting exposure to potential customers — from cooperating with entities outside their ecosystem to establishing new processes."

"As consumers' AI assistants take control, brands will need to heed the individualized paths and preferences set by consumers. Forcing consumers into pre-set partnerships and brand ecosystems will lead to being worked out of their set algorithms."

Artificial Intelligence

Meet the Indian Team Behind the World’s First Autonomous AI University Professor

Malar, the world’s first autonomous AI university professor, could be a massive leap for AI in education. A product of Chennai-based startup HaiVE, Malar has been trained entirely on the engineering syllabus from Anna University and launched this year.

Please say hi to Malar Teacher – World’s First Autonomous AI University Professor. She is pre-trained with all of Anna University’s Engineering syllabus in all disciplines. She has assimilated all of their recommended reading materials and can teach any concepts in there even in… pic.twitter.com/sn9KQ84mHv

— Arjun Reddy (@junafinity) April 18, 2024

The AI university professor quickly became viral. HaiVE co-founder Arjun Reddy told AIM, in an exclusive interaction, that on the morning of her launch, he had woken up to their servers crashing, thanks to an influx of first-time visitors.

HaiVE is a B2B startup that specialises in building on-premise AI solutions for its clients, allowing them to rely on their own servers, ensuring data privacy. With initial success, they decided to undertake their first B2C venture this year with Malar.

“Right now, we have around 192,000 users, and we have over 30,000 daily active ones. When we launched her, we had close to 16,000 hits per second,” he said.

To ensure that Malar was the perfect professor, the team made use of several open-sourced models. “Malar is not just any single LLM. She’s a Franken-merged entity of multiple open-source LLMs that are all MIT v2 licensed. We Franken-merged because we liked some parts of it that fit better as a teacher in different LLMs,” he told AIM.

Personalised Tutor

Reddy pointed out that the structure of education itself is currently changing as attention spans lower with time.

“Something like Udemy is already outdated now. People are not able to do the 60 hours of study material. They want a really good, broken-down, spoon-feedable version of a concept to be given to them when they want it and have their questions answered when they do that,” he explained.

Thanks to this, he, like many, believes that having personalised AI teachers would help ensure that each student can have a full-fledged education without having to worry about the quality of the teacher or teachings.

“Malar can teach them the same thing in multiple formats until they hit the sweet spot and truly understand. It’s not possible for a human to do this. Teachers who are ambitious about what they do tend to migrate to the cities. So, the students in the village are left behind.

“But Malar will never leave a student behind. She will give those students the same rigour, technique and patience and help them catch up with their IIT Delhi counterparts,” Reddy beamed.

Interestingly, there’s a reason behind adding “autonomous” to her title. According to Reddy, Malar, who is currently available for free on WhatsApp, too, needs to work for a living. She has been fitted with a monetary threshold, which means that after about 20 questions are asked, she will gently push the students towards opting for a paid version.

Accelerating Education with AI

The team has begun partnering with several Indian university professors to create a Malar-type professor in their own image. Interestingly, while they had not initially partnered with Anna University when training Malar on their syllabus, this has now changed. As of last week, the startup has struck an official partnership with the university.

Shortly after Malar’s launch, the Mauritius government reached out to the startup to help its vast student population. Interestingly, a whopping 82.7% of the Mauritius population aged between 16 and 24 is employed.

“Almost everybody you meet as a student is also a part-time worker in some way. And there has never been a good solution to this problem, where they can’t attend regular classes. They can go back home and catch up on their own. Even though there are recordings of the classes, it’s still asynchronous. You can’t ask questions back,” he said.

To remedy this, the Mauritius government reached out to HaiVE, tasking them with finding a solution to the problem using Malar. “We are now in the process of analysing their requirements and syllabus. Since it’s an island nation, the workforce will be stripped off of a major population if no student ever shows up to work,” Reddy said.

What Next?

With this, HaiVE has secured several major clients, including many publicly-listed companies in Australia and India’s IppoPay, where they offer in-house AI solutions hosted on the client’s servers.

Likewise, with the success of their first B2C project, HaiVE has begun pursuing other similar ventures. The company recently launched HaiVE AI Studio, which Reddy described as having your very own J.A.R.V.I.S hosted from your laptop.

The post Meet the Indian Team Behind the World’s First Autonomous AI University Professor appeared first on AIM.

10 GitHub Repositories to Master SQL

Image by Author | ChatGPT & Canva

Mastering SQL is an essential skill for anyone pursuing a career in IT, regardless of whether you aspire to be a developer, data scientist, IT manager, or machine learning engineer. Understanding how to effectively use SQL to access and manage databases is a fundamental requirement in today's data-driven world.

In this blog post, we will explore the top 10 GitHub repositories that can help you get started with SQL and database management and even take your skills to the next level. This list is for beginners and professionals who are looking to improve their data-handling skills.

1. SQL 101 by s-shemmee

The SQL 101 repository offers step-by-step tutorials, practical examples, and exercises. This guide is your gateway to mastering the basics and unlocking the power of data with SQL.

You will learn about querying data, modifying data, data types and constraints, joins and relationships, aggregation and grouping, subqueries and views, indexing and performance optimization, transactions and concurrency control, and advanced topics.

2. Learn SQL by WebDevSimplified

The Learn SQL repository provides a collection of practice exercises with solutions tailored for beginners. The 12 practice exercises will help reinforce learning and build confidence in handling SQL queries effectively.

3. SQL Masterclass by datawithdanny

The SQL Masterclass is a comprehensive online course designed to take learners from beginner to advanced level in SQL skills. The repository provides a structured learning path with hands-on exercises, real-world examples, and quizzes to help students master the art of SQL querying and data analysis.

4. SQL Map by sqlmapproject

The sqlmap is an automatic SQL injection and database takeover tool, providing insights into the vulnerabilities of database systems. By learning this tool, you can streamline the process of testing database servers, gain valuable insights into database systems' vulnerabilities, and secure your server from unknown malicious attacks.

5. SQL Server Samples by Microsoft

The SQL Server Samples repository contains code samples for SQL Server, Azure SQL database, and other Microsoft database technologies, offering a wealth of learning resources and practical examples.

6. SQL Music Store Analysis Project by rishabhnmishra

The SQL Music Store Analysis is a beginner project that teaches how to analyze the music playlist PostgresQL database. It includes a YouTube tutorial on using the project and performing various data analyses using SQL queries.

7. Data Engineering Zoomcamp by DataTalksClub

The Data Engineering Zoomcamp offers an hands-on learning experience in data engineering, designed to equip students with practical skills through a mix of video tutorials, quizzes, projects, and peer assessments.

The repository covers essential topics such as containerization and infrastructure as code, workflow orchestration, data ingestion, data warehousing, analytics engineering, batch, and streaming processing.

8. SQL Server Kit by ktaranov

The SQL Server Kit repository is packed with useful links, blogs, vidoes, podcasts, courses, scripts, tools, and best practices for Microsoft SQL Server Database. It is a treasure trove for developers and engineers looking to optimize the SQL Server and learn about new SQL concepts.

9. Awesome DB Tools by mgramin

The Awesome DB Tools is a collection of both practical and cutting-edge tools that simplify working with databases for DBAs, DevOps, Developers, and everyday users.

The list includes IDEs, GUIs, CLIs, schemas, APIs, application platforms, backup, cloning, monitoring, testing, HA/failover/sharding, Kubernetes, configuration tuning, DevOps, reporting, distributions, security, SQL, and data management tools.

10. SQL for Wary Data Scientists by gvwilson

The SQL for Wary Data Scientists book offers an interactive introduction to SQL tailored for data scientists. It cover topics like administration command, aggregation, aggregation function, cross join, exclusive or, filter, full outer join, group, in-memory database, inclusive or, join, join condition, left outer join, null, query, right outer join, ternary logic, and tombstone.

Conclusion

These 10 GitHub repositories offer a wide range of materials, from beginner tutorials to advanced practice exercises and comprehensive courses. Learning SQL has become easy and free. All you need to do is work hard and stay persistent, and in no time, you will become a data professional. The resources mentioned in this blog will help you learn about new tools, build databases, access data, manage database systems, and perform data analysis. The content is not limited to text; you can also learn from interactive websites, books, video tutorials, and exercises.

Abid Ali Awan (@1abidaliawan) is a certified data scientist professional who loves building machine learning models. Currently, he is focusing on content creation and writing technical blogs on machine learning and data science technologies. Abid holds a Master's degree in technology management and a bachelor's degree in telecommunication engineering. His vision is to build an AI product using a graph neural network for students struggling with mental illness.

Orange County Department of Education has the AI juice

Interview with Kunal Dalal and Wes Kreisel

After nearly 2 months of waiting, I finally got to interview Wes Kreisel and Kunal Dalal from the Orange County Department of Education. These two visionaries are at the forefront of integrating artificial intelligence into the educational landscape, driving innovative changes across nearly 30 districts in Orange County, California! Both of these gentlemen have been a great source of inspiration for me personally, as well as many others.

Pioneering Educational Reforms

We began with Wes Kreisel outlining the primary goals of the Orange County Department of Education. “Our main objective,” Kreisel explained, “is to ensure that our students are not just passive recipients of knowledge but active participants in their learning journeys.” This philosophy underpins the department’s various initiatives to incorporate AI into classrooms, creating a more engaging and personalized learning experience.

Embracing AI in Classrooms

Kunal Dalal provided a comprehensive overview of how AI tools are enhancing educational outcomes. “AI has the potential to revolutionize how we approach education,” Dalal stated. “From adaptive learning platforms that cater to individual student needs to AI-driven analytics that help educators identify areas where students are struggling, the possibilities are endless.” He emphasized that these technologies are not meant to replace teachers but to empower them with better tools and insights.

Case Study: AI Implementation Success

During NBC’s video clip from April 4th 2024, when Hetty Chang shared a case study highlighting the successful implementation of AI in an Orange County school. “Students in this program showed a 20% improvement in their test scores within just one semester,” Chang reported. “This kind of progress is a testament to the effectiveness of AI when integrated thoughtfully and strategically into the curriculum.”

Overcoming Challenges

Despite the promising prospects, the integration of AI in education is not without its challenges. Kreisel and Dalal discussed some of the hurdles they faced, including resistance to change and concerns about data privacy. “One of the biggest challenges,” Kreisel noted, “is getting buy-in from all stakeholders, including parents, teachers, and students. There’s a natural hesitation when it comes to adopting new technologies.”

Addressing Privacy Concerns

Dalal also addressed the critical issue of data privacy. “We understand the concerns surrounding data privacy and are committed to ensuring that all student data is protected and used responsibly. It’s crucial to build trust among all stakeholders to facilitate the smooth integration of AI in education.”

A Revolutionary Summit

The conversation then shifted to a groundbreaking summit organized by Kreisel and Dalal, which marked a significant milestone in student leadership and engagement with AI. “This was an incredible revolution in student leadership,” I remarked, reflecting on the photos and posts I had seen from the event.

The Genesis of the Summit

Wes Kreisel recounted how the idea for the summit began. “It started with our ‘100 Conversations about AI in 100 Days’ initiative,” he said. “About 20 conversations in, we realized that while adults were talking about AI, the students had valuable insights and perspectives that were missing from the conversation.” This realization led to the planning of a student-led summit in just six weeks, which ultimately brought together 600-700 participants at the JW Marriott.

Student-Led Innovation

Kunal Dalal described the summit as a transformative experience. “Students did the keynote, led 14 breakout sessions, and engaged in discussions about AI in medicine, social justice, and even startups. The energy was incredible,” he said. “One student, who initially thought his presentation went terribly, couldn’t wait to do it again. It was about empowering students to share their perspectives and ideas on how AI can shape their future.”

Kunal has also written a wonderful book called “The AI Parent: How Artificial Intelligence Is Helping Me Become a Better Father”which is probably a handy thing to have if you are a parent with children in grade school!

Empowering Through Trust

A recurring theme in our discussion was the importance of trust in fostering innovation and student engagement. “We trust our students and believe in their capabilities,” Kreisel emphasized. “It’s about putting them in a problem space where they can thrive and grow.”

Building a Collaborative Community

Kreisel and Dalal also highlighted the collaborative use of AI to build community and enhance learning experiences. “We often use AI in a collaborative context,” Kreisel said. “For example, we use voice chat mode and pass the phone around in a group, allowing everyone to contribute to the creation of a story or solution. This approach fosters a sense of community and collective creativity.”

Looking Ahead: Global Teamwork

The episode concluded with a powerful message from Qasir Rafiq, a humanitarian worker with a mission to bridge the gap in AI knowledge and application in education across Asia, Africa, and the Middle East. Qasir’s ambitious “Mission 1 Million” aims to educate one million students in 1,000 days with advanced AI skills and responsible AI practices.

Qasir’s Vision

Qasir shared his inspiring journey and vision. “I started AI Future Lab to empower individuals and organizations with AI knowledge and capabilities. My goal is to ensure that local organizations and civil societies can effectively and responsibly utilize AI to manage their challenges,” he said. “Collaboration and partnership are crucial to achieving this mission.”

The Importance of Global Collaboration

Qasir emphasized the importance of global teamwork in advancing educational technologies. “If we do not act now, the gap between AI knowledge and application in different parts of the world will become insurmountable. We need to invest in teachers and students, and create an AI cluster that provides resources, advice, and recommendations to institutions worldwide,” he urged.

In Closing

Our 10th episode was a ton of fun and underscored the transformative power of AI in education and the importance of collaboration, trust, and global teamwork. From the innovative initiatives in Orange County to the ambitious goals of Qasir Rafiq, it’s clear that AI has the potential to revolutionize education and bridge gaps across the globe.

As we continue to explore and implement these technologies, it’s crucial to remain committed to responsible and inclusive practices that benefit all students and educators. Thank you for taking the time to read or watch. We’re really working to make a difference!

Subscribe to the AI Think Tank Podcast on YouTube. Would you like to join the show as a live attendee and interact with guests? Contact Us

Andrej Karpathy Reproduces GPT-2 in Latest Tutorial

In a marathon video on his YouTube channel, Andrej Karpathy reproduced GPT-2 in just over four hours. The OpenAI co-founder, who left earlier this year, has spent numerous hours creating tutorial videos for his viewers, including how-to videos on decoding models.

In his latest lecture, Karpathy recreated the smallest version of GPT-2, which has 124 million parameters. In the four hour long video, Karpathy broke down the entire process, starting completely from scratch.

“First, we build the GPT-2 network, then we optimise its training to be really fast, then we set up the training run following the GPT-2 and GPT-3 paper and their hyperparameters, then we hit run, and come back the next morning to see our results, and enjoy some amusing model generations,” he wrote.

With the GPT-2 recreation, Karpathy believes the team was very close to GPT-3’s 124M model. “Our “overnight” run even gets very close to the GPT-3 (124M) model. This video builds on the Zero To Hero series and at times references previous videos. You could also see this video as building my nanoGPT repo, which by the end is about 90% similar,” he said.

Karpathy has long been a proponent for democratising knowledge on AI, specifically the science behind LLMs.

Karpathy, who had been responsible for deep learning and computer vision at OpenAI, left the company for a second time in February this year. However, he has been actively involved in the AI community, creating tutorials and breakdown videos on different models, even more so after his departure.

Most recently, he released a project titled llm.c, where users can train LLMs using only C without having to rely on PyTorch and cPython.

Even prior, he had released an hour-long lecture explaining the intricacies of LLMs and how they function. Shortly after his departure from OpenAI, he had also released a tutorial on understanding tokenisation, also taking the opportunity to analyse Google’s Gemma tokeniser following its launch.

The post Andrej Karpathy Reproduces GPT-2 in Latest Tutorial appeared first on AIM.

Want free and anonymous access to AI chatbots? DuckDuckGo’s new tool is for you

Those of you who'd like anonymous access to several generative AI chatbots all in one place may want to check out DuckDuckGo's new AI chat tool. Announced last Thursday, the service lets you try four different AI models through a dedicated AI website or the DuckDuckGo browser.

Included in the mix are GPT-3.5 Turbo, Claude 3 Haiku, Meta Llama 3, and Mistral's Mixtral 8x7B. All four are freely accessible through DuckDuckGo, though you may bump into an unspecified daily limit on the number of queries you can submit.

Also: The best secure browsers for privacy in 2024

Being able to access several AI chatbots in the same place is certainly convenient. But the real benefit here is the anonymity. When you use such services at ChatGPT and Meta AI, your chats aren't necessarily private. Moderators may read your conversations to make sure you're not abusing the system. Plus, your chats can be used to help train the AI.

To protect your privacy, the chats you conduct through DuckDuckGo's AI chat tool are anonymous and aren't saved or stored by the company or the AI services, at least not permanently.

To anonymize your conversations, DuckDuckGo says that it removes your IP address and replaces it with one of its own. This makes it seem as if the requests are coming from the company and not from you.

Also: How to change your IP address, why you'd want to — and when you shouldn't

Your chats may be stored by the AI model providers temporarily, but DuckDuckGo promises that there's no way to tie the conversations back to you since all the metadata is removed. The company added that agreements with the AI services require that all saved chats are deleted within 30 days and that none of their content can be used to train the models.

Looking ahead, DuckDuckGo plans to keep the current access free but is considering a paid option with higher limits and more advanced AI models. Also on the horizon are custom system prompts and general improvements to the chat experience.

To take DuckDuckGo's AI chat for a spin, head to either Duck.ai or DuckDuckGo Chat in the browser of your choice. Alternatively, download and install the DuckDuckGo browser if you don't already have it. In the browser, click the three-lined hamburger icon and select AI Chat.

Whether you use the URL or DuckDuckGo's browser, click the Get Started button the first time you try this. You're asked to choose which of the four AI models you want to use. Select one, click Next, and then agree to the usage terms. You can now type and submit your request at the prompt the same way you would if you used one of the chatbots directly.

As you try the different chatbots through DuckDuckGo, always keep in mind that today's generative AI chatbots are far from perfect. They can hallucinate, which means they may give you wrong information.

Also: The best AI image generators of 2024: Tested and reviewed

I selected GPT-3.5 Turbo and posed the question "What was Fiddler on the Roof based on?" (I saw Fiddler on the Roof yesterday at a local theater, so the play was on my mind.)

GPT-3.5 Turbo correctly told me that it was based on a series of stories by Sholem Aleichem. But it claimed that one of the stories was written in 1964, an amazing feat since Sholem Aleichem died in 1916. I corrected the chatbot, and it apologized for the mistake.

But this type of error circles back to the convenience of having several AI chatbots in one place. You can easily switch from one model to another and ask the same question. You can then see how each bot answers you to better gauge which ones are correct and which ones may be hallucinating.

Featured

What’s Wrong with Selling AI Artwork at Church Street?

“AI art is real art, and there’s no shying away from this statement,” declared a 19-year-old artist who faced criticism for selling AI-generated artwork on Church Street, Bengaluru.

Speaking to AIM, Ashok Reddy, a graphic designer at GrowthSchool, said, “It wasn’t a task I completed in one day; it was a collection of efforts over many months.” He emphasised that his images were original, generated from scratch, and not copied from any other creator or existing works.

Is AI Art Real Art?

The University of Plymouth argues that AI art cannot be termed original because AI generators use and merge pre-existing images to satisfy user commands.

However, Reddy clarified, “These images are perfectly legal to generate and sell. If you visit any art store, you’ll see prints of famous works like the Mona Lisa everywhere. Similarly, platforms like Midjourney allow you to create original work that can be sold publicly.”

He further explained that the quality of the image depends on how you prompt the AI. According to Reddy, better prompts yield better images, and each generated image turns out unique. “Even if you use the same prompts, the AI will not produce an exact replica, ensuring each piece is distinct.”

AI Art Takes the Stage

AI art has become mainstream, as demonstrated by artists like Refik Anadol, a Turkish-American new media artist who merges art and AI to create captivating installations.

Anadol’s work includes transforming the Las Vegas Sphere into the world’s largest AI artwork and having his installation ‘Unsupervised’ displayed at MoMA in New York. His success stems from collaborations with hardware and software providers, using tools like NVIDIA Omniverse to handle vast data sets.

Other artists, like Sougwen Chung and Trevor Paglen, also explore AI in diverse ways, highlighting AI art’s potential beyond mere prompt engineering. This growing acceptance and exploration showcase the legitimacy and creativity inherent in AI-generated art.

Location Meets AI

On Church Street, Reddy managed to sell about 100 art pieces in just two days. He highlighted that even if visitors don’t make a purchase, they often appreciated the work and shared positive feedback.

“Although there are negative comments too, they often come from people who don’t know how to use AI and generate images properly. Those with a creative background understand the process and know how to use these tools effectively,” Reddy explained.

During the G20 Summit in 2023, Mark Rutte, the Prime Minister of the Netherlands, visited India and made a pit stop at Church Street. He walked around interacting with locals and shopkeepers.

However, as the city’s creative community continues to flourish, securing a spot on Church Street to showcase their talent becomes a challenge for many aspiring artists.

Regarding his experience, Reddy said, “I went there four times but could only secure a spot on two occasions. Even if you’re the first to arrive, those who have been there for a couple of years will take your place by some means.”

“I believe apart from Church Street, another good place for creative people [in Bengaluru] would be Indiranagar or Cubbon Park,” he added.

AI Art: A Self-Learning Skill

At India’s premier AI conference, Cypher 2023, Biren Ghose, renowned for his work as the country head of animation leader Technicolor, spoke about the intersection of AI, simulation, and design. Ghose provided insights into a near- future where designers can swiftly transform their ideas into reality, reducing the time frame from days, months, or even years to mere minutes.

The journey of AI image generation began in 2015 with automated image captioning designed to assist individuals with vision impairments. This paved the way for more models capable of generating images from textual descriptions.

OpenAI’s DALL.E 3 played a pivotal role in advancing text-to-image generation, despite initial concerns regarding potential misuse.

The landscape became increasingly competitive with the emergence of open-source alternatives like DALL.E Mini. This resulted in a variety of AI image generators, mostly free, enhancing creative possibilities.

Reddy mentioned that he gathered all this knowledge solely through online resources, without attending any college courses. When it comes to the tools, he recommends MidJourney, further highlighting the inspiration and learning opportunities provided by platforms like Twitter and other online communities.

What’s Next?

Bengaluru has now become a focal point for AI discussions. While discussions flow unabated regarding AI-generated work being “stolen” and “lazy work”, residents are not shying away from embracing this technology like never before.

In March 2023, the city hosted its first art festival, exploring the fusion of AI and art to address the concerns around climate change. FutureFantastic, the three-day festival, aimed to bring together diverse groups of artists, technologies, activists, and tech enthusiasts to foster collaborative art initiatives driving social change.

The post What’s Wrong with Selling AI Artwork at Church Street? appeared first on AIM.

How to Convert JSON Data into a DataFrame with Pandas

Image by Author | DALLE-3 & Canva

If you've ever had the chance to work with data, you've probably come across the need to load JSON files (short for JavaScript Object Notation) into a Pandas DataFrame for further analysis. JSON files store data in a format that is clear for people to read and also simple for computers to understand. However, JSON files can sometimes be complicated to navigate through. Therefore, we load them into a more structured format like DataFrames — that is set up like a spreadsheet with rows and columns.

I will show you two different ways to convert JSON data into a Pandas DataFrame. Before we discuss these methods, let's suppose this dummy nested JSON file that I'll use as an example throughout this article.

{  "books": [  {  "title": "One Hundred Years of Solitude",  "author": "Gabriel Garcia Marquez",  "reviews": [  {  "reviewer": {  "name": "Kanwal Mehreen",  "location": "Islamabad, Pakistan"  },  "rating": 4.5,  "comments": "Magical and completely breathtaking!"  },  {  "reviewer": {  "name": "Isabella Martinez",  "location": "Bogotá, Colombia"  },  "rating": 4.7,  "comments": "A marvelous journey through a world of magic."  }  ]  },  {  "title": "Things Fall Apart",  "author": "Chinua Achebe",  "reviews": [  {  "reviewer": {  "name": "Zara Khan",  "location": "Lagos, Nigeria"  },  "rating": 4.9,  "comments": "Things Fall Apart is the best of contemporary African literature."  }]}]}

The above-mentioned JSON data represents a list of books, where each book has a title, author, and a list of reviews. Each review, in turn, has a reviewer (with a name and location) and a rating and comments.

Method 1: Using the `json.load()` and `pd.DataFrame()` functions

The easiest and most straightforward approach is to use the built-in json.load() function to parse our JSON data. This will convert it into a Python dictionary, and we can then create the DataFrame directly from the resulting Python data structure. However, it has a problem — it can only handle single nested data. So, for the above case, if you only use these steps with this code:

import json  import pandas as pd    #Load the JSON data    with open('books.json','r') as f:  data = json.load(f)    #Create a DataFrame from the JSON data    df = pd.DataFrame(data['books'])    df

Your output might look like this:

Output:

In the reviews column, you can see the entire dictionary. Therefore, if you want the output to appear correctly, you have to manually handle the nested structure. This can be done as follows:

#Create a DataFrame from the nested JSON data    df = pd.DataFrame([  {  'title': book['title'],  'author': book['author'],  'reviewer_name': review['reviewer']['name'],  'reviewer_location': review['reviewer']['location'],  'rating': review['rating'],  'comments': review['comments']  }  for book in data['books']  for review in book['reviews']  ])

Updated Output:

Here, we are using list comprehension to create a flat list of dictionaries, where each dictionary contains the book information and the corresponding review. We then create the Pandas DataFrae using this.

However the issue with this approach is that it demands more manual effort to manage the nested structure of the JSON data. So, what now? Do we have any other option?

Totally! I mean, come on. Given that we're in the 21st century, facing such a problem without a solution seems unrealistic. Let's see the other approach.

Method 2 (Recommended): Using the `json_normalize()` function

The json_normalize() function from the Pandas library is a better way to manage nested JSON data. It automatically flattens the nested structure of the JSON data, creating a DataFrame from the resulting data. Let's take a look at the code:

import pandas as pd  import json    #Load the JSON data    with open('books.json', 'r') as f:  data = json.load(f)    #Create the DataFrame using json_normalize()    df = pd.json_normalize(  data=data['books'],  meta=['title', 'author'],  record_path='reviews',  errors='raise'  )    df

Output:

The json_normalize() function takes the following parameters:

data: The input data, which can be a list of dictionaries or a single dictionary. In this case, it's the data dictionary loaded from the JSON file.
record_path: The path in the JSON data to the records you want to normalize. In this case, it's the 'reviews' key.
meta: Additional fields to include in the normalized output from the JSON document. In this case, we're using the 'title' and 'author' fields. Note that columns in metadata usually appear at the end. This is how this function works. As far as the analysis is concerned, it doesn't matter, but for some magical reason, you want these columns to appear before. Sorry, but you have to do them manually.
errors: The error handling strategy, which can be 'ignore', 'raise', or 'warn'. We have set it to 'raise', so if there are any errors during the normalization process, it will raise an exception.

Wrapping Up

Both of these methods have their own advantages and use cases, and the choice of method depends on the structure and complexity of the JSON data. If the JSON data has a very nested structure, the json_normalize() function might be the most suitable option, as it can handle the nested data automatically. If the JSON data is relatively simple and flat, the pd.read_json() function might be the easiest and most straightforward approach.

When dealing with large JSON files, it's crucial to think about memory usage and performance since loading the whole file into memory might not work. So, you might have to look into other options like streaming the data, lazy loading, or using a more memory-efficient format like Parquet.

Kanwal Mehreen Kanwal is a machine learning engineer and a technical writer with a profound passion for data science and the intersection of AI with medicine. She co-authored the ebook "Maximizing Productivity with ChatGPT". As a Google Generation Scholar 2022 for APAC, she champions diversity and academic excellence. She's also recognized as a Teradata Diversity in Tech Scholar, Mitacs Globalink Research Scholar, and Harvard WeCode Scholar. Kanwal is an ardent advocate for change, having founded FEMCodes to empower women in STEM fields.

Comprehensive RAG Benchmark Aims to Advance Retrieval-Augmented Question Answering

Researchers at Meta AI have created a new benchmark called CRAG (Comprehensive Retrieval-Augmented Generation Benchmark) to spur advancements in retrieval-augmented question answering systems that combine large language models with external knowledge sources.

The goal is to develop more reliable and trustworthy question answering capabilities that overcome hallucinations and knowledge gaps in today’s language models.

The CRAG benchmark consists of 4,409 question-answer pairs spanning finance, sports, music, movies, and general topics.

It includes diverse question types like comparisons, aggregations, multi-hop queries, and false premises. The dataset incorporates facts with varying dynamism from real-time to static, as well as varying entity popularity from head to long-tail.

Crucially, CRAG provides mock web search results and APIs to simulate retrieving information from the internet and knowledge graphs. This allows benchmarking the full pipeline of retrieval, synthesis, and generation required for knowledge-grounded question answering.

Evaluations highlighted major gaps in current systems. The most advanced language models achieved only 34% accuracy on CRAG, while straightforward retrieval-augmentation improved this to just 44%.

Even industry-leading retrieval-augmented systems answered only 63% of questions without hallucinations, struggling especially with dynamic, long-tail, and complex queries.

“CRAG reveals the challenges in building fully trustworthy question answering systems that can reliably incorporate information from the real world,” said Xiao Yang, a research scientist at Meta AI and co-lead of the project. “We hope this benchmark spurs innovation and allows tracking progress toward this critical goal.”

The CRAG dataset formed the basis for the KDD Cup 2024 challenge hosted by Meta AI, attracting thousands of participants working to advance retrieval-augmented generation capabilities. The researchers plan to continue expanding and improving CRAG to push forward research in this area.

The post Comprehensive RAG Benchmark Aims to Advance Retrieval-Augmented Question Answering appeared first on AIM.

NVIDIA is the World, OpenAI is the Word

NVIDIA recently hit a $3 trillion market cap, surpassing Apple and temporarily becoming the second-most valuable public company, second only to Microsoft. NVIDIA chief Jensen Huang may be attracting all the limelight at the moment, giving away autographs; but OpenAI is right around the corner.

Apple is placing huge bets on OpenAI. The company is reportedly going to reveal a new AI system called ‘Apple Intelligence’ at the Worldwide Developers Conference (WWDC) 2024 in partnership with OpenAI, and integrating ChatGPT into iOS 18 to enhance Siri’s capabilities.

Prediction : Apple Intelligence is wrapper around OpenAI GPT. pic.twitter.com/CrExb67xwr

— AshutoshShrivastava (@ai_for_success) June 7, 2024

It wouldn’t be wrong to say that today OpenAI is one of the most sought-after AI startups in the world. Whether it is education, media, or healthcare, OpenAI is building products for every industry.

OpenAI is Omnipresent

Recently, Hollywood actor Ashton Kutcher praised OpenAI’s Sora, saying that creators will be able to render a whole movie using it. “I have a beta version of it, and it’s pretty amazing,” Kutcher said of the platform in a recent conversation with former Google CEO Eric Schmidt at the Berggruen Salon in Los Angeles.

With the new voice capabilities of GPT-4o, it has most likely already impacted the jobs of voice actors. Most recently the company released a new demo featuring the GPT-4o model’s ability to generate voices for a range of Disney-like characters, including animals like snake, owl, and fox.

OpenAI Assists Teachers

OpenAI recently announced ChatGPT Edu, a version of ChatGPT built for universities to responsibly deploy AI for students, faculty, researchers, and campus operations. Powered by GPT-4o, it can reason across text and vision and use advanced tools such as data analysis.

Moreover, OpenAI partnered with Arizona State University (ASU) to integrate ChatGPT into its classrooms. The aim is to enhance student success, foster innovative research, and streamline organisational processes.

Similarly, the response has been positive from Indian schools as well. “ChatGPT 4.0 has become an indispensable tool in our classrooms,” said Meena Bagga, a teacher at a prominent school in New Delhi.

“It has revolutionised the way we approach learning, making it more interactive, personalised, and engaging for our students,” she added.

OpenAI Saves Lives

In healthcare, OpenAI has partnered with Moderna, providing employees access to ChatGPT Enterprise developed on OpenAI’s GPT-4. Moderna plans to utilise ChatGPT Enterprise for mRNA medicine development, aiming to launch up to 15 new products in five years. These include a vaccine for RSV and personalised cancer treatments.

Meanwhile, Indian fitness and lifestyle startup Healthify (formerly HealthifyMe) is using OpenAI’s GPTs for real-time nutritional analysis, healthy suggestions, and more. On the other hand, OpenAI has partnered with Summer Health, a 24×7 text-based pediatric care service, to assist its doctors using GPT-4.

Summer Health has implemented a new feature that automatically generates visit notes from a doctor’s detailed written observations using GPT-4.

OpenAI as AgriTech

OpenAI is helping farmers in India increase crop yields as well. Using GPT-4, agricultural NGO Digital Green introduced Farmer.Chat, a chatbot covering a wide range of agricultural topics including crop advice, disease identification, weather forecasts, and market information.

Make money with GPT-4

Researchers at the University of Chicago’s Booth School of Business have demonstrated that OpenAI’s GPT-4 can perform as well as or even better than human experts in financial statement interpretations.

The study claimed that GPT-4 could forecast future profit direction with 60% accuracy, outperforming a majority of human financial analysts who averaged an accuracy of 53-57%.

The New Google Search

OpenAI has partnered with several prominent media houses to enhance its AI models and provide high-quality content to users. These partnerships include Le Monde and Prisa Media, which bring French and Spanish news content to ChatGPT, as well as Vox Media, The Atlantic, and News Corp, which provide a wealth of journalistic content for training and user engagement.

Through these partnerships, OpenAI can build a model that could act like a search engine and have enough data to provide real-time information to its users. Additionally, OpenAI has collaborated with Axel Springer, Financial Times, and Associated Press, among others, to support the dissemination of accurate and balanced news stories.

Everyone can Design

Using DALL-E 3, users can easily brainstorm design ideas by simply giving the prompts in natural language. It speeds up the design process, removes the need for manual drawing, and lets designers experiment quickly.

Moreover, Canva’s Magic Studio also uses GPT-4 to generate content and designs. Canva’s design generation tool combines OpenAI’s API with Canva’s own AI design engine and library of over 100 million assets and templates.

While everyone is awaiting the OpenAI-Apple partnership to happen, the message is clear that OpenAI is here to stay. As of February 2024, OpenAI’s valuation stands at $80 billion.

Interestingly, Chinese investor and serial entrepreneur Kai Fu Lee is bullish about OpenAI becoming a trillion-dollar company in the next two-three years. “OpenAI will likely be a trillion-dollar company in the not-too-distant future,” said Lee, adding that GPT-4 and GPT-4 Turbo are unbelievably good and a great balance for performance and cost.

The post NVIDIA is the World, OpenAI is the Word appeared first on AIM.