How to use ChatGPT’s Advanced Data Analysis to create quality charts and tables

abstractdaata2gettyimages-2161500491

Know what floats my boat? Charts and graphs. Give me a cool chart to dig into and I'm unreasonably happy. Is that weird? I don't think so.

As it turns out, ChatGPT does a great job making charts and tables. And given that this ubiquitous generative AI chatbot can synthesize a ton of information into something chart-worthy, what ChatGPT gives up in pretty presentation it more than makes up for in informational value.

How to use ChatGPT to make charts and tables

It should come as no surprise to anybody that AI chatbots' feature sets are changing constantly. As of the time of this update (November 2024), OpenAI has just launched an early version of its Windows application (for paying customers only) and has introduced its o1-preview and o1-mini LLM models — also just for paying customers. The GPT-4o version is now available to free users. The Advanced Data Analysis feature we'll be talking about here is available to both free and paying customers.

Historically, OpenAI has introduced major new features into its Plus version ($20/month) and then, after a time, rolled them out to free users. As such, it's often challenging — week by week — to tell you which features exist in the free version vs. the Plus version.

Here's a general guideline, especially as it pertains to the rest of this article. The free version is generally more limited than the Plus version. That means fewer queries per session, less data to analyze, possibly a slightly older LLM version available, longer wait times for responses, and so on. Basically, you're in the cheap seats if you use the free version and you get a more premium experience if you pay for the Plus version.

I now pay for the Plus version because I found I often got cut off from asking questions before I was done with whatever I was working on. That (mostly) doesn't happen anymore now that I pay for the Plus version.

Also: The moment I realized ChatGPT Plus was a game-changer for my business

For much of this article, we'll be using the Advanced Data Analysis that's now embedded in the free and Plus versions. This tool will import data tables in a wide range of file formats. While it doesn't specify a size limit for imported data, it can handle fairly large files, but will break if the files exceed some undefined level of complexity.

For now, my advice is to try these things on the free version, and if you need a more responsive experience, upgrade to the Plus version.

List the top five cities in the world by population. Include country.

I asked this question to ChatGPT and here's what I got back:

Turning that data into a table is simple. Just tell ChatGPT you want a table:

Make a table of the top five cities in the world by population. Include country.

Notice that it also gave me population data, even though I didn't explicitly ask for a population column.

You can specify certain details for the table, like field order and units. Here, I'm moving the country first and compressing the population numbers.

Make a table of the top five cities in the world by population. Include country and a population field. Display the fields in the order of rank, country, city, population. Display population in millions (with one decimal point), so 37,833,000 would display as 37.8M.

Note that I gave the AI an example of how I wanted the numbers to display.

In this example, we're going to make a simple bar chart.

Make a bar chart of the top five cities in the world by population

The dataset I chose for this article is readily available from a government site, so you can replicate this experiment on your own. There are a ton of great datasets available on Data.gov, but I found that many are far too large for ChatGPT to use.

Also: How to use ChatGPT to create an app

Once I downloaded this one, I realized it also included information on ethnicity, so we could run several different charts from the same dataset.

Click the little upload button and then tell it the data file you want to import.

I asked it to show me the first five lines of the file so I'd know more about the file's format.

Create a pie chart showing gender as a percentage of the overall dataset

And here's the result. Note the color choices for each pie wedge. That was ChatGPT's choice.

Also: ChatGPT's Windows app is now available to free-tier users – here's what it can do

You can instruct Advanced Data Analytics to use different colors. I was careful to choose colors that did not reinforce gender stereotypes or redefine common gender-related colors.

Create a pie chart showing gender as a percentage of the overall dataset. Use light green for male and medium yellow for female.

Look at ChatGPT's response carefully. Here's where we see inaccuracies in its response. I asked for the male wedge to be green and the female wedge to be yellow. In the chart, the AI reversed that, but in the descriptive text, it got it right. Don't be afraid to correct the AI.

The colors of the chart don't match the text. Please do it again.

Show the distribution of ethnicity in the dataset using a pie chart. Use only light colors.

And here's the result. Notice anything?

Also: ChatGPT vs. ChatGPT Plus: Is a paid subscription still worth it?

Apparently, New York didn't properly normalize its data. It used "WHITE NON HISPANIC" and "WHITE NON HISP" together, "BLACK NON HISPANIC" and "BLACK NON HISP" together, and "ASIAN AND PACIFIC ISLANDER" and "ASIAN AND PACI" together. This resulted in inaccurate representations of the data.

One benefit of ChatGPT is it remembers instructions throughout a session. So I was able to give it this instruction:

For all the following requests, group "WHITE NON HISPANIC" and "WHITE NON HISP" together. Group "BLACK NON HISPANIC" and "BLACK NON HISP" together. Group "ASIAN AND PACIFIC ISLANDER" and "ASIAN AND PACI". Use the longer of the two ethnicity names when displaying ethnicity.

And it replied:

Let's try the chart again, using the same prompt.

Show the distribution of ethnicity in the dataset using a pie chart. Use only light colors.

That's better:

Also: How to use AI for research the right way – responsibly and effectively

You need to be diligent when looking at results. For example, in a request for top baby names, the AI separated out "Madison" and "MADISON" as two different names:

For all the following requests, baby names should be case insensitive.

For each ethnicity, present two pie charts side-by-side, one for each gender. Each pie chart should list the top five baby names for that gender and that ethnicity. Use only light colors. Do not title each chart. Remove the phrase "Matplotlib Chart" from each chart.

The AI gave me four charts like the following, one for each ethnicity it was tracking. Note the phrase "Matplotlib Chart" at the top of the chart. As you can see, I tried very hard to get ChatGPT to remove it and other wacky titles it chose to use from the charts — with no success. Sometimes, you need to give up and just use something like Photoshop to edit out the stupid from an AI response.

Also notice that Sofia and Sophia are very popular, but are shown as two different names. But that's what makes charts so fascinating.

FAQ

Is the data uploaded to ChatGPT for charting kept private or is there a risk of data exposure?

Assume that there's always a privacy risk.

I asked this question to ChatGPT and this is what it told me:

Data privacy is a priority for ChatGPT. Uploaded data is used solely for the purpose of the user's current session and is not stored long-term or used for any other purposes. However, for highly sensitive data, users should always exercise caution and consider using the Enterprise version of ChatGPT, which offers enhanced data confidentiality.

Also: Generative AI brings new risks to everyone. Here's how you can stay safe

My recommendation: Don't trust ChatGPT or any generative AI tool. The Enterprise version is supposed to have more privacy controls, but I would recommend you only upload data that you won't mind finding its way to public visibility.

Can ChatGPT's Advanced Data Analysis handle real-time data or is it more suited for static datasets?

It's possible, but there are some practical limitations. First, the Plus account will throttle the number of requests you can make in a given period of time. Second, you have to upload each file individually. There is the possibility you could use a licensed ChatGPT API to do real-time analytics. But for the chatbot itself, you're looking at parsing data at rest.

You can follow my day-to-day project updates on social media. Be sure to subscribe to my weekly update newsletter on Substack, and follow me on Twitter at @DavidGewirtz, on Facebook at Facebook.com/DavidGewirtz, on Instagram at Instagram.com/DavidGewirtz, and on YouTube at YouTube.com/DavidGewirtzTV.

Artificial Intelligence

Follow us on Twitter, Facebook
0 0 votes
Article Rating
Subscribe
Notify of
guest
0 comments
Oldest
New Most Voted
Inline Feedbacks
View all comments

Latest stories

You might also like...