Know what floats my boat? Charts and graphs.
Give me a cool chart to dig into and I'm unreasonably happy. I love watching the news on election nights, not for the vote count, but for all the great charts. I switch between channels all evening to see every possible way that each network finds to present numerical data.
Is that weird? I don't think so.
Also: The moment I realized ChatGPT Plus was a game-changer for my business
As it turns out, ChatGPT does a great job making charts and tables. And given that this ubiquitous generative AI chatbot can synthesize a ton of information into something chart-worthy, what ChatGPT gives up in pretty presentation it more than makes up for in informational value.
It should come as no surprise to anybody that AI chatbots' feature sets are changing constantly. As of the time of this update (end of May, 2024), OpenAI has just come out with a Mac application and has release its GPT-4o LLM, which is available for both free and paying customers. The GPT-4o version that comes for the added-price Plus version is supposed to have interactive chart features and the ability to interact with the engine longer per session.
But, not so much. My free account doesn't offer GPT-4o at all yet. It hasn't rolled out to all free accounts yet. And while paid ChatGPT Plus plan does provide the interactive charts feature in Chrome and Safari, it doesn't in the Mac app.
Also: ChatGPT vs. ChatGPT Plus: Is a paid subscription still worth it?
This article was last updated when the Advanced Data Analysis features (which included charts) were only available to Plus customers. Even though some of those features are supposed to be available to free customers, since my free account doesn't have them yet, I'm going to present the rest of this article as if the charting features are only available to Plus customers. If you're a free customer and you have GPT-4o, feel free to try some of the prompts. Those features may work for you, and undoubtely will as we move forward in time.
Advanced Data Analysis produces relatively ugly charts. But it rocks. First, let's discuss where ChatGPT gets its data, then we'll make some tables.
How to use ChatGPT to make charts and tables
List the top five cities in the world by population. Include country.
I asked this question to ChatGPT's free version and here's what I got back:
Turning that data into a table is simple. Just tell ChatGPT you want a table:
Make a table of the top five cities in the world by population. Include country.
Make a table of the top five cities in the world by population. Include country and a population field
You can also specify certain details for the table, like field order and units. Here, I'm moving the country first and compressing the population numbers.
Make a table of the top five cities in the world by population. Include country and a population field. Display the fields in the order of rank, country, city, population. Display population in millions (with one decimal point), so 37,833,000 would display as 37.8M.
Note that I gave the AI an example of how I wanted the numbers to display.
That's about as far as the free version will take us. From now on, we're switching to the $20/month ChatGPT Plus version.
In this example, we're just going to make a simple bar chart.
Make a bar chart of the top five cities in the world by population
Chatty little tool, isn't it?
The eagle-eyed among you may have noticed the discrepancy in populations between the previous table shown and the results here. Notice that the table has a green icon and this graph has a purple icon. We've jumped from GPT-3.5 (the free version of ChatGPT) to GPT-4 (in ChatGPT Plus). It's interesting that the differing LLMs have slightly different data. This difference is all part of why it pays to be careful when using AIs, so double-check your work. In our case, we're just demonstrating charts, but this is a tangible example of where confidently presented data can be wrong or inconsistent.
The dataset I chose for this article is readily available from a government site, so you can replicate this experiment on your own. There are a ton of great datasets available on Data.gov, but I found that many are far too large for ChatGPT to use.
Also: How to use ChatGPT to create an app
Once I downloaded this one, I realized it also included information on ethnicity, so we can run a number of different charts from the same dataset.
Click the little upload button and then tell it the data file you want to import.
I asked it to show me the first five lines of the file so I'd know more about the file's format.
Create a pie chart showing gender as a percentage of the overall dataset
And here's the result:
Unfortunately, the dark shade of green makes the numbers difficult to read. Fortunately, you can instruct Advanced Data Analytics to use different colors. I was careful to choose colors that did not reinforce gender stereotypes.
Create a pie chart showing gender as a percentage of the overall dataset. Use light green for male and medium yellow for female.
Show the distribution of ethnicity in the dataset using a pie chart. Use only light colors.
And here's the result. Notice anything?
Apparently, New York didn't properly normalize its data. It used "WHITE NON HISPANIC" and "WHITE NON HISP" together, "BLACK NON HISPANIC" and "BLACK NON HISP" together, and "ASIAN AND PACIFIC ISLANDER" and "ASIAN AND PACI" together. This resulted in inaccurate representations of the data.
One benefit of ChatGPT is it remembers instructions throughout a session. So I was able to give it this instruction:
For all the following requests, group "WHITE NON HISPANIC" and "WHITE NON HISP" together. Group "BLACK NON HISPANIC" and "BLACK NON HISP" together. Group "ASIAN AND PACIFIC ISLANDER" and "ASIAN AND PACI". Use the longer of the two ethnicity names when displaying ethnicity.
And it replied:
Let's try the chart again, using the same prompt.
Show the distribution of ethnicity in the dataset using a pie chart. Use only light colors.
That's better:
You need to be diligent when looking at results. For example, in a request for top baby names, the AI separated out "Madison" and "MADISON" as two different names:
For all the following requests, baby names should be case insensitive.
For each ethnicity, present two pie charts, one for each gender. Each pie chart should list the top five baby names for that gender and that ethnicity. Use only light colors.
As it turns out, the chart generated text that was too small to read. So, to get a more useful chart, we can export it back out. I'm going to specify both file format and file width:
Export this chart as a 3000 pixel wide JPG file.
And here's the result:
Notice that Sofia and Sophia are very popular, but are shown as two different names. But that's what makes charts so fascinating.
FAQ
How much does it cost to use Advanced Data Analytics?
Advanced Data Analytics comes with ChatGPT Plus. Some of its features are available in GPT-4o for the free version of ChatGPT. ChatGPT Plus is $20/month. Advanced Data Analytics also is included with the Enterprise edition, but pricing for that hasn't been released yet.
Is the data uploaded to ChatGPT for charting kept private or is there a risk of data exposure?
Assume that there's always a privacy risk.
I asked this question to ChatGPT and this is what it told me:
Data privacy is a priority for ChatGPT. Uploaded data is used solely for the purpose of the user's current session and is not stored long-term or used for any other purposes. However, for highly sensitive data, users should always exercise caution and consider using the Enterprise version of ChatGPT, which offers enhanced data confidentiality.
Also: Generative AI brings new risks to everyone. Here's how you can stay safe
My recommendation: Don't trust ChatGPT or any generative AI tool. The Enterprise version is supposed to have more privacy controls, but I would recommend you only upload data that you won't mind finding its way to public visibility.
Can ChatGPT's Advanced Data Analysis handle real-time data or is it more suited for static datasets?
It's possible, but there are some practical limitations. First, the Plus account will throttle the number of requests you can make in a given period of time. Second, you have to upload each file individually. There is the possibility you could use a licensed ChatGPT API to do real-time analytics. But for the chatbot itself, you're looking at parsing data at rest.
You can follow my day-to-day project updates on social media. Be sure to subscribe to my weekly update newsletter on Substack, and follow me on Twitter at @DavidGewirtz, on Facebook at Facebook.com/DavidGewirtz, on Instagram at Instagram.com/DavidGewirtz, and on YouTube at YouTube.com/DavidGewirtzTV.