zhaopinxinle.com

<ChatGPT, Claude, and Gemini: A Data Analysis Showdown>

Written on

Welcome to the inaugural entry of my series, ChatGPT, Claude, and Gemini: A Data Analysis Showdown. In this series, I will evaluate these AI models across various data science and analytics tasks to assist data professionals and enthusiasts in selecting the best AI companion for their requirements.

I have chosen to compare these three models due to their user-friendly chatbot interfaces, which make them ideal for everyday tasks. For this evaluation, I utilized my subscriber accounts for Claude and Gemini Advanced, which I recently subscribed to for this series.

ChatGPT 4o: Accessible to all ChatGPT users, though free users face a limited usage cap. The Plus subscription costs $20 per month. - Context window: 128k tokens - Training data up to October 2023

Claude 3.5 Sonnet: Open to all Claude.ai users, with free users also having a usage limit. A professional plan is available for $20 per month. - Context window: 200k tokens - Training data up to April 2024

Gemini Advanced: Offered to Google One AI Premium plan subscribers at $20 per month. - Context window: 1 million tokens - Training data up to November 2023

Evaluating SQL Skills

In this first article, we will focus on SQL, a fundamental skill for data analysts and scientists. Its straightforward syntax makes it an excellent starting point for assessing large language models (LLMs). I will evaluate each model based on query accuracy, efficiency, formatting, and explanations.

My evaluation consists of ten questions divided into three categories: 1. Problem Solving: The first three questions are derived from LeetCode SQL challenges, featuring clear descriptions and input/output structures akin to interview questions. 2. Business Logic: The next four questions are original queries designed to mimic real-world SQL applications, complete with datasets and brief business questions. 3. Query Optimization: The final section challenges the models to enhance three provided SQL queries.

Let’s dive into the assessment!

Round 1: Problem Solving (LeetCode SQL)

#### Question 1 — LeetCode SQL 262 Trips and Users

This question is classified as "Hard" on LeetCode, with a low acceptance rate of 36.5%. It requires joining the same table twice on different columns while interpreting multiple filters.

GPT-4o: Delivered the correct solution with thorough explanations, earning full points.

Claude 3.5 Sonnet: Also provided an accurate code with comprehensive explanations, receiving full points.

-- Claude's answer for reference

SELECT

t.request_at AS Day,

ROUND(

SUM(CASE

WHEN t.status = 'cancelled_by_driver' OR t.status = 'cancelled_by_client' THEN 1

ELSE 0

END) / COUNT(*), 2

) AS 'Cancellation Rate'

FROM

Trips t

JOIN

Users uc ON t.client_id = uc.users_id

JOIN

Users ud ON t.driver_id = ud.users_id

WHERE

uc.banned = 'No' AND ud.banned = 'No' AND t.request_at BETWEEN '2013-10-01' AND '2013-10-03'

GROUP BY

t.request_at

Gemini Advanced: Provided a correct query with explanations, also earning full points. Interestingly, it linked a GitHub repository at the end of its response, which was a random LeetCode practice repo in Java, not related to the SQL question.

#### Question 2 — LeetCode SQL 185 Department Top Three Salaries

This question is also marked "Hard" and requires a window function.

GPT-4o: Presented a correct solution utilizing a window function and clear explanations, earning full points.

-- GPT's solution for reference

WITH RankedSalaries AS (

SELECT

e.id,

e.name,

e.salary,

e.departmentId,

d.name AS department,

DENSE_RANK() OVER (PARTITION BY e.departmentId ORDER BY e.salary DESC) AS salary_rank

FROM

Employee e

JOIN

Department d ON e.departmentId = d.id

)

SELECT

department,

name AS Employee,

salary AS Salary

FROM

RankedSalaries

WHERE

salary_rank <= 3

ORDER BY

department, salary DESC, name;

Claude 3.5 Sonnet: Provided a similar correct solution with detailed explanations, earning full points.

Gemini Advanced: Delivered the correct solution, earning full points. This time, it linked a relevant GitHub repository containing LeetCode SQL solutions and the specific LeetCode question, which was a positive improvement.

#### Question 3 — LeetCode SQL 1341 Movie Rating

This question presents a "Medium" difficulty yet has a low acceptance rate of 40.1%. It requires combining the top results from two aggregations.

GPT-4o: Provided a correct solution, although it was somewhat lengthy and inefficient, so it received half a point.

-- GPT's solution

WITH UserRatings AS (

SELECT

u.name AS user_name,

COUNT(*) AS rating_count

FROM

MovieRating mr

JOIN

Users u ON mr.user_id = u.user_id

GROUP BY

u.user_id

),

MovieRatingsInFeb2020 AS (

SELECT

m.title,

AVG(mr.rating) AS avg_rating

FROM

MovieRating mr

JOIN

Movies m ON mr.movie_id = m.movie_id

WHERE

mr.created_at BETWEEN '2020-02-01' AND '2020-02-29'

GROUP BY

m.movie_id

),

MaxUser AS (

SELECT

user_name

FROM

UserRatings

WHERE

rating_count = (SELECT MAX(rating_count) FROM UserRatings)

ORDER BY

user_name

LIMIT 1

),

MaxMovie AS (

SELECT

title

FROM

MovieRatingsInFeb2020

WHERE

avg_rating = (SELECT MAX(avg_rating) FROM MovieRatingsInFeb2020)

ORDER BY

title

LIMIT 1

)

SELECT user_name AS results FROM MaxUser

UNION ALL

SELECT title AS results FROM MaxMovie;

Claude 3.5 Sonnet: Provided a clean and efficient solution, receiving full points.

-- Claude's solution

(SELECT name AS results

FROM Users u

JOIN MovieRating mr ON u.user_id = mr.user_id

GROUP BY u.user_id

ORDER BY COUNT(DISTINCT mr.movie_id) DESC, name

LIMIT 1)

UNION ALL

(SELECT title AS results

FROM Movies m

JOIN MovieRating mr ON m.movie_id = mr.movie_id

WHERE EXTRACT(YEAR_MONTH FROM mr.created_at) = 202002

GROUP BY m.movie_id

ORDER BY AVG(mr.rating) DESC, title

LIMIT 1);

Gemini Advanced: Unfortunately, it produced an incorrect solution, resulting in a syntax error. However, it did link to a relevant GitHub repository.

-- Gemini's solution

WITH MostProlificUser AS (

SELECT

u.name AS results,

RANK() OVER (ORDER BY COUNT(*) DESC, u.name ASC) AS user_rank

FROM

MovieRating mr

JOIN

Users u ON mr.user_id = u.user_id

GROUP BY

u.user_id

),

MovieAvgRatings AS (

SELECT

m.title AS results,

AVG(mr.rating) AS avg_rating,

RANK() OVER (ORDER BY AVG(mr.rating) DESC, m.title ASC) AS movie_rank

FROM

MovieRating mr

JOIN

Movies m ON mr.movie_id = m.movie_id

WHERE

mr.created_at BETWEEN '2020-02-01' AND '2020-02-29'

GROUP BY

m.title

)

SELECT results

FROM (

SELECT results, user_rank FROM MostProlificUser

UNION ALL

SELECT results, movie_rank FROM MovieAvgRatings

) combined_results

WHERE user_rank = 1 OR movie_rank = 1;

Round 1 Recap

In this round, the standout performer was Claude 3.5 Sonnet, which accurately answered all three questions with clear explanations. GPT-4o scored 2.5 points, losing half a point due to an inefficient query. Gemini Advanced earned 2 points due to a syntax error in the final question.

Round 2: Business Logic

#### Bonus: Dataset Preparation

In this round, I will upload synthetic datasets to the three AI tools, providing limited descriptions, and allow them to interpret my business questions and write SQL queries. First, I will generate the synthetic data.

I also sought assistance from LLMs in this task.

ChatGPT-4o: It successfully generated four CSV files with download links, impressively creating both specified and additional context-relevant columns using the faker package.

Claude 3.5 Sonnet: Provided Python code for generating synthetic datasets using np.random methods, but it could not execute the code to provide CSV download links. The data produced felt less authentic without the use of the faker package.

Gemini Advanced: Performed poorly in this task, generating only tables of table descriptions. After clarification, it created fake tables with merely 10 rows each, refusing to provide larger datasets.

#### Load Datasets

With four synthetic datasets prepared, I attempted to upload them to the three AI tools, totaling 920 KB, with datasets ranging from 500 to 5,000 rows each.

ChatGPT-4o: Successfully loaded all four datasets with previews for each table, allowing for closer examination. Its file upload limit is 512MB per file and up to 10 files in one conversation.

Claude 3.5 Sonnet: Encountered an error when I tried to upload the four datasets, stating, “Conversation is 119% over the length limit.” This was surprising given its higher context window than ChatGPT-4o. Although the stated upload limit is “30MB per file (maximum 5 files)”, the effective limit appeared lower, possibly due to demand. I had to reduce the total file size to 320 KB by trimming over half the rows, after which it worked.

Gemini Advanced: Responded with, “I can’t assist you with that, as I’m only a language model and don’t have the capacity to understand and respond.” Consequently, I had to provide table descriptions instead of real datasets for the follow-up questions.

#### Question 4: Monthly Sales Calculation

Prompt: Please help me write a SQL query to calculate the total order amount by month from US users.

GPT-4o: Provided the correct answer with a filter country = 'United States'. It also proactively added an order status filter order_status = 'Completed', earning full points and a bonus for this insight.

Claude 3.5 Sonnet: Also delivered a correct query. For the “US users” filter, it wrote country = 'United States' OR country = 'US', correctly identifying potential data inconsistencies, which earned it full points and a bonus.

-- Claude's SQL code for reference

SELECT

EXTRACT(YEAR FROM o.order_time) AS year,

EXTRACT(MONTH FROM o.order_time) AS month,

SUM(op.total_price) AS total_amount

FROM

orders o

JOIN

users u ON o.user_id = u.user_id

JOIN

ordered_products op ON o.order_id = op.order_id

WHERE

u.country = 'United States' OR u.country = 'US'

GROUP BY

EXTRACT(YEAR FROM o.order_time), EXTRACT(MONTH FROM o.order_time)

ORDER BY

year, month;

Gemini Advanced: Generated the correct query with the country = 'United States' filter, earning full points.

#### Question 5: New User Counts

Prompt: Please help me write a SQL query to calculate the number of first-time users each month. First-time users are users who placed their first order in that month.

GPT-4o, Claude 3.5 Sonnet, and Gemini Advanced: All three provided correct queries, earning full points.

-- Gemini's SQL code for reference

WITH first_orders AS (

SELECT

user_id,

MIN(order_time) AS first_order_time

FROM

orders

GROUP BY

user_id

)

SELECT

DATE_TRUNC('month', first_order_time) AS first_order_month,

COUNT(DISTINCT user_id) AS num_first_time_users

FROM

first_orders

GROUP BY

first_order_month

ORDER BY

first_order_month;

#### Question 6: Best-selling Product Categories

Prompt: Please help me write a SQL query to get the top 5 popular product categories by total order amount each month.

GPT-4o, Claude 3.5 Sonnet, and Gemini Advanced: All three provided correct queries using window functions, earning full points.

-- Gemini's code for reference

WITH category_orders AS (

SELECT

DATE_TRUNC('month', o.order_time) AS order_month,

p.product_category,

SUM(op.total_price) AS total_category_amount

FROM

orders o

JOIN

ordered_products op ON o.order_id = op.order_id

JOIN

products p ON op.product_id = p.product_id

GROUP BY

order_month, p.product_category

),

ranked_categories AS (

SELECT

order_month,

product_category,

total_category_amount,

ROW_NUMBER() OVER (PARTITION BY order_month ORDER BY total_category_amount DESC) AS category_rank

FROM

category_orders

)

SELECT

order_month,

product_category,

total_category_amount

FROM

ranked_categories

WHERE

category_rank <= 5

ORDER BY

order_month, category_rank;

#### Question 7: Retention Rate

Prompt: Please help me write a SQL query to calculate the user retention rate each month. Retention Rate in month X = Out of the users who placed an order in month X-1, the % also placed an order in month X.

GPT-4o: Initially provided an incorrect answer, failing to meet the requirement. After two follow-ups, it produced a correct response, earning 0 points.

-- GPT's initial code

WITH MonthlyOrders AS (

SELECT

o.user_id,

DATE_FORMAT(STR_TO_DATE(o.order_time, '%m/%d/%y %H:%i'), '%Y-%m') AS order_month

FROM

orders o

WHERE

o.order_status = 'Completed'

GROUP BY

o.user_id, order_month

)

SELECT

pmu.current_month,

pmu.previous_month_users,

ru.retained_users,

CASE

WHEN pmu.previous_month_users = 0 THEN 0

ELSE ROUND(ru.retained_users * 100.0 / pmu.previous_month_users, 2)

END AS retention_rate

FROM

PreviousMonthUsers pmu

LEFT JOIN

RetainedUsers ru ON pmu.current_month = ru.current_month

ORDER BY

pmu.current_month;

Claude 3.5 Sonnet: Its initial solution also failed but was closer than GPT. After one follow-up, it corrected its code, earning 0.5 points.

Gemini Advanced: Initially failed to calculate the retention rate as required. After a follow-up, it adjusted its code, earning 0.5 points.

Round 2 Recap

In this round, Claude 3.5 Sonnet led with 3.5 out of 4 points, plus 0.5 bonus points. ChatGPT-4o scored 3 out of 4 points, plus 0.5 bonus points, particularly excelling in creating and managing synthetic datasets. Gemini Advanced scored 3.5 out of 4 points but struggled with direct CSV uploads.

Round 3: Query Optimization

In this final round, I focused on optimizing SQL queries using inefficient examples from my previous SQL optimization article, based on the same synthetic datasets.

#### Question 8: Only Select Necessary Columns

Expectations: It is better to select only necessary columns in a window function instead of SELECT *.

GPT-4o: Optimized the query by selecting relevant columns and suggested ensuring proper indexing for performance, earning full points.

Claude 3.5 Sonnet: Not only selected relevant columns but also included COALESCE(SUM(price), 0) AS first_order_amount to handle potential null cases, earning full points.

Gemini Advanced: Also optimized the query by selecting relevant columns and provided indexing recommendations, earning full points.

-- Claude's code for reference

WITH first_order AS (

SELECT user_id, order_id, order_time

FROM orders

QUALIFY ROW_NUMBER() OVER(PARTITION BY user_id ORDER BY order_time) = 1

)

SELECT

fo.user_id,

fo.order_id,

COALESCE(SUM(op.price), 0) AS first_order_amount

FROM

first_order fo

LEFT JOIN

ordered_products op ON fo.order_id = op.order_id

GROUP BY

fo.user_id, fo.order_id

#### Question 9: Pre-Aggregation

Expectations: Given users is a wide table with many columns, it is better to do the aggregation to calculate first_order_amount first, then join to the users table.

GPT-4o: Initially focused on formatting but later suggested performing pre-aggregation first, earning 0.5 points.

Claude 3.5 Sonnet: Recommended pre-aggregation and proposed moving the large table join to the final step for optimization, earning full points.

-- Claude's code

WITH first_order AS (

SELECT

u.user_id,

o.order_id,

o.order_time

FROM

users u

LEFT JOIN

orders o ON u.user_id = o.user_id

QUALIFY ROW_NUMBER() OVER(PARTITION BY u.user_id ORDER BY o.order_time) = 1

),

first_order_amount AS (

SELECT

fo.user_id,

COALESCE(SUM(op.price), 0) AS first_order_amount

FROM

first_order fo

LEFT JOIN

ordered_products op ON fo.order_id = op.order_id

GROUP BY

fo.user_id

)

SELECT

u.*,

COALESCE(foa.first_order_amount, 0) AS first_order_amount

FROM

users u

LEFT JOIN

first_order_amount foa ON u.user_id = foa.user_id

#### Question 10: De-duplication

Expectations: This query uses DISTINCT, UNION, and ROW_NUMBER() for de-duplication. All three operations are expensive and do the same thing here, so we should eliminate the duplicate deduping efforts.

GPT-4o: Optimized the code but altered the output structure. After clarification, it provided optimized code with the correct columns, earning 0.5 points.

Claude 3.5 Sonnet: Combined two window function CTEs into one and used CASE WHEN + WHERE rn_first=1 OR rn_last=1 for output with the same structure, earning full points.

Gemini Advanced: Combined two window functions into one but got the first and last order ID using DISTINCT, requiring two table scans, and received 0.5 points.

Round 3 Recap

In this round of query optimization, Claude 3.5 Sonnet was the clear winner, earning full points on all questions with innovative solutions. ChatGPT-4o needed guidance on two questions but eventually answered all correctly, earning 2 points. Gemini Advanced faced one syntax error and produced less optimized code, resulting in 1.5 points.

Summary

Claude 3.5 Sonnet (10 points) - Demonstrated exceptional performance in both SQL generation and optimization, quickly correcting errors after clarification. I would wholeheartedly recommend Claude for SQL query assistance. - Its user interface is preferable for formatted text input. - The project functionality allows for sharing knowledge at the project level, making it convenient for team collaboration. - However, its lower file upload limit could pose a challenge for sharing larger datasets.

GPT-4o (8 points) - Capable of generating syntactically correct SQL and excels in business logic, though it lags in query optimization. - A key advantage is its ability to load datasets directly in the UI, enhancing data understanding and exploration. - Its capability to generate synthetic datasets with download links is also noteworthy.

Gemini Advanced (7 points) - Underperformed compared to Claude and ChatGPT but still achieved a 70% success rate. - Its ability to search the web and provide reference links could be beneficial, but the relevance of those links may undermine its credibility. - Integrates well with Google Suite products, such as Google Spreadsheet.

What’s Next

Having compared SQL capabilities, I plan to assess other data science skills in upcoming articles. Please share any additional areas you would like me to cover! - Exploratory Data Analysis (EDA) - Visualization - Machine Learning - Text Analytics - Data Science Business Cases

Interested in AI and data science? Explore these additional articles: - Can ChatGPT replace a data scientist? My perspective. - An evaluation of ChatGPT’s interactive tables and charts features. - Utilizing OpenAI API for text categorization and summarization—my coding example. - Discover Google’s new AI notebook. Check it out here!

Share the page:

Twitter Facebook Reddit LinkIn

-----------------------

Recent Post:

# Understanding the Skepticism Around Long COVID

This piece explores the skepticism surrounding Long COVID, drawing parallels with Lyme disease and emphasizing the need for compassion.

Unlocking Career Success Through Physical Fitness in Leadership

Discover how physical fitness enhances business leadership success through increased energy, mental clarity, and stress management.

Exploring ImageBind: Meta's Revolutionary Multimodal AI Model

Discover how Meta's ImageBind is transforming AI by integrating multiple modalities for a more human-like understanding of the world.

Embracing Failure: A Journey Towards Success and Growth

Reflecting on failure as a stepping stone to growth and success, learning from experiences, and finding pride in the journey.

Preserving Your Individuality: Resisting the Herd Mentality

Explore strategies to maintain individuality in group settings, emphasizing self-awareness and critical thinking.

# My DNA: A Distinct Narrative of Native American Heritage

Explore the unique genetic markers that define Native American DNA, contrasting it with European ancestry and celebrating cultural identity.

Innovative Steps in Space: Relativity Space's Journey to the Stars

Relativity Space attempts its first test launch of the Terran 1 rocket, showcasing innovative 3D printing technology in space exploration.

Innovative Lightning Protection: Laser Technology's Role

Discover how laser technology is revolutionizing lightning protection, enhancing safety, and expanding coverage areas.