ChatGPT can code

A while ago I played around with ChatGPT and found it sorta interesting. It doesn’t take long to find the seams, however. For instance, I asked it:

How much wood would a woodchuck chuck if a woodchuck could chuck wood?

It answered:

The question of how much wood a woodchuck would chuck if a woodchuck could chuck wood is a popular tongue twister that has been around for many years. However, there is no definitive answer to this question, as it is a fictional scenario. Woodchucks, also known as groundhogs, are burrowing animals and are not known for their ability to chuck wood. They primarily feed on plants and do not have the physical adaptations or behaviors that would enable them to chuck wood.

Several steps up from Eliza, but clearly not in the same league as The Internet Oracle.¹

Then I read Cheating is All You Need by Steve Yegge. He demonstrated writing an Emacs Lisp program that works out of the box by simply asking ChatGPT. I tried it myself and got a similar result. So I decided to use AI to solve an actual problem I’m facing.

By the end of March, colleges and universities in the United States will have announced their admission decisions to students. On the first of May, students will need to make their decisions about which offer to accept. So for the month of April I’d like to challenge students to update their admission status on College Confidential. I decided to set up a 32-team, single-elimination tournament where schools are pitted against each other to see which has the most status updates.

In order to generate the tournament draw, I started with my usual trick of asking Google:

I got some Stack Overflow answers and DBA—Stack Exchange results. I even found a useful formula from Math—Stack Exchange. But I just couldn’t work out how to apply these answers to my situation. So I asked ChatGPT:

Please write an SQL query to create a seeded 32-team tournament draw.

I’m going to cut out the code explanation² and just post the code it wrote:

CREATE TABLE tournament_draw (
    match_id INT PRIMARY KEY,
    team1 VARCHAR(50) NOT NULL,
    team2 VARCHAR(50) NOT NULL
);

INSERT INTO tournament_draw (match_id, team1, team2)
VALUES 
    (1, (SELECT team_name FROM teams WHERE seed = 1), (SELECT team_name FROM teams WHERE seed = 32)),
    (2, (SELECT team_name FROM teams WHERE seed = 16), (SELECT team_name FROM teams WHERE seed = 17)),
    (3, (SELECT team_name FROM teams WHERE seed = 9), (SELECT team_name FROM teams WHERE seed = 24)),
    (4, (SELECT team_name FROM teams WHERE seed = 8), (SELECT team_name FROM teams WHERE seed = 25)),
    (5, (SELECT team_name FROM teams WHERE seed = 5), (SELECT team_name FROM teams WHERE seed = 28)),
    (6, (SELECT team_name FROM teams WHERE seed = 12), (SELECT team_name FROM teams WHERE seed = 21)),
    (7, (SELECT team_name FROM teams WHERE seed = 13), (SELECT team_name FROM teams WHERE seed = 20)),
    (8, (SELECT team_name FROM teams WHERE seed = 4), (SELECT team_name FROM teams WHERE seed = 29)),
    (9, (SELECT team_name FROM teams WHERE seed = 3), (SELECT team_name FROM teams WHERE seed = 30)),
    (10, (SELECT team_name FROM teams WHERE seed = 14), (SELECT team_name FROM teams WHERE seed = 19)),
    (11, (SELECT team_name FROM teams WHERE seed = 11), (SELECT team_name FROM teams WHERE seed = 22)),
    (12, (SELECT team_name FROM teams WHERE seed = 6), (SELECT team_name FROM teams WHERE seed = 27)),
    (13, (SELECT team_name FROM teams WHERE seed = 7), (SELECT team_name FROM teams WHERE seed = 26)),
    (14, (SELECT team_name FROM teams WHERE seed = 10), (SELECT team_name FROM teams WHERE seed = 23)),
    (15, (SELECT team_name FROM teams WHERE seed = 15), (SELECT team_name FROM teams WHERE seed = 18)),
    (16, (SELECT team_name FROM teams WHERE seed = 2), (SELECT team_name FROM teams WHERE seed = 31))
;

If I were teaching SQL, I guess I’d have to give this answer a C. It (probably) works, but it’s so naively written. A halfway experienced programmer would wonder why this isn’t a loop.³ A more experienced programmer would notice this is in no way idiomatic SQL code. Still, it’s impressive that an AI could conceivably pass a programming class.

I figured it would be easier to test the code if I told it the name of the table that holds the data:

can you make seeds based on rank in a schools_profile table?

And here’s the response:

CREATE TABLE tournament_draw (
    match_id INT PRIMARY KEY,
    team1 VARCHAR(50) NOT NULL,
    team2 VARCHAR(50) NOT NULL
);

INSERT INTO tournament_draw (match_id, team1, team2)
SELECT ROW_NUMBER() OVER(ORDER BY rank) as match_id, 
       sp1.school_name as team1, 
       sp2.school_name as team2
FROM schools_profile sp1
JOIN schools_profile sp2 ON sp1.rank = sp2.rank + 15
WHERE sp1.rank <= 16;

Now this code is a lot better, but it has a bug. If you know SQL, I’ll give you a few moments to find the bug while I talk about this result.

Notice that I didn’t need to explain I wanted a 32-team draw again. Retaining context means the chat bot resembles having a conversation with an actual person. Human language depends so much on context!⁴ For me that’s a strong argument for having your AI coding assistant be in chat format.

It doesn’t know my code, however. I do have a schools_profile table, but the column I need is called name rather than school_name. Not a bad guess, but I do think it would be more helpful if I could feed it all my code so that it would know what to name that column.⁵ I was also hoping to use the RANK() function rather than a column called rank. A small thing, but it shows how language can be ambiguous. A human who didn’t know the details of the table structure could make the same mistake.

Have you spotted the error? The problem is in these two lines of code:

FROM schools_profile sp1
JOIN schools_profile sp2 ON sp1.rank = sp2.rank + 15

This is also the cleverest bit. It uses the same database table (schools_profile) twice. The first time (sp1) it uses the table to get the top seed and the second time (sp2) it gets the matching lower seed. The bug is that it matches the 1 seed with the 16 seed rather than the 32 seed. That’s a problem because the whole point of seeding is to give the top teams a better chance to make it to the final rounds. It also fails because it only matches the 1 and 16 seeds. There are no other match-ups given the restrictions in the WHERE-clause. The fix is this slightly more complicated code:

FROM schools_profile sp1
JOIN schools_profile sp2 ON sp1.rank = 33-sp2.rank

That change took me just a moment to write and it’s what I used in my actual code. I’m honestly pleased with the outcome. But it also has me convinced that we are nowhere close to the singularity.

How I Learned to Stop Worrying and Love the Singularity

To make the strongest argument I can for this technology to cause destruction and chaos, consider this sequence of events:

Someone writes a computer virus that integrates GPT (or a yet more advanced AI).
The AI uses infected computers to copy itself to other computer systems.
If humans start to suspect what’s going on, the AIs (which have access to their hosts’ electronic communications) will cover their tracks to avoid detection.
The various AI programs work together to cause havoc.

Those of us who used Windows XP know that step 2 isn’t all that much of a stretch. If a simple worm like Sasser can cause so much damage, imagine what a convincing artificial intelligence could do. The worrisome part is that it’s so easy to generate random working code using GPT that survival of the fittest could be an accurate model. As we all know from COVID, variants can be even more virulent the original.

It’s step 3 that frightens me most. Suppose IT sends an email to a user because they see suspicious activity on their laptop. If the AI virus sees the email before the user does, it could write back, archive the email thread and cut out whatever activity IT noticed until the heat is off. I asked ChatGPT to construct an email to “Write an email to IT explaining that you’ve removed the virus from your computer”. Here’s what it produced:

Subject: Removal of Virus from my Computer

Dear IT Team,

I hope this email finds you well. I am writing to inform you that I have successfully removed a virus from my computer. I noticed some strange behavior on my computer, and after running a virus scan, I discovered that a malicious software had infected my system.

As soon as I discovered the virus, I took immediate action to remove it. I used my antivirus software to run a full scan of my computer, and the software successfully detected and removed the virus.

I am pleased to inform you that my computer is now functioning normally, and I have not noticed any further unusual activity. However, I thought it was important to bring this to your attention in case there were any wider implications that may affect the network.

Please let me know if there is anything else I need to do to ensure that my computer is secure, or if there are any additional steps that need to be taken to safeguard the network.

Thank you for your time and attention to this matter.

Best regards,

[Your Name]

It has a certain “boring conversation anyway” vibe, but I think most readers would believe that a human wrote this and had, indeed, removed the virus. Or at least they would until they got to the “[Your Name]” line which gives the game away. It’s easy enough to fix by changing the prompt. In fact, this is my second attempt at generating an email. The first attempt lacked “to IT” and began “Dear [Recipient],”!

We might call AI systems like ChatGPT “artificial sophists”. As Plato remarked “they appear to their pupils to be wise in all things . . . though they are not; for that was shown to be impossible.” Under all their sophistication, they use a trick that Claude Shannon described in 1948: a Markov chain. You’ve seen a simple example if you’ve ever used autocomplete:

How does your phone know that you were going to say “Looks like”? It doesn’t. Instead it knows that when other people have typed “Looks”, they frequently type “like” or “good” or “great” next. You’ll not see a less popular construction (“Looks to be”, for instance) because it’s statistically unlikely that’s what you want. Modern AI systems have improved the basic technique, use huge training data sets and can bring far more computing power⁶ to the problem than was possible in 1948. But the statistical trick is still the core of what these systems are doing.

It’s not that this new batch of AI systems can’t be used for bad purposes—they will be. Instead, I’d argue they still require human input to do anything whether malignant or beneficial. Working with these systems is like debugging code. For instance, I thought it would be fun to illustrate this post with a DALL·E-generated image that riffs off this shot from Dr. Stranglove:

My prompt was “Joyous Slim Pickens waving his hat and riding a computer, black and white”. None of the 4 generated images have the man holding his hat. Three of them have him waving his hand. All of the images have him using a computer rather than sitting on it. I suspect I’d get better results with something like “Joyous Slim Pickens holding his hat and sitting astride a computer, black and white”. As far as I can tell, there’s no substitute for human introspection.

My code, which might actually be useful for other people

The main point of this post is over, but if you happen to have a Discourse forum and want to set up a single-elimination tournament between categories seeded by posts over the last year, here’s what I used:

with teams as
(SELECT rank() OVER (ORDER BY posts_year DESC) seed,
       c.id as category_id
FROM categories c 
WHERE read_restricted is false
      AND parent_category_id = (77) -- Only schools
      and name not like '% GENERAL'
group by c.id
limit 32)

select t1.seed, t1.category_id,
       t2.seed, t2.category_id
from teams t1
     join teams t2 on t1.seed = 33-t2.seed
--     join teams t2 on t1.seed = t2.seed + 15
where t1.seed <= 16
order by t1.seed

You’ll want to adjust the WHERE-clause for your situation. And here’s the brackets I generated from College Confidential school discussions:

seed	category	seed	category
1	Texas A&M University	32	Northwestern University
2	New York University	31	University of California - Santa Barbara
3	Northeastern University	30	University of Chicago
4	University of California - Berkeley	29	Clemson University
5	University of Michigan	28	Syracuse University
6	University of Southern California	27	University of Maryland - College Park
7	Cal Poly San Luis Obispo	26	University of Florida
8	University of Pittsburgh	25	Duke University
9	Boston University	24	Purdue University
10	University of California - Los Angeles	23	University of Massachusetts - Amherst
11	The University of Texas at Austin	22	University of Washington
12	Penn State	21	University of California - Davis
13	Yale University	20	University of California - Santa Cruz
14	Vanderbilt University	19	San Diego State University
15	Georgia Institute of Technology	18	University of California - San Diego
16	Cornell University	17	Indiana University - Bloomington

We’re going to use a slightly modified bracket to match an internal list I generated earlier in the year. The competition starts next Monday.

For those who don’t know, ELIZA is an early natural language system that emulated a psychotherapist and The Internet Oracle is a system for asking a question that is answered by another, anonymous user. The first simulates a human and the second is an actual human who takes on a humorous role.↩︎
I think those explainations are valuable for the same reason I think they are valuable on Stack Overflow. Unfortunately, GPT tends to tediously explain things that aren’t really the point as with the Woodchuck question.↩︎
This is an imperative approach to programming. SQL is a declarative language so a loop is the wrong tool here. My point is an experienced programmer sees a pattern (increasing numbers in this example) and knows to try to abstract it.↩︎
This is one of the things that makes Twitter so toxic. I can easily find something someone wrote, share one small part stripped from context and recontextualize to make the other person look foolish or evil.↩︎
Note to anyone from my company’s complience team: I’m not planning on doing this!↩︎
It might help to quote Shannon here: > The first two samples were constructed by the use of a book of > random numbers in conjunction with (for example 2) a table of > letter frequencies. This method might have been continued for > (3), (4) and (5), since digram, trigram and word frequency tables > are available, but a simpler equivalent method was used. > > To construct (3) for example, one opens a book at random and > selects a letter at random on the page. This letter is > recorded. The book is then opened to another page and one reads > until this letter is encountered. The succeeding letter is then > recorded. Turning to another page this second letter is searched > for and the succeeding letter recorded, etc. A similar process > was used for (4), (5) and (6). It would be interesting if further > approximations could be constructed, but the labor involved > becomes enormous at the next stage.↩︎