I have begun to experiement with AI in my personal projects. It is useful as an advanced form of autocomplete—often recognizing patterns and suggesting what I intended to type anyways. That being said, I try to only use it for small bits of code at a time (usually no more than 1 line) and I also try to maintain a healthy skepticism for what it produces.
As with code reviews and snapshot testing, AI’s benefits can be far outweighed by its costs if the human coder doesn’t bother to double check its output. It can be easy to ignore its results, however, as the brain uses the opportunity to relax.
In order to keep myself from seeding too much of my responsiblity for correct code to the whims of a opaque mathematical box, I’ve decided to keep a list of the worst bugs I’ve seen an AI produce, such that I can revisit it if I ever catch myself slipping.
I was writting my own logger (for good reason, which I may eventually write about) which stores log messages as an array of arrays of string—[ ["a", "b"], ["c", "d"] ]
. When this logger writes these messages to the console, it clears all the strings so as not to write them again, but it needs to keep the basic tree structure of the array—[ [], [] ]
.
To do this, I decided to generate an entirely new tree of arrays, rather than clear the old ones. With some modifications for clarity, Github Copilot offered me this:
this.logs = new Array(oldLength).fill([]);
Do you see the bug? To some, it could actually slip past you, although I expect most professional developers to be able to spot it if told that there is a bug (especially given the title of this section).
The code .fill([])
instantiates only one array, which is then passed by reference into every position on the root array. This means that, though the tree of arrays would appear correct at first, the next log that goes into the array would be copied to every position:
this.logs = [ [], [], [], [] ];
this.logs[0].push("a");
// logs is now [ ["a"], ["a"], ["a"], ["a"] ];
This bug was interesting to me specifically because it is so obscure. It’s somewhat easy to spot in isolation, but if this line of code were to land in the middle of a large chunk of code, and if I weren’t rigorously vetting that code, I can’t say for certain I would spot it.
On top of that, it would be unlikely that I would write any unit tests that would find this bug. The setup for that unit test would have to introduce layers (to add more subarrays), write the logs (to clear the arrays), add a new log, and then check that the logs were as expected. That’s a lot of steps for a unit test in a home project.
When this bug manifests, I might not even notice that it’s happening. Logs are filled with repeated entries. If I were scanning down a wall of text, I doubt I would bat an eye at some duplicate messages. This bug could easily trip me up when trying to solve an issue, but I wouldn’t immediately know that I was looking at a bug. I would probably spend a long time trying to find out why my code-being-logged is running in a weird way before I ever expected the logger itself was wrong.
Basically, this bug is exactly why I maintain a healthy skepticism of AI in software development, it’s why I try not to generate more than one line of code at a time, and it’s why I try to always thoroughly vet any code it generates. I still find AI autocompletion to be a step up in terms of productivity, but probably by less than many other coders using AI, because I only allow it to save me some time in typing code out; not in designing the code or reasoning about the written code.