After writing about AI for years, I still hadn't had what you might call a serious encounter with it in its personalized form. Anyone who uses Google has probably been offered their "AI summary" before the conventional search results. I've found these summaries helpful sometimes and not so helpful other times, but I haven't sought them out.
What made me turn the corner was something a friend sent me by a software engineer named Matt Shumer, whose essay "Something Big is Happening" appeared on the Fortune website on Feb. 11. Shumer's point was that the latest iterations of AI systems are so much more capable than what has gone before that whole swathes of what George Gilder calls "symbolic manipulators"—lawyers, engineers, judges, doctors, architects, you name it—now face a radical choice. Either embrace AI and by doing so outperform your peers by orders of magnitude, or turn away from it and watch your career flame out. That's a little exaggerated, but not much.
This reinforced something another friend has told me about his own personal use of AI: that it has benefited his writing and research greatly, acting as a mostly trustworthy assistant to summarize large bodies of literature and help him clarify his thoughts. The biggest problem this friend has had with it is that it tends to be sycophantic and flatter him excessively. But he sat down with it one day and told it to refer to him as "the researcher" and itself as the "AI system," instead of "you" and "me," and things got better.
So I decided to opt for a paid version of Anthropic's AI product, which Shumer said was significantly better than the free version, and decided to give it a major task that I would ordinarily give to a grad student, if I had one (funding is very hard to find in my research area).
The job involved reading about a thousand rows of data in a big spreadsheet to fill out some yes/no questions about each row. The data was in the form of comments submitted by various individuals in response to questions. I gave the AI system examples to follow and what I thought were pretty detailed instructions.
All this was in the form of the usual chat format, with me and the AI system taking turns typing into chat boxes.
After a misunderstanding in which I thought the system was working on the problem and it thought I hadn't told it to start yet, it got to work and spat out various things like "Ran 6 commands . . . Examine the spreadsheet structure" and so on.
It was done in about ten minutes. Then I spent an hour or so going over its work.
I wish I could say I couldn't have done better myself. But I could have, by a long shot.
I didn't exhaustively examine all 700 rows of entries that the system produced—that would have taken many hours, about as long as it would take me to just do the job myself. So I sampled every tenth row for a hundred rows to see how the thing did.
In looking at ten rows, I found nine mistakes. This is not a good average.
In the system's defense, this is absolutely the first time I've ever tried anything like this. I could go back and get a lot more explicit about the rules for answering the yes/no questions about each row, and let it try again. But in comments online about this particular version of AI (Sonnet 4.6, I think), some people said that you get results faster, but you have to fix problems more often. That is consistent with my experience.
Good things about this exercise include how fast the thing ran, how it basically grasped what I wanted, and how it produced something in only ten minutes. But speed isn't everything.
Some not-so-good aspects include the errors and a kind of weird fawning or flattery I also noticed. I'd call it "gushing" when it spontaneously responded "This is a genuinely exciting dataset — ball lightning is one of the most mysterious atmospheric phenomena ever reported!"
I suppose that sort of thing has been cultivated by the AI's keepers, probably to keep the user engaged, or encouraged, or something. I found myself wishing that instead, they had adopted the mien of Joe Friday in the old Dragnet true-crime series. Friday was famed for his flat "Just the facts, ma'am" aspect, and that seems more in keeping with a system that supposedly can tackle highly sophisticated and challenging jobs of major import.
But like everybody else who doesn't work for Anthropic or the other four or five leading AI companies, we will simply have to take what we can get and deal with the negative aspects as well as we can.
Will I try again? Probably, but maybe with a different task. As part of my signup process, Anthropic has been emailing me little suggestions of other things to try: writing recipes, managing emails, creating content, solving problems, visualizing data, or helping me decide whether to go to Portugal or Spain on vacation (no-brainer for me: Spain, but I don't have time right now).
I am not especially tempted to try any of these suggestions right yet. But I do admit that if I can get the thing to turn out useful work, it could be worth what I spent on it. I paid for a year's subscription in advance, perhaps not the wisest thing to do, but I'm the type of person who is motivated to get his money's worth, and spending the money in advance may get me engaged when nothing else would.
I see that Anthropic just had a dustup with the Pentagon, which banned its use within the armed forces as punishment for uncooperation, or something. Now that we are apparently in a war with Iran, the leaders of Anthropic may feel glad that their product isn't part of the war. But not all battles are fought with bombs and bullets, and I have a feeling that the greatest battles involving AI are yet to come.
Sources: The essay by Matt Shumer, who runs an AI applications company, appeared on Feb. 11, 2026 at https://fortune.com/2026/02/11/something-big-is-happening-ai-february-2020-moment-matt-shumer/.
No comments:
Post a Comment