Harvard and MIT have teamed to
develop an artificial-intelligence system that grades essay questions on
exams. The way it works is
this. First, a human grader manually
grades a hundred essays, and feeds the essays and the grades to the
computer. Then the computer
allegedly learns to imitate the grader, and goes on to grade the rest of the
essays a lot faster than any manual grader could—so fast, in fact, that often
the system provides students nearly instant feedback on their essays, and a
chance to improve their grade by rewriting the essay before the final grade is
assigned. So we have finally
gotten to the point of grading essays by algorithms, which is all computers can
do.
Joshua Schulz, a philosophy
professor at DeSales University, doesn't think much of using machines to grade
essays. His criticisms appeared in
the latest issue of The New Atlantis,
a quarterly on technology and society, and he accuses the software developers
of "functionalism."
Functionalism is a theory of the mind that says, basically, the mind is nothing more than what the mind does. So if you have a human being who can grade essays and a
computer that can grade the same essays just as well, why, then, with regard to
grading essays, there is no essential difference between the two.
With all due respect to Prof.
Schulz, I think he is speculating, at least when he supposes that the
essay-grading-software developers espouse a particular theory of the mind, or
for that matter, any theory of the mind whatsoever. The head of the consortium that developed the software is an
electrical engineer, not a philosopher.
Engineers as a group are famously impatient with theorizing, and simply
use whatever tools fall to hand to get the job done. And that's what apparently happened here. Problem: tons and tons of essay questions and not enough skilled
graders to grade them. Solution:
an automated essay grader whose
output can't be distinguished from the work of skilled human graders. So where is the beef?
The thing that bothers Prof. Schulz
is that the use of automated essay-grading tends to blur the distinction
between the human mind and everything else. And here he touches on a genuine concern: the tendency of large bureaucracies to
turn matters of judgment into automatic procedures that a machine can perform.
Going to extremes can make a point
clearer, so let's try that here.
Suppose you are unjustly accused of murder. By some unlikely coincidence, you were driving a car of a
similar make to the car driven by a bank robber who shot and killed three people
and escaped in a car whose license plate number matches yours except for the
last two digits, which the eyewitness to the crime didn't remember. The detectives on the case didn't find
the real bank robber, but they did find you. You are arrested, and in due time you enter the courtroom to
find seated at the judge's bench, not a black-robed judge, but a computer
terminal at which a data-entry clerk has entered all the relevant data. The computer determines that
statistically, the chances of your being guilty are greater than the chances
that you're innocent, and the computer has the final word. Welcome to Justice 2.0.
Most people would object to such a
delicate thing as a murder trial being turned over to a machine. But nobody has a problem with lawyers
who use word processors or PowerPoints in their courtroom presentations. The difference is that when computers
and technology are used as tools by humans exercising that rather mysterious
trait called judgment, no one being judged can blame the machines for an unjust
judgment, because the persons running the machines are clearly in charge.
But when a grade comes out of a
computer untouched by human hands (or unseen by human eyes until the student
gets the grade), you can question whether the grader who set the example for
the machine is really in charge or not.
Presumably, there is still an appeals process in which a student could
protest a machine-assigned grade to a human grader, and perhaps this type of
system will become more popular and cease to excite critical comments. If it does, we will have moved another
step along the road that further systematizes and automates interactions that
used to be purely person-to-person.
Something similar has happened in a
very different field:
banking. My father was a
loan officer for many years at a small, independent bank. He never finished college, but that
didn't keep him from developing a finely honed gut feel for the
credit-worthiness of prospective borrowers. He wouldn't have known an algorithm if it walked up and
introduced itself, but he got to know his customers well, and his personal
interactions with them was what he based his judgment on. He would guess wrong once in a great while,
but usually because he allowed some extraneous factor to sway his judgment. For example, once my mother asked him
to loan money to a work colleague of hers, and it didn't work out. But if he stuck to only the things he
knew he should pay attention to, he did pretty well.
Recently I had the occasion to
borrow some money from one of the largest national banks in the U. S., and it
was not a pleasant experience. I
will summarize the process by saying it was based about 85% on a bunch of
numbers that came out of computer algorithms that worked from objective
data. At the very last step in the
process, there were a few humans who intervened, but only after I had jumped
through a long series of obligatory hoops that allowed the bankers to check off
"must-do" boxes. If even
one of those boxes had been left blank, no judgment would have been
required—the machine would say no, and that would have been the end of it. I got the strong impression that the
people were there mainly to serve the machines, and not the other way around.
The issue boils down to whether you
think there is a genuine essential difference between humans and machines. If you do, as most people of faith do,
then no non-human should judge a human about anything important, whether it's
for borrowing money, assigning a grade, or going to jail. If you don't think there's a
difference, there's no reason at all why computers can't judge people, except
for purely performance-based factors such as the machines not being good enough
yet. Let's just hope that the
people who think there's no difference between machines and people don't end up
running all the machines. Because
there's a good chance that soon afterwards, the machines will be running the
people instead.
Sources: The Winter 2014 issue of The New Atlantis carried Joshua Schulz's
article Machine Grading and Moral
Learning on pp. 109-119. The New York Times article from which Prof.
Schulz learned about the AI-based essay grading system is available at http://www.nytimes.com/2013/04/05/science/new-test-for-computers-grading-essays-at-college-level.html. The Harvard-MIT consortium's name is edX.
Note to Readers: In my blog of June 16, 2014, I asked
for readers to comment on the question of monetizing this blog. Of the three or four responses
received, all but one were mostly positive. I have decided to attempt it at some level, always subject
to reversal if I think it's going badly.
So in the coming weeks, you may see some changes in the blog format, and
eventually some ads (I hope, tasteful ones) may appear. But I will try to preserve the basic
format as it stands today as much as possible.
No comments:
Post a Comment