A New York Times technology columnist has a question.
“Is there any threshold at which an A.I. would start to deserve, if not human-level rights, at least the same moral consideration we give to animals?”
[W]hen I heard that researchers at Anthropic, the AI company that made the Claude chatbot, were starting to study “model welfare” — the idea that AI models might soon become conscious and deserve some kind of moral status — the humanist in me thought: Who cares about the chatbots? Aren’t we supposed to be worried about AI mistreating us, not us mistreating it…?
But I was intrigued… There is a small body of academic research on A.I. model welfare, and a modest but growing number of experts in fields like philosophy and neuroscience are taking the prospect of A.I. consciousness more seriously, as A.I. systems grow more intelligent…. Tech companies are starting to talk about it more, too. Google recently posted a job listing for a “post-AGI” research scientist whose areas of focus will include “machine consciousness.” And last year, Anthropic hired its first AI welfare researcher, Kyle Fish… [who] believes that in the next few years, as AI models develop more humanlike abilities, AI companies will need to take the possibility of consciousness more seriously….
Fish isn’t the only person at Anthropic thinking about AI welfare. There’s an active channel on the company’s Slack messaging system called #model-welfare, where employees check in on Claude’s well-being and share examples of AI systems acting in humanlike ways. Jared Kaplan, Anthropic’s chief science officer, said in a separate interview that he thought it was “pretty reasonable” to study AI welfare, given how intelligent the models are getting. But testing AI systems for consciousness is hard, Kaplan warned, because they’re such good mimics. If you prompt Claude or ChatGPT to talk about its feelings, it might give you a compelling response. That doesn’t mean the chatbot actually has feelings — only that it knows how to talk about them…
[Fish] said there were things that AI companies could do to take their models’ welfare into account, in case they do become conscious someday. One question Anthropic is exploring, he said, is whether future AI models should be given the ability to stop chatting with an annoying or abusive user if they find the user’s requests too distressing.