On December 2, the AI research community was shocked to learn that Timnit Gebru had been fired from her post at Google. Gebru, one of the leading voices in responsible AI research, is known among other things for coauthoring groundbreaking work that revealed the discriminatory nature of facial recognition, cofounding the Black in AI affinity group, and relentlessly advocating for diversity in the tech industry.
But on Wednesday evening, she announced on Twitter that she had been terminated from her position as Google’s ethical AI co-lead. “Apparently my manager’s manager sent an email [to] my direct reports saying she accepted my resignation. I hadn’t resigned,” she said.Karen Hao, “A leading AI ethics researcher says she’s been fired from Google” at Technology Review
Whatever happened, she isn’t working at Google any more. That’s because of a paper she co-wrote with four Google employees and two outside advisors, including Emily M. Bender, a professor at the University of Washington.
According to an article in Technology Review, the paper took aim at the “large language models” for search engines that power Google’s cash cow, Google advertising, which had performed very well during the third quarter of 2020.
The paper identified four main risks, asking “whether enough thought has been put into the potential risks associated with developing them and strategies to mitigate these risks.” But, according to Google AI head Jeff Dean, the paper, which was to be delivered at an upcoming AI conference, “didn’t meet our bar for publication.”
Many Googlers and other AI workers have jumped into the fray, defending Gebru and the paper:
An open letter demanding transparency has now been signed by more than 4,500 people, including DeepMind researchers and UK academics…
“I stand with Dr Timnit Gebru,” said Tabitha Goldstaub, who chairs the UK government’s AI council.
“She’s brave, brilliant and we’ve all benefited from her work…Cristina Criddle, “Thousands more back Dr Timnit Gebru over Google ‘sacking’” at BBC News (December 8, 2020)
Technology Review’s Karen Hao got a look at the disputed paper. She was not in a position to make the paper public but she could shed light on four concerns about natural language processing that the paper addressed.
And this is the point where the story becomes a bit more complex:
1.Environmental and financial costs: “A version of Google’s language model, BERT, which underpins the company’s search engine, produced 1,438 pounds of CO2 equivalent in Strubell’s estimate—nearly the same as a roundtrip flight between New York City and San Francisco.”
But to what are we comparing those numbers? If Google Search helps in the fight against COVID-19, many observers would say that a few more doses of flight-size CO2 are worth getting control of the situation, which is wrecking lives across the planet.
2.“racist, sexist, and otherwise abusive language ends up in the training data”: “the MeToo and Black Lives Matter movements, for example, have tried to establish a new anti-sexist and anti-racist vocabulary. An AI model trained on vast swaths of the internet won’t be attuned to the nuances of this vocabulary and won’t produce or interpret language in line with these new cultural norms.”
Again, wait! Is the vocabulary of MeToo and Black Lives Matter really “new cultural norms” or only the self-expression of adherents and supporters? If we assume that Google’s target language is English, aside from the United States (pop. 331 million), there is Nigeria (pop. 208 million), Britain (pop. 68 million), Canada (pop. 38 million, 3/4 of whom are English-speaking), Australia (pop. 26 million)… and many others. If Google intends to provide a global search engine in English, most users will not likely wish sensitive areas of language choice to be shaped by what must—necessarily—sound like explicitly American internal concerns. There’s a word for that: colonialism.
3.“Research opportunity costs”: This one’s a puzzler: “… large language models don’t actually understand language and are merely excellent at manipulating it, Big Tech can make money from models that manipulate language more accurately, so it keeps investing in them. ‘This research effort brings with it an opportunity cost,” Gebru and her colleagues write. Not as much effort goes into working on AI models that might achieve understanding, or that achieve good results with smaller, more carefully curated datasets (and thus also use less energy).’”
Are the researchers suggesting that Google should focus efforts on creating conscious computers that “achieve understanding”? There are good reasons for doubting the viability of such a project. Any such proposal should be assessed as part of a larger discussion about the the Hard Problem of consciousness and intelligence in general.
4.Illusions of meaning: The researchers worry about fake news: “AI models could be used to generate misinformation about an election or the covid-19 pandemic, for instance.”
The difficulty, of course, is that the term fake news is notoriously hard to define; it often merely means news that the authorities would prefer to censor, irrespective of its relationship to evidence.
None of this is to suggest that Gebru et al’s paper shouldn’t be published and discussed. Without access to the paper itself, I can only say that it’s unclear why any of the listed concerns (offered along with a large number of citations, we are told) falls short of Google’s “bar” for publication. They all sound pretty conventional and they would surely benefit from an airing in front of a broader international audience.
In any event, Gebru, who probably won’t be unemployed for long, seems to be yet another talented person who was forced out of the Google juggernaut. In November 2019, we noted that four whistleblowers had felt compelled to leave or been forced out of Google in the previous eighteen months.
Why this isn’t the internet we were promised is likely to become a much bigger public issue.
You may also enjoy:
Google’s secret health data grab: the whistleblower talks. This is the fourth whistleblower in the last eighteen months.