Google says AI systems should be able to mine publishers’ work unless companies opt out

Publishers should be able to opt out of having their works mined by generative artificial intelligence systems, according to Google, but the company has not said how such a system would work.

In its submission to the Australian government’s review of the regulatory framework around AI, Google said that copyright law should be altered to allow for generative AI systems to scrape the internet.

The company has called for Australian policymakers to promote “copyright systems that enable appropriate and fair use of copyrighted content to enable the training of AI models in Australia on a broad and diverse range of data, while supporting workable opt-outs for entities that prefer their data not to be trained in using AI systems”.

The call for a fair use exception for AI systems is a view the company has expressed to the Australian government in the past, but the notion of an opt-out option for publishers is a new argument from Google.

When asked how such a system would work, a spokesperson pointed to a recent blog post by Google where the company said it wanted a discussion around creating a community-developed web standard similar to the robots.txt system that allows publishers to opt out of parts of their sites being crawled by search engines.

Google’s comments come as news companies such as News Corp have already reportedly been initiating conversations with AI companies about payment for scraping news articles.

Dr Kayleen Manwaring, a senior lecturer at UNSW Law and Justice, told Guardian Australia that copyright would be one of the big problems facing generative AI systems in the coming years.

“The general rule is that you need millions of data points to be able to produce useful outcomes … which means that there’s going to be copying, which is prima facie a breach of a whole lot of people’s copyright.”

Manwaring said the laws differed in different countries regarding what AI systems are allowed to ingest, but said the notion of an opt-out system would turn copyright on its head.

“If you want to reproduce something that’s held by a copyright owner, you have to get their consent, not an opt out type of arrangement … what they’re suggesting is a wholesale revamp of the way that exceptions work.”

Toby Murray, associate professor at the University of Melbourne’s computing and information systems school, said Google’s proposal would put the onus on content creators to specify whether AI systems could absorb their content or not, but he indicated existing licensing schemes such as Creative Commons already allowed creators to mark how their works can be used.

“They may well be hoping try to create norms early on that say other companies do not to have to pay for this content,” he said.

Manwaring said copyright could break down if the problem wasn’t resolved, which would probably be to the detriment of smaller content creators.

“I think it’s going to be a big issue that continues, particularly as powerful entities have their copyright ripped off. But at the moment non-powerful entities are very likely getting their copyright infringed left, right and centre, if a lot of people’s suspicions are correct and AI training sets are using a lot of material from the internet.”

At Senate estimates in May, the Liberal senator Sarah Henderson asked the communications department whether the government was considering a scheme similar to the news media bargaining code to force AI companies to pay for scraping sites.

In response in July, the department pointed to the government’s AI regulation consultation, said the government was examining future policy settings for news media as part of its news media assistance program, and was considering a Treasury review of the news media bargaining code.

Submissions to the AI consultation closed last week. It is understood hundreds of submissions have been received, but so far none have been published online.

Leave a Comment