Reddit and Google’s new public licensing deal to train AI products through their platform’s user-generated content data has raised questions about what social media sites can do with user-generated content.

The models are being trained by collecting users’ posts. And, while the terms of service agreement generally allow users to retain ownership to their posts, the terms of use of most major social media platforms allow the sites to license the material to third parties. Intellectual Property partner Edward Cavazos said the reality is users have “already given those [license] rights away.” He added that if a user wanted to argue that AI training was not a part of the deal when they first joined the platforms, they would have very little legal room.

Cavazos expects similar licensing agreements will continue to surface until every large language model has a deal in place with major social media platforms.

Models “need to know how 15 year olds on social media speak, and they’re not going to get that from POLITICO,” he said. “Ultimately, they just want data. The more data, the smarter their models are.”

With AI tools continuing to expand, Cavazos said he believes future licensing deals could also include platforms that feature audio and video user-generated content such as SoundCloud and Instagram. “I would not be surprised if every kind of user-generated, content-driven site eventually provides their data to large language model operators,” he said.

Some social media platforms may allow users to opt-out so that their posts and content are not part of the data that is licensed to AI companies for training. However, Cavazos said that it is a bit tricky; users will have to be sure to opt-out in a timely manner. If a user opts out after training has already commenced and the models have already trained on their content, it’s unclear whether the AI tools will be able to unlearn content, he said.

“It’s like asking someone to un-remember something. It’s not that easy,” he concluded.

Click here to read the full article.