Scroll to read more

The legal parameters around online data scraping are once again set to be tested, with Meta losing a court battle to sue a data scraping company for taking Facebook and Instagram user data without permission.

As reported by Ars Technica, last January, Meta launched legal action against a company called Bright Data over its scraping of user information from its big two social apps.

Meta alleged that Bright Data had breached its terms of use by ingesting user data, but Bright Data countered that it had only accessed publicly accessible information, and as such, it had not breached the terms of any agreement.

The process Bright Data claims to have used involved gathering info while logged out of each app, so any data that it could access is freely available, and thus not locked within Meta’s walled garden. Users have the option to limit what they display publicly, and as such, any information they have chosen to share is theoretically not bound by Meta’s rules.  

The judge agreed, ruling that Bright Data had not violated any rules, leaving it free to continue scraping Facebook and IG user data, and on-selling that via its own products and services.

Which seems like it shouldn’t be possible, but to the letter of the law, publicly available info can be used, within certain contexts, without direct permission.

Data-scraping has been a highly contentious legal issue, especially in regards to the variance in protections between data only available to logged in users, and that which anybody can access on the web.

LinkedIn undertook a five year court battle along similar lines, with professional services company hiQ Labs arguing that it should be allowed to scape publicly accessible LinkedIn user data, despite not being given explicit permission to use that information by LinkedIn users.

Despite hiQ Labs winning several rulings, LinkedIn continued to push its case, which eventually saw LinkedIn win a key ruling, enabling them to block hiQ Labs from continuing to scape user data.

But the varying readings of the legal specifics underline the challenges that platforms face in policing this element, because current laws aren’t made to cover this specific use, or misuse, which can make it difficult to prosecute.

The impact, then, is that the platforms are subsequently forced to hide more of their information behind log-in walls, essentially locking it away to protect it from misuse. Which, in some ways, could be a better approach, but it also means that posts then can’t be indexed by Google, limiting discovery and referral traffic. Such measures also make it more difficult to lure new users, as they limit the access that would enable newcomers to get a feel for the app before signing up.

Even so, with these concerns, along with generative AI training, most social apps are looking to further limit their non logged-in access, with X recently updating its system to significantly limit what non-users can see of its content.

Generative AI scraping may actually be a bigger impetus to enact such changes either way, but there does need to be more legal clarification around data scraping, and what qualifies as misuse in a social media specific context.

And this isn’t the only data scraping case that Meta’s pursuing, with the company also seeking legal recourse against two other companies that scraped Facebook data for use website browser extensions.

As such, Meta would have hoped to establish a clear precedent with this case, but now, like LinkedIn, it’ll be forced back to the courts to appeal this ruling.

Hopefully, it won’t take another five years to reach clearer legal consensus.