Scroll to read more

Is Reddit’s actually data worth $60 million per year?

That’s reportedly how much an as-yet-unnamed AI development company has paid to gain exclusive access to Reddit’s full data set, which will see said AI company incorporate Reddit user responses into its large language model (LLM), with a view to the system providing more human-like answers and insight, and becoming a bigger challenger in online search.

As reported by Bloomberg, after working to restrict access to its data over the last year, in order to stop AI companies from profiting off its content, Reddit has now signed an exclusive contract with “an unnamed large AI company”, which will see that company integrate Reddit insights into its models.

Which is a high price tag, considering that the top tier of X’s API access (200 million posts per month) costs around $2.5 million per year.

So could Reddit’s data be worth significantly more than that, and if it is, does it then make sense for Reddit to provide such on an exclusive basis?

The value of Reddit data is that it provides actual, human usage insight, which can often be of more value than online reviews that can be gamed and skewed by paid responses. That’s getting even worse in the age of generative AI, with some companies now employing AI tools to create human-sounding reviews online, in order to boost their product ratings.

As a result, more and more people have been turning to Reddit to get honest product reviews and performance insight. They’re still using Google, but more people are using the “site:reddit.com” qualifier to glean more specific insights from Reddit communities.

For example, if you were looking for a new hair dryer, you can look up “best hair dryer” on Google to get this:

Google example

Or you can add “best hair dryer site:reddit.com” for this:

Google example

The Reddit forum links connect through to actual people’s experiences, and include solid, functional insight from those who’ve used each device. The Reddit responses are also up and downvoted, making it easier to find the best response to guide your search process.

The more specific, personal insight can add significant value to the answers provided, and many people have found that this is now a better, more valuable discovery process than trusting Google results within themselves.

And now, one AI company will get all of this insight exclusively to itself.

That could be a big boost to its business ambitions, with a view to making AI chatbots more of a rival for traditional search behavior. Already, more people are turning to conversational chatbots for online discovery, and with this, whichever LLM can access Reddit data will have an exclusive trove of valuable consumer insights, which it can repackage within its responses.

For example, using the same hair dryer prompt in ChatGPT, the system currently gives me a listing of technical considerations and recommendations based on top sellers. But with added Reddit commentary, it could also provide a more personalized addendum:

“According to users, the best hair dryer for curly hair is the Ella Bella Ionic hair dryer, while those with straight hair tend to prefer the Dyson Supersonic.”

The system could then provide more specific answers based on your requirements, by sourcing that info from subreddit communities.

It’s a significant value-add, which will make whichever company gets this info a far more viable option as a search consideration, though the $60 million per year ongoing price tag is high, and is also at least somewhat reliant on Reddit continuing to grow, in order to maximize its value and utility.

And Reddit is growing. Reddit’s added 20 million more users over the past three years, and it continues to see strong engagement in over 100,000 active communities. The company’s been working to highlight its business value, ahead of a planned IPO, which could come next month, and this deal will now be factored into the valuation of the platform moving forward.

In some ways, it’s possible that Reddit could be limiting its opportunities by signing an exclusive data contract. But that’s why the price tag is so high, and it’ll be interesting to see which chatbot comes out with “Reddit exclusive insights” as a value add sometime soon.

I mean, it seems likely that it’ll be OpenAI, with the backing of Microsoft, as it looks to take on Google’s Search dominance. With the rise of conversational searches, that does seem like a logical investment, and with another data source taken out of the mix, that could also lead to more differentiation in the market.

It could also point to similar exclusivity deals in future, as each company tries to differentiate and dominate with their chatbot tools. Current AI chatbots have been able to scrape vast amounts of data from across the web, which means that their initial models will all be relatively similar as a result, but in future, as information evolves, and new data is required to match search intent, fresh sources will also be required to maintain relevance, and audience interest.

Meta claims to have an advantage in this respect, because it has all of the insights published to Facebook and Instagram to work with, while Elon Musk will view xAI as holding a lead, due to his platform being the leading real-time news discussion app.

But maybe, considering broader trends, Reddit insight is actually the real leader in terms of refining search queries.

And maybe, that will prove to be more important than most think.