While the major U.S. tech players continue to experiment with new uses for generative AI, China’s Bytedance is also making significant advances, which could end up giving TikTok and edge in AI usage, and driving new behaviors as a result of AI tools.
The challenge, as most social apps have found, is there isn’t really a valuable, immediate use case for generative AI in social apps.
Sure, you can add in a chatbot, but people mostly want to connect with other people (hence the “social” moniker), while image generation tools have seemingly limited novelty value, and generative AI post tools can also detract from human engagement.
Which is where I feel like TikTok is taking a more logical and valuable approach, by implementing features that will actually drive engagement.
Though its latest experiment is a little more questionable on this front.
As Business Insider reports, ByteDance has developed a new AI model that can replicate any person’s voice, with believable enough accuracy, based on minimal input.
ByteDance’s “StreamVoice” system can use just a few utterances to replicate a person’s voice in real time, enabling you to replicate virtually any person’s speech (you can hear examples of StreamVoice outputs here).
Analysts have highlighted the potential for misuse, with fraud and other deceptive behavior set to be facilitated by scammers via such systems. Though it also worth noting that Meta is also developing the same, with its “AudioBox” software now also available for live testing on the web.
So why create tools that can replicate people’s voice?
In Meta’s AudioBox paper, it says that the tool will “lower the barrier of accessibility for audio creation”, providing more opportunity for more people to create audio content.
“Creators could use models like Audiobox to generate soundscapes for videos or podcasts, custom sound effects for games, or any of a number of other use cases.”
Not sure that’s significantly different to recording the audio direct, but conceptually, it could enable more variations of spoken text in your projects, which may enable broader creative expression.
ByteDance is obviously eyeing the same, and given the popularity of TikTok’s robot voice elements already, it could enable more ways to enhance the audio of your clips.
It’s another step in TikTok’s evolving AI toolset, which also includes generative AI profile images, improved contextual search, and AI music generation in-stream, while it’s also testing text-to-video creation tools and AI chatbots of varying capacity.
And that’s not all. According to a new report from Bloomberg, TikTok’s also now expanding its test of an automated feature which could make all posts in the app shoppable, by identifying objects in every video, then prompting viewers to “find similar items on TikTok Shop”.
That’s been in testing for some time, with Insider posting the above image as part of its story on the project last November, while it’s also been in testing on Douyin, the Chinese version of the app, since 2019. So ByteDance has had a long time to revise and improve this element ahead of a broader TikTok release.
The incorporation of smarter AI will help to make this a more valuable shopping tool, by showcasing more products to more users, and matching the results based on preferences.
And given that TikTok is already a key discovery tool for many young users, it makes sense for the app to build on this element, and encourage new usage behaviors, aligned with its broader shopping push.
Essentially, TikTok is making the smartest moves towards incorporating AI in a complementary way, which will enhance its core use case, as opposed to being tacked on, in order to latch onto the latest tech trend.
That, as noted, is something that other platforms are still struggling with, and it’ll be interesting to see how TikTok looks to integrate more AI functions over time, and whether they do indeed guide new usage behaviors in the app.