Meta Preparing Multi-Billion Dollar Investment in Leading AI Data Startup Scale AI
Meta Preparing Multi-Billion Dollar Investment in Leading AI Data Startup Scale AI
Three months after China’s DeepSeek disrupted the artificial intelligence sector with a model rivaling top U.S. efforts, 28-year-old Alexandr Wang appeared before Congress to outline key steps to maintain America’s AI leadership. At an April hearing, Wang urged lawmakers to create a centralized “national AI data reserve,” ensure sufficient energy for data centers, and avoid fragmented state-level regulation. Florida Representative Neal Dunn welcomed his testimony, saying, “Good to see you back in Washington—you’re becoming a familiar face.”
Though not as publicly prominent as OpenAI’s Sam Altman, Wang, CEO and co-founder of Scale AI, has gained substantial clout in both policy and technology circles.
Scale AI provides data labeling services essential for training AI systems used by firms like Meta and OpenAI. It also supports companies in developing customized AI tools. Increasingly, the firm has started working with highly qualified individuals such as PhDs and licensed professionals to develop more refined models, according to someone close to the company.
In the broader AI ecosystem, three key components—hardware, skilled personnel, and data—define competitiveness. Scale AI has established itself as a major force in the data domain.
Now, the company’s prominence is expected to grow further. Meta is reportedly in discussions to invest several billion dollars into Scale AI. The deal could surpass $10 billion, potentially making it one of the largest private funding rounds on record. In 2024, the startup was valued at approximately $14 billion, in a round that included Meta’s backing.
Scale AI’s evolution parallels that of OpenAI. Both startups were founded nearly ten years ago and anticipated a tipping point in AI development. The CEOs—who are also friends and former roommates—have both become key figures representing AI before U.S. lawmakers. And like OpenAI, which received a massive investment from a tech giant, Scale AI may soon receive a similarly significant financial boost.
Initially, Scale AI focused on labeling visual datasets—images of traffic lights, roadways, and vehicles—for use in autonomous driving systems. Over time, the company expanded to processing and curating the vast text datasets that fuel large language models such as ChatGPT. These models rely on labeled data to learn patterns and improve responses.
That said, Scale AI has faced criticism for its labor practices, particularly its use of low-paid contractors in countries like Kenya and the Philippines. Some workers have reported psychological distress from reviewing disturbing content. In 2019, Wang stated that workers received “good” pay—within the 60th to 70th percentile for their respective regions. Company spokesperson Joe Osborne said the U.S. Department of Labor recently closed its review into Scale AI’s labor compliance.
As the AI industry evolves, synthetic data generated by AI is becoming more prevalent in model training. However, high-quality real-world data remains critical, particularly for developing models capable of complex reasoning tasks.
To meet this need, Scale AI has ramped up efforts to recruit specialists with graduate-level education. These contributors engage in reinforcement learning, a method that trains AI systems by rewarding correct answers and penalizing incorrect ones.
According to a source familiar with operations, these advanced contributors design complex problems to challenge AI models. As of early 2025, 12% of participants in this process held a doctorate—many in fields like molecular biology—while over 40% held a master’s degree, MBA, or law degree.
This initiative is particularly important for enterprises seeking AI tools for professional use in sectors such as law and medicine. One key area of focus is enhancing models’ capabilities in domains like tax law, where legal interpretations vary significantly across jurisdictions.
This shift has fueled robust growth for Scale AI. The company generated roughly $870 million in revenue in 2024 and expects to reach $2 billion this year. According to the same source, demand for highly trained contributors has surged since DeepSeek’s emergence, as companies pursue models that mimic human logic for more nuanced applications.
Additionally, Scale AI has expanded its ties with the U.S. government, especially through defense-related work. Wang, known for his strong stance on China, has built relationships with lawmakers concerned about China’s rising AI power. Former Scale AI executive Michael Kratsios now plays a key role in President Donald Trump’s technology team, shaping national AI policy.
A deeper collaboration with Meta could help the tech giant close the gap with AI competitors like OpenAI and Google, while also strengthening its position in defense-related tech development. For Scale AI, the deal would bring a wealthy strategic partner and a symbolic link back to Wang’s entrepreneurial origins.
Reflecting on his startup journey, Wang once shared that when a venture capitalist asked when he first felt inspired to launch a company, he jokingly replied that it was after watching The Social Network—the movie dramatizing Facebook’s early days.
More info here – Have a Story? Address it to the Editor and submit it here
Disclaimer
The information provided in this article is for general informational purposes only and from publicly available sources. While we strive for accuracy, we do not make any representations or warranties, express or implied, regarding the completeness, reliability, or validity of the content. This article does not make any direct claims about specific companies, individuals, or organizations. Any references to reports or external sources are for context and do not imply endorsement or verification of any specific allegations. Readers are encouraged to conduct their own research and seek professional advice before making business decisions. We disclaim any liability for any losses or damages incurred as a result of reliance on the information provided.