Michael Nuñez 2025-07-02 09:00:00 venturebeat.com
Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more
Bright Data, the Israeli web scraping company that defeated both Meta and Elon Musk’s X in federal court, unveiled a comprehensive AI infrastructure suite Wednesday designed to give artificial intelligence systems unfettered access to real-time web data — a capability the company argues Big Tech platforms are trying to monopolize.
The announcement of Deep Lookup, Browser.ai, and enhanced data collection protocols represents a dramatic expansion for the decade-old company, which has transformed from a specialized web scraping service into what CEO Or Lenchner calls “a unique infrastructure layer for AI companies.” The move comes as artificial intelligence companies increasingly struggle to access current web information needed to power chatbots, autonomous agents, and other AI applications.
“The intelligence of today’s LLMs is no longer its limiting factor; access is,” Lenchner said in an exclusive interview with VentureBeat. “We’ve spent the last decade fighting for open access to public web data, and these new offerings bring us to the next chapter in our journey, one characterized by truly accessible data and the subsequent rise of contextually-aware agents.”
The launch follows Bright Data’s high-profile legal victories in 2024, when federal judges dismissed lawsuits from both Meta and X alleging the company illegally scraped their platforms. Those rulings established crucial legal precedent defining what constitutes “public data” on the internet — information that can be viewed without logging in and therefore can be legally collected and used.
The court cases revealed that both Meta and X had been Bright Data customers even while suing the company, highlighting the contradictory stance many tech giants have taken toward web scraping. The rulings have broader implications for the AI industry, which relies heavily on web data to train and operate language models.
“It was revealed in court that both of them were a Bright Data customer, because everyone needs data, everyone, especially those who are building models,” Lenchner explained. “We are the only company that has the financial resources, and I would even say the courage to do that.”
Judge William Alsup, who presided over the X case, wrote that giving social media companies “free rein to decide, on any basis, who can collect and use data” risks creating “information monopolies that would disserve the public interest.” The ruling established that data viewable without login credentials constitutes public information that can be legally scraped.
Bright Data has now filed a countersuit against X, alleging the platform violated antitrust laws by trying to create a data monopoly to benefit Musk’s AI company, xAI. “The only reason that X are trying to stop Bright Data from allowing its customers to scrape X is that they will be the only entity that can enjoy the relevant quality data that X produces,” Lenchner said.
Deep Lookup and Browser.ai target AI companies struggling with data access
The company’s new products address what Lenchner identifies as the three core requirements for AI systems: algorithms, compute power, and data access. While Bright Data doesn’t develop AI algorithms or provide computing resources, it aims to become the definitive solution for the third requirement.
Deep Lookup functions as a natural language research engine designed to answer complex, multi-layered business questions in real-time. Unlike general-purpose search engines or AI chatbots that provide summaries, Deep Lookup specializes in comprehensive results for queries beginning with “find all.” For example, users can ask for “all shipping companies that went through the Panama and Suez canals in 2023 whose Q3 revenues declined by over 2 percent.”
The system draws from Bright Data’s massive web archive, which currently contains over 200 billion HTML pages and adds 15 billion monthly. By next year, the archive is expected to exceed 500 billion pages. “It’s not just random web pages, it’s actually what the world cares about, because our 20,000 customers represent billions of internet users,” Lenchner noted.
Browser.ai represents what the company calls “the industry’s first unblockable, AI-native browser.” Designed specifically for autonomous AI agents, the cloud-based service mimics human behavior to access websites without triggering bot detection systems. It supports natural language commands and can perform complex web interactions like booking flights or making restaurant reservations.
The browser infrastructure already processes over 150 million web actions daily, according to the company. “Almost all of them are customers,” Lenchner said of AI agent companies that have raised significant funding. “Because what we figured out, and they figured out, is that we solve that problem of entering a website without being blocked and executing web actions on the website.”
MCP Servers (Model Context Protocol) provides a low-latency control layer enabling AI agents to search, crawl, and extract live data in real-time. The protocol allows developers to build AI systems that can act on current information rather than relying solely on training data.
Patent portfolio and proxy network create competitive moat against blocking
Bright Data’s competitive advantage stems from what Lenchner describes as an “obsession” with overcoming website blocking mechanisms. The company holds over 5,500 patent claims on its technology and operates the world’s largest proxy network with more than 150 million IP addresses across 195 countries.
“We have such a good look into the internet,” Lenchner explained. “For a long time now, we have been mapping the internet, and for a long time now, we’re also archiving big chunks of the internet.”
The company’s approach involves sophisticated techniques to mimic human behavior, using real devices, IP addresses, and browser fingerprints rather than simple automated scripts. This makes detection and blocking extremely difficult for websites.
“The only way to block us, practically, is to put the data behind the login, then we won’t even try,” Lenchner said. “Sometimes there is a new blocking logic that we won’t solve immediately. It will take our research team 12 hours, three days that’s like the most it was, and we will unlock it.”
Revenue surpasses $100 million as AI demand explodes post-ChatGPT
While Bright Data remains privately held by a private equity firm, Lenchner confirmed with VentureBeat the company’s annual recurring revenue significantly exceeds $100 million. The business has experienced explosive growth since the launch of ChatGPT in late 2022, as AI companies scrambled to access training data and real-time information.
“Starting March 2023, which is pretty much when GPT-3 changed the world, the AI, or what we call the data for AI, use case just absolutely exploded for us as a company,” Lenchner said. “Everything else is also growing, because everyone needs more data, period. But this use case is just like nothing we’ve seen before.”
The company serves over 20,000 businesses, including Fortune 500 companies and major AI laboratories. Traditional customers include e-commerce platforms tracking competitor pricing, financial services firms seeking market intelligence, and enterprises conducting business research.
GDPR compliance and ethical practices differentiate from competitors
Bright Data has invested heavily in compliance infrastructure to address privacy concerns around data collection. The company follows European GDPR and California CCPA regulations, automatically notifying individuals when their personal information is collected from public sources and providing deletion options.
“The regulation and the legislation are clear since the European GDPR and at least California and CCPA regulations came to play,” Lenchner explained. “If we collected your email address, for example, we will automatically send you an email saying, ‘Hey, this is who we are. We collected your personal information from the public domain. Here’s a huge button you can click if you want to review it, and you can obviously ask to delete it.’”
The company maintains a large compliance team and extensive documentation of its practices, which proved valuable during court proceedings. “We enterprises especially love us because we have our ethical stand that was scrutinized in US courts twice,” Lenchner said.
Web access wars intensify as tech giants seek data monopolies
The battle over web data access reflects broader tensions in the AI industry about information control and competitive advantage. As AI systems become more sophisticated, access to current, comprehensive web data becomes increasingly valuable — and contentious.
Lenchner predicts the web will become “more closed” over time, similar to how Google maintains exclusive access to its web crawling capabilities while others must use alternative services. “A few tech giants are gonna get free access to every website with their agents,” he said. “The rest will need to use our infrastructure or someone else’s infrastructure.”
The company is also observing new trends, including businesses scraping AI chatbots for marketing purposes and the emergence of new protocols like MCP that enable AI agents to interact with web services more effectively.
“All of these guys that are consuming massive amounts of data, and all of us are using them, it’s all going towards building the brains of the robots,” Lenchner said. “It’s okay that you have a chatbot that is talking to a human, because that’s eventually what a robot will do.”
Robot brains and agent economy drive next phase of growth
Bright Data’s transformation from web scraping service to AI infrastructure provider reflects the rapidly evolving needs of the artificial intelligence industry. As companies rush to deploy AI agents and autonomous systems, access to real-time web data becomes as crucial as computing power and algorithmic sophistication.
The legal precedents established through Bright Data’s court victories may prove as significant as its technical innovations, potentially shaping how the entire AI industry accesses and uses web information. With major tech platforms increasingly restricting data access while simultaneously developing their own AI systems, independent infrastructure providers like Bright Data may become essential for maintaining competitive balance in the AI ecosystem.
“We’re an infrastructure company,” Lenchner emphasized. “We’re very talented engineers that hardly go anywhere, just sit with our computers and write code. We’re doing it well. We have no intentions to do anything else.”
The Deep Lookup beta launches Tuesday for business customers, with general public access available through a waitlist. Browser.ai and MCP Servers are already available to enterprise clients through Bright Data’s existing platform.
Source Link

Help Power Techcratic’s Future – Scan To Support
If Techcratic’s content and insights have helped you, consider giving back by supporting the platform with crypto. Every contribution makes a difference, whether it’s for high-quality content, server maintenance, or future updates. Techcratic is constantly evolving, and your support helps drive that progress. As a solo operator who wears all the hats, creating content, managing the tech, and running the site, your support allows me to stay focused on delivering valuable resources. Your support keeps everything running smoothly and enables me to continue creating the content you love. I’m deeply grateful for your support, it truly means the world to me! Thank you!BITCOIN ![]() ![]() bc1qlszw7elx2qahjwvaryh0tkgg8y68enw30gpvge Scan the QR code with your crypto wallet app |
DOGECOIN ![]() ![]() D64GwvvYQxFXYyan3oQCrmWfidf6T3JpBA Scan the QR code with your crypto wallet app |
ETHEREUM ![]() ![]() 0xe9BC980DF3d985730dA827996B43E4A62CCBAA7a Scan the QR code with your crypto wallet app |