The Snowflake vs Databricks rivalry is intensifying. Databricks recently raised $10 billion in one of its largest funding rounds ever. The leading data and AI company is expected to go public next year, a move that may create unease for the AI cloud data company Snowflake.
Databricks CEO Ali Ghodsi, however, believes the company is far ahead of Snowflake and doesn’t even consider them a competitor. “We had a program called Snow Melt to go after Snowflake, but that’s behind us now,” he said in a recent interview.
In another interview, Ghodsi admitted that Snowflake no longer keeps him up at night. “There was a time they would, but not anymore.”
Earlier this year, Databricks acquired Tabular, the data management startup behind the Iceberg storage format. They then open-sourced the Unity Catalog, the industry’s only unified and open governance solution for data and AI. This move directly challenged Snowflake’s earlier move to open-source the Polaris Catalog, an open-source data catalogue for Iceberg tables.
Interestingly, Snowflake also had its eye on Tabular. “Multiple vendors were very interested. I mean, you’re good at guessing who was really interested in that,” Ghodsi said, subtly indicating the involvement of their biggest rival.
From a numbers perspective, Databricks expects to surpass a $3 billion annual revenue run rate by the end of its fourth quarter, which ends on January 31, 2025. The company reported over 60% revenue growth in the third quarter of 2024. Snowflake, on the other hand, expects a product revenue of $3.43 billion in 2025.
Snowflake, with higher revenue, has been relatively reserved about the rivalry. However, it has highlighted successes in attracting customers from Databricks and introduced competing products in data engineering and machine learning.
“I have no idea why he is so obsessed with Snowflake because I am not obsessed with Databricks,” Michael Scarpelli, CFO at Snowflake, said in an old interview in reference to Ghodsi.
Betting Big on AI
What started as a rivalry in data warehousing is now expanding into AI. Both companies are working to add generative AI services to their offerings.
For instance, Snowflake recently entered a multi-year deal with AI safety and research company Anthropic to use its Claude models. This partnership will make Anthropic’s Claude models available to customers through Snowflake Cortex AI and help businesses worldwide get more value from their data.
More businesses are turning to Snowflake’s cloud data to organise their data using AI. Just like Salesforce and Microsoft, Snowflake is working on AI agents with its Snowflake Intelligence platform.
Snowflake chief Sridhar Ramaswamy believes it will simplify how enterprises derive value from data. “Imagine asking a data agent, ‘Give me a summary of this Google Doc’ or ‘Tell me how many deals we had in North America last quarter’, and instantly following up with the next steps using that same agent. That’s exactly what Snowflake Intelligence will enable – a seamless way to access and act on your data in one place,” he added.
Ramaswamy was appointed as the CEO of Snowflake earlier this year with a mission to pivot the company toward AI and machine learning. In 2019, he co-founded Neeva, an ad-free and privacy-focused search engine, which was acquired by Snowflake in 2023. Today, Neeva’s services are integrated into Snowflake, which brings generative AI into its search functionalities to improve data discovery and analysis.
Recently, Snowflake acquired Datavolo to improve data pipelines and TruEra to strengthen LLM and ML observability in its AI cloud data.
Not only that, Snowflake has developed an in-house LLM called Artic, alongside partnerships with Reka, Mistral, Meta, AI21, and Anthropic. The company recently released Arctic Embed L 2.0 and Arctic Embed M 2.0, which are updated embedding models that support multilingual search.
In a similar fashion, Databricks is pursuing a strategy of acquiring more startups to strengthen and expand its offerings and services. In 2024 alone, the company has made four acquisitions, including the $1 billion acquisition of Tabular in June. Databricks acquired MosaicML in July last year and later used its technology to launch the Databricks’ Data Intelligence Platform.
The platform includes AI solutions to support the entire ML lifecycle. Its core offering, Mosaic AI, simplifies building, deploying, and managing AI and ML models. The platform supports generative AI with large language models (LLMs) using techniques like prompt engineering, RAG, fine-tuning, and pretraining.
Ghodsi believes enterprises must establish a data strategy before developing an AI strategy. “First of all, you’ve got to get your data strategy right. A lot of companies now want to skip that step and jump straight to AI,” he said, adding that if data isn’t well-organized, one won’t find success with AI.
Elaborating on Databricks’ role with AI, he further said, “We want to be the infrastructure for AI applications, helping them with the data flywheel and making their models smarter.”
In March 2024, Databricks launched DBRX, a transformer-based model with 132 billion parameters and 36 billion active parameters per token during inference. Its Mixture of Experts (MoE) architecture outperforms open-source models and rivals closed-source models like GPT-3.5 and Gemini 1.5 Pro.
Old Rivalry
While Databricks and Snowflake share similarities in design, architecture, and analytics support, they serve distinct purposes. Snowflake functions as a modern replacement for legacy data warehouses with ELT (extract, load and transform) capabilities, whereas Databricks provides a Spark-powered data processing engine that complements data warehouses.
Databricks operates as a platform as a service (PaaS), whereas Snowflake functions as a software as a service (SaaS). When it comes to data structures, Databricks supports all data types, including raw and unstructured data, while Snowflake primarily focuses on semi-structured and structured data.
Recently, Ramaswamy shared his thoughts on the total cost of ownership (TCO) comparison between Snowflake and what he referred to as ‘Spark-based SaaS’ solutions – a jibe at Databricks.
“Snowflake consistently outperforms Spark-based SaaS with a 30% price-performance improvement…helping teams focus on innovation, not complexity.” Ramaswamy’s remarks fueled a heated debate, particularly as Databricks supporters argue that the additional management controls in Spark are essential for customisation.
To counter this, Databricks has recently strengthened its SQL and business intelligence capabilities, stepping into Snowflake’s traditional territory. Meanwhile, Snowflake has introduced products to compete with Databricks in data engineering and machine learning, including an initiative called ‘SparkAttack’ to capture machine learning workflows from Databricks.
The rivalry between Snowflake and Databricks is a driving force in cloud data platform innovation. Both companies are constantly challenging each other to improve, especially in AI and cost efficiency. As we move into 2025, their competition will shape the future of data technology for businesses worldwide.
The post ‘Databricks Doesn’t See Snowflake as Competition Anymore’ appeared first on Analytics India Magazine.