The relationship between Snowflake and Databricks is much like that between Google and Microsoft – both competing in the same space, with one just a step behind the other all the time. Of course, neither is willing to admit it.
In a post that ignited fierce debate, Snowflake CEO Sridhar Ramaswamy shared his thoughts on the total cost of ownership (TCO) comparison between Snowflake and what he referred to as “Spark-based SaaS” solutions – a jibe at Databricks.
“This felt so clearly directed at Databricks, without mentioning them. A bit weird for a CEO if you ask me,” said Jaco van Gelder, staff data engineer at IKEA and an instructor at Databricks, adding that Ali Ghodsi, the CEO of Databricks, would never put up posts like these calling Snowflake “Cloud-based SQL”.
It is very unlikely that Ramaswamy would start a heated argument, but this sparked discussions across the data engineering community, with both Snowflake and Databricks enthusiasts weighing in, as he added the line: “Snowflake consistently outperforms Spark-based SaaS with a 30% price-performance improvement, based on industry benchmarks, all without the heavy tuning Spark requires. But beyond that, Snowflake is easier to manage and helps teams focus on innovation, not complexity.”
Ease of Use vs More Control
Ramaswamy highlighted Snowflake’s ease of use, which is arguably true as Snowflake has always been more proficient when giving customers easy-to-use platforms, though at the cost of complete control over customisation. This is particularly with its automatic performance optimisation, which Ramaswamy claimed significantly reduces the need for resource-intensive tuning—a common requirement for Spark.
“Most organisations spend 70% of their budget on labour and 30% on hardware, software, and cloud—yet, many compare Spark-based SaaS and Snowflake purely on price-performance,” said Ramaswamy.
According to him, Snowflake not only provides better performance but also simplifies cloud infrastructure management for organisations, basically a fully managed SaaS platform.
While speaking with AIM, Murad Wagh, the director of sales engineering at Snowflake, said that Snowflake is focused on enabling businesses to get more out of their data without having to worry about the underlying infrastructure.
This is what differentiates Snowflake’s offering from competitors like Databricks, and others, which the company sees as healthy competition. He added that one of Snowflake’s key differentiators is that it’s a fully SaaS-based service focused on business outcomes, whereas other platforms offer Platform-as-a-Service (PaaS) with more control knobs and infrastructure management.
Databricks is Not SaaS, But PaaS
The debate escalated when Laszlo Sragner, an AI product builder, chimed in with a critical view of Databricks’ AI solutions, saying, “I’ve never heard of a single person who liked Databricks’ AI solution.” This remark drew immediate reactions, with van Gelder humorously replying: “I am the first one, nice to meet you.” He defended Databricks’ offerings, particularly its popular MLFlow platform.
The idea of AI as a Service is evolving, but SaaS remains very much alive. Snowflake’s customers appreciate the focus on outcomes, not infrastructure, which allows them to execute large-scale projects with minimal teams in record time.
On the other hand, Josue A Bogran, solutions architect manager at Kythera Labs, who is also Databricks product advisory board member, said that Databricks offers “serverless” compute, which means everything is managed for you, with no need to provision anything. It supports both SQL and Python.
He also argued that setting up an ‘SQL serverless warehouse’ in Databricks is just as complex as setting one up in Snowflake. He agreed that Snowflake currently offers better cost management, but claimed that Databricks adding significant management overhead isn’t entirely fair.
Along similar lines, Gelder said that Databricks operates more as PaaS, not SaaS, and there’s a market for that approach. Meanwhile, he also added that many companies want some level of control to manage their own storage. “I encourage Snowflake’s approach; there is a market for this ‘we’ll do everything for you SaaS’ way of handling things, and it works. Spark-based PaaS would have been slightly better,” he added.
He also challenged the idea that infrastructure management is always handled by separate teams, suggesting that data engineers are often involved in aspects like networking and storage.
According to Longhow Lam, a freelance data scientist, “Databricks offers a huge and mature ML + AI ecosystem” which includes serverless computing with built-in tuning and management capabilities, making it comparable to Snowflake in terms of ease of use. He also countered Ramaswamy’s claims about Databricks’ management overhead, stating, “All that is ‘management intensive’ is there for a very good reason, and engineers love it!”
The post Snowflake’s Cloud-Based SQL vs Databricks’ Spark-Based SaaS appeared first on AIM.