Starlark, a hermetic subset of Python, has gained prominence in the realm of build systems like Bazel, something that Google uses to build all of its products. The unique position of being ‘Python-like yet not Python’ has always been an interesting topic of debate among developers.
On one hand, Starlark offers the familiarity of Python syntax, ensuring ease of adoption. On the other, its limitations, both by design and due to minimalism, make it a polarising choice.
At its core, Starlark is a lightweight programming language designed to be embedded within applications, offering configuration or scripting functionalities. It excels in these tasks because it simplifies scripting for build systems.
The Goodness of Python?
According to the creators of Starlark, due to its Python-like nature, it is dynamically typed, supports high-level data structures, and includes first-class functions with lexical scoping and garbage collection. With its compact syntax and readability, Starlark is ideal for defining structured data, creating reusable functions, or integrating scripting functionality into applications.
But the very reason it is praised is also the reason why people don’t like it – Python. Though it reduces the learning curve for programmers who are using Python, it eliminates the safety issues associated with Python, which are a lot.
Haoyi Li, staff software engineer at Databricks, reflecting on his seven years of experience with Starlark in Bazel, discussed the pros and cons of using Starlark on Hacker News.
“Having a ‘hermetic’ subset of Python is nice. You can be sure your Bazel Starlark codebase isn’t making network calls, reading files, or shelling out subprocesses,” he said. This hermetic nature enforces reproducibility, determinism, and enables optimisations like parallelisation and caching. Developers familiar with Python syntax find comfort in its simplicity compared to other tools, such as templated Bash or YAML scripts.
When it comes to internal tools development, Starlark shines. Ajay Kidave, who is building Clace, a platform using Go and Starlark, highlighted that though Starlark has been great for avoiding Python’s dependency management challenges with easily extensible plugin APIs, it does not support usual error handling features.
“There are no exceptions and no multi-value return for error values. All errors result in an abort,” Kidave added. “This works for a configuration language where a fail-fast behaviour is good. But for a general purpose script, you need more fine-grained error handling.”
Meanwhile, several others discuss on a Reddit thread that being similar to Python is not exactly an issue since Starlark is an embedded language. “The code just won’t work, that’s all. It’s not even statically typed, and you’re not supposed to write edifices of engineering in it.”
When Starlark was launched, a lot of companies started adopting it to build dozens of products, which shows that there was a gap within the Python ecosystem that it could fill.
Limited, Yet Complex
While Starlark’s minimalism offers safety, it can also be a curse. Haoyi Li also pointed out the pitfalls of the language. “A large Starlark codebase is a large Python codebase. Large Python codebases are imperative, untyped, and messy even without Starlark’s constraints.”
The absence of modern Python features like PEP 484 type annotations makes managing complexity difficult. Without type support, IDE assistance is minimal, leading to what he calls “spaghetti code” in large Starlark projects. Starlark’s design philosophy reflects a broader tension in build systems.
Should scripting languages prioritise simplicity and safety or embrace complexity to accommodate genuine needs? This has been a discussion for a long time.
“I think the lesson of Starlark is again confirmation that purely declarative languages are too restrictive. You need some ‘smart’ code by users. Even if it’s only in 5% of the code, that 5% is very useful and load-bearing,” said a developer, explaining that Starlark is essentially a nice mix of both paradigms.
Rochus Keller, who developed BUSY, which is a lean and statically typed cross-platform build system for GCC, CLANG, and MSVC, criticised the lack of modularisation and static typing in many build systems. “The achievements of software engineering, like modularisation and type checking, seem to have had little influence on build systems,” Keller said.
Meanwhile, Gradle, a JVM build system, opts for complexity by using Kotlin—a strongly typed, general-purpose language. As Mike Hearn, the creator of Hydraulic Conveyor, pointed out, Gradle leans into complexity and treats building systems as special programs. This makes the resulting mess somewhat optimisable and tractable.
Python’s flexibility comes with serious trade-offs, making programs difficult to reason about. The same goes for Starlark, which is constantly debated amongst all the programmers using different programming languages.
The post Starlark is Basically Python, But Not Really Python, and That’s Fine appeared first on Analytics India Magazine.