Inference is 3x Faster in Linux than in Windows

Inference is 3x Faster in Linux than in Windows

Recently, a developer shared how he switched from Windows to Linux after 30 years and saw remarkable results for AI-specific tasks. The person, who goes by the name Inevitable-Start-653, mentioned that he had six 24GB graphics cards, pushing the limits of what’s typically used in consumer-grade setups.

As more GPUs were added to the system, the performance hit on Windows became increasingly noticeable. Despite using top-notch inferencing software like Oobabooga’s Textgen, the Windows operating system’s overhead proved to be a significant bottleneck.

Attempts to mitigate this issue using Windows Subsystem for Linux (WSL) with DeepSpeed and even upgrading to PyTorch 2.2 failed to yield noticeable improvements in inferencing speeds. Once he transitioned to a dual-boot setup with Ubuntu Linux, the user reported a dramatic improvement in performance.

Inferencing speeds increased by approximately 3x, more VRAM became available for context, and the overall system responsiveness improved significantly.

Win vs Linux for AI inferencing
Source: Win 11 vs Ubuntu

This performance boost not only enhanced the speed of AI tasks but also allowed for working with larger models and datasets, potentially leading to more advanced AI applications and research.

When the performance of Stable Diffusion was compared on Ubuntu and Windows, Ubuntu had lead with 9.5% suggesting Linux is not only better in terms of inference but also works well with text-to-image models.

What Went Wrong with WSL?

When we consider WSL for inference and overall performance related to AI-centric tasks, things are a little off-track hence there is no clear answer. WSL is simply a virtual machine for Linux running over Windows with some optimisations. This means, if you want to use only WSL, the Windows which serves as a base will still consume a hefty amount of resources.

When a Reddit user compared AI capabilities of the same hardware over Windows, WSL, and Ubuntu, the results of Windows and WSL were similar and Ubuntu performed ~20-30% faster in text gen inference workloads & ~50-60% faster in image generation workloads (Stable Diffusion).

“I ended up uninstalling all my WSL instals. For some reason, it took more space than an actual Ubuntu install and didn’t offer any perf gains either,” he added.

Installing Linux on bare metal is always considered a good choice. Even if you were to optimise the WSL instance, you can not expect the same degree of hardware utilisation and efficiency. Sure, using libraries like DeepSpeed might sound like a good idea, but the amount of performance gains you get are negligible (one user was only able to gain 0.05 tokens/second).

Apart from inference speeds, there are two critical problems that AI developers face. One, WSL is extremely poor with I/O operations between Windows and the Linux environment, and AI datasets are usually pretty large in size so if you were to transfer them between Windows and WSL, you might end up spending more time on transfers than getting the actual operation done.

The second is unstable GUI (Graphical User Interface). While you can use GUI with WSL, multiple users have reported that the GUI on WSL is unreliable, and you often are advised to install Linux on bare metal on a secondary computer.

The Solution?

WSL is not entirely bad. In fact, some developers prefer using WSL because they don’t want the headache of installing Linux on bare metal and are already comfortable with Windows. However, if the workload is heavy and complex, using the bare metal installation of Linux is a good choice.

Sure, Linux is not something which works out of the box as sometimes, you might face issues with drivers and if you invest into the latest hardware, it might take a while for you to receive decent support from the community. That is the reason you often get advised to invest into a little older hardware if you are building a computer for Linux use.

If you decide to work with Linux on bare metal, most users recommend Ubuntu’s LTS (Long Term Support) version for stability or Arch Linux to get the most recent packages before any other distribution.

The post Inference is 3x Faster in Linux than in Windows appeared first on AIM.

Follow us on Twitter, Facebook
0 0 votes
Article Rating
Subscribe
Notify of
guest
0 comments
Oldest
New Most Voted
Inline Feedbacks
View all comments

Latest stories

You might also like...