PyTorch Releases Version 2.3 with Focus on Large Language Models and Sparse Inference

PyTorch announced the release of version 2.3, introducing several new features and improvements for performance and usability of large language models and sparse inference.

The release, which consists of 3,393 commits from 426 contributors, brings support for user-defined Triton kernels in torch.compile. This feature allows users to migrate their existing Triton kernels without experiencing performance regressions or graph breaks.

The feature also allows Torch Inductor to precompile user-defined Triton kernels and organise code more efficiently.

Another feature, Tensor Parallelism for efficient training of large language models. It facilitates various tensor manipulations across GPUs and hosts, integrating with FSDP (Fully Sharded Data Parallel) for efficient 2D parallelism.

The PyTorch team has validated Tensor Parallelism on training runs for models with over 100 billion parameters, demonstrating its effectiveness in handling large-scale language models.

PyTorch 2.3 introduces support for semi-structured sparsity, specifically 2:4 sparsity, by implementing it as a tensor subclass. This feature enhances performance, achieving up to 1.6 times faster processing than dense matrix multiplication, and includes advanced functionalities like mixing different data types during quantization, uses improved versions of cuSPARSELt and CUTLASS kernels, and is compatible with torch.compile for more efficient computation.

Compared to the previous version, PyTorch 2.2, which brought advancements like the integration of FlashAttention-v2 and the introduction of AOTInductor, PyTorch 2.3 builds upon these improvements and introduces new features specifically targeted at large language models and sparse inference.

With significant contributions from a large and active community, this version brings features like user-defined Triton kernels and Tensor Parallelism to collectively improve performance, scalability, and flexibility.

The post PyTorch Releases Version 2.3 with Focus on Large Language Models and Sparse Inference appeared first on Analytics India Magazine.

Synology Launches Advanced Data Management & Security Solutions Against Ransomware in India

Synology unveiled its new solutions and products on Wednesday in a press briefing in New Delhi, aimed at addressing the growing challenge posed by ransomware attacks.

The company unveiled its flagship high-density storage server, HD6500, capable of providing up to 4.8 Petabytes of storage capacity. This server could serve as not only a central repository for files but also a backup pool for large enterprise organisations.

Synology believes its hardware, along with its software capabilities, are well-designed to protect enterprise data in India.

With the enactment of the Digital Personal Data Protection Act (DPDPA) continuing to shape data management practices for businesses in India, Synology shed light on the challenges faced by small and medium-sized enterprises (SMEs) in adhering to stricter data governance requirements.

These include data breach notification protocols, data subject rights management, and the appointment of Data Protection Officers (DPOs) in certain cases.

In addition to addressing compliance challenges related to stricter DPDPA requirements, the event also focused on pressing cybersecurity concerns.

These included recent data breaches such as the Polycab and AIIMS incidents, in which it took AIIMS more than 2 weeks to fully recover the data and operations, ransomware attacks targeting Indian businesses, and cloud security risks associated with data residency and provider trustworthiness.

The company emphasised the importance of a comprehensive backup strategy in combating ransomware attacks.

Moreover, Russell Chen, Country Manager at Synology SAARC region said the company recorded an impressive 20% market growth last year, driven mainly by the increasing demand from SMBs and enterprises.

“The surge in demand for Synology’s backup solutions, exceeding 20%, can be attributed to India’s emergence as a prominent data centre hub in the region. This growth has been further fueled by government initiatives encouraging businesses to allocate a greater portion of their IT budget towards security,” Chen said at the launch event.

“The landscape of data management and security is evolving rapidly, compelling businesses to remain ahead of the trend to ensure compliance and safeguard sensitive information.” said Russell Chen.

The post Synology Launches Advanced Data Management & Security Solutions Against Ransomware in India appeared first on Analytics India Magazine.

Why NVIDIA is Acquiring Run:ai

Why NVIDIA is Acquiring Run:ai

NVIDIA, the GPU giant, is slowly turning into an acquisition machine. Most recently, it announced its intention to acquire Run:ai, a Kubernetes-based workload management and orchestration software provider.

NVIDIA has entered into a definitive agreement to acquire Run:ai to help its customers make more efficient use of their AI computing resources. The Tel Aviv-based company simplifies the management and optimisation of AI hardware infrastructure for developers and operations teams.

While the terms of the deal have not been publicly disclosed, sources close to the matter revealed that the acquisition was valued at $700 million.

“Run:ai has been collaborating closely with NVIDIA since 2020 and we share a passion for helping our customers make the most of their infrastructure,” said Omri Geller, Run:ai co-founder and CEO. “We’re thrilled to join NVIDIA and look forward to continuing our journey together.”

Run:ai was founded in 2018, and it launched its first product in 2020. The company’s founders, Omri Geller and Ronen Dar, met at Tel Aviv University while working on their master’s and PhD degrees, respectively, under the supervision of Professor Meir Feder.

Through their work, they identified a clear trend in the industry – a constant and growing demand for sufficient compute power to accelerate machine learning and deep learning, often surpassing available infrastructure. Recognising this challenge, they joined forces to find a solution, leading to the establishment of Run:ai.

Run:ai’s clientele includes some of the world’s largest enterprises across multiple industries such as Adobe, Sony, Zebra, and University of Oxford, utilising the platform to manage data-centre-scale GPU clusters. NVIDIA has stated that it will continue to offer Run:ai’s products “under the same business model” and will invest in Run:ai’s product roadmap as part of NVIDIA’s DGX Cloud AI platform.

DGX for the win

NVIDIA’s Alexis Bjorlin said in the blog that as NVIDIA AI deployments for customers grow in complexity, with workloads spread across various infrastructures such as cloud, edge, and on-premises data centres, the need for effective management and orchestration becomes increasingly important.

The Run:ai platform provides AI developers and their teams with a centralised interface to manage shared compute infrastructure, ensuring easier and faster access for complex AI workloads. It offers functionality to add users, organise them into teams, grant access to cluster resources, control quotas, priorities, and pools, as well as monitor and generate reports on resource usage.

Additionally, the platform enables the pooling of GPUs and sharing of computing power, from fractions of GPUs to multiple GPUs or multiple nodes of GPUs across different clusters, for separate tasks. This efficient utilisation of GPU cluster resources allows customers to maximise the return on their compute investments.

Run:ai specialises in enabling enterprise customers to efficiently manage and optimise their compute infrastructure, whether it is located on premises, in the cloud, or in hybrid environments.

The company has developed an open platform on Kubernetes, which serves as the orchestration layer for modern AI and cloud infrastructure. This platform is compatible with all popular Kubernetes variants and seamlessly integrates with third-party AI tools and frameworks.

Another AI investment for NVIDIA

This is one of the biggest acquisitions for NVIDIA after Mellanox in March 2019 for $6.9 billion. Apart from that, NVIDIA also acquired OmniML in March 2023, the company that helped NVIDIA move its ML models on edge.

Now, NVIDIA DGX server, workstation, and DGX Cloud customers will also have access to Run:ai’s capabilities for their AI workloads, particularly for generative AI deployments across multiple data centre locations.

NVIDIA’s DGX Cloud is now hosted on Microsoft Azure, Google Vertex, and Oracle cloud infrastructure. Besides, NVIDIA is also going to self-host DGX on Blackwell systems this year.

Although Run:ai faces limited direct competition, other companies are also exploring dynamic hardware allocation for AI workloads. One such example is Grid.ai, which provides software enabling data scientists to train AI models across GPUs, processors, and other hardware components simultaneously.

NVIDIA is vertically integrating their platforms, making it a single place for the AI infra needs. In 2023, NVIDIA made a total of 40 investments, and by this month in 2024, it has already reached its 12th investment. The idea is to simply fund and acquire companies that are using NVIDIA’s GPUs – which is virtually all companies at this point.

The post Why NVIDIA is Acquiring Run:ai appeared first on Analytics India Magazine.

Can Artificial Intelligence Make Insurance More Affordable?

AI rapidly transforms industries by optimizing processes, enhancing data analytics and creating smarter, more efficient systems. Traditionally, the insurance sector determines pricing by manually analyzing various factors — including coverage type — to calculate risk and set premiums.

Imagine harnessing AI’s power to sift through massive datasets more accurately and efficiently. It promises faster service and potentially fairer pricing for policyholders. This shift could revolutionize how insurers calculate premiums to make the process more transparent and tailored to individual risk profiles.

Basics of Insurance Pricing

Insurance companies traditionally determine premiums by analyzing age, location and the type of coverage clients seek. For instance, premiums might increase as policyholders age, primarily because being older typically corresponds with more health complications or a shorter life span. These aspects increase the risk to insurers.

Companies also consider where customers live because different areas have varying risk levels due to crime rates or environmental hazards. Insurers face the challenge of balancing accurate risk assessment with competitive pricing when selecting coverage. They must offer attractive rates to their clients while still covering potential costs. This balance is crucial for their business viability and the policyholders’ financial protection.

AI in Insurance

Currently, 80% of insurance companies utilize AI and machine learning to manage and analyze their data. This widespread adoption underscores its critical role in modernizing and streamlining the industry.

Integrating AI technology allows insurers to handle large volumes of information with unprecedented precision and speed. This capability lets them assess risk, set premiums and detect fraud more effectively than before. It means quicker service and more accurate pricing that reflects actual risk rather than a one-size-fits-all estimate.

The potential of AI to enhance decision-making processes in the insurance sector is immense. Advanced algorithms enable companies to predict outcomes, personalize policies and optimize claims management. This approach can also reduce human error and increase efficiency.

These improvements bolster the insurers’ bottom lines and enhance the policyholder experience. They benefit from more tailored coverage options and more responsive service. As AI evolves, it can significantly impact and offer smarter, more adaptable insurance solutions.

AI-Driven Changes in Insurance Pricing Models

AI and machine learning significantly enhance the accuracy of risk assessment by integrating and analyzing vast datasets. These technologies study complex patterns that human analysts might overlook and enable a deeper understanding of risk factors specific to each policyholder. It means insurers can tailor their offerings more precisely, reflecting actual risk rather than a generalized model.

Its ability to process large volumes of data accelerates claims processing and ensures clients receive compensation more quickly when needed. Additionally, these tools are adept at detecting fraudulent activities, which protects the insurer and policyholders from potential financial losses.

AI technologies manifest in various innovative forms, such as telematics, wearables and IoT devices. These contribute to more accurate risk assessments and premium calculations.

Telematics devices in vehicles track driving behaviors, providing insurers with data on how safely clients drive, which can lead to personalized premium rates or discounts. Wearables, like fitness trackers, offer insights into their health and lifestyle, potentially lowering health insurance costs by demonstrating active and healthy habits.

Similarly, IoT devices in houses can monitor risks — like fire or theft — to improve safety and potentially reduce home insurance premiums. These technologies collectively enhance the interaction with insurers and offer benefits for maintaining safer practices and a healthier lifestyle.

Benefits of AI-Enhanced Pricing for Insurers

The increased accuracy in premium calculation through AI mitigates risk, leading to potential cost reductions for insurance companies and policyholders.

This is significant because insurers can streamline operations and pass these savings onto clients through lower premiums. Moreover, the precision of AI analyses dramatically diminishes the likelihood of over- or underpricing risk. It ensures policyholders pay a fair rate corresponding to their actual risk level.

AI also enhances customer segmentation, creating personalized insurance products tailored to individual needs. This personalization happens through analyzing detailed data points, which allows insurers to understand various client segments more profoundly and offer products that more accurately fit different lifestyles and risk profiles.

Additionally, it automates routine tasks and analyses — like data entry and claim processing — which speeds up these operations and reduces the chance of human error. It results in faster service and more reliable insurance coverage because AI helps companies manage policies and claims precisely and efficiently.

Implications for Policyholders

The advent of AI in insurance has led to a significant shift toward fairer, usage-based premiums, which could be a game-changer for policyholders. In 2023, the average annual health insurance premiums were $8,435 for single coverage and $23,968 for family coverage, a considerable expense for many.

However, by incorporating AI, insurers can tailor premiums more closely to actual usage and risk level, lowering costs. This personalized approach makes insurance more accessible and rewards policyholders for healthy lifestyles or safe driving practices with reduced rates. It aligns their costs more directly with their personal risk factors.

Conversely, integrating AI into insurance raises valid privacy and data security concerns. As insurers collect and analyze more personal data to fine-tune policy offerings and streamline claims, the risk of breaches or misuse increases.

They must invest heavily in securing data in addition to using AI to process claims faster and settle disputes more accurately. This means implementing robust cybersecurity measures and transparent data usage policies to protect clients’ sensitive information. Likewise, policyholders must stay informed about how organizations handle their information and understand their rights to navigate these changes confidently.

Challenges and Ethical Considerations

As AI becomes integral to the insurance industry, it brings ethical issues concerning data use, algorithm biases and transparency. Clients’ personal information is crucial for tailoring policies, but there is a fine line between use and misuse. It emphasizes the need for precise data handling and consent policies.

Bias in AI algorithms can lead to unfair policy rates or claim denials if developers don’t monitor and correct them. On top of these concerns, the regulatory landscape struggles to keep pace with AI’s rapid development, necessitating new frameworks to ensure its positive and well-regulated impact.

Additionally, generative AI is reshaping the workforce and is the second leading cause of job losses after industrial and humanoid robots. This shift prompts a need for reskilling and transitioning strategies within the sector to mitigate employment impacts. It makes it essential for insurers to stay informed and adaptable as the industry evolves.

The Future of AI in Insurance Pricing

AI will continue to transform the insurance landscape. Industry experts estimate that generative AI could contribute approximately $7 trillion to the global GDP over the next decade. This significant economic impact underscores the potential for groundbreaking innovations and emerging technologies within the insurance experience.

Insurers can also use sophisticated AI applications to further personalize premium calculations, risk assessments and claims processing. Innovations — like real-time risk modeling, blockchain for transparent and secure policy management, and AI-driven virtual assistants for customer service — are likely to become standard features. These advancements will refine how people interact with insurance providers and ensure greater accuracy and efficiency in managing needs.

Navigating the AI Revolution in Insurance Responsibly

Policyholders and industry leaders must engage with AI responsibly as it reshapes the insurance landscape. Embrace AI’s potential to enhance the insurance experience while advocating for transparency, fairness and security in its deployment to ensure it benefits everyone involved.

GitHub Copilot Rival, Augment Secures $252 Mn at $1 Bn Valuation to Boost AI for Developers

Augment, a GitHub Copilot alternative, recently announced that it raised $227 million in a Series B funding round at a $977 million post-money valuation.

With this latest funding, Augment plans to use it to accelerate product development alongside expanding engineering, and go-to-market functions as the company gears up for rapid growth.

The round was also led by Sutter Hill Ventures, Index Ventures, Innovation Endeavors, Lightspeed Venture Partners, and Meritech Capital, among others.

Augment was founded in 2022 by Igor Ostrovsky, former chief architect at Pure Storage and software engineer at Microsoft, and Guy Gur-Ari, an AI researcher from Google.

The company is led by Scott Dietzen, who has previous leadership experience at Pure Storage, Yahoo, and WebLogic/BEA Systems, and Dion Almaer, an alumnus of Google, Shopify, Mozilla, and Palm.

So far, Augment has raised $252 million, following its $25 million Series A led by Sutter Hill Ventures.

“Augment has built a truly brilliant team, among the best in enterprise AI, and as good as any team we have ever helped put together,” said Mike Speiser, managing director at Sutter Hill Ventures.

Over $1 trillion is spent on software engineering annually, yet most companies remain dissatisfied with the programs they produce and consume. AI is seen as a remedy, with Gartner predicting that “by 2027, 50% of enterprise software engineers will use ML-powered coding tools.”

“Software remains far too expensive and painful to develop. AI is poised to transform coding, and after surveying the landscape, we came away convinced that Augment has both the best team and recipe for empowering programmers and their organisations to deliver more and better software,” explained Eric Schmidt, founding partner at Innovation Endeavors and former CEO of Google.

Join the waitlist today.

Augment-ing Developers

Ty Schenk, CEO of Keeta, said that Augment is solving real-world engineering challenges with their contextual awareness of our code base. “We are seeing a >40% increase in developer productivity across the board,” he said.

In a blog post, Dietzen shared his vision for Augment: ” The Next decade will see the biggest leap forward in software quality and team productivity since the advent of high-level languages.”

He believes that Augment’s AI will be capable of ever-deeper reasoning, restoring the joy of crafting software.

Almaer, VP of Product at Augment, highlighted the platform’s key features, including an expert understanding of large codebases, the ability to produce running code, and fast inference that operates at 3x the speed of competitors using state-of-the-art techniques like custom GPU kernels.

Importantly, the platform was designed from the first line of code for tenant isolation, with an architecture built to protect companies’ precious source code and intellectual property.

The post GitHub Copilot Rival, Augment Secures $252 Mn at $1 Bn Valuation to Boost AI for Developers appeared first on Analytics India Magazine.

Why IBM is Acquiring HashiCorp

Why IBM is Acquiring HashiCorp

IBM has announced its plans to acquire HashiCorp in a deal valued at $6.4 billion, aiming to expand its cloud-based software products to capitalise on the growing demand for AI-powered solutions. The deal will be closed by the end of 2024.

The acquisition of HashiCorp by IBM establishes a comprehensive end-to-end hybrid cloud platform designed to address AI-driven complexity. By combining the portfolios and talent of both the companies, clients will benefit from extensive application, infrastructure, and security lifecycle management capabilities.

HashiCorp will continue working as a division within IBM.

According to Stephen Elliot, a vice president at market research firm International Data Corp, “This is a smart deal for IBM. They’re buying a leader and it complements their existing portfolio.” Upon completion, HashiCorp is expected to drive significant synergies for IBM, particularly in strategic growth areas such as Red Hat, Watsonx, data security, IT automation, and consulting.

IBM will make HashiCorp open again

HashiCorp boasts a diverse clientele of more than 4,400 clients, including prominent names such as Bloomberg, Comcast, Deutsche Bank, GitHub, JPMorgan Chase, Starbucks, and Vodafone.

Arvind Krishna, IBM chairman and chief executive officer, said, “HashiCorp has a proven track record of enabling clients to manage the complexity of today’s infrastructure and application sprawl. Combining IBM’s portfolio and expertise with HashiCorp’s capabilities and talent will create a comprehensive hybrid cloud platform designed for the AI era.”

Their offerings enjoy widespread adoption within the developer community, with 85% of the Fortune 500 utilising HashiCorp’s products. In HashiCorp’s FY2024, their community products across infrastructure and security were downloaded more than 500 million times. The only problem is its change of licensing in the past few years.

Some experts predict that IBM’s open source strategy might pivot HashiCorp’s abrupt BSL licence change last year to being open source again. While others say that HashiCorp might become another cog in the IBM wheel.

One of the products of HashiCorp, Terraform, provides organisations with a unified workflow for provisioning cloud, private datacenter, and SaaS infrastructure, and enables continuous management throughout the infrastructure lifecycle. Though, it has been witnessing a decline for the last few years with people shifting to OpenTofu for its open source offerings.

Another product Vault offers identity-based security, allowing organisations to automatically authenticate and authorise access to secrets and other sensitive data. In December, it was forked into an open source project called OpenBao, interestingly by an IBM engineer.

“If I were the leader at IBM responsible for the HashiCorp deal, moving all HashiCorp products to Apache-2.0 would be the first decision I would make, and ensure that decision was highlighted in the initial press release,” said Kelsey Hightower.

If I were the leader at IBM responsible for the HashiCorp deal, moving all HashiCorp products to Apache-2.0 would be the first decision I would make, and ensure that decision was highlighted in the initial press release.

— Kelsey Hightower (@kelseyhightower) April 23, 2024

In addition to these core products, HashiCorp offers:

  • Boundary: A solution for secure remote access
  • Consul: Facilitates service-based networking
  • Nomad: Provides workload orchestration
  • Packer: Used for building and managing images as code
  • Waypoint: An internal developer platform

Additionally, the two companies anticipate accelerating HashiCorp’s growth initiatives by leveraging IBM’s renowned go-to-market strategy, scale, and global reach, operating in over 175 countries worldwide.

An ideal deal for HashiCorp as well

Founded in 2012, HashiCorp believes that it is still in the early stages of cloud adoption. “With IBM, we have the opportunity to help more customers get there faster to accelerate our product innovation, and to continue to grow our practitioner community,” said Armon Dadgar, HashiCorp co-founder and CTO.

By combining HashiCorp’s offerings with those of IBM and Red Hat, clients will have a platform to automate the deployment and orchestration of workloads across evolving infrastructure, including hyperscale cloud service providers, private clouds, and on-premises environments.

“Our strategy at its core is about enabling companies to innovate in the cloud, while providing a consistent approach to managing cloud at scale. The need for effective management and automation is critical with the rise of multi-cloud and hybrid cloud, which is being accelerated by today’s AI revolution,” said Dadgar.

“I’m incredibly excited by the news and to be joining IBM to accelerate HashiCorp’s mission and expand access to our products to an even broader set of developers and enterprises,” he added.

The post Why IBM is Acquiring HashiCorp appeared first on Analytics India Magazine.

7 End-to-End MLOps Platforms You Must Try in 2024

7 End-to-End MLOps Platforms You Must Try in 2024
Image by Author

Do you ever feel like there are too many tools for MLOps? There's a tool for experiment tracking, data and model versioning, workflow orchestration, feature store, model testing, deployment and serving, monitoring, runtime engines, LLM frameworks, and more. Each category of tool has multiple options, making it confusing for managers and engineers who want a simple solution, a unified tool that can easily perform almost all the MLOps tasks. This is where end-to-end MLOps platforms come in.

In this blog post, we will review the best end-to-end MLOps platforms for personal and enterprise projects. These platforms will enable you to create an automated machine learning workflow that can train, track, deploy, and monitor models in production. Additionally, they offer integrations with various tools and services you may already be using, making it easier to transition to these platforms.

1. AWS SageMaker

Amazon SageMaker is a quite popular cloud solution for the end-to-end machine learning life cycle. You can track, train, evaluate, and then deploy the model into production. Furthermore, you can monitor and retain models to maintain quality, optimize the compute resource to save cost, and use CI/CD pipelines to automate your MLOps workflow fully.

If you are already on the AWS (Amazon Web Services) cloud, you will have no problem using it for the machine learning project. You can also integrate the ML pipeline with other services and tools that come with Amazon Cloud.

Similar to AWS Sagemaker, you can try Vertex AI and Azure ML. They all provide similar functions and tools for building an end-to-end MLOPs pipeline with integration with cloud services.

2. Hugging Face

I am a big fan of the Hugging Face platform and the team, building open-source tools for machine learning and large language models. The platform is now end-to-end as it is now providing the enterprise solution for multiple GPU power model inference. I highly recommend it for people who are new to cloud computing.

Hugging Face comes with tools and services that can help you build, train, fine-tune, evaluate, and deploy machine learning models using a unified system. It also allows you to save and version models and datasets for free. You can keep it private or share it with the public and contribute to open-source development.

Hugging Face also provides solutions for building and deploying web applications and machine learning demos. This is the best way to showcase to others how terrific your models are.

3. Iguazio MLOps Platform

Iguazio MLOps Platform is the all-in-one solution for your MLOps life cycle. You can build a fully automated machine-learning pipeline for data collection, training, tracking, deploying, and monitoring. It is inherently simple, so you can focus on building and training amazing models instead of worrying about deployments and operations.

Iguazio allows you to ingest data from all kinds of data sources, comes with an integrated feature store, and has a dashboard for managing and monitoring models and real-time production. Furthermore, it supports automated tracking, data versioning, CI/CD, continuous model performance monitoring, and model drift mitigation model drift.

4. DagsHub

DagsHub is my favorite platform. I use it to build and showcase my portfolio projects. It is similar to GitHub but for data scientists and machine learning engineers.

DagsHub provides tools for code and data versioning, experiment tracking, mode registry, continuous integration and deployment (CI/CD) for model training and deployment, model serving, and more. It is an open platform, meaning anyone can build, contribute, and learn from the projects.

The best features of the DagsHub are:

  • Automatic data annotation.
  • Model serving.
  • ML pipeline visualization.
  • Diffing and commenting on Jupyter notebooks, code, datasets, and images.

The only thing it lacks is a dedicated compute instance for model inference.

5. Weights & Biases

Weights & Biases started as an experimental tracking platform but evolved into an end-to-end machine learning platform. It now provides experiment visualization, hyperparameter optimization, model registry, workflow automation, workflow management, monitoring, and no-code ML app development. Moreover, it also comes with LLMOps solutions, such as exploring and debugging LLM applications and GenAI application evaluations.

Weights & Biases comes with cloud and private hosting. You can host your server locally or use managed to survive. It is free for personal use, but you have to pay for team and enterprise solutions. You can also use the open-source core library to run it on your local machine and enjoy privacy and control.

6. Modelbit

Modelbit is a new but fully featured MLOps platform. It provides an easy way to train, deploy, monitor, and manage the models. You can deploy the trained model using the Python code or the `git push` command.

Modelbit is made for both Jupyter Notebook lovers and software engineers. Apart from training and deploying, Modelbit allows us to run models on auto scaling computing using your preferred cloud service or their dedicated infrastructure. It is a true MLOps platform that lets you log, monitor, and alert about the model in production. Moreover, it comes with a model registry, auto retraining, model testing, CI/CD, and workflow versioning.

7. TrueFoundry

TrueFoundry is the fastest and most cost-effective way of building and deploying machine learning applications. It can be installed on any cloud and used locally. TrueFoundry also comes with multiple cloud management, autoscaling, model monitoring, version control, and CI/CD.

Train the model in the Jupyter Notebook environment, track the experiments, save the model and metadata using the model registry, and deploy it with one click.

TrueFoundry also provides support for LLMs, where you can easily fine-tune the open-source LLMs and deploy them using the optimized infrastructure. Moreover, it comes with integration with open source model training tools, model serving and storage platforms, version control, docker registry, and more.

Final Thoughts

All the platforms I mentioned earlier are enterprise solutions. Some offer a limited free option, and some have an open-source component attached to them. However, eventually, you will have to move to a managed service to enjoy a fully featured platform.

If this blog post becomes popular, I will introduce you to free, open-source MLOps tools that provide greater control over your data and resources.

Abid Ali Awan (@1abidaliawan) is a certified data scientist professional who loves building machine learning models. Currently, he is focusing on content creation and writing technical blogs on machine learning and data science technologies. Abid holds a Master's degree in technology management and a bachelor's degree in telecommunication engineering. His vision is to build an AI product using a graph neural network for students struggling with mental illness.

More On This Topic

  • The 5 Best Vector Databases You Must Try in 2024
  • Top 5 AI Coding Assistants You Must Try
  • 5 Must Try Awesome Python Data Visualization Libraries
  • A Beginner's Guide to End to End Machine Learning
  • A Full End-to-End Deployment of a Machine Learning Algorithm into a…
  • 5 Amazing & Free LLMs Playgrounds You Need to Try in 2023

AIOS: Operating System for LLM Agents

AIOS: Operating System for LLM Agents

Over the past six decades, operating systems have evolved progressively, advancing from basic systems to the complex and interactive operating systems that power today's devices. Initially, operating systems served as a bridge between the binary functionality of computer hardware, such as gate manipulation, and user-level tasks. Over the years, however, they have developed from simple batch job processing systems to more sophisticated process management techniques, including multitasking and time-sharing. These advancements have enabled modern operating systems to manage a wide array of complex tasks. The introduction of graphical user interfaces (GUIs) like Windows and MacOS has made modern operating systems more user-friendly and interactive, while also expanding the OS ecosystem with runtime libraries and a comprehensive suite of developer tools.

Recent innovations include the integration and deployment of Large Language Models (LLMs), which have revolutionized various industries by unlocking new possibilities. More recently, LLM-based intelligent agents have shown remarkable capabilities, achieving human-like performance on a broad range of tasks. However, these agents are still in the early stages of development, and current techniques face several challenges that affect their efficiency and effectiveness. Common issues include the sub-optimal scheduling of agent requests over the large language model, complexities in integrating agents with different specializations, and maintaining context during interactions between the LLM and the agent. The rapid development and increasing complexity of LLM-based agents often lead to bottlenecks and sub-optimal resource use.

To address these challenges, this article will discuss AIOS, an LLM agent operating system designed to integrate large language models as the ‘brain' of the operating system, effectively giving it a ‘soul.' Specifically, the AIOS framework aims to facilitate context switching across agents, optimize resource allocation, provide tool services for agents, maintain access control, and enable concurrent execution of agents. We will delve deep into the AIOS framework, exploring its mechanisms, methodology, and architecture, and compare it with state-of-the-art frameworks. Let's dive in.

After achieving remarkable success in large language models, the next focus of the AI and ML industry is to develop autonomous AI agents that can operate independently, make decisions on their own, and perform tasks with minimal or no human interventions. These AI-based intelligent agents are designed to understand human instructions, process information, make decisions, and take appropriate actions to achieve an autonomous state, with the advent and development of large language models bringing new possibilities to the development of these autonomous agents. Current LLM frameworks including DALL-E, GPT, and more have shown remarkable abilities to understand human instructions, reasoning and problem solving abilities, and interacting with human users along with external environments. Built on top of these powerful and capable large language models, LLM-based agents have strong task fulfillment abilities in diverse environments ranging from virtual assistants, to more complex and sophisticated systems involving creating problem solving, reasoning, planning, and execution.

The above figure gives a compelling example of how an LLM-based autonomous agent can solve real-world tasks. The user requests the system for a trip information following which, the travel agent breaks down the task into executable steps. Then the agent carries out the steps sequentially, booking flights, reserving hotels, processing payments, and more. While executing the steps, what sets these agents apart from traditional software applications is the ability of the agents to show decision making capabilities, and incorporate reasoning in the execution of the steps. Along with an exponential growth in the quality of these autonomous agents, the strain on the functionalities of large language models, and operating systems has witnessed an increase, and an example of the same is that prioritizing and scheduling agent requests in limited large language models poses a significant challenge. Furthermore, since the generation process of large language models becomes a time consuming task when dealing with lengthy contexts, it is possible for the scheduler to suspend the resulting generation, raising a problem of devising a mechanism to snapshot the current generation result of the language model. As a result of this, pause/resume behavior is enabled when the large language model has not finalized the response generation for the current request.

To address the challenges mentioned above, AIOS, a large language model operating system provides aggregations and module isolation of LLM and OS functionalities. The AIOS framework proposes an LLM-specific kernel design in an attempt to avoid potential conflicts arising between tasks associated and not associated with the large language model. The proposed kernel segregates the operating system like duties, especially the ones that oversee the LLM agents, development toolkits, and their corresponding resources. As a result of this segregation, the LLM kernel attempts to enhance the coordination and management of activities related to LLMs.

AIOS : Methodology and Architecture

As you can observe, there are six major mechanisms involved in the working of the AIOS framework.

  • Agent Scheduler: The task assigned to the agent scheduler is to schedule and prioritize agent requests in an attempt to optimize the utilization of the large language model.
  • Context Manager: The task assigned to the context manager is to support snapshots along with restoring the intermediate generation status in the large language model, and the context window management of the large language model.
  • Memory Manager: The primary responsibility of the memory manager is to provide short term memory for the interaction log for each agent.
  • Storage Manager: The storage manager is responsible to persist the interaction logs of agents to long-term storage for future retrieval.
  • Tool Manager: The tool manager mechanism manages the call of agents to external API tools.
  • Access Manager: The access manager enforces privacy and access control policies between agents.

In addition to the above mentioned mechanisms, the AIOS framework features a layered architecture, and is split into three distinct layers: the application layer, the kernel layer, and the hardware layer. The layered architecture implemented by the AIOS framework ensures the responsibilities are distributed evenly across the system, and the higher layers abstract the complexities of the layers below them, allowing for interactions using specific modules or interfaces, enhancing the modularity, and simplifying system interactions between the layers.

Starting off with the application layer, this layer is used for developing and deploying application agents like math or travel agents. In the application layer, the AIOS framework provides the AIOS software development kit (AIOS SDK) with a higher abstraction of system calls that simplifies the development process for agent developers. The software development kit offered by AIOS offers a rich toolkit to facilitate the development of agent applications by abstracting away the complexities of the lower-level system functions, allowing developers to focus on functionalities and essential logic of their agents, resulting in a more efficient development process.

Moving on, the kernel layer is further divided into two components: the LLM kernel, and the OS kernel. Both the OS kernel and the LLM kernel serve the unique requirements of LLM-specific and non LLM operations, with the distinction allowing the LLM kernel to focus on large language model specific tasks including agent scheduling and context management, activities that are essential for handling activities related to large language models. The AIOS framework concentrates primarily on enhancing the large language model kernel without alternating the structure of the existing OS kernel significantly. The LLM kernel comes equipped with several key modules including the agent scheduler, memory manager, context manager, storage manager, access manager, tool manager, and the LLM system call interface. The components within the kernel layer are designed in an attempt to address the diverse execution needs of agent applications, ensuring effective execution and management within the AIOS framework.

Finally, we have the hardware layer that comprises the physical components of the system including the GPU, CPU, peripheral devices, disk, and memory. It is essential to understand that the system of the LLM kernels cannot interact with the hardware directly, and these calls interface with the system calls of the operating system that in turn manage the hardware resources. This indirect interaction between the LLM karnel’s system and the hardware resources creates a layer of security and abstraction, allowing the LLM kernel to leverage the capabilities of hardware resources without requiring the management of hardware directly, facilitating the maintenance of the integrity and efficiency of the system.

Implementation

As mentioned above, there are six major mechanisms involved in the working of the AIOS framework. The agent scheduler is designed in a way that it is able to manage agent requests in an efficient manner, and has several execution steps contrary to a traditional sequential execution paradigm in which the agent processes the tasks in a linear manner with the steps from the same agent being processed first before moving on to the next agent, resulting in increased waiting times for tasks appearing later in the execution sequence. The agent scheduler employs strategies like Round Robin, First In First Out, and other scheduling algorithms to optimize the process.

The context manager has been designed in a way that it is responsible for managing the context provided to the large language model, and the generation process given the certain context. The context manager involves two crucial components: context snapshot and restoration, and context window management. The context snapshot and restoration mechanism offered by the AIOS framework helps in mitigating situations where the scheduler suspends the agent requests as demonstrated in the following figure.

As demonstrated in the following figure, it is the responsibility of the memory manager to manage short-term memory within an agent’s lifecycle, and ensures the data is stored and accessible only when the agent is active, either during runtime or when the agent is waiting for execution.

On the other hand, the storage manager is responsible for preserving the data in the long run, and it oversees the storage of information that needs to be retained for an indefinite period of time, beyond the activity lifespan of an individual agent. The AISO framework achieves permanent storage using a variety of durable mediums including cloud-based solutions, databases, and local files, ensuring data availability and integrity. Furthermore, in the AISO framework, it is the tool manager that manages a varying array of API tools that enhance the functionality of the large language models, and the following table summarizes how the tool manager integrates commonly used tools from various resources, and classifies them into different categories.

The access manager organizes access control operations within distinct agents by administering a dedicated privilege group for each agent, and denies an agent access to its resources if they are excluded from the agent’s privilege group. Additionally, the access manager is also responsible to compile and maintain auditing logs that enhances the transparency of the system further.

AIOS : Experiments and Results

The evaluation of the AIOS framework is guided by two research questions: first, how is the performance of AIOS scheduling in improving balance waiting and turnaround time, and second, whether the response of the LLM to agent requests are consistent after agent suspension?

To answer the consistency questions, developers run each of the three agents individually, and subsequently, execute these agents in parallel, and attempt to capture their outputs during each stage. As demonstrated in the following table, the BERT and BLEU scores achieve the value of 1.0, indicating a perfect alignment between the outputs generated in single-agent and multi-agent configurations.

To answer the efficiency questions, the developers conduct a comparative analysis between the AIOS framework employing FIFO or First In First Out scheduling, and a non scheduled approach, wherein the agents run concurrently. In the non-scheduled setting, the agents are executed in a predefined sequential order: Math agent, Narrating agent, and rec agent. To assess the temporal efficiency, the AIOS framework employs two metrics: waiting time, and turnaround time, and since the agents send multiple requests to the large language model, the waiting time and the turnaround time for individual agents is calculated as the average of the waiting time and turnaround time for all the requests. As demonstrated in the following table, the non-scheduled approach displays satisfactory performance for agents earlier in the sequence, but suffers from extended waiting and turnaround times for agents later in the sequence. On the other hand, the scheduling approach implemented by the AIOS framework regulates both the waiting and turnaround times effectively.

Final Thoughts

In this article we have talked about AIOS, an LLM agent operating system that is designed in an attempt to embed large language models into the OS as the brain of the OS, enabling an operating system with a soul. To be more specific, the AIOS framework is designed with the intention to facilitate context switching across agents, optimize resource allocation, provide tool service for agents, maintain access control for agents, and enable concurrent execution of agents. The AISO architecture demonstrates the potential to facilitate the development and deployment of large language model based autonomous agents, resulting in a more effective, cohesive, and efficient AIOS-Agent ecosystem.

Reid Hoffman Creates a DeepFake of Himself, Reid AI

Reid Hoffman, the co-founder of LinkedIn recently discussed various AI topics in an interview with a virtual AI twin of himself. He shared it through a post on X.

He said he deepfaked himself to see if conversing with an AI-generated version of myself can lead to self-reflection, new insights into my thought patterns, and deep truths.

Why did I deepfake myself? To see if conversing with an AI-generated version of myself can lead to self-reflection, new insights into my thought patterns, and deep truths. pic.twitter.com/DWODoZ9lXL

— Reid Hoffman (@reidhoffman) April 24, 2024

In the interview with his virtual twin, he posed several questions to assess it, including how to summarise a 336-page book on blitz scaling in one sentence. He also inquired about which of them would be a better video host, to which the response highlighted the virtual twin’s ability to excel in hosting content with extensive data, frequent updates, or multiple languages.

Rohit Bhargava, recently shared from Hoffman’s video an intriguing discussion on humanising his LinkedIn page, marking a pinnacle in AI’s ability to personalise and humanise digital platforms. An experiment well worth delving into.

Towards the conclusion of the video, Hoffman himself was questioned about the strategic shift at Inflection AI (a company he co-founded) and the ramifications of co-founder and CEO Mustafa Suleyman’s recent transition to Microsoft as the CEO of the newly established Microsoft AI division.

Hoffman characterised Inflection AI as a “remarkable entity” adept in both emotional intelligence and cognitive prowess.

Hoffman explained to his AI interviewer that Mustafa’s passion lies in building consumer products at scale, but the business’s development would take years. He emphasised that the true startup opportunity lies in the developer/API business. With the shift to Microsoft, Suleyman can now concentrate on consumer opportunities without immediate pressure to prove the business model.

Houffman further added, “I found it somewhat intriguing, as if it opened a path towards enhancing my humanity, a means to express myself more authentically. It’s akin to the insights gained from watching a video of oneself, discovering nuances that refine our communication skills.”

The post Reid Hoffman Creates a DeepFake of Himself, Reid AI appeared first on Analytics India Magazine.

7 End-to-End MLOps Platforms You Must Try in 2024

7 End-to-End MLOps Platforms You Must Try in 2024
Image by Author

Do you ever feel like there are too many tools for MLOps? There's a tool for experiment tracking, data and model versioning, workflow orchestration, feature store, model testing, deployment and serving, monitoring, runtime engines, LLM frameworks, and more. Each category of tool has multiple options, making it confusing for managers and engineers who want a simple solution, a unified tool that can easily perform almost all the MLOps tasks. This is where end-to-end MLOps platforms come in.

In this blog post, we will review the best end-to-end MLOps platforms for personal and enterprise projects. These platforms will enable you to create an automated machine learning workflow that can train, track, deploy, and monitor models in production. Additionally, they offer integrations with various tools and services you may already be using, making it easier to transition to these platforms.

1. AWS SageMaker

Amazon SageMaker is a quite popular cloud solution for the end-to-end machine learning life cycle. You can track, train, evaluate, and then deploy the model into production. Furthermore, you can monitor and retain models to maintain quality, optimize the compute resource to save cost, and use CI/CD pipelines to automate your MLOps workflow fully.

If you are already on the AWS (Amazon Web Services) cloud, you will have no problem using it for the machine learning project. You can also integrate the ML pipeline with other services and tools that come with Amazon Cloud.

Similar to AWS Sagemaker, you can try Vertex AI and Azure ML. They all provide similar functions and tools for building an end-to-end MLOPs pipeline with integration with cloud services.

2. Hugging Face

I am a big fan of the Hugging Face platform and the team, building open-source tools for machine learning and large language models. The platform is now end-to-end as it is now providing the enterprise solution for multiple GPU power model inference. I highly recommend it for people who are new to cloud computing.

Hugging Face comes with tools and services that can help you build, train, fine-tune, evaluate, and deploy machine learning models using a unified system. It also allows you to save and version models and datasets for free. You can keep it private or share it with the public and contribute to open-source development.

Hugging Face also provides solutions for building and deploying web applications and machine learning demos. This is the best way to showcase to others how terrific your models are.

3. Iguazio MLOps Platform

Iguazio MLOps Platform is the all-in-one solution for your MLOps life cycle. You can build a fully automated machine-learning pipeline for data collection, training, tracking, deploying, and monitoring. It is inherently simple, so you can focus on building and training amazing models instead of worrying about deployments and operations.

Iguazio allows you to ingest data from all kinds of data sources, comes with an integrated feature store, and has a dashboard for managing and monitoring models and real-time production. Furthermore, it supports automated tracking, data versioning, CI/CD, continuous model performance monitoring, and model drift mitigation model drift.

4. DagsHub

DagsHub is my favorite platform. I use it to build and showcase my portfolio projects. It is similar to GitHub but for data scientists and machine learning engineers.

DagsHub provides tools for code and data versioning, experiment tracking, mode registry, continuous integration and deployment (CI/CD) for model training and deployment, model serving, and more. It is an open platform, meaning anyone can build, contribute, and learn from the projects.

The best features of the DagsHub are:

  • Automatic data annotation.
  • Model serving.
  • ML pipeline visualization.
  • Diffing and commenting on Jupyter notebooks, code, datasets, and images.

The only thing it lacks is a dedicated compute instance for model inference.

5. Weights & Biases

Weights & Biases started as an experimental tracking platform but evolved into an end-to-end machine learning platform. It now provides experiment visualization, hyperparameter optimization, model registry, workflow automation, workflow management, monitoring, and no-code ML app development. Moreover, it also comes with LLMOps solutions, such as exploring and debugging LLM applications and GenAI application evaluations.

Weights & Biases comes with cloud and private hosting. You can host your server locally or use managed to survive. It is free for personal use, but you have to pay for team and enterprise solutions. You can also use the open-source core library to run it on your local machine and enjoy privacy and control.

6. Modelbit

Modelbit is a new but fully featured MLOps platform. It provides an easy way to train, deploy, monitor, and manage the models. You can deploy the trained model using the Python code or the `git push` command.

Modelbit is made for both Jupyter Notebook lovers and software engineers. Apart from training and deploying, Modelbit allows us to run models on auto scaling computing using your preferred cloud service or their dedicated infrastructure. It is a true MLOps platform that lets you log, monitor, and alert about the model in production. Moreover, it comes with a model registry, auto retraining, model testing, CI/CD, and workflow versioning.

7. TrueFoundry

TrueFoundry is the fastest and most cost-effective way of building and deploying machine learning applications. It can be installed on any cloud and used locally. TrueFoundry also comes with multiple cloud management, autoscaling, model monitoring, version control, and CI/CD.

Train the model in the Jupyter Notebook environment, track the experiments, save the model and metadata using the model registry, and deploy it with one click.

TrueFoundry also provides support for LLMs, where you can easily fine-tune the open-source LLMs and deploy them using the optimized infrastructure. Moreover, it comes with integration with open source model training tools, model serving and storage platforms, version control, docker registry, and more.

Final Thoughts

All the platforms I mentioned earlier are enterprise solutions. Some offer a limited free option, and some have an open-source component attached to them. However, eventually, you will have to move to a managed service to enjoy a fully featured platform.

If this blog post becomes popular, I will introduce you to free, open-source MLOps tools that provide greater control over your data and resources.

Abid Ali Awan (@1abidaliawan) is a certified data scientist professional who loves building machine learning models. Currently, he is focusing on content creation and writing technical blogs on machine learning and data science technologies. Abid holds a Master's degree in technology management and a bachelor's degree in telecommunication engineering. His vision is to build an AI product using a graph neural network for students struggling with mental illness.

More On This Topic

  • The 5 Best Vector Databases You Must Try in 2024
  • Top 5 AI Coding Assistants You Must Try
  • 5 Must Try Awesome Python Data Visualization Libraries
  • A Beginner's Guide to End to End Machine Learning
  • A Full End-to-End Deployment of a Machine Learning Algorithm into a…
  • 5 Amazing & Free LLMs Playgrounds You Need to Try in 2023