7 Steps to Mastering MLOPs

7 Steps to Mastering MLOPs
Image by Author

Many companies today want to incorporate AI into their workflow, specifically by fine-tuning large language models and deploying them to production. Due to this demand, MLOps engineering has become increasingly important. Rather than hiring just data scientists or machine learning engineers, companies are looking for individuals who can automate and streamline the process of training, evaluating, versioning, deploying, and monitoring models in the cloud.

In this beginner's guide, we will focus on the seven essential steps to mastering MLOps engineering, including setting up the environment, experiment tracing and versioning, orchestration, continuous integration/continuous delivery (CI/CD), model serving and deployment, and model monitoring. In the final step, we will build a fully automated end-to-end machine-learning pipeline using various MLOps tools.

1. Local and Cloud Environment Setup

In order to train and evaluate machine learning models, you will first need to set up both a local and cloud environment. This involves containerizing machine learning pipelines, models, and frameworks using Docker. After that, you will learn to use Kubernetes to automate the deployment, scaling, and management of these containerized applications.

By the end of the first step, you will become familiar with a Cloud platform of your choice (such as AWS, Google Cloud, or Azure) and learn how to use Terraform for infrastructure as code to automate the setup of your cloud infrastructure.

Note: It is essential that you have a basic understanding of Docker, Git, and familiarity with command line tools. However, if you have a background in software engineering, you may be able to skip this part.

2. Experiment Tracking, Versioning, and Model Management

You will learn to use MLflow for tracking machine learning experiments, DVC for model and data versioning, and Git for code versioning. MLflow can be used for logging parameters, output files, model management, and servering.

These practices are essential for maintaining a well-documented, auditable, and scalable ML workflow, ultimately contributing to the success and efficiency of ML projects.

Check out the 7 Best Tools for Machine Learning Experiment Tracking and pick one that works best for your workflow.

3. Orchestration and Machine Learning Pipelines

In the third step, you will learn to use orchestration tools such as Apache Airflow or Prefect to automate and schedule the ML workflows. The workflow includes data preprocessing, model training, evaluation, and more, ensuring a seamless and efficient pipeline from data to deployment.

These tools make each step in the ML flow to be modular and reusable across different projects to save time and reduce errors.

Learn about 5 Airflow Alternatives for Data Orchestration that are user friendly and come with modern features. Also, check out the Prefect for Machine Learning Workflows tutorial to build and execute your first ML pipeline.

4. CI/CD for Machine Learning

Integrate Continuous Integration and Continuous Deployment (CI/CD) practices into your ML workflows. Tools like Jenkins, GitLab CI, and GitHub Actions can automate the testing and deployment of ML models, ensuring that changes are efficiently and safely rolled out. You will learn to Incorporate automated testing of your data, model, and code to catch issues early and maintain high-quality standards.

Learn how to automate model training, evaluation, versioning, and deployment using GitHub Actions by following the A Beginner's Guide to CI/CD for Machine Learning.

5. Model Serving and Deployment

Model serving is a critical aspect of utilizing machine learning models effectively in production environments. By employing model serving frameworks such as BentoML, Kubeflow, Ray Serve, or TFServing, you can efficiently deploy your models as microservices, making them accessible and scalable across multiple applications and services. These frameworks provide a seamless way to test model inference locally and offer features for you to securely and efficiently deploy models in production.

Learn about the Top 7 Model Deployment and Serving Tools that are being used by top companies to simplify and automate the model deployment process.

6. Model Monitoring

In the sixth step, you will learn how to implement monitoring to keep track of your model's performance and detect any changes in your data over time. You can use tools like Evidently, Fiddler, or even write custom code for real-time monitoring and alerting. By using a monitoring framework, you can build a fully automated machine learning pipeline where any significant decrease in model performance will trigger the CI/CD pipeline. This will result in re-training the model on the latest dataset and eventually deploying the latest model to production.

If you want to learn about the important tools used to build, maintain, and execute the end-to-end ML workflow, you should check out the list of the top 25 MLOps tools you need to know in 2024.

7. Project

In the final step of this course, you will have the opportunity to build an end-to-end machine learning project using everything you have learned so far. This project will involve the following steps:

  1. Select a dataset that interests you.
  2. Train a model on the chosen dataset and track your experiments.
  3. Create a model training pipeline and automate it using GitHub Actions.
  4. Deploy the model either in batch, web service or streaming.
  5. Monitor the performance of your model and follow best practices.

Bookmark the page: 10 GitHub Repositories to master MLOps. Use it to learn about the latest tools, guides, tutorials, projects and free courses to learn everything about MLOps.

Conclusion

You can enroll in an MLOps Engineering course that covers all seven steps in detail and helps you gain the necessary experience to train, track, deploy, and monitor machine learning models in production.

In this guide, we have learned about the seven necessary steps for you to become an expert MLOps engineer. We have learned about the tools, concepts, and processes required for engineers to automate and streamline the process of training, evaluating, versioning, deploying, and monitoring models in the cloud.

Abid Ali Awan (@1abidaliawan) is a certified data scientist professional who loves building machine learning models. Currently, he is focusing on content creation and writing technical blogs on machine learning and data science technologies. Abid holds a Master's degree in technology management and a bachelor's degree in telecommunication engineering. His vision is to build an AI product using a graph neural network for students struggling with mental illness.

More On This Topic

  • Last call: Stefan Krawcyzk’s 'Mastering MLOps' Live Cohort
  • 7 Steps to Mastering Machine Learning with Python in 2022
  • KDnuggets™ News 22:n05, Feb 2: 7 Steps to Mastering Machine…
  • 7 Steps to Mastering SQL for Data Science
  • 7 Steps to Mastering Python for Data Science
  • 7 Steps to Mastering Data Science Project Management with Agile

The Rise of AI Software Engineers: SWE-Agent, Devin AI and the Future of Coding

sOFTWARE ENGINEER FUTURE GENERATIVE AI AGENTS DEVIN AI

The field of artificial intelligence (AI) continues to push the boundaries of what was once thought impossible. From self-driving cars to language models that can engage in human-like conversations, AI is rapidly transforming various industries, and software development is no exception. The emergence of AI-powered software engineers, such as SWE-Agent developed by Princeton University's NLP group, Devin AI, represents a groundbreaking shift in how software is designed, developed, and maintained.

SWE-Agent, a cutting-edge AI system, promises to revolutionize the software engineering process by autonomously identifying and resolving GitHub issues with unprecedented speed and accuracy. This remarkable tool leverages state-of-the-art language models like GPT-4, streamlining the development cycle and enhancing developer productivity.

The Advent of AI Software Engineers

Traditionally, software development has been a labor-intensive process, requiring teams of skilled programmers to write, review, and test code meticulously. However, the advent of AI-powered software engineers like SWE-Agent has the potential to disrupt this age-old paradigm. By harnessing the power of large language models and machine learning algorithms, these AI systems can not only generate code but also identify and fix bugs, streamlining the entire development lifecycle.

One of the key advantages of SWE-Agent is its ability to autonomously resolve GitHub issues with remarkable efficiency. On average, it can analyze and fix problems within 93 seconds, boasting an impressive 12.29% success rate on the comprehensive SWE-bench test set. This level of speed and accuracy is unprecedented in the software engineering realm, promising to significantly accelerate development timelines and reduce the overall cost of software projects.

At the core of SWE-Agent's success lies the innovative Agent-Computer Interface (ACI), a design paradigm that optimizes interactions between AI programmers and code repositories. By simplifying commands and feedback formats, ACI facilitates seamless communication, empowering SWE-Agent to perform tasks ranging from syntax checks to test execution with remarkable efficiency. This user-friendly interface not only enhances performance but also accelerates adoption among developers, making AI-assisted software development more accessible and approachable.

swe agent LLM

SWE agent LLM

LLM Agents: Orchestrating Task Automation

LLM agents are sophisticated software entities designed to automate the execution of complex tasks. These agents are equipped with access to a comprehensive toolkit or set of resources, enabling them to intelligently determine the best tool or method to employ based on the specific input they receive.

The operation of an LLM agent can be visualized as a dynamic sequence of steps, meticulously orchestrated to fulfill the given task. Significantly, these agents possess the capability to use the output from one tool as input for another, creating a cascading effect of interlinked operations.

BabyAGI: Task Management Powerhouse One of the most notable LLM agents is BabyAGI, an advanced task management system powered by OpenAI's cutting-edge artificial intelligence capabilities. In tandem with vector databases like Chroma or Weaviate, BabyAGI excels in managing, prioritizing, and executing tasks with remarkable efficiency. Leveraging OpenAI's state-of-the-art natural language processing, BabyAGI can formulate new tasks aligned with specific objectives and boasts integrated database access, enabling it to store, recall, and utilize pertinent information.

At its core, BabyAGI represents a streamlined version of the Task-Driven Autonomous Agent, incorporating notable features from platforms like GPT-4, Pinecone vector search, and the LangChain framework to independently craft and execute tasks. Its operational flow comprises four key steps: extracting the foremost task from the pending task list, relaying the task to a dedicated execution agent for processing, refining and storing the derived result, and formulating new tasks while dynamically adjusting the priority of the task list based on the overarching objective and outcomes of previously executed tasks.

AgentGPT: Autonomous AI Agent Creation and Deployment AgentGPT is a robust platform tailored for the creation and deployment of autonomous AI agents. Once a particular objective is defined for these agents, they embark on a relentless loop of task generation and execution, striving tirelessly to meet the stipulated goal. At the heart of its operation lies a chain of interconnected language models (or agents) that collaboratively brainstorm the optimal tasks to meet an objective, execute them, critically assess their performance, and iteratively devise subsequent tasks. This recursive approach ensures that AgentGPT remains adaptive, learning and refining its strategies with each loop to inch closer to the objective.

A comparative depiction of the software development SOP between MetaGPT and real-world human team

https://arxiv.org/pdf/2308.00352.pdf

Code Assistants: Enhancing Developer Productivity

Code assistants are advanced tools designed to assist developers in the code-writing process, often implemented as Integrated Development Environment (IDE) plugins, extensions, or add-ons. These assistants are capable of suggesting code completions, identifying and rectifying bugs, providing optimization recommendations, and simplifying recurring coding tasks. By incorporating generative AI models, they analyze coding patterns and furnish insights that streamline the development workflow, accelerating code generation and elevating the quality of output.

GitHub Copilot: AI-Powered Programming Companion GitHub Copilot, developed through a collaboration between GitHub and OpenAI, harnesses the capabilities of the Codex generative model, aiding developers in writing code more efficiently. Described as an AI-powered programming companion, it presents auto-complete suggestions during code development. GitHub Copilot keenly discerns the context of the active file and its related documents, proposing suggestions directly within the text editor. It boasts proficiency across all languages represented in public repositories.

Copilot X, an enhanced version of Copilot, builds upon this foundation, offering an enriched experience with chat and terminal interfaces, enhanced support for pull requests, and leveraging OpenAI's GPT-4 model. Both Copilot and Copilot X are compatible with Visual Studio, Visual Studio Code, Neovim, and the entire JetBrains software suite.

AWS CodeWhisperer: Real-Time Coding Recommendations Amazon CodeWhisperer is a machine learning-driven code generator that offers real-time coding recommendations. As developers script, it proactively presents suggestions influenced by the ongoing code. These propositions range from concise comments to elaborately structured functions. Currently, CodeWhisperer is attuned to a multitude of programming languages, including Java, Python, JavaScript, TypeScript, and many more. The tool seamlessly integrates with platforms such as Amazon SageMaker Studio, JupyterLab, Visual Studio Code, JetBrains, AWS Cloud9, and AWS Lambda.

Bard to Code: Conversational AI for Code Generation Bard, often categorized as conversational AI or a chatbot, demonstrates an adeptness in producing human-like textual responses to a diverse spectrum of prompts, owing to its extensive training on a myriad of textual data. Moreover, it possesses the dexterity to produce code across various programming languages, including but not limited to Python, Java, C++, and JavaScript.

SWE-Agent vs. Competitors: Democratizing Access to Advanced Programming Capabilities

In a landscape dominated by proprietary solutions like Devin AI and Devika, SWE-Agent shines as an open-source alternative, democratizing access to cutting-edge AI programming capabilities. Both SWE-Agent and Devin boast impressive performance on the SWE-bench benchmark, with SWE-Agent achieving a competitive 12.29% issue resolution rate. However, SWE-Agent's open-source nature sets it apart, aligning with the collaborative ethos of the software development community.

By making its codebase available to developers worldwide, SWE-Agent invites contributions and fosters an ecosystem of innovation and knowledge-sharing. Developers can freely integrate SWE-Agent into their workflows, harnessing its power to streamline software development processes while simultaneously contributing to its evolution. This collaborative approach empowers developers of all backgrounds and skill levels to optimize their workflows, enhance code quality, and navigate the complexities of modern software development with confidence.

Beyond its technical prowess, SWE-Agent holds the potential to catalyze a paradigm shift in software engineering education and community collaboration. As an open-source tool, SWE-Agent can be integrated into educational curricula, providing students with hands-on experience in AI-assisted software development. This exposure can help shape the next generation of software engineers, equipping them with the skills and mindset necessary to thrive in an increasingly automated and AI-driven industry.

Moreover, SWE-Agent's collaborative nature encourages developers to share their experiences, best practices, and insights, fostering a vibrant community of knowledge exchange. Through open-source contributions, bug reports, and feature requests, developers can actively participate in shaping the future of AI-powered software engineering. This collaborative approach not only accelerates the pace of innovation but also ensures that SWE-Agent remains relevant and adaptable to the ever-evolving needs of the software development ecosystem.

The Future of Software Development

While the emergence of AI-powered software engineers like SWE-Agent presents exciting opportunities, it also raises important questions and challenges that must be addressed. One critical consideration is the potential impact on the software development workforce. As AI systems become more capable of automating various aspects of the development process, there may be concerns about job displacement and the need for reskilling and upskilling initiatives.

However, it's important to recognize that AI is not a replacement for human developers but rather a powerful tool to augment and enhance their capabilities. By offloading repetitive and time-consuming tasks to AI systems like SWE-Agent, human developers can focus on higher-level tasks that require critical thinking, creativity, and problem-solving skills. This shift in focus could lead to more fulfilling and rewarding roles for software engineers, allowing them to tackle more complex challenges and drive innovation.

Another challenge lies in the ongoing development and refinement of AI systems like SWE-Agent. As software complexity continues to increase and new programming paradigms emerge, these AI systems must be continuously trained and updated to stay relevant and effective. This requires a concerted effort from the research community, as well as close collaboration between academia and industry, to ensure that AI-powered software engineers remain at the forefront of technological advancements.

Moreover, as AI systems become more integrated into the software development process, concerns around security, privacy, and ethical considerations must be addressed. Robust measures must be put in place to ensure the integrity and trustworthiness of the generated code, as well as to mitigate potential biases or unintended consequences. Ongoing research and dialogue within the software engineering community will be crucial in navigating these challenges and establishing best practices for the responsible development and deployment of AI-powered software engineers.

Conclusion

The rise of AI-powered software engineers like SWE-Agent represents a pivotal moment in the evolution of software development. By leveraging the power of large language models and machine learning algorithms, these AI systems have the potential to revolutionize the way software is designed, developed, and maintained. With their remarkable speed, accuracy, and ability to streamline the development lifecycle, AI software engineers promise to enhance developer productivity and accelerate the pace of innovation.

However, the true impact of AI software engineers extends beyond mere technical capabilities. As open-source solutions like SWE-Agent gain traction, they have the power to democratize access to advanced programming capabilities, fostering a collaborative ecosystem of knowledge-sharing and empowering developers of all backgrounds and skill levels.

As we embrace the era of AI-assisted software development, it is crucial to recognize the challenges and opportunities that lie ahead. While job displacement concerns and the need for reskilling exist, AI systems like SWE-Agent also present an opportunity to redefine the role of software engineers, allowing them to focus on higher-level tasks that require critical thinking and creativity.

Ultimately, the successful integration of AI-powered software engineers into the software development ecosystem will require a collective effort from researchers, developers, and industry leaders.

Dell Unveils High-performance APEX File Storage for Microsoft Azure Customers

Dell Technologies announced the launch of APEX File Storage for Microsoft Azure. This groundbreaking offering brings the proven power of Dell’s PowerScale OneFS file storage system to the Azure cloud.

By integrating high-performance file storage with Azure’s native AI tools, APEX File Storage for Azure enables customers to consolidate and manage data more effectively, reduce storage costs, and enhance data protection and security

Introducing #DellAPEX File Storage for #MicrosoftAzure.⚡
🔑 benefits include:
➡ Simplified with "multicloud by design"
➡ Extreme flexibility
➡ Data control
Learn more by reading our latest [BLOG] https://t.co/MUhDCmb6fe

— Dell Technologies Partners (@DellTechPartner) April 12, 2024

Designed for hybrid cloud and cloud burst use cases, APEX File Storage for Azure delivers unparalleled performance, scalability, and data protection features compared to native Azure storage options.

With the ability to support up to 18 nodes and 5.6PiB in a single namespace, the solution offers 6x greater cluster performance, up to 11x larger namespace, and up to 23x more snapshots per volume than Azure NetApp Files.

What truly sets APEX File Storage for Azure apart is its unwavering commitment to customer needs. With proactive support and an impressive 97% customer satisfaction rate, Dell Support Services provides highly trained experts available around the clock and around the globe to address your OneFS requirements. This ensures minimal disruptions and helps you maintain a high level of productivity and optimal outcomes.

Dell’s strategic collaboration with Microsoft represents a significant milestone in this journey, offering customers the freedom to store and process data in the most optimal location for their AI use cases.

With APEX File Storage for Azure, Dell continues to uphold its commitment to empowering businesses to innovate faster and achieve transformative outcomes with AI. This partnership allows organizations to leverage the combined strengths of Dell’s expertise in data management and Microsoft’s Azure cloud platform, enabling them to unlock the full potential of their AI initiatives and drive meaningful business impact.

The post Dell Unveils High-performance APEX File Storage for Microsoft Azure Customers appeared first on Analytics India Magazine.

Infosys Feels Good About Its Work with Generative AI

Infosys Feels Good About Its Work with Generative AI

After showing little to no confidence in generative AI in the last quarter, Infosys revealed in its Q4FY24 quarterly reports that the IT giant is seeing excellent traction with its clients for generative AI work. However, this quarter, as well, it did not disclose the revenue from generative AI.

Meanwhile, during its recent earnings, TCS revealed that it has an AI and generative AI pipeline worth $900 million, as CEO and MD K Krithivasan revealed during the Q4 2024 earnings call. This figure nearly matches the revenue achieved by rival Accenture, which totalled $1.1 billion in the first two quarters.

“We’re working on projects across software engineering, process optimisation, customer support, advisory services, and sales and marketing,” said Salil Parekh, Infosys CEO. We’re working with market-leading open access and closed large language models,” he added, saying that Infosys feels good about its work with generative AI.

Parekh also announced Infosys’ strategic acquisition of an engineering services company, possibly a hint towards InSemi. The deal was inked in January for INR 280 crore.

“We are not, at this stage, publicly sharing what the Generative AI revenue is, but we have tremendous capability, a good leading market position, and we feel, in generative AI, we are very strong in the capabilities,” he said.

Giving an example, Parekh said that in software development, Infosys has generated over 3 million lines of code using one of the generative AI and large language models in the public domain.

Infosys recorded $4.4 billion worth large deals, which is the highest ever last deal value in the financial year for the company.

“In several situations, we’ve chained the large language models with client specific data within our projects. We’ve put generative AI in our services and developed playbooks for each of our offerings,” said Parekh, which is all part of Infosys Topaz generative AI offerings.

He also said that Infosys is committed to ethical and responsible use of AI. Infosys became the first IT services company globally to achieve the ISO 42,001 to 2020 disease certification.

Parekh added that since data has become the critical enabler for making generative AI successful in an enterprise AI deployment, the IT giant is making sure that the access to the data for its client remains streamlined.

Infosys reported a 30% increase in consolidated net profit, reaching Rs 7,969 crore for the quarter. This is in contrast to the INR 6,128 crore profit reported during the same quarter last year.

Parekh said that Infosys has trained 8 out of 10 of its total employees in generative AI, but that has not made any impact on its hiring strategy. In September last year, NVIDIA announced a partnership with Infosys for revolutionising the world of enterprise AI. It also unveiled plans to establish the NVIDIA Center of Excellence’, dedicated to training and certifying 50,000 of its employees in NVIDIA AI technology.

The IT giant reported a decrease in its workforce for the fiscal year 2024, marking a decline of 25,994 employees, a first since at least 2001. The company’s total headcount for FY24 stood at 317,240, a 7.5% reduction compared to the previous year.

In terms of quarterly changes, Infosys saw a reduction of 5,423 employees, marking the fifth consecutive quarter of decline. Additionally, the attrition rate for the fourth quarter on a last twelve-month basis fell to 12.6% from 12.9%.

TCS also announced that it has trained over 350,000 employees in AI/ML, including GenAI. Moreover, TCS is also a launch partner for the newly announced AWS Generative AI Competency.

Parekh also consistently highlighted Infosys’ expansion in Europe with its latest proximity centre in Sofia, Bulgaria. The IT giant is committed to fostering local talent and nurturing innovation, aiming to hire, attract, and upskill 500 new employees over the next four years.

Recently, Infosys also announced a strategic partnership with Intel, integrating Intel technologies such as 4th and 5th Gen Intel Xeon processors, Intel Gaudi 2 AI accelerators, and Intel Core Ultra into Infosys Topaz. This collaboration aims to offer AI-first services, solutions, and platforms to accelerate business value through generative AI technologies.

The post Infosys Feels Good About Its Work with Generative AI appeared first on Analytics India Magazine.

Get University Level Certified for Next to Nothing

University Certified
Image by DALL-E

Not everybody has the luxury to go and study at university.

For one, it is expensive — being the biggest factor when people decide not to go to university. The second is that a lot of people do not know what they want in life, and it can be difficult to make that decision at such a young age.

If you are someone who has been in this situation or is currently in this situation but you still want to level up and up-skill to get the job you want without having to pay crazy expensive tuition fees — this article is for you.

Harvard University — CS50s

Harvard University is a well-known and popular state-of-the-art university. We’ve heard of them in movies, from our teachers, etc. The amazing thing is that you can now start your learning journey with them at a fraction of the price.

The courses are free when you enroll in the free audit tracker. However, if you would like to go through the assignment and gain a certification, you will have to pay a fee. But this fee is nowhere near what one would pay for tuition fees — it ranges from USD 50 to USD 300.

So what are the courses?!

CS50's Introduction to Computer Sciences

Link: Introduction to Computer Science

If you are someone who wants to start their data profession but is hesitant to take a Computer Science bachelor's degree due to the expense — this is for you.

In this Introduction to Computer Science course, you will learn about the art of programming and computer science. You will open up your mind and learn how to think algorithmically to solve programming problems. You will go over concepts such as abstraction, algorithms, data structures, software engineering, web development and more.

It doesn’t stop there!

You will also become familiar with programming languages such as C, Python, SQL, JavaScript, and HTML.

You will engage with a community of like-minded learners who come from all different backgrounds and by the end of it be able to develop and present your final project to your peers.

CS50's AP Computer Science Principles Programs

Link: AP Computer Science Principles Program

Is the introductory course not enough for you?

Don’t worry — check out the AP Computer Science Principles Program which also includes the ‘Introduction to Computer Science’ and another in-depth ‘Understanding Technology’ course. In the ‘Understanding Technology’ course, you will learn about the internet, multimedia, security, web development, and programming.

The point of this course is that it is aimed at high school students. There is an understanding that there is a high demand for tech professionals but young people cannot simply afford to enter the industry due to expensive tuition fees.

This is where the program comes into play where you can get state-of-the-art University resources and knowledge for a discounted price of £369 (at the time of writing).

If you are not a high school student but would still like to take these courses, you will have to register for them separately:

  • Link: Introduction to Computer Science
  • Link: CS50s Understanding Technology

CS50's Computer Science for Business Professionals

Link: Computer Science for Business Professionals

Let’s say you don’t want to become a software engineer or data scientist and you’re currently enjoying your role. You could be a manager, or product manager or a founder, but with the way things are moving in the tech industry — it would be good if you understood the world of computer science.

In this Computer Science for Business Professionals course, you will go through a top-down approach and learn the high-level concepts and design decisions related to the tech industry regarding computer science.

The main areas you will be focusing on are computational thinking, programming languages, internet technologies, web development, technology stacks, and cloud computing.

CS50's Introduction to Programming with Python

Link: Introduction to Programming with Python

Or maybe you want to get right into it. Start learning how to program from day one.

The CS50's Introduction to Computer Science course has a more general focus on computer science and different languages. In the Introduction to Programming with Python, you will learn the most popular language for general-purpose programming, data science, and web programming.

You will learn how to read and write code, find and fix bugs, extract data, and write unit tests. Learn about functions, arguments, variables, types, conditionals, Boolean expressions, and more. You do not need to have any prior programming experience to take this course.

Exercises included in this course are real-world programming problems, so you can get a realistic idea of the world as a Python programmer.

Wrapping it Up

And there you have it. 4 courses, which are very similar but are aimed at different groups and different goals. Once you have a good understanding of the path that you want to go down, you will know what exact course you need to up-skill and land the job you want!

Nisha Arya is a data scientist, freelance technical writer, and an editor and community manager for KDnuggets. She is particularly interested in providing data science career advice or tutorials and theory-based knowledge around data science. Nisha covers a wide range of topics and wishes to explore the different ways artificial intelligence can benefit the longevity of human life. A keen learner, Nisha seeks to broaden her tech knowledge and writing skills, while helping guide others.

More On This Topic

  • How to Get Certified as a Data Scientist
  • ModelOps: What you need to know to get certified
  • 5 Reasons Why You Should Get Certified
  • Advance your data science career to the next level
  • Junior Data Scientist: The Next Level
  • Next Level AI Programming: Prompt Design & Building AI Products

Automation Anywhere Wants to Augment Humans with AI, Not Replace Them

AI powered process automation leader Automation Anywhere recently launched a suite of generative AI-powered services, improving its Automation Success Platform with user-friendly tools.

Nearly a year ago, AIM spoke to cofounder Ankur Kothari, who shared that the company was “experimenting with various generative AI algorithms, exploring hundreds of different use cases for clients and partners”. Today, Automation Copilot, the company’s flagship generative AI product, is actively used by over 70 customers and has strong market acceptance.

In a recent interview with AIM, Adi Kuruganti, chief product officer, Automation Anywhere, provided valuable insights into the company’s journey and strategies for implementing generative AI across various industries.

“When we started 18 months ago, our goal was to expand into handling more unstructured documents, such as bills, landing pages, mortgage applications, and healthcare documents,” said Kuruganti, adding that partnerships have played a crucial role in this evolution.

Since then, the Los Angeles-based company has teamed up with major cloud service providers like Google and Microsoft Azure to leverage their generative AI and cloud capabilities. Early in December last year, Automation Anywhere entered into a multi-year Strategic Collaboration Agreement (SCA) with AWS, which elevates Automation Anywhere into the top 1% of partners globally.

Keeping Humans in the Centre

Job displacement due to AI has been a pressing concern over the past year, especially as layoffs increased. However, the narrative is shifting as companies now view AI more as a tool to improve productivity rather than a threat to employment.

Still, Automation Anywhere, as one of the companies that automate jobs, is well aware of this situation. Kuruganti compared the situation to the time when the internet was born and people had similar fears, but instead, the internet helped create more jobs. That is Automation Anywhere’s goal too.

“Especially as an automation company, there is often a fear that automation might replace jobs. Historically, that’s where we began, and it still remains a concern for many of our customers today, especially in sectors that rely heavily on optimisation,” said Kuruganti.

So, the company’s key focus areas are reskilling and upskilling, recognising that job roles are changing due to AI. Headed by co-founder and social impact officer Neeti Mehta Shukla, it is working in multiple areas and partnering with large companies to navigate these changes.

“But as we’ve seen, generative AI is more relevant for front office roles like customer service. Our goal is to augment human capabilities, not replace them,” added Kuruganti.

Acing the GenAI Wave

The new product updates include new AI-driven solutions that go beyond data extraction and summarisation to now support structured document types with a more contextual understanding provided by optical character recognition (OCR). “Our compound AI system is pivotal in developing these solutions, utilising vast amounts of data to refine and tailor AI models to specific tasks,” the CPO shared.

Its suite of automation tools includes Automation Copilot for Business Users, offering a streamlined interface for automating routine tasks without extensive technical expertise. Document Automation streamlines the processing of all types of documents, enhancing data accuracy and reducing errors.

Autopilot facilitates the continuous execution of multi-bot operations for tasks requiring high reliability, such as financial reconciliations. For technical users, the Automation Copilot for Automators enables the creation of complex automation scripts, integrating with existing IT infrastructure and offering error handling and customisation capabilities.

Collectively, these tools offer scalable, secure solutions that boost operational efficiency and compliance.

Kuruganti shared that six out of ten prominent banks in India use the company’s services. Genesys, Cargill, Salesforce, ZS, Honeywell, and Juniper Networks are some of the company’s customers, and all of them have achieved major milestones with its AI tools.

For example, Petrobras, the largest oil and gas company in Brazil, used Automation Copilot for Business Users in a POC to optimise tax processes and realised $120 million in savings.

What do Customers Want Now?

Customer acceptance has encouraged the company, as many early adopters have witnessed tangible benefits. The focus has shifted from just providing tools to creating outcomes that are meaningful and measurable. For instance, integrating AI with existing systems like Salesforce has allowed businesses to streamline operations and achieve higher efficiency levels.

However, when discussing customer adoption, Kuruganti highlighted that one of the biggest hurdles is keeping pace with technological change because “what was relevant even six months ago may be obsolete today”. So, defining and aligning clear outcomes with business goals has become crucial for sustaining innovation.

“Our initial product introductions focused on enabling functionalities. Now, we’re seeing active use in production environments, where the real value of these AI applications comes to light,” he noted.

Companies are increasingly focused on specific metrics, such as reducing average handling times or improving customer satisfaction scores, to quantify the impact of their investments in AI, especially in sensitive areas like anti-money laundering.

What Next?

Automation Anywhere is the preferred automation vendor for major corporations like IDFC, Nestle, Adani as well as the Indian government.

Kuruganti said, “About 50 to 60% of our global workforce, including product engineering, UX, and customer support, are based in India,” making it a key market for the company.

The company has reported robust performance globally, with fourth-quarter results surpassing expectations. There was a 50% growth from the previous quarter and a 14% year-over-year increase in large deals.

“For FY24, our goal is to achieve profitability like last quarter, but more importantly, we want to expand our customer base and enhance our service offerings over the next six to 12 months,” concluded Kuruganti.

The post Automation Anywhere Wants to Augment Humans with AI, Not Replace Them appeared first on Analytics India Magazine.

Why Data Pipeline is Important for High ROI AI Products

Why Data Pipeline is Important for High ROI AI Products

Salesforce is said to be in the final stages of negotiations to acquire data-management software provider Informatica for $11 billion. To many, the deal is reminiscent of Snowflake’s acquisition of Neeva in May last year.

“Salesforce potentially acquiring Informatica seems like a push to compete more with Snowflake,” wrote Astasia Myers, partner at Felicis.

These acquisitions are in step with big-tech companies like Google acquiring Looker, the data analytics startup, and Microsoft acquiring ADRM Software and Rubrik in 2019 and 2021, respectively. Rubrik, a data management company, is now targeting an IPO with a $5.4 billion valuation.

Databricks’ acquisition of Acrion, Mosaic ML, and Okera is also along similar lines, aimed at managing its data pipeline and increasing generative AI capabilities.

Similarly, Salesforce’s possible acquisition of Informatica is targeted at greatly enhancing its data capabilities, especially in fields like data integration, quality assurance, and customer insights. This also points towards the importance of building a data strategy and ensuring a smooth pipeline when it comes to building high-ROI AI products.

Contextualised Data is King

James Wu, partner at M12, highlighted in a recent post that building a strong data pipeline and building data-centric AI is important. That is why the venture fund also invested in Unstructured.io, with another data curation company in the pipeline. “Big data will continue to be the foundation, but contextualised data is king,” he said.

“We’re interested in the ‘AI-data feedback loop’ – we think better AI can analyse data to identify errors and inconsistencies, improving data quality for future models,” he explained, saying that cleaner data can also help in training superior AI models, like a cyclic loop.

Naveen Rao, VP of generative AI at Databricks, also shared similar thoughts. “We at Databricks are very much about the lifecycle of data and GenAI working synergistically together. We demonstrated the power of our training platform by building DBRX with it and we used all the tools in Databricks. We believe in the power of all the components around the model that comprise the full system,” he said.

This points to the need for building a good data strategy for expecting high ROI on AI products. Matthew Blasa, AI strategist and lead data scientist consultant, emphasises that since AI’s lifeblood is data, it is important to have an endless clean data pipeline for AI products.

Source: Matthew Blasa

“It’s important to ensure that your data is reliable, relevant to the large needs, and collected from multiple sources,” Blasa explained. “Without a clear data strategy, creating a model with enduring value is challenging. Relying solely on retraining and monitoring won’t close the gap. It may even make it harder.”

Crawling, walking, and running with AI

AI advisor Vin Vashishta shares the perfect plan for companies building AI products. “One thing that I’ve learned after a decade of building data and AI products is that businesses must crawl-walk-run with AI,” he wrote in a post.

Crawling is about collecting data, walking involves using the data to create descriptive models, and running uses more advanced models such as predictive, prescriptive, and diagnostic ones. He explains how starting with crawling and walking makes running less expensive and faster in the long run.

Each phase offers immediate benefits and builds on the previous phase, creating a solid foundation. “Walk and run handle about 90% of use cases, reducing time to value,” Vashishta explained.

In another post, Vashishta explained how high-quality data can bring quick results, and descriptive models trained on it yield quarterly gains. These efforts lay the foundation for AI products and potentially larger returns. “Trash data trains trash models, but the business needs tangible returns in months, not years. Fixing the data doesn’t deliver them unless data teams and leaders take a product-first approach,” he added.

The Data Pipeline Strategy

It is clear that data availability is important to build the best generative AI products. This is why companies like Salesforce, Snowflake, Databricks, and all other data and AI providers are expanding their hold on data companies. This would, in the end, provide them with high-quality streamlined data to improve their AI products.

AI products are data products. “Without a solid data strategy, it’s tough to trust the decisions made by our AI-driven products and keep them profitable,” said Blasa.

The post Why Data Pipeline is Important for High ROI AI Products appeared first on Analytics India Magazine.

Cisco Unveils Hypershield to Secure AI-Powered Data Centers

Cisco today announced Cisco Hypershield, a revolutionary new security architecture designed to protect data centres and clouds in the era of AI.

Hypershield enables security enforcement to be placed everywhere it’s needed, from software to servers to network switches, bringing advanced security capabilities to data centres, factory floors, and hospital imaging rooms.

Hypershield is built on three key pillars: AI-native design for autonomous and predictive security, cloud-native technologies like eBPF, and a hyper-distributed approach that embeds security controls into servers and the network fabric. This allows Hypershield to block application exploits in minutes, stop lateral movement, and enable zero-downtime software upgrades.

“Cisco Hypershield is one of the most significant security innovations in our history,” said Chuck Robbins, Cisco Chair and CEO. “With our data advantage and strength in security, infrastructure and observability platforms, Cisco is uniquely positioned to help our customers harness the power of AI.”

Cisco is collaborating with NVIDIA to build AI-native security solutions that protect and scale the data centres of tomorrow. This includes leveraging NVIDIA’s Morpheus cybersecurity AI framework and NIM microservices to augment Hypershield with robust security from cloud to edge.

As AI workloads drive explosive demand, data centres are hitting power and space constraints. Cisco Hypershield helps data centre operators evolve their infrastructure to provide the hyper-scale digital backbone needed for the AI revolution while maintaining security.

“Cisco Hypershield aims to address the complex security challenges of modern, AI-scale data centers,” said Zeus Kerravala, founder and principal analyst of ZK Research. “Cisco’s vision of a self-managing fabric that seamlessly integrates from the network to the endpoint will help redefine what’s possible for security at scale.”

Cisco Hypershield is expected to be generally available in August 2024 as part of Cisco’s unified Security Cloud platform. It comes at a critical time, as cybercriminals exploit vulnerabilities faster than ever, often within 24 hours of disclosure.

The post Cisco Unveils Hypershield to Secure AI-Powered Data Centers appeared first on Analytics India Magazine.

Millions of AI-Powered Chinese EVs Could be Running on Indian Streets Soon 

Tesla CEO Elon Musk may likely announce the entry of his autonomous EV-making company into India after he meets Indian Prime Minister Narendra Modi later this month.

Reportedly, the company chasing Level 5 autonomous driving is also scouting locations in New Delhi and Mumbai to open its first few physical stores in the country.

We could soon have luxury Tesla EVs stuck in Mumbai or Bengaluru’s infamous traffic jams. However, Tesla is not the only foreign EV company making a foray into the Indian market.

Earlier this year, Chinese electric vehicle manufacturer BYD entered the Indian market, boldly stating its intention to seize 90% of the nation’s EV market.

Surprisingly, BYD, a company few people have heard of outside China, surpassed Tesla in sales volume during the final quarter of 2023. For context, BYD experienced a 73% sales surge in 2023, selling 1.6 million units, trailing Tesla’s 1.8 million units.

Although Tesla maintains its lead in annual sales figures for 2023, there’s a strong possibility that BYD could soon overtake Tesla in terms of overall sales.

Chinese Cars on Indian Streets

Last month, BYD, which has achieved Level 3 autonomy, introduced the Seal electric vehicle in India, priced between INR 40 to 55 lakh. Within a few hours of the launch, the vehicle garnered over 200 bookings.

While it’s still unclear how Tesla will price its models, their most affordable car is expected to fall in the same price bracket. Prices can escalate to INR 2 crore for higher-end models like the Tesla Model X.

We’ll have to wait and see if Tesla adopts a competitive pricing strategy in India to compete with BYD and other EV manufacturers, including market leader Tata Motors.

However, BYD will undoubtedly introduce affordable cars in the Indian market. For instance, the BYD Seagull EV Honor Edition starts at USD 9,700, or approximately INR 8-9 lakh, excluding taxes. In the coming years, BYD will bring its cheaper cars to India, targeting middle-class Indians.

Notably, the mid-range price segment is the most lucrative for automobile makers, accounting for over 40% of the market, and BYD has already made its intentions clear about targeting this segment.

This could mean the average Indian car may be a Chinese car in the coming years.

Moreover, BYD’s entry into India could open the doors to hundreds of Chinese EV companies. According to Wired, nearly 300 EV companies operated in China last year.

This could mean that China-made EVs might flood the Indian market in the coming years, similar to Chinese smartphones, which seem to be ubiquitous in the hands of every other Indian.

Interestingly, phone maker Xiaomi has also begun selling its EV in the Chinese market this year.

Meanwhile, the government is not planning to put any restrictions on the import of EVs to India, even from China. While there is a limit to the number of vehicles imported from a foreign country, there are workarounds.

BYD, for instance, is seeking a homologation certification from the Automotive Research Association of India (ARAI) for its vehicles, allowing it to bypass the import restrictions.

Selling AI Cars

While autonomous vehicles on Indian streets still remain a distant dream, automobile makers are increasingly introducing AI-powered features such as smart parking, advanced driver-assistance systems (ADAS) and battery management systems.

Earlier this year, BYD introduced an AI-powered smart car system powered by Drive Thor, NVIDIA’s next-generation in-vehicle computing chip. The second-largest EV maker in the world unveiled the XUANJI AI large model, marking the first integration of AI across all vehicular domains.

Other Chinese EV makers like Xpeng have also announced their plans to invest millions of dollars in AI. Many industry experts that AIM spoke to told us that AI features will play an important role when customers shop for cars.

Chinese EV companies can sell AI-powered cars at reasonable prices and gain significant market share not just in India but globally.

China’s dominance in EVs is worrying

China’s growing dominance in the global EV landscape has become a cause for concern for many in the western world. Former US President Donald Trump claimed, if re-elected, he would impose a 100% tariff on Chinese cars.

US Commerce Secretary Gina Raimondo claimed that modern cars are essentially iPhones on wheels. She expressed concerns that if Chinese-made cars operate on American roads, they could be susceptible to hacking, posing a potential threat to national security.

Investment bank UBS said in a report that BYD and other leading Chinese EV companies are set to conquer the world market with high-tech, low-cost EVs for the masses.

This could also hold true for India, posing a significant challenge for Indian homegrown companies such as Tata, Maruti and Mahindra, which are developing their own AI-powered EV models.

While India does not intend to restrict EV imports, its stance might change if Chinese EVs flood the Indian market. Earlier, the government banned many Chinese apps, including TikTok, citing national security threats.

Banning Chinese EVs under the guise of national security remains a possibility.

The post Millions of AI-Powered Chinese EVs Could be Running on Indian Streets Soon appeared first on Analytics India Magazine.

Microsoft Unveils VASA-1, Creating DeepFake Videos with a Single Image

Just around the time of elections, Microsoft researchers have developed a powerful new AI system called VASA-1 that can generate stunningly realistic videos of talking faces from just a single image and an audio clip.

The VASA-1 system goes far beyond simple lip-syncing by capturing a wide range of facial expressions, emotions, head movements and even allowing control over aspects like gaze direction and perceived distance.

Microsoft just dropped VASA-1.
This AI can make single image sing and talk from audio reference expressively. Similar to EMO from Alibaba
10 wild examples:
1. Mona Lisa rapping Paparazzi pic.twitter.com/LSGF3mMVnD

— Min Choi (@minchoi) April 18, 2024

As the video not only captures exquisitely synchronised lip movements with the audio but also encompasses a wide spectrum of facial nuances and natural head motions, contributing to the perception of authenticity and liveliness.

VASA-1 achieves its realism by using AI to disentangle different facial components like expressions, 3D head position, and lip movements. This allows independent control and editing of each aspect.

The blog explains how the approach not only ensures superior video quality featuring authentic facial and head movements, but also enables the seamless online creation of 512×512 videos at speeds of up to 40 FPS, all with minimal initial latency. This breakthrough sets the stage for interactive experiences with lifelike avatars, mimicking natural human conversational nuances in real-time engagements.

Min Choi, says VASA-1 has the remarkable ability to animate a single image with expressive speech, akin to Alibaba’s EMO technology. EMO boasts an AI framework capable of crafting remarkably lifelike ‘talking head’ videos using just one image and an audio input.

On the other hand, people are concerned with the possible misuse of this deepfake technology as it is launched just at the time of elections.

Microsoft just introduced VASA-1.
It's a new AI model that can turn 1 photo and 1 piece of audio into a fully lifelike human deepfake.
Wild to drop this right before the election 😬pic.twitter.com/MuLkZVOKRM

— Rowan Cheung (@rowancheung) April 18, 2024

Acknowledging the potential for misuse, the researchers underscore the positive applications of VASA-1, which encompass enhancing educational experiences, aiding individuals with communication difficulties, and offering companionship or therapeutic support.

The post Microsoft Unveils VASA-1, Creating DeepFake Videos with a Single Image appeared first on Analytics India Magazine.