Software Engineering Jobs are Dying

Software Engineering Jobs are Dying

For over a decade, a popular belief has been that a computer science degree is all you need to tread the path to wealth, especially in a country like India.

If you looked around then, everyone and their neighbour pursued engineering college degrees to land software engineer jobs. Cut to 2024, things are slowly changing.

“Winter is coming for software engineering,” said Deedy Das from Menlo Ventures, sharing a graph (see below) about how by 2024, software engineering roles will almost become a distant memory.

The graph clearly shows that there was a boom in hiring tech talent back in 2021, which has now dropped to 40% of it in 2024. Added to this are the layoffs in big tech, such as the recent one where Google decided to lay off its Python programmers for outsourcing to cheaper options.

Even startups, one of the biggest providers of jobs to developers, are now preferring to hire tenured people with expertise. “Startups often dislike hiring new grads because the cost to train them and get them up to speed is quite high,” explained Das in the post, adding that freshers with knowledge about new technologies such as AI are preferred.

It seems like AI is stealing jobs indeed. There are a lot of CS majors who are no longer able to find jobs as now everyone wants to be a programmer. With AI tools in the market, it has become easier to become a programmer, and the barrier to entry has reduced significantly, making everyone want to do something with software.

What about more jobs?

Das said that people would migrate from software for a little bit and then come back during the next boom cycle, just like AI. Isaac Hasson suggested developers to stop studying pure computer science and get skilled in other disciplines such as biology and chemistry, which he said are all going to transform pretty soon as well.

This is what Yann LeCun, the chief of AI at Meta, posted on X about a year back. It takes at least 15-20 years to have an effect on productivity. The delay is determined by how fast people learn and adapt to it. “So no, AI is not going to cause instant mass unemployment,” LeCun concluded. “It is only going to displace jobs over time and make people more productive,” just like any other technological revolution.

On the other hand, several predictions suggest that there will be many more jobs in the future. According to Francois Chollet, the creator of Keras, “There will be more software engineers (the kind that write code, e.g., Python, C or JavaScript code) in five years than there are today.” He adds that the estimated number of professional software engineers today is 26 million, which would jump to 30-35 million in five years.

For context, he said that many are currently claiming that people shouldn’t get into computer science because, in the future, most of the software engineering will be done by AI, the likes of Devin, Devika, and the recent GitHub Workspace update. In the latest podcast with Lex Fridman, when asked how much programming people would do in the next 5-10 years, OpenAI CEO Sam Altman said, “A lot, but I think it’ll be in a very different shape.”

Similarly, a user Bjoern said, “ I see AI as the fracking tech of software engineering that allows us to extract enormous amounts of previously inaccessible oil,” which he said are the long tail of software use cases that were not available before. “We will need more software engineers to decide what to build, how to scale things, and how to maintain that long tail of complexity.”

Computer science is not all you need

The number of CS majors in universities have increased exponentially over the last decade and the trend seems to increase even more. But the problem is that people are applying for software engineering roles while relying completely on AI and Copilot tools, which are not enough to get jobs in the market.

An entire generation is studying for jobs that won’t exist. Mark Cuban last year predicted that the highest-paying college major in the world, computer science, will hold very little value for employers in the future. “Twenty years from now, if you are a coder, you might be out of a job,” he said in a podcast.

One of the most important things that Altman and LeCun agree upon is that humans should be trained with AI and use them as copilots. Well, the most-celebrated phrase across 2023 has been, “AI won’t take your job, it’s somebody using AI that will take your job.” And, “you will be 10x more valuable in the coming years”, as Logan Kilpatrick said on X.

“The best practitioners of the craft will use multiple tools and they’ll do some work in natural language,” he added. Altman explained that people would be able to focus on the higher level of abstractions, and the puzzle-solving skill set of programming, which Fridman agreed, was the harder part.

Furthermore, implying that a lot of software engineering jobs would be redundant is dangerous as many students would not know if they should attend college at all.

The post Software Engineering Jobs are Dying appeared first on Analytics India Magazine.

The Threat of Offensive AI and How to Protect From It

Artificial Intelligence (AI) swiftly transforms our digital space, exposing the potential for misuse by threat actors. Offensive or adversarial AI, a subfield of AI, seeks to exploit vulnerabilities in AI systems. Imagine a cyberattack so smart that it can bypass defense faster than we can stop it! Offensive AI can autonomously execute cyberattacks, penetrate defenses, and manipulate data.

MIT Technology Review has shared that 96% of IT and security leaders are now factoring in AI-powered cyber-attacks in their threat matrix. As AI technology keeps advancing, the dangers posed by malicious individuals are also becoming more dynamic.

This article aims to help you understand the potential risks associated with offensive AI and the necessary strategies to effectively counter these threats.

Understanding Offensive AI

Offensive AI is a growing concern for global stability. Offensive AI refers to systems tailored to assist or execute harmful activities. A study by DarkTrace reveals a concerning trend: nearly 74% of cybersecurity experts believe that AI threats are now significant issues. These attacks aren't just faster and stealthier; they're capable of strategies beyond human capabilities and transforming the cybersecurity battlefield. The usage of offensive AI can spread disinformation, disrupt political processes, and manipulate public opinion. Additionally, the increasing desire for AI-powered autonomous weapons is worrying because it could result in human rights violations. Establishing guidelines for their responsible use is essential for maintaining global stability and upholding humanitarian values.

Examples of AI-powered Cyberattacks

AI can be used in various cyberattacks to enhance effectiveness and exploit vulnerabilities. Let's explore offensive AI with some real examples. This will show how AI is used in cyberattacks.

  • Deep Fake Voice Scams: In a recent scam, cybercriminals used AI to mimic a CEO’s voice and successfully requested urgent wire transfers from unsuspecting employees.
  • AI-Enhanced Phishing Emails: Attackers use AI to target businesses and individuals by creating personalized phishing emails that appear genuine and legitimate. This enables them to manipulate unsuspecting individuals into revealing confidential information. This has raised concerns about the speed and variations of social engineering attacks with increased chances of success.
  • Financial Crime: Generative AI, with its democratized access, has become a go-to tool for fraudsters to carry out phishing attacks, credential stuffing, and AI-powered BEC (Business Email Compromise) and ATO (Account Takeover) attacks. This has increased behavioral-driven attacks in the US financial sector by 43%, resulting in $3.8 million in losses in 2023.

These examples reveal the complexity of AI-driven threats that need robust mitigation measures.

Impact and Implications

Offensive AI poses significant challenges to current security measures, which struggle to keep up with the swift and intelligent nature of AI threats. Companies are at a higher risk of data breaches, operational interruptions, and serious reputation damage. It's critical now more than ever to develop advanced defensive strategies to effectively counter these risks. Let's take a closer and more detailed look at how offensive AI can affect organizations.

  • Challenges for Human-Controlled Detection Systems: Offensive AI creates difficulties for human-controlled detection systems. It can quickly generate and adapt attack strategies, overwhelming traditional security measures that rely on human analysts. This puts organizations at risk and increases the risk of successful attacks.
  • Limitations of Traditional Detection Tools: Offensive AI can evade traditional rule or signature-based detection tools. These tools rely on predefined patterns or rules to identify malicious activities. However, offensive AI can dynamically generate attack patterns that don't match known signatures, making them difficult to detect. Security professionals can adopt techniques like anomaly detection to detect abnormal activities to effectively counter offensive AI threats.
  • Social Engineering Attacks: Offensive AI can enhance social engineering attacks, manipulating individuals into revealing sensitive information or compromising security. AI-powered chatbots and voice synthesis can mimic human behavior, making distinguishing between real and fake interactions harder.

This exposes organizations to higher risks of data breaches, unauthorized access, and financial losses.

Implications of Offensive AI

While offensive AI poses a severe threat to organizations, its implications extend beyond technical hurdles. Here are some critical areas where offensive AI demands our immediate attention:

  • Urgent Need for Regulations: The rise of offensive AI calls for developing stringent regulations and legal frameworks to govern its use. Having clear rules for responsible AI development can stop bad actors from using it for harm. Clear regulations for responsible AI development will prevent misuse and protect individuals and organizations from potential dangers. This will allow everyone to safely benefit from the advancements AI offers.
  • Ethical Considerations: Offensive AI raises a multitude of ethical and privacy concerns, threatening the spread of surveillance and data breaches. Moreover, it can contribute to global instability with the malicious development and deployment of autonomous weapons systems. Organizations can limit these risks by prioritizing ethical considerations like transparency, accountability, and fairness throughout the design and use of AI.
  • Paradigm Shift in Security Strategies: Adversarial AI disrupts traditional security paradigms. Conventional defense mechanisms are struggling to keep pace with the speed and sophistication of AI-driven attacks. With AI threats constantly evolving, organizations must step up their defenses by investing in more robust security tools. Organizations must leverage AI and machine learning to build robust systems that can automatically detect and stop attacks as they happen. But it's not just about the tools. Organizations also need to invest in training their security professionals to work effectively with these new systems.

Defensive AI

Defensive AI is a powerful tool in the fight against cybercrime. By using AI-powered advanced data analytics to spot system vulnerabilities and raise alerts, organizations can neutralize threats and build a robust security cover. Although still in development, defensive AI offers a promising way to build responsible and ethical mitigation technology.

Defensive AI is a potent tool in the battle against cybercrime. The AI-powered defensive system uses advanced data analytics methods to detect system vulnerabilities and raise alerts. This helps organizations to neutralize threats and construct strong security protection against cyber attacks. Although still an emerging technology, defensive AI offers a promising approach to developing responsible and ethical mitigation solutions.

Strategic Approaches to Mitigating Offensive AI Risks

In the battle against offensive AI, a dynamic defense strategy is required. Here’s how organizations can effectively counter the rising tide of offensive AI:

  • Rapid Response Capabilities: To counter AI-driven attacks, companies must enhance their ability to quickly detect and respond to threats. Businesses should upgrade security protocols with incident response plans and threat intelligence sharing. Moreover companies should utilize cutting edge real-time analysis tools like threat detection systems and AI driven solutions.
  • Leveraging Defensive AI: Integrate an updated cybersecurity system that automatically detects anomalies and identifies potential threats before they materialize. By continuously adapting to new tactics without human intervention, defensive AI systems can stay one step ahead of offensive AI.
  • Human Oversight: AI is a powerful tool in cybersecurity, but it is not a silver bullet. Human-in-the-loop (HITL) ensures AI's explainable, responsible, and ethical use. Humans and AI association is actually important for making a defense plan more effective.
  • Continuous Evolution: The battle against offensive AI isn't static; it's a continuous arms race. Regular updates of defensive systems are compulsory for tackling new threats. Staying informed, flexible, and adaptable is the best defense against the rapidly advancing offensive AI.

Defensive AI is a significant step forward in ensuring resilient security coverage against evolving cyber threats. Because offensive AI constantly changes, organizations must adopt a perpetual vigilant posture by staying informed on emerging trends.

Visit Unite.AI to learn more about the latest developments in AI security.

New Relic and Atlassian Deliver the First Observability Integration for Incidents Tab in Jira

New Relic, the all-in-one observability platform for every engineer, announced a new integration with Atlassian, a leading provider of team collaboration and productivity software.

New Relic is the first and only observability platform to be integrated into Atlassian’s new capability to track incidents in Jira Software.

The integration empowers engineering teams to bolster their incident management and resolution practices with insights gained from software issues detected in New Relic. This enables organizations to more effectively manage current incidents and prevent future ones, freeing more time for product innovation.

Incidents are inevitable for every software development team. When development teams lack visibility into production issues and incident ownership is unclear, incidents and bugs go unresolved and can impact user experience, revenue, and reputation.

This issue is exacerbated when incident tracking is scattered across systems, making collaboration difficult between development and IT operations teams.

New Relic and Atlassian are solving these challenges with the Incidents tab in Jira, which sends incidents from New Relic and other tools to Jira, allowing teams to quickly learn about the incident and focus on identifying the affected services, entities, and issues.

By surfacing post-incident reviews (PIRs) in Jira, teams can assign and manage preventative work to reduce the frequency and volume of costly incidents. New Relic is the first and only observability provider to offer an integration available in early access to Jira customers, which helped shape Jira’s new capability.

“At New Relic, we believe in meeting engineers where they are, enabling them to do their best work with their preferred tools. We are delighted to be the first observability platform to integrate with the incidents tab in Jira,” said New Relic Chief Design and Strategy Officer Peter Pezaris.

“With our natively-connected solution, we hope to expand access to observability by bringing the right insights into engineers’ existing workflows, helping them manage their work more effectively with the best-of-breed tools they use every day.”

Key benefits of the integration include:

  • Enhance incident detection and visibility: Enable development teams to monitor their code’s performance in production and identify associated issues in Jira Software.
  • Avoid screen swivel, speed up resolution: Link and create Jira issues with pre-populated incident details from New Relic to help resolve critical issues faster and reduce downtime.
  • Prioritise and streamline workflows: Review incidents by priority, affected service, and status to quickly identify incidents that require a team’s attention.
  • Establish proactive practices: Create PIRs that include real-time New Relic data to help teams understand root causes, remediation options, and how to prevent recurring incidents.

The post New Relic and Atlassian Deliver the First Observability Integration for Incidents Tab in Jira appeared first on Analytics India Magazine.

Why Big Tech Layoffs are Good News for India

layoffs are good for India

Last month, a sweeping wave of layoffs hit major tech giants like Tesla, Google, and Apple, affecting over 20,000 employees. This comes amid a broader trend of job cuts in the industry that impacted more than 70,000 individuals throughout the year. There is no stopping.

Looks like big tech companies are on a layoff spree. Recently, Google fired its entire US-based Python team, which consisted of less than ten people. A few days back, it also terminated 200 employees from its ‘core team’ and transferred certain positions abroad, including India and Mexico.

Meanwhile, Microsoft also recently laid off 1,900 employees at Activision Blizzard and Xbox in January this year. AWS also cut several hundred employees recently. Meta also laid off several employees and expected to impact research support staff.

In the hardware space, AMD laid off 450 employees from their Shanghai R&D centre last year.

In the fourth quarter of 2023, Qualcomm fired more than 1258 employees from its San Diego and Santa Clara offices. However, it recently inaugurated a design centre in Chennai with an investment of 177 crores, generating jobs for 1600 tech professionals.

Good news for India

Most reports cite that big tech companies are relocating their resources to countries like India and Mexico to scale their business operations, leading to global team acquisitions to hire cheap labour.

For instance, the Python team’s layoff follows their setting up a new office in Munich, Germany, where they will train new employees from Bengaluru, Dublin, Atlanta, and Chicago at a lower cost.

Apart from Google, you will find tech giants like Meta, Amazon, AMD, Qualcomm, and more. They have big plans for India, which eventually will produce more jobs.

Despite cutting off several hundred employees, AWS plans to invest $12.7 billion (Rs 1,05,600 crores) in cloud infrastructure in India by 2030, which will generate an estimated average of 131,700 full-time equivalent (FTE) jobs.

Even though Meta’s company-funded oversight body is planning to trim its workforce, the company’s take on India is positive: it is said to offer a $250,000 grant to the top five Indian startups for a mixed reality program.

Also, Microsoft unveiled the ADVANTA(I)GE INDIA initiative to equip 2 million people in India with AI skills by 2025 as part of its Skills for Jobs programme.

Furthermore, Microsoft and iCreate (International Centre for Entrepreneurship and Technology) have signed a MoU with the Ministry of Electronics & IT to boost AI startups in India. From the initial pool of 1,100 AI innovators, the programme will select 100 startups to provide them with access to Microsoft’s Azure OpenAI platform and resources.

The top 25 startups from this group will receive additional support from Microsoft’s global network to help them develop advanced and globally competitive AI-based products.

AMD is also planning something big for India. In the next five years, it will invest $400 million to open AMD’s largest design centre. The company is actively hiring for many machine learning and generative AI roles. By the end of 2028, it aims to have about 3000 new engineering roles.

Qualcomm also recently inaugurated a design centre in Chennai, investing 177 crores and creating jobs for 1600 tech professionals.

Furthermore, in 2022, Qualcomm established its second largest office outside the US at a cost of $500 million over the next five years, which will generate jobs for around 8700 software professionals.

Cheap labour, really?

India has a vast pool of highly skilled and hardworking software professionals and has emerged as one of the key talent markets for tech skills worldwide. This means big tech companies can easily and at lower costs find skilled talent to support new-gen tech such as artificial intelligence (AI), machine learning, and data science.

Furthermore, India has a geographical advantage in the Asia-Pacific region. With such a strong presence, tech companies can more easily access and cater to the massive consumer markets in neighbouring countries, including China, Japan, Malaysia, and South Korea.

Big tech companies pay well to Indian employees, compared to any Indian IT companies or startups. For instance, the average base salary for an AI engineer at Google is about INR 10.7 crore annually. The starting salary is about INR 12 LPA and can easily go up to INR 21.2 crore. Other big tech players like Microsoft and Amazon also offer a similar range.

On the other hand, Indian IT pays the lowest of them all despite claims of training employees in generative AI. The entry-level salary for freshers at TCS ranges from INR 3-4 LPA, which remains lower than the industry standard.

average genai salary in india

Compared to the US?

Compared to US employees, Indians in the country definitely get paid less. “The Average annual salary of developers in India is almost 2.5 times lesser than the average salary of all developers all over the world, almost five times less than the average salary in the US,” said Nitesh Agarwal, CEO at Curated Connections.

Salary comparision of software enginners in india vs USA

(Source: Tomaž Weiss)

All of this comes down to the economics of running a scalable business, and resource relocation is just part of the corporate strategy.

‘Price’s Law,’ coined by British physicist Derek J. de Solla Price and also discussed by Marc Andreessen and Ben Horowitz in their recent podcast, ‘The Ben and Marc Show,’ discusses this.

Andreessen said that the square root of the number of contributors generates roughly 50% of the output, reflecting how a small group often drives significant productivity within organisations.

He said that the disproportionate impact of a select few in terms of productivity applies across various domains like engineering, product management, and sales and underscores the strategic considerations in organisational design and resource allocation.

The post Why Big Tech Layoffs are Good News for India appeared first on Analytics India Magazine.

5 Simple Steps to Automate Data Cleaning with Python

A 5 Simple Steps Pipeline to Automate Data Cleaning with Python. Boxplot.
Image by Author

It is a widely spread fact among Data Scientists that data cleaning makes up a big proportion of our working time. However, it is one of the least exciting parts as well. So this leads to a very natural question:

Is there a way to automate this process?

Automating any process is always easier said than done since the steps to perform depend mostly on the specific project and goal. But there are always ways to automate, at least, some of the parts.

This article aims to generate a pipeline with some steps to make sure our data is clean and ready to be used.

Data Cleaning Process

Before proceeding to generate the pipeline, we need to understand what parts of the processes can be automated.

Since we want to build a process that can be used for almost any data science project, we need to first determine what steps are performed over and over again.

So when working with a new data set, we usually ask the following questions:

  • What format does the data come in?
  • Does the data contain duplicates?
  • Does the data contain missing values?
  • What data types does the data contain?
  • Does the data contain outliers?

These 5 questions can easily be converted into 5 blocks of code to deal with each of the questions:

1.Data Format

Data can come in different formats, such as JSON, CSV, or even XML. Every format requires its own data parser. For instance, pandas provide read_csv for CSV files, and read_json for JSON files.

By identifying the format, you can choose the right tool to begin the cleaning process.

We can easily identify the format of the file we are dealing with using the path.plaintext function from the os library. Therefore, we can create a function that first determines what extension we have, and then applies directly to the corresponding parser.

2. Duplicates

It happens quite often that some rows of the data contain the same exact values as other rows, what we know as duplicates. Duplicated data can skew results and lead to inaccurate analyses, which is not good at all.

This is why we always need to make sure there are no duplicates.

Pandas got us covered with the drop_duplicated() method, which erases all duplicated rows of a dataframe.

We can create a straightforward function that utilizes this method to remove all duplicates. If necessary, we add a columns input variable that adapts the function to eliminate duplicates based on a specific list of column names.

3. Missing Values

Missing data is a common issue when working with data as well. Depending on the nature of your data, we can simply delete the observations containing missing values, or we can fill these gaps using methods like forward fill, backward fill, or substituting with the mean or median of the column.

Pandas offers us the .fillna() and .dropna() methods to handle these missing values effectively.

The choice of how we handle missing values depends on:

  • The type of values that are missing
  • The proportion of missing values relative to the number of total records we have.

Dealing with missing values is a quite complex task to perform — and usually one of the most important ones! — you can learn more about it in the following article.

For our pipeline, we will first check the total number of rows that present null values. If only 5% of them or less are affected, we will erase these records. In case more rows present missing values, we will check column by column and will proceed with either:

  • Imputing the median of the value.
  • Generate a warning to further investigate.

In this case, we are assessing the missing values with a hybrid human validation process. As you already know, assessing missing values is a crucial task that can not be overlooked.

4. Dealing with Data Types

Ensuring correct data types is crucial for data analysis. Sometimes, data types are incorrectly assigned, such as treating a numeric column as a string.

The main problem in dealing with data types is that there are too many scenarios to consider. This is why, the most straightforward way to automate the cleaning of data is to define the data types we expect our table to have and receive a warning in case there is a mismatch.

When working with regular data types we can proceed to transform the columns directly with the pandas .astype() function, so you could actually modify the code to generate regular conversations.

Otherwise, it is usually too risky to assume that a transformation will be performed smoothly when working with new data.

5. Dealing with Outliers

Outliers can significantly affect the results of your data analysis. Techniques to handle outliers include setting thresholds, capping values, or using statistical methods like Z-score.

In order to determine if we have outliers in our dataset, we use a common rule and consider any record outside of the following range as an outlier. [Q1 — 1.5 * IQR , Q3 + 1.5 * IQR]

Where IQR stands for the interquartile range and Q1 and Q3 are the 1st and the 3rd quartiles. Below you can observe all the previous concepts displayed in a boxplot.

XXX
Image by Author

To detect the presence of outliers, we can easily define a function that checks what columns present values that are out of the previous range and generate a warning.

Final Thoughts

Data Cleaning is a crucial part of any data project, however, it is usually the most boring and time-wasting phase as well. This is why this article effectively distills a comprehensive approach into a practical 5-step pipeline for automating data cleaning using Python and.

The pipeline is not just about implementing code. It integrates thoughtful decision-making criteria that guide the user through handling different data scenarios.

This blend of automation with human oversight ensures both efficiency and accuracy, making it a robust solution for data scientists aiming to optimize their workflow.

You can go check my whole code in the following GitHub repo.

Josep Ferrer is an analytics engineer from Barcelona. He graduated in physics engineering and is currently working in the data science field applied to human mobility. He is a part-time content creator focused on data science and technology. Josep writes on all things AI, covering the application of the ongoing explosion in the field.

More On This Topic

  • 7 Steps to Mastering Data Cleaning and Preprocessing Techniques
  • 5 Tasks To Automate With Python
  • Automate Microsoft Excel and Word Using Python
  • Automate the Boring Stuff with GPT-4 and Python
  • 5 Simple Steps Series: Master Python, SQL, Scikit-learn, PyTorch &…
  • The Prefect Way to Automate & Orchestrate Data Pipelines

AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation

Over the years, the creation of realistic and expressive portraits animations from static images and audio has found a range of applications including gaming, digital media, virtual reality, and a lot more. Despite its potential application, it is still difficult for developers to create frameworks capable of generating high-quality animations that maintain temporal consistency and are visually captivating. A major cause for the complexity is the need for intricate coordination of lip movements, head positions, and facial expressions to craft a visually compelling effect.

In this article, we will be talking about AniPortrait, a novel framework designed to generate high-quality animations driven by a reference portrait image and an audio sample. The working of the AniPortrait framework is divided into two stages. First, the AniPortrait framework extracts the intermediate 3D representations from the audio samples, and projects them into a sequence of 2D facial landmarks. Following this, the framework employs a robust diffusion model coupled with a motion module to convert the landmark sequences into temporally consistent and photorealistic animations. The experimental results demonstrate the superiority and ability of the AniPortrait framework to generate high quality animations with exceptional visual quality, pose diversity, and facial naturalness, therefore offering an enhanced and enriched perceptual experience. Furthermore, the AniPortrait framework holds remarkable potential in terms of controllability and flexibility, and can be applied effectively in areas including facial reenactment, facial motion editing, and more. This article aims to cover the AniPortrait framework in depth, and we explore the mechanism, the methodology, the architecture of the framework along with its comparison with state of the art frameworks. So let’s get started.

AniPortrait: Photorealistic Portrait Animation

Creating realistic and expressive portrait animations has been the focus of researchers for a while now owing to its incredible potential and applications spanning from digital media and virtual reality to gaming and more. Despite years of research and development, producing high-quality animations that maintain temporal consistency and are visually captivating still presents a significant challenge. A major hurdle for developers is the need for intricate coordination between head positions, visual expressions, and lip movements to craft a visually compelling effect. Existing methods have failed to tackle these challenges, primarily since a majority of them rely on limited capacity generators like NeRF, motion-based decoders, and GAN for visual content creation. These networks exhibit limited generalization capabilities, and are unstable in generating high quality content. However, the recent emergence of diffusion models has facilitated the generation of high-quality images, and some frameworks built on top of diffusion models along with temporal modules have facilitated the creation of compelling videos, allowing diffusion models to excel.

Building upon the advancements of diffusion models, the AniPortrait framework aims to generate high quality animated portraits using a reference image, and an audio sample. The working of the AniPortrait framework is split in two stages. In the first stage, the AniPortrait framework employs transformer-based models to extract a sequence of 3D facial mesh and head pose from audio input, and projects them subsequently into a sequence of 2D facial landmarks. The first stage facilitates the AniPortrait framework to capture lip movements and subtle expressions from the audio in addition to head movements that synchronize with the rhythm of the audio sample. The second stage, the AniPortrait framework employs a robust diffusion model and integrates it with a motion module to transform the facial landmark sequence into a photorealistic and temporally consistent animated portrait. To be more specific, the AniPortrait framework draws upon the network architecture from the existing AnimateAnyone model that employs Stable Diffusion 1.5, a potent diffusion model to generate lifelike and fluid based on a reference image and a body motion sequence. What is worth noting is that the AniPortrait framework does not use the pose guider module within this network as it implemented in AnimateAnyone framework, but it redesigns it, allowing the AniPortrait framework not only to maintain a lightweight design but also exhibits enhanced precision in generating lip movements.

Experimental results demonstrate the superiority of the AniPortrait framework in creating animations with impressive facial naturalness, excellent visual quality, and varied poses. By employing 3D facial representations as intermediate features, the AniPortrait framework gains the flexibility to modify these representations as per its requirements. The adaptability significantly enhances the applicability of the AniPortrait framework across domains including facial reenactment and facial motion editing.

AniPortrait: Working and Methodology

The proposed AniPortrait framework comprises two modules, namely Lmk2Video, and Audio2Lmk. The Audio2Lmk module attempts to extract a sequence of landmarks that captures intricate lip movements and facial expressions from audio input while the Lmk2Video module uses this landmark sequence to generate high-quality portrait videos with temporal stability. The following figure presents an overview of the working of the AniPortrait framework. As it can be observed, the AniPortrait framework first extracts the 3D facial mesh and head pose from the audio, and projects these two elements into 2D key points subsequently. In the second stage, the framework employs a diffusion model to transform the 2D key points into a portrait video with two stages being trained concurrently within the network.

Audio2Lmk

For a given sequence of speech snippets, the primary goal of the AniPortrait framework is to predict the corresponding 3D facial mesh sequence with vector representations of translation and rotation. The AniPortrait framework employs the pre-trained wav2vec method to extract audio features, and the model exhibits a high degree of generalization, and is capable of recognizing intonation and pronunciation from the audio accurately that plays a crucial role in generating realistic facial animations. By leveraging the acquired robust speech features, the AniPortrait framework is able to effectively employ a simple architecture consisting of two fc layers to convert these features into 3D facial meshes. The AniPortrait framework observes that this straightforward design implemented by the model not only enhances the efficiency of the inference process, but also ensures accuracy. When converting audio to pose, the AniPortrait framework employs the same wav2vec network as the backbone, although the model does not share the weights with the audio to mesh module. It is majorly due to the fact that pose is associated more with tone and rhythm present in the audio, which holds a different emphasis when compared against audio to mesh tasks. To account for the impact of the previous states, the AniPortrait framework employs a transformer decoder to decode the pose sequence. During this process, the framework integrates the audio features into the decoder using cross-attention mechanisms, and for both the modules, the framework trains them using the L1 loss. Once the model obtains the pose and mesh sequence, it employs perspective projection to transform these sequences into a 2D sequence of facial landmarks that are then utilized as input signals for the subsequent stage.

Lmk2Video

For a given reference portrait image and a sequence of facial landmarks, the proposed Lmk2Video module creates a temporally consistent portrait animation, and this animation aligns the motion with the landmark sequence, and maintains an appearance that is in consistency with the reference image, and finally, the framework represents the portrait animation as a sequence of portrait frames. The design of the Lmk2Video’s network structure seeks inspiration from the already existing AnimateAnyone framework. The AniPortrait framework employs a Stable Diffusion 1.5, an extremely potent diffusion model as its backbone, and incorporates a temporal motion module that effectively converts multi-frame noise inputs into a sequence of video frames. At the same time, a ReferencenNet network component mirrors the structure of Stable Diffusion 1.5, and employs it to extract the appearance information from the reference image, and integrates it into the backbone. The strategic design ensures that the facial ID remains consistent throughout the output video. Differentiating from the AnimateAnyone framework, the AniPortrait framework enhances the complexity of the PoseGuider’s design. The original version of the AnimateAnyone framework comprises only a few convolution layers post which the landmark features merge with the latents a the input layer of the backbone. The AniPortrait framework discovers that the design falls short in capturing intricate movements of the lips, and to tackle this issue, the framework adopts the multi-scale strategy of the ConvNet architecture, and incorporates landmark features of corresponding scales into different blocks of the backbone. Furthermore, the AniPortrait framework introduces an additional improvement by including the landmarks of the reference image as an additional input. The cross-attention module of the PoseGuider component facilitates the interaction between the target landmarks of each frame and the reference landmarks. This process provides the network with additional cues to comprehend the correlation between appearance and facial landmarks, thus assisting in the generation of portrait animations with more precise motion.

AniPortrait: Implementation and Result

For the Audio2Lmk stage, the AniPortrait framework adopts the wav2vec2.0 component as its backbone, and leverages the MediaPipe architecture to extract 3D meshes and 6D poses for annotations. The model sources the training data for the Audio2Mesh component from its internal dataset that comprises nearly 60 minutes of high-quality speech data sourced from a single speaker. To ensure the 3D mesh extracted by the MediaPipe component is stable, the voice actor is instructed to face the camera, and maintain a steady head position during the entirety of the recording process. For the Lmk2Video module, the AniPortrait framework implements a two-stage training approach. In the first stage, the framework focuses on training ReferenceNet, and PoseGuider, the 2D component of the backbone, and leaves out the motion module. In the second step, the AniPortrait framework freezes all the other components, and concentrates on training the motion module. For this stage, the framework makes use of two large-scale high-quality facial video datasets to train the model, and processes all the data using the MediaPipe component to extract 2D facial landmarks. Furthermore, to enhance the sensitivity of the network towards lip movements, the AniPortrait model differentiates the upper and lower lips with distinct colors when rendering the pose image from 2D landmarks.

As demonstrated in the following image, the AniPortrait framework generates a series of animations that demonstrate superior quality as well as realism.

The framework then utilizes an intermediate 3D representation that can be edited to manipulate the output as per the requirements. For instance, users can extract landmarks from a certain source and alter its ID, therefore allowing the AniPortrait framework to create a facial reenactment effect.

Final Thoughts

In this article, we have talked about AniPortrait, a novel framework designed to generate high-quality animations driven by a reference portrait image and an audio sample. By simply inputting a reference image and an audio clip, the AniPortrait framework is capable of generating a portrait video that features natural movement of heads, and smooth lip motion. By leveraging the robust generalization capabilities of the diffusion model, the AniPortrait framework generates animations that display impressive realistic image quality, and lifelike motion. The working of the AniPortrait framework is divided into two stages. First, the AniPortrait framework extracts the intermediate 3D representations from the audio samples, and projects them into a sequence of 2D facial landmarks. Following this, the framework employs a robust diffusion model coupled with a motion module to convert the landmark sequences into temporally consistent and photorealistic animations. The experimental results demonstrate the superiority and ability of the AniPortrait framework to generate high quality animations with exceptional visual quality, pose diversity, and facial naturalness, therefore offering an enhanced and enriched perceptual experience. Furthermore, the AniPortrait framework holds remarkable potential in terms of controllability and flexibility, and can be applied effectively in areas including facial reenactment, facial motion editing, and more.

Microsoft Eats into Amazon’s Cloud Market Share 

Microsoft Amazon Google Cloud

The $76-billion global cloud infrastructure services market has once again been captured by the big three with a 67% combined market share. Amazon continues to dominate the cloud market with a 31% share, taking a 1% hit from the previous year. Microsoft, on the other hand, has been surging forward.

Microsoft Azure is the King

Microsoft Azure has shown steady growth in the cloud sector, showing an increased capture of the market. The recent quarter’s cloud revenue was $35.1 billion, which was up 23% year-on-year (YoY). The company is closely trailing Amazon with a global cloud market share of 25%.

Microsoft’s large spread of AI offerings across its enterprise suite is proving to be its golden egg (the goose being OpenAI).

“Our AI innovation continues to build on our strategic partnership with OpenAI. More than 65% of the Fortune 500 now use Azure OpenAI Service,” said Microsoft chief Satya Nadella, in a recent earnings call.

Nadella also confirmed that the quantity of Azure deals valued at over $100 million rose by over 80% compared to the previous year, while the number of deals exceeding $10 million more than doubled.

Guided by Nadella’s strategic brilliance, Microsoft’s cloud share has been advancing by 1% each quarter, mirroring the deliberate steps of the king on a chessboard.

Copilot Mode ON

Microsoft’s Copilot is proving to be the backbone for AI-powered products for its customers. “30,000 customers across every industry have used Copilot Studio to customise Copilot for Microsoft 365 or build their own, up 175% quarter-over-quarter,” said Nadella.

In the earnings announcements, Nadella spoke at length about Copilot’s applications across domains. He claimed that almost 60% of Fortune 500 companies use Copilot and have witnessed an accelerated adoption across industries such as Amgen, BP, Cognizant, Koch Industries, Moody’s, Novo Nordisk, NVIDIA, and Tech Mahindra purchasing over 10,000 seats.

“We’re not stopping there. We’re accelerating our innovation, adding over 150 Copilot capabilities since the start of the year,” said Nadella.

While Microsoft skyrockets, Google has maintained its 11% share of the cloud market.

Source: X

Google Cloud Remains Resilient

Google witnessed staggering growth in the recent quarter with 15% revenue growth YoY and a net income of $23.7 billion, which is a jump of 57% from the previous year. The company attributes a considerable chunk of growth to Google Cloud.

“Today, over 60% of funded GenAI startups and nearly 90% of GenAI unicorns are Google Cloud customers,” said Google chief Sundar Pichai. The company posted an operating income of $900 million on cloud services. The company even acknowledged that the growth across the cloud is underpinned by the benefits of AI.

In cloud, Google has announced over 1000 new products and features in the past eight months.

AI Integration Continues for AWS

Though Amazon saw a 1% dip in the recent results, Amazon is not backing down in any way. AWS’s segment sales increased 17% YoY to hit $25 billion, and the company has been extensively investing in bringing AI on their platform.

Recently, AWS announced the general availability of Amazon Q, which is the company’s most advanced AI-powered assistant. Amazon Q will be available in three forms to assist developers, enterprises, and Q apps, enabling companies to build generative AI apps using their company data.

“The combination of companies renewing their infrastructure modernisation efforts and the appeal of AWS’ AI capabilities is reaccelerating AWS’ growth rate,” said Andy Jassy, Amazon’s president and CEO. The company is at a $100 billion annual revenue rate.

Amazon Bedrock, AWS’s generative AI service that allows users to leverage the latest LLMs for building AI applications, also witnessed remarkable numbers in the recent quarter. Amazon confirmed that thousands of organisations worldwide are using Amazon Bedrock.

The post Microsoft Eats into Amazon’s Cloud Market Share appeared first on Analytics India Magazine.

Oracle Launches Database 23ai, Brings AI Power to Enterprise Data

Oracle has unveiled Oracle Database 23ai, a new database technology integrating AI capabilities. The release, now available as a suite of cloud services, focuses on streamlining AI utilisation, enhancing application development, and supporting critical workloads.

One of its key features, Oracle AI Vector Search, makes data search simple by enabling users to search documents, images, and relational data based on conceptual content rather than specific keywords or data values.

AI Vector Search facilitates natural language queries on private business data within Oracle databases, eliminating the need to move or duplicate data for AI processing. This real-time AI integration within databases enhances efficiency, security, and operational effectiveness.

Oracle Database 23ai is available in Oracle Cloud Infrastructure (OCI) on Oracle Exadata Database Service, Oracle Exadata Cloud@Customer and Oracle Base Database Service, as well as on Oracle Database@Azure.

Juan Loaiza, Oracle’s Executive Vice President of Mission-Critical Database Technologies, highlighted the significance of Oracle Database 23ai, stating it as a game-changer for global enterprises.

“AI Vector Search combined with new unified development paradigms and mission-critical capabilities makes it simple for developers and data professionals to build intelligent apps, increase developer productivity, and run mission-critical workloads,” she said.

Key enhancements in Oracle Database 23ai include AI Vector Search for semantic search, Oracle Exadata System Software 24ai for accelerated AI processing, and OCI GoldenGate 23ai for real-time data replication across heterogeneous stores. These innovations empower developers to build intelligent apps, leverage JSON and graph data models, and ensure mission-critical data availability and security.

With Oracle’s continuous advancements in AI-integrated databases, customers can expect improved operational efficiency, enhanced data security, and accelerated innovation in enterprise applications. Oracle Database 23ai represents a significant step forward in AI-driven database solutions, promising a robust foundation for businesses embracing AI technologies.

The post Oracle Launches Database 23ai, Brings AI Power to Enterprise Data appeared first on Analytics India Magazine.

Meta’s New Data Analyst Professional Certification Has Dropped!

Meta Certification
Image by Author

A new professional certification has arrived on the Coursera platform — Meta Data Analyst Professional Certificate.

If you have had thoughts about entering the data analytics market, now is a great opportunity. The current median US salary for a Data Analyst is $82,000+, with over 90,000+ U.S. job openings in the field.

Let’s get right into the course…

Meta Data Analyst Certification

Link: Meta Data Analyst Professional Certificate

This course is aimed at beginners who are looking to enter the tech industry from a data analyst approach. You can take this course in your own time and at your own pace. The whole course will take you 5 months to complete if you commit 10 hours a week, but if you can commit more you can get it done faster!

The certification is made up of 5 courses:

  • Introduction to Data Analytics
  • Data Analysis with Spreadsheets and SQL
  • Python Data Analytics
  • Statistics for Marketing
  • Introduction to Data Management

What Will I Learn?

In these 5 courses, you will learn:

  • How to collect, clean, sort, evaluate, and visualise data
  • Learn how to apply the OSEMN framework used to guide the data analysis process
  • Ensure that the approach you have taken delivers actionable insights
  • Use statistical analysis, such as hypothesis testing and regression analysis, to make data-driven decisions
  • Understand the foundational principles of effective data management
  • Understand the usability of data assets within an organisation

What Will I Get Out Of This Certification?

If you plan to take this certification, here are the features and a few perks that come with it:

  • Receive professional-level training from Meta
  • Demonstrate your proficiency in portfolio-ready projects
  • Qualify for in-demand job titles: Data Analyst, Associate Data Analyst, and Business Analyst
  • Resume and LinkedIn review with personalized feedback
  • Practice your skills with interactive tools and mock interviews
  • Career support with Coursera’s job search guide

Wrapping it Up

In 5 months you could be ready to start your new career, with the support and guidance you need to get there. The demand for data analysts continues to grow as the value of data becomes more valuable.

Happy learning!

Nisha Arya is a data scientist, freelance technical writer, and an editor and community manager for KDnuggets. She is particularly interested in providing data science career advice or tutorials and theory-based knowledge around data science. Nisha covers a wide range of topics and wishes to explore the different ways artificial intelligence can benefit the longevity of human life. A keen learner, Nisha seeks to broaden her tech knowledge and writing skills, while helping guide others.

More On This Topic

  • Top Data Analyst Certification Courses for 2022
  • What’s New in SAS Certification?
  • Level Up with DataCamp's New Azure Certification
  • How I 14Xed my salary in 14 years as a data analytics/science professional
  • Become a Data Science Professional in Five Steps
  • How to Ace Data Scientist Professional Certificate Exam

Why Google Gemma is Better than Meta’s Llama 3 for Indic LLMs

Earlier this year, Google released Gemma, a family of lightweight open-source language models developed by Google DeepMind and other teams across Google. Soon after its launch, many Indian developers experimented with it and built Indic LLMs like Tamil Gemma, Telugu Gemma, Hindi Gemma and many more.

“Gemma probably does a better job in Indic tokenisation than GPT-4 and Llama 3,” said Vivek Raghavan, co-founder of Sarvam AI, in an exclusive interview with AIM.

However, he added that Llama 3 has its own advantages. “I think Llama 3 looks quite good. There are many open models and we have a strategy where we leverage all of them,” he said.

The thought was echoed by Adithya S Kolavi, founder of Cognitive Lab, who recently built a leaderboard for Indic LLMs. According to his leaderboard, Meta’s latest release, Llama 3, performs significantly better than Llama 2 on most benchmarks. However, compared to Gemma, it falls a little short. Gemma’s tokenization for Devanagari is efficient when compared to Llama 2.

Average eval scores for hindi
gemma-7b -> 0.550
llama3-8b -> 0.498
llama2-7b -> 0.309 https://t.co/cQCuB1W9od pic.twitter.com/Cfcf3Kzggm

— Adithya S K (@adithya_s_k) April 20, 2024

“Models using Llama 2 extended its tokenizer by 20 to 30k tokens, reaching a vocabulary size of 50-60k. Continuous pre-training is crucial for understanding these new tokens. In contrast, Gemma’s tokenizer initially handles Indic languages well, requiring minimal fine-tuning for specific tasks,” said Kolavi.

Recently, Telugu LLM Labs also experimented with Gemma and released Telugu Gemma. “On a higher level, the Gemma tokenizer includes tokens for most Indian languages, providing strong representations for these tokens. In contrast, the Llama3 tokenizer supports only a few languages, and its quality of support is not as robust,” said Ravi Theja, founder of Telugu LLM Labs.

“Gemma features an exhaustive 256K tokenizer. A quick test of its tokenizing capabilities revealed that the models are exceptionally proficient in handling the Telugu language,” he added.

Similarly, OdiaGenAI released Hindi-Gemma-2B-instruct, a 2 billion SFT with 187k large instruction sets in Hindi. The company said the Gemma-2B was chosen as a base model due to 2B versions for CPU and on-device applications and efficient tokenisers on Indic languages compared to other LLMs.

“In comparative tests conducted by the OdiaGenAI team, the Gemma 7B model demonstrates superior performance over the Gemma 2B LLM model for Indic languages such as Odia,” shared Shantipriya Parida, the creator of Odia Llama.

Gemma Holds Advantage Over Llama 3

Llama 3 is pre-trained on over 15 trillion tokens collected from publicly available sources. Only 5% of the Llama 3 pre-training dataset consists of high-quality non-English data, covering over 30 languages, which amounts to 750 billion tokens.

“750 billion tokens are spread across 30 languages, and considering an equal distribution over all 30 languages, it comes out to be 25 billion tokens per non-English language. A language like Hindi is very rich, so I feel it’s grossly underrepresented in Llama 3,” said Adarsh Shirawalmath, founder of Tensoic and creator of Kannada Llama.

Llama 3 is a bit difficult when it comes to Indic LLMs “It’s going to be hard to adapt Llama 3 for Indic languages, in my opinion,” said Kolavi. Even though initial tests show better performance with Devanagari compared to Llama 2, it struggles with other languages like Kannada, Malayalam, and Tamil. More testing is needed to fully assess Llama 3’s performance with these languages.

He explained in his blog that Llama 3 uses a TikToken-based tokenizer, which struggles to efficiently tokenize Indic languages, even with a vocabulary size of 121k. Moreover, when it comes to vocabulary expansion, unlike models using sentence-piece tokenization as Gemma does, Llama 3 may face difficulties in expanding its vocabulary to better handle the wide variety of Indic languages.

Not all is Lost for Llama 3

“The environment around Llama 3 is really buzzing and a lot of experiments are being done whereas the same has plateaued for Gemma. While Gemma is better for Indic languages since it has a lot of Indic tokens in its 256k vocab size, it does not mean that it’s easier to work with. In fact, Gemma is really hard and unstable to work with,” said Shirawalmath.

He said that given the size of the embedding layer due to its huge vocab size, it’s really hard to train/finetune. Llama 3 on the other hand hits the sweet spot of 128k vocab size using the Tiktoken tokenizer but really lacks Indic tokens.

“There are some challenges, but they are all solvable depending upon what you do,” said Raghavan about Llama 3, where he and his team are currently experimenting to build an Indic voice LLM, which is expected to be launched in the coming months.

The post Why Google Gemma is Better than Meta’s Llama 3 for Indic LLMs appeared first on Analytics India Magazine.