AI — Страница 1450

Microsoft’s Classic WordPad On The Way Out, Notepad Gets Updates

Microsoft has decided to bid farewell to the iconic WordPad. Three decades ago, the company introduced WordPad, a no-cost word processing tool that later became a staple with Windows 95. However, Microsoft has now made the decision to retire this text editing tool, which has long been pre installed on every Windows computer.

The announcement has been a long time coming, considering Microsoft recently introduced updates for Notepad. These new features, including autosave and automatic tab restoration, mark the first upgrade to Notepad since 2018.

In a support note, Microsoft stated, “WordPad is no longer receiving updates and will be removed in a future release of Windows.” The note further recommends Microsoft Word for handling rich text documents like .doc and .rtf, while suggesting Windows Notepad for plain text documents such as .txt files.

The most recent WordPad software documentation clarifies that as part of planned development cycles, certain features are introduced while others are phased out to enhance the overall user experience. Consequently, WordPad will no longer see further development or updates, and it will eventually be removed from Windows 11 in an upcoming software update.

Consistent with Microsoft’s tradition of quietly shutting down products over decades, WordPad’s discontinuation got a brief mention on Microsoft’s Deprecated features list, devoid of any fanfare.

WordPad was introduced as a replacement to the earlier free word processor, Microsoft Write, back in 1985. Even though it was initially reinstalled on all machines, over time, WordPad became an optional application with Windows 10. In 2020, reports revealed that Microsoft was experimenting with ads within the free WordPad application. However, this addition was never widely released to the public.

As per the details, WordPad will remain fully functional and accessible until users install the Windows update. Microsoft has yet to specify a precise date for when this update will be.

The post Microsoft’s Classic WordPad On The Way Out, Notepad Gets Updates appeared first on Analytics India Magazine.

Bootstrapped Zoho Crosses 100 Million Users

SaaS company Zoho, on Tuesday, announced that it has crossed 100 million users across its expansive suite of 55+ business applications. Remarkably, Zoho achieved this milestone without external funding, making it the first self-funded SaaS company to do so. This achievement follows Zoho’s impressive accomplishment of reaching $1 billion in annual revenue last year.

“I want to thank all our customers for trusting us with their business and helping us reach 100 million users worldwide,” said Zoho’s co-founder and CEO, Sridhar Vembu, in a statement. “This is an impressive milestone for any organization, but is particularly sweet for us — a bootstrapped company that has never raised external capital.”

In recent years, Zoho Corp has experienced remarkable growth in both sales and user numbers. To put this into perspective, it achieved the milestone of one million users in 2008. Impressively, it added the last 50 million users in just the past five years.

Additionally, the company has made significant progress in the enterprise or “upmarket” segment in India, achieving a three-year Compound Annual Growth Rate (CAGR) of 65 percent.

Unleashing Generative AI

Earlier this year, Zoho announced that it is doubling down on investment efforts in R&D, and expects to release more generative AI solutions in phases through 2023 and 2024.

Back in June, Zoho made an announcement regarding the development of its very own LLM. This project is being spearheaded by CEO Vembu and is being actively worked on by the company’s dedicated R&D team.

With the adoption of generative AI, Zoho was one of the few SaaS companies which didn’t lay off its employees. Zoho’s co-founder and CEO, Sridhar Vembu, emphasized on July 21 that AI will only replace specific job functions, not entire employees. “At Zoho, we hold the belief that AI can take over certain roles, but the significance of people remains unchanged. This perspective reflects the core philosophy of our organization,” Vembu had stated.

Challenging Times Ahead

OpenAI recently launched ChatGPT Enterprise. The company in their blog said that it comes with a new admin console that will let businesses manage team members easily and offers domain verification, SSO, and usage insights, allowing for large-scale deployment into enterprise. This development overlaps with services of many of the current SaaS companies who offer B2B offerings, including Zoho.

Only time will tell whether the introduction of ChatGPT Enterprise would potentially jeopardize Zoho’s survival, as OpenAI has stepped in to offer business solutions centered on ChatGPT.

The post Bootstrapped Zoho Crosses 100 Million Users appeared first on Analytics India Magazine.

Data Cleaning with Pandas

Image by Author Introduction

If you are into Data Science, then data cleaning might sound like a familiar term to you. If not, let me explain that to you. Our data often comes from multiple resources and is not clean. It may contain missing values, duplicates, wrong or undesired formats, etc. Running your experiments on this messy data leads to incorrect results. Therefore, it is necessary to prepare your data before it is fed to your model. This preparation of the data by identifying and resolving the potential errors, inaccuracies, and inconsistencies is termed as Data Cleaning.

In this tutorial, I will walk you through the process of cleaning the data using Pandas.

Dataset

I will be working with the famous Iris dataset. The Iris dataset contains measurements of four features of three species of Iris flowers: sepal length, sepal width, petal length, and petal width. We will be using the following libraries:

Pandas: Powerful library for data manipulation and analysis

Scikit-learn: Provides tools for data preprocessing and machine learning

Steps for Data Cleaning

1. Loading the Dataset

Load the Iris dataset using Pandas' read_csv() function:

column_names = ['id', 'sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species']  iris_data = pd.read_csv('data/Iris.csv', names= column_names, header=0)  iris_data.head()

Output:

id	sepal_length	sepal_width	petal_length	petal_width	species
1	5.1	3.5	1.4	0.2	Iris-setosa
2	4.9	3.0	1.4	0.2	Iris-setosa
3	4.7	3.2	1.3	0.2	Iris-setosa
4	4.6	3.1	1.5	0.2	Iris-setosa
5	5.0	3.6	1.4	0.2	Iris-setosa

The header=0 parameter indicates that the first row of the CSV file contains the column names (header).

2. Explore the dataset

To get insights about our dataset, we will print some basic information using the built-in functions in pandas

print(iris_data.info())  print(iris_data.describe())

Output:

  RangeIndex: 150 entries, 0 to 149  Data columns (total 6 columns):   #   Column        Non-Null Count  Dtype    ---  ------        --------------  -----     0   id            150 non-null    int64     1   sepal_length  150 non-null    float64   2   sepal_width   150 non-null    float64   3   petal_length  150 non-null    float64   4   petal_width   150 non-null    float64   5   species       150 non-null    object   dtypes: float64(4), int64(1), object(1)  memory usage: 7.2+ KB  None

Output for iris_data.describe()

The info() function is useful to understand the overall structure of the data frame, the number of non-null values in each column, and the memory usage. While the summary statistics provide an overview of numerical features in your dataset.

3. Checking Class Distribution

This is an important step in understanding how the classes are distributed in categorical columns, which is an important task for classification. You can perform this step using the value_counts() function in pandas.

print(iris_data['species'].value_counts())

Output:

Iris-setosa        50  Iris-versicolor    50  Iris-virginica     50  Name: species, dtype: int64

Our results show that the dataset is balanced with an equal number of representations of each species. This sets the base for a fair evaluation and comparison across all 3 classes.

4. Removing Missing Values

Since it is evident from the info() method that we have 5 columns with no missing values, we will skip this step. But if you encounter any missing values, use the following command to handle them:

iris_data.dropna(inplace=True)

5. Removing Duplicates

Duplicates can distort our analysis so we remove them from our dataset. We will first check their existence using the below-mentioned command:

duplicate_rows = iris_data.duplicated()  print("Number of duplicate rows:", duplicate_rows.sum())

Output:

Number of duplicate rows: 0

We do not have any duplicates for this dataset. Nonetheless, the duplicates can be removed via the drop_duplicates() function.

iris_data.drop_duplicates(inplace=True)

6. One-Hot Encoding

For categorical analysis, we will perform one-hot encoding on the species column. This step is performed due to the tendency of Machine Learning algorithms to work better with numerical data. The one-hot encoding process transforms categorical variables into a binary (0 or 1) format.

encoded_species = pd.get_dummies(iris_data['species'], prefix='species', drop_first=False).astype('int')  iris_data = pd.concat([iris_data, encoded_species], axis=1)  iris_data.drop(columns=['species'], inplace=True)

Image by Author

7. Normalization of Float Value Columns

Normalization is the process of scaling numerical features to have a mean of 0 and a standard deviation of 1. This process is done to ensure that the features contribute equally to the analysis. We will normalize the float value columns for consistent scaling.

from sklearn.preprocessing import StandardScaler    scaler = StandardScaler()  cols_to_normalize = ['sepal_length', 'sepal_width', 'petal_length', 'petal_width']  scaled_data = scaler.fit(iris_data[cols_to_normalize])  iris_data[cols_to_normalize] = scaler.transform(iris_data[cols_to_normalize])

Output for iris_data.describe() after normalization

8. Save the cleaned dataset

Save the cleaned dataset to the new CSV file.

iris_data.to_csv('cleaned_iris.csv', index=False)

Wrapping Up

Congratulations! You have successfully cleaned your first dataset using pandas. You may encounter additional challenges while dealing with complex datasets. However, the fundamental techniques mentioned here will help you get started and prepare your data for analysis.

Kanwal Mehreen is an aspiring software developer with a keen interest in data science and applications of AI in medicine. Kanwal was selected as the Google Generation Scholar 2022 for the APAC region. Kanwal loves to share technical knowledge by writing articles on trending topics, and is passionate about improving the representation of women in tech industry.

7 Toolkits To Build Your AI Ethically

Developing and deploying AI responsibly requires tools to help the entire development team understand what they’re doing and show them how their choices affect end users. Researchers, analysts, and policymakers today are in dire need to at least minimise – if not avoid – the harm AI model can cause.

Toolkits play a fundamental role in creating systems fairer, more robust and transparent. Here are 7 toolkits that can assist in implementing AI ethically:

NASSCOM Responsible AI Resource Kit

In 2022, the National Association of Software and Services Companies (NASSCOM) collaborated with industry leaders including Microsoft, Tata Consultancy Services and IBM Research to introduce the Responsible AI Hub and resource kit. As the name suggests, the objective behind this initiative is to ensure responsible integration of AI technology.

NASSCOM will maintain this evolving reference, continually incorporating the latest research and industry best practices, drawing from several credible sources.

This kit equips businesses with the tools and guidance for AI development and deployment while complying with standards ethically. Moreover, the kit offers insights for identifying and mitigating ethical risks that may arise while implementing AI-powered solutions.

AI and data protection risk toolkit

The Information Commissioner’s Office of UK launched an AI and data protection toolkit last year as part of the effort to spread best practices in the use of AI. The toolkit is available as an Excel file on its website to download and edit to help organisations through several stages.

During the release, senior policy officer Alister Pearson said it has been developed to help organisations comply with data protection regulations and win public trust in the use of AI.

Ethics in Tech Toolkit

The Markkula Center for Applied Ethics at Santa Clara University developed the project to provide free resources for everyone to integrate ethics more fully into their products and designs.

The project includes a comprehensive ethical toolkit for engineering and design practice that consists of 7 tools that can be combined with engineering and design workflows. The toolkit can be an attempt to ensure that practitioners responsibly develop technologies and ethics does not become ‘vaporware’.

Playing with AI Fairness: What-if Tool

Abbreviated as WIT, the tool developed by Google makes it easier to examine, evaluate, and debug ML systems easily and accurately. The open-source interactive visual tool enables the understanding of a classification or regression model by enabling users to examine, evaluate, and compare models.

Thanks to its simple user-friendly interface and reduced dependency on coding, this tool caters to a wide range of users. Whether you’re a seasoned developer, researcher, or student, this resource can integrate into your workflow through TensorBoard or as an extension within a Jupyter or Colab notebook.

Ethics and Algorithms Toolkit

The toolkit developed by Joy Bunaguro and his colleagues asks a series of questions that grade the different types of risk in a data-driven initiative. Depending upon the level of risk, the toolkit contains suggestions for mitigations.

Aequitas

Developed in 2018 by the University of Chicago Center for Data Science and Public Policy Aequitas is an open-source bias audit toolkit. The tool is built to be used by a broad set of people from developers, to policymakers.

The purpose of the tool is to audit machine learning models for discrimination, bias, and make informed and equitable decisions around predictive risk-assessment tools.

Human-AI eXperience (HAX) Toolkit

The tool by Microsoft is best used early in the development process to design human-centric AI systems. The toolkit is made of four parts; HAX Workbook, HAX design patterns, HAX Playbook, HAX Design Library.

HAX draws its foundation from a 2019 research paper, effectively turning its theoretical principles into practical tools. In many ways, these guidelines advocate for clarity and precision in UI copy, while being easy to implement plans in case of system failures.

The post 7 Toolkits To Build Your AI Ethically appeared first on Analytics India Magazine.

Gartner’s Hype Cycle is a Waste of Time

Recently, Gartner released another version of its Hype Cycle — and it’s nothing new. Each year, the research and advisory firm creates over 100 of these cycles across various fields. However, it looks like the hype cycle is going through its own Trough of Disillusionment.

Each emerging technology is different and takes its own separate path to get adopted among the masses. Take ChatGPT for instance. It clearly broke the AI Hype Cycle within three months of the launch. According to Gartner, it was supposed to take a minimum of two years to reach the ‘plateau’.

This new report seems to be no different.

There is no doubt that over the years, the report has helped business leaders and its clients get a fair idea of how much time a new emergent technology will take to eventually reach the ‘Plateau of Productivity’. In simpler terms, this is when the technology sees widespread adoption and achieves its full potential.

Missing the gold

To start with, one cannot simply ignore that almost every single truly transformative technology wasn’t ever tracked by Gartner including the WiFi, GPS chips, the App Store, Web2.0, API’s etc. It’s quite remarkable how many significant technologies from the past two decades weren’t recognised early or never made it to a Hype Cycle. In the world of technology, things that may appear minor or short-lived often end up forming the basis for the next generation of business and consumer tools.

A closer look at Gartner’s different graph versions over the years shows that certain technologies vanish in later editions, seemingly replaced by new ones. For example, “smartphone” made an appearance on the ‘Slope of Enlightenment’ in 2006, without a clear journey through the ‘Peak of Inflated Expectations’.

The story doesn’t stop there. The Gartner Hype Cycle missed out on notable technologies like NoSQL, which started gaining traction in the mid-2000s, giving us innovations like MongoDB, Cassandra, Redis, and Couchbase. Considering this, it would not be wrong to say that the level of hype Gartner and its associates are trying to generate for a technology, represents the extent to which their consultants are pushing clients towards certain tech, while ignoring the real advancements.

Times When the Hype Cycle Got it Wrong

In 2022, Gartner on Zero Knowledge Proof predicted that the technology will take 5 to 10 years for mainstream adoption. However, Worldcoin, which was recently launched based on ZKP, saw major adoption around the world. According to the recent reports, Worldcoin secured more than 9,500 Argentine users within a day in August.

As of now, more than 2 million people around the world from Chile to Indonesia to Kenya have sat in front of Worldcoin orbs to have their irises scanned. The number is clearly higher than the cycle expected.

Similarly, for Code generation, a period of five to ten years was predicted. But today, we have a plethora of options to choose from for code generation, such as Code Llama, Code Whisper, Codex and Codey. Clearly, Gartner got it wrong.

Furthermore, In 2022, Gartner predicted that the Metaverse would need more than 10 years to realise its full potential. Yet, earlier this year, there was a growing sentiment that the Metaverse is dead. However, in the case of Metaverse, it would be interesting to observe if it is currently in its ‘Trough of Disillusionment’ phase or if it is really dead.

Hype Cycle is not Science

Many technologies simply fade away with time or die. According to Michael Mullany, an additional 20% of all technologies that were tracked for multiple years on the Hype Cycle became obsolete before reaching any kind of mainstream success. The Gartner Hype Cycle is not science, but Gartner presents it as an established natural law. Expressing similar sentiments, a user on Hacker News wrote, “Why do people think the Gartner Hype Cycle is a law of Physics?” when in fact, the Hype Cycle lacks empirical backing and fails to consider technologies that deviate from its prescribed path.

Well, it may be useful to investors in some cases as it gives them an estimate of the time period in which a tech will mature, but blindly following it without taking into consideration other factors and sources of information can lead to wrong decisions, because Gartner can conveniently change the hype next year to fix their mistakes.

The post Gartner’s Hype Cycle is a Waste of Time appeared first on Analytics India Magazine.

XGBoost is All You Need

Tabular data, commonly found in spreadsheets and databases, constitutes the backbone of decision-making in various industries, and most importantly, in machine learning. For these tasks, the primary requirement is a model that can handle tabular data efficiently, accurately, and interpretably. Arguably, XGBoost (Extreme Gradient Boosting) excels on all fronts, amid all the hype around other deep learning techniques, even LLMs.

XGBoost pic.twitter.com/V7BdVyE4Aw

— Bojan Tunguz (@tunguz) October 2, 2022

Bojan Tunguz, the quadruple kaggle grandmaster who works at NVIDIA, states that XGBoost is all you need. But is it really true that XGBoost can be continually touted as the best low-code ML solution available today? Even beating out LLMs in terms of classification capabilities on tabular data?

Transformer is NOT all you need, only a little bit

Traditionally, there have been two distinct groups in the ML ecosystem: the tabular-data-focused data scientists that use XGBoost, lightBGM, and similar tools, and the LLM group. These both groups have used separate techniques and models. However, recent experiments have shown that LLMs can be applied effectively for classification on tabular data without extensive data cleaning or feature engineering, but the capabilities are still time consuming.

To apply LLMs to tabular data, prompt engineering can be one of the helpful solutions, but it is still in infancy. The LLMs produce textual output, but the focus here is on using the internal embeddings (latent structure embeddings) generated by LLMs, which can be passed to traditional tabular models like XGBoost. While Transformers have undoubtedly revolutionised generative AI, their strengths lie in unstructured data, sequential data, and tasks that involve complex patterns.

For example, in Kaggle competitions, where tabular data dominates, LLMs, when provided with appropriate prompts, demonstrated predictive power, though not at the level of top-performing traditional models like XGBoost. This suggests the potential for LLMs to become valuable tools in tabular data analysis is still under development, reigning XGBoost extreme.

But the case is limited to smaller datasets only, and stops when the size increases. To build LLMs, we need a corpus of data. On the other hand, Kaggle competitions have some megabytes or a few gigabytes of data, where it performs well. But as the size increases, Transformers prove to be the better option.

Krishna Rastogi, CTO of MachineHack said, “Transformers are like the H-bombs of machine learning, and XGBoost is the reliable sniper rifle. When it comes to tabular data, XGBoost proves to be the sharpshooter of choice.”

He further explains that most MachineHackers also use XGBoost or CATBoost, but it’s because it works well in general for competitions. “But I believe the real world data is more messy and requires a whole level of data cleaning, checking duplicate, good and bad labelling, this is where Transformers outperform,” he added.

Why and when to XGBoost

One of the key reasons for XGBoost’s prominence in tabular data tasks is its inherent interpretability. In many real-world applications, understanding why a model makes a particular prediction is as important as the prediction itself. This is especially crucial in fields like healthcare, finance, and regulatory compliance. Unlike deep learning models like Transformers, which are often considered “black boxes,” XGBoost provides clear and intuitive insights into feature importance.

Everyone is focussed on developing LLMs.

— Marc (@marccodess) August 29, 2023

When dealing with tabular datasets, efficiency is paramount. XGBoost’s optimised algorithms and the ability to parallelise training make it exceptionally fast. In contrast, deep learning models like Transformers often require extensive computational resources, including GPUs, to achieve similar performance on structured data. For many organisations, this efficiency can translate into cost savings and faster time-to-insight, as they do not have huge amounts of data.

XGBoost’s versatility extends beyond classification to regression and ranking tasks. Whether you need to predict a continuous target variable, rank items by relevance, or classify data into multiple categories, XGBoost can handle it with ease.

Another advantage of XGBoost is its robustness in handling noisy or incomplete datasets. Though people argue that it also falls into the trap of overfitting, as in real-world scenarios, data can be messy, with missing values, outliers, and inconsistencies. XGBoost mitigates this risk through its regularisation techniques, including L1 and L2 regularisation.

Moreover, when it comes to outliers, while often regarded as data artefacts, can carry valuable information or indicate anomalies in the dataset. XGBoost’s tree-based approach is naturally robust to outliers. Decision trees can capture the underlying patterns in the presence of extreme values, making XGBoost an ideal choice for tasks where outliers are significant.

Conclusively, when it comes to comparatively smaller amounts of structured data, XGBoost proves that sometimes the simplest solution is also the best one. Why not figure out if it can take another step and be used for AI models, and replace Transformers someday?

The post XGBoost is All You Need appeared first on Analytics India Magazine.

Nearly 83% of Indian Firms Were Hit by Cyber Breaches in 2022

Today, Cloudflare, Inc. unveiled the findings of its latest study on cybersecurity in the Asia Pacific region. The report titled “Securing the Future: Asia Pacific Cybersecurity Readiness Survey,” highlighted that a staggering 83% of Indian organisations encountered at least one cybersecurity incident in the past year.

The survey included 4,009 cybersecurity decision-makers from organisations of varying sizes. Respondents represented diverse industries, and the survey spanned 14 markets across the Asia Pacific region.

Furthermore, 48% of the surveyed organisations reported experiencing 10 or more incidents. These incidents were primarily attributed to web attacks, phishing, and supply chain attacks, the primary objective being financial gain. The study further explains organisations are dealing with an increasing number of cybersecurity incidents, their readiness levels, and the resulting consequences.

Despite the high frequency of cybersecurity incidents in India, only 52% of respondents believed their organisations were adequately prepared. This lack of preparedness came at a significant cost, with 47% of organisations reporting financial losses exceeding US $1 million over the past year. Additionally, 27% experienced financial setbacks of no less than US $2 million.

A shortage of cybersecurity talent was identified as the most significant challenge by 57% of Indian business leaders, while 44% cited a lack of funding as a hindrance to protecting their businesses. The repercussions of these cybersecurity incidents extended beyond financial losses. Approximately 46% of respondents revealed that their organisations had to reduce or restrict hybrid work arrangements, lay off employees, and postpone expansion plans.

To address this issue, last month, the Union Cabinet gave its nod to an extension of the Digital India program, allocating INR 14,903 crore to bolster digital initiatives in various domains such as skilling, cyber security, high-performance computing, and technology simplification for the general population.

The post Nearly 83% of Indian Firms Were Hit by Cyber Breaches in 2022 appeared first on Analytics India Magazine.

Big Tech’s AI Models Are Lost in Logic

Some say large language models (LLMs) are a step towards AGI, the rest think of it as merely a cool new tool. Every nook and cranny of the content generation industry—from newsrooms to script writers—is loomed by the fear of being taken over by AI language models. These tools have a credible ability to write everything from a Shakespearan poet to code in several languages. These models can spit out nicely stitched sentences but lack a fundamental human aspect: logical reasoning.

Yoshua Bengio had mentioned during an interview with AIM, the magnitude of data that these systems possess is almost equal to a person reading every day, every waking hour, all their life, and then living 1000 lives. However, they fail at reasoning. “LLMs are encyclopaedic thieves,” he stated, pointing towards the models’ incapability to reason with that knowledge as consistently as humans.

While researchers have long studied the subject, there is no sign yet that by adding layers, parameters, and attention heads to transformers, the logical reasoning gap will be bridged.

Looking Beyond Words

The extent to which famous text-based LLMs can reason remains uncertain. Models trained solely on text data inherently face limitations when it comes to common sense and real-world knowledge. While expanding the training dataset helps to a certain degree, these models may still have unexpected knowledge gaps. Multi-modal models, such as LLMs that understand both text and images can address some of these challenges.

In a paper published in IEEE, Meta’s AI chief, Yann LeCun, echoes Bengio’s sentiment painting a bleak picture of LLM understanding, presenting a pessimistic assessment of LLMs’ capabilities to understand solely through reading. Multi-modal models have demonstrated improved reasoning abilities compared to their single-sense counterparts. It is worth noting, however, that symbolic logic, an approach dominating decades, yielded minimal progress during the time period.

While multimodal large language models (MLLMs) have kindled hopes of making AI models capable of reasoning, their development remains in a rudimentary stage.

Big Tech Tryna Reason

While many have already declared language models cannot think, the big tech has been digging further to find means to make these AI tools good at logical reasoning.

A group of researchers from Virginia Tech Microsoft introduced a unique methodology known as the “Algorithm of Thoughts (AoT).” This approach propels LLMs along the paths of algorithmic reasoning, creating a novel path for contextual learning. Additionally, it suggests that with this training method, LLMs could possess the capability to integrate their intuition into searches that are optimised for better outcomes.

The research cites that LLMs have traditionally been trained on methods such as the “Chain-of-Thought,” “Self-consistency,” and “Least-to-Most Prompting.” However, these methods presented certain limitations that restricted their overall effectiveness. The method addresses the limitations of current in-context learning techniques like the “Chain-of-Thought” (CoT) approach. While CoT occasionally provides incorrect intermediary steps, the AoT steers the model by using algorithmic examples, resulting in more dependable results.

A week ago, researchers at Google released a study titled, ‘Teaching language models to reason algorithmically’ to teach models like ChatGPT to reason better algorithmically. The method takes the in-context learning approach and introduces an algorithm better at reasoning. These discoveries suggest that exploring longer contexts, and prompting more informative explanations could provide valuable research.

Earlier this year, researchers from Amazon won an outstanding-paper award for showing that knowledge distillation using contrastive decoding in the teacher model and counterfactual reasoning in the student model improves the consistency of “chain of thought” reasoning.

Teaching LLMs to reason for rational outputting is a hyperactive topic of research today. A conventional approach is the so-called chain-of-thought paradigm but researchers are gradually attaining better results along with other methodologies. While the big tech is chasing the subject one step at a time, the companies are yet to strike gold.

The post Big Tech’s AI Models Are Lost in Logic appeared first on Analytics India Magazine.

Friend or Fraud?

When Faridabad resident Karan received a call from one of his friends who had just met with an accident asking him to transfer Rs 30,000 to him for treatment, he had little reasons to raise a doubt. The man calling Karan sounded exactly like his friend and said he was using someone else’s phone as his phone got damaged in the accident.

Karan frantically transferred the money. Later, when he contacted his friend, he realised that he had been a victim of a fraud AI voice call. He filed a complaint with the NIT Cyber police station, which then revealed that a fraudster used an AI voice impersonator to fake his friend’s voice and duped him of his money. Such cases are happening all across the country.

Besides voice, criminals have exploited deepfake technology to deceive individuals through calls and videos. For instance, a man in Kerala fell victim to a deepfake call from a friend claiming a medical emergency, resulting in a loss of INR 40,000.

The growing accessibility and advancement of AI have once again transformed the nature of cyber crimes, and this new industry seems to be mushrooming pretty fast. From WormGPT to FraudGPT to now impersonation scams.

A global survey found that 25% of adults have experienced an AI voice scam. India leads the list with an astounding amount of incidents at 47%, followed by the United States at 14%, and the U.K. at 8%.

These are carried out by extracting your voice samples from social media sites like Instagram, Facebook, Twitter etc. As little as 3 seconds of your voice can be used to clone it using voice cloning.

BigTechs’ GenAI Poses New Challenges

Microsoft recently introduced a groundbreaking text-to-speech AI model called VALL-E. In a paper published this month, the company unveiled that VALL-E can replicate a person’s voice using just a brief 3-second recording. Impressively, preliminary findings indicate that VALL-E can even capture and reproduce the emotional nuances of the speaker.

VALL-E is trained on a dataset comprising 60,000 hours of English speech data. This dataset is asserted to be “hundreds of times larger than existing systems,” and is significantly better than the existing models in the realm of AI-driven voice synthesis.

So, a three-second recording of your voice, paired with something like Eleven Labs’ multilingual v2—a foundational AI Model that can be used for nearly 30 Languages, definitely calls for concern. Additionally, Meta’s SeamlessM4T is capable of translation into 100 languages.

While some are excited about the doors that these AI tools could open up in marketing, customer service, e-learning and entertainment, others are wary of what it could entail—an industry of AI-enabled criminals using it for all kinds of crimes—a new coming of Jamtara?

Cybercriminals are also using cloning tools like HeyGen, Murf, Resemble AI, Lyrebird, and ReadSpeaker to create indistinctible voice clones. To add to it these tools are inexpensive, costing as little as $0.6

These easily accessible and cheap AI voice generators are enabled by numerous tutorials that are available online. The ease of access to Generative AI Models has allowed individuals with limited technical knowledge to carry out tasks that were beyond their capabilities.

The tutorials make it easy for inexperienced and tech-oblivious individuals with ill intent to carry out scams at scale.

Diamond Cut Diamond

While these scammers are using AI-enabled voice generators, law enforcement is also wielding similar weaponry against them.

To fight these scams the police have been using AI tools to monitor SIM cards that are engaged in such scams and recently shut down upwards of 14k SIMs in Haryana’s Mewat.

The Indian Department of Telecommunications is also employing an AI-based facial recognition tool called ASTR to combat fraudulent SIM card use.

ASTR encodes human faces in subscriber images using convolutional neural networks to account for various factors like face angle and image quality. It conducts face comparisons, grouping similar faces, and identifies identical faces with at least 97.5% accuracy.

ASTR is capable of detecting all SIMs associated with a suspected face in less than 10 seconds from a database of one crore images. Additionally, ASTR employs “fuzzy logic” to find approximate matches for subscriber names, accommodating typographical errors. The tool helps identify individuals with multiple connections or SIMs obtained under different names using the same photograph.

The list is also shared with banks, payment wallets, and social media platforms to disconnect these numbers. WhatsApp collaborated with the government to disable fraudulent accounts, with ongoing efforts across other social media platforms.

It’s also crucial to stay alert and adopt proactive measures on your end. Users can verify the caller’s identity, employ codewords or pose a question only they would answer correctly.

The post Friend or Fraud? appeared first on Analytics India Magazine.

High Precision Semantic Image Editing with EditGAN

Generative Adversarial Networks or GANs have been enjoying new applications in the image editing industry. For the past few months, EditGAN is gaining popularity in the AI/ML industry because it's a novel method for high-precision, and high-quality semantic image editing.

We will be talking about the EditGAN model in detail, and let you know why it might prove to be a milestone in the semantic image editing industry.

So let’s start. But before we get to know what EditGAN is, it’s important for us to understand what is the importance of EditGAN, and why it is a significant step forward.

Why EditGAN?

Although traditional GAN architectures have helped the AI-based image editing industry advance significantly, there are some major challenges with building a GAN architecture from scratch.

During the training phase, a GAN architecture requires a high amount of labeled data with semantic segmentation annotations.
They are capable of providing only high-level control.
And often, they just interpolate back and forth between images.

It can be observed that although traditional GAN architectures get the work done, they are not effective for wide scale deployment. Traditional GAN architecture’s sub-par efficiency is the reason why EditGAN was introduced by NVIDIA in 2022.

EditGAN is proposed to be an effective method for high precision, and high quality semantic image editing with the capability of allowing its users to edit images by altering their highly detailed segmentation masks of an image. One of the reasons why EditGAN is a scalable method for image editing tasks is because of its architecture.

The EditGAN model is built on a GAN framework that models images and their semantic segmentations jointly, and requires only a handful of labeled or annotated training data. The developers of EditGAN have attempted to embed an image into GAN’s latent space to effectively modify the image by performing conditional latent code optimization in accordance with the segmentation edit. Furthermore, to amortize optimization, the model attempts to find “editing vectors” in latent space that realizes the edits.

The architecture of the EditGAN framework allows the model to learn an arbitrary number of editing vectors that can then be implemented or applied directly on other images with high speed, and efficiency. Furthermore, experimental results indicate that EditGAN can edit images with a never seen before level of detail while preserving the image quality to a maximum.

To sum as to why we need EditGAN, it's the first ever GAN-based image editing framework that offers

Very high-precision editing.
Can work with a handful of labeled data.
Can be deployed effectively in real-time scenarios.
Allows compositionality for multiple edits simultaneously.
Works on GAN-generated, real embedded, and even out of domain images.

High-Precision Semantic Image Editing with EditGAN

StyleGAN2, a state of the art GAN framework for image synthesis, is the primary image generation component of EditGAN. The StyleGAN2 framework maps latent codes that are drawn from a pool of multivariate normal distribution, and maps it into realistic images.

StyleGAN2 is a deep generative model that has been trained to synthesize images of the highest quality possible along with acquiring a semantic understanding of the images modeled.

Segmentation Training and Inference

The EditGAN model embeds an image into the GAN’s latent space using optimization, and an encoder to perform segmentation on a new image, and training the segmentation branch. The EditGAN framework continues to build on previous works, and trains an encoder to embed the images in the latent space. The primary objective here is to train the encoder consisting of standard pixel-wise L2 and LPIPS construction losses using samples from GAN, and real-life training data. Furthermore, the model also regularizes the encoder explicitly using the latent codes when working with the GAN samples.

Resultantly, the model embeds the annotated images from the dataset labeled with semantic segmentation into the latent space, and uses cross entropy loss to train the segmentation branch of the generator.

Using Segmentation Editing to Find Semantics in Latent Space

The primary purpose of EditGAN is to leverage the joint distribution of semantic segmentations and images for high precision image editing. Let’s say we have an image x that needs to be edited, so the model embeds the image into EditGAN’s latent space or uses the sample images from the model itself. The segmentation branch then generates y or the corresponding segmentation primarily because both RGB images & segmentations share the same latent codes w. Developers can then use any labeling or digital painting tools to modify the segmentation & edit them as per their requirements manually.

Different Ways of Editing during Inference

The latent space editing vectors obtained using optimization can be described as semantically meaningful, and are often disentangled with different attributes. Therefore, to edit a new image, the model can directly embed the image into the latent space, and directly perform the same editing operations that the model learnt previously, without performing the optimization all over again from scratch. It would be safe to say that the editing vectors the model learns amortize the optimization that was essential to edit the image initially.

It is worth noting that developers have still not perfected disentanglement, and edit vectors often do not return the best results when used to other images. However, the issue can be overcome by removing editing artifacts from other parts of the image by performing a few additional optimization steps during the test time.

On the basis of our current learnings, the EditGAN framework can be used to edit images in three different modes.

Real-Time Editing with Editing Vectors

For images that are localized, and disentangled, the model edits the images by applying editing vectors learned previously with different scales, and manipulates the images at interactive rates.

Using Self-Supervised Refinement for Vector-based Editing

For editing localized images that are not disentangled perfectly with other parts of the image, the model initializes editing the image using previously learned editing vectors, and removes editing artifacts by performing a few additional optimization steps during the test time.

Optimization-based Editing

To perform large-scale & image-specific edits, the model performs optimization from the start because editing vectors cannot be used to perform these kinds of transfers to other images.

Implementation

The EditGAN framework is evaluated on images spread across four different categories: Cars, Birds, Cats, and Faces. The segmentation branch of the model is trained by using image-mask pairs of 16, 30, 30, 16 as labeled training data for Cars, Birds, Cats, and Faces respectively. When the image is to be edited purely using optimization, or when the model is attempting to learn the editing vectors, the model performs 100 optimization steps using the Adam optimizer.

For the Cat, Car, and Faces dataset, the model uses real images from the DatasetGAN’s test set that were not used to train the GAN framework for performing editing functionality. Straightaway, these images are embedded into EditGAN’s latent space using optimization and encoding. For the Birds category, the editing is shown on GAN-generated images.

Results

Qualitative Results

In-Domain Results

The above image demonstrates the performance of the EditGAN framework when it is applying the previously learned editing vectors on novel images, and refining the images using 30 optimization steps. These editing operations performed by the EditGAN framework are disentangled for all classes, and they preserve the overall quality of the images. Comparing the results of EditGAN and other frameworks, it could be observed that the EditGAN framework outperforms other methods in performing high-precision, and complex edits while preserving the subject identity, and image quality at the same time.

What’s astonishing is that the EditGAN framework can perform extremely high precision edits like dilating the pupils, or editing the wheel spokes in the tyres of a car. Furthermore, EditGAN can also be used to edit the semantic parts of objects that have only a few pixels, or it can be used to perform large-scale modifications to an image as well. It's worth noting that the several editing operations of the EditGAN framework are capable of generating manipulated images unlike the images that appear in the GAN training data.

Out of Domain Results

To evaluate EditGAN’s out of domain performance, the framework has been tested on the MetFaces dataset. The EditGAN model uses in-domain real faces to create editing vectors. The model then embeds MetFaces portraits that are out of domain using a 100-step optimization process, and applies the editing vectors via a 30-step self-supervised refinement process. The results can be seen in the following image.

Quantitative Results

To measure EditGAN’s image editing capabilities quantitatively, the model uses a smile edit benchmark that was first introduced by MaskGAN. Faces that contain neutral expression are replaced with smiling faces, and the performance is measured across three parameters.

Semantic Correctness

The model uses a pre-trained smile attribute classifier to measure whether the faces in the images show smiling expressions after editing.

Distribution-level Image Quality

Kernel Inception Distance or KID and Frechet Inception Distance or FID is calculated between the CelebA test dataset & 400 edited test images.

Identity Preservations

The model’s ability to preserve the identity of subjects when editing the image is measured using a pre-trained ArcFace feature extraction network.

The above table compares the performance of the EditGAN framework with other baseline models on the smile edit benchmark. The method followed by the EditGAN framework to deliver such high results is compared across three different baselines:

MaskGAN

MaskGAN takes non-smiling images along with their segmentation masks, and a target smiling segmentation mask as the input. It's worth noting that when compared to EditGAN, the MaskGAN framework requires a large amount of annotated data.

Local Editing

EditGAN also compares its performance with local editing, a method that is used to cluster GAN features to implement local editing, and it is dependent on reference images.

InterFaceGAN

Just like EditGAN, InterFaceGAN also attempts to find editing vectors in the latent space of the model. However, unlike EditGAN, the InterFaceGAN model uses a large amount of annotated data, auxiliary attribute classifiers, and does not have the fine editing precision.

StyleGAN2Distillation

This method creates an alternative approach that does not necessarily require real image embeddings, and instead it uses an editing-vector model to create a training dataset.

Limitations

Because EditGAN is based on the GAN framework, it has the identical limitation as any other GAN model: it can work only with images that can be modeled by the GAN. EditGAN’s limitation to work with GAN modeled images is the major reason why it is difficult to implement EditGAN across different scenarios. However, it is worth noting that EditGAN’s high-precision edits can be transferred readily to other different images by making use of editing vectors.

Conclusion

One of the major reasons why GAN is not an industry standard in the image editing field is because of its limited practicality. GAN frameworks usually require a high amount of annotated training data, and they do not often return a high efficiency & accuracy.

EditGAN aims to tackle the issues presented by conventional GAN frameworks, and it attempts to come about as an effective method for high-quality, and high-precision semantic image editing. The results so far have indicated that EditGAN indeed offers what it claims, and it’s already performing better than some of the current industry standard practices & models.

Unleashing Generative AI

Challenging Times Ahead

1. Loading the Dataset

2. Explore the dataset

3. Checking Class Distribution

4. Removing Missing Values

5. Removing Duplicates

6. One-Hot Encoding

7. Normalization of Float Value Columns

8. Save the cleaned dataset

More On This Topic

Missing the gold

Times When the Hype Cycle Got it Wrong

Hype Cycle is not Science

Transformer is NOT all you need, only a little bit

Why and when to XGBoost

Looking Beyond Words

Big Tech Tryna Reason

Why EditGAN?

High-Precision Semantic Image Editing with EditGAN

Segmentation Training and Inference

Using Segmentation Editing to Find Semantics in Latent Space

Implementation

Results

Qualitative Results

Quantitative Results

Limitations

Conclusion