Andrew Ng Collaborates with OpenAI for A New Course on ChatGPT Prompt Engineering

Andrew Ng’s DeepLearning.AI has collaborated with OpenAI to launch a new course: ‘ChatGPT Prompt Engineering for Developers’. The free-of-cost course will let you learn how to use a large language model (LLM) effectively to build new and powerful applications.

Check out the website here.

Earlier, Andrew Ng had introduced a new course, “Mathematics for Machine Learning and Data Science Specialization”.

What to Expect from the Course?

Isabella Fulford, a member of the technical staff at OpenAI, and Ng will teach a new course about how LLMs work. The course aims to offer valuable tips for prompt engineering, as well as demonstrate the various ways that LLM APIs can be used in applications for summarisation, inference, text transformation, and expansion. Furthermore, the course will cover two essential principles for crafting successful prompts, provide instruction on how to systematically develop effective prompts, and teach you to create a personalized chatbot.

Why Should You Join?

The short 1.5 hours course is beginner-friendly and, is designed to be accessible to novices with a fundamental grasp of Python. However, it is also helpful for experienced machine learning specialists who aspire to explore the forefront of rapid engineering and use LLMs.

The post Andrew Ng Collaborates with OpenAI for A New Course on ChatGPT Prompt Engineering appeared first on Analytics India Magazine.

6 Data Science Job Openings at Leading Indian Companies 

Industries spanning finance, healthcare, e-commerce, and marketing are harnessing the power of data science to boost growth, efficiency, and innovation, making it a lucrative field to be in. As the demand for data scientists continues to soar, this may be the right time for those seeking a profitable career shift. So if you are a budding or a seasoned data scientist looking for a job change, we have got you covered.

JP Morgan & Chase

Role: Associate Sr – Data Science

JP Morgan & Chase is seeking individuals for the position of Associate Sr – Data Science who is passionate about converting data into valuable insights and empowering breakthroughs in business. They will analyse customer behaviours and predict financial needs to optimise sales performance and develop self-serve tools for real-time information and identify potential attrition events for effective counteraction.

Minimum Qualifications:

  • Bachelor’s/Master’s Degree in quantitative analytics fields with over seven years of data science experience
  • Proficient in SAS/Python programming, data wrangling, and complex SQL scripting
  • Skilled in solving business problems with fact-based and scientific analytics

Preferred Qualifications:

  • Familiarity with financial services, consulting, or marketing agency insights functions
  • Experience in workforce analytics, sales performance analytics, or sales science organisation
  • Expertise in big data disciplines, Agile methodologies, and new technologies

Click here to apply.

Read more: Data Science Hiring Process at Pepperfry

Bosch

Role: Data Scientist

As a data scientist in the Bosch global team, you’ll provide AI and ML solutions and collaborate with other departments, enhance existing cloud-based solutions, and explore new use cases.

Minimum Qualifications:

  • The ideal candidate must have a bachelor’s or master’s degree in software engineering, computer science, mathematics, or a similar field.
  • At least two years of experience in professional data science.
  • The candidate must possess extensive knowledge of Python (with knowledge of R being a plus), as well as object-oriented programming languages with a strong emphasis on clean code. They must have a proven track record in areas such as ML, neural networks, pattern analysis, time series forecasting, data analysis, and data-pipeline technologies (e.g., Kubernetes, Docker, NoSQL databases, Workflow-Engine).
  • A strong understanding of statistics is necessary, and knowledge of cloud technologies (ideally Microsoft Azure) and SQL would also be needed.

Click here to learn more about it.

PayPal

Role: Manager – Data Science

As a Manager -Decision Science, you will lead and develop a team of skilled decision scientists, providing them with daily guidance to ensure exceptional results and also oversee key metrics and make changes when necessary to improve performance.

Minimum Qualifications:

  • Experience as either a manager or Lead Decision Scientist / Lead Data Scientist / Analytics roles
  • The ideal candidates must have excellent analytical skills, including expertise in SQL and data visualisation.
  • They should also have experience in leading cross-functional collaborations and managing relationships with multiple stakeholders.
  • The ability to effectively lead a team and promote teamwork is also essential.
  • The company recognises that some candidates may lack confidence due to imposter syndrome and encourages them to apply regardless.

Apply here.

LSEG (London Stock Exchange Group)

Role: Data Scientist

The role of a data scientist involves data handling, programming, and developing automated solutions for sourcing information. It includes analysing content, building models, and using machine learning to improve core processes. Collaboration with various teams is required to tackle large-scale analytics issues and create visualisations and pipeline tools.

Minimum Qualifications:

  • Higher education in statistics, mathematics or engineering in computer science with data science certification.
  • Proficiency in developing and delivering automation using Python and R. Additionally, they should be skilled in utilising one or more technologies such as VBA, SQL, JAVA, and RPA to drive improvements in business processes.
  • Strong understanding of data science principles and have experience managing financial content.
  • Adaptable to new technologies and can provide guidance to team members.

Preferred Qualifications:

  • The desired skills for this role include data mining, data sourcing, NLP, unsupervised and deep learning, and predictive modelling.
  • They should also be experienced in using AWS and other cloud-based tools that facilitate the onboarding of Python codes.

Check out their careers page now.

PwC

Role: Senior Manager – Data Science

The role involves collaborating with US-based consultants and clients, and working closely with the business analytics teams in India. The main responsibilities will include leading high-level analytics consulting projects and providing sound business advice to project teams.

Minimum Qualifications:

  • The candidate should be experienced in managing and deploying ML models on cloud environments, with a strong understanding of supervised and unsupervised ML algorithms, statistics, and data analysis.
  • The candidate also has extensive experience working with various ML frameworks and tools such as scikit-learn, mlr, caret, H2O, TensorFlow, Pytorch, and MLlib.
  • They must have advanced-level programming skills in SQL and Python/Pyspark, which enables them to guide teams. In addition, the candidate is proficient in using visualisation tools such as Tableau, PowerBI, and AWS QuickSight to convey information to stakeholders.

Apply here.

Walmart Global Tech

Role: Senior Data Scientist

As a senior data scientist, you’ll solve complex problems by analysing terabytes of data using data science tools and techniques. You’ll also be involved in developing PoCs and presenting them to the product team, then work with ML and software engineering teams to deploy solutions as APIs or pipelines. You’ll stay updated with the latest tech and mentor junior associates in providing robust data science solutions.

Minimum Qualifications:

The candidate either needs to have a bachelor’s degree in a related field such as statistics, economics, analytics, mathematics, computer science, or information technology, along with at least three years of experience in an analytics-related field or have a master’s degree in one of the mentioned fields with at least one year of experience in an analytics-related field or have a minimum of five years of experience in an analytics or related field.

Click here to apply.

Read more: Decoding the Stephen Wolfram Enigma

The post 6 Data Science Job Openings at Leading Indian Companies appeared first on Analytics India Magazine.

Unlocking the Power of Search Ranking in E-commerce Websites using Data Science

Search ranking is the backbone that any e-commerce website stands on. It is directly tied to the visibility and accessibility of products and, consequently, to sales as well. But like we learn, there can be more finesse.

Analytics India Magazine caught up with Kavitha Krishnan, Senior Manager, Data Science at Tredence, to understand how new AI algorithms can build a better search experience and the effectiveness of a good search ranking.

AIM: What are some of the challenges that e-commerce websites face when it comes to search and how can they overcome such challenges?

Kavitha: There are 2 main challenges in a search cycle.

(a) To generate the most relevant search results with respect to a query

(b) To decide the ranking/sorting for these results.

While a lot of attention is typically paid to the first challenge, the importance of addressing the second challenge should not be overlooked.

Effective search ranking algorithms are crucial for maximizing the visibility and accessibility of products to potential customers, which in turn can increase the likelihood of a purchase and enhance user satisfaction. When it comes to e-commerce websites, the key question is how to rank products on a search results page in a way that satisfies multiple objectives, including maximizing relevance and purchase likelihood. Achieving a balance between these objectives can be a complex task that requires leveraging AI/ML techniques.

For example, when a customer searches for snacks on an e-commerce website, the website may display hundreds of snack products. However, if the most appropriate products are not displayed prominently, the website may miss out on potential sales. Balancing multiple competing objectives to ensure an optimal customer experience and higher sales/margin for the business is critical. If a website consistently displays only higher-priced products, for instance, it may deter customers and negatively impact sales. Finding the right balance is key.

AIM: What are some of the different approaches to search ranking and how do they work?

Kavitha: Develop a system to scientifically sort products on the search results page to optimize “Relevance”, “Add to cart rates”, “Gross Margin Dollars” etc. Balancing multiple objectives is the main goal of the ranking exercise.

The most appropriate algorithms for this use case are ‘Learning to Rank’ and ‘Multi-Objective ranking optimization’ systems.

Learning to Rank systems use machine learning to understand behavioural patterns and generate ranks based on a combination of features, such as product descriptions, reviews, ratings, and inventory. The weights of these features can be adjusted based on business goals and priorities or left to the model to tune and learn. This algorithm considers the multidimensional nature of relevance and business constraints, making it suitable for building relevance ranking models in production. Ranking models typically work by predicting a relevance score for an input feature set containing the query term and the list of products and their features. And we sort these products according to the generated relevance score to determine the ranking.

Multi-Objective Ranking Optimization is the task of learning a ranking model from training examples while optimizing multiple objectives simultaneously. A ranked list conforming to all business requirements can be generated by this system, also known as MORO. Both Learning to Rank and Multi-Objective systems can be easily incorporated into search engines for any retailer to improve overall user experience and generate maximum profit.

AIM: How do you think these search ranking algorithms take into account factors like the relevance of the product or its popularity or its profitability?

Kavitha: When websites rank their products, they need to factor in various aspects such as relevance, popularity, and profitability.

Out of these, relevance holds utmost importance as it ensures that the displayed product matches the user’s search query. For example, if a user searches for “organic carrots,” the website should show relevant results such as organic carrots before showing normal carrots or baby carrots. The algorithm looks at several aspects such as keyword density, title tags, meta descriptions, and content quality to generate this relevance aspect. Using word embeddings of the search term and the product terms helps in generating the right relevance.

Popularity is determined by the number of views or purchases made earlier, along with customer reviews and ratings. For instance, when a user searches for hotels on a booking site, the website displays a list of popular hotels based on customer reviews and ratings. Similarly, the products with higher engagement and customer reviews should get boosted.

The profitability factor should be taken into account as another consideration. Focusing only on the relevance and popularity of products may yield a better customer experience, but may result in inadequate profits and failure to meet business goals. It’s essential to factor in pricing and profit margins when determining the ranking of products.

Websites need to incorporate all these factors into the ranking algorithm that generates the search results list, especially emphasizing the top few rows of products, to achieve an optimized search grid.

AIM: What are the indicators for e-commerce companies to assess the performance or effectiveness of search ranking algorithms?

Kavitha: The most important way to assess the performance and effectiveness of search rankings is to use A/B testing. A/B compares two versions of a webpage to determine which one performs better. Users are randomly split into two groups and shown different versions. Deliver the new search ranking to a subset of the customer, while others still see the existing search grid. Identify the uplift in business and click stream metrics like Add to Cart rate, Engagement rate, Abandonment rate, and Gross Margin between these two groups. With this A/B testing, we will be able to access how good our search ranking process is.

In addition to A/B testing, there are other ways to assess the performance and effectiveness of search rankings. One key metric is the average time spent on the search results page. A well-optimized search grid should reduce the time users spend searching for the right product. By analyzing these metrics, retailers can identify areas for improvement and make data-driven decisions to optimize their search ranking algorithms. Additionally, it’s important to regularly monitor and evaluate search metrics to ensure that the search grid continues to perform effectively and meet business goals.

AIM: How do you think AI/ML can help with search rankings? How is this tech developing?

Kavitha: Our focus is on addressing the challenge of “Converting searches into sales”, and I firmly believe that leveraging AI/ML is the optimal approach for accomplishing this goal. By identifying the barriers that prevent online shoppers from discovering the products they seek and making purchases, AI/ML algorithms help retailers to enhance the search experience and effectively meet business goals.

The key characteristic of an exceptional search experience is the ability to promptly provide every user with the most pertinent results. However, the challenge lies in efficiently and gracefully executing this task behind the scenes. So choosing the right algorithm and right implementation strategy is the key. Making even minor enhancements to the relevance ranking system can have a dual effect of improving the shopping experience for millions of customers and significantly boosting revenue.

Although the application of learning to rank (LTR) in web search has been extensively researched, its utilization in E-Com search remains unexplored to a very large extent. Using the right feature representation and obtaining dependable relevant features and utilizing user feedback as features are still areas that need a lot of Business +AI acumen.

Our goal is to provide the correct perspective for solving the problem in the E-commerce use case, despite the availability of various algorithms that can tackle it. Unfortunately, only a small number of retailers have taken full advantage of these systems so far. Google and Amazon have even introduced new approaches to address the ranking problem in recent times, indicating how significant it is for retailers to act. Therefore, retailers must seize this abundant opportunity at hand. The most effective way to advance is to leverage the potential of AI/ML, employing the latest and most effective algorithms and integrating them into your E-commerce search engine.

The post Unlocking the Power of Search Ranking in E-commerce Websites using Data Science appeared first on Analytics India Magazine.

IBM launches QRadar Security Suite for accelerated threat detection and response

Exterior view of IBM sign at IBM Canada Head Office on May 16, 2018 in Markham, Ontario, Canada.
Image: JHVEPhoto/Adobe Stock

At the RSA conference, IBM launched a platform-centric expansion to its QRadar security product, designed as a one-stop shop to accelerate response and offer a unified framework for security operations centers. Called QRadar Suite, the cloud-native service expands capabilities across threat detection, investigation and response technologies, according to the company.

The service has an integrated dashboard user experience and artificial intelligence automation for parsing threats and responses. It’s designed to address the ongoing bad arithmetic around security operations centers: a threat landscape that is only expanding; more sophisticated attackers; plus an endemic shortage of human sentries to guard enterprise perimeters and kill chains.

“Today’s Security Operation Center teams are protecting a fast-expanding digital footprint that extends across hybrid cloud environments – creating complexity and making it hard to keep pace with accelerating attack speeds,” according to IBM, which also said the products are specifically meant to help buttress security operations center teams facing labor-intensive alert investigations and response processes, manual analysis and the proliferation of tools, data, points of engagement, APIs and other potential vulnerabilities.

XDR, SIEM and SOAR

Keeping pace with one of the pied pipers of RSA 2023 — unified platforms over multi-vendor security — IBM said QRadar Suite includes extended detection and response, or XDR, as well as security information and event management, and security orchestration, automation and response, or SOAR. It also includes a new cloud-native log management capability — all built around a common user interface, shared insights and connected workflows.

Emily Mossburg, Deloitte’s global cyber leader, said SOAR is about automating the workflow, while SIEM is the collection of security logs and events, and rules and policies to define analysis on top of that. “I would consider SOAR to be security worldflow management. The vendors are sort of pushing it to help simplify the whole security operation and drive down the level of effort associated with working through incident and researching,” she said.

She said it comes down to dealing with a perennial shortage of security analysts.“There’s an element of balancing out the talent gap and I think the reality is that there’s a cost element to this. Organizations can’t spend more on protecting themselves than the revenue they bring in. If you had human eyes on glass on everything all the time you couldn’t afford security.”

IBM said its QRadar SIEM has a new unified analyst interface that provides shared insights and workflows with broader security operations toolsets. IBM said it plans to make QRadar SIEM available as a service on Amazon Web Services by the end of Q2 2023.

AI, the sine qua non of security?

During RSA, many companies talked about the virtues of AI in security, particularly with the increase in alerts into SOCs and the paucity of human agents, particularly in mid-sized businesses that are perhaps more vulnerable to phishing attacks.

IBM Managed Security Services said it is using AI to automate more than 70% of alert closures and reduce its alert triage timelines by 55% on average within the first year of implementation, according to the company.

IBM said QRadar uses AI to:

  • Triage: The company said that to prioritize and respond to alerts, QRadar includes AI trained on prior analyst response patterns, along with external threat intelligence from IBM X-Force and broader contextual insights from across detection toolsets.
  • Investigation: AI models identify high-priority incidents and automatically begin investigating and generate a timeline and attack graph of the incident based on the MITRE ATT&CK framework, and recommend actions to speed response.
  • Hunting: QRadar uses open-source threat hunting language and federated search capabilities to ID attacks and indicators of compromise across environments, without moving data from its original source.

The design elements of the system include a UX across products meant to make it easier to increase analyst speed and efficiency across the kill chain and AI capabilities. It is cloud-based and delivered on AWS and includes cloud-native log management capability.

“In the face of a growing attack surface and shrinking attack timelines, speed and efficiency are fundamental to the success of resource-constrained security teams,” said Mary O’Brien, general manager, IBM Security, in a statement. “IBM has engineered the new QRadar Suite around a singular, modernized user experience, embedded with sophisticated AI and automation to maximize security analysts’ productivity and accelerate their response across each step of the attack chain,” she added.

Matt Olney, director, threat intelligence and interdiction at Cisco’s Talos threat intelligence unit, said it’s indeed an exciting time in AI and a system that supports human analysts is ideal. But he worries that, while AI will be faster, it may not be better, and suggests AI in the service of security poses a paradoxical conundrum. “We are training AI on internet, so we are creating things that can solve all these solved problems, but if we haven’t bothered to solve the problems we won’t be able to use the AI to do it,” he said.

Cisco showcased an early conceptual version of its AMES AI model for security, which will move toward a natural language interface. Olney voiced concerns that security AI systems could eventually eliminate lower level or Tier 1 security jobs, potentially hobbling enterprises’ ability to fill higher level SOC analyst positions where problems get solved creatively, generating data that would improve AI. “So when we start training AI, what are we going to train it on that’s new, if we’ve ended up eliminating these people?”

Platforms versus single vendors: a false dichotomy?

Mossburg said the platforming trend follows an inflection point in the industry on full display at RSA. “For a long time, we have focused on best-of-breed, the best mousetrap and it has gotten complex and hard to manage. Does it make sense to have 100 of the best mouse traps if you don’t have time to set them? We need to move to some level of simplicity so we can actually manage this thing that we have. We will see more of this for the next five years. We will see significant consolidation,” she predicted.

Olney said there are advantages to having a unified environment. “There are a lot of things to think about when making decisions about what to invest in, so really you want to look for what gives you the most visibility and what integrates well with the current level of sophistication your security staff has. Ultimately the tools are super important and useful and necessary, but ultimately it’s the people that are going to define the success of your security program,” he said.

He enumerated the advantages of having a unified environment. “You have a better relationship with vendors, a lot of sway when you are negotiating, and it’s easier to train people. Also, your support contracts are usually unified and that helps with financing,” Olney said.

A drawback: how likely is it for one company to excel at all toolsets? “If I’m advising a customer, I’ll say you have to have a really solid understanding of what your security needs are before you go looking for a security product,” said Olney, adding that enterprises should find a solution that gives them maximum visibility and the most secure controls they can apply to secure their network when they are actively engaging with their adversary.

The bottom line is security is hard, he said.

“You can’t just buy something from a vendor, plug it in and say I’m secure now. That’s not how this game works. It has to be complementary between right people with right skills sets combined with right tools and capabilities and put those together,” he added.

Cybersecurity Insider Newsletter

Strengthen your organization's IT security defenses by keeping abreast of the latest cybersecurity news, solutions, and best practices.

Delivered Tuesdays and Thursdays Sign up today

The best Roborock robot vacuums and mops of 2023

Chores are not fun, especially vacuuming and mopping the floors, but what if there was a shortcut? Created in 2014, Roborock burst onto the scene with its automated broom. Since then, it has expanded to automated mopping, as well. However, with several models to choose from, it can feel both confusing and intimidating when trying to determine which one is right for you.

That is where we come in. We studied the market and analyzed each model to find the best Roborock for your home. This is what we found.

Also:

  • The best robot mops
  • The best cheap vacuums
  • The best car vacuums
  • The best vacuums for pet hair

ZDNET Recommends

Write long-form content in seconds with half off this leading AI tool

Image: StackCommerce

Create 5,000+ words in a matter of seconds and completely overhaul your content marketing. Right now, a lifetime subscription to Wordplay AI Content Generator is half off.

Creating content is a necessary hassle for small businesses. You need to carve out your niche on the web, and SEO-optimized content is one of the most cost-effective ways to do it. But if you don’t have time to research and write content yourself, you need a tool like Wordplay AI Content Generator.

While ChatGPT may be more well-known, Wordplay goes beyond, allowing you to create 5,000+ word articles with minimal input and minimal editing required. As such, Wordplay has earned 4.7/5 stars on AppSumo, 4.9/5 stars on Product Hunt and Capterra, and was named a Spring 2023 G2 High Performer with a 4.8/5-star rating.

Wordplay article drafts are 95% complete in about 15 seconds, and you can create blog articles, web pages, marketing content, blog ideas and more, quickly. You can craft quality content in more than 20 languages to target specific audiences, all without worrying about SEO. Wordplay’s content is always created with Google’s algorithm in mind. It’s specifically built to make your content accurate, on-topic, and useful, so that it ranks highly on search engine result pages.

With Wordplay, you can work in a few ways. The guided mode allows you to select your title, introduction, and subsections to provide context for Wordplay to work with. In title mode, you can just provide a descriptive title and say how long you want your article to be. In outline mode, you can provide an outline and watch Wordplay get to work. No matter how you work, you can save hundreds of hours by creating articles using Wordplay’s clever AI.

Scale your content marketing in record time. Right now, you can get a lifetime subscription to Wordplay AI Content Generator for 49% off the regular price of $199, at just $99.99.

Prices and availability are subject to change.

Innovation Insider Newsletter

Catch up on the latest tech innovations that are changing the world, including IoT, 5G, the latest about phones, security, smart cities, AI, robotics, and more.

Delivered Tuesdays and Fridays Sign up today

Exploring Data Cleaning Techniques With Python

Exploring Data Cleaning Techniques With Python
Image by freepik.com

In real-world data science projects, the data used for analysis may contain several imperfections such as the presence of missing data, redundant data, data entries having incorrect format, presence of outliers in the data, etc. Data cleaning refers to the process of preprocessing and transforming raw data to render it in a form that is suitable for further analysis such as for descriptive analysis (data visualization) or prescriptive analysis (model building). Clean, accurate, and reliable data must be utilized for post analysis because “bad data leads to bad predictive models”.

Several libraries in Python, including pandas and numpy, can be used for data cleaning and transformation. These libraries offer a wide range of methods and functions to carry out tasks including dealing with missing values, eliminating outliers, and translating data into a model-friendly format. Additionally, eliminating redundant features or combining groups of highly correlated features into a single feature could lead to dimensionality reduction. Training a model using a dataset with fewer features will improve the computational efficiency of the model. Furthermore, a model built using a dataset having fewer features is easier to interpret and has better predictive power.

In this article, we will explore various tools and techniques that are available in Python for cleaning, processing, and transforming data. We will demonstrate data cleaning techniques using the data.cvs dataset shown below:

Exploring Data Cleaning Techniques With Python
data.csv showing various imperfections such as duplicated data, NaN, etc. Data created by Author.
Libraries For Data Cleaning in Python

In Python, a range of libraries and tools, including pandas and NumPy, may be used to clean up data. For instance, the dropna(), drop duplicates(), and fillna() functions in pandas may be used to manage missing data, remove missing data, and remove duplicate rows, respectively. The scikit-learn toolkit offers tools for dealing with outliers (such as the SimpleImputer class) and transforming data into a format that can be utilized by a model, such as the StandardScaler class for standardizing normalizing numerical data, and the MinMaxScalar for normalizing data.

In this article, we will explore various data cleaning techniques that can be used in Python to prepare and preprocess data for use in a machine learning model.

Processing Missing Data

The processing of missing data is one of the most important imperfections in a dataset. Several methods for dealing with missing data are provided by the pandas package in Python, including dropna() and fillna(). The dropna() method is used to eliminate any columns or rows that have missing values. For instance, the code below will eliminate all rows with at least one missing value:

import pandas as pd    data = pd.read_csv('data.csv')    data = data.dropna()

The fillna() function can be used to fill in missing values with a specific value or method. For example, the following code will fill in missing values in the 'age' column with the mean age of the data:

import pandas as pd    data = pd.read_csv('data.csv')    data['age'].fillna(data['age'].mean(), inplace=True)

Handling Outliers

Handling outliers is a typical data cleaning activity. Values that diverge greatly from the rest of the data are considered outliers. These factors should be managed carefully since they have a significant influence on a model's performance. The RobustScaler class from the scikit-learn toolkit in Python is used to handle outliers. By deleting the median and scaling the data according to the interquartile range, this class may be used to scale the data.

from sklearn.preprocessing import RobustScaler    data = pd.read_csv('data.csv')    scaler = RobustScaler()    data = scaler.fit_transform(data)

Encoding Categorical Variables

Another common data cleaning task is converting data into a format that can be used by a model. For instance, before categorical data can be employed in a model, it must be transformed into numerical data. The get_dummies() method in the pandas package allows one to transform category data into numerical data. In the example below, the categorical feature ‘Department’ is transformed into numerical data:

import pandas as pd    data = pd.read_csv('data.csv')    data = pd.get_dummies(data, columns=['Department'])

Removing Duplicate Data

Duplicate data must also be eliminated during the data cleaning process. To delete duplicate rows from a Python DataFrame, the drop_duplicates() method provided by the pandas package can be used. For instance, the code below will eliminate any redundant rows from the data:

import pandas as pd    data = pd.read_csv('data.csv')    data = data.drop_duplicates()

Feature Engineering

Feature selection and feature engineering are essential components of data cleaning. The process of choosing only the relevant features in a dataset is referred to as feature selection, whereas the process of building new features from already existing ones is known as feature engineering. The code below is an illustration of feature engineering:

import pandas as pd  from sklearn.preprocessing import StandardScaler    # read the data into a pandas dataframe  df = pd.read_excel("data.csv")    # create a feature matrix and target vector  X = df.drop(["Employee ID", "Date of Joining"], axis=1)  y = df["Salary"]    # scale the numerical features  scaler = StandardScaler()  X_scaled = scaler.fit_transform(X[["Age", "Experience"]])    # concatenate the scaled features with the categorical features  gender_dummies = pd.get_dummies(X["Gender"], prefix="Gender")  X_processed = pd.concat(      [gender_dummies, pd.DataFrame(X_scaled, columns=["Age", "Experience"])],      axis=1,  )    print(X_processed)

In the above code, we first create a feature matrix (X) by dropping the 'Employee ID' and 'Date of Joining' columns, and create a target vector (y) consisting of the 'Salary' column. We then scale the numerical features 'Age' and 'Experience' using the StandardScaler() function from scikit-learn.

Next, we create dummy variables for the categorical 'Gender' column and concatenate them with the scaled numerical features to create the final processed feature matrix (X_processed).

Note that the specific feature extraction techniques used will depend on the data and the specific requirements of the analysis. Also, it's important to split the data into training and testing sets before applying any machine learning models to avoid overfitting.

Conclusion

To conclude, data cleaning is an essential stage in the machine learning process since it guarantees the data used for analysis (descriptive or prescriptive) is of high quality. Important methods that may be used to prepare and preprocess data include converting data format, removing duplicate data, dealing with missing data, outlier detection, feature engineering, and feature selection. Pandas, NumPy, and scikit-learn are just a few of the many libraries and tools for feature engineering and data cleaning.
Benjamin O. Tayo is a Physicist, Data Science Educator, and Writer, as well as the Owner of DataScienceHub. Previously, Benjamin was teaching Engineering and Physics at U. of Central Oklahoma, Grand Canyon U., and Pittsburgh State U.

More On This Topic

  • Data Cleaning with Python Cheat Sheet
  • Introduction to Python Libraries for Data Cleaning
  • Data Cleaning and Wrangling in SQL
  • Getting Started Cleaning Data
  • Exploring the SwAV Method
  • Exploring GPT-3: A New Breakthrough in Language Generation

5 Ways AI Is Impacting STEM Education in 2023

5 Ways AI Is Impacting STEM Education in 2023
Image by Editor

STEM education is a priority for students and educators across the country. Science, technology, education and math are essential fields that need skilled workers now and in the future. STEM is even more important for student success as tech advances and the world gets more complicated.

Artificial intelligence (AI) is a major innovation shaping nearly every aspect of the world, and education is no exception. People face many potential challenges as AI becomes more ubiquitous, but the educational benefits are undeniable.

AI has almost endless possibilities. STEM educators can use it in several ways to improve science and technology learning for students of all ages, from daily instruction to experiences that reach beyond the classroom.

Learn more about the many ways AI is impacting STEM education in 2023 and beyond.

1. Personalized Assistance

Everyone learns at a different pace, and students have various learning styles. For example, visual learners won’t respond to the same methods as reading/writing learners. That means not everyone is on the same page, and some might get left behind.

AI enables educators to embrace adaptive learning — a learning experience customized for every student. Artificial intelligence programs can gather and analyze data from each student to create a personalized curriculum. Kids can move at their own pace, getting extra long division practice or physics problems to solve if needed.

Personalized tutoring impacts student success rates — tutored children outperform their untutored peers in most subjects, including STEM fields. AI tutoring takes things to the next level — no more one-size-fits-all learning that holds some kids back and leaves others behind.

2. Greater Creativity

Another benefit of adaptive learning with AI is the potential for more creative thinking and problem-solving. Traditional learning programs are rigid and leave little room for students to explore and experiment with new ideas.

AI software is quick on its feet, able to adapt to changes and suggest new paths. STEM students that enter the workforce will face bigger challenges that require smart solutions. Getting experience with creative problem-solving early on will improve their success in future endeavors.

3. Inclusion and Access

Artificial intelligence is at the forefront of accessible technology. For example, virtual reality (VR) makes STEM education and other types of learning more available for nontraditional students. Children who may be unable to attend classes in person or have a learning disability can take part in learning programs thanks to AI.

4. More Accurate Assessments

Human error is inevitable. Educators preparing or grading exams and standardized assessments may make mistakes that can impact a student’s education. However, AI programs have high rates of accuracy in practical applications.

That means STEM students at all educational levels will receive timely and correct feedback. Immediate, meaningful comments from AI and human instructors help students get back on the right track. Greater accuracy also allows teachers to pinpoint students or academic areas needing extra attention.

5. Preparation for the Future

AI is the future of technology. However, there is currently an AI education gap — the demand for highly intelligent software is high, but the number of capable workers is struggling to keep up. That means it’s more important than ever for students to prepare for an AI-oriented career in STEM.

The earlier students get experience, the better. AI programs in education allow students of all ages to familiarize themselves with this technology, grow alongside it and become next-gen leaders in the field.

AI’s Role in STEM Education

AI and other cutting-edge technologies play a huge role in the future of education, especially for STEM students who will develop future programs. Embracing AI early on in their learning careers is essential to ensure academic and career success.
April Miller is managing editor of consumer technology at ReHack Magazine. She have a track record of creating quality content that drives traffic to the publications I work with.

More On This Topic

  • How to be a Data Scientist without a STEM degree
  • The AI Education Gap and How to Close It
  • 11 Best Data Science Education Platforms
  • Top Data Python Packages to Know in 2023
  • What To Expect for AI Quality Trends In 2023
  • 10 Amazing Machine Learning Visualizations You Should Know in 2023

Boston Dynamics robot dog can answer your questions now, thanks to ChatGPT

Boston Dynamics is an engineering and robotics company that has gained popularity for its viral videos showcasing its futuristic robots performing impressive feats such as parkour.

A new video shared by AI expert Santiago Valdarrama, showcases Boston Dynamics' robot dog, Spot, performing a few new tricks that involve artificial intelligence.

Also: What is HuggingChat? Everything to know about this open-source AI chatbot

In the almost two minute video, Spot is able to answer natural language questions such as "Are you standing?" and "What is your battery level?".

To do so, Spot first uses Open AI's ChatGPT to query information and then uses Google's text-to-speech AI to vocalize the answer.

"At the end of each mission, the robots capture a ton of data. There's no simple way to query all of it on demand. That's where ChatGPT comes in," said Valdaramma.

Also: This AI chatbot can sum up any PDF and answer any question you have about it

"We show it the configuration files and the mission results. We then ask questions using that context. Put that together with a voice-enabled interface, and we have an awesome way to query our data!"

The video, which takes place at Levatas' lab, also shows Spot answering mission specific questions such as, "Spot, how many inspections in your next mission?" or "Describe your last mission."

Also: This new technology could blow away GPT-4 and everything like it

Spot has the ability to carry up to 14kg of equipment, can perform repeatable missions, gather data, navigate different terrains, conduct thermal inspections, detect radiation and more.

The robots' advanced abilities, paired with AI, caused some people to jump to the usual apocalyptic concerns about AI and robots. Valdarrama took to Twitter to put those worries to rest.

"90% of the comments were people talking about the end of civilization. Meanwhile, 99% of data scientists are still figuring out how to split their tabular datasets," said Valdaramma.

Artificial Intelligence

Alibaba Cloud seeks partners to help build custom generative AI models

Alibaba Cloud booth

Alibaba Cloud booth is seen during the Apsara Conference 2022 on November 3, 2022 in Hangzhou, Zhejiang Province of China.

Alibaba Cloud wants partners to help build generative artificial intelligence (AI) models that are customized for companies across various verticals, including finance and petrochemicals.

The Chinese cloud vendor has introduced a partnership program that it hopes will accelerate the development of such applications, powered by its large language model, Tongyi Qianwen. Launched earlier this month, the generative AI model is expected to be integrated with all of Alibaba's own business applications, including e-commerce, search, navigation, entertainment, enterprise communication, and intelligence voice assistance.

Also: What is generative AI and why is it so popular?

With the Tongyi Qianwen partnership program, Alibaba aims to facilitate the creation of large language models tailored for different verticals.

Initial industrial models in the partnership program will encompass sectors such as transportation, hospitality, finance, and telecommunications. The initial seven partners that have enrolled in the program include petrochemical company Kunlun Digital Technology, transportation company China Transinfo Technology, and electricity company LongShine Technology.

Under the initiative, Alibaba Cloud will offer partners tech support, cloud computing, as well as AI and machine learning tools. These partners can tweak and retrain the vendor's large language AI model with their proprietary technology and industrial expertise, within a "secure and designated" cloud environment, Alibaba said.

The jointly developed models then will be made available on websites and through APIs for enterprise customers and developers, who can use the AI frameworks to create applications, such as shopping guides and domain-specific virtual assistants.

Also: People turn to ChatGPT to troubleshoot their tech problems

The program will enable Alibaba to offer "more tangible value" enterprise customers with AI models that suit their industry's specific business needs, said Alibaba Group's chairman and CEO Daniel Zhang, who also leads the cloud business.

The Chinese vendor currently is integrating Tongyi Qianwen with its operating system for cars, AliOS, for internal tests, with IM Motors slated to be the first automotive brand to implement the AI model. An electric vehicle manufacturer, IM Motors is a joint venture backed by Alibaba and SAIC Motor.

Also: Just how big is this generative AI? Think internet-level disruption

Alibaba said it had received more than 200,000 beta testing requests for Tongyi Qianwen since its launch on April 11. These came from businesses across various verticals, including transport, fashion, and fintech.

Tongyi Qianwen already powers more than 10 functions on Alibaba's online collaboration workplace platform DingTalk, where its chatbot can create to-do lists, generate chat summaries, and draft marketing posts. The chatbot also will continue to learn as users feed more data, according to Alibaba.

Tongyi Qianwen has both Chinese and English language capabilities.

Significant price cuts to cloud services

The partnership program was launched at Alibaba Cloud's annual partners summit in Nanjing this week, where the vendor also unveiled price cuts to its core products and services. Ranging from 15% to 50%, the fee adjustments mark the biggest price reduction to date; however, only its customers in China will benefit from the cuts.

Also: This new technology could blow away GPT-4 and everything like it

The move is part of Alibaba's efforts to gain a wider footprint in its domestic public cloud market.

Artificial Intelligence