Stack Overflow’s Moderators are Its Last Line of Defense Against AI Junk

To say that Stack Overflow has been having a bad year is an understatement. From considerable community backlash for its proposed LLM product, to uproar over its API access changes, the community question answer platform has come under fire since ChatGPT exploded in popularity. However, this isn’t the only reason the site has declined in popularity.

New statistics show that Stack Overflow has lost around 50% of its traffic over the past one and a half years. Moreover, it has also experienced a decrease in its lifeblood of questions and answers, which has also reduced by 50%. This also comes at a time where many users of the site feel increasingly strangled by moderation.

Even as the site continues to crack down on the quality of its content, a case can be made for the increase in moderation on the website. As the Internet continues to be filled with AI junk, Stack Overflow’s heavily-moderated database of rich user-driven content might be the last bastion of human-generated domain specific-data.

Stack Overflow’s unsteady mutiny

Even before the launch of ChatGPT in November last year, Stack Overflow was seeing a steady decline in users. This was mainly caused by the company’s new-found attitude towards moderation, which started to veer into the extreme. Hacker News forum member JohnMakin stated,

“Moderation on SO has gotten progressively more horrible. Can’t tell you how many times I found the exact, bizarre question I was asking only to see one comment trying to answer it and then a mod aggressively shutting it down for not being “on topic” enough or whatever….Oftentimes the best answer is buried in comments and has very negative feedback despite answering the exact question.”

This can largely be traced back to a moderation strike which curators, contributors, and moderators of the site participated in on June 5th 2023. The main objective of this was to protest Stack Overflow’s flip-flopping AI policy, which first led to thousands of posts being removed and hundreds of users being suspended. This was then revoked in May of this year, allowing AI content to be published on the platform, much to moderators’ chagrin.

This then led to moderator’s raising the alarm over AI-generated content, believing that it will “over time, drive the value of the sites to zero”. They also argued that the company has ignored the needs of its community, instead focusing on business pivots. Through the strike, they aimed to bring attention to the issues moderators on the site face.

While the moderators are currently engaging in a retracted battle against the site’s owners, it seems that they are slowly winning. They have succeeded in bringing in an interim solution on the generative AI front, wherein the AI-generated content will be checked against a set of ‘strong’ and ‘weak’ heuristics, which will determine whether a post should be removed or not. The moderators were also successful in getting Stack Overflow to continue providing access to the data dumps and API access. This battle belies the importance of sticking to human-generated content in the age of AI, especially when the company is trying to make a living selling training data.

Saving the golden goose

Currently, many developers have turned to using chatbots to solve their programming issues. As algorithms like ChatGPT get better, their capability to logically deconstruct code also becomes more capable. Kartik D, a Senior Backend Developer at MachineHack, said on using Stack Overflow, “Finding the right Stack Overflow answer for an issue is difficult, but it’s easier in ChatGPT. Combining GPT-3.5 and Bard you get a good result, but the suggested results in Bard usually redirect to Stack Overflow.”

This shows the impact that Stack Overflow has on the training datasets of large language models like GPT-4. It is well-known information that question-answer sites are some of the richest sources of data, especially for large language models. Not only is the quality of the data high, but it is also structured in a model that could net the best training.

User maxlin on the Hacker News forum summarised this perfectly, stating, “Even though StackOverflow in the common use case has been taken over by ChatGPT, I sincerely hope it keeps operating, stays strict (even if it causes collateral) and keeps ban on LLM-generated content…Obviously ChatGPT was trained partly with data only gainable from a healthy StackOverflow-kind of site with users actively asking unique questions and enough people answering those unique questions with well-thought-out answers.”

This also echoes the statements of Reddit CEO Steve Huffman, who has stated that Reddit’s ‘corpus of data is really valuable’, as it contains things that people would ‘only ever say in therapy, or A.A., or never at all”. In that way, Stack Overflow also contains answers to some of the most specific technical queries on the Internet, keeping the quality high and up-to-date.

If AI content is allowed on the site, the quality of overall content would deteriorate and move away from the carefully worded and constructed answers of today. Moreover, stronger moderation will only increase the quality of the data, which is something Stack Overflow will soon desperately count on as self-debugging LLMs become more prominent.

To say that Stack Overflow has been having a bad year is an understatement. From considerable community backlash for its proposed LLM product, to uproar over its API access changes, the community question answer platform has come under fire since ChatGPT exploded in popularity. However, this isn’t the only reason the site has declined in popularity.

New statistics show that Stack Overflow has lost around 50% of its traffic over the past one and a half years. Moreover, it has also experienced a decrease in its lifeblood of questions and answers, which has also reduced by 50%. This also comes at a time where many users of the site feel increasingly strangled by moderation.

Even as the site continues to crack down on the quality of its content, a case can be made for the increase in moderation on the website. As the Internet continues to be filled with AI junk, Stack Overflow’s heavily-moderated database of rich user-driven content might be the last bastion of human-generated domain specific-data.

Stack Overflow’s unsteady mutiny

Even before the launch of ChatGPT in November last year, Stack Overflow was seeing a steady decline in users. This was mainly caused by the company’s new-found attitude towards moderation, which started to veer into the extreme. Hacker News forum member JohnMakin stated,

“Moderation on SO has gotten progressively more horrible. Can’t tell you how many times I found the exact, bizarre question I was asking only to see one comment trying to answer it and then a mod aggressively shutting it down for not being “on topic” enough or whatever….Oftentimes the best answer is buried in comments and has very negative feedback despite answering the exact question.”

This can largely be traced back to a moderation strike which curators, contributors, and moderators of the site participated in on June 5th 2023. The main objective of this was to protest Stack Overflow’s flip-flopping AI policy, which first led to thousands of posts being removed and hundreds of users being suspended. This was then revoked in May of this year, allowing AI content to be published on the platform, much to moderators’ chagrin.

This then led to moderator’s raising the alarm over AI-generated content, believing that it will “over time, drive the value of the sites to zero”. They also argued that the company has ignored the needs of its community, instead focusing on business pivots. Through the strike, they aimed to bring attention to the issues moderators on the site face.

While the moderators are currently engaging in a retracted battle against the site’s owners, it seems that they are slowly winning. They have succeeded in bringing in an interim solution on the generative AI front, wherein the AI-generated content will be checked against a set of ‘strong’ and ‘weak’ heuristics, which will determine whether a post should be removed or not. The moderators were also successful in getting Stack Overflow to continue providing access to the data dumps and API access. This battle belies the importance of sticking to human-generated content in the age of AI, especially when the company is trying to make a living selling training data.

Saving the golden goose

Currently, many developers have turned to using chatbots to solve their programming issues. As algorithms like ChatGPT get better, their capability to logically deconstruct code also becomes more capable. Kartik D, a Senior Backend Developer at MachineHack, said on using Stack Overflow, “Finding the right Stack Overflow answer for an issue is difficult, but it’s easier in ChatGPT. Combining GPT-3.5 and Bard you get a good result, but the suggested results in Bard usually redirect to Stack Overflow.”

This shows the impact that Stack Overflow has on the training datasets of large language models like GPT-4. It is well-known information that question-answer sites are some of the richest sources of data, especially for large language models. Not only is the quality of the data high, but it is also structured in a model that could net the best training.

User maxlin on the Hacker News forum summarised this perfectly, stating, “Even though StackOverflow in the common use case has been taken over by ChatGPT, I sincerely hope it keeps operating, stays strict (even if it causes collateral) and keeps ban on LLM-generated content…Obviously ChatGPT was trained partly with data only gainable from a healthy StackOverflow-kind of site with users actively asking unique questions and enough people answering those unique questions with well-thought-out answers.”

This also echoes the statements of Reddit CEO Steve Huffman, who has stated that Reddit’s ‘corpus of data is really valuable’, as it contains things that people would ‘only ever say in therapy, or A.A., or never at all”. In that way, Stack Overflow also contains answers to some of the most specific technical queries on the Internet, keeping the quality high and up-to-date.


If AI content is allowed on the site, the quality of overall content would deteriorate and move away from the carefully worded and constructed answers of today. Moreover, stronger moderation will only increase the quality of the data, which is something Stack Overflow will soon desperately count on as self-debugging LLMs become more prominent.

The post Stack Overflow’s Moderators are Its Last Line of Defense Against AI Junk appeared first on Analytics India Magazine.

AI Will Deliver $115 Billion More to the Economy, But Are Tech Pros Ready?

A circuit board with the word AI hovering above.
Image: Kaikoro/Adobe Stock

The adoption of AI has occurred at an almost unprecedented rate. Now, a report from the Technology Council of Australia and Microsoft suggests that generative AI alone could be worth $115 billion to the Australian economy by 2030.

That is assuming that the rate of adoption continues on its current trajectory. If things slow down a little, it could still boost the economy by $45 billion.

Microsoft has a vested interest in making this sound both impressive and transformative. After all, Microsoft has invested billions of dollars into OpenAI — a company that conveniently produces generative AI solutions.

SEE: Learn more about generative AI and how it works.

Nonetheless, AI being a revenue-generating resource isn’t new news, and it’s already having an impact on businesses and the economy, with some finding ways for it to boost productivity and efficiency and others finding themselves out of a job.

For now, IT professionals are relatively protected by AI. The most impacted sectors are finance and banking, media and marketing, and legal services, while the least impacted are manufacturing and factory workers, agriculture and healthcare. However, AI’s impact is almost certain to be universal, with Goldman Sachs predicting that the technology will replace 300 million full-time jobs by 2030.

The question is whether IT workers are genuinely ready to ride the wave of displacement that is about to wash over the country.

Jump to:

  • Understand what AI can and cannot do
  • Be prepared to change how you work
  • Five quick tips for IT pros to prepare for the AI future

Understand what AI can and cannot do

The simple and blunt answer to the question of whether AI can replace a job is whether or not the job is process-heavy. AI is a glorified form of automation, and while it’s built on astronomically large sets of data, and is therefore potentially good at disseminating information, anything that relies on “thinking outside the box” is beyond the capabilities of AI.

That applies both now and into the future. AI will get better and more reliable. It’s going to support innovators and free them to do more creative thinking. It’s not going to lead them.

IT professionals should therefore focus on developing skill sets that extend beyond the process side of their work. Coding, software engineering and basic data analytics will be increasingly replaced by AI.

SEE: Take advantage of TechRepublic Premium’s prompt engineer hiring kit.

However, experienced IT professionals will be a step ahead of AI with their higher value skills. They’ll use the work AI does to enhance their own value within the organization. Project managers, specialists and those that can articulate IT into business outcomes will continue to find themselves valued by businesses.

And then there’s AI itself. Australia is on track to need another 161,000 AI specialists by 2030, so adding AI capabilities to your portfolio now is an effective way to prepare for the future.

Be prepared to change how you work

Another key shift that IT professionals will want to prepare themselves for is the shift away from full-time employment to a more contract-based form of work — call that “freelancing,” “consulting” or the “gig economy,” as you will. One of the benefits of having full-time employees is that they can be set tasks of low value that would become excessively expensive to give to a freelancer at contracting rates.

Those are also the tasks that will increasingly be taken over by AI, leaving full-time employees with a surplus of time they may or may not be able to fill with higher-value tasks. Organizations might use that as an opportunity to streamline their workforces.

However, the need for the valued skills those employees brought to their organizations in the first place will continue to be required, so what we’ll likely see is more organizations relying more heavily on freelance capabilities on a project-by-project basis. Indeed, as far back as 2018 McKinsey was reporting that 61% of organizations anticipated hiring more temporary employees. So, this shift may come as no surprise to some Australian IT professionals.

Working as a freelancer or consultant requires additional skills around business development, networking, time management and project management. It’s difficult for some, but IT professionals would do well to prepare themselves for the potential that they will spend at least some time working on a freelance basis in their future careers. Those that know they’ll struggle with the “business” side of freelance life should look into some basic business development and management courses to prepare themselves for this potential outcome.

Five quick tips for IT pros to prepare for the AI future

AI isn’t something for IT professionals to fear. There is always going to be work for the IT pros, it’s just a matter of being adaptable and flexible enough to move to where the opportunities lie. It’s important to consider the next few years as critical to career development to ensure that, as your role transforms, you are able to move with it. Those that can’t, will be the “AI casualty” statistics that will get routinely written up in the media.

1. Continuous learning and upskilling

AI and related technologies are constantly evolving, so IT professionals must embrace lifelong learning and continuously update their skills. Investing time in learning AI programming languages, machine learning algorithms, and data science techniques will empower IT professionals to contribute effectively to AI-related projects and remain relevant in their roles.

2. Identify AI-augmented roles

Rather than fearing AI as a threat, IT professionals should recognize its potential to augment their existing roles. For instance, AI can assist in data analysis, predictive maintenance, cybersecurity and decision-making. By identifying these AI-augmented roles, IT professionals can position themselves as valuable assets in their organizations.

3. Develop a data-driven mindset

Data is the backbone of AI, and IT professionals should develop a data-driven mindset. They must understand how to collect, clean and analyze data to derive meaningful insights that can drive business decisions and improve AI models.

4. Cultivate creativity and problem-solving skills

While AI can handle many repetitive tasks, creativity and problem-solving abilities remain essential skills for IT professionals. AI is only as effective as the problems it solves. IT professionals who can creatively identify opportunities for AI implementation and effectively address complex challenges will be in high demand.

5. Ethical considerations in AI

With the increasing use of AI, ethical considerations have become paramount. IT professionals should familiarize themselves with the ethical implications of AI, including bias in algorithms, data privacy and the potential social impact. This will make them incredibly valuable to organizations that need that contextual awareness that AI lacks.

Subscribe to the Daily Tech Insider AU Newsletter

Stay up to date on the latest in technology with Daily Tech Insider Australian Edition. We bring you news on industry-leading companies, products, and people, as well as highlighted articles, downloads, and top resources. You’ll receive primers on hot tech topics that are most relevant to AU markets that will help you stay ahead of the game.

Delivered Thursdays Sign up today

Genpact Collaborates with Microsoft to Equip Workforce with AI Tools

Genpact has recently announced a collaboration with Microsoft to equip over 115,000 employees with AI tools. The collaboration will allow Genpact’s global talent to access Microsoft’s Azure OpenAI Service to enable them to unlock new opportunities to implement generative AI capabilities for clients.

Genpact has identified various opportunities to leverage large language models (LLMs) to drive enterprise efficiencies in areas such as transition management, global service desk management, infrastructure management, and other areas.

This relationship with Microsoft builds on Genpact’s deep expertise in AI innovation – from decades-long investments in advanced data analytics to strategic AI acquisitions, to leading-edge technology solutions and extensive experience with LLMs across numerous industries, including consumer goods, retail, life sciences, healthcare, hi-tech, and financial services.

Genpact will also support its employees’ use of generative AI tools through comprehensive training programmes and resources.

The company recently established a generative AI practice with a team of data scientists, data engineers, and domain experts focused on the rapid development of generative AI capabilities on the Genpact Cora platform which is already integrated with more than 250 enterprise ecosystems handling more than 20 million transactions a month.

“Our continuous learning environment helps clients optimise AI’s rapidly evolving landscape, and we’re excited to leverage Microsoft’s AI tools will revolutionise the way we approach problem-solving,” Vidya Rao, Chief Information Officer at Genpact, said.

The post Genpact Collaborates with Microsoft to Equip Workforce with AI Tools appeared first on Analytics India Magazine.

Everything You Need About the LLM University by Cohere

Everything You Need About the LLM University by Cohere
Image by Author

You’ve probably been hearing a lot about Large Language Models (LLMs). Some of you are interested in what the future holds. Some are wondering “How do I involve myself in this?!”. Regardless of what your thoughts about LLMs are — the end goal is to want to learn more about it. If you want to learn about LLMs to transition to a different career in the tech industry — The LLM University by Cohere can help you with exactly that!

We are seeing more and more developers interested in taking their careers with LLMs to the next level. Natural language processing (NLP) is an area that a lot of developers thought they’d not dive into. But with the growth of LLMs and organizations such as Cohere providing education content — it’s making the transition a lot easier.

What is LLM University?

Cohere aims to build the future of language AI by empowering developers and enterprises to make products that allow them to capture essential business value with language AI. to live up to this, they have created the LLM University for developers who want to learn more about NLP and LLMs.

They offer a comprehensive curriculum that aims to provide students and developers with a good foundational knowledge of NLP and build on this to develop their own applications.

Don’t feel nervous when you hear that it’s for developers — because they are here to cater to all types of people from all types of backgrounds. You will learn the basics of NLP and LLMs and build on your knowledge to a more advanced level, such as building and using text representation and text generation models.

The theoretical aspect has clear explanations and analogies with examples to back it up and the practical aspect has code examples to solidify your knowledge. Once you have a good understanding of the sector, you will put your skills to the test with hands-on exercises which will then allow you to build and deploy your very own models.

Learning Route

So how does that work? Beginners and intermediates together? No. So there are two ways to learn:

  1. Sequential

If you are a new machine learning engineer, you may feel more comfortable starting from the beginning with NLP and LLMs. With the sequential route, you will go through the basics of NLP and LLMs, and their architecture.

Although this route requires very little background knowledge, you can still brush up on your knowledge of Machine Learning and NLP using the following material: Appendix 1.

  1. Non-Sequential

If you feel a bit more confident about the basics of NLP and LLMs, you may not want to start from the basics. You can skip these basic modules and you can move on to particular modules that fit your requirements or will help you with a particular project in mind. You can have a look at what this entails by checking out the following material: Appendix 2.

LLM University Curriculum

Want to know what you will be learning about? Let's dive in…

In the following main modules, you will learn about LLMs, how they work, and work on practical hands-on labs to build your own language applications. The first module is completely theory-focused, and then in modules 2, 3, and 4, you will have a combination of theory and hands-on practice with code labs.

These are the modules:

  1. Module 1: What are Large Language Models?

In this module, you will learn the basics of LLMs, as well as learn more about embeddings, attention, transformer model architecture, semantic search, as well as practical examples and hands-on exercises.

  1. Module 2: Text Representation with Cohere Endpoints

In the second module, you will go through theory as well as practical labs where you will learn how to use Cohere's endpoints for Classification, Embeddings, and Semantic Search. By the end of this module, you will learn how to write code to call the Cohere API for several different endpoints.

  1. Module 3: Text Generation with Cohere Endpoints

In the third module, you will learn about using generative learning to generate text. You will start with a codelab that teaches you how to use the generated endpoint and then master prompt engineering.

  1. Module 4: Deployment

Last but not least, deployment! When you build your applications, you will then learn how to deploy them using platforms and frameworks, such as AWS SageMaker, Streamlit, and FastAPI.

Once you have completed these modules, you will have mastered the world of NLP and unlocked a world of new opportunities in the growing language technology.

Wrapping it up

For you to get the help you need, Cohere is taking in the first batch of learners and guiding them through the course materials together. They also have reading groups and will be hosting exclusive events. You can sign up for their Discord community: Cohere's Discord community where you will be able to connect with other learners, help each other through the process, share ideas and build together.
Nisha Arya is a Data Scientist, Freelance Technical Writer and Community Manager at KDnuggets. She is particularly interested in providing Data Science career advice or tutorials and theory based knowledge around Data Science. She also wishes to explore the different ways Artificial Intelligence is/can benefit the longevity of human life. A keen learner, seeking to broaden her tech knowledge and writing skills, whilst helping guide others.

More On This Topic

  • Free University Data Science Resources
  • New Online MS in Business Analytics for Managers from University of…
  • Be prepared to manage the threat with an MS in Cybersecurity from Bay Path…
  • Be prepared to manage the threat with an MS in Cybersecurity from Bay Path…
  • Be prepared to manage the threat with an MS in Cybersecurity from Bay Path…
  • Be prepared to manage the threat with an MS in Cybersecurity from Bay Path…

HDFC ERGO & Google Cloud Partner to Create Center of Excellence for Generative AI 

India’s leading private sector general insurance company HDFC ERGO General Insurance is launching a Centre of Excellence (CoE) for generative AI in collaboration with Google Cloud. The latter will be supporting HDFC Ergo with identifying and developing use cases and upskilling their teams on generative AI.

As a ‘digital first’ organization, HDFC ERGO has been a frontrunner in harnessing technology to democratize insurance and deliver efficient and hyper-personalized customer-centric services. Google Cloud will support HDFC ERGO in establishing best practices and guidelines to ensure the responsible and ethical use of generative AI for their developers.

By leveraging the capabilities of its CoE, HDFC ERGO aims to develop new products and services, enhance the efficiency of existing processes, and optimize costs. Besides, aligned with HDFC ERGO’s commitment to fostering a future-ready workforce, the CoE will also offer training and upskilling opportunities to employees on generative AI technologies, empowering young jobseekers to stay at the forefront of this rapidly evolving field.

Last day, Birlasoft Limited, part of the C.K. Birla Group has established a Generative AI Centre of Excellence, in collaboration with Microsoft. The Generative AI Centre of Excellence will serve as a hub for Birlasoft and Microsoft experts to facilitate research, training, and collaboration.

Under the Hood

Being a digital-first company, HDFC ERGO has embraced AI technology and is known for its exceptional customer service. They have developed innovative insurance products and services AI, ML, NLP and Robotics. The company’s technological platform enables approximately 93% of retail policies to be issued digitally and caters to around 69% of customer service requests on a 24×7 basis, with 19% of these services handled by AI-based tools.

HDFC ERGO General Insurance Company Limited is a leading non-life insurance company in India’s private sector. It was established through a collaboration between Housing Development Finance Corporation Ltd. (HDFC) and ERGO International AG, a part of the Munich Re Group. After HDFC merged with HDFC Bank Limited, the insurance company became a subsidiary of the bank.

HDFC ERGO offers a comprehensive range of general insurance products, including health, motor, home, agriculture, travel, credit, cyber, and personal accident insurance in the retail segment. Additionally, they provide property, marine, engineering, marine cargo, group health, and liability insurance for corporate clients.

In terms of investments, HDFC ERGO has made two significant ones, with the most recent being a $10 million investment in MedGenome on March 5, 2018. They have also acquired L&T General Insurance Company for $82 million on June 6, 2016.

The post HDFC ERGO & Google Cloud Partner to Create Center of Excellence for Generative AI appeared first on Analytics India Magazine.

IBM Report: Data Breaches Costed India INR 17.9 Crores On Avg, AI Proved Cost-Effective

System software firm, IBM Security has released its annual Cost of a Data Breach Report showing the alarming average cost of a data breach in India reached INR 17.9 crores in 2023 – a 28% increase since 2020. Remarkably, the companies which are internally relying on AI and automation saw a significantly less amount.

The news comes a fortnight after the Union Cabinet approved the draft of Digital Personal Data Protection Bill (DPDP) Bill 2023 for the ongoing monsoon session of the Indian Parliament. The bill proposes to levy a penalty of up to Rs 250 crore (instead of initially proposed 500 crores) on organisations for every instance of failure to prevent data breach.

Commenting on the report, Viswanath Ramaswamy, Vice President, Technology, IBM India & South Asia, said “With cyberattacks growing in pace and cost in India, businesses must invest in modern security strategies and solutions to stay resilient. The report shows that security AI and automation had the biggest impact on keeping breach costs down and cutting time off the investigation – and yet a majority of organisations in India still haven’t deployed these technologies.”

Notably, the report also states AI and automation was the most impactful for cutting down the speed of breach identification and containment for 4 in every 10 organisations surveyed for the report compiled by Poneman Institute. In the India landscape, firms with extensive use of AI and automation experienced a data breach lifecycle that was 153 days shorter (225 days versus 378 days) compared to the rest. It is important to note that 80% of studied organisations have limited (37%) or no use (43%) of AI.

Last week, the Telecom Regulatory Authority of India (TRAI) has recommended a structure for regulating AI through the lens of a risk-based framework in a ‘141 paged document’ that will only concern the 20% of the companies referring to the report. As part of the framework, the authority also has plans for an independent statutory body, Artificial Intelligence and Data Authority of India (AIDAI).

The post IBM Report: Data Breaches Costed India INR 17.9 Crores On Avg, AI Proved Cost-Effective appeared first on Analytics India Magazine.

Textbooks Are All You Need: A Revolutionary Approach to AI Training

Textbooks Are All You Need: A Revolutionary Approach to AI Training
Image created by Author with Midjourney Introduction

Researchers are always looking for new and better ways to train artificial intelligence models. A recent paper from Microsoft proposed an interesting approach — using a synthetic textbook to teach the model instead of the massive datasets typically used.

The paper introduces a model called Phi-1 that was trained entirely on a custom-made textbook. The researchers found this was just as effective as much larger models trained on huge piles of data for certain tasks.

The title "Textbooks Are All You Need" is a clever reference to the well-known concept in AI "Attention is All You Need." But here they flip the idea — rather than focusing on the model architecture itself, they show the value of high-quality, curated training data like you'd find in a textbook.

The key insight is that a thoughtful, well-designed dataset can be just as useful as enormous, unfocused piles of data for teaching an AI model. So the researchers put together a synthetic textbook to carefully feed the model the knowledge it needed.

This textbook-based approach is an intriguing new direction for efficiently training AI models to excel at specific tasks. It highlights the importance of training data curation and quality over just brute force data size.

Key Points

  • The Phi-1 model, despite being significantly smaller than models like GPT-3, performs impressively well in Python coding tasks. This demonstrates that size isn't everything when it comes to AI models.
  • The researchers used a synthetic textbook for training, emphasizing the importance of high-quality, well-curated data. This approach could revolutionize how we think about training AI models.
  • The Phi-1 model's performance improved significantly when fine-tuned with synthetic exercises and solutions, indicating that targeted fine-tuning can enhance a model's capabilities beyond the tasks it was specifically trained for.

Discussion

The Phi-1 model, with 1.3 billion parameters, is relatively small compared to models like GPT-3, which has 175 billion parameters. Despite this size difference, Phi-1 demonstrates impressive performance in Python coding tasks. This achievement underscores the idea that the quality of training data can be as important, if not more so, than the size of the model.

The researchers used a synthetic textbook to train the Phi-1 model. This textbook was generated using GPT-3.5 and was composed of Python text and exercises. The use of a synthetic textbook emphasizes the importance of high-quality, well-curated data in training AI models. This approach could potentially shift the focus in AI training from creating larger models to curating better training data.

Interestingly, the Phi-1 model's performance improved significantly when it was fine-tuned with synthetic exercises and solutions. This improvement was not limited to the tasks it was specifically trained for. For example, the model's ability to use external libraries like pygame improved, even though these libraries were not included in the training data. This suggests that fine-tuning can enhance a model's capabilities beyond the tasks it was specifically trained for.

Research Q&A

Q: How does the Phi-1 model compare to larger models in terms of versatility?

A: The Phi-1 model is specialized in Python coding, which restricts its versatility compared to multi-language models. It also lacks the domain-specific knowledge of larger models, such as programming with specific APIs or using less common packages.

Q: How does the Phi-1 model handle stylistic variations or errors in the prompt?

A: Due to the structured nature of the datasets and the lack of diversity in terms of language and style, the Phi-1 model is less robust to stylistic variations or errors in the prompt. If there's a grammatical mistake in the prompt, the model's performance decreases.

Q: Could the Phi-1 model's performance improve with the use of GPT-4 for generating synthetic data?

A: Yes, the researchers believe that significant gains could be achieved by using GPT-4 to generate synthetic data instead of GPT-3.5. However, GPT-4 is currently slower and more expensive to use.

Q: How does the Phi-1 model's approach to training differ from traditional methods?

A: Traditional methods often focus on increasing the size of the model and the amount of data. In contrast, the Phi-1 model emphasizes the quality of the data and uses a synthetic textbook for training. This approach could potentially shift the focus in AI training from creating larger models to curating better training data.

Research Takeaways

Microsoft Research's "Textbooks Are All You Need" has a rather novel idea for training AI models. Instead of just throwing massive piles of data at the model like usual, they created a synthetic textbook to teach the model.

They trained this smaller model called Phi-1 only using this custom textbook, and it worked shockingly well compared to huge models like GPT-3. It shows that you can train a really effective AI with a thoughtfully designed, high-quality dataset, even if it's way smaller.

The key is taking the time to curate great training data, like you'd find in a textbook, instead of just feeding the model terabytes of random, messy data. It's all about the quality, not quantity.

This could change how people think about training AI going forward. Rather than chasing ever-bigger models that need giant datasets, maybe we should focus more on creating the best possible training textbooks, even if they're smaller. It's an intriguing idea that the key is in the textbook, not just in scaling up the model.

Matthew Mayo (@mattmayo13) is a Data Scientist and the Editor-in-Chief of KDnuggets, the seminal online Data Science and Machine Learning resource. His interests lie in natural language processing, algorithm design and optimization, unsupervised learning, neural networks, and automated approaches to machine learning. Matthew holds a Master's degree in computer science and a graduate diploma in data mining. He can be reached at editor1 at kdnuggets[dot]com.

More On This Topic

  • OpenAI’s Approach to AI Safety
  • Explaining the Explainable AI: A 2-Stage Approach
  • A simple static visualization can often be the best approach
  • Support Vector Machines: An Intuitive Approach
  • Applied Language Technology: A No-Nonsense Approach
  • A (Much) Better Approach to Evaluate Your Machine Learning Model

Top 10 Free Specialised Courses by Andrew Ng

Andrew Ng, founder and chief of DeepLearning.AI, is leading the AI education space with meticulously created courses. In addition to offering introductory courses, he provides a range of specialised courses, all of which are available for free.

And like always, these courses are all free of cost.

These specialised courses and programs by DeepLearning.AI, curated by a team of AI experts and led by Andrew Ng, provide valuable opportunities for learners to advance their knowledge and expertise in the rapidly evolving field of AI and deep learning.

Let’s take a look at them.

AI For Everyone

Unlike the other courses by Ng, “AI for Everyone” is a non-technical, introductory course designed to help both business professionals and technical individuals understand AI technologies and their applications. It covers the fundamentals of AI, machine learning, and data science, providing insights into what AI can and cannot do. Participants will learn about the workflow of AI and data science projects, how to choose AI projects and the impact of AI on society. The course aims to equip learners with the knowledge to build a sustainable AI strategy and navigate the challenges brought about by technological change. The course consists of four weeks of content, with a total duration of six hours.

Instructor: Andrew Ng

AI For Good

The AI for Good course, in collaboration with Microsoft’s AI for Good lab is a specialization designed for individuals interested in using AI to address real-world challenges in humanitarian and environmental projects. Participants will learn how to contribute to AI-powered initiatives that create positive change, such as mitigating climate change, supporting disaster response, and improving public health. The course provides a step-by-step framework for utilizing AI in real-world projects and includes hands-on case studies and labs using Python and Jupyter Notebooks. It is suitable for learners from all backgrounds and does not require prior experience in AI or coding. The course is a collaboration with Microsoft’s AI for Good Lab and offers insights from experts working in the field. Upon completion, participants receive a certificate and gain valuable knowledge to contribute to AI for Good initiatives worldwide.

Instructor: Robert Monarch, ML Leader at Apple

Machine Learning Specialization

This is a newly rebuilt and expanded program created by Andrew Ng for beginners seeking to break into the field of AI and machine learning. The specialization consists of three courses, and it provides foundational AI concepts through an intuitive visual approach, followed by hands-on coding and an introduction to the underlying math. The course is designed to be approachable for complete beginners, requiring no prior math knowledge or rigorous coding background. It covers topics such as linear regression, logistic regression, neural networks, decision trees, recommender systems, and more. The updated curriculum uses Python instead of Octave, and the section on applying machine learning has been enhanced with best practices from the last decade.

Instructor: Andrew Ng

Master the Mathematics Behind AI and Unlock Your Potential

The “Mathematics for Machine Learning and Data Science Specialisation” is a beginner-friendly course that equips learners with a solid understanding of essential mathematical concepts used in machine learning. The course covers calculus, linear algebra, statistics, and probability, providing students with the tools to comprehend algorithms and optimize them for custom implementation. By enrolling in this specialization, participants will gain statistical techniques to enhance data analysis, acquire highly sought-after skills by employers for excelling in machine learning interviews and securing their dream jobs. The course features a team of instructors with expertise in the field, and it is designed for individuals with a high-school level of mathematics knowledge.

Instructor: Luis Serrano, Founder, Serrano Academy

TensorFlow: Data and Deployment Specialization

The “TensorFlow: Data and Deployment Specialization” is a four-course intermediate-level program that spans four months, with a recommended commitment of three hours per week. The specialization aims to teach participants how to deploy machine learning models for various devices and platforms using TensorFlow. It covers topics such as running models in web browsers using TensorFlow.js, deploying models on mobile devices using TensorFlow Lite, data pipelines with TensorFlow Data Services, and advanced deployment scenarios with TensorFlow Serving. The courses include practical exercises and projects, and participants will learn to handle data, work with APIs, and use pre-trained models effectively.

Instructor: Laurence Moroney, Lead AI Advocate at Google.

Generative Adversarial Networks (GANs) Specialization

The GANs Specialization is a three course intermediate-level program that focuses on image generation using GANs. Students will learn to create basic GANs using PyTorch, advanced DCGANs with convolutional layers, and conditional GANs. The courses cover comparing generative models, using the FID method to assess GAN fidelity and diversity, detecting bias in GANs, and implementing StyleGAN techniques. Additionally, participants will explore GANs applications for data augmentation and privacy preservation, as well as building Pix2Pix and CycleGAN for image translation. The program also addresses social implications of GANs, such as bias in machine learning and methods to detect it. Throughout the courses, learners will develop skills in areas like generator design, image-to-image translation, and understanding computer graphics terminology, among others. The specialization aims to provide a comprehensive understanding of GANs and offers practical hands-on experience.

Instructor: Sharon Zhou, CEO, Co-founder, Lamini

AI for Medicine

The course focuses on practical applications of ML in the field of medicine. Participants will learn to estimate treatment effects using data from randomized control trials, interpret diagnostic and prognostic models, and extract information from unstructured medical data using natural language processing. The skills acquired include model interpretation, image segmentation, natural language extraction, machine learning, time-to-event modeling, deep learning, model evaluation, multi-class classification, random forest, model tuning, and treatment effect estimation. The course will also explore various AI-driven medical applications, such as diagnosing diseases from x-rays and 3D MRI brain images, predicting patient survival rates with tree-based models, and automating the labeling of medical datasets through natural language processing.

Instructor: Pranav Rajpurkar, Assistant Professor and Director, Harvard-Stanford Medical AI Bootcamp

Generative AI with Large Language Models (LLMs)

This course, developed in collaboration with AWS, focuses on teaching the foundational principles and practical applications of generative AI in real-world scenarios. It covers the entire lifecycle of LLM-based generative AI, starting from data gathering and model selection to deployment and performance evaluation. Participants will gain a functional understanding of LLMs and the transformer architecture that powers them, along with the ability to fine-tune models for specific use cases. The course also explores cutting-edge research in generative AI and offers hands-on training, tuning, and deployment methods to optimize model performance.

Instructor: Antje Barth and Mike Chambers, Developer Advocates, Gen AI, AWS. Chris Fregly and Shelbee Eigenbrode, Principal Solutions Architect, Gen AI, AWS.

MLOps Specialization

The MLOps Specialization is an advanced 4-month course that equips students with production-ready ML skills. It covers tools, techniques, and experiences to build and maintain integrated systems operating continuously in production, handling evolving data efficiently. The four courses include topics like ML production system design, concept drift, data pipelines, feature engineering with TensorFlow Extended, and model resource management.

Instructors: Andrew Ng, Robert Crowe, TensorFlow Developer Engineer, Google, Laurence Moroney, Lead AI Advocate, Google

Practical Data Science on the AWS Cloud (PDS) Specialization

This is also another advanced course that equips data-focused developers, scientists, and analysts with the skills needed to deploy scalable ML pipelines using Amazon SageMaker in the AWS cloud. The specialization covers various topics, including data preparation, feature engineering, automated machine learning (AutoML), model training and evaluation, ML pipelines, artifact and lineage tracking, and human-in-the-loop pipelines. Participants will gain hands-on experience with algorithms like BERT and FastText for natural language processing (NLP) using Amazon SageMaker. By the end of the program, learners will be able to build and deploy end-to-end ML pipelines, optimize model performance, and reduce costs while improving data products.

Instructors: Antje Barth, Developer Advocate, Gen AI, AWS. Chris Fregly and Shelbee Eigenbrode, Principal Solutions Architect, Gen AI, AWS. Sireesha Muppala, Principal Solutions Architect, AI and ML, AWS.

Read more: Top 7 Generative AI Courses by Andrew Ng

The post Top 10 Free Specialised Courses by Andrew Ng appeared first on Analytics India Magazine.

Independent Ada Lovelace Institute Asks UK Government to Firm up AI Regulation Proposals

The circuit board is highlighted with blue color and has a chip with AI print on it.
Image: Shuo/Adobe Stock

Attempts to create standards and regulations for the way generative AI intersects with many aspects of society are underway across the world. For instance, in March, the U.K. government released a white paper promoting the country as a place to “turbocharge growth” in AI. According to the white paper, 500,000 people in the U.K. are employed in the AI industry, and AI contributed £3.7 billion ($4.75 billion) to the national economy in 2022.

In response, on July 18, the independent research body Ada Lovelace Institute, in a lengthy report, called for a more “robust domestic policy” in order to regulate AI through legislation that clarifies and organizes the U.K.’s effort to promote AI as an industry.

Jump to:

  • Lovelace institution cautions government
  • What are the Lovelace Institute’s recommendations?
  • The art of balancing regulation and innovation
  • U.K. stands in contrast to EU security concerns

Lovelace institution cautions government

“The UK’s diffuse legal and regulatory network for AI currently has significant gaps. Clearer rights and new institutions are needed to ensure that safeguards extend across the economy,” Matt Davies and Michael Birtwistle of the Ada Lovelace Institute wrote.

Both groups are essentially calling for more clarity around AI regulation, but the U.K. government is focusing on being “pro-innovation,” while the Ada Lovelace Institute promotes an emphasis on oversight. The U.K. government is also working on gradually shifting away from the GDPR as part of post-Brexit reshuffling.

What are the Lovelace Institute’s recommendations?

The Ada Lovelace Institute’s recommendations include:

  • Taking another look at the U.K.’s adoption of GDPR and the proposed Data Protection and Digital Information Bill, which could replace GDPR in the country.
  • Publishing a statement of citizens’ rights and protections as related to AI.
  • Clarifying laws and creating new government positions around AI.
  • Supporting the development of standards.
  • Establishing funds and government support for consumer groups, trade unions and advisory organizations that might want to hold AI makers accountable.

Meanwhile, the U.K. prefers to let existing governmental bodies decide how to handle AI on a case-by-case basis. Specifically, the white paper recommends the Health and Safety Executive, Equality and Human Rights Commission and Competition and Markets Authority work on their own “context-specific approaches” to generative AI.

The art of balancing regulation and innovation

Gerald Kierce Iturrioz, co-founder and chief executive officer at AI governance management platform Trustible, said his organization agrees with many of the Ada Lovelace Institute’s recommendations.

Governments that want to be pro-innovation should “clarify the legal gray areas such as use of data for training, how bias and fairness should be evaluated, and what the burden of proof standards should be,” he said in an email to TechRepublic.

“The U.K. must swiftly establish guardrails to ensure that AI systems are developed and used responsibly within the public sector,” Iturrioz said.

If the government doesn’t establish guardrails, more risks could arise. For example, Iturrioz pointed out the use of automated facial recognition by the U.K. police, which a human rights study from the University of Cambridge last year found to be ethically and legally dubious.

U.K. stands in contrast to EU security concerns

The U.K.’s relatively laissez-faire approach stands in contrast to the European Union’s focus on regulation. The EU is working on an AI draft law for a risk-based approach that focuses on reducing bias, coercion or biometric identification such as automated facial recognition. In June, the European Parliament approved draft legislation for the AI Act, which establishes guidelines for the use of AI and forbids some uses, including real-time facial recognition in public places.

Representatives from countries across the world and from many of the leading AI makers presented similar concerns at the first United Nations Security Council meeting on the topic.

“The U.K. seems to be waiting to see how implementation and reception of the EU’s AI Act should influence their approach towards AI regulations,” said Iturrioz. “While this makes sense on the surface, there are risks to sitting back while others move ahead on AI regulation.”

Person using a laptop computer.

Subscribe to the TechRepublic News and Special Offers (Intl) Newsletter

Keep informed about the latest site features, downloads, special offers, and products from TechRepublic.

Delivered Wednesdays and Fridays Sign up today

Salesforce Takes a Giant Leap with Generative AI

When it comes to companies heavily investing in the capabilities and advantages of generative AI, Salesforce undoubtedly takes the lead. Marc Benioff, co-founder and CEO of Salesforce even said that generative AI could possibly be the most important technology of any lifetime. In March, the San Francisco-based software company became the first company to announce the world’s first generative AI for CRM, called EinsteinGPT.

Subsequently, Salesforce has unveiled an array of generative AI capabilities, such as SlackGPT and TableauGPT. Through its venture wing and with a USD 500 million budget, Salesforce is also investing in a host of AI startups. Furthermore, the CRM leader also introduced its own Large Language Models (LLMs) known as XGen-7B.

“Earlier this year, we began infusing generative AI across our entire platform with Einstein GPT, which brings generative AI across every part of Salesforce. In June, we announced AI Cloud, a suite of capabilities optimised for delivering real-time generative experiences across all applications and workflows,” Deepak Pargaonkar, vice president, solution engineering, Salesforce India, told AIM.

“We have a robust roadmap for lighting up generative AI capabilities across every Salesforce product, as well as delivering low-code generative app builders that make it easy to extend what AI Cloud provides out of the box with custom AI – built by yourself or, by bringing your own model from a partnering ML platform – to further enhance the productivity of your teams,” he added.

AI cloud

To bring generative AI capabilities to its customers, salesforce announced AI Cloud- a suite of products built for CRM in enterprises where they can boost their productivity through all the Salesforce AI applications. “AI Cloud is enterprise-ready. We’re using it to deliver AI-generated content across every line of business. Think of any contact centre agent – they can now use AI Cloud to generate relevant replies to chat conversations, grounded on sensitive data, integrated into an end-to-end workflow, and do all of this without fear of that case data leaving Salesforce’s trust boundary.”

AI Cloud empowers sales reps to auto-generate personalised emails for customers, while service teams produce customised agent chat replies and case summaries. Marketers utilise auto-generated personalised content to engage customers through various channels. Commerce teams access auto-generated insights and recommendations for tailored commerce experiences. Additionally, developers leverage AI to auto-generate code, predict potential bugs, and suggest fixes.

“Moreover, LLMs can assist developers in writing code more efficiently and allowing them to focus on creativity and other business aspects, ultimately enhancing the organisation’s time to deploy. This improvement in technology’s time to deployment leads to enhanced customer experience and business value, addressing key areas of focus,” Pargaonkar said.

Bring your own models

Another important feature that Salesforce provides is letting its customers leverage any generative AI models they prefer. Salesforce lets customers leverage LLMs developed by Salesforce AI Research, and LLMs from AI firms such as Anthropic, Cohere AI through AWS.

Moreover, customers can also bring in their own domain-specific models, according to Pargaonkar. These models, whether deployed on Amazon SageMaker or Google’s Vertex AI, will directly connect to AI Cloud through the Einstein GPT Trust Layer.

However, according to recent Salesforce research, 73% of employees are concerned about the new security risks introduced by generative AI. Moreover, almost 60% of those who intend to use generative AI are unsure about how to ensure data security.

Pargaonkar says the ‘EinsteinGPT Trust Layer’ helps tackle this apprehension from the customers as it prevents LLMs from retaining sensitive customer data. “We’ve built a new Einstein GPT Trust Layer that will become the industry standard for trusted AI in the enterprise, enabling teams to LLMs with their customer data, but without compromising on data privacy and security.

“Moreover, Salesforce is used by numerous customers globally, including several in India, to effectively manage their customer data. We prioritise maintaining the utmost sincerity in keeping this data secure and not sharing it externally. Since we operate with data from your CRM, it is crucial to respect customer preferences and regulatory requirements regarding Personally Identifiable Information (PII). If customers or policies prohibit sharing or leveraging such data, we strictly adhere to those guidelines.”

Salesforce in India

Salesforce has been in India for many years, catering to CRM needs of many enterprises in the country. However, now, Salesforce has its eyes set on the Micro, Small and Medium Enterprises (MSME), which has been growing significantly in the last few years and are adopting technologies at a much rapid pace.

“The MSME market in India is extremely important to Salesforce because these are high-growth organisations with global ambitions and they seek technological transformation, as it is through technology that they can effectively scale and navigate the challenges of large-scale operations,” Pargaonkar said.

To tap into this segment, Salesforce recently introduced ‘Salesforce Starter’, a user-friendly CRM solution that combines sales, service, and email outreach tools in a single suite. This offering empowers companies to kickstart their operations, enhance customer experiences, cut costs, and boost revenue.

The post Salesforce Takes a Giant Leap with Generative AI appeared first on Analytics India Magazine.