Copilot Workspace is GitHub’s take on AI-powered software engineering

Copilot Workspace is GitHub’s take on AI-powered software engineering Kyle Wiggers 9 hours

Is the future of software development an AI-powered IDE? GitHub’s floating the idea.

At its annual GitHub Universe conference in San Francisco on Monday, GitHub announced Copilot Workspace, a dev environment that taps what GitHub describes as “Copilot-powered agents” to help developers brainstorm, plan, build, test and run code in natural language.

Jonathan Carter, head of GitHub Next, GitHub’s software R&D team, pitches Workspace as somewhat of an evolution of GitHub’s AI-powered coding assistant Copilot into a more general tool, building on recently introduced capabilities like Copilot Chat, which lets developers ask questions about code in natural language.

“Through research, we found that, for many tasks, the biggest point of friction for developers was in getting started, and in particular knowing how to approach a [coding] problem, knowing which files to edit and knowing how to consider multiple solutions and their trade-offs,” Carter said. “So we wanted to build an AI assistant that could meet developers at the inception of an idea or task, reduce the activation energy needed to begin and then collaborate with them on making the necessary edits across the entire corebase.”

At last count, Copilot had over 1.8 million paying individual and 50,000 enterprise customers. But Carter envisions a far larger base, drawn in by feature expansions with broad appeal, like Workspace.

“Since developers spend a lot of their time working on [coding issues], we believe we can help empower developers every day through a ‘thought partnership’ with AI,” Carter said. “You can think of Copilot Workspace as a companion experience and dev environment that complements existing tools and workflows and enables simplifying a class of developer tasks … We believe there’s a lot of value that can be delivered in an AI-native developer environment that isn’t constrained by existing workflows.”

There’s certainly internal pressure to make Copilot profitable.

Copilot loses an average of $20 a month per user, according to a Wall Street Journal report, with some customers costing GitHub as much as $80 a month. And the number of rival services continues to grow. There’s Amazon’s CodeWhisperer, which the company made free to individual developers late last year. There are also startups, like Magic, Tabnine, Codegen and Laredo.

Given a GitHub repo or a specific bug within a repo, Workspace — underpinned by OpenAI’s GPT-4 Turbo model — can build a plan to (attempt to) squash the bug or implement a new feature, drawing on an understanding of the repo’s comments, issue replies and larger codebase. Developers get suggested code for the bug fix or new feature, along with a list of the things they need to validate and test that code, plus controls to edit, save, refactor or undo it.

GitHub Workspace

Image Credits: GitHub

The suggested code can be run directly in Workspace and shared among team members via an external link. Those team members, once in Workspace, can refine and tinker with the code as they see fit.

Perhaps the most obvious way to launch Workspace is from the new “Open in Workspace” button to the left of issues and pull requests in GitHub repos. Clicking on it opens a field to describe the software engineering task to be completed in natural language, like, “Add documentation for the changes in this pull request,” which, once submitted, gets added to a list of “sessions” within the new dedicated Workspace view.

GitHub Workspace

Image Credits: GitHub

Workspace executes requests systematically step by step, creating a specification, generating a plan and then implementing that plan. Developers can dive into any of these steps to get a granular view of the suggested code and changes and delete, re-run or re-order the steps as necessary.

“If you ask any developer where they tend to get stuck with a new project, you’ll often hear them say it’s knowing where to start,” Carter said. “Copilot Workspace lifts that burden and gives developers a plan to start iterating from.”

GitHub Workspace

Image Credits: GitHub

Workspace enters technical preview on Monday, optimized for a range of devices, including mobile.

Importantly, because it’s in preview, Workspace isn’t covered by GitHub’s IP indemnification policy, which promises to assist with the legal fees of customers facing third-party claims alleging that the AI-generated code they’re using infringes on IP. (Generative AI models notoriously regurgitate their training datasets, and GPT-4 Turbo was trained partly on copyrighted code.)

GitHub says that it hasn’t determined how it’s going to productize Workspace, but that it’ll use the preview to “learn more about the value it delivers and how developers use it.”

I think the more important question is: Will Workspace fix the existential issues surrounding Copilot and other AI-powered coding tools?

An analysis of over 150 million lines of code committed to project repos over the past several years by GitClear, the developer of the code analysis tool of the same name, found that Copilot was resulting in more mistaken code being pushed to codebases and more code being re-added as opposed to reused and streamlined, creating headaches for code maintainers.

Elsewhere, security researchers have warned that Copilot and similar tools can amplify existing bugs and security issues in software projects. And Stanford researchers have found that developers who accept suggestions from AI-powered coding assistants tend to produce less secure code. (GitHub stressed to me that it uses an AI-based vulnerability prevention system to try to block insecure code in addition to an optional code duplication filter to detect regurgitations of public code.)

Yet devs aren’t shying away from AI.

In a StackOverflow poll from June 2023, 44% of developers said that they use AI tools in their development process now, and 26% plan to soon. Gartner predicts that 75% of enterprise software engineers will employ AI code assistants by 2028.

By emphasizing human review, perhaps Workspace can indeed help clean up some of the mess introduced by AI-generated code. We’ll find out soon enough as Workspace makes its way into developers’ hands.

“Our primary goal with Copilot Workspace is to leverage AI to reduce complexity so developers can express their creativity and explore more freely,” Carter said. “We truly believe the combination of human plus AI is always going to be superior to one or the other alone, and that’s what we’re betting on with Copilot Workspace.”

As GitHub Begins Technical Preview of Copilot Workspace, an Engineer Answers How it Differs from Devin

github copilot workspace

At GitHub Universe 2023, CEO Thomas Dohmke introduced the world to GitHub CoPilot Workspace, which he believes has the ability to reimagine the very nature of the developer experience itself.

Within Copilot Workspace, developers can ideate, plan, develop, test, and execute code using natural language. Introduced in 2022, GitHub Copilot has emerged as the world’s most popular AI developer tool.

The Microsoft-owned company now anticipates Copilot Workspace to be the next evolutionary step.

“There are various ways in which Copilot Workspace can help a developer throughout the software development journey. One of the most important benefits is its ability to help developers get started on a task,” Jonathan Carter, head of GitHub Next, told AIM in an exclusive interview.

Research conducted by GitHub indicates that initiating a project is frequently one of the most daunting aspects of software development.

“Particularly deciding how to approach a task, which files to look through, and how to consider the pros and cons of various solutions. Copilot Workspace reduces that cognitive burden by meeting developers where a new task often begins–a GitHub issue–and synthesising all of the information in that issue to inform a sequenced plan for developers to iterate through,” Carter said.

GitHub Workspace vs Devin

Earlier this year Cognition Labs announced Devin, dubbed as the world’s first AI software engineer.

Upon its announcement, Devin had the developer community talking, as it effectively cleared multiple engineering interviews at top AI companies and also fulfilled actual tasks on the freelance platform Upwork.

However, Carter believes there are fundamental differences between Copilot Workspace and Devin, even though they both are designed to solve similar problems. “At a high-level, Devin and Copilot Workspace are working towards similar goals–reimagining the developer environment as an AI-native workflow.

“That said, we don’t view GitHub Copilot Workplace as an ‘AI engineer’; we view it as an AI assistant to help developers be more productive and happier,” Carter said.

Nonetheless, the biggest differentiator between the two AI tools is that Devin includes a build/test/run agent that attempts to self-repair errors.

“We initially built a similar capability, which you can see in the demo we gave at GitHub Universe in November, but ultimately decided to scope it out for the technical preview to focus on optimising the core user experience,” Carter pointed out.

“Our research has shown that developers value sequenced functions for AI-assistance, and we want to ensure Copilot Workspace meets developers’ needs to build confidence and trust in the tool before we invest in new features, including productising our build/run/test agent,” he said.

Opening GitHub Workspace for technical preview

Similar to Devin, GitHub now plans to give developers early access to test the newest AI tool for software development. Starting today, GitHub will begin the technical preview for GitHub Copilot Workspace.

“We’re looking forward to getting Copilot Workspace into the hands of a diverse group of developers so we can understand where they’re getting the most value from it today and where we can make adjustments to make it even more valuable in the future,” Carter said.

Developers who had access to Devin spoke profoundly of the AI tool. Yet Devin landed in troubled waters after a software developer claimed, in a YouTube video, that the demo video that Cognition Labs released earlier this year was staged.

Although the startup provided some clarification, assessing the pros and cons of an AI tool is challenging until it undergoes extensive testing. With the technical preview, GitHub aims to accomplish precisely that.

“It’s hard to say what limitations developers will find with Copilot Workspace until they use it at scale, and that’s exactly why we do technical previews.”

Mobile compatibility, an advantage?

Copilot Workspace encourages exploration by allowing developers to edit, regenerate, or undo every part of their plan as they iterate to find the exact solution they need.

It also increases their confidence by providing developers with integrated tools to test and validate that the AI-generated code performs as expected.

Copilot Workspace also “boosts collaboration with automatic saved versions and context of previous changes so developers can immediately pick up where their teammate left off,” Carter noted.

Moreover, the tool is also mobile-compatible, which GitHub believes is a huge advantage for developers.

“Personally, I love taking walks in between meetings and I often find myself thinking through a new idea while I’m on-the-go. With Copilot Workspace on mobile, I can easily create a plan for bringing that idea to life, and even test and implement it, all from my phone.

“We’re also excited about Copilot Workspace on mobile because it allows developers to collaborate from wherever they may be. If a colleague sends me a link to their Workspace, I can explore and review it from my phone just as easily as I could from my computer,” Carter added.

Making developers efficient

Carter expects Copilot Workspace to provide immediate improvements for developers in terms of efficiency when “you consider how long it typically takes to read through an issue, explore related files, and put together an implementation plan without Copilot Workspace. What has historically taken hours can now be done in seconds.”

However, he considers the productivity improvements to be an incidental outcome of the broader advantages of Copilot Workspace in terms of enhancing clarity of thought, fostering exploration, and boosting confidence.

“For example, on my team at GitHub Next, I have front-end developers doing back-end work with the help of Copilot Workspace and vice-versa.

“Being able to tackle projects outside of your specific area of expertise confidently is a huge benefit, and when you think of doing this at scale you can imagine how much more productive developer teams can be,” Carter said.

The post As GitHub Begins Technical Preview of Copilot Workspace, an Engineer Answers How it Differs from Devin appeared first on Analytics India Magazine.

Advance Your Tech Career with These 3 Popular Certificates

Advance Your Tech Career with These 3 Popular Certificates
Image by Author

As the tech sector continues to advance and grow at a rapid rate — there is no better time than now for you to advance your career.

The unfortunate thing about the tech industry is that many layoffs are currently happening. This could be due to economics or the use of AI in our day-to-day lives. The best thing you can do for yourself in times like this is to make yourself more appealing to the market.

How do you do this?

Advance your career by learning new skills and getting more experience under your belt.

You want to be a professional that is high in demand due to your skillset in a market that has a low supply.

So let’s dive into certifications you can take to advance your career.

Business Intelligence

Link: Google Business Intelligence Professional Certificate

It is stated that the job market for business intelligence analysts will grow by 23% from 2021 to 2031. As the tech industry becomes more competitive, companies big and small are looking into new ways to spend more efficiently and increase their return on investment. Incoming Business Intelligence Analysts have data analysis skills that focus on user experiences and the end customer.

In this certification offered by Google, you will learn about what business intelligence professionals' roles and responsibilities are to then go into practise data modelling using processes such as extract, transform, and load (ETL) to help you meet the organisation's needs.

Take your findings and create data visualisations to help answer business questions as well as a dashboard that will help you communicate your data insights to stakeholders.

AWS Cloud Solution Architect

Link: AWS Cloud Solutions Architect Professional Certificate

As more businesses move to the cloud, the demand for AWS solution architects continues to grow. AWS is currently commanding 33% of the IaaS market, and Solution Architects are currently on an average $100,000 a year salary to meet the demand.

In this certification offered by AWS itself, you will learn how to make informed decisions about when and how to apply key AWS Services for computing, storage, database, networking, monitoring, and security.

You will then dive a little deeper into the design of architectural solutions, operational excellence, and also addressing common business challenges. But it doesn’t stop there, you will also learn how to create and operate a data lake in a secure and scalable way, as well as learn how to optimize performance and costs.

Azure Developer

Link: Microsoft Azure Developer Associate (AZ-204) Professional Certificate

With more and more organisations being heavily reliant on machine learning and artificial intelligence, Azure professionals are becoming high in demand to meet organisation's cloud requirements.

In this certification offered by Microsoft, you will go through all phases of cloud development from requirements, definition, and design; to development, deployment, and maintenance; to performance tuning and monitoring. This course provides developers with a good understanding of how to create end-to-end solutions in Microsoft Azure.

You will learn how to implement solutions, manage web apps, and develop authentications and authorisations. Consisting of 8 courses, this certification will help you prepare for the Exam AZ-204: Developing Solutions for Microsoft Azure.

Wrapping it Up

To remain competitive in a market that is becoming popular by the day, you want to be looking at the needs of organisations in the current market and how you can fulfill those needs. This is the reason I presented only 3 certifications as this is what the current market is looking for.

If you have any other certifications you will recommend to the community, drop them in the comments below!

Nisha Arya is a data scientist, freelance technical writer, and an editor and community manager for KDnuggets. She is particularly interested in providing data science career advice or tutorials and theory-based knowledge around data science. Nisha covers a wide range of topics and wishes to explore the different ways artificial intelligence can benefit the longevity of human life. A keen learner, Nisha seeks to broaden her tech knowledge and writing skills, while helping guide others.

More On This Topic

  • Advance your data science career to the next level
  • 5 Data Science Communities to Advance Your Career
  • Advance your Career with the 3rd Best Online Master's in Data…
  • 5 AI Courses From Google to Advance Your Career
  • Popular Google Certification for All Areas in the Tech Industry
  • 7 Free Harvard University Courses to Advance Your Skills

Why is AI different? It Can Guide Our Societal Aspirations

Slide1

Traditional analytics optimize based on existing data, reflecting past realities, limitations, and biases. In contrast, AI focuses on future aspirations, identifying the learning needed to achieve aspirational outcomes and guiding your evolution toward these outcomes.

When I talk to my students, the question I keep getting is, “Is AI really different from traditional analytics?” My answer is yes. It’s different because it can learn and adapt to operating in incredibly complex, dynamic situations with minimal human intervention.

Traditional analytics are reactive and focus on analyzing past or current data to optimize decisions within pre-existing patterns. On the other hand, AI analytics are predictive and prescriptive, with a learning and adaptive component. They can not only optimize based on past data but can also learn from new data to achieve more aspirational outcomes in the future.

To build AI models that help us realize our full potential, we need a strategic vision that develops analytics capabilities to optimize current operations, drive transformation and innovation, and expand the metrics upon which we measure contributions to society and humankind. This requires us to envision a more aspirational future considering our desired societal outcomes and the metrics we will use to measure society’s progress toward those outcomes (see Figure 1).

Slide2

Figure 1: Broaden How Society Will Measure AI Success

For example, integrating considerations like ESG (Environmental, Social, and Governance) into AI analytics reflects a broader trend where businesses align their operations with societal values and long-term sustainability. AI can enhance this by identifying trends, risks, and opportunities in ESG criteria, thus guiding corporations towards sustainable practices and reporting.

This requires a shift in our mindset; shifting from productivity-focused metrics (doing things faster or cheaper) to value creation is critical in this transition. Our goal should be to generate economic value, leverage data as an asset, and focus on outcomes that benefit customers, shareholders, and society. AI analytics can drive this shift by enabling deeper insights and facilitating innovation that traditional methods may overlook.

Yes, AI can help us craft the future we want. But to accomplish that, we must build AI models that reflect what we want to become, not just what we are today.

Importance of AI Utility Function

The AI Utility Function is a distinct feature of AI analytics that sets it apart from traditional analytics. The feedback mechanism is the key to an AI Utility Function that can continuously learn and adapt. The feedback mechanism steers the AI model’s learning and decision-making processes by defining optimal outcomes and their associated economic value and weights. It enhances adaptive learning and decision automation and facilitates the ongoing improvement of model performance (see Figure 2).

Slide3

Figure 2: AI Utility Function and Feedback Mechanism

The AI Utility Function feedback mechanism enables AI models to learn and adapt without human intervention, distinguishing AI analytics from traditional analytics (see Table 1).

Traditional Analytics AI Analytics (AI Utility Function)
Adaptive Learning Capabilities Typically, it lacks a mechanism to autonomously adapt or optimize based on changing data inputs without human intervention. The AI utility function in AI models (especially in reinforcement learning) guides the system to learn from new data and experiences to maximize its defined utility, inherently supporting ongoing adaptation.
Complexity of Models Traditional models are often straightforward. The AI utility function can engage in complex decision-making that involves balancing multiple objectives, sometimes in conflicting scenarios. This allows AI models to handle complex, real-world problems with multiple variables and potential outcomes.
Predictive vs. Reactive Analytics Generally reactive, focusing on insights from historical data without a mechanism to project future states beyond simple forecasts Uses AI utility functions to not only predict outcomes but also to proactively suggest changes or actions that will optimize future results based on these predictions
Decision Automation Provides data-driven insights that require human interpretation for decision implementation. Techniques are usually limited to structured data in rigid databases and relational structures. They are often constrained by the scope of the data they are designed to analyze.
Scalability and Efficiency Scalability can be limited by the computational inefficiencies of older models and the need for human-centric, manual tuning. Providees a mechanism to autonomously adapt or optimize based on changing data inputs without human intervention.
Scope of Data Utilization The AI utility function can engage in complex decision-making that involves balancing multiple objectives, sometimes in conflicting scenarios. AI models can handle complex, real-world problems with multiple variables and potential outcomes. An AI utility function enables AI systems to derive actionable insights from varied data types and sources by defining what ‘useful’ outcomes look like across these diverse inputs, which can enhance the system’s ability to work with unstructured and semi-structured

Table 1: Difference Between Traditional and AI Analytics

In the broader strategic context, the AI Utility Function enables systems to align closely with organizational goals. It essentially translates strategic objectives into computational objectives that AI systems can understand and act upon. This ensures that the operational activities driven by AI analytics are in harmony with the organization’s long-term strategic goals, a critical advantage when deploying AI in dynamic markets or environments where strategic agility is essential.

Summary: Importance of Societal Data & AI Literacy

Widespread education on data literacy and ethical AI usage is vital for increasing societal engagement with AI and data. There is a need for a comprehensive pathway to develop basic and advanced AI and data skills, emphasizing the importance of fostering a data-driven culture that extends beyond individual organizations to society. The democratization of data knowledge empowers citizens to utilize AI insights responsibly in their decision-making processes, promoting a more informed and proactive public.

I do have a suggestion… (Figure 3)

Slide4

Figure 3: “AI & Data Literacy: Empowering Citizens of Data Science”

“AI & Data Literacy: Empowering Citizens of Data Science” proposes the following foundational framework:

  • Societal Impact and Strategic Implementation: Integrate AI and data literacy into the core functions of societal structures, ensuring that these educational efforts are aligned with and actively support civic and social objectives.
  • Ethical Frameworks: Guide how societal stakeholders can integrate ethical considerations into AI deployments to establish trust and integrity as AI technologies increasingly incorporate into daily life.
  • Practical Applications for Wider Adoption: Illustrate how AI and data literacy can be effectively applied across various sectors of society through a series of case studies and real-life scenarios.
  • Promoting Continuous Education: Emphasize the importance of ongoing education and skill development in AI and data-related fields, advocating for continuous learning initiatives that keep pace with technological advancements.
  • Leadership in Societal Transformation: Directly engage leaders and policymakers, providing insights on leading societal change initiatives around data literacy to ensure widespread adoption and success.
  • Fostering Community and Collaborative Learning: Promote the creation of collaborative communities around AI and data science, encouraging shared learning and collective problem-solving that extends across societal boundaries.

Through these elements, the book “AI & Data Literacy: Empowering Citizens of Data Science” educates and inspires societal change, positioning AI and data literacy as pivotal to informed citizenship and proactive participation in a data-driven future. The book acts as a catalyst and a guide for integrating AI into societal frameworks, driving towards a more informed, ethical, and collaborative future.

Top of Form

10 Free Online AI Courses to Learn from the Best 

With AI enjoying unprecedented prominence across the globe, the need for material on how it exactly works has shot up remarkably. The good news is, the access to such courses has never been more open.

Several universities have shot to the top of the leaderboard in terms of offering courses in AI and data science. Nearly 75 universities figured in the 2024 QS World University Rankings for data science and artificial intelligence, compared to barely 20 in 2023.

This spells an increased interest not only in learning about AI but also in teaching it. But what does this spell for you?

Even if you’re already taking a course and are interested in further widening your horizons, there is an endless supply of online courses that you could take to upskill yourself. This would prove helpful especially with most jobs moving towards using AI in their daily functioning.

However, finding the perfect course that is both informative and affordable may be difficult. So, here is a rundown of some of the best courses on AI being offered for free right now.

Massachusetts Institute of Technology

MIT has made available Patrick Winston’s 6.034 Artificial Intelligence course on its website. The course runs through the basics of knowledge presentation, problem-solving and learning methods for AI.

It includes lectures from Prof Winston, as well as access to all assignments, examinations, readings, tutorials and demonstrations needed to complete the course. The course itself is self-paced and completely free.

Learn more about it here.

University of California, Davis

UCD is currently offering a course on ‘Big Data, Artificial Intelligence, and Ethics’ through Coursera. The course goes through opportunities available in big data, and how exactly AI works. It also advertises opportunities to interact with IBM Watson and a focus on understanding natural language processing.

Learn more about the course here.

Harvard University

Harvard offers several free courses on artificial intelligence, ranging from the basics of AI to its implications for business and policy. There are a total of seven courses available, with courses on data science, machine learning, Python and even the fundamentals of TinyML.

Learn more about the courses here.

Stanford University

Stanford University Online offers a course titled ‘Machine Learning Specialisation’ from the Stanford School of Engineering. The self-paced course is being offered through Coursera, where interested applicants can learn about all things ML from Andrew Ng.

The course includes modules on multi-linear progression, logistic regression, neural networks, and clustering among others.

Learn more about the course here.

University of Washington

The University of Washington is offering a course on ‘Machine Learning Foundations: A Case Study Approach’ through Coursera. The self-paced 18-hour course covers machine learning and deep learning concepts, as well as a rundown on Python programming.

Learn more about the course here.

Georgia Institute of Technology

Georgia Tech offers a short free course on artificial intelligence. Taught by Thad Starner, famous for his work on wearable computing, the two-hour course goes through the fundamentals of classical search, machine learning, pattern learning and probability. The course is currently being offered through Udacity.

Learn more about the course here.

If you’d prefer learning from the big leaguers themselves, several big-tech companies also offer free courses in the fundamentals of AI and machine learning.

Google

Google maintains a short ‘Google AI for Anyone’ course. The two-hour-long self-paced course talks about the fundamentals of AI, data learning and machine learning, and their relationships with each other.

It also takes the student through the understanding of neural networks, AI ethics, applications and implications of poor data.

Learn more about the course here.

Intel

Intel offers an eight-week long course on ‘Introduction to AI’. The course is thorough and goes through the history of AI to its usage in current times. However, the course requires a prior understanding of Python programming.

The course is aimed specifically at students, industry professionals from other science fields and developers.

Learn more about the course here.

IBM

IBM offers an AI foundations course in partnership with Coursera. The course runs through the fundamentals of AI, with a special focus on generative AI and the usage of chatbots.

Interestingly, it also offers a module on building AI-backed chatbots without programming. Like the UCD course, it also provides access to IBM Watson and is easily accessible to those with next to no knowledge of AI.

Learn more about the course here.

Amazon Web Services

AWS offers a free machine learning and AI course, complete with a learning plan. The ten-hour self-paced course is aimed at beginners. It offers input on the fundamentals of machine learning, terminologies and its use in businesses.

The course also includes an introduction to Amazon SageMaker, their own machine-learning platform.

Learn more about the course here.

The post 10 Free Online AI Courses to Learn from the Best appeared first on Analytics India Magazine.

Free Python Resources That Can Help You Become a Pro

Free Python Resources That Can Help You Become a Pro
Image by Author

Python is the most popular programming language out there, and learning it will give you an advantage in your career. You can use it to build web applications, automate tasks, perform data analysis, and build machine learning models; in short, Python can do anything for you.

How is that possible? Because of open-source community support that has created and maintained Python packages for all kinds of tasks and for every field of study. You can even access popular packages from Java, C++, and other languages, as there are Python wrappers for all of them available.

Python is a necessary skill to gain, and it will help you transition into a more specialized field. However, it is still tricky for non-technical people or beginners. You have to learn syntax, functions, and libraries. Then, you have to learn to use all these skills to build projects, which will require you to take courses and learn from various resources.

In this blog, we will review the free Python courses, books, GitHub repositories, projects, cheat sheets, and online compilers that will help you get started and become an expert at the language quickly.

Python Course

I have been guiding students on where to start learning data science, and I always recommend starting with Python and SQL. Most of them are not sure about paying a huge amount, so I recommend they take a top free course and learn the basics, and if they want to get better at it, they can pay for the course.

The free course in this section covers the basics of Python language syntax and libraries. You will also learn to use Python for data analysis and build simple machine-learning models. All of the courses in this section are popular and have been highly rated by people who have taken them.

  • Python for Beginners by Programming with Mosh
  • Python for Everybody by University of Michigan
  • Principles of Computation with Python by Carnegie Mellon University
  • Data Analysis with Python by freecodecamp
  • Python for Data Science by freecodecamp

Python Books

Some people prefer books to courses because they want to take it slow and learn everything about the topic before trying anything. The books mentioned in the list below are popular and written by top personalities in the industry. They include examples, projects, and additional resources for becoming an experienced Python developer.

  • Python for Everybody by Dr. Charles Severance
  • Automate the Boring Stuff with Python by Al Sweigart
  • Python Data Science Handbook by Jake VanderPlas
  • Python 3 Patterns, Recipes and Idioms by Bruce Eckel & Friends
  • The Hitchhiker’s Guide to Python by Kenneth Reitz & Tanya Schlusser

Python GitHub Repositories

I always recommend using GitHub as a learning platform. On GitHub, you can find various community-supported repositories essential for Python beginners. These repositories provide a "learning by doing" approach and consist of projects, exercises, and problems for you to solve to learn the language. They also come with a list of tools, frameworks, free resources, and everything you need to build things using the Python language.

  • practical-tutorials/project-based-learning
  • zhiwehu/Python-programming-exercises
  • geekcomputers/Python
  • vinta/awesome-python
  • TheAlgorithms/Python

Python Projects

After learning the basics and getting used to Python syntax, it is time to put your skills to the test by building projects. Working on Python projects will also help you build a strong portfolio that will eventually help you land a high-paying job. The list below contains projects for all levels, from beginners to experts.

  • 7 Python Projects for Beginners by Abid Ali Awan
  • 5 Python Projects for Data Science Portfolio by Abid Ali Awan
  • 12 Beginner Python Projects by freecodecamp
  • 25 Python Projects You Can Build by Jessica Wilkins
  • Python Projects for Beginner to Advanced by GeeksforGeeks

Python CheatSheets

Cheat sheets are useful for both experts and students who want to review concepts before an interview or exam. They contain bite-sized information about Python syntax, libraries, and functions for easy revision. I use them to prepare for job interviews or when writing technical content.

  • Web Scraping with Python by The PyCoach
  • Pandas: Data Wrangling by pydata
  • Python Machine Learning by DataCamp
  • Neural Networks in Python by DataCamp
  • Ultimate Python Cheatsheet by wilfredinni

Online Python Compiler

Only some have access to personal computers, and even though they have a laptop, they want to avoid installing Python and IDE or even running Python files. In this section, I have mentioned the top and free Python developer environments you can access through your browser, which are ready to be used in a few seconds. These platforms are popular and user-friendly, so instead of setting up your environment to test the code or even learn to code, I suggest students use an online Python interpreter.

  • Online Python — IDE
  • Deepnote
  • Replit
  • Google Colab
  • Gitpod

Final Thoughts

If you are new to Python, I will say good luck. The language is easy to learn, and the resources I have provided in this blog will help you learn it fast. The only thing I need from you is your dedication. You need to put in the effort and time to learn and gain experience building the project.

This blog contains a list of free Python resources, such as courses, books, repositories, projects, chests, and online compilers. If you still need clarification about where to start, you can write me a proper message on LinkedIn, and I will try my best to help you.

Abid Ali Awan (@1abidaliawan) is a certified data scientist professional who loves building machine learning models. Currently, he is focusing on content creation and writing technical blogs on machine learning and data science technologies. Abid holds a Master's degree in technology management and a bachelor's degree in telecommunication engineering. His vision is to build an AI product using a graph neural network for students struggling with mental illness.

More On This Topic

  • SAS Analytics Pro – now available for on-site or containerized…
  • Data Science Projects That Can Help You Solve Real World Problems
  • 5 Rare Data Science Skills That Can Help You Get Employed
  • How Generative AI Can Help You Improve Your Data Visualization Charts
  • If You Can Write Functions, You Can Use Dask
  • 5 Free Books to Help You Master Python

Financial Times Enters into a Content Licensing Agreement with OpenAI

The Financial Times has entered into an agreement with OpenAI to license its content so that the AI startup can build new AI tools. According to a press release from FT, users of ChatGPT will see summaries, quotes, and direct links to FT articles. Any query yielding information from the FT will be clearly credited to the publication.

The FT, which is already a user of OpenAI’s products, specifically the ChatGPT Enterprise, recently introduced a beta version of a generative AI search tool called “Ask FT.” This feature, powered by Anthropic’s Claude LLM, enables subscribers to search for information across the publication’s articles.

“Apart from the benefits to the FT, there are broader implications for the industry. It’s right, of course, that AI platforms pay publishers for the use of their material,” said FT chief executive John Ridding.

“At the same time, it’s clearly in the interests of users that these products contain reliable sources,” he added.

This marks OpenAI’s fifth agreement within the past year, adding to a series of similar deals with prominent news organizations such as the US-based Associated Press, Germany’s Axel Springer, France’s Le Monde, and Spain’s Prisa Media.

In December, The New York Times became the first major US media organization to file a lawsuit against OpenAI and Microsoft, alleging that these tech giants utilized millions of articles without proper licensing to develop the underlying models of ChatGPT.

The post Financial Times Enters into a Content Licensing Agreement with OpenAI appeared first on Analytics India Magazine.

Azim Premji Investment Firm to Invest Over $10 Billion in AI, Says CIO

The Azim Premji-owned private equity fund, Premji Invest, will be investing upwards of $10 billion in AI companies, according to reports.

The company has decided to increase investments within the AI sector, according to a report from Bloomberg. Premji Invest’s CIO and managing partner T K Kurien stated that the investments will focus on refining their already existing proprietary investment tools that rely on AI.

This announcement comes only a few weeks after Premji Invest had co-led healthcare-focused LLM Hippocratic AI’s $53 million Series A round. Additionally, it was also reported earlier this month that the company was looking to invest anywhere between $50 to $70 million in Canva.

According to Moneycontrol, the company had also wanted to get ahead on the GenAI boom, which is why it had shifted focus towards more technology-based investments.

Apart from Hippocratic AI, the company has invested in a few AI sector firms in the past, including Cohesity Inc., Pixis, and Ikigai. Additionally, according to Kurien, the company also hopes to allow access of their AI tools to open-source developers.

In terms of AI tools developed by the company, Premji Invest had initially begun development in the field around three years ago, with a total of 14 AI engineers hired so far.

Currently, the company makes use of AI to parse through companies based on several hundred parameters to “identify investment opportunities”.

The latest announcement serves another purpose, with Kurien saying, “The firm expects the entire exercise to also give it a bird’s eye view of emerging technologies and trends that could help it stay ahead of peers.”

The Premji Invest CIO also stated that they were looking into using the tools for potential streamlining of legal processes and improving government services. He stated that this would “help India’s overburdened courts resolve cases faster and to also aid governments’ efforts to offer service more effectively.”

The post Azim Premji Investment Firm to Invest Over $10 Billion in AI, Says CIO appeared first on Analytics India Magazine.

A Starter Guide to Data Structures for AI and Machine Learning

A Guide to Data Structures for AI and Machine Learning
Image created by Author

Introduction

Data structures are, in a sense, the building blocks of algorithms, and are critical for the effective functioning of any AI or ML algorithm. These structures, while often thought of as simple containers for data, are more than that: they are incredibly rich tools in their own right, and can have a greater effect on the performance, efficiency, and overall computational complexity of algorithms than has been given credit. Choosing a data structure is therefore a task that requires careful thought, and can be determinate of the speed with which data can be processed, the scale to which an ML model can operate, or even of the feasibility of a given computational problem.

This article will introduce some data structures of importance in the fields of AI and ML and is aimed at both practictioners and students, as well as AI and ML enthusiasts. It is our hope in writing this article to supply some knowledge of important data structures in the AI and ML realms, as well as to provide some guidelines as to when and how these structures can be used effectively to their best advantage.

As we go through each of a series of data structures, examples will be given of AI and ML scenarios in which they might be employed, each structure possessing its own set of strengths and weaknesses. Any implementations will be given in Python, a language of enormous popularity in the data science field, and are suitable for a variety of tasks in AI and ML. Mastering these core building blocks is essential for a variety of tasks that data scientists might face: sorting large data sets, creating high-performing algorithms that are both fast and light on memory, and maintaining data structures in a logical and efficient way to name but a few.

After starting with the basics of simple arrays and dynamic arrays, we will move on to more advanced structures, such as linked lists and binary search trees, before wrapping up with hash tables, a structure that is both very useful and can provide an excellent return on the investment of learning. We cover both the mechanical production of these structures, as well as their real-world use in AI and ML applications, a combination of theory and practice that provides the reader with the understanding needed to decide which is best for a particular problem, and to implement those structures in a robust AI system.

In this section, we dive deep into the various data structures pivotal for AI and machine learning, starting with arrays and dynamic arrays. By understanding the characteristics, advantages, and limitations of each data structure, practitioners can make informed choices that enhance the efficiency and scalability of their AI systems.

1. Arrays and Dynamically-Sizing Arrays

Perhaps the most basic of computer science data structures, an array is a collection of elements of the same type stored in adjacent memory locations, allowing direct random access to each element. Dynamic arrays, like the lists in Python, build on simple arrays, but adding automatic resizing, where additional memory is allocated as elements are added or removed. This auto-memory-allocating ability is at the heart of dynamic arrays. A few basic suggestions as to when arrays are best to use might include problems with a seemingly linear traversing of data or where the number of elements does not fluctuate in the slightest, such as datasets of unchangeable sizes that Machine Learning algorithms might ingest.

Let’s first discuss the upsides:

  • Easy access to elements by index: Quick retrieval operations, which is crucial in many AI and ML scenarios where time efficiency is key
  • Good for known or fixed-size problems: Ideal for when the number of elements is predetermined or changes infrequently

And the downsides:

  • Fixed size (for static arrays): Requires knowing the maximum number of elements in advance, which can be limiting
  • Costly insertions and deletions (for static arrays): Each insertion or deletion potentially requires shifting elements, which is computationally expensive

Arrays, possibly because they are simple to grasp and their utility, can be found nearly anywhere in computer science education; they are a natural classroom subject. Having O(1), or constant, time-complexity when accessing a random element from a computer memory location endears it to systems where runtime efficiency reigns supreme.

In the world of ML, the array and dynamic array are crucial for being able to handle datasets and, usually, to arrange feature vectors and matrices. High-performance numerical libraries like NumPy use arrays in concert with routines that efficiently perform task across datasets, allowing for rapid processing and transformation of numerical data required for training models and using them for predictions.

A few fundamental operations performed with Python’s pre-built dynamic array data structure, the list, include:

# Initialization  my_list = [1, 2, 3]    # Indexing  print(my_list[0])        # output: 1    # Appending  my_list.append(4)        # my_list becomes [1, 2, 3, 4]    # Resizing  my_list.extend([5, 6])   # my_list becomes [1, 2, 3, 4, 5, 6]

2. Linked Lists

Linked lists are another basic data structure, one consisting of a sequence of nodes. Each node in the list contains both some data along with a pointer to the next node in the list. A singly linked list is one that each node in the list has a reference to just the next node in the list, allowing for forward traversal only; a doubly linked list, on the other hand, has a reference to both the next and previous nodes, capable of forward and backward traversal. This makes linked lists a flexible option for some tasks where arrays may not be the best choice.

The good:

  • They are: dynamic expansions or contractions of linked lists occur with no additional overhead of reallocating and moving the entire structure
  • They facilitate fast insertions and deletions of nodes without requiring further node shifting, as an array might necessitate

The bad:

  • The unpredictability of the storage locations of elements creates poor caching situations, especially in contrast to arrays
  • The linear or worse access times required to locate an element by index, needing full traversal from head to find, are less efficient

They are especially useful for structures where the number of elements is unclear, and frequent insertions or deletions are required. Such applications make them useful for situations that require dynamic data, where changes are frequent. Indeed, the dynamic sizing capability of linked lists is one of their strong points; they are clearly a good fit where the number of elements cannot be predicted well in advance and where considerable waste could occur as a result. Being able to tweak a linked list structure without the major overhead of a wholesale copy or rewrite is an obvious benefit, particularly where routine data structure adjustments are likely to be required.

Though they have less utility than arrays in the realm of AI and ML, linked lists do find specific applications wherein highly mutable data structures with rapid modifications are needed, such as for managing data pools in genetic algorithms or other situations where operations on individual elements are performed regularly.

Shall we have a simple Python implementation of linked list actions? Sure, why not. Note that the following basic linked list implementation includes a Node class to represent each list element, and a LinkedList class to handle the operations on the list, including appending and deleting nodes.

class Node:      def __init__(self, data):          self.data = data          self.next = None    class LinkedList:      def __init__(self):          self.head = None        def append(self, data):          new_node = Node(data)          if not self.head:              self.head = new_node              return          last = self.head          while last.next:              last = last.next          last.next = new_node        def delete_node(self, key):          temp = self.head          if temp and temp.data == key:              self.head = temp.next              temp = None              return          prev = None          while temp and temp.data != key:              prev = temp              temp = temp.next          if temp is None:              return          prev.next = temp.next          temp = None        def print_list(self):          current = self.head          while current:              print(current.data, end=' ')              current = current.next          print()

Here is an explanation of the above code:

  • This LinkedList class is responsible for managing the linked list, which includes creation, appending data, deleting nodes, and displaying the list, and when initialized creates the head pointer, head, marks an empty linked list by default
  • The append method appends data to the end of a linked list, creating a new node either at the head of the list when it's empty, or traversing to the end of a non-empty list to add the new node
  • The delete_node method removes a node with a given key (data) by considering these three cases: target key is in the head node; target key is in another node in the list; no node holds the key
  • By setting pointers correctly, it is able to take out a node without sacrificing the order of remaining nodes
  • The print_list method walks the list starting at the head, printing the contents of each node, in sequence, allowing for a simple means of understanding the list

Here is an example of the above LinkedList code being used:

# Create a new LinkedList  my_list = LinkedList()    # Append nodes with data  my_list.append(10)  my_list.append(20)  my_list.append(30)  my_list.append(40)  my_list.append(50)    # Print the current list  print("List after appending elements:")  my_list.print_list()       # outputs: 10 20 30 40 50    # Delete a node with data '30'  my_list.delete_node(30)    # Print the list after deletion  print("List after deleting the node with value 30:")  my_list.print_list()       # outputs: 10 20 40 50    # Append another node  my_list.append(60)     # Print the final state of the list  print("Final list after appending 60:")  my_list.print_list()       # 10 20 40 50 60

3. Trees, particularly Binary Search Trees (BST)

Trees are an example of a non-linear data structure (compare with arrays) in which parent-child relationships exist between nodes. Each tree has a root node, and nodes may contain zero or more child nodes, in a hierarchical structure. A Binary Search Tree (BST) is a kind of tree that allows each node to contain up to two children, generally referred to as the left child and right child. In this type of tree, keys contained in a node must, respectively, either be greater than or equal to all nodes contained within its left subtree, or less than or equal to all nodes contained in its right subtree. These properties of BSTs can facilitate more efficient search, insert, and remove operations, provided that the tree remains balanced.

BST pros:

  • With respect to more commonly used data structures such as arrays or linked lists, BSTs facilitate quicker access, insertion and deletion

And BST cons:

  • However, previously mentioned that BSTs will provide decreased performance when unbalanced/skewed
  • This can cause operation time complexity to degrade to O(n) in the worst case

BSTs are particularly effective when many search, insert, or delete operations are required with respect to the dataset they are handling. They are certainly more appropriate when the data is accessed frequently in a dataset that undergoes frequent changes.

Moreover, trees represent an ideal structure for describing hierarchical data in a way creating a tree-like relationships between data, like files system or organizational chart. This makes them particularly useful in applications where this sort of hierarchical data structuring is of interest.

BSTs are able to assure search operations are quick due to their average O(log n) time complexity for access, insert, and delete operations. This makes them of particular interest for applications where swift data access and updates are necessary.

Decision trees, a type of tree data structure widely used for classification and regression tasks in machine learning, enable models to be constructed which predict the based off target variable from rules determined by the features. Trees also see wide use in AI, such as game programming; particularly in the case of games of strategy such as chess, trees are used to simulate scenarios and determine constraints which dictate optimal moves.

Here is an overview of how you can implement a basic BST, including insert, search and delete methods, using Python:

class TreeNode:      def __init__(self, key):          self.left = None          self.right = None          self.val = key    def insert(root, key):      if root is None:          return TreeNode(key)      else:          if root.val < key:              root.right = insert(root.right, key)          else:              root.left = insert(root.left, key)      return root    def search(root, key):      if root is None or root.val == key:          return root      if root.val < key:          return search(root.right, key)      return search(root.left, key)    def deleteNode(root, key):      if root is None:          return root      if key < root.val:          root.left = deleteNode(root.left, key)      elif(key > root.val):          root.right = deleteNode(root.right, key)      else:          if root.left is None:              temp = root.right              root = None              return temp          elif root.right is None:              temp = root.left              root = None              return temp          temp = minValueNode(root.right)          root.val = temp.val          root.right = deleteNode(root.right, temp.val)      return root    def minValueNode(node):      current = node      while current.left is not None:          current = current.left      return current

Explanation of the above code:

  • The foundation of a Binary Search Tree is the TreeNode class, which houses the node's value (val) and its left and right child node pointers (left and right)
  • The insert function is an implementation of the recursive strategy of inserting a value into the BST: in the base case in which no root exists it creates a new TreeNode, and otherwise it puts keys larger than itself to its right subtree, and smaller nodes to the left, preserving the BST's structure
  • The search function handles the base cases of no node with the specified value being found and not finding the specified root's value, and then searches recursively in the correct subtree based on the value of the key being compared to the current node
  • The delete_node method can be split into three cases: like a delete call for a key without children (replaced by the right child); one without a right child (replaced by the left child); and delete on a node with two children (replaced by its 'inorder successor', the smallest value in its right subtree), making the recursive node deletions and maintaining BST structure
  • A helper function is that of finding the minimum-value node (i.e. the leftmost node) of a subtree, which is utilized during the deletion of a node with two children

Here is an example of the above BST code implementation being used.

# Create the root node with an initial value  root = TreeNode(50)    # Insert elements into the BST  insert(root, 30)  insert(root, 20)  insert(root, 40)  insert(root, 70)  insert(root, 60)  insert(root, 80)    # Search for a value  searched_node = search(root, 70)  if searched_node:      print(f"Found node with value: {searched_node.val}")  else:      print("Value not found in the BST.")    # output -> Found node with value: 70    # Delete a node with no children  root = deleteNode(root, 20)    # Attempt to search for the deleted node  searched_node = search(root, 20)  if searched_node:      print(f"Found node with value: {searched_node.val}")  else:      print("Value not found in the BST - it was deleted.")    # output -> Value not found in the BST - it was deleted.

4. Hash Tables

Hash tables are a data structure well-suited to rapid data access. They harness a hash function to compute an index into a series of slots or buckets, out of which the desired value is returned. Hash tables can deliver almost instant data access thanks to these hash functions, and can be used to scale to large datasets with no decrease in access speed. The efficiency of hash tables relies heavily on a hash function, which evenly distributes entries across an array of buckets. This distribution helps to avoid key collisions, which is when different keys resolve to the same slot; proper key collision resolution is a core concern of hash table implementations.

Pros of hash tables:

  • Rapid data retrieval: Provides average-case constant time complexity (O(1)) for lookups, insertions, and deletions
  • Average time complexity efficiency: Mostly consistently swift, which makes hash tables suited to real-time data handling in general

Cons of hash tables:

  • Worst-case time complexity not great: Can degrade to O(n) if there are many items hashing to the same bucket
  • Reliant on a good hash function: The importance of the hash function to hash table performance is significant, as it has a direct influence on how well the data is distributed amongst the buckets

Hash tables are most often used when rapid lookups, insertions, and deletions are required, without any need for ordered data. They are particularly useful when quick access to items via their keys is necessary to make operations more rapid. The constant time complexity property of hash tables for their basic operations makes them extremely useful when high performance operation is a requirement, especially in situations where time is of the essence.

They are great for dealing with massive data, since they provide a high speed way for data lookup, with no performance degredation as the size of the data grows. AI often needs to handle huge amounts of data, where hash tables for retrieval and lookup make a lot of sense.

Within machine learning, hash tables help with feature indexing large data collections — in preprocessing and model training, quick access and data manipulation facilitated via hash tables. They can also make certain algorithms perform more efficiently — in some cases, during k-nearest neighbors calculation, they can store already computed distances and recall them from a hash table to make large dataset calculations quicker.

In Python, the dictionary type is an implementation of hash tables. How to make use of Python dictionaries is explained below, with a collision handling strategy as well:

# Creating a hash table using a dictionary  hash_table = {}    # Inserting items  hash_table['key1'] = 'value1'  hash_table['key2'] = 'value2'    # Handling collisions by chaining  if 'key1' in hash_table:      if isinstance(hash_table['key1'], list):          hash_table['key1'].append('new_value1')      else:          hash_table['key1'] = [hash_table['key1'], 'new_value1']  else:      hash_table['key1'] = 'new_value1'    # Retrieving items  print(hash_table['key1'])    # output: can be 'value1' or a list of values in case of collision    # Deleting items  del hash_table['key2']

Conclusion

An investigation of a few of the data structures underpinning AI and machine learning models can show us what some of these rather simple building blocks of the underlying technology are capable of. The inherent linearity of arrays, the adaptability of linked lists, the hierarchical organization of trees, and the O(1) search time of hash tables each offer different benefits. This understanding can inform the engineer as to how they can best leverage these structures %mdash; not only in the machine learning models and training sets they put together, but in the reasoning behind their choices and implementations.

Becoming proficient in elementary data structures with relevance to machine learning and AI is a skill that has implications. There are lots of places to learn this skill-set, from university to workshops to online courses. Even open source code can be an invaluable asset in getting familiar with the disciplinary tools and best practices. The practical ability to work with data structures is not one to be overlooked. So to the data scientists and AI engineers of today, tomorrow, and thereafter: practice, experiment, and learn from the data structure materials available to you.

Matthew Mayo (@mattmayo13) holds a Master's degree in computer science and a graduate diploma in data mining. As Managing Editor, Matthew aims to make complex data science concepts accessible. His professional interests include natural language processing, machine learning algorithms, and exploring emerging AI. He is driven by a mission to democratize knowledge in the data science community. Matthew has been coding since he was 6 years old.

More On This Topic

  • OpenAI API for Beginners: Your Easy-to-Follow Starter Guide
  • Super Study Guide: A Free Algorithms and Data Structures eBook
  • Python Basics: Syntax, Data Types, and Control Structures
  • Getting Started with Python Data Structures in 5 Steps
  • How to MLOps like a Boss: A Guide to Machine Learning without Tears
  • A Beginner's Guide to the Top 10 Machine Learning Algorithms

Pocket-Sized Powerhouse: Unveiling Microsoft’s Phi-3, the Language Model That Fits in Your Phone

In the rapidly evolving field of artificial intelligence, while the trend has often leaned towards larger and more complex models, Microsoft is adopting a different approach with its Phi-3 Mini. This small language model (SLM), now in its third generation, packs the robust capabilities of larger models into a framework that fits within the stringent resource constraints of smartphones. With 3.8 billion parameters, the Phi-3 Mini matches the performance of large language models (LLMs) across various tasks including language processing, reasoning, coding, and math, and is tailored for efficient operation on mobile devices through quantization.

Challenges of Large Language Models

The development of Microsoft’s Phi SLMs is in response to the significant challenges posed by LLMs, which require more computational power than typically available on consumer devices. This high demand complicates their use on standard computers and mobile devices, raises environmental concerns due to their energy consumption during training and operation, and risks perpetuating biases with their large and complex training datasets. These factors can also impair the models' responsiveness in real-time applications and make updates more challenging.

Phi-3 Mini: Streamlining AI on Personal Devices for Enhanced Privacy and Efficiency

The Phi-3 Mini is strategically designed to offer a cost-effective and efficient alternative for integrating advanced AI directly onto personal devices such as phones and laptops. This design facilitates faster, more immediate responses, enhancing user interaction with technology in everyday scenarios.

Phi-3 Mini enables sophisticated AI functionalities to be directly processed on mobile devices, which reduces reliance on cloud services and enhances real-time data handling. This capability is pivotal for applications that require immediate data processing, such as mobile healthcare, real-time language translation, and personalized education, facilitating advancements in these fields. The model's cost-efficiency not only reduces operational costs but also expands the potential for AI integration across various industries, including emerging markets like wearable technology and home automation. Phi-3 Mini enables data processing directly on local devices which boosts user privacy. This could be vital for managing sensitive information in fields such as personal health and financial services. Moreover, the low energy requirements of the model contribute to environmentally sustainable AI operations, aligning with global sustainability efforts.

Design Philosophy and Evolution of Phi

Phi's design philosophy is based on the concept of curriculum learning, which draws inspiration from the educational approach where children learn through progressively more challenging examples. The main idea is to start the training of AI with easier examples and gradually increase the complexity of the training data as the learning process progresses. Microsoft has implemented this educational strategy by building a dataset from textbooks, as detailed in their study “Textbooks Are All You Need.” The Phi series was launched in June 2023, beginning with Phi-1, a compact model boasting 1.3 billion parameters. This model quickly demonstrated its efficacy, particularly in Python coding tasks, where it outperformed larger, more complex models. Building on this success, Microsoft latterly developed Phi-1.5, which maintained the same number of parameters but broadened its capabilities in areas like common sense reasoning and language understanding. The series outshined with the release of Phi-2 in December 2023. With 2.7 billion parameters, Phi-2 showcased impressive skills in reasoning and language comprehension, positioning it as a strong competitor against significantly larger models.

Phi-3 vs. Other Small Language Models

Expanding upon its predecessors, Phi-3 Mini extends the advancements of Phi-2 by surpassing other SLMs, such as Google's Gemma, Mistral's Mistral, Meta's Llama3-Instruct, and GPT 3.5, in a variety of industrial applications. These applications include language understanding and inference, general knowledge, common sense reasoning, grade school math word problems, and medical question answering, showcasing superior performance compared to these models. The Phi-3 Mini has also undergone offline testing on an iPhone 14 for various tasks, including content creation and providing activity suggestions tailored to specific locations. For this purpose, Phi-3 Mini has been condensed to 1.8GB using a process called quantization, which optimizes the model for limited-resource devices by converting the model's numerical data from 32-bit floating-point numbers to more compact formats like 4-bit integers. This not only reduces the model's memory footprint but also improves processing speed and power efficiency, which is vital for mobile devices. Developers typically utilize frameworks such as TensorFlow Lite or PyTorch Mobile, incorporating built-in quantization tools to automate and refine this process.

Feature Comparison: Phi-3 Mini vs. Phi-2 Mini

Below, we compare some of the features of Phi-3 with its predecessor Phi-2.

  • Model Architecture: Phi-2 operates on a transformer-based architecture designed to predict the next word. Phi-3 Mini also employs a transformer decoder architecture but aligns more closely with the Llama-2 model structure, using the same tokenizer with a vocabulary size of 320,641. This compatibility ensures that tools developed for Llama-2 can be easily adapted for use with Phi-3 Mini.
  • Context Length: Phi-3 Mini supports a context length of 8,000 tokens, which is considerably larger than Phi-2’s 2,048 tokens. This increase allows Phi-3 Mini to manage more detailed interactions and process longer stretches of text.
  • Running Locally on Mobile Devices: Phi-3 Mini can be compressed to 4-bits, occupying about 1.8GB of memory, similar to Phi-2. It was tested running offline on an iPhone 14 with an A16 Bionic chip, where it achieved a processing speed of more than 12 tokens per second, matching the performance of Phi-2 under similar conditions.
  • Model Size: With 3.8 billion parameters, Phi-3 Mini has a larger scale than Phi-2, which has 2.7 billion parameters. This reflects its increased capabilities.
  • Training Data: Unlike Phi-2, which was trained on 1.4 trillion tokens, Phi-3 Mini has been trained on a much larger set of 3.3 trillion tokens, allowing it to achieve a better grasp of complex language patterns.

Addressing Phi-3 Mini's Limitations

While the Phi-3 Mini demonstrates significant advancements in the realm of small language models, it is not without its limitations. A primary constraint of the Phi-3 Mini, given its smaller size compared to massive language models, is its limited capacity to store extensive factual knowledge. This can impact its ability to independently handle queries that require a depth of specific factual data or detailed expert knowledge. This however can be mitigated by integrating Phi-3 Mini with a search engine. This way the model can access a broader range of information in real-time, effectively compensating for its inherent knowledge limitations. This integration enables the Phi-3 Mini to function like a highly capable conversationalist who, despite a comprehensive grasp of language and context, may occasionally need to “look up” information to provide accurate and up-to-date responses.

Availability

Phi-3 is now available on several platforms, including Microsoft Azure AI Studio, Hugging Face, and Ollama. On Azure AI, the model incorporates a deploy-evaluate-finetune workflow, and on Ollama, it can be run locally on laptops. The model has been tailored for ONNX Runtime and supports Windows DirectML, ensuring it works well across various hardware types such as GPUs, CPUs, and mobile devices. Additionally, Phi-3 is offered as a microservice via NVIDIA NIM, equipped with a standard API for easy deployment across different environments and optimized specifically for NVIDIA GPUs. Microsoft plans to further expand the Phi-3 series in the near future by adding the Phi-3-small (7B) and Phi-3-medium (14B) models, providing users with additional choices to balance quality and cost.

The Bottom Line

Microsoft's Phi-3 Mini is making significant strides in the field of artificial intelligence by adapting the power of large language models for mobile use. This model improves user interaction with devices through faster, real-time processing and enhanced privacy features. It minimizes the need for cloud-based services, reducing operational costs and widening the scope for AI applications in areas such as healthcare and home automation. With a focus on reducing bias through curriculum learning and maintaining competitive performance, the Phi-3 Mini is evolving into a key tool for efficient and sustainable mobile AI, subtly transforming how we interact with technology daily.