ChatGPT can now work with different apps on macOS and Windows desktops, OpenAI announced on X on 15 November. This marks the company’s first direct attempt at computer vision and agent control.
ChatGPT VS Code, Xcode, Terminal, iTerm2
ChatGPT for macOS can now work with apps on your desktop. In this early beta for Plus and Team users, you can let ChatGPT look at coding apps to provide better answers. pic.twitter.com/3wMCZfby2U— OpenAI Developers (@OpenAIDevs) November 14, 2024
This early beta update claims to let ChatGPT examine coding apps to provide better answers for Plus and Team users. It not only assists with codes like VS Code, Xcode, Terminal, and iTerm2 but also talks to its users (through its voice assist feature), lets them take screenshots, upload files, and search the web (through SearchGPT).
As reported earlier, Anthropic also made Claude Artifacts available to all users on iOS and Android, allowing anyone to create apps easily without writing a single line of code.
A ChatGPT feature that becomes highly beneficial in desktop use is asking anything. Users can select any section of any document and open ChatGPT to ask for meanings, explanations, and feedback. This is a desktop implementation of ChatGPT’s most evident function.
This development follows the discussions from a day ago about OpenAI’s agent, ‘Operator,’ which is to be released in January 2025. Rowan Cheung, founder of ‘The Rundown AI,’ speculates that the next step beyond this would be to allow ChatGPT to control and see desktops as an agent.
OpenAI Follows Suit
In October this year, Microsoft released its ‘Copilot Vision’ to transform autonomous workflows with Copilot. According to Microsoft, these autonomous agents would be the new ‘apps’ for an AI-driven world, executing tasks and managing business functions on behalf of individuals, teams, and departments.
Meanwhile, the company also introduced ten new autonomous agents in Dynamics 365, to automate processes like lead generation, customer service, and supplier communication for organisations.
Following that, Anthropic made a big announcement by releasing its new Claude 3.5 Sonnet which would control computers with the beta feature, ‘Computer Use’. The company had reported that the model made significant progress in agentic coding tasks, which involved AI autonomously generating and manipulating code.
This approach to Anthropic Claude’s computer feature stood out extensively as it didn’t rely on multiple agents to perform different tasks; instead, a single agent managed multiple tasks.
As compared by AIM earlier, Microsoft integrated Copilot into MS Excel, while Claude directly operated Excel. This called into question the existence of Copilot.
OpenAI wasn’t behind, even though this move by Anthropic and others (like Google Jarvis, speculated to release this month) had created a stronghold in the AI industry. OpenAI’s focus has also shifted to interface from expanding its features.
OpenAI entered this race by introducing the Swarm framework, an approach for creating and deploying multi-agent AI systems. It was the missing piece that simplified the process of creating and managing multiple AI agents helping them work together to accomplish complex tasks.
Following that, the launch of ChatGPT on desktops was a major step for a pioneer in AI to transform the way this chatbot is used, only to be enhanced by ‘Operator’ in January.
Now, the chatbot will be able to provide answers, be a companion, and assist with daily tasks.
The post OpenAI Launches ChatGPT Desktop Version, Mirroring Microsoft’s Copilot appeared first on Analytics India Magazine.