Sonata Software Launches Generative AI Platform Harmoni.AI

It services firm Sonata Software has announced the launch of Harmoni.AI- its generative AI platform offering a bouquet of industry solutions, service delivery platforms, and accelerators using generative AI.

“We are seeing strong interest from our clients in enhancing customer experience, launching new business models, growing revenue, and enhancing productivity. Our responsibility is to help enterprises leverage the most relevant use cases for their specific business needs within a governed framework.

“The key to success, therefore, as with any AI, is the guardrails that humans build around them to guarantee secure and trusted outcomes.” Samir Dhir, managing director & chief executive officer at Sonata Software, said.

Sonata’s Harmoni.AI is a holistic ‘Responsible by Design’ platform for generative AI. A Data Governance and Acceleration engine backs it with a choice of using industry leading LLMs and a consulting framework to enable effective adoption and faster time to market.

It is doing pilots with multiple customers, including Fortune 500 clients, particularly in the areas of Healthcare and Life Sciences, Consumer Products & Retail, Telecom, Media & Technology, and Banking and Financial Services.

Moreover, Sonata has established Harmoni.AI Academy, to train engineers on ‘Responsible-First by Design’ approach and around 20% of its engineers are involved on AI initiatives to enable clients to leverage the potential of generative AI in a trusted, secure, and governed framework.

The post Sonata Software Launches Generative AI Platform Harmoni.AI appeared first on Analytics India Magazine.

IT Services Firm Hexaware Launches Generative AI Consulting & Practice Unit

Hexaware Technologies has announced the launch of its transformational Generative AI Consulting & Practice Unit to help businesses navigate the rapidly transforming digital landscape marked by the surge of generative AI.

“Enterprises today are facing significant challenges navigating the rapid advancements in technology, specifically in the realm of generative AI,” Arun Ramchandran, president & global head of the hi-tech & professional services vertical & Hexaware Consulting, said.

The GenAI Consulting & Practice Unit is designed to fill a critical market gap, addressing the dire need for robust, advanced generative AI solutions that address proliferating industry use cases across enterprise functions and technologies and offer real, quantifiable business advantages.

Hexaware plans to roll out an array of platforms, tools, blueprints, workshops, and tailored strategies that are poised to expedite and enrich clients’ AI journeys.

“As part of this new unit, we will offer end-to-end consulting to help clients demystify and discover generative AI opportunities, using DecodeAI, and also engage and execute generative AI projects through EncodeAI, both proprietary frameworks and methodologies,” Ramchandran said.

In addition, Hexaware is also launching Tenjin, a comprehensive platform crafted to speed up generative AI adoption through a portfolio of solutions across all our service lines, while safeguarding data security and privacy.

The post IT Services Firm Hexaware Launches Generative AI Consulting & Practice Unit appeared first on Analytics India Magazine.

Jasper Vs. Scalenut: Which Writing Tool is Best? (July 2023)

Are you tired of spending hours crafting the perfect blog post? Do you wish there was an easier way to create engaging content that captivates your audience? Look no further than this Jasper AI vs Scalenut comparison.

AI writing tools like Jasper AI and Scalenut are popular in content creation. They use Artificial Intelligence to generate written content, saving time and effort for writers.

Lucky for you, I have used Jasper AI and Scalenut and tested these tools to see which reigns superior. This comprehensive comparison will dive deep into both tools' features, pros, and cons to help you make an informed decision.

Whether you're a content creator, marketer, or business owner looking to streamline your writing process, AI writing tools for long-form content marketing can be a game-changer. We'll explore how these tools work and why they are becoming increasingly popular in the industry.

So if you're ready to take your content creation to the next level and discover which AI writing tool is best for you, keep reading. By the end, you'll have all the information you need to make an educated choice between Jasper AI and Scalenut!

Who Needs AI Writing Tools?

An AI writer is beneficial for a variety of individuals and businesses:

  • Content creators, marketers, and bloggers can improve their efficiency and make more money writing.
  • Businesses can use AI tools for product descriptions, social media posts, and email marketing.
  • Non-native English speakers can enhance grammar and language usage.

But between Jasper AI and Scalenut, which tool is best for which user? Let's take a look.

What is Jasper AI?

Jasper AI homepage.

Jasper AI is an AI writing tool that offers a user-friendly interface and powerful features for content creation.

Using its 50+ templates, you can easily generate high-quality content in various formats, including long-form blog articles and short-form content social media posts. It also has a built-in plagiarism checker and uses natural language processing (NLP terms) to optimize content for SEO and provide relevant keyword suggestions, which search engines like Google will love.

Jasper AI also offers a range of entertaining tutorials (the Jasper YouTube channel is my favorite) to assist you in your content writing journey. The tool also provides excellent customer support and offers a free 7-day trial to test its functionality.

Jasper AI Pros

  • User-friendly interface.
  • Built-in plagiarism checker (optional add-on).
  • Generates high-quality content using natural language processing and machine learning algorithms.
  • Suitable for various purposes like blog posts, articles, product descriptions, and social media.
  • Choose a unique tone of voice or create your own custom brand voice.
  • 50+ templates.
  • Write in 25+ languages.

Jasper AI Cons

  • The output and formatting of the generated content can be limiting.
  • It may lack the creativity and personal touch a human writer can provide.
  • You must fact-check their content to ensure it's true.
  • Can be expensive, especially when using the Surfer SEO integration, which has subscriptions of its own.
  • With Jasper's subscriptions, you may pay for features you don't use (such as Jasper Art).

What is Scalenut?

Scalenut homepage.

Scalenut is an all-in-one AI writing tool with a user-friendly interface and many useful features to speed up content creation. Like Jasper, it utilizes natural language processing (NLP) and artificial intelligence (AI) to generate high-quality content, including blog posts, articles, social media posts, and product descriptions.

Scalenut provides 40+ AI templates, such as Product Descriptions and Ad Copy, to customize the content according to your needs. It also acts as your personal SEO assistant with built-in SEO optimization features so your content can soar to the top of the SERPs.

Ultimately, Scalenut aims to simplify the content creation process and help users produce engaging and relevant content.

Scalenut Pros

  • User-friendly interface.
  • AI-generated content for short and long-form content.
  • Built-in keyword research tools, including keyword clusters
  • Uses AI to optimize content streamlining your workflow.
  • Add your own custom tone of voice for consistency throughout your content.
  • One-click WordPress publish (including other integrations like SemRush for keyword research and Copyscape to check for plagiarism)
  • More affordable than Jasper.

Scalenut Cons

  • Only generates content in English.
  • Must manually create your own writing styles.
  • Must edit the content to ensure it is true and check for grammatical errors.
  • It may lack the creativity and personal touch a human writer can provide.
  • Content may be repetitive.

Jasper AI vs Scalenut: Comparison

Next, we will compare Jasper side by side by comparing their features, content quality, ease of use, language support, AI templates, integrations, customer support, and pricing.

Features: Jasper AI

Jasper AI features.

Here are the main features that come with Jasper AI:

  • Surfer SEO Integration.
  • AI chat functionality.
  • 50+ templates.
  • Enhance text prompt button.
  • Built-in AI image generator (Jasper Art).
  • Have Jasper write in a unique tone of voice.
  • Create a custom Brand Voice.
  • Use AI to create campaigns.
  • Built-in plagiarism checker.

Using natural language processing and machine learning algorithms, Jasper creates content in various writing styles and tones. It also includes built-in features like plagiarism detection, grammar checking, and Surfer SEO integration to ensure originality and error-free writing.

Jasper is a marketer's best friend who can create campaigns instantly using AI. If you're looking for an AI writing tool geared more towards marketing and keeping a consistent brand tone of voice, Jasper's features align with you.

Features: Scalenut

Scalenut features.

Here are the main features that come with Scalenut:

  • Cruise Mode to generate 1,500+ word articles in minutes with a customizable outline.
  • Add credible statistics directly into your content.
  • Built-in keyword planner that creates keyword clusters for topical authority.
  • WordPress, SemRush, and CopyScape integrations.
  • Built-in SEO tools to optimize content.
  • 40+ AI templates.

While Jasper has an AI chat functionality, more AI templates, a built-in AI image generator, and the ability to use AI to create campaigns, Scalenut offers unparalleled features like the ability to create a 1,500+ word article with a customizable outline using AI, a keyword planner with useful metrics, multiple useful integrations with popular applications, and SEO tools to optimize content. This all-in-one tool is incredibly useful for bloggers and content creators who want to streamline their content.

With Scalenut's user-friendly interface, writers of all skill levels can easily create blog posts, social media captions, and product descriptions. With its focus on content creation and optimization, Scalenut is a valuable tool for writers looking to streamline their workflow and produce exceptional content.

Winner: Both

Jasper and Scalenut provide plenty of valuable features, with the ideal choice contingent on the user's needs. Scalenut emerges as the more suitable option for bloggers, boasting functionalities tailored toward long-form content writing and SEO optimization. On the other hand, Jasper proves to be better suited for marketers and businesses seeking to maintain a consistent brand voice.

Content Quality: Jasper

Powered by ChatGPT 3.5, Jasper AI generates high-quality content that sounds human. It has undergone extensive training on various data sources, including books, articles, and websites, ensuring its writing is informative and engaging.

Where Jasper stands out is in its tone of voice feature. Feel free to put whatever you want as the tone of voice (e.g., witty, cheeky, persuasive, etc.), and Jasper will generate content in that tone.

Determining the tone of voice on Jasper.

You can also create your own custom Brand Voice based on a URL or by providing text that you can use for whatever content you are writing with Jasper.

Jasper brand voice.

Jasper also has an “Enhance prompt” button for better outputs.

The Jasper enhance prompt button.

With Jasper, you can prioritize speed or quality before generating content.

Choosing whether to prioritize speed or quality with Jasper.

I tested it myself, and (as expected) the content generated faster when Speed was selected:

An example of what Jasper generates when the speed option is selected.

When I selected “Quality,” the content quality was better, but it took longer to generate:

An example of the content Jasper generates when Quality is selected.

Content Quality: Scalenut

Scalenut stands uses advanced AI algorithms to generate high-quality content. It analyzes and understands the writing context through natural language processing and machine learning, resulting in human-like accuracy.

Scalenut also offers a tone of voice feature, but you must manually add them. Once created, you can use your tones for whatever content you create using Scalenut.

Adding a tone of voice using Scalenut.

The content generated by Scalenut creates an excellent article structure and allows you to regenerate, rephrase, expand, shorten, or customize your content instantly with the Refresh button.

Selecting the Refresh button to regenerate, rephrase, expand, shorten, or customize a paragraph in Scalenut.

Winner: Jasper

Scalenut and Jasper generate high-quality content using machine algorithms and natural language processing, allowing you to create custom tones. However, Jasper goes out of their way to ensure your content is engaging and has plenty of charisma.

Ease of Use: Jasper AI

The Jasper AI dashboard.

Jasper AI has a clean and user-friendly interface. Clicking the “Create new content” button offers a straightforward writing experience, making content generation easy for users.

Selecting what type of content to create in Jasper.

With various templates and pre-designed content structures, Jasper AI assists you in whatever you are writing.

Ease of Use: Scalenut

The Scalenut dashboard.

Like Jasper, Scalenut stands out for its user-friendly interface and intuitive design, making it effortless for writers of all levels to navigate and utilize.

The platform offers a simple, straightforward writing experience, easily accessible features, and clear instructions. When using Cruise Mode, users can generate a 1,500+ word article within minutes by following the steps, which are very easy to follow.

Winner: Both

Both Jasper and Scalenut have clean and simple UX interfaces with plenty of resources, making it easy for anyone to get started in AI writing.

Language Support: Jasper AI

Language support available in Jasper AI.

Jasper AI offers robust language support, catering to 25+ languages such as English, Spanish, French, German, Italian, Portuguese, Dutch, and more. This extensive language support allows users to create content in their preferred language, enabling them to reach and engage with specific target audiences effectively.

With advanced natural language processing algorithms, Jasper AI ensures that the generated content is high quality, regardless of the chosen language.

The platform also offers translation capabilities, making it convenient for users to translate their content into different languages.

The languages Jasper will translate.

Jasper AI's comprehensive language support makes it an excellent choice for global users and businesses in diverse markets.

Language Support: Scalenut

Scalenut AI writing tool offers language support exclusively in English. This can be extremely limiting for those who don't speak English or are not targeting English speakers with their content.

Winner: Jasper

The clear winner here is Jasper, supporting 25+ languages with translation.

AI Templates: Jasper AI

AI templates available in Jasper.

Jasper AI offers 50+ AI templates for long and short-form content, such as blog posts, social media content, and product descriptions. These templates enable users to generate high-quality content quickly and efficiently.

AI Templates: Scalenut

AI templates available in Scalenut.

Scalenut offers 40+ AI templates for also for blog posts, social media content, and product descriptions. These templates are pre-designed and ready to use, saving users valuable time and effort in creating content from scratch.

The templates Jasper and Scalenut offer are relatively the same, with some unique to Scalenut, including:

  • First-Person to Third-Person Converter
  • Active to Passive Converter
  • Passive to Active Converter

Winner: Jasper

Jasper and Scalenut give users many templates to guide their content creation process. However, Jasper wins with 50+ AI templates, while Scalenut offers 40+.

Integrations: Jasper AI

Jasper AI integrations.

Jasper AI has a couple of useful integrations:

  • Surfer SEO: You can connect your account to optimize your content using Surfer SEO.
  • Plagiarism Checker: Make sure your content is original (must buy credits separately).
  • DeepL Translation: Translate your content into 30+ languages with a click.
  • Grammarly: Connect your Grammarly account to check for spelling and grammar mistakes.

This eliminates the need to switch between tools saving you time.

Integrations: Scalenut

Scalenut integrations.

Scalenut integrates with popular platforms:

  • WordPress: Directly publish onto your WordPress with a few clicks.
  • Semrush: Use one of the most popular keyword research tools to find trending keywords and create keyword clusters to dominate your niche.
  • Copyscape: Make sure your content is one-of-a-kind and free of plagiarism.

Scalenut includes SEO optimization tools on par with Surfer SEO, eliminating the need for a separate account and the additional expenses associated with Surfer SEO.

Winner: Both

Scalenut and Jasper provide outstanding integration options, with the choice ultimately depending on the user's budget and preferences.

Customer Support: Jasper AI

Customer support options available in Jasper.

Jasper provides excellent customer support and resources, like:

  • Help center/knowledge base: Articles containing instant answers to commonly asked questions. You can also send a message to their team (they usually respond within a few hours).
  • Business plan demo: Fill out a form to request a demonstration of Jasper Business.
  • Jasper Academy: Complete Jasper courses to learn how to use Jasper on a basic or professional level and get Jasper certified (you're rewarded with a badge at the end).
  • Jasper Jumpstart: Become a Jasper pro through this collection of courses.
  • Webinars: Access Jasper webinars live and on demand.
  • YouTube channel: Jasper's YouTube channel consists of a variety of Jasper tutorials told through engaging videos.
  • Community: Join the free and private Facebook community to get help, share ideas, and connect with other Jasper users.

Customer Support: Scalenut

Customer support options available in Scalenut.

Scalenut provides comprehensive customer support, including:

  • Help Center: Articles providing advice and answers.
  • Bootcamp: Video tutorials.
  • Community: Join the free and private Facebook community to get help, share ideas, and connect with other Scalenut users.
  • 1:1 Demo: Schedule a free 1:1 demonstration of Scalenut via Calendly.
  • Chat with us: Instantly start a chat with the Scalenut team to get help when needed.
  • Email us: Directly email Scalenut.
  • Onboarding Support: Book a 30-minute Calendly meeting to get started with Scalenut.

Winner: Scalenut

While both Jasper and Scalenut provide plenty of resources to familiarize users with their product, get answers to their questions, connect with the community, and contact support via email or chat, Scalenut offers 1:1 Calendly meetings for those requiring extra help and guidance.

Pricing: Jasper AI

Jasper AI pricing plans.

  • Creator: $49 per month (best for individual content creators who want to improve their content speed and quality).
  • Teams: $125 per month (best for creating content for multiple brands and collaborating with others on campaigns).
  • Business: Custom (best for personalizing your AI features how you like and getting team training tailored to your company).

Jasper offers a 7-day free trial with each plan, allowing you to test the software and determine if it meets your needs. It's important to note that every Jasper plan includes Jasper Art (an integrated AI art generator) and unlimited AI word generation – a significant improvement from their previous credit-based system.

For the monthly plans, pricing starts at $49 per month for a single user. This plan includes one brand voice, 50 knowledge assets, 50+ templates, access to SEO mode, and the Jasper browser extension.

The price increases up to $125 per month for three users. With this plan, you get everything in the previous plan plus three brand voices, 150 knowledge assets, the ability to customize your template, instant campaigns, collaboration, and user management.

If you choose the Business plan, pricing is customized to meet your specific requirements. This plan offers your company the most control, customization, security, and training.

If you'd like to save 20%, consider one of their annual plans.

Pricing: Scalenut

Scalenut pricing plans.

  • Essential: $39 per month (best for creators and consultants)
  • Growth: $79 per month (best for startups and growing businesses)
  • Pro: $149 per month (best for large teams, businesses, and agencies)

Scalenut provides a complimentary 7-day trial to assess the software's suitability for your requirements.

The monthly packages start at $39 per month and include 100,000 AI words, five SEO articles monthly, over 40 AI templates, Cruise Mode, SERP Analysis, NLP Key Terms, SEO Editor, Document sharing, chrome extension, email support, and live chat support.

For $79 per month, you can access all the features mentioned in the previous plan as well as unlimited AI words, 30 SEO articles, 30 keyword clusters, 30 pages for auditing and optimization, unlimited tone of voice options, fix-it auto optimizer, one-click WordPress publishing capability, and integrations.

The highest-priced plan is $149 per month and includes all the features of the previous plan in addition to unlimited AI words, 75 SEO articles per month, 75 keyword clusters per month, and 75 pages for auditing and optimization purposes. Furthermore, it offers a dedicated manager and the option to add more users for $49 each.

If you wish to save 50%, consider opting for one of their annual plans.

Winner: Scalenut

The Scalenut pricing plan is much more affordable than Jasper, especially with its annual plans.

Jasper AI vs Scalenut: Which Should You Choose?

Jasper and Scalenut have their strengths and weaknesses, so the true winner depends on the user and their needs. Who is Scalenut and Jasper AI best suited for?

Who Should Use Jasper AI?

  • Marketers
  • Businesses
  • Non-native English speakers

Jasper AI is better suited for marketers and businesses who want top-notch written content that reflects their brand.

While writers and individuals aiming to streamline their writing process can also benefit from using Jasper, its focus is increasingly on brand tone of voice and marketing, as seen by its latest feature enabling the instant creation of campaigns with AI.

Jasper is a great option for those who speak various languages, as it supports over 25 languages.

Who Should Use Scalenut?

  • Bloggers
  • Content creators
  • English speakers

Scalenut is the best choice for bloggers, content creators, and English speakers looking to streamline content production.

Scalenut offers the convenience of Cruise Mode, which allows you to generate a comprehensive SEO-optimized outline and article within minutes. While Jasper has a feature called the “One-Shot Blog Post” that serves a similar purpose, it cannot compete with the versatility and features provided by Cruise Mode.

Jasper and Scalenut offer SEO optimization tools, but Scalenut goes above and beyond by integrating Semrush with keyword planning and clusters. In addition, Scalenut offers a more budget-friendly pricing structure than Jasper, making it the perfect option for individuals rather than businesses with more disposable income.

These additional tools make Scalenut particularly well-suited for bloggers and content creators who often work on longer pieces of content.

Final Thoughts

After a detailed comparison between Jasper AI and Scalenut, the true winner depends on the user's overall needs.

Jasper is ideal for a marketer or business owner looking for an AI writing tool that delivers exceptional content while keeping your brand voice consistent. However, if you are a blogger or content creator generating longer content and want to streamline your workflow, I'd highly recommend Scalenut.

To go deeper into the comparison and make an informed decision based on your unique needs, read my Jasper AI Review and Scalenut Review or visit Jasper AI or Scalenut.

Frequently Asked Questions

What is the best alternative to Jasper AI?

Scalenut is a highly recommended alternative to Jasper AI, offering similar AI-powered writing capabilities and features. Each tool has its own strengths and weaknesses, so the best alternative depends on individual needs and preferences.

Is Jasper AI better than copy AI?

Jasper AI and Copy AI are powerful AI writing tools with unique features. Jasper focuses more on keeping the brand tone of voice consistent and content generation for teams, while Copy AI focuses more on copywriting and marketing-oriented content.

What is Jasper AI good for?

Jasper AI is an excellent writing assistant that generates high-quality content quickly, perfect for creating blog posts, articles, and captivating social media captions. Jasper is ideal for businesses and individuals seeking to produce large volumes of written content.

Is Jasper AI the same as Jarvis AI?

Yes, Jasper AI and Jarvis AI are the same company. Jarvis AI underwent a rebrand to Jasper AI at the beginning of 2022.

OpenAI, Google and More Agree to White House List of Eight AI Safety Assurances

The White House.
Image: Bill Chizek/Adobe Stock

Some of the largest generative AI companies operating in the U.S. plan to watermark their content, a fact sheet from the White House revealed on Friday, July 21. Amazon, Anthropic, Google, Inflection, Meta, Microsoft and OpenAI agreed to eight voluntary commitments around the use and oversight of generative AI, including watermarking.

This follows a March statement about the White House’s concerns about the misuse of AI. Also, the agreement comes at a time when regulators are nailing down procedures for managing the effect generative artificial intelligence has had on technology and the ways people interact with it since ChatGPT put AI content in the public eye in November 2022.

Jump to:

  • What are the eight AI safety commitments?
  • Government regulation of AI may discourage malicious actors

What are the eight AI safety commitments?

The eight AI safety commitments include:

  • Internal and external security testing of AI systems before their release.
  • Sharing information across the industry and with governments, civil society and academia on managing AI risks.
  • Investing in cybersecurity and insider threat safeguards, specifically to protect model weights, which impact bias and the concepts the AI model associates together.
  • Encouraging third-party discovery and reporting of vulnerabilities in their AI systems.
  • Publicly reporting all AI systems’ capabilities, limitations and areas of appropriate and inappropriate use.
  • Prioritizing research on bias and privacy.
  • Helping to use AI for beneficial purposes such as cancer research.
  • Developing robust technical mechanisms for watermarking.

The watermark commitment involves generative AI companies developing a way to mark text, audio or visual content as machine-generated; it will apply to any publicly available generative AI content created after the watermarking system is locked in. Since the watermarking system hasn’t been created yet, it will be some time before a standard way to tell whether content is AI generated becomes publicly available.

SEE: Hiring kit: Prompt engineer (TechRepublic Premium)

Government regulation of AI may discourage malicious actors

Former Microsoft Azure global vice president and current Cognite chief procurement officer Moe Tanabian supports government regulation of generative AI. He compared the current era of generative AI with the rise of social media, including possible downsides like the Cambridge Analytica data privacy scandal and other misinformation during the 2016 election, in a conversation with TechRepublic.

“There are a lot of opportunities for malicious actors to take advantage of [generative AI], and use it and misuse it, and they are doing it. So, I think, governments have to have some watermarking, some root of trust element that they need to instantiate and they need to define,” Tanabian said.

“For example, phones should be able to detect if malicious actors are using AI-generated voices to leave fraudulent voice messages,” he said.

“Technologically, we’re not disadvantaged. We know how to [detect AI-generated content],” Tanabian said. “Requiring the industry and putting in place those regulations so that there is a root of trust that we can authenticate this AI generated content is the key.”

Subscribe to the Innovation Insider Newsletter

Catch up on the latest tech innovations that are changing the world, including IoT, 5G, the latest about phones, security, smart cities, AI, robotics, and more.

Delivered Tuesdays and Fridays Sign up today

YOLOv7: The Most Advanced Object Detection Algorithm?

July 6th 2022 will be marked down as a landmark in AI history because it was on this day when YOLOv7 was released. Ever since its launch, the YOLOv7 has been the hottest topic in the Computer Vision developer community, and for the right reasons. YOLOv7 is already being regarded as a milestone in the object detection industry.

Shortly after the YOLOv7 paper was published, it turned up as the fastest, and most accurate real-time objection detection model. But how does YOLOv7 outcompete its predecessors? What makes YOLOv7 so efficient in performing computer vision tasks?

In this article we will try to analyze the YOLOv7 model, and try to find the answer to why YOLOv7 is now becoming industry standard? But before we can answer that, we will have to have a look at the brief history of object detection.

What is Object Detection?

Object detection is a branch in computer vision that identifies and locates objects in an image, or a video file. Object detection is the building block of numerous applications including self-driving cars, monitored surveillance, and even robotics.

An object detection model can be classified into two different categories, single-shot detectors, and multi-shot detectors.

Real Time Object Detection

To truly understand how YOLOv7 works, it’s essential for us to understand YOLOv7’s main objective, “Real Time Object Detection”. Real Time Object Detection is a key component of modern computer vision. The Real Time Object Detection models try to identify & locate objects of interest in real time. Real Time Object Detection models made it really efficient for developers to track objects of interest in a moving frame like a video, or a live surveillance input.

Real Time Object Detection models are essentially a step ahead from the conventional image detection models. While the former is used to track objects in video files, the latter locates & identifies objects within a stationary frame like an image.

As a result, Real Time Object Detection models are really efficient for video analytics, autonomous vehicles, object counting, multi-object tracking, and much more.

What is YOLO?

YOLO or “You Only Look Once” is a family of real time object detection models. The YOLO concept was first introduced in 2016 by Joseph Redmon, and it was the talk of the town almost instantly because it was much quicker, and much more accurate than the existing object detection algorithms. It wasn’t long before the YOLO algorithm became a standard in the computer vision industry.

The fundamental concept that the YOLO algorithm proposes is to use an end-to-end neural network using bounding boxes & class probabilities to make predictions in real time. YOLO was different from the previous object detection model in the sense that it proposed a different approach to perform object detection by repurposing classifiers.

The change in approach worked as YOLO soon became the industry standard as the performance gap between itself, and other real time object detection algorithms were significant. But what was the reason why YOLO was so efficient?

When compared to YOLO, object detection algorithms back then used Region Proposal Networks to detect possible regions of interest. The recognition process was then performed on each region separately. As a result, these models often performed multiple iterations on the same image, and hence the lack of accuracy, and higher execution time. On the other hand, the YOLO algorithm uses a single fully connected layer to perform the prediction at once.

How Does YOLO Work?

There are three steps that explain how a YOLO algorithm works.

Reframing Object Detection as a Single Regression Problem

The YOLO algorithm tries to reframe object detection as a single regression problem, including image pixels, to class probabilities, and bounding box coordinates. Hence, the algorithm has to look at the image only once to predict & locate the target objects in the images.

Reasons the Image Globally

Furthermore, when the YOLO algorithm makes predictions, it reasons the image globally. It’s different from region proposal-based, and sliding techniques as the YOLO algorithm sees the complete image during training & testing on the dataset, and is able to encode contextual information about the classes, and how they appear.

Before YOLO, Fast R-CNN was one of the most popular object detection algorithms that couldn’t see the larger context in the image because it used to mistake background patches in an image for an object. When compared to the Fast R-CNN algorithm, YOLO is 50% more accurate when it comes to background errors.

Generalizes Representation of Objects

Finally, the YOLO algorithm also aims at generalizing the representations of objects in an image. As a result, when a YOLO algorithm was run on a dataset with natural images, and tested for the results, YOLO outperformed existing R-CNN models by a wide margin. It’s because YOLO is highly generalizable, the chances of it breaking down when implemented on unexpected inputs or new domains were slim.

YOLOv7: What’s New?

Now that we have a basic understanding of what real time object detection models are, and what is the YOLO algorithm, it’s time to discuss the YOLOv7 algorithm.

Optimizing the Training Process

The YOLOv7 algorithm not only tries to optimize the model architecture, but it also aims at optimizing the training process. It aims at using optimization modules & methods to improve the accuracy of object detection, strengthening the cost for training, while maintaining the interference cost. These optimization modules can be referred to as a trainable bag of freebies.

Coarse to Fine Lead Guided Label Assignment

The YOLOv7 algorithm plans to use a new Coarse to Fine Lead Guided Label Assignment instead of the conventional Dynamic Label Assignment. It is so because with dynamic label assignment, training a model with multiple output layers causes some issues, the most common one of it being how to assign dynamic targets for different branches and their outputs.

Model Re-Parameterization

Model re-parametrization is an important concept in object detection, and its use is generally followed with some issues during training. The YOLOv7 algorithm plans on using the concept of gradient propagation path to analyze the model re-parametrization policies applicable to different layers in the network.

Extend and Compound Scaling

The YOLOv7 algorithm also introduces the extended and compound scaling methods to utilize and effectively use the parameters & computations for real time object detection.

YOLOv7 : Related Work

Real Time Object Detection

YOLO is currently the industry standard, and most of the real time object detectors deploy YOLO algorithms, and FCOS (Fully Convolutional One-Stage Object-Detection). A state of the art real time object detector usually has the following characteristics

  • Stronger & faster network architecture.
  • An effective feature integration method.
  • An accurate object detection method.
  • A robust loss function.
  • An efficient label assignment method.
  • An efficient training method.

The YOLOv7 algorithm does not use self-supervised learning & distillation methods that often require large amounts of data. Conversely, the YOLOv7 algorithm uses a trainable bag-of-freebies method.

Model Re-Parameterization

Model re-parameterization techniques is regarded as an ensemble technique that merges multiple computational modules in an interference stage. The technique can be further divided into two categories, model-level ensemble, and module-level ensemble.

Now, to obtain the final interference model, the model-level reparameterization technique uses two practices. The first practice uses different training data to train numerous identical models, and then averages the weights of the trained models. Alternatively, the other practice averages the weights of models during different iterations.

Module level re-parameterization is gaining immense popularity recently because it splits a module into different module branches, or different identical branches during the training phase, and then proceeds to integrate these different branches into an equivalent module while interference.

However, re-parameterization techniques cannot be applied to all kinds of architecture. It’s the reason why the YOLOv7 algorithm uses new model re-parameterization techniques to design related strategies suited for different architectures.

Model Scaling

Model scaling is the process of scaling up or down an existing model so it fits across different computing devices. Model scaling generally uses a variety of factors like number of layers(depth), size of input images(resolution), number of feature pyramids(stage), and number of channels(width). These factors play a crucial role in ensuring a balanced trade off for network parameters, interference speed, computation, and accuracy of the model.

One of the most commonly used scaling methods is NAS or Network Architecture Search that automatically searches for suitable scaling factors from search engines without any complicated rules. The major downside of using the NAS is that it’s an expensive approach for searching suitable scaling factors.

Almost every model re-parameterization model analyzes individual & unique scaling factors independently, and furthermore, even optimizes these factors independently. It’s because the NAS architecture works with non-correlated scaling factors.

It’s worth noting that concatenation-based models like VoVNet or DenseNet change the input width of a few layers when the depth of the models is scaled. YOLOv7 works on a proposed concatenation-based architecture, and hence uses a compound scaling method.

The figure mentioned above compares the extended efficient layer aggregation networks (E-ELAN) of different models. The proposed E-ELAN method maintains the gradient transmission path of the original architecture, but aims at increasing the cardinality of the added features using group convolution. The process can enhance the features learned by different maps, and can further make the use of calculations & parameters more efficient.

YOLOv7 Architecture

The YOLOv7 model uses the YOLOv4, YOLO-R, and the Scaled YOLOv4 models as its base. The YOLOv7 is a result of the experiments carried out on these models to improve the results, and make the model more accurate.

Extended Efficient Layer Aggregation Network or E-ELAN

E-ELAN is the fundamental building block of the YOLOv7 model, and it is derived from already existing models on network efficiency, mainly the ELAN.

The main considerations when designing an efficient architecture are the number of parameters, computational density, and the amount of computation. Other models also consider factors like influence of input/output channel ratio, branches in the architecture network, network interference speed, number of elements in the tensors of convolutional network, and more.

The CSPVoNet model not only considers the above-mentioned parameters, but it also analyzes the gradient path to learn more diverse features by enabling the weights of different layers. The approach allows the interferences to be much faster, and accurate. The ELAN architecture aims at designing an efficient network to control the shortest longest gradient path so that the network can be more effective in learning, and converging.

ELAN has already reached a stable stage regardless of the stacking number of computational blocks, and gradient path length. The stable state might be destroyed if computational blocks are stacked unlimitedly, and the parameter utilization rate will diminish. The proposed E-ELAN architecture can solve the issue as it uses expansion, shuffling, and merging cardinality to continuously enhance the network’s learning ability while retaining the original gradient path.

Furthermore, when comparing the architecture of E-ELAN with ELAN, the only difference is in the computational block, while the transition layer’s architecture is unchanged.

E-ELAN proposes to expand the cardinality of the computational blocks, and expand the channel by using group convolution. The feature map will then be calculated, and shuffled into groups as per the group parameter, and will then be concatenated together. The number of channels in each group will remain the same as in the original architecture. Lastly, the groups of feature maps will be added to perform cardinality.

Model Scaling for Concatenation Based Models

Model scaling helps in adjusting attributes of the models that helps in generating models as per the requirements, and of different scales to meet the different interference speeds.

The figure talks about model scaling for different concatenation-based models. As you can in figure (a) and (b), the output width of the computational block increases with an increase in the depth scaling of the models. Resultantly, the input width of the transmission layers is increased. If these methods are implemented on concatenation-based architecture the scaling process is performed in depth, and it’s depicted in figure (c).

It can thus be concluded that it’s not possible to analyze the scaling factors independently for concatenation-based models, and rather they must be considered or analyzed together. Therefore, for a concatenation based model, it's suitable to use the corresponding compound model scaling method. Additionally, when the depth factor is scaled, the output channel of the block must be scaled as well.

Trainable Bag of Freebies

A bag of freebies is a term that developers use to describe a set of methods or techniques that can alter the training strategy or cost in an attempt to boost model accuracy. So what are these trainable bags of freebies in YOLOv7? Let’s have a look.

Planned Re-Parameterized Convolution

The YOLOv7 algorithm uses gradient flow propagation paths to determine how to ideally combine a network with the re-parameterized convolution. This approach by YOLov7 is an attempt to counter RepConv algorithm that although has performed serenely on the VGG model, performs poorly when applied directly to the DenseNet and ResNet models.

To identify the connections in a convolutional layer, the RepConv algorithm combines 3×3 convolution, and 1×1 convolution. If we analyze the algorithm, its performance, and the architecture we will observe that RepConv destroys the concatenation in DenseNet, and the residual in ResNet.

The image above depicts a planned re-parameterized model. It can be seen that the YOLov7 algorithm found that a layer in the network with concatenation or residual connections should not have an identity connection in the RepConv algorithm. Resultantly, it's acceptable to switch with RepConvN with no identity connections.

Coarse for Auxiliary and Fine for Lead Loss

Deep Supervision is a branch in computer science that often finds its use in the training process of deep networks. The fundamental principle of deep supervision is that it adds an additional auxiliary head in the middle layers of the network along with the shallow network weights with assistant loss as its guide. The YOLOv7 algorithm refers to the head that’s responsible for the final output as the lead head, and the auxiliary head is the head that assists in training.

Moving along, YOLOv7 uses a different method for label assignment. Conventionally, label assignment has been used to generate labels by referring directly to the ground truth, and on the basis of a given set of rules. However, in recent years, the distribution, and quality of the prediction input plays an important role to generate a reliable label. YOLOv7 generates a soft label of the object by using the predictions of bounding box and ground truth.

Furthermore, the new label assignment method of the YOLOv7 algorithm uses lead head’s predictions to guide both the lead & the auxiliary head. The label assignment method has two proposed strategies.

Lead Head Guided Label Assigner

The strategy makes calculations on the basis of the lead head’s prediction results, and the ground truth, and then uses optimization to generate soft labels. These soft labels are then used as the training model for both the lead head, and the auxiliary head.

The strategy works on the assumption that because the lead head has a greater learning capability, the labels it generates should be more representative, and correlate between the source & the target.

Coarse-to-Fine Lead Head Guided Label Assigner

This strategy also makes calculations on the basis of the lead head’s prediction results, and the ground truth, and then uses optimization to generate soft labels. However, there’s a key difference. In this strategy, there are two sets of soft labels, coarse level, and fine label.

The coarse label is generated by by relaxing the constraints of the positive sample

assignment process that treats more grids as positive targets. It’s done to avoid the risk of losing information because of the auxiliary head’s weaker learning strength.

The figure above explains the use of a trainable bag of freebies in the YOLOv7 algorithm. It depicts coarse for the auxiliary head, and fine for the lead head. When we compare a Model with Auxiliary Head(b) with the Normal Model (a), we will observe that the schema in (b) has an auxiliary head, while it’s not in (a).

Figure (c) depicts the common independent label assigner while figure (d) & figure (e) respectively represent the Lead Guided Assigner, and the Coarse-toFine Lead Guided Assigner used by YOLOv7.

Other Trainable Bag of Freebies

In addition to the ones mentioned above, the YOLOv7 algorithm uses additional bags of freebies, although they were not proposed by them originally. They are

  • Batch Normalization in Conv-Bn-Activation Technology: This strategy is used to connect a convolutional layer directly to the batch normalization layer.
  • Implicit Knowledge in YOLOR: The YOLOv7 combines the strategy with the Convolutional feature map.
  • EMA Model: The EMA model is used as a final reference model in YOLOv7 although its primary use is to be used in the mean teacher method.

YOLOv7 : Experiments

Experimental Setup

The YOLOv7 algorithm uses the Microsoft COCO dataset for training and validating their object detection model, and not all of these experiments use a pre-trained model. The developers used the 2017 train dataset for training, and used the 2017 validation dataset for selecting the hyperparameters. Finally, the performance of the YOLOv7 object detection results are compared with state of the art algorithms for object detection.

Developers designed a basic model for edge GPU (YOLOv7-tiny), normal GPU (YOLOv7), and cloud GPU (YOLOv7-W6). Furthermore, the YOLOv7 algorithm also uses a basic model for model scaling as per different service requirements, and gets different models. For the YOLOv7 algorithm the stack scaling is done on the neck, and proposed compounds are used to upscale the depth & width of the model.

Baselines

The YOLOv7 algorithm uses previous YOLO models, and the YOLOR object detection algorithm as its baseline.

The above figure compares the baseline of the YOLOv7 model with other object detection models, and the results are quite evident. When compared with the YOLOv4 algorithm, YOLOv7 not only uses 75% less parameters, but it also uses 15% less computation, and has 0.4% higher accuracy.

Comparison with State of the Art Object Detector Models

The above figure shows the results when YOLOv7 is compared against state of the art object detection models for mobile & general GPUs. It can be observed that the method proposed by the YOLOv7 algorithm has the best speed-accuracy trade-off score.

Ablation Study : Proposed Compound Scaling Method

The figure shown above compares the results of using different strategies for scaling up the model. The scaling strategy in the YOLOv7 model scales up the depth of the computational block by 1.5 times, and scales the width by 1.25 times.

When compared with a model that only scales up the depth, the YOLOv7 model performs better by 0.5% while using less parameters, and computation power. On the other hand, when compared with models that only scale up the depth, YOLOv7’s accuracy is improved by 0.2%, but the number of parameters need to be scaled by 2.9%, and computation by 1.2%.

Proposed Planned Re-Parameterized Model

To verify the generality of its proposed re-parameterized model, the YOLOv7 algorithm uses it on residual-based, and concatenation based models for verification. For the verification process, the YOLOv7 algorithm uses 3-stacked ELAN for the concatenation-based model, and CSPDarknet for residual-based model.

For the concatenation-based model, the algorithm replaces the 3×3 convolutional layers in the 3-stacked ELAN with RepConv. The figure below shows the detailed configuration of Planned RepConv, and 3-stacked ELAN.

Furthermore, when dealing with the residual-based model, the YOLOv7 algorithm uses a reversed dark block because the original dark block does not have a 3×3 convolution block. The below figure shows the architecture of the Reversed CSPDarknet that reverses the positions of the 3×3 and the 1×1 convolutional layer.

Proposed Assistant Loss for Auxiliary Head

For the assistant loss for auxiliary head, the YOLOv7 model compares the independent label assignment for the auxiliary head & lead head methods.

The figure above contains the results of the study on the proposed auxiliary head. It can be seen that the overall performance of the model increases with an increase in the assistant loss. Furthermore, the lead guided label assignment proposed by the YOLOv7 model performs better than independent lead assignment strategies.

YOLOv7 Results

Based on the above experiments, here’s the result of YOLov7’s performance when compared to other object detection algorithms.

The above figure compares the YOLOv7 model with other object detection algorithms, and it can be clearly observed that the YOLOv7 surpasses other objection detection models in terms of Average Precision (AP) v/s batch interference.

Furthermore, the below figure compares the performance of YOLOv7 v/s other real time objection detection algorithms. Once again, YOLOv7 succeeds other models in terms of the overall performance, accuracy, and efficiency.

Here are some additional observations from the YOLOv7 results & performances.

  1. The YOLOv7-Tiny is the smallest model in the YOLO family, with over 6 million parameters. The YOLOv7-Tiny has an Average Precision of 35.2%, and it outperforms the YOLOv4-Tiny models with comparable parameters.
  2. The YOLOv7 model has over 37 million parameters, and it outperforms models with higher parameters like YOLov4.
  3. The YOLOv7 model has the highest mAP and FPS rate in the range of 5 to 160 FPS.

Conclusion

YOLO or You Only Look Once is the state of the art object detection model in modern computer vision. The YOLO algorithm is known for its high accuracy, and efficiency, and as a result, it finds extensive application in the real time object detection industry. Ever since the first YOLO algorithm was introduced back in 2016, experiments have allowed developers to improve the model continuously.

The YOLOv7 model is the latest addition in the YOLO family, and it’s the most powerful YOLo algorithm till date. In this article, we have talked about the fundamentals of YOLOv7, and tried to explain what makes YOLOv7 so efficient.

Microsoft’s Bing Chat comes to Chrome and Safari in tests for ‘select users’

Microsoft’s Bing Chat comes to Chrome and Safari in tests for ‘select users’ Sarah Perez @sarahintampa / 8 hours

Microsoft’s AI chatbot, Bing Chat, is coming to non-Microsoft browsers, the company confirmed today following various reports of the AI chatbot being spotted in other browsers like Google Chrome and Apple’s Safari. The expansion will make Microsoft’s ChatGPT-like AI chatbot available to a broader set of users, as it was previously available to consumers only within Microsoft products, like the Bing mobile app and Microsoft Edge browser.

The company confirmed to TechCrunch that Bing Chat is expanding to other browsers, which hadn’t yet been officially announced.

“We are flighting access to Bing Chat in Safari and Chrome to select users as part of our testing on other browsers,” said Microsoft director of communications, Caitlin Roulston, in an emailed statement. “We are excited to expand access to even more users once our standard testing procedures are complete.”

According to those who gained access to the Bing AI chatbot on Windows, they received a pop-up in the Windows 10 or 11 taskbar, offering the opportunity to try the Bing AI in Chrome. Otherwise, users can head to Bing.com from their preferred browser, then click on the “Chat” icon to try out the experience. In our own tests, however, we could access Bing Chat in Chrome, but not Safari at this time. That could be because we’re not among the “select users” who were gaining access during the tests.

Image Credits: screenshot of Bing.com

Bing Chat’s ChatGPT-like experience is powered by OpenAI’s GPT-4 model, but some have reported that testing the AI chatbot in other browsers had more limitations than with the original version. For example, the blog WindowsLatest.com, which was the first to spot the expansion, noted that Bing Chat in Chrome supports only five messages per conversation, instead of the 30 available in Microsoft Edge. It was also limiting the character count to 2,000, instead of the 3,000 supported by Edge, the site said.

Microsoft declined to confirm these details or share any further information about the differences between the various versions of Bing Chat when we asked for more information. The company also wouldn’t confirm when the expansion to other browsers first began, which platforms were supported, or whether the tests would include users in global markets. That’s for us to discover in the days ahead, apparently.

In addition to adding support for Chrome and Safari, Bing Chat appears to be testing a native dark theme, too, but this is also not yet broadly available.

Bing Chat has been working its way into other Microsoft products following its launch earlier this year. In a matter of weeks, the new Bing arrived in the Bing mobile app and Edge browser for iOS, Android, and the desktop, in addition to being integrated with Skype. This month, Microsoft announced Bing Chat would also head into the enterprise with a version of Bing Chat that included business-focused data privacy and governance controls. Alongside that announcement, Microsoft also noted Visual Search, which lets the chatbot respond to questions about uploaded images, was rolling out, too.

What is Worldcoin? Eye-scanning crypto project launched by OpenAI CEO

TOKYO, JAPAN - JUNE 12: Students raise their hands for a question as OpenAI Chief Executive Officer Sam Altman looks on during an event at Keio University on June 12, 2023 in Tokyo, Japan. Altman discussed with students at the event hosted by one of Japan's leading private universities as he expressed his intentions to open an office and broaden services in the country. (Photo by Tomohiro Ohsumi/Getty Images)

Students raise their hands for a question as OpenAI Chief Executive Officer Sam Altman looks on during an event at Keio University on June 12, 2023 in Tokyo, Japan.

Want to prove you're really human? Let Sam Altman scan your eyeballs.

Worldcoin (WLD), a cryptocurrency project founded by OpenAI CEO Sam Altman, launched today with the aim of redefining the digital identification process by offering users a World ID, which verifies that the ID's owner is a real human.

With OpenAI working towards building 'responsible' AGI, its CEO appears to recognize the need for human privacy online, especially considering how popular AI chatbots like ChatGPT have become. As artificial intelligence technology advances, with generative AI at the forefront, fake virtual identities and bots are easier to create and operate, and distinguishing humans from AI on the internet becomes increasingly difficult.

Also: How I used ChatGPT to write a custom JavaScript bookmarklet

World ID works like a digital passport, verifying that the person using it is a real human while keeping their real-world identity anonymous. This would maintain privacy and protect users from malicious cyber attacks on their identity.

A look at Worldcoin's Orb and biometric scanner.

Consumers that want a World ID need to do an in-person biometric screening, which entails a face and iris scan at one of Worldcoin's 'Orbs.' These Orbs are silver spheres, about the size of a soccer ball, strategically positioned in various locations worldwide and typically accessible by appointment.

Also: Software developers' dance with generative AI is still at that awkward stage

When users download the World App, they'll be prompted to verify their identity by creating their World ID at an Orb and receive the WLD token afterward — just for being human.

WLD trading took off with the news of the official launch of the crypto project, soaring as high as $3.58 from $1.70. Binance and other exchanges listed the cryptocurrency, and $145 million was traded this morning before the stock price tanked by 29.4%.

Also: WormGPT: What to know about ChatGPT's malicious cousin

Currently, the US has 11 Orb locations distributed among Los Angeles, New York, Miami, and San Francisco. Worldcoin is reportedly scaling up Orb sites globally, expanding to 35 cities in 20 countries, though it already has over 2 million scanned users from its beta period worldwide.

For the Worldcoin founders, this is a step towards economic equality, democratic distribution of funds, and, ultimately, Universal Basic Income (UBI). "Worldcoin is an attempt at global scale alignment, the journey will be challenging, and the outcome is uncertain," the founders explained in a letter introducing the project.

"If successful, we believe Worldcoin could drastically increase economic opportunity, scale a reliable solution for distinguishing humans from AI online while preserving privacy, enable global democratic processes, and eventually show a potential path to AI-funded UBI," the letter explained.

More Crypto

Unraveling the Fiction and Reality of AI’s Evolution: An Interview with Paul Muzio

Unraveling the Fiction and Reality of AI’s Evolution: An Interview with Paul Muzio July 24, 2023 by Steve Conway, Sr Analyst, Intersect 360 Research

(GrandeDuc/Shutterstock)

Editors Note: In the wake of rising concerns about AI’s potential impact after the introduction of ChatGPT and other generative AI applications, HPCwire asked Steve Conway, senior analyst at Intersect360 Research, to interview Paul Muzio, former vice president for HPC at Network Computing Systems, Inc., and current chair of the HPC User Forum, an organization Conway helped to create. At a recent User Forum meeting, Muzio gave a well-received talk chronicling the history of human concerns about artificial intelligence and questioning whether intelligence is limited to humans. A link to Muzio’s presentation appears at the end of the interview.

HPCwire: Paul, people have been concerned for a long time about machines with super-human intelligence taking control of us and maybe even deciding to eliminate humanity. Your talk provided some examples from popular culture. Can you mention some of those?

Muzio: As I mention in my presentation, in my opinion the most profound prognostication of machines with super-human intelligence was presented in the 1956 movie Forbidden Planet. The movie foretells a global or “planetary” version of Google, the metaverse, machine-to-brain and brain-to-machine communication and what might go wrong. I also mention R.U.R., a play written in 1921 by Karel and Josef Capek. The Capeks are the inventors of the word “robot”. There is one line in that play that grabbed me, “from a technical point of view, the whole of childhood is a sheer absurdity. So much time lost.” This concept is also addressed in Forbidden Planet. There are many other writings in science fiction that I did not mention, such as I, Robot by Earl and Otto Binder, the movies Ex Machina, 2001: A Space Odyssey, and many others.

HPCwire: The impressive capabilities of generative AI have amplified concerns about where AI might be headed. In your opinion, how concerned should we be? You pointed out several times in your talk that unlike humans, computers retain what they’ve learned forever, without the need to educate the next generation.

Muzio: It is easy to make mistakes; it is hard to guarantee correctness. But even correctness does not preclude unintentional or adverse consequences. In The Complete Robot, Asimov discusses the situation where there is an iterative development of algorithms and that after a number of iterations, no human can understand the nth algorithm. This is illustrated to a degree by the development of AlphaGo by DeepMinds. AlphaGo was played against AlphaGo and in the end developed not only superhuman capability but also evolved to an algorithmic complexity beyond what humans could have developed. Recent experiments with developmental versions of GPT-4 have also resulted in some unexpected results. In fact, OpenAI has had to “dumb down” GPT-4 prior to its general availability.

GPT, as a released product, does not in and of itself have memory, i.e., it does not have operational access to a global planetary library which contains all knowledge. But we are building, at the present, huge decentralized libraries: libraries of human history and thought, libraries of biology, libraries of evolutionary trends, libraries of the universe. Of course, even data collections down to who we communicate with, what our preferences and dislikes are, and our everyday interactions. We strive to protect, perpetuate, and share those libraries. We are, with current computing technology, acquiring and preserving exabytes and exabytes of data. And, there is more sharing of that information than we are aware of. Right now, generative AI (G-AI) tools have access to some data for training purposes. What happens in the future when and if future G-AI tools gain access to all these decentralized libraries?

By the way, there are those who say that you have to “show” AI millions of pictures for it to be able to recognize a cat, whereas a child can quickly learn to recognize a cat. I argue that argument fails to acknowledge that the child has also seen millions of pictures of diverse things including the cat. I think that when G-AI has access to all those libraries we are building, it too will quickly learn.

HPCwire: Generative AI is still an early development. It’s generally still within the realm of so-called path problems, where a human provides the machine with a desired outcome and the machine obeys the human command by following a step-by-step path to pursue that outcome. At some future point machines should be able to handle insight problems, where they pursue and sometimes achieve innovations without prescribed outcomes. That has great potential benefits for humanity but is that also a cause for concern?

Muzio: I recently watched a presentation by Sebastien Bubeck, a very brilliant researcher at Microsoft. I think he clearly shows that an experimental version of GPT-4 has gone beyond the “path problem.” Yes, he concludes that GPT-4 is not capable of planning, but has many attributes of intelligence. His is a really great presentation and analysis of where we are today. Watch it.

As I point out in my presentation, it took 5,000 years to go from the invention of the wheel to the building of an automobile. The world of computers and AI is only a few decades old. Where will we be a few decades from now? Forbidden Planet and other science fiction books/movies tend to present a bleaker future and maybe science fiction may actually foretell the future. I would add the following: it is human hubris to assume that we are the pinnacle of evolution.

HPCwire: On a practical level this whole topic might revolve around the human-machine interface, or HMI, and the possibility that at some point computers or other machines might sever that connection as something no longer needed by them or even annoying. Do you see that as a possibility?

Muzio: Certainly this is so postulated in R.U.R. and the movie Ex Machina. I would expect it to be more evolutionary. We become more dependent on intelligent systems, and we become less capable of surviving in the world. I currently live out in Montauk, New York, which was long a quiet fishing community (the nearest traffic light to my house is 17 miles away). It is now inundated in the summer by Gen-Zers. Unfortunately, no one has taught Gen-Zers that when you walk on a country street with no sidewalks that you should walk facing traffic. I have a hunch that GPT-4 would know. In my presentation, I cite two books that address biological evolution with a crossover to AI. I highly recommend them.

HPCwire: AI is already being used to help design computer chips. You mentioned in your talk that this process could get out of human hands if the process becomes self-sustaining and the chips design their even-smarter successors. Should chipmakers be taking preventive measures?

Muzio: In my presentation, I mention that the chipmakers will not like what I say, but I believe the only preventative measure is to limit the further development of advanced chips. I guess I am not alone in this as the U.S. Government is restricting the export to the PRC of the technology to build advanced chips.

HPCwire: So far, we’ve been talking about two forms of intelligence, human and machine, but in your talk you referred to scientific evidence that humans aren’t the only natural creatures with intelligence. Can you say something about that?

Muzio: If you grew up with a pet or with animals, you recognized that they could think, plan, and had feelings, i.e., they had intelligence. Two millennia ago, the ancient Romans recognized that octopodes were uniquely intelligent. Some birds are able to count. Researchers have found that plants can recognize insect threats and communicate. In my presentation, I mentioned two books, both published in 2022, An Immense World by Ed Yong and Ways of Being: Animals, Plants, Machines-The Search for Planetary Intelligence by James Bridle. Both books have extensive citations to refereed research publications. Both books give you a different perspective on intelligence.

HPCwire: With AI, as with most transformational technologies, there can be a big difference between what can be done and what should be done. In 2016, Germany became the first country to pass a national law governing automated vehicles. Ethicists and even religious leaders were part of the group that developed this legislation. Is it time to require that training in ethics be added to AI-related university curricula?

Muzio: Ethics is important. Unfortunately, most ethics courses are poorly taught and not remembered. But yes, ethics should be taught in AI-related university curricula, and I would recommend that required reading include R.U.R., Asimov’s The Complete Robot, the two books I cited above, and a screening of Forbidden Planet and maybe my presentation if teachers think it’s worthwhile enough.

HPCwire: A final question. The definitions of life I’ve seen are pretty broad. Do you think AI machines at some point may qualify as living things? Does that matter?

Muzio: The short answer to the first final question is, yes. The answer to the second final question is more difficult. In Forbidden Planet, the goal was to build an eternal machine into which the Krell could intellectually live forever. If that could be achieved, a lot of people would be very happy. If the goal was to dispense with people altogether, that would also matter. And, if in x-billion years, the universe fades into nothing, it doesn’t matter at all.

Presentation link (short 20-minute video)

This article first appeared on HPCwire.

Related

Who Will Win the AGI Race? 

Tom Cruise’s Mission: Impossible – Dead Reckoning shows the world how AI can be the perfect villain. While The Entity, the faceless antagonist in the movie, manipulates the course of humanity, big techs inching towards AGI in real life are trying really hard to build a safer entity. But, here’s a rider — everyone is still figuring it out with what they believe will lead them towards it.

To each their own

“OpenAI’s mission is to ensure that artificial general intelligence (AGI) — by which we mean highly autonomous systems that outperform humans at most economically valuable work — benefits all of humanity,” — OpenAI Charter 2018.

OpenAI has been clear from the beginning in defining their goals outlining AGI as their mission. CEO Sam Altman says LLMs could pave the way to building an AGI. He also believes that this entity will not have a body. “We are deep into the unknown here,” said Altman, in the Lex Fridman podcast. “For me, a system that cannot significantly add to the sum total of scientific knowledge we have access to, kind of discover, invent, whatever you wanna call it, new fundamental science, is not super intelligence,” said Altman.

He further said that there is a possibility that GPT-10 could evolve into true AGI with just a few innovative ideas. However, he believes that the true excitement lies in AI serving as a tool that participates in a human feedback loop, acting as an extension of human will and amplifying their capabilities.

* Weights have been provided based on usage of each type of model/functionality

The Iron Model of Transformers

OpenAI’s dedication towards building an exhaustive list of transformer models trained on large datasets is probably their key to unlocking AGI, for even Altman believes that LLM could be a ‘part of the way to build an AGI’. He also feels that expansion of the GPT paradigm in important ways will help but doesn’t know what those ways are.

The transformer model is the key neural network architecture for the OpenAI GPT models. From the first GPT model in 2018, which was trained on 117 million parameters to the latest GPT-4 model launched in March this year (whose parameters are not confirmed), OpenAI has been focusing on the LLMs. The company’s list of transformer models extend to even text-to-image models DALL-E and DALL-E-2, voice-to-text Whisper and text-to-music Jukebox.

The All and Mighty Reinforcement Learning

Google DeepMind’s CEO Demis Hassabis believes that with the ongoing progress, AGI is just a few years away, maybe just about ten. However, he foresees uncertainties as careful exploration is required in the field. Swearing by reinforcement learning, a method that learns through the process of trial and error, Google DeepMind holds the crown. With models such as AlphaFold, AlphaZero, and others, DeepMind also believes that the maximisation of total reward might be sufficient to understand intelligence and its associated abilities, and that reward is enough to reach artificial general intelligence.

While DeepMind has had its share of AGI conversations, Sundar Pichai believes the race is not priority. “While some have tried to reduce this moment to just a competitive AI race, we see it as so much more than that.” He also mentioned that their emphasis is on the race to build AI responsibly, and to make sure ‘as a society we get it right’.

The Black Widow of Self-Supervised Learning

“I don’t think I have any particular insight on when a singular AI system that is a general intelligence will get created,” said Mark Zuckerberg when asked about AGI timelines on the Lex Fridman’s podcast.

Meta’s Yann LeCun said that supervised learning and reinforcement learning will not lead to AGI as these approaches are inadequate in developing systems capable of reasoning with commonsense knowledge about the world. He believes that self-supervised learning is the way towards AGI.

This method does not rely on data that’s been labelled by humans for training purposes, instead, it trains on entirely unlabelled or new data. There has been promising results with self-supervised language understanding models, libraries, frameworks that have surpassed traditional and fully supervised models. Since 2013, the company has expanded its research efforts towards self-supervised learning.

The Way of Ultron

With the latest launch of xAI , Musk seeks to build ‘good AGI’ with the purpose to ‘understand the universe’. Musk explains how building AI that cares about understanding the universe will ‘unlikely annihilate humans because we are an interesting part of the universe. He also predicts that AGI will be achieved by 2029.

While others are moving towards AGI with a bodyless form, Musk’s investment in everything robotics is probably a reflection of how a physical form may be the answer. A working prototype of Optimus robot, that is powered by the same self-driving computer that runs Tesla cars, was unveiled on the Tesla AI Day, last year. Musk believed that these advancements would one day contribute towards AGI.

Shape Shifters of Multimodality

Google and OpenAI (though not largely) have incorporated multimodality functions in their models. Google’s PaLM-E, MedPalM-2 have multimodal capabilities. OpenAI’s transformer-based architecture CLIP, released in Jan 2021, processes textual descriptions associated with images, and performs zero-shot learning for image classification and object detection. GPT-4 supports image uploads, and ChatGPT app supports voice commands through Whisper integration.

There are also a number of other models under research that may have a future role towards aiding companies achieve AGI causality is one of them. Believed to bring a huge transformation, causality refers to the relationship between cause (what we do) and effect (what happens as a result of it) where machines try to learn like us.

A Soothsayer to Navigate The Labyrinth

Altman’s above tweet on testing the latest custom instructions feature was rather too specific. Resorting to ChatGPT to unroll a path towards super intelligence may be a playful number, but looking at tech leaders’ interpretation of AGI and super intelligence, their ambiguity on the matter is crystal clear. However, their approach to get there, either intentionally or unintentionally, is different and aligned with what they seem fits their long-term company goals.

Just like them, it seems obscure as to who might finish first in the AGI race. Each company discussed here has employed different models but their reliance on each are at varying degrees. However, OpenAI’s huge dependency on transformer language models, and the fact that the primary goal for OpenAI itself was AGI, the company might be at the forefront of the so-called AGI race.

The post Who Will Win the AGI Race? appeared first on Analytics India Magazine.

Singapore looks for generative AI use cases with sandbox options

cloudgettyimages-1251913187

Two sandboxes now are available for government agencies and businesses in Singapore to develop and test generative artificial intelligence (AI) applications.

Both sandboxes will run on Google Cloud's generative AI toolsets, including its Vertex AI platform, low-code developer tools, and graphical processing units (GPUs). Pre-retained generative AI models also will be made available, encompassing Google's own large language model Palm, its partners' AI models, as well as open source options.

Also: Every major AI feature announced at Google I/O 2023

The initiative is part of a partnership agreement inked earlier this year between the US cloud vendor and the Singapore government to establish an AI Government Cloud Cluster. The cloud platform runs within a dedicated environment on Google Cloud and aims to drive AI adoption in the public sector.

One of the sandboxes will be used exclusively by government agencies, while the other is available to local organizations. Collectively, the two controlled cloud-based environments will be provided at no cost for three months, for up to 100 use cases or organizations.

These will be selected through a series of workshops to be held over 100 days, during which participants will receive training by Google Cloud engineers to identify use cases that can be supported with generative AI and built using toolsets from the sandbox.

The Smart Nation and Digital Government Office (SNDGO) will administer the government sandbox. while Digital Industry Singapore (DISG) will manage the sandbox set aside for local businesses.

Also: Google Cloud and Salesforce partner to help improve customer experience with AI

Singapore, which introduced its national AI strategy in 2019, has more than 4,000 researchers publishing on AI today, according to Josephine Teo, minister for communications and information. While the research output was "more than respectable", she noted that Singapore still had "some distance to go" in translating this research into meaningful use cases across different verticals.

"Such efforts are important because they push us to iron out kinks," Teo said during the launch of the sandboxes. "These kinks could be data issues, but they could also be issues to do with security. Responsible implementation of AI needs us to ensure you do not leave these questions unanswered, but to develop satisfactory answers. This will allow us to unlock the fullest potential of AI for Singapore."

Google Cloud's Asia-Pacific vice president Karan Bajwa noted that rolling out generative AI within an organization requires a different approach from general-purpose, public-facing generative AI applications. The former should be deployed with governance and robustness, requiring enterprise-grade generative AI applications to deliver responses based on curated and company-approved data sources.

Also: Software developers' dance with generative AI is still at that awkward stage

Data security also is critical for generative AI models used to train enterprise applications, to ensure sensitive information will not be leaked. AI models, too, will need to be calibrated and finetuned for the industry within which an organization operates, such as healthcare or financial services, Bajwa said.

To optimize their cost, companies also will want to choose the right model for the business challenge they are looking to resolve, he said, adding that the larger the model, the more costly its usage.

The sandboxes provide a "clear pathway" for companies to "quickly, easily, and responsibly" build their own generative AI applications with the necessary data security and governance, he said.

Organizations already signed up to participate in the sandbox initiatives include the Ministry of Manpower, Government Technology Agency (GovTech), American Express, PropertyGuru Group, and Tokopedia.

Also: What is generative AI and why is it so popular? Here's everything you need to know

The public sector's CIO office, GovTech, taps AI and natural language processing engines to power its virtual intelligence chat assistant (Vica) platform. It is looking to leverage generative AI and plans to move all 88 chatbots, currently used within the sector, to generative AI models by year-end.

GovTech already is employing generative AI to operate seven chatbots that are used for two intranets, the Housing Development Board, and Singapore Polytechnic, among others.

According to GovTech, generative AI has cut the number of hours needed to train the model by 10-fold and generated more natural responses. To reduce the risk of hallucinations, responses are generated from data drawn directly from the relevant government agency. Queries about the weather, for instance, are answered using data extracted from the National Environment Agency's API.

Generative AI models then are used mainly to train the chatbots to better understand questions. Chatbots trained by generative AI have demonstrated an ability to answer 85% to 90% of questions, compared to about 75% for non-generative AI chatbots.

Also: Train AI models with your own data to mitigate risks

Organizations should make the effort to train AI models to address risks such as hallucinations, said Jimmy Ng, DBS Bank's CIO and head of group technology and operations.

Large language models learn from any publicly available data, which may not necessarily be quality data. While taking a plug-and-play approach may work for some, businesses need AI applications that have the necessary guardrails and security, Ng said during a panel discussion at the launch.

Organizations then should sharpen large language models with their own knowledge database to improve the quality of the training data, he added.

Artificial Intelligence