AI — Страница 1690

A Striking Relevance of Data Sketching in LLMs

At Data Engineering Summit (DES) 2023, presenting a talk about Unleashing the Power of Probabilistic Data Structures: Optimizing Storage and Performance for Big Data, Sudarshan Pakrashi, Director of Data Engineering at Zeotap spoke about statistical algorithms designed to optimise use of memory in storing and querying large datasets. In one of the questions asked, he spoke about if data sketching can be used in current generative AI models such as LLMs.

To this, Pakrashi responded that it is possible to do so and is actually a “great analogy”. He explained how in every language model there are word associations that need to be maintained when there is a huge dataset of words. “Imagine the permutations and combinations that you want to have and sketches are in fact used to maintain those because then your model is actually going to query the frequencies for those combinations,” he explained.

Data sketching is a method of summarising large datasets using compact data structures that can provide approximate answers to queries about the data. In the context of LLMs, data sketching can be used to summarise the text corpus used to train the model, which can help reduce the memory requirements of the model and improve its training efficiency.

Why Do We Need Data Sketching in LLMs

“Do you ever feel overwhelmed by a constant stream of information?” – reads the first line of the paper – ‘What is Data Sketching, and Why Should I Care’ by Graham Cormode in 2017. When you start filtering out information based on what is required, that is exactly similar to what data sketching essentially is. And this can greatly benefit the training of LLMs.

Data sketching can help improve the efficiency and scalability of generative models in the following ways:

Data compression: Summarising large datasets using sketching techniques, LLMs can be trained on representations that are smaller than the original one, reducing the memory requirements and computational resources needed for deploying and training. This can be especially helpful when dealing with limited resources or large-scale datasets.

Faster training: Speed up the training process for LLMs by reducing the amount of data they need to process. Data sketching, by essentially reducing the size of the data, can lead to faster convergence and shorter training times without significantly compromising the quality of the generated samples.

Real-time data: Sketching can enable LLMs to process and learn from data streams efficiently, and automatically update their internal representations on-the-fly by generating new samples based on the most recent data.

Anomaly detection: For identifying outliers or anomalies in the training data, sketching and sampling techniques can be used to improve the quality of LLMs outputs. By identifying and potentially removing anomalous data points, LLMs can focus on learning the underlying structure and patterns in the data, leading to better generated samples.

Data exploration: For exploring large datasets to gain insights into their structure and characteristics, sketching can provide insights that can be used to guide the design and configuration of LLMs, such as selecting appropriate architectures, hyperparameters, or loss functions.

Data Sketching Techniques in LLMs

Although data sketching has been used previously in natural language processing (NLP) tasks, the recent rise of LLMs like GPT-3, that require huge compute, data sketching can, and in fact have, increased the efficient training and deployment of AI models.

One commonly used data sketching technique in LLMs is the Bloom filter, which is a probabilistic data structure that can efficiently test whether an item is in a set. Bloom filters can be used to represent the vocabulary of the text corpus used to train the model, allowing the model to store the vocabulary in a much smaller memory footprint.

Another technique used in LLMs for data sketching is Count-min sketches. This is another type of probabilistic data structure that can efficiently estimate the frequency of items in a set. Count-min sketches can be used to estimate the frequency of words in the text corpus, which can be used to optimise the training of the model.

Similar techniques include HyperLogLog, which is another probabilistic algorithm used to estimate the number of distinct elements in a large dataset.

Moreover, Quantiles sketches is another technique that provides approximate answers to queries about percentiles, medians, or other order statistics of a dataset. This is similar to Sampling, which means selecting a subset of data for representing the entire dataset.

The post A Striking Relevance of Data Sketching in LLMs appeared first on Analytics India Magazine.

Copilot Who? Open-Source Autocoders Take Over

Since the open-source LLM boom began with the LLaMA leak, many other parties have wanted to follow the trend. Last week alone, two coding LLMs were released, suggesting the beginning of a new trend. Now that LLMs with capabilities close to GitHub Copilot are in the hands of the open-source community, developers have mixed feelings.

Even as the release of LLaMA spurred the creation of a bevy of open-source LLMs, it seems that these new coding LLMs will do the same for auto-coders. Both BigCode’s StarCoder and Replit’s Code V1 offer an open-source alternative to Copilot’s proprietary LLM based on GPT-4, opening them up to tinkering and product integration.

Is open source the future of AI?

Recently, a researcher from Google stated in a leaked document that neither OpenAI nor Google has a moat when it comes to LLMs, giving the win to the open-source community. He stated:

“We should not expect to be able to catch up [to open source]. The modern internet runs on open source for a reason. Open source has some significant advantages that we cannot replicate.”

This is largely true, especially when considering the impact that LLaMA had on the open-source ecosystem. The list of innovations brought to the model is long and numerous, but the key takeaway is that all of these improvements were made by volunteer AI enthusiasts wishing to build a better product.

Even though Meta stood to gain the least from the LLaMA leak, the improvements made by the community effectively netted them a planet’s worth of free labor. This is, by far, the biggest value proposition of open source, as it allows volunteers to contribute to even the biggest projects.

Until now, the biggest barrier for entry for the open-source community when it comes to LLMs was the training cost, which routinely went up into the millions for especially big models. However, developers were able to cut down the training cost for LLaMA down to $300 and even optimize it to the point where it could run on a Raspberry Pi.

Regarding auto-coders, programmers and developers were in the pre-LLaMA state of affairs before the launch of this week’s LLMs. They either had to use a proprietary, closed source solution, like OpenAI’s GPT-4, GitHub Copilot, or Tabnine, or fine-tune an existing open-source LLM at a large personal cost.

Both of these approaches were not conducive to harnessing the potential of open source, hampered as they were by restrictive licenses and usage rights. However, with the launch of StarCoder and CodeV1, there is an opportunity for auto-coding-powered projects to be released onto the market.

Innovation waiting to happen

Even though both these models were released under open-source licenses (CC by SA for CodeV1 and OpenRAIL-M for Starcoder), it seems as though they are serving different purposes in the open-source world. Out of the two, StarCoder is arguably built from the ground up for the open-source community, as both the model and a 6.4TB dataset of source code were open-sourced at the same time.

The model was also found to be better in terms of quality than Replit’s Code V1, which seems to have focused on being cheap to train and run. On the HumanEval benchmark, StarCoder was able to reach a score of 40.8%, while Code V1 could only reach 30.5%. Moreover, StarCoder can do more than just predict code, it can also help programmers review code and solve issues using code metadata.

One drawback of StarCoder is its hardware requirements, which demand at least 32GB of GPU memory in 16-bit mode. However, if LLaMA is any indication, the open-source community only needs a few more weeks to optimize this model to run even on phones and laptops.

Replit’s CodeV1, on the other hand, seems to be a valuable addition to their pre-existing software ecosystem. CodeV1 appears to be another step in their strategy to democratize access to an AI-powered software stack. This way, instead of locking in developers to a single ecosystem like GitHub Copilot is trying to do, it can target a much larger and diverse market of developers who want freedom in the way they use AI.

User runnerup on the Hacker News forum had this to say about CodeV1: “Things like GitHub Copilot don’t allow me to use their AI against my codebase the way that I want to. So if Replit can find more innovative, boundary-pushing ways of integrating LLMs, they won’t necessarily need the highest quality LLMs to produce a superior user experience.”

Most importantly, it does not matter that these models are either difficult to run or inaccurate, as the open-source community can easily remedy these issues. The volunteers working on LLaMA and its associated projects have shown that the community can make models both easy to run and more accurate through efforts such as crowd-sourced RLHF. However, one thing is certain: the innovation is just beginning for these open-source models.

The post Copilot Who? Open-Source Autocoders Take Over appeared first on Analytics India Magazine.

You Can Do Better than Having Just a Vector Database

In the recent past, the database market is seeing a proliferation of speciality vector databases. Companies that buy these database management systems and plug them into their data architectures may be initially hopeful about their ability to query for vector similarity. But the short-lived excitement eventually turns into regret about bringing yet another component into their application environment.

Vectors and vector search are just a data type and query processing approach, not a foundation for a new way of processing data. Using a speciality vector database (SVDB) will lead to the usual problems we see and keep solving again and again. Already, customers normally use multiple speciality systems: redundant data, excessive data movement, lack of agreement on data values among distributed components, extra labour expense for specialized skills, extra licensing costs, limited query language power, programmability and extensibility, limited tool integration, poor data integrity and availability compared with a true database management system (DBMS).

Instead of using an SVDB, we believe that application developers using vector similarity search will be better served by building their applications on a general, modern data platform that meets all their database requirements, not just one.

Case Study

The Story of a Generative AI Startup: Getting Accurate Results from a GPT-powered Chatbot

This story is inspired by real events.

Once upon a time, there was a startup building finely tuned bots to aid developers with highly technical content. It required a system that could perform the following tasks:

Rapidly process and convert semi-structured data into vectors
Employ similarity matching to find locally indexed documents that match user inquiries
Enhance the matching results with additional context and re-sort them
Transmit the context to GPT-4, receive the generated response and present it to the user

The basic operational flows for this application are listed in the diagram below. However, the key factor that distinguishes vector databases is not merely the similarity matching capability, but also the ability to enrich the matching results with supplementary information while ultimately re-sorting the outcomes and obtaining the most accurate answer from GPT.

Challenges with a Specialty Vector Database

Initially, the startup was using an SVDB but soon realized it had its limitations. The SVDB could only provide similar results for a specific text or question and had a very small number of tags that each embedding can store, whereas the startup’s approach required iterating at scale and re-ranking frequently. For instance, being able to rank based on a user’s specific context (like asking a question about a particular version of the software) was a crucial feature for providing personalized support to developers.

As their data architecture became more complex, they had to supplement the SVDB with an ElasticSearch database. User feedback and events were stored in PostgreSQL, and fed into ElasticSearch to refine the ranking. Essentially, the SVDB became an (expensive) feature of a database.

The issue with relying solely on the SVDBs vector similarity capability was that it could only provide a record ID and a score, which was insufficient for delivering accurate results. The records had to be combined with Elasticsearch to provide a more precise version with context, which could then be sent to GPT.

About SingleStoreDB

SingleStoreDB is a high-performance, scalable, modern SQL DBMS and cloud service that supports multiple data models including structured data, semi-structured data based on JSON, time-series, full text, spatial, key-value and vector data. Our vector database subsystem, first made available in 2017 and subsequently enhanced, allows the extremely fast nearest-neighbour search to find objects that are semantically similar, easily using SQL. Moreover, the so-called “metadata filtering” function (which is billed as a virtue by SVDB providers) available in SingleStoreDB is far more powerful and general than its alternatives — simply by using SQL filters, joins and other capabilities.

The beauty of SingleStoreDB for vector database management is that it excels at vector-based operations and it is truly a modern database management system. It has all the benefits one expects from a DBMS including ANSI SQL, ACID transactions, high availability, disaster recovery, point-in-time recovery, programmability, extensibility and more. Plus, it is fast and scalable, supporting both high-performance transaction processing and analytics in a single distributed system.

SingleStoreDB Support for Vectors

SingleStoreDB supports vectors and vector similarity search using dot product (for cosine similarity) and euclidean_distance functions. These functions are used by our customers for applications including face recognition, visual product photo search and text-based semantic search [Aur23]. With the explosion of generative AI technology, these capabilities form a firm foundation for text-based AI chatbots — like our very own SQrL.

The SingleStore vector database engine uses Intel SIMD instructions to implement vector similarity matching extremely efficiently.

Why SingleStoreDB Is the Ultimate Vector Database Solution

To deliver more accurate results at a lower cost per question answered, SingleStore DB required a streamlined architecture that supported semantic search while matching with re-ranking and refinement required analytics.

SingleStoreDB offered the optimal solution as it provided superior performance for processing and analyzing semi-structured data such as JSON. SingleStoreDB can also index text, store and match vectors, re-rank and refine matching results based on additional context.

By the way, SingleStoreDB can already do exact nearest-neighbor search incredibly fast via efficient, indexed metadata filtering, distributed parallel scans and Single Instruction and Multiple Data Stream (SIMD). You can also do ANN search in a way that does not require searching all vectors — with a little extra work — by creating clusters, and only examining vectors in clusters nearby a query vector. Also, most partner integrations can be easily built by the customer on top of SingleStoreDB because they are client application-side integrations that use partner services, and then just interact with the DBMS via SQL.

What can SingleStoreDB do to enable your vector database applications? Try it for free in the cloud or self-hosted today, and find out.

AUTHOR DETAILS:

Eric Hanson: Eric Hanson is a Director of Product Management at SingleStore, responsible for query processing, storage and extensibility feature areas. He joined the SingleStore product management team in 2016.

Arnaud Comet: Arnaud Comet is a Director of Product Management at SingleStore. He joined the SingleStore product management in 2022, and has 10+ years of experience driving cloud services growth.

The post You Can Do Better than Having Just a Vector Database appeared first on Analytics India Magazine.

Transform Your Video Creation Process with These 11 Innovative AI Tools

According to application and network intelligence company Sandvine, video usage grew 24% in 2022, accounting for 65% of all internet traffic, with Netflix generating the most traffic followed by TikTok, Disney+ and Hulu. While other reports suggest that by 2023, video content is estimated to constitute 80% of all consumer internet traffic, benefiting from advancements such as 5G networks, faster streaming services and increased digital marketing budgets.

Generation of video content is becoming more important as it engages internet audiences more effectively than other media forms. The latest text-to-video tools have made video creation much easier and accessible, allowing anyone to produce stunning videos with ease.

These tools helps in saving time and improve the quality of video content, providing insights into how audiences engage with videos. Most of them are user-friendly and do not require special skills or training, although some may require basic knowledge of video editing or design principles. Some of them even offer free or trial versions, with a few features requiring a paid subscription. These tools are suitable for anyone interested in creating high-quality video content, from professional marketers to hobbyists.

Here’s a list of Generative AI tools for video:

Lumen5:

Lumen5 is a video creation platform that uses AI to turn written content into engaging videos. The platform is designed for businesses and brands to create video content easily for their social media posts, stories, and ads. Users simply enter the text they want to use, and Lumen5’s AI technology analyzes the content to create a video that matches the tone and style of the text. Users can also customize their videos by uploading their own images and video clips.

Lumen5’s wants to simplify and quicken the process of high-quality videos for anyone without training or experience. The platform’s built-in media library provides access to millions of stock footage, photos, and music tracks, eliminating the need to purchase or record digital assets externally. With Lumen5, marketing teams can focus on the narrative and story, while the system handles the heavy lifting of creating videos. Overall, Lumen5 offers a convenient and accessible way for businesses and individuals to create engaging video content for their social media channels.

Magisto

Magisto is an online video editing platform that uses AI-powered technology to automatically create professional-quality videos from user-uploaded photos and videos. The platform is designed to make video editing easy and accessible for users without any prior video editing experience.

Magisto’s AI-powered video editor analyses the user’s uploaded media and automatically creates a custom video with matching music, transitions, and effects based on the user’s selected style and preferences. Users can further customise their videos by adding text, captions, and additional media, and adjusting the timing and order of their content.

Magisto is commonly used by businesses and individuals for a variety of purposes, such as creating promotional videos, social media content, event recaps, and more. The platform offers both a free and paid version, with additional features and options available to paid users.

Wibbitz

Wibbitz is a free to use cloud-based video creation platform that allows users to quickly and easily create short-form video content from text-based articles or other written content. It uses artificial intelligence to automatically generate video content, which can be customised by the user with various editing tools.

The platform is popular among media companies and publishers who need to quickly produce video content to share on social media or their websites. Wibbitz offers a range of templates and styles to choose from, allowing users to create videos in various formats, including square and vertical videos optimised for mobile viewing.

Wibbitz was founded in 2011 and is headquartered in New York City. The platform has been used by companies such as Reuters, Forbes, and The Huffington Post to create video content quickly and efficiently.

Animoto

Animoto is a cloud-based video creation service that allows users to create and share professional-quality videos using their own photos, video clips, and music. The platform uses an AI-powered video maker that can analyse the user’s media and automatically create a custom video slideshow that matches the selected style and music. Animoto’s video editor also allows users to customise their videos by adding text, captions, and effects, as well as adjusting the timing and order of the media.

Animoto is commonly used for a variety of purposes, such as creating promotional videos for businesses, showcasing products or services, creating educational videos, sharing personal stories or events, and more. Animoto offers both a free and paid version of their platform, with additional features and options available to paid users.

MotionDen

MotionDen is an online video maker platform that allows users to create professional-quality videos without the need for extensive technical knowledge or expertise. The platform offers a range of customisable templates for different types of videos such as promotional videos, explainer videos, and social media videos. Users can add their own text, images, and music to the templates, and customise them according to their needs.

MotionDen also provides users with a range of advanced features such as animations, special effects, and transitions to enhance the visual appeal of their videos. The platform also offers a user-friendly interface that makes it easy for beginners to create high-quality videos in a matter of minutes.

MotionDen is primarily used by businesses, marketers, and content creators who want to create engaging videos for their audiences. The platform offers a range of pricing plans, including a free plan that allows users to create videos with a watermark. Paid plans offer additional features and options, such as high-quality video downloads and more customisation options.

InVideo

InVideo is a video editing and creation platform that allows users to create professional-quality videos quickly and easily. It offers a wide range of customisable templates, tools, and features, making it easy for anyone to create engaging video content.

With InVideo, users can choose from a library of pre-made templates, or start from scratch with their own design. They can add text, images, music, and other media to their videos, and edit them using a range of tools, including transitions, filters, and effects.

InVideo also offers features such as voiceover recording, collaboration tools, and social media integrations, making it easy for users to share their videos across different platforms.

In addition to its video editing features, InVideo also provides access to a large library of stock footage, music, and images, which users can use to enhance their videos.

Overall, InVideo is a versatile video creation tool that is suitable for anyone looking to create professional-quality videos quickly and easily, whether for personal or business purposes.

Clipchamp

Clipchamp is a web-based video editing and creation platform that allows users to create, edit, and share videos online. It offers a wide range of features, including video trimming, cropping, merging, adding text and audio, and applying effects and filters.

One of the standout features of Clipchamp is its user-friendly interface, which makes it easy for users of all skill levels to create professional-looking videos without any prior editing experience. Clipchamp also offers a variety of templates and pre-built video assets, which can be used to create videos quickly and easily.

In addition to its editing features, Clipchamp also includes a range of sharing and publishing options. Users can easily share their videos on social media platforms like Facebook and Instagram, or export their videos to popular video hosting sites like YouTube and Vimeo.

Overall, Clipchamp is a powerful and versatile video editing platform that is well-suited to a wide range of users, from amateur video creators to professional marketers and content producers.

Steve AI:

Steve.AI is an AI-based tool that allows for easy creation of videos and animations, intended to reduce the time required to create video campaigns. Users input a script and Steve.AI automatically generates a high-quality video. The tool enables customisation of visuals, background music and other settings to ensure the desired outcome. The goal of the product is to make the video creation process faster and easier for video makers, marketers, and salespeople.

Steve AI is capable of making videos specifically curated for social media sites like Youtube, Twitter, linkedin, Facebook. It can also produce explainers, product videos, Animated videos, educational, marketing, and corporate videos amongst others.

Synthesia:

Synthesia is an efficient video creation tool that leverages the power of AI to generate visually stunning content using a variety of inputs such as text, audio files, photos, and videos. Synthesia lets you input your written content in more than 120 different languages and generate a polished video in a mere 15-minute timeframe. Additionally, Synthesia offers a vast library of high-quality media assets that can be used to create even more impressive videos in record time.

The company was established in 2017 by a team of AI experts and entrepreneurs from renowned institutions such as UCL, Stanford, TUM, and Cambridge. Its co-founders include Victor Riparbelli (CEO), Prof. Matthias Niessner, and Prof. Lourdes Agapito.

Elai:

Elai offers an all-in-one solution for video creation that utilises AI technology. Their automated workflow system streamlines the entire process, saving time on tasks such as shot selection and audio level adjustments. With Elai, professional-looking videos can be created quickly and efficiently, without the need for actors, crews, or expensive equipment. Their AI video generator can produce customised videos featuring a digital presenter from just text, eliminating the need for a camera, studio, or green screen. Videos can be created in a matter of minutes with Elai.

While Alex Uspenskyi is the founder, Vitalii Romanchenko is the CEO and co-founder of elai io.

Pictory:
Pictory is a tool that caters to the needs of photographers and videographers, enabling them to create stunning montages from their photo libraries without prior video production experience. With its user-friendly drag-and-drop interface and advanced AI capabilities, users can easily bring their ideas to life in a short amount of time. This platform also automatically generates branded videos from lengthy content, making it quick, simple, and affordable. Additionally, users do not need any technical skills or software downloads to use Pictory, and a free trial is available.

The post Transform Your Video Creation Process with These 11 Innovative AI Tools appeared first on Analytics India Magazine.

Generative AI defined: How it works, benefits and dangers

A circuit board lit up with pink light representing generative AI. — Image: Smart Future/Adobe Stock

The likes of ChatGPT and DALL-E, both from OpenAI, are rapidly gaining traction in the world of business and content creation. But what is generative AI, how does it work and what is all the buzz about? Read on to find out.

Jump to:

What is generative AI?
How does generative AI work?
Examples of generative AI
Types of generative AI models
Benefits of generative AI
Use cases of generative AI
Dangers and limitations of generative AI
Generative AI vs. general AI
Is generative AI the future?

What is generative AI?

In simple terms, generative AI is a subfield of artificial intelligence in which computer algorithms are used to generate outputs that resemble human-created content, be it text, images, graphics, music, computer code or otherwise.

In generative AI, algorithms are designed to learn from training data that includes examples of the desired output. By analyzing the patterns and structures within the training data, generative AI models can produce new content that shares characteristics with the original input data. In doing so, generative AI has the capacity to generate content that appears authentic and human-like.

How does generative AI work?

Generative AI is based on machine learning processes inspired by the inner workings of the human brain, known as neural networks. Training the model involves feeding algorithms large amounts of data, which serves as the foundation for the AI model to learn from. This can consist of text, code, graphics or any other type of content relevant to the task at hand.

Once the training data has been collected, the AI model analyzes the patterns and relationships within the data to understand the underlying rules governing the content. The AI model continuously fine-tunes its parameters as it learns, improving its ability to simulate human-generated content. The more content the AI model generates, the more sophisticated and convincing its outputs become.

SEE: Gartner: ChatGPT interest boosts generative AI investments (TechRepublic)

Examples of generative AI

Generative AI has made significant advancements in recent years, with a number of tools capturing the public attention and creating a stir amongst content creators in particular. Big tech companies have also jumped on the bandwagon, with Google, Microsoft, Amazon and others all lining up their own generative AI tools.

Depending on the application, generative AI tools may rely on an input prompt that guides it towards producing a desired outcome — think ChatGPT and DALL-E 2.

Some of the most notable examples of generative AI tools include:

ChatGPT: Developed by OpenAI, ChatGPT is an AI language model that can generate human-like text based on given prompts.

DALL-E 2: Another generative AI model from OpenAI, DALL-E is designed to create images and artwork based on text-based prompts.

Midjourney: Developed by San Francisco-based research lab Midjourney Inc., Midjourney interprets text prompts and context to produce visual content, similar to DALL-E 2.

GitHub Copilot: An AI-powered coding tool created by GitHub and OpenAI, GitHub Copilot suggests code completions for users of development environments like Visual Studio and JetBrains.

SEE: Here’s how Cisco is bringing a Chat-GPT experience to WebEx

Types of generative AI models

There are several types of generative AI models, each designed to address specific challenges and applications. These generative AI models can be broadly categorized into the following types.

Transformer-based models

These models, such as OpenAI’s ChatGPT and GPT-3.5, are neural networks designed for natural language processing. They’re trained on large amounts of data to learn the relationships between sequential data — like words and sentences — making them useful for text-generation tasks.

Generative adversarial networks

GANs are made up of two neural networks, a generator and a discriminator, that work in a competitive or adversarial capacity. The generator creates data, while the discriminator evaluates the quality and authenticity of said data. Over time, both networks get better at their roles, leading to more realistic outputs.

Variational autoencoders

VAEs use an encoder and a decoder to generate content. The encoder takes the input data, such as images or text, and simplifies it into a more compact form. The decoder takes this encoded data and restructures it into something new that resembles the original input.

Multimodal models

Multimodal models can process multiple types of input data, including text, audio and images; they combine different modalities to create more sophisticated outputs. Examples include DALL-E 2 and OpenAI’s GPT-4, which is also capable of accepting image and text inputs.

Benefits of generative AI

The most compelling advantage generative AI proposes is efficiency, in that it can enable businesses to automate specific tasks and focus their time, energy and resources on more important strategic objectives. This often results in lower labor costs and an increase in operational efficiency.

Generative AI can offer additional advantages to businesses and entrepreneurs, including:

Easily customizing or personalizing marketing content.
Generating new ideas, designs or content.
Writing, checking and optimizing computer code.
Drafting templates for essays or articles.
Enhancing customer support with chatbots and virtual assistants.
Facilitating data augmentation for machine learning models.
Analyzing data to improve decision-making.
Streamlining research and development processes.

SEE: Why recruiters are excited about generative AI (TechRepublic)

Use cases of generative AI

Despite generative AI still being in its relative infancy, the technology has already found a firm foothold in various applications and industries.

In content creation, for instance, generative AI can produce text, images and even music, assisting marketers, journalists and artists with their creative processes. In customer support, AI-driven chatbots and virtual assistants can provide more personalized assistance and reduce response times while reducing the burden on customer service agents.

SEE: How Grammarly is drawing on generative AI to improve hybrid work (TechRepublic)

Other uses of generative AI include:

Healthcare: Generative AI is used in medicine to accelerate the discovery of novel drugs, saving time and money in research.
Marketing: Advertisers use generative AI to craft personalized campaigns and adapt content to consumers’ preferences.
Education: Some educators use generative AI models to develop customized learning materials and assessments that cater to students’ individual learning styles.
Finance: Financial analysts use generative AI to examine market patterns and predict stock market trends.
Environment: Climate scientists employ generative AI models to predict weather patterns and simulate the effects of climate change.

Dangers and limitations of generative AI

It’s important to note that generative AI presents numerous issues requiring attention. One major concern is its potential for spreading misinformation or malicious or sensitive content, which could cause profound damage to people and businesses — and potentially pose a threat to national security.

These risks have not escaped policymakers. In April 2023, the European Union proposed new copyright rules for generative AI that would require companies to disclose any copyrighted material used to develop these tools. Hopes are that such rules will encourage transparency and ethics in AI development, while minimizing any misuse or infringement of intellectual property. This should also offer some protection to content creators whose work may be unwittingly mimicked or plagiarized by generative AI tools.

The automation of tasks by generative AI could also affect the workforce and contribute to job displacement, requiring impacted employees to reskill or upskill. Additionally, generative AI models can unintentionally learn and amplify biases present in training data, leading to problematic outputs that perpetuate stereotypes and harmful ideologies.

ChatGPT, Bing AI and Google Bard have all drawn controversy for producing incorrect or harmful outputs since their launch, and these concerns must be addressed as generative AI evolves, particularly given the difficulty of scrutinizing the sources used to train AI models.

SEE: Why business leaders believe the rewards of generative AI outweigh the risks (TechRepublic)

Generative AI vs. general AI

Generative AI and general AI represent different aspects of artificial intelligence. Generative AI focuses on creating new content or ideas based on existing data. It has specific applications and is a subset of AI that excels at solving particular tasks.

General AI, also known as artificial general intelligence, broadly refers to the concept of AI systems that possess human-like intelligence. General AI is still the stuff of science fiction; it represents an imagined future stage of AI development in which computers are able to think, reason and act autonomously.

Is generative AI the future?

It depends on who you ask, but many experts believe that generative AI has a significant role to play in the future of various industries. The capabilities of generative AI have already proven valuable in areas like content creation, software development and healthcare, and as the technology continues to evolve, so too will its applications and use cases.

That said, the future of generative AI is inextricably tied to addressing the potential risks it presents. Ensuring AI is used ethically by minimizing biases, enhancing transparency and accountability and upholding data governance will be critical as the technology progresses. At the same time, striking a balance between automation and human involvement will be crucial for maximizing the benefits of generative AI while mitigating any potential negative consequences on the workforce.

Innovation Insider Newsletter

Catch up on the latest tech innovations that are changing the world, including IoT, 5G, the latest about phones, security, smart cities, AI, robotics, and more.

Delivered Tuesdays and Fridays Sign up today

AI will change the role of developers forever. Here’s why that’s good news

There's concern that widespread use of AI will result in a job cull, including for IT professionals. But Rajeswari Koppala, senior manager of DevOps at United Airlines, says automation presents new opportunities for everyone, including the staff in her department.

"I am an evangelist of automation," she says. "I think if you use it properly, you can do wonders. There's a lot of scope where we can use AI tools and machine learning to optimize what we're doing."

Also: 6 ways to ace a job interview, according to these business leaders

In the case of United, Koppala is already introducing automation through the Harness software development platform, which uses AI to simplify DevOps processes and support continuous integration and continuous delivery (CI/CD).

The technology has helped to accelerate software deployment cycles by 75% and reduced the build process from 22 minutes to just five, allowing IT professionals to focus on higher-value tasks, such as creating new services that meet business requirements.

Rather than spending hours provisioning infrastructure and dealing with repetitive operations requests, United IT staff can get on with what they do best — developing and deploying applications.

Also: AI could automate 25% of all jobs. Here's which are most (and least) at risk

Other companies are taking a similar approach, with research from Stonebranch suggesting increased use of AI and automation across the IT profession is a common trend. More than four-fifths (81%) of organizations plan to grow their automation program in 2023 and 86% plan to replace or add a new automation platform.

That's certainly the case at foreign exchange specialist Travelex, where assistant vice president Mayank Goswami is overseeing the use of a CI/CD platform from technology specialist CircleCI to automate software deployment processes across multiple environments.

The platform allows Travelex to roll out standardized development templates quickly, rather than having to set up new infrastructure in every location around the globe.

Goswami says the implementation of the CircleCI platform is part of a broader shift towards Agile and DevOps in the business, and IT professionals shouldn't be concerned by the ever-increasing use of automation as part of the development process.

Also: 5 ways to be a better manager: Best practices every leader should know

"Change is inevitable," he says. "Technology changes at least every two or three years and maybe quicker. You can't stick to what you know. You have to learn. If you consider change as an opportunity, that's how you will be able to survive in the IT industry."

The end result of increased automation, says Goswami, is bigger efficiencies and better working practices for everyone.

"When people work together and they're focusing on the larger business objective, and doing everything to achieve that incrementally through automation and using DevOps practices and tools, I think that's where the real benefits come through," he says.

Koppala also believes that IT professionals shouldn't be overly concerned about the rise of automation. New technologies bring fresh opportunities for operational efficiencies. She gives the example of automating deployment pipelines.

Also: Generative AI means more productivity, and a likely retrenchment for software developers

"If you have learned something from the work that you have done — and create models that can use the knowledge that is already in the system — that can bring big benefits."

However, it's important to recognize that, while automation can boost efficiency and reduce the number of repetitive tasks in an IT department, there are limits to what can be achieved.

Koppala says building automation into software development and deployment processes is a great first step, yet it's just one stage in a much longer journey.

"Over the years, automation has been a continuous struggle in the organization because any DevOps or platform engineering team tends to create automation for the use cases they know at that point of time," she says.

Also: These are the most in-demand tech roles in 2023

Going beyond that level — and adding intelligence into automation, so that manual intervention can be reduced when use cases change — is where United wants to go next.

Koppala says increasing the amount of intelligence in the software development process is one her team's major objectives for the next two years. And she expects AI to play a big role.

"When the use case changes, automation doesn't work — and the team needs to step in and do the manual work. So, how do you build intelligent automation that takes care of the use cases that you don't know about yet? That's the space where you can make use of AI and ML models and I am actually very optimistic about their role in the future."

Also: How generative AI is changing your technology career path

Like Koppala, Goswami also expects to start seeing increasing amounts of automation in the DevOps environment.

He says it's early days for Travelex when it comes to forays into AI, particularly for generative tools, such as ChatGPT.

However, Goswami and his colleagues are wise enough to keep a watchful eye on fast-moving developments in AI.

"All these emerging technologies are on our radar to look into whether there's something that delivers business value from the point of view of our customers."

Back at United, Koppala says her team is already investigating nascent AI developments, including a feature in the Harness platform called Continuous Verification, which uses real-time, semi-supervised machine learning (ML) to model and predict service behavior.

Also: Want to learn more about prompt engineering? OpenAI's free course can help

She says the aim is to integrate the deployment pipeline with monitoring capability. Then, if problems occur when a new service is rolled out, the ML-led technology can intervene automatically, which means business-critical applications keep running.

"For example, say I'm doing a deployment today, it goes live, and it all looks good," she says. "But what happens if, after two days of the deployment, the performance of the service starts to degrade and no one notices straightaway?"

Koppala says that's where Continuous Verification fills a gap — the technology continuously monitors service performance and automatically takes proactive action.

"As soon as services performance is degraded, this deployment pipeline gets triggered to roll back to the previous version, which was working fine," she says. "So, that's kind of self-healing — that's an intelligence-led tool that provides benefits for everyone."

Also: How I used ChatGPT and AI art tools to launch my Etsy business fast

Those kinds of plus points mean Koppala and her senior management colleagues at United are keen to look at how AI might help boost a wider range of software development and deployment processes.

She recognizes that the introduction of other AI tools is "a bigger journey altogether." But, once again, some significant progress is being made, including the evaluation of an AI-based tool that shows the impact of infrastructure changes before they're pushed live.

"We're not there yet, we're still working on that," says Koppala, who reiterates that emerging technology will continue to play an ever-increasing role in the working lives of United's IT and development professionals.

"That's our objective for the next two years," she says. "We want to close that space and take advantage of the right tools."

How Microsoft might turn Bing Chat into your AI personal assistant

Image: gguy/Adobe Stock

If there’s one thing to know about Microsoft, it’s this: Microsoft is a platform company. It exists to provide tools and services that anyone can build on, from its operating systems and developer tools, to its productivity suites and services, and on to its global cloud. So, we shouldn’t be surprised when an announcement from Redmond talks about “moving from a product to a platform.”

The latest such announcement was for the new Bing GPT-based chat service. Infusing search with artificial intelligence has allowed Bing to deliver a conversational search environment that builds on its Bing index and OpenAI’s GPT-4 text generation and summarization technologies.

Instead of working through a list of pages and content, your queries are answered with a brief text summary with relevant links, and you can use Bing’s chat tools to refine your answers. It’s an approach that has turned Bing back to one of its initial marketing points: helping you make decisions as much as search for content.

SEE: Establish an artificial intelligence ethics policy in your business using this template from TechRepublic Premium.

ChatGPT has recently added plug-ins that extend it into more focused services; as part of Microsoft’s evolutionary approach to adding AI to Bing, it will soon be doing the same. But, one question remains: How will it work? Luckily, there’s a big clue in the shape of one of Microsoft’s many open-source projects.

Jump to:

Semantic Kernel: How Microsoft extends GPT
Giving GPT semantic memory
Giving Bing Chat skills

Semantic Kernel: How Microsoft extends GPT

Microsoft has been developing a set of tools for working with its Azure OpenAI GPT services called Semantic Kernel. It’s designed to deliver custom GPT-based applications that go beyond the initial training set by adding your own embeddings to the model. At the same time, you can wrap these new semantic functions with traditional code to build AI skills, such as refining inputs, managing prompts, and filtering and formatting outputs.

While details of Bing’s AI plug-in model won’t be released until Microsoft’s BUILD developer conference at the end of May, it’s likely to be based on the Semantic Kernel AI skill model.

Designed to work with and around OpenAI’s application programming interface, it gives developers the tooling necessary to manage context between prompts, to add their own data sources to provide customization, and to link inputs and outputs to code that can help refine and format outputs, as well as linking them to other services.

Building a consumer AI product with Bing made a lot of sense. When you drill down into the underlying technologies, both GPT’s AI services and Bing’s search engine take advantage of a relatively little-understood technology: vector databases. These give GPT transformers what’s known as “semantic memory,” helping it find links between prompts and its generative AI.

A vector database stores content in a space that can have as many dimensions as the complexity of your data. Instead of storing your data in a table, a process known as “embedding” maps it to vectors that have a length and a direction in your database space. That makes it easy to find similar content, whether it’s text or an image; all your code needs to do is find a vector that is the same size and the same direction as your initial query. It’s fast and adds a certain serendipity to a search.

Giving GPT semantic memory

GPT uses vectors to extend your prompt, generating text that’s similar to your input. Bing uses them to group information to speed up finding the information you’re looking for by finding web pages that are similar to each other. When you add an embedded data source to a GPT chat service, you’re giving it information it can use to respond to your prompts, which can then be delivered in text.

One advantage of using embeddings alongside Bing’s data is you can use them to add your own long text to the service, for example working with documents inside your own organization. By delivering a vector embedding of key documents as part of a query, you can, for example, use a search and chat to create commonly used documents containing data from a search and even from other Bing plug-ins you may have added to your environment.

Giving Bing Chat skills

You can see signs of something much like the public Semantic Kernel at work in the latest Bing release, as it adds features that take GPT-generated and processed data and turn them into graphs and tables, helping visualize results. By giving GPT prompts that return a list of values, post-processing code can quickly turn its text output into graphics.

As Bing is a general-purpose search engine, adding new skills that link to more specialized data sources will allow you to make more specialized searches (e.g., working with a repository of medical papers). And as skills will allow you to connect Bing results to external services, you could easily imagine a set of chat interactions that first help you find a restaurant for a special occasion and then book your chosen venue — all without leaving a search.

By providing a framework for both private and public interactions with GPT-4 and by adding support for persistence between sessions, the result should be a framework that’s much more natural than traditional search applications.

With plug-ins to extend that model to other data sources and to other services, there’s scope to deliver the natural language-driven computing environment that Microsoft has been promising for more than a decade. And by making it a platform, Microsoft is ensuring it remains an open environment where you can build the tools you need and don’t have to depend on the tools Microsoft gives you.

Microsoft is using its Copilot branding for all of its AI-based assistants, from GitHub’s GPT-based tooling to new features in both Microsoft 365 and in Power Platform. Hopefully, it’ll continue to extend GPT the same way in all of its many platforms, so we can bring our plug-ins to more than only Bing, using the same programming models to cross the divide between traditional code and generative AI prompts and semantic memory.

Innovation Insider Newsletter

Catch up on the latest tech innovations that are changing the world, including IoT, 5G, the latest about phones, security, smart cities, AI, robotics, and more.

Delivered Tuesdays and Fridays Sign up today

US Takes a Page from Supercomputing Past to Boost AI Research

US Takes a Page from Supercomputing Past to Boost AI Research May 5, 2023 by Agam Shah

A pattern is emerging on how U.S. government wants to boost its AI research. The approach is similar to how the early supercomputing infrastructures were built: engage academia and national labs, then apply the research to solve critical domestic issues.

The National Science Foundation (NSF) on Tuesday announced it was awarding $140 million to universities to promote fundamental research in artificial intelligence.  The funding is targeted at specific institutions to address public-sector issues like cybersecurity, climate change, agriculture, public health and education. The NSF is also funding fundamental research for the building blocks of AI so models in the future are ethical, trustworthy and accessible.

The need for coordinated initiatives in AI

"From the perspective of advancing ethical research and figuring out responsible use cases, the first step is foundational research. This is best placed at academic institutions and universities," said Hadan Omaar, a senior analyst focusing on AI policy at the Information Technology and Innovation Foundation, a think tank based in Washington DC.

There has been a pushback against AI research being done by the private sector over the last few years, "so there's an effort needed by the government to balance this out," Omaar said.

The AI research being done by the universities is not market-oriented, instead focusing more on solving public sector problems that are high on the U.S. agenda. Similarly, supercomputers at national laboratories are being prioritized for tasks like economic modeling and weapons development.

The U.S. government is increasingly concerned about the responsible and safe use of AI. The fast growth of AI tools like ChatGPT has concerned U.S. cybersecurity officials. Many U.S. law-enforcement agencies, including the Federal Trade Commission (FTC) and the Department of Justice (DOJ), late last month expressed concerns about the harmful use of artificial intelligence and automated systems to break laws. The agencies said they would use laws and enforcement actions to promote the responsible use of AI.

“Technological advances can deliver critical innovation – but claims of innovation must not be cover for lawbreaking. There is no AI exemption to the laws on the books, and the FTC will vigorously enforce the law to combat unfair or deceptive practices or unfair methods of competition,” said FTC chair Lina Khan in a statement.

The AI arms race

NSF is funding these AI projects as the U.S, the European Union and China engage in an AI arms race. The EU is considering AI legislation called the Artificial Intelligence Act, which would set boundaries for AI development to ensure safe and desirable outcomes.

A string of private sector AI breakthroughs from OpenAI, Microsoft and Google have put the U.S. ahead of China in AI. Previously, China was seen as a global leader because of its use of AI in efforts like the social credit system. The Chinese system rates individuals through using facial recognition software, which uses data from over 200 million surveillance cameras in the country, HR firm Horizons said on its website.

The U.S. government is not competing with the EU and China on issues like climate change or agriculture, and the private sector won’t provide tools to solve those problems, Omaar said. The White House is trying to get a handle on these domestic problems with AI, and for decades has relied on academia to find answers, Omaar continued.

The programs funded by the NSF will add to a stronger knowledge base, and will eventually factor into the AI race against China and the EU.  “Ultimately, the U.S. will have to economically stay competitive in AI,” Omaar said.

Geographic diversity

Omaar noted that the NSF funding for AI programs was sprinkled across different geographies, which was one way to keep all regions engaged.  She took note of NSF funding a group led by the University of Minnesota, Twin Cities to add foundational AI knowledge related to agriculture and forestry. It is close to the U.S. heartland, very relevant to the region, and will boost the university’s profile to better compete with other research institutions.

“NSF is trying to spread access across the country. If it is one in forestry, it is in a university that struggles with the issue,” Omaar said.

The NSF has opened national institutes at academic and research institutions to further its AI agenda, the agency said in a statement. "Today’s investment means the NSF and funding partners have now invested close to half a billion dollars in the AI Institutes research network, which reaches almost every U.S. state," the NSF said in a statement. 

Areas of funding

The funding includes access to computing resources – possibly at supercomputing centers – such as “central processing unit (CPU) and graphics processing unit (GPU) options with multiple accelerators per node, high-speed networking, and sufficient memory capacity (i.e., at least one terabyte per node),” according to a January document from the National Artificial Intelligence Research Resource Task Force, which was created by the NSF and the White House Office of Science and Technology Policy (OSTP) to guide the AI research roadmap.

Among the other NSF-funded AI endeavors: a team led by the University of California at Santa Barbara will look at AI cyber-agents to catch and prevent hacks (cybersecurity specialists believe that tools like ChatGPT could be used to create code that could hack systems, or draft letters for social engineering); a group led by the University of Maryland will focus on a program to develop a framework for unbiased AI systems that can be trusted across racial and gender divides; a team led by Columbia University will try to establish connections between AI and how the human brain processes information; and the University at Buffalo and other institutions will focus on AI systems to assess if kids need interventions. Two other programs funded by the NSF will look at AI in decision-making and education.

A larger U.S. effort

The NSF funding initiative was part of a larger White House announcement around the safe and responsible use of AI. The White House managed to snag commitments from top AI companies including Google, Microsoft, OpenAI and Nvidia to evaluate AI systems based on a platform developed by Scale AI during the DEFCON 31, which will be held in Las Vegas in August.

“This independent exercise will provide critical information to researchers and the public about the impacts of these models, and will enable AI companies and developers to take steps to fix issues found in those models,” the White House said in its statement.

It is not clear if large-language models such as OpenAI’s GPT-4, which is the brain behind Microsoft’s BingGPT, will be open for evaluation. The model is closed, unlike GPT-3, which is used by ChatGPT and is open-source.

Looming ubiquity

The release of ChatGPT late last year set off a storm of activity around generative AI, which has boosted public and private investment in the technology. A survey released by the World Economic Forum this week stated that "generative AI has received particular attention recently, with claims that 19% of the workforce could have over 50% of their tasks automated by AI." About 75% of the companies surveyed by WEF plan to put artificial intelligence into daily operations in the coming years.

That said, present applications have had varying levels of success. Microsoft released BingGPT, which is based on GPT-4, for testing a few months ago, and the early results were poor. The AI tool was often hallucinating and provided inaccurate or moody answers. Microsoft also had little to lose by making its AI technology public because of its small 2.79% search engine market share in April, which dwarfs in comparison to Google's 92.61% market share, according to numbers from StatCounter.

The UK’s antitrust agency is going after Big Tech’s AI models now

Just a week after thwarting Microsoft's attempt to shell out $68.7 billion for video game maker Activision Blizzard, the UK's antitrust agency Competition and Markets Authority (CMA) announced that it will launch a review of the artificial intelligence landscape in Britain.

These would include wildly popular generative AI models like Nvidia's Clara as well as large language models (LLM) like Open AI's ChatGPT and DALL-E as well as proprietary LLMs like Google's Bard.

Also: How I used ChatGPT and AI art tools to launch my Etsy business fast

"This initial review will focus on the questions the CMA is best placed to address — what are the likely implications of the development of AI foundation models for competition and consumer protection?" said Sarah Cardell, Chief Executive of CMA, in a statement.

The CMA has asked stakeholders to weigh in by sending submissions by June 2, 2023. Following a period of accumulation of all relevant documents and then an analysis, the CMA will publish a report detailing its findings in September 2023.

Also: How to use the new Bing (and how it's different from ChatGPT)

"It is crucial that the potential benefits of this transformative technology are readily accessible to UK businesses and consumers while people remain protected from issues like false or misleading information," added CMA chief Cardell.

AI, a creator of chaos and miracles

The UK — and the rest of the world — have been consumed by seemingly unending developments showcasing the abilities of AI models like ChatGPT, developed by US firm OpenAI amongst others.

Also: Why open source is essential to allaying AI fears, according to Stability.ai founder

ChatGPT has become an overnight star amongst global users for its ability to deliver a dizzying variety of information in uncanny, human-like conversations. These are possible because OpenAI has trained its algorithm on 300 billion words contained in text databases on the internet.

There are enormous gains that can be made with AI — in cancer detection, autonomous vehicles, and gene therapy, to name a few — in virtually every field imaginable. Yet, ChatGPT has also been excoriated for inherent racial, gender and age biases and for its tendency to make up things when it is asked certain questions to which it does not know the answer.

AI has also rendered thousands of jobs in peril. Early this week, Arvind Krishna, the CEO of IBM said that he isn't going to be hiring any people whom he thinks will be replaced by AI in the coming years, speculating that this number could be as any as 7,800 roles.

Also: Generative AI is changing your technology career path. What to know

Educational technology company Chegg shed an astounding 50% of its market cap this week because ChatGPT seems to be displacing it — why pay $15 per month for course solutions when you can get it for free, seems to be the thinking amongst users.

Investment bank Goldman Sachs thinks that as many as 300 million jobs could be lost to AI automation in years to come so it comes as no surprise that the UK is beginning to act.

Artists and musicians have also been put on notice. AI versions of major artists covering other major artists' songs have flooded the internet and the legal ramifications of copyright violation are yet to be untangled.

The CMA, though, has stated that cases of copyright and intellectual property are not in its purview. Neither are instances of online safety, data protection, security and others.

Also: This new technology could blow away GPT-4 and everything like it

This is because the UK government had earlier announced that it was going to divvy-up responsibilities for AI between its regulators for human rights, health and safety, and competition rather than construct one solely for technology.

It published a White Paper in March that asked regulators, including the CMA, to think about how the innovative development and deployment of AI can be measured against five broad principles: "safety, security and robustness; appropriate transparency and explainability; fairness; accountability and governance; and contestability and redress."

Uncle Sam gets into action

Not surprisingly, these are the very same issues that have galvanized the US government into action, which has arguably presided over the eye of the AI storm.

Also: How to use Midjourney to generate amazing images

Vice President Kamala Harris and other senior officials are meeting this week with the CEOs of major AI development companies, including OpenAI (creator of ChatGPT), Microsoft (an OpenAI investor), Alphabet (parent of Google and its AI offspring Bard), and Anthropic AI, amongst others, to discuss a future roadmap for the development of responsible AI.

Also announced were plans to undertake public assessments of all of the major AI generative systems that proliferate. These would be undertaken at the AI village — a community of thousands of hackers, data scientists, independent community partners and AI experts.

These efforts come on the heels of President Biden's release of a Blueprint for an AI Bill of Rights late last year that was architected to try and protect people from the negative effects of artificial intelligence.

In a related development, the National Science Foundation received $140 million to launch seven new National AI Research Institutes to forge breakthroughs in climate, energy, agriculture, public health, and other areas.

Also: Would you listen to AI-run radio? This station tested it out on listeners

The conviction to corral AI before it runs rampant through society has also spread to Europe. As ZDNET has reported, the European Parliament is putting into place an 'AI Act' that will classifying AI models according to risk levels.

Most significantly, these models will now have to disclose any feeding of copyrighted materials to their models during the training phase.

There seems to be a global conviction to try and foster innovation in a transformative field while ensuring that it doesn't cause widespread chaos, or even according to some, the end of humanity.

Some thanks for spurring this action should be given to no other than Geoffrey Hinton, considered the godfather of AI and machine learning, who thinks that his machine-child may actually destroy humankind if not checked.

Also: ChatGPT's intelligence is zero, but it's a revolution in usefulness, says AI expert

Hinton, who just days ago quit his job at Google, says that if he hadn't come up with the technology, somebody else would have, and the most important thing is to act now. "Right now, they're not more intelligent than us, as far as I can tell. But I think they soon may be," he told the BBC.

That's a future that governments around the world will have to negotiate, and the clock is ticking.

Stability AI Releases Text-to-Image Model DeepFloyd IF

Stability AI and its multimodal AI research lab, DeepFloyd, have announced the research release of DeepFloyd IF, a cutting-edge text-to-image cascaded pixel diffusion model. The model is initially released under a non-commercial, research-permissible license, but an open-source release is planned for the future.

DeepFloyd IF boasts several remarkable features, including:

Deep text prompt understanding: The model uses T5-XXL-1.1 as a text encoder, with numerous text-image cross-attention layers, ensuring better alignment between prompts and images.
Coherent and clear text alongside generated images: DeepFloyd IF can generate images containing objects with varying properties and spatial relations.
High degree of photorealism: The model has achieved an impressive zero-shot FID score of 6.66 on the COCO dataset.
Aspect ratio shift: The model can generate images with non-standard aspect ratios, including vertical, horizontal, and the standard square aspect.
Zero-shot image-to-image translations: The model can modify an image's style, patterns, and details while preserving its basic form.

Below are some of the example concepts created by DeepFloyd IF:

DeepFloyd IF's modular, cascaded, pixel diffusion design consists of several neural modules interacting synergistically. The model works in pixel space, processing high-resolution data in a cascading manner using individually trained models at different resolutions. This involves a base model that generates low-resolution samples and successive super-resolution models that produce high-resolution images.

The model was trained on a custom high-quality LAION-A dataset containing 1 billion (image, text) pairs, a subset of the English part of the LAION-5B dataset. DeepFloyd's custom filters were used to remove watermarked, NSFW, and other inappropriate content.

DeepFloyd IF's process

Initially, DeepFloyd IF is released under a research license. The researchers aim to encourage the development of novel applications across domains such as art, design, storytelling, virtual reality, and accessibility. To inspire potential research, they have proposed several technical, academic, and ethical research questions.

Technical research questions include:

Optimizing the IF model to enhance performance, scalability, and efficiency.
Improving output quality by refining sampling, guiding, or fine-tuning the model.
Applying techniques used to modify Stable Diffusion output to DeepFloyd IF.

Academic research questions include:

Exploring the role of pre-training for transfer learning.
Enhancing the model's control over image generation.
Expanding the model's capabilities beyond text-to-image synthesis by integrating multiple modalities.
Assessing the model's interpretability to improve understanding of generated images' visual features.

Ethical research questions include:

Identifying and mitigating biases in DeepFloyd IF.
Assessing the model's impact on social media and content generation.
Developing an effective fake image detector that utilizes the model.

To access the model's weights, users must accept the license on DeepFloyd's Hugging Face space. For more information, you can visit the model's website, GitHub repository, Gradio demo, or join public discussions through DeepFloyd's Linktree.

Why Do We Need Data Sketching in LLMs

Data Sketching Techniques in LLMs

Is open source the future of AI?

Innovation waiting to happen

Case Study

The Story of a Generative AI Startup: Getting Accurate Results from a GPT-powered Chatbot

This story is inspired by real events.

Challenges with a Specialty Vector Database

About SingleStoreDB

SingleStoreDB Support for Vectors

Why SingleStoreDB Is the Ultimate Vector Database Solution

What is generative AI?

How does generative AI work?

Examples of generative AI

Types of generative AI models

Transformer-based models

Generative adversarial networks

Variational autoencoders

Multimodal models

Benefits of generative AI

Use cases of generative AI

Dangers and limitations of generative AI

Generative AI vs. general AI

Is generative AI the future?

Innovation Insider Newsletter

More on AI tools

Semantic Kernel: How Microsoft extends GPT

Giving GPT semantic memory

Giving Bing Chat skills

Innovation Insider Newsletter

Related

AI, a creator of chaos and miracles

Uncle Sam gets into action

More on AI tools