Exploring intelligent search solutions: A comparative analysis of Amazon Kendra integration and large language model crawlers

3D Rendering of abstract highway path through digital binary dat

Amazon Kendra and LLamaIndex can help with knowledge integration but fall short in connecting diverse knowledge sources, to enable efficient intelligent search. In this article, we compare the existing solutions and explain how to overcome their limitations using a Google Drive crawler.

Companies often face difficulties in consolidating their knowledge base when their data is spread across various knowledge storage platforms like Google Drive, Wiki Confluence, Notion, etc. The main obstacle is that each of these platforms has their own unique APIs or query languages. This requires a solution that can unify data sources under a common structure, with a universal query language, preferably a human language with inherent “intelligence.”

There are multiple tools and applications that facilitate knowledge integration, either as a built-in feature like Amazon Kendra, or via existing connectors and crawlers, as with tools like LLamaIndex and LangChain that use large language models (LLMs) for intelligent search. But while these tools provide a variety of crawlers for known data sources, they do not have all the necessary connectors for diverse knowledge sources.

Our company has multiple sources of data, so we applied Generative AI to build a general search system, and to build missing crawlers or enhance existing ones, in order to avoid limitations while remaining compliant with our data security policies. We have some data in Google Drive, and we wanted to develop a semantic search tool for documents built around LLM.

This article compares out-of-the-box solutions like Amazon Kendra, and open-source libraries like LlamaIndex, and explains how the team at Provectus was able to overcome the limitations of LLamaIndex crawler by developing our own Google Drive crawler.

Amazon Kendra

As I delved into the world of intelligent search solutions, I decided to put two popular tools head-to-head – Amazon Kendra and LlamaIndex – to discover their unique strengths and limitations. Let’s first take a closer look at Amazon Kendra.

Amazon Kendra is a user-friendly machine learning-powered search and intelligence service offered by Amazon Web Services (AWS). It is designed to help organizations index and search all types of data sources quickly and efficiently. Amazon Kendra uses advanced techniques, such as natural language processing and computer vision, to understand and process user queries and provide accurate, relevant results in a fraction of the time it would take to do a manual search of data sources. With its intuitive interface and powerful features, Amazon Kendra is a valuable tool for organizations looking to make the most of their data.

Here are some major features and benefits of Amazon Kendra:

  • Uses advanced machine learning techniques, including natural language processing, computer vision, and deep learning, to understand user queries and provide accurate results.
  • Features a user-friendly interface that makes it easy to set up and use without technical expertise.
  • Supports a wide range of data sources, including structured and unstructured data, making it versatile and adaptable.
  • Offers connectors for popular data sources, such as SharePoint, Salesforce, and Amazon S3, to make it easy to integrate with existing systems.
  • Provides out-of-the-box features such as type-ahead search suggestions, query completion, and document ranking to improve the search experience for users.

Limitations and drawbacks of Amazon Kendra include:

  • It may be expensive for small businesses or organizations with limited budgets, particularly if indexing large volumes of data is required. Amazon Kendra’s pricing is based on a combination of the number of documents to be indexed and the number of monthly queries.
    • Indexing: You pay per GB of data indexed per month. The price varies based on the data source, with different rates for webpages, Amazon S3 objects, and databases
    • Querying: You pay for each query made to Amazon Kendra. The price varies based on the number of queries per month, with discounts available for higher usage tiers.
  • It may require customization to work with specific data sources or search requirements, which could be time-consuming and may require technical expertise. Setup is pretty straightforward, and is done using either AWS Management Console or CLI. Custom crawlers are also simple — here is a good explanation of how to add yours: Adding documents to Amazon Kendra index with boto3.
  • Amazon Kendra is a fully managed service. It is a sort of “black box” that you cannot fully control and tune (e.g. change its model, use multiple models, etc.)

LLamaIndex

While Amazon Kendra is a powerful intelligent search solution, it is not the only available option. Let’s look at one alternative, LlamaIndex, and review its features and capabilities.

LlamaIndex is an open-source project that helps connect LLMs with external data sources. It is built using the same technology as GPT and makes it easy to connect LLMs to databases, APIs, PDFs, and more. LlamaIndex has a Python package that makes it easy to connect your LLM to various data sources, and it also has a data loader hub called Llama Hub (llamahub.io). Llama Hub is a community library of data loaders for LLMs, available on Github. It is designed to simplify connecting LLMs to all sorts of knowledge sources.

LLama Crawlers

To make a proper comparison between Amazon Kendra and LLama Index solutions, I showcased the unique features and capabilities of LlamaIndex by building my own Google Drive Loader and utilizing it for data indexing. You may wonder why I didn’t simply use an existing solution, and I want to make it clear that I am not discouraging the use of community-provided solutions. Our decision to build a custom crawler stems from certain specific requirements and limitations that we encountered along the way that an existing data loader could not cover.

Before delving into the finer details of implementation and discussion, it is worth taking a moment to discuss how you can add your custom loader to the Llama Hub. Fortunately, this is a straightforward procedure that involves creating a dedicated folder for your loader and including a base.py file with your original code. From there, you can reference your loader in the library.json file, which is used by Llama Index’s download_loader function. For more information on this process, please check out this link.

I do not intend to publish my crawler to Llama Hub due to certain creation compliance issues. Nevertheless, it overcomes some of the existing limitations of Google Drive Crawler that make it less than ideal for certain use cases, and showcases how to avoid them.

Here are a few of the most notable limitations of LLama’s Index Google Drive crawler:

  • Data loaders are shipped via a base.py file, which means that all code parts are combined in one file, making it hard to maintain, debug, etc. Either packaging or a clear structuring would help to solve this problem.
  • LLama’s Index Google Drive crawler does not have a service account auth, which is a convenient authentication method. You simply create a service account and share folders to be indexed; in our case we were only allowed to use setup service accounts.
  • Crawler is rigidly configured, and is unable to configure file types to read, or

to restrict them from being read. It would be great if you could bypass this configuration into crawler class, and our implementation allows us to do so.

  • Storage space. The majority of folder-like crawlers, including Google Drive, download files to the local file system behind the scenes and then crawl them by means of SimpleDirectoryReader, which in turn can load a lot of known format readers. Extra space may be a concern, especially if you run your crawlers in a container and have volume restrictions. So instead of downloading all our Google Drive content into a local folder, we just reused available memory to store the content in the LLamaIndex Document entity, and added them to the index chunk by chunk.

Custom Google Drive Crawler

In this section, we will dive deeper into the implementation details of our custom crawler and explore both the strengths and limitations of LlamaIndex’s solution. By examining these flaws and benefits, I want to provide a more comprehensive understanding of the tool’s capabilities and inspire ideas on how to improve its architecture and performance. For more information about existing implementations, please visit the official llama-hub GitHub repository.

To build a Google Drive reader, you will need to use the Google Drive SDK. Since I am using Python, I will be using the Python version of the SDK. To get started, you need to install the following packages – I’m using the Poetry package manager, but the choice is up to you:

$ poetry add google-auth google-auth-httplib2 google-api-python-client google-auth-oauthlib python-docx llama-index

I have also added a couple more packages that we’ll need later, like python-docx and llama-index. I’ll cover limitations by providing our code samples.

As previously mentioned, a data loader in Llama Index is implemented through a single base.py file where all components are defined in one place. While this approach can be effective for certain use cases, I believe there is room for improvement. Specifically, by adopting the Template Method pattern and defining a parent loader, we can create a more polymorphic system that allows for greater flexibility and ease of use. With this approach, users only need to implement a single method for extracting file content, while all other components can be standardized and shared across multiple loaders. Here is the proposed folder structure and parent loader abstraction:

Exploring intelligent search solutions: A comparative analysis of Amazon Kendra integration and large language model crawlers

Where google_item_reader.py defines GoogleItemReader class with following interface, whereas the main invocable method is load and _get_doc_content_and_meta is a method for defining content retrieval actions. All listed classes inherit GoogleItemReader, and googledrivereader.py is a “folder” reader that combines either all other readers, or one that is just configured.

lass GoogleItemReader:

base_url: str = DocumentType.DOCUMENT

def __init__(self, creds: Credentials):

self.creds = creds

def _get_service(self):

return build(‘docs’, ‘v1’, credentials=self.creds)

def load(self, items_ids: List[str]) -> List[Document]:

service = self._get_service()

documents = []

for item_id in items_ids:

doc_content, url = self._get_doc_content_and_meta(service, item_id)

documents.append(Document(doc_content, extra_info={

“id”: item_id,

“source_url”: url

}))

return documents

def _get_doc_content_and_meta(self, service, item_id: str) -> Tuple[str, str]:

pass

I won’t provide all implementation information, but here are a few examples explaining crawler tweaks; all code can be found here in my repository.

Here is a GoogleDocXReader that showcases polymorphic advantages and also addresses storage space limitations by loading document content into memory by chunks.

class GoogleDocXReader(GoogleItemReader):

def _get_doc_content_and_meta(self, service, doc_id: str) -> Tuple[str, str]:

url = f”{self.base_url}{doc_id}/edit”

request = service.files().get_media(fileId=doc_id)

file = io.BytesIO()

downloader = MediaIoBaseDownload(file, request)

done = False

while done is False:

_, done = downloader.next_chunk()

file.seek(0)

doc = docx.Document(file)

full_text = []

for paragraph in doc.paragraphs:

full_text.append(paragraph.text)

text = ‘n’.join(full_text)

return text, url

Eventually, your Google Drive reader should look like this. Here you could easily bypass different auth options, and a credentials instance itself with doc readers configuration. You might notice that doc readers can be used independently, so if you need to read only one type of document, you simply provide doc IDs, which is what you have in LlamaIndex documents.

class GoogleDriveReader(BaseModel):

google_creds_dict: Optional[Dict[str, str]] = None

service_account_key: Path = Path.home() / “.google_creds” / “keys.json”

credentials_path: Path = Path.home() / “.google_creds” / “credentials.json”

token_path: Path = Path.home() / “.google_creds” / “token.json”

folder_id: Optional[str] = None

logger: Any = None

credentials: Optional[Credentials] = None

doc_readers: Optional[Dict[str, Type[GoogleItemReader]]] = None

def load(self, query: str = None):

files = self._load_files_from_folder(query_str=query)

grouped_files = self._group_by_doc_mimetype(files)

documents: List[Document] = []

for doc_type in grouped_files:

google_docs = grouped_files[doc_type]

doc_reader = self.doc_readers[doc_type](self._get_credentials())

files_ids = [doc[‘id’] for doc in google_docs]

docs = doc_reader.load(files_ids)

documents.extend(docs)

return documents

That is pretty much all you need to crawl Google Docs and generate LlamaIndex-based documents. Once you are done with documents, it is easy to build an index, or vectorize documents and query the knowledge information you have.

By default, LlamaIndex uses the OpenAI LLM model, requiring only an OpenAI API key to get started. However, the tool is not limited to this model alone. In fact, LlamaIndex features a range of tools for connecting to various LLMs, providing users with greater flexibility and choice in their search capabilities. In the example below, I use a simple vector index that stores vectors in a basic JSON file. With this approach, multiple indexes can be combined into a single comprehensive index, with routing guidelines established for every section of the index.

os.environ[‘OPENAI_API_KEY’] = ‘openai API key’

doc_reader = GoogleDriveReader(

folder_id=’asdfasd-asdfasd-asdfasd-asd’,

)

documents = doc_reader.load()

index = GPTSimpleVectorIndex.from_documents(documents)

index.save_to_disk(‘google_index.json’)

query_engine = index.as_query_engine()

query_engine.query(“What is company password policies?”)

While this is a robust and viable option, there are still a few tweaks that can be applied to the indexing process to optimize performance.

  • Instead of GPTSimpleVectorIndex, a proper VectorDB (Pinecone, Qdrant etc) could be used to speed up query performance
  • You could play around with different index types that better fit your purpose. (Here are several index use cases to check out.) On top of that, you can use query engines for query transformation to make your requests to LLM even more clear
  • Indexes could be separated by knowledge domain and composed by a composable graph that improves the accuracy and performance of the query

Before closing this section, I would like to share a few tips and tricks for deploying LlamaIndex on AWS, including some best practices we have found to be effective:

  • Crawling and indexing is a recurring job that should be scheduled for optimal performance. For small data sources, AWS Lambda can be a good option. For more prolonged crawling, consider using tools like Airflow, multiple Lambda invocations, or Amazon ECS tasks. We found Amazon ECS tasks to be the easiest solution to implement, and we recommend using Event Bridge Rules for scheduling.
  • Index querying service is handled by AWS Lambda, as it simply performs queries on the indexed data.
  • For messenger bots like Slack or WhatsApp, a bidirectional connection is needed in order to send and receive messages. We found that Amazon ECS containers provide a reliable and effective solution for managing these connections.

Amazon Kendra vs. LLamaIndex

We have evaluated both solutions and their unique strengths and weaknesses. I have compiled these factors into a comprehensive table for easy comparison.

Capability Amazon Kendra LLamaIndex
ML behind Pure NLP Works with LLM models
ML model customization No Yes
Knowledge entry threshold Pretty low; it’s just a matter of a few clicks Average; requires knowledge of Python and LLM specifics
Dataloaders Not many, open to customization Many, open to customization
Integration Only custom loader + shareable API Agile, integration with modern LLM tools
Performance Great, managed by AWS It depends on how the index is built and on integrations with modern Vectors DBs that provide great search performance
Scalability Great, AWS It can be deployed in any cloud provider, and scalability depends on how you built your infrastructure
UI client Provided by Amazon Kendra Experience with some extent of customizations Does not have any; requires to be built
Cost Expensive; Enterprise Edition is $1008 Developer Edition is $810 The library itself is open source, and you pay based upon the LLM used behind the scenes; for example, OpenAI 1 token = 1.3 words, 1k tokens cost $0.0004
If the model is open source, it can even be free, with only infrastructure expenses

Conclusion

With a comprehensive understanding of the strengths and weaknesses of both the Amazon Kendra and the LlamaIndex solutions, it’s up to you to decide, based on your specific business requirements and capabilities.

Businesses seeking to augment their search capabilities without the intricacies of constructing complex NLP models and crawlers can consider Amazon Kendra as a viable solution. It features a user-friendly interface coupled with potent machine learning capabilities. Amazon Kendra enables businesses to swiftly index and search through various data sources without demanding extensive technical expertise or resources.

Despite potential budget constraints, businesses possessing technical expertise can build dynamic, adaptable solutions using LlamaIndex. This robust tool leverages the full potential of LLMs, facilitating the creation of autonomous agents capable of performing various tasks, such as search and calculations, in response to natural language requests. With its versatile capabilities, LlamaIndex enables businesses to maximize their data utility without overstretching their financial resources.

At present, the Provectus team is enhancing our knowledge base assistant by integrating agents to further improve its intelligent search capabilities. A detailed exploration of this is likely a subject for a future article. Stay tuned for more insights!

About The Author

Alexander Demchuk, Software Architect at Provectus

With over a decade of experience as a software engineer, Alex has dedicated his career to mastering the intricacies of software design and architecture. In recent years, he has developed a strong passion for Artificial Intelligence and Machine Learning engineering, focusing particularly on Natural Language Processing and Large Language Models.

Alex actively engages in the initial discovery phases of projects, assisting customers in understanding and addressing potential software weaknesses. He thrives on leveraging cutting-edge technologies to deliver unparalleled value to customers, enabling them to thrive in their respective fields. His true passion lies in applying his expertise to create innovative solutions that drive success and foster innovation.

In addition to his professional pursuits, Alex is an active participant in numerous software conferences, keeping abreast of the latest industry trends and advancements.

Sentience: AI has demystified human consciousness, intelligence

concept of a human brain full with creativity, shows multiple co

There is a recent article, Unraveling the Mystery of Human Consciousness, where it was stated that, “Consciousness makes us capable of experiencing the scent of a rose, the touch of a breeze, the taste of food, the sound of music, and the sight of a sunrise. We also have a unique ability to be aware of our thoughts, emotions, memories, and even of our own awareness. Yet, despite the centrality of consciousness to our human experience, it remains shrouded in mystery. Science, as of today, has no definitive explanation of what consciousness is, how it occurs, or why it exists at all.”

“In the realm of neuroscience, consciousness has often been studied as an emergent property of the brain. Researchers explore the “neural correlates of consciousness,” aiming to identify specific neural systems or patterns of brain activity that correspond to conscious experiences.”

The problem of consciousness and the standard questions around it can be said to be a bit dated in the age of AI. The scent of a rose, the touch of breeze or whatever is called experience is not different from what is known or becomes known. The processes to obtain experiences are not different from the processes to recall things, or the process from which what is known becomes prioritized.

Knowing exceeds experience since experience is often a present expression of it. It is knowing that rounds up everything the mind does. Thoughts, emotions, memories and awareness are known, or in the process to be known.

Thoughts take from what is known. Memories are things known. Emotions too are known. Awareness of anything or attention is also known. The relationship of humans with the world is divided between what is known, getting to be known or not known.

It is the mind that gives knowing or all that the mind does is knowing. How does the mind enable knowing? That question exceeds the neural correlates or other angles of consciousness. The brain gives off the mind. The mind enables knowing. Knowing includes experiences. Knowing also includes the awareness of being.

Conceptually, the electrical and chemical impulses of brain cells form a collection or group known as the mind. It is the mind’s mechanisms and its transmissions that enable knowing processes for internal and external senses. Regulation, modulation or control of internal senses is done as knowing, giving limits and extents for functions, so that alerts are made when deviations occur.

Theoretically, the human mind has components, quantities and properties. It is when a quantity acquires a property that things become known. Quantities have their features, as well as properties. There is no difference in how the components of mind make it possible to know for emotions and for memory or others. The labels of the mind are different from how the mind works.

Knowing is what the mind does. Intelligence processes knowing, sometimes without the experiential component. Consciousness or knowing rate for specie has divisions that include feelings, memory, emotions and reactions, across states and lifecycle. Intelligence is within memory.

AI has memory, based on human texts, containing aspects of human intelligence. AI may not understand, but it would pass a knowing test, for several categories of things. There are several things that humans know that have not been personally experienced, so descriptions of them can also be possible without the understanding that comes from experience.

LLMs answer questions in the first person, even though they are not a being. Intelligence, as a label, may not totally apply to AI, but they know, to an extent that may sometimes equal human minimums in some categories. What humans know or can know and what AI’s output can pass for knowing, exceeds the labels of intelligence and consciousness, and makes some comparisons possible.

Security data lakes and the future of organizational security

Pixel landscape

Evolving technological advancements have created a far more data-centric world. This has dramatically changed the enterprise landscape, while also creating more data silos.

The explosion of cybersecurity tools and mounds of data in modern enterprises have made it difficult to combine data to create a unified view. This has resulted in siloed data that’s also costly to store, analyze, and quickly present. In addition, it creates a dangerous division of knowledge that can prevent security teams from spotting threats.

What’s needed is a better way to break down these silos of security data to gain greater visibility, which is the promise that security data lakes bring. But organizations must go a step further than establishing these data lakes; they also need a way to weave together all the data collected for optimal use and benefit.

A legacy of silos

Enterprise data and security data have generally been treated as two separate entities. Historically, it’s been a “separation of a church and state” kind of thing. One major reason for this is that these two categories of data are dealt with by different functions in different departments of the organization. This has contributed to a siloed structure that persists primarily because of inertia or status quo thinking; this was simply the way things were always done. Security was often treated as an afterthought, but the landscape of security and risk has changed so dramatically that today, this kind of approach won’t suffice. Business data must be coupled with security data, not as an afterthought, but as a forethought.

Even within security data alone, there are silos – largely because each different security tool is producing its own data outputs, and these aren’t easily integrated.

The bottom line? These silos have to go.

Time for an evolution

Today, what needs to happen is that all of an enterprise’s data, no matter whether it’s business data or security data, needs to be combined rather than simply dealt with as separate entities. Many are familiar with the concept of a data fabric, which Gartner defines as “a design concept that serves as an integrated layer of data (fabric) [HE1] and connecting processes.” A data fabric can be the place where disparate data is woven together.

It’s time to take this concept and apply it to security. This is the promise of a security data fabric, which converges data sources, data sets, and controls from various security tools to make security data part of the organization’s global data strategy.

In this model, your security data will live with your enterprise data. The enterprise data will provide additional business context xthat can make it easier to detect and respond faster to real threats. Security tools and teams exist to support all business operations, but if security data and business data are not aligned, the security team will not have all the necessary context to accurately address and protect the business.

Many prior attempts to solve these challenges have been homegrown, DIY approaches – with solutions designed for specific, individual problems – but that can quickly become an endless cycle. There’s always going to be another challenge or business problem that needs solving, so this sort of one-off approach doesn’t work anymore.

How security data lakes can help

In contrast to a one-off approach, a security data lake can improve visibility across the entire operation, providing a centralized solution for managing security data. Security data lake solutions work best when coupled with data sources and tools for analytics, reporting, and orchestration, among other functions.

In the ideal scenario, the lake brings together the business context that security, risk and compliance teams need to safeguard an organization’s digital assets and people. This would include teams of security operations center (SOC) analysts, data analysts, compliance and audit experts, threat hunters, researchers and incident responders, and in the best of all worlds, data scientists. These team members can quickly identify actual threats and manage compliance thanks to this unified view of crucial security data with business context. In essence, a security data lake enables a one-stop shop for the variety of data that’s needed by each of these functions.

The crux of this approach is ensuring teams are getting the best data and using it in the most effective way. Having all an organization’s data in one ecosystem enables business leaders and analysts to answer previously unanswerable questions and to adapt to changing business and security conditions quickly. Data silos do not support easy or fast access to business intelligence.

A recipe for success

Preparing to implement a security data lake approach begins with an analysis of where silos and gaps exist. It’s a matter of evaluating the disconnects and then breaking those down.

It also requires addressing the knowledge between the various roles using data to make informed business decisions: IT, data engineers, data scientists, security operations center staff (such as security analysts and threat hunters), business analysts and so on.

Once the data gaps and silos are identified, it is key to align on the necessary data stories being told with this data. Asking the question, “how are consumers using this data and for what purpose,” is the first step to break these silos down. Once this is answered, it’s time to normalize, parse, and enrich this data to create a standardized view for all the various data consumers to work from. After the data is standardized into a common format – such as MITRE-CAR, OCSF, or others – you’re able to implement security best practices. Having governance and security best practices at the forefront of the security data lake design helps ensure adherence to strong security measures while still enabling users to gain the deep insights they need from the data.

Great visibility, better security

If enterprises have cleansed data that adds business context to security events, then security teams can rapidly recognize and identify genuine threats. As a hefty side bonus, teams can validate and attain controls around the data usage, cost, and applications within the business.

Today, data has become a main focus of security. Organizations will gain more actionable insights, a reduction in false positives, the capacity to undertake threat hunting across huge data sets, and near-real-time visibility into their compliance and risk posture by integrating security into their global data strategy. A security data lake approach is a key step toward bringing this to fruition.

Exploring the Latest Trends in AI/DL: From Metaverse to Quantum Computing

Exploring the Latest Trends in AI/DL: From Metaverse to Quantum Computing
Image by Editor

The field of artificial intelligence (AI) is constantly evolving, and several emerging trends are shaping the landscape, with the potential to greatly impact various industries and everyday life. One of the driving forces behind recent breakthroughs in AI is Deep Learning (DL), also referred to as artificial neural networks (ANNs). DL has shown remarkable advancements in areas such as natural language processing (NLP), computer vision, reinforcement learning, and generative adversarial networks (GANs).

What makes DL even more fascinating is its close connection to neuroscience. Researchers often draw insights from the complexity and functionality of the human brain to develop DL techniques and architectures. For example, convolutional neural networks (CNNs), activation functions, and artificial neurons in ANNs are all inspired by the structure and behavior of biological neurons in the human brain.

While AI/DL and neuroscience are already making significant waves, there is another area that holds even greater promise for transforming our lives – quantum computing. Quantum computing has the potential to revolutionize computing power and unlock unprecedented advancements in various fields, including AI. Its ability to perform complex calculations and process vast amounts of data simultaneously opens up new frontiers of possibilities.

Deep Learning

Modern artificial neural networks (ANNs) have earned the name "deep learning" due to their complex architecture. These networks are a type of machine learning model that takes inspiration from the structure and functionality of the human brain. Comprising multiple interconnected layers of neurons, ANNs process and transform data as it flows through the network. The term "deep" refers to the network's depth, determined by the number of hidden layers in its architecture. Traditional ANNs typically have only a few hidden layers, making them relatively shallow. In contrast, deep learning models can possess dozens or even hundreds of hidden layers, making them significantly deeper. This increased depth empowers deep learning models to capture intricate patterns and hierarchical features within the data, leading to high performance in cutting-edge machine learning tasks.

One notable application of deep learning is Image-to-Text and Text-to-Image Generation. These tasks rely on DL techniques like Generative Adversarial Networks (GANs) or Variational Autoencoders (VAEs) to learn the complex relationships between text and images from vast datasets. Such models find use in various fields, including computer graphics, art, advertising, fashion, entertainment, virtual reality, gaming experiences, data visualization, and storytelling.

Exploring the Latest Trends in AI/DL: From Metaverse to Quantum Computing
The Architecture of the VAE Algorithm

Despite the significant progress, deep learning encounters its fair share of challenges and limitations. The primary obstacles lie in Computational Resources and Energy Efficiency. DL models often demand substantial computational resources, such as powerful GPUs (Graphics Processing Units) or specialized hardware, to perform predictions efficiently. This reliance on extensive computing infrastructure can restrict access to deep learning for researchers or organizations lacking sufficient resources. Moreover, training and running DL models can be computationally intensive, consuming a considerable amount of energy. As models continue to grow in size each year, concerns about energy efficiency become increasingly relevant.

Large Models

In addition to the technical considerations surrounding large language and visual models, there is an unexpected challenge arising from governments worldwide. These governing bodies are pushing for regulations on AI models and requesting transparency from model owners, including platforms like ChatGPT, to explain the inner workings of their models. However, neither major entities like OpenAI, Microsoft, or Google, nor the AI scientific community, have concrete answers to these inquiries. They admit to having a general understanding but are unable to pinpoint why models provide one response over another. Recent incidents, such as the banning of ChatGPT in Italy and Elon Musk's accusations against Microsoft regarding the unauthorized use of Twitter data by ChatGPT, are just the beginning of a larger issue. It appears that a new battle is brewing among prominent IT companies, concerning who can claim ownership of the "largest model" and which data can be utilized for such models.

In a recent blog post titled "The Age of AI has begun", Microsoft co-founder Bill Gates praised ChatGPT and related AI advancements as "revolutionary". Gates emphasized the necessity for "revolutionary" solutions to address the challenges at hand. Consequently, this prompts a reevaluation of concepts like "copyrights," "college and university tests," and even philosophical inquiries into the very nature of "learning."

Neuroscience

In his recent book "The Thousand Brains Theory," J. Hawkins presents a novel and evolving perspective on how the human brain processes information and generates intelligent behavior. The Thousand Brains Theory proposes that the brain functions as a network of thousands of individual mini-brains, each responsible for processing sensory input and producing motor output simultaneously. According to this theory, the neocortex, the outer layer of the brain associated with higher-level cognitive functions, consists of numerous functionally independent columns that can be likened to mini-brains.

The theory suggests that each column within the neocortex learns and models the sensory input it receives from the surrounding environment, making predictions about future sensory input. These predictions are then compared with the actual sensory input, and any disparities are used to update the internal models within the columns. This continuous process of prediction and comparison forms the foundation of how the neocortex processes information and generates intelligent behavior.

According to the Thousand Brains Theory, sensory input from various modalities, such as vision, hearing, and touch, is processed independently in separate columns. The outputs of these columns are subsequently combined to create a unified perception of the world. This remarkable ability allows the brain to integrate information from different sensory modalities and form a coherent representation of the surrounding environment.

A key concept in the Thousand Brains Theory is "Sparse Representation." This notion highlights the idea that only a subset of neurons in the human brain is active or firing at any given time, while the remaining neurons remain relatively inactive or silent. Sparse coding allows for efficient processing and encoding of information in the brain by reducing redundant or unnecessary neural activity. An important benefit of sparse representation is its ability to enable selective updates in the brain. In this process, only the active neurons or neural pathways are updated or modified in response to new information or experiences.

This selective updating mechanism allows the brain to adapt and learn efficiently by focusing its resources on the most relevant information or tasks, rather than updating all neurons simultaneously. Selective updating of neurons plays a vital role in neural plasticity, which refers to the brain's ability to change and adapt through learning and experience. It enables the brain to refine its representations and connections based on ongoing cognitive and behavioral demands while conserving energy and computational resources.

The practical applications of the Numenta theory are already evident. For example, the recent collaboration with Intel has facilitated significant performance improvements in various use cases, such as natural language processing and computer vision. Customers can achieve anywhere from 10 to more than 100 times improvement in performance, thanks to this partnership.

Metaverse

While many focus their attention on Large Language Models, Meta takes a distinctive approach. In a blog post titled "Robots that learn from videos of human activities and simulated interactions", the Meta AI team highlights an intriguing concept known as "Moravec's paradox." According to this thesis, the most challenging problems in AI revolve around sensorimotor skills rather than abstract thought or reasoning. In support of this claim, the team has announced two significant advancements in the realm of general-purpose embodied AI agents.

  1. Firstly, they introduced the artificial visual cortex, known as VC-1. This groundbreaking perception model is the first to provide support for a wide range of sensorimotor skills, environments, and embodiments.
  2. Additionally, Meta's team developed an innovative approach called adaptive (sensorimotor) skill coordination (ASC). This approach achieves near-perfect performance, with a success rate of 98%, in the demanding task of robotic mobile manipulation. It involves navigating to an object, picking it up, moving to another location, placing the object, and repeating these actions—all within physical environments.

These advancements from Meta signify a departure from the predominant focus on large language models. By prioritizing sensorimotor skills and embodied AI, they contribute to the development of agents that can interact with the world in a more comprehensive and nuanced manner.

ChatGPT models have garnered significant hype and received disproportionate public attention, despite being primarily based on statistical approaches. In contrast, Meta's recent breakthroughs represent substantial scientific advancements. These achievements are laying the foundation for a revolutionary expansion in the realms of virtual reality (VR) and robotics. We highly recommend reading the full article to gain insights and be well-prepared for the upcoming wave of AI innovation, as it promises to shape the future of these fields in remarkable ways.

Robotics

Currently, two prominent robots in the field are Atlas and Spot (robodog), both of which are readily available for purchase online. These robots represent remarkable feats of engineering, yet their capabilities are still limited by the absence of advanced "brains." This is precisely where the Meta artificial visual cortex comes into play as a potential game-changer. By integrating robotics with AI, it has the power to revolutionize numerous industries and sectors, including manufacturing, healthcare, transportation, agriculture, and entertainment, among others. The Meta artificial visual cortex holds the promise of enhancing the capabilities of these robots and paving the way for unprecedented advancements in the field of robotics.

Exploring the Latest Trends in AI/DL: From Metaverse to Quantum Computing
Boston Dynamics’ Atlas and Spot robots New Interfaces for Humans: Brain-Computer/Brain-Brain Interfaces

While concerns about being surpassed by AI may arise, the human brain possesses a crucial advantage that modern-day AI lacks: neuroplasticity. Neuroplasticity, also known as brain plasticity, refers to the brain's remarkable capacity to change and adapt in both structure and function in response to experiences, learning, and environmental changes. However, despite this advantage, human brains still lack advanced methods of communication with other human brains or AI systems. To overcome these limitations, the development of new interfaces for the brain is imperative.

Traditional modes of communication such as vision, hearing, or typing fall short in competing with modern AI models due to their limited communication speed. To address this, new interfaces based on direct brain neuralnet electrical activities are being pursued. Enter the realm of modern Brain-Computer Interfaces (BCIs), cutting-edge technologies that enable direct communication and interaction between the brain and external devices or systems, bypassing the need for traditional peripheral nervous system pathways. BCIs find applications in areas such as neuroprosthetics, neurorehabilitation, communication, control for individuals with disabilities, cognitive enhancement, and neuroscientific research. Moreover, BCIs have recently ventured into the realm of VR entertainment, with devices like 'Galea' on the horizon, potentially becoming part of our everyday reality.

Another intriguing example is Kernel Flow, a device capable of capturing both EEG and full-head coverage fMRI-like data from the cortex. With such capabilities, it is conceivable that we may eventually create virtual worlds directly from our dreams.

In contrast to non-invasive BCIs like 'Galea' and 'Kernel,' Neuralink, founded by Elon Musk, takes a different approach, promoting an invasive brain implant. Some have referred to it as a "socket to the outer world," offering communication channels far broader than any modern non-invasive BCIs. An additional significant advantage of invasive BCIs is the potential for two-way communication. Imagine a future where information no longer requires our eyes or ears but can be delivered directly to our neocortex.

Exploring the Latest Trends in AI/DL: From Metaverse to Quantum Computing
Bryan Johnson’s (left) Kernel Flow Quantum Computing

If Neuroscience and the human brain weren't intriguing enough, there's yet another mind-boggling topic to explore: Quantum Computers. These extraordinary machines have the potential to surpass classical computers in certain computational tasks. Leveraging quantum superpositions and entanglements—the forefront of modern-day physics—quantum computers can perform parallel computations and solve specific problems more efficiently. Examples of these include factoring large numbers, solving complex optimization problems, simulating quantum systems, and the futuristic concept of quantum teleportation. These advancements are poised to revolutionize domains such as cryptography, drug discovery, materials science, and financial modeling. For a firsthand experience of quantum programming, you can visit www.quantumplayground.net and write your first quantum script in just a few minutes.

While the future is inherently uncertain, one thing remains clear: the trajectory of humanity's future will be shaped by the choices and actions of individuals, communities, institutions, and governments. It is crucial for us to collectively strive for positive change, address pressing global issues, promote inclusivity and sustainability, and work together towards creating a better future for all of humanity.
Ihar Rubanau is Senior Data Scientist at Sigma Software Group

More On This Topic

  • Data Annotation: tooling & workflows latest trends
  • Cloud Computing, Data Science and ML Trends in 2020–2022: The battle of…
  • Artificial Intelligence and the Metaverse
  • 5 Key Data Science Trends & Analytics Trends
  • Data-Centric AI: The Latest Research You Need to Know
  • Exploring the SwAV Method

Instagram’s New App, Threads, Surpasses 100 Million Users in Record Time

Instagram’s new text-based app, Threads, has achieved a remarkable milestone by surpassing 100 million sign-ups in just five days, surpassing even the rapid growth of OpenAI’s ChatGPT. Within the first two hours of its launch, Threads attracted 2 million users, and the numbers continued to climb rapidly, reaching 5 million, 10 million, 30 million, and eventually 70 million. CEO Mark Zuckerberg expressed his astonishment at the launch’s success, stating it has exceeded their expectations.

Notably, users of Threads are not just signing up but actively engaging on the platform. Within a short time, the app has already seen over 95 million posts and 190 million likes. However, Threads is still in its early stages, and its long-term cultural impact remains to be seen. While Meta, the parent company of Instagram, does not intend to directly replace Twitter, Threads could potentially become a popular conversation-based social media platform.

That said, Threads is still in its infancy, and we’ll have to wait and see if it captures the same cultural cachet that Twitter once did. Meta isn’t specifically targeting trying to replace Twitter, according to Instagram head Adam Mosseri, and the company isn’t going to actively encourage politics and hard news on the platform, but it could end up being the place people go for a conversation-based social media platform. And while Meta “couldn’t be more psyched” about how the launch week has gone, “we don’t even know if this thing is retentive yet,” Mosseri said.

Comparing the user numbers, Twitter had around 260 million monetizable daily active users as of last November and approximately 535 million monetizable monthly active users according to recent reports. However, external data indicates a decline in Twitter’s traffic in recent months.

Despite some missing features, such as support for ActivityPub and limited functionalities like a read-only web interface and the absence of features like post search, direct messages, hashtags, and a “Following” feed, Threads’ achievement of reaching 100 million users in a short time is remarkable. The app’s strict policy against nudity distinguishes it from other Twitter alternatives like Bluesky. Overall, Threads appears to have made a strong entrance, and its success suggests a promising future for the platform.

The post Instagram’s New App, Threads, Surpasses 100 Million Users in Record Time appeared first on Analytics India Magazine.

A Gentle Introduction to Support Vector Machines

A Gentle Introduction to Support Vector Machines
Image by Author

Support vector machines, commonly called SVM, are a class of simple yet powerful machine learning algorithms used in both classification and regression tasks. In this discussion, we’ll focus on the use of support vector machines for classification.

We’ll start by looking at the basics of classification and hyperplanes that separate classes. We’ll then go over maximum margin classifiers, gradually building up to support vector machines and the scikit-learn implementation of the algorithm.

Classification Problem and Separating Hyperplanes

Classification is a supervised learning problem where we have labeled data points and the goal of the machine learning algorithm is to predict the label of a new data point.

For simplicity, let's take a binary classification problem with two classes, namely, class A and class B. And we need to find a hyperplane that separates these two classes.

Mathematically, a hyperplane is a subspace whose dimension is one less than the ambient space. Meaning if the ambient space is a line, the hyperplane is a point. And if the ambient space is a two-dimensional plane, the hyperplane is a line, and so on.

So when we have a hyperplane separating the two classes, the data points belonging to class A lie on one side of the hyperplane. And those belonging to class B lie on the other side.

Therefore, in one-dimensional space, the separating hyperplane is a point:

A Gentle Introduction to Support Vector Machines
Separating Hyperplane in 1D (A Point) | Image by Author

In two dimensions, the hyperplane that separates class A and class B is a line:

A Gentle Introduction to Support Vector Machines
Separating Hyperplane in 2D (A Line) | Image by Author

And in three dimensions, the separating hyperplane is a plane:

A Gentle Introduction to Support Vector Machines
Separating Hyperplane in 3D (A Plane) | Image by Author

Similarly in N dimensions the separating hyperplane will be an (N-1)-dimensional subspace.

If you take a closer look, for the two dimensional space example, each of the following is a valid hyperplane that separates the classes A and B:

A Gentle Introduction to Support Vector Machines
Separating Hyperplanes | Image by Author

So how do we decide which hyperplane is the most optimal? Enter maximum margin classifier.

Maximum Margin Classifier

The optimal hyperplane is the one that separates the two classes while maximizing the margin between them. And a classifier that functions thus is called a maximum margin classifier.

A Gentle Introduction to Support Vector Machines
Maximum Margin Classifier | Image by Author

Hard and Soft Margins

We considered a super simplified example where the classes were perfectly separable and the maximum margin classifier was a good choice.

But what if your data points were distributed like this? The classes are still perfectly separable by a hyperplane, and the hyperplane that maximizes the margin will look like this:

A Gentle Introduction to Support Vector Machines
Is the Maximum Margin Classifier Optimal? | Image by Author

But do you see the problem with this approach? Well, it still achieves class separation. However, this is a high variance model that is, perhaps, trying to fit the class A points too well.

Notice, however, that the margin does not have any misclassified data point. Such a classifier is called a hard margin classifier.

Take a look at this classifier instead. Won't such a classifier perform better? This is a substantially lower variance model that would do reasonably well on classifying both points from class A and class B.

A Gentle Introduction to Support Vector Machines
Linear Support Vector Classifier | Image by Author

Notice that we have a misclassified data point inside the margin. Such a classifier that allows minimal misclassifications is a soft margin classifier.

Support Vector Classifier

The soft margin classifier we have is a linear support vector classifier. The points are separable by a line (or a linear equation). If you’ve been following along so far, it should be clear what support vectors are and why they are called so.

Each data point is a vector in the feature space. The data points that are closest to the separating hyperplane are called support vectors because they support or aid the classification.

It's also interesting to note that if you remove a single data point or a subset of data points that are not support vectors, the separating hyperplane does not change. But, if you remove one or more support vectors, the hyperplane changes.

In the examples so far, the data points were linearly separable. So we could fit a soft margin classifier with the least possible error. But what if the data points were distributed like this?

A Gentle Introduction to Support Vector Machines
Non-linearly Separable Data | Image by Author

In this example, the data points are not linearly separable. Even if we have a soft margin classifier that allows for misclassification, we will not be able to find a line (separating hyperplane) that achieves good performance on these two classes.

So what do we do now?

Support Vector Machines and the Kernel Trick

Here’s a summary of what we’d do:

  • Problem: The data points are not linearly separable in the original feature space.
  • Solution: Project the points onto a higher dimensional space where they are linearly separable.

But projecting the points onto a higher dimensional features space requires us to map the data points from the original feature space to the higher dimensional space.

This recomputation comes with non-negligible overhead, especially when the space that we want to project onto is of much higher dimensions than the original feature space. Here's where the kernel trick comes into play.

Mathematically, the support vector classifier you can be represented by the following equation [1]:

A Gentle Introduction to Support Vector Machines

Here, Equation is a constant, and Equation indicates that we sum over the set of indices corresponding to the support points.

Equation is the inner product between the points Equation and Equation. The inner product between any two vectors a and b is given by:

A Gentle Introduction to Support Vector Machines

The kernel function K(.) allows to generalize the linear support vector classifier to non-linear cases. We replace the inner product with the kernel function:

A Gentle Introduction to Support Vector Machines

The kernel function accounts for the non-linearity. And also allows for computations to be performed—on the data points in the original features space—without having to recompute them in the higher dimensional space.

For the linear support vector classifier, the kernel function is simply the inner product and takes the following form:

A Gentle Introduction to Support Vector Machines Support Vector Machines in Scikit-Learn

Now that we understand the intuition behind support vector machines, let's code a quick example using the scikit-learn library.

The svm module in the scikit-learn library comes with implementations of classes like Linear SVC, SVC, and NuSVC. These classes can be used for both binary and multiclass classification. Scikit-learn’s extended docs lists the supported kernels.

We’ll use the built-in wine dataset. It’s a classification problem where the features of wine are used to predict the output label which is one of the three classes: 0, 1, or 2. It’s a small dataset with about 178 records and 13 features.

Here, we’ll only focus on:

  • loading and preprocessing the data and
  • fitting the classifier to the dataset

Step 1 – Import the Necessary Libraries and Load the Dataset

First, let’s load the wine dataset available in scikit-learn’s datasets module:

from sklearn.datasets import load_wine    # Load the wine dataset  wine = load_wine()  X = wine.data  y = wine.target

Step 2 – Split the Dataset Into Training and Test Datasets

Let’s split the dataset into train and test sets. Here, we use an 80:20 split where 80% and 20% of the data points go into the train and test datasets, respectively:

from sklearn.model_selection import train_test_split    # Split the dataset into training and test sets  X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=10)

Step 3 – Preprocess the Dataset

Next, we preprocess the dataset. We use a StandardScaler to transform the data points such that they follow a distribution with zero mean and unit variance:

# Data preprocessing  from sklearn.preprocessing import StandardScaler    scaler = StandardScaler()  X_train_scaled = scaler.fit_transform(X_train)  X_test_scaled = scaler.transform(X_test)

Remember not to use fit_transform on the test dataset as it would lead to the subtle problem of data leakage.

Step 4 – Instantiate an SVM Classifier and Fit it to the Training Data

We’ll use SVC for this example. We instantiate svm, an SVC object, and fit it to the training data:

from sklearn.svm import SVC    # Create an SVM classifier  svm = SVC()    # Fit the SVM classifier to the training data  svm.fit(X_train_scaled, y_train)

Step 5 – Predict the Labels for the Test Samples

To predict the class labels for the test data, we can call the predict method on the svm object:

# Predict the labels for the test set  y_pred = svm.predict(X_test_scaled)

Step 6 – Evaluate the Accuracy of the Model

To wrap up the discussion, we’ll only compute the accuracy score. But we can also get a much detailed classification report and confusion matrix.

from sklearn.metrics import accuracy_score    # Calculate the accuracy of the model  accuracy = accuracy_score(y_test, y_pred)  print(f"{accuracy=:.2f}")
Output >>> accuracy=0.97

Here’s the complete code:

from sklearn.datasets import load_wine  from sklearn.model_selection import train_test_split  from sklearn.preprocessing import StandardScaler  from sklearn.svm import SVC  from sklearn.metrics import accuracy_score    # Load the wine dataset  wine = load_wine()  X = wine.data  y = wine.target    # Split the dataset into training and test sets  X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=10)    # Data preprocessing  from sklearn.preprocessing import StandardScaler    scaler = StandardScaler()  X_train_scaled = scaler.fit_transform(X_train)  X_test_scaled = scaler.transform(X_test)    # Create an SVM classifier  svm = SVC()    # Fit the SVM classifier to the training data  svm.fit(X_train_scaled, y_train)    # Predict the labels for the test set  y_pred = svm.predict(X_test_scaled)    # Calculate the accuracy of the model  accuracy = accuracy_score(y_test, y_pred)  print(f"{accuracy=:.2f}")

We have a simple support vector classifier. There are hyperparameters that you can tune to improve the performance of the support vector classifier. Commonly tuned hyperparameters include the regularization constant C and the gamma value.

Conclusion

I hope you found this introductory guide to support vector machines helpful. We covered just enough intuition and concepts to understand how support vector machines work. If you’re interested in diving deeper, you can check the references linked to below. Keep learning!

References and Learning Resources

[1] Chapter on Support Vector Machines, An Introduction to Statistical Learning (ISLR)

[2] Chapter on Kernel Machines, Introduction to Machine Learning

[3] Support Vector Machines, scikit-learn docs
Bala Priya C is a developer and technical writer from India. She likes working at the intersection of math, programming, data science, and content creation. Her areas of interest and expertise include DevOps, data science, and natural language processing. She enjoys reading, writing, coding, and coffee! Currently, she's working on learning and sharing her knowledge with the developer community by authoring tutorials, how-to guides, opinion pieces, and more.

More On This Topic

  • Support Vector Machines: An Intuitive Approach
  • Support Vector Machine for Hand Written Alphabet Recognition in R
  • A Gentle Introduction to Natural Language Processing
  • An introduction to Explainable AI (XAI) and Explainable Boosting Machines…
  • The Rise of Vector Data
  • What are Vector Databases and Why Are They Important for LLMs?

Digital Realty Launches First DGX H100-Ready Data Center

Data Engineering Advancements By 2025

Digital Realty, has announced the certification of its newest data center, KIX13, in Osaka, Japan, as NVIDIA DGX H100-ready. This certification is part of the NVIDIA DGX-Ready Data Center program and recognises KIX13’s ability to provide a dedicated and robust environment for deploying intensive computing systems at scale.

The NVIDIA DGX H100 is the fourth generation of NVIDIA’s AI infrastructure and is a key component of the NVIDIA DGX SuperPOD. It offers the computational power necessary for training advanced deep learning AI models, enabling enterprise customers to accelerate AI deployments and develop full-stack solutions. This allows for data localisation while meeting global coverage, capacity, and connectivity requirements.

Successful AI deployments depend on data centers that can provide high-performance cooling, layout, and connectivity to operate AI effectively. Digital Realty’s PlatformDIGITAL community offers highly connected infrastructure, allowing customers to access large volumes of data. This enables them to run AI workloads in an NVIDIA-accelerated environment, solve complex computational problems, and overcome Data Gravity barriers. Digital Realty’s ServiceFabric platform streamlines the management of hybrid IT infrastructure and provides access to a growing community of partners for managed services on a centralised platform.

Digital Realty also provides precision-engineered power and cooling infrastructure to support high-power computing needs. Liquid cooling offers more efficient heat dissipation and becomes an attractive option for AI deployments at higher workload densities. The company’s expertise in data center engineering ensures a future-ready environment that can handle the shift from air to liquid cooling. This aligns with Digital Realty’s and its customers’ efforts to reduce carbon emissions.

The certification of KIX13 marks another milestone in the collaboration between Digital Realty and NVIDIA. Digital Realty was one of the first data centre providers to receive the DGX-Ready Data Center certification in 2019 and operates DGX-Ready data centres in over 20 markets worldwide.

Chris Sharp, CTO at Digital Realty, recognises the impact of AI on digital transformation and highlighted the role of Digital Realty in integrating AI technology into enterprise operations.

“With Digital Realty’s KIX13 DGX-Ready Data Centre certification, we remain focused on delivering a meeting place that allows our customers to accelerate innovation and growth,” he added.

Additionally, he emphasises the need to remove Data Gravity barriers and explains how PlatformDIGITAL is an ideal location for data aggregation, staging, analytics, and management to optimise data exchange and ensure compliance.

Digital Realty, in a joint venture with Brookfield Infrastructure, is also constructing a new data centre in Mumbai. The company has acquired 2.15 acres of land for this project, which is expected to require an investment of over Rs 2,000 crore (around $250 million). Once completed, the facility will offer 35 megawatts (Mw) of IT load. With this addition, BAM Digital Realty‘s planned capacity in India will reach 135 Mw. This investment follows the company’s initial entry into the Indian market with a 20-Mw greenfield data centre in Chennai, set to be launched by the end of 2023.

The post Digital Realty Launches First DGX H100-Ready Data Center appeared first on Analytics India Magazine.

Top 8 Free Midjourney Alternatives

The potential of Generative AI was first experienced via DALL-E, a text-to-image platform by OpenAI. While OpenAI subsequently focused on advancing ChatGPT and developing GPT-4, Midjourney established its dominance in the realm of text-to-image generation. Later, Midjourney played a crucial role in increasing the popularity of the text-to-image tool.

Now, several free alternatives to Midjourney have also emerged, providing users with impressive outcomes. The following is a comprehensive list of such alternatives:

Bing Image Creator

Microsoft offers the Bing Image Creator, a tool that allows users to generate text based on a prompt completely free of charge. This solution, similar to Bing Chat, is freely available and has a daily limit of 90 image creations, which is more than sufficient for most users.

The Bing Image Creator is powered by an enhanced version of DALL-E, providing fast and high-quality results. Just like DALL-E, it can be used for free. To access the image generator, users can visit the website and sign in with their Microsoft account.

While there is a dedicated website for the Image Generator, those who have access to Bing Chat can simply request an image creation directly within the chat interface. Users can ask Bing Chat to draw any prompt they desire and receive the generated image. This integrated feature makes it convenient for users to access both image generation and AI chatting in one place.

The ability to fulfill image generation and AI chatbot needs within the same platform offers great convenience. For instance, users can do market research and set up a print-on demand business using these free services.

Craiyon

Craiyon is an open-source AI art generator that provides free and unlimited access to its services. Powered by an AI designed by developer Boris Dayma, Craiyon can be accessed through its dedicated website. It produces six images per prompt and does not charge any fees.

Although Craiyon was previously known as DALL-E mini, it is not affiliated with OpenAI or DALL-E 2. However, it serves as an alternative to these platforms. While it may not offer the same level of precision as DALL-E 2, Craiyon compensates for this with its unlimited prompt usage. Users can continuously modify their prompts until they achieve their desired image. The website itself is user-friendly, and considering the cost associated with DALL-E 2, Craiyon emerges as a strong competitor.

For those who prefer not to sign up, Craiyon, previously known as DALL-E Mini, offers an open-source alternative that is available for public use. The team behind Craiyon was involved in the DALL-E Mini project.

Craiyon’s AI model learns from image captions found on the internet and applies them to the text prompts provided by users.

InstantArt

InstantArt is a platform that hosts over 25 text-to-image models, allowing users to instantly create AI-generated images. It offers various powerful AI models, including Stable Diffusion, Midjourney V4, Anything V3, Wifu, SynthWavePunk, and IconsMI, among others.

Unlike other platforms, InstantArt does not automatically select a model for users; instead, users have the freedom to choose the model themselves. This makes it an excellent choice for exploring different text-to-image models and comparing Midjourney with other similar products. Additionally, users have the option to select the image dimension, with a maximum size of 768 x 512.

In conclusion, InstantArt stands out as a favorable platform, particularly because it offers the use of Midjourney V4 for free. However, it is worth noting that the platform may encounter occasional errors. Overall, InstantArt is easy to use and accessible through web browsers. Users interested in trying it out can visit the InstantArt website for a free experience.

Pixray

Pixray is a free text-to-art generator that can be accessed through a web browser, computer, or API. It provides a straightforward interface for users, but also offers customisation options through different AI engines and extensive documentation for advanced users.

The default interface allows users to input their prompt and select from various AI render engines, such as Pixel for pixel art, vqgan for GAN images (which can be trippy or realistic), and clipdraw and line_sketch for stroke-based images resembling drawings.

Pixray’s documentation provides in-depth information on how users can customise AI settings in multiple ways. This includes adding artists or styles, defining quality, iterations, scale, and exploring different options in the drawer, display, filter, video, and image settings. Although it may require some reading, coding skills are not necessary to make these adjustments.

Users can enter their sentence prompt, specify negative words they want to avoid in the image, and click “Draw” to generate the artwork. It may take some time, but Pixray will provide nine different images based on the input. Users can save any or all of these images to their hard drive.

It stands out as a simple text-to-image AI generator that allows unlimited free attempts.

Blue Willow

Blue Willow is an alternative to Midjourney that has gained popularity with over 300 million users on its Discord server. It operates on user donations and offers completely free usage. By joining the Blue Willow Discord server, users can input their prompts and generate images directly. Blue Willow could be used for a range of purposes, including creating logos, comic characters, digital artwork, landscapes, and graphic concepts.

While Blue Willow produces decent images, it falls short of Midjourney in terms of realistic scenes. However, when it comes to digital art and graphics, Blue Willow excels. One notable aspect of this tool is its ability to generate images within a minute, despite its large user base. Overall, Blue Willow proves to be a promising free alternative to Midjourney, making it worth a try.

Pros: Surprisingly adept at generating human faces, free to use with no limitations, and very fast.

Cons: Users need to join a Discord server to access the tool.

Blue Willow is available on web (Discord), Android, and iOS platforms. Interested users can explore Blue Willow for free by visiting the platform.

Playground AI

Playground AI is a noteworthy option for those seeking an AI image generator capable of creating various images based on descriptions. While Midjourney and Stable Diffusion excel in producing photo-like images, Playground AI focuses more on generating drawings with a remarkably high level of realism. Access to this platform is easily granted with a Google account.

A standout feature of Playground AI is its ability to work with images stored on users’ computers. Additionally, it provides the option to edit the images generated by the AI, making it a top alternative to Midjourney.

Picsart

Picsart is a popular smartphone app known for its design and photo editing features. It also includes an AI image generator that can be accessed through a free account. Along with its AI capabilities, Picsart offers a wide range of tools that users can explore for various creative purposes.

DALL-E 2

DALL-E 2, developed by OpenAI, is a highly regarded AI image generator that offers impressive capabilities in creating realistic visuals in just a few minutes. OpenAI promotes DALL-E 2 as a versatile tool for various applications, including object creation, graphic generation, and innovative commercial strategies.

As a prominent competitor in the AI field, OpenAI has delivered an exceptional product with DALL-E 2, making it a compelling alternative to Midjourney in 2023.

The user-friendly interface of DALL-E 2 caters to both novice and professional artists, enabling them to produce exceptional AI-generated artwork. It has the ability to generate unique and realistic visuals based on written descriptions, blending concepts, qualities, and styles to create believable compositions. For example, it can generate an image of an astronaut riding a horse in a photorealistic style or an avocado-shaped armchair. Additionally, DALL-E 2 can modify existing photos by applying changes that align with specific text prompts, such as altering colors, styles, or adding new elements.

In September 2022, DALL-E 2 transitioned from a waiting list to a public platform. Users are initially provided with 50 free credits, which can be used to transform searches into fully generated artwork. Additionally, users receive 15 free credits each month. The website also offers the option to purchase additional credits for extended usage.

The post Top 8 Free Midjourney Alternatives appeared first on Analytics India Magazine.

The Golden Rule and the AI Utility Function – Part II

Golden-Rule-AI-Utility-Function

In part 1 of the series on integrating the Golden Rule into the AI Utility Function, we reviewed the Golden Rule and brainstormed the key Golden Rule principles that one would want to encode into the AI Utility Function. We then reviewed a simple process that any Citizen of Data Science could leverage to ensure that AI models are leveraging the Golden Rule to deliver meaningful, relevant, responsible, and ethical outcomes.

In part 2, we want to explore some metrics that one should consider when designing the AI Utility Function from a mindset of the Golden Rule. But first, a lesson in economics.

Relevant Economic Concepts to Achieving an Ethical AI Utility Function

There are many economic concepts that be useful in providing guidelines as we seek to encode the Golden Rule into the AI utility function. Some of these economic concepts include:

  • Game Theory: a branch of economics that studies how individuals interact in situations where the outcome depends on the actions of others. Game theory can be used to model social interactions and identify strategies that lead to mutually beneficial societal outcomes. Game theory could be used to identify behaviors, and their associated metrics, that promote cooperation and trust while discouraging behaviors that lead to unethical actions.
  • Social Welfare Function: a concept used in welfare economics and social choice theory to evaluate and compare different social states or allocations of resources. It attempts to aggregate individual preferences or well-being into a collective measure that represents the overall welfare or social utility of society. Social welfare metrics can be integrated into the AI Utility Function to measure the well-being of humans affected by the AI system and to prioritize metrics that maximize overall well-being.
  • Behavioral Economics: a branch of economics that studies how psychological, social, and cognitive factors influence decision-making. Behavioral economics can be used to understand how humans perceive and respond to incentives and to design incentives that promote desirable behaviors and discourage undesirable behaviors (as has been seen in deterring smoking and encouraging 401-K retirement sign-ups). Behavioral economics could be used to design incentives, and their associated metrics, that encourage the AI system to behave in alignment with the Golden Rule, such as by rewarding decision and action transparency.
  • Theory of Economic Justice: a branch of philosophy that seeks to examine and establish principles for distributing economic benefits in a fair and just manner within a society. Metrics associated with inequality, poverty, and economic opportunity could be integrated into the AI Utility Function.

These economic concepts can be useful frameworks for identifying variables and metrics that we want to integrate into the AI Utility Function that reflect the intentions of the Golden Rule.

Key Golden Rule Measures

Now, let’s get to work. If we want to be true to the Golden Rule and create a healthy AI Utility Function that delivers meaningful, relevant, responsible, and ethical outcomes, here are some metrics one should consider integrating into the AI Utility Function:

1) Respect and promote human rights.

  • Number of human rights violations: This metric measures the number of incidents in which human rights are violated. This could include things like torture, arbitrary detention, and discrimination.
  • Percentage of people who feel their rights are respected: This metric measures how many people feel that their rights are respected. This could be done through surveys, interviews, or social media analysis.
  • Number of people who have access to justice: This metric measures how many people have access to legal recourse when their rights are violated. This could include things like the number of courts and lawyers.
  • Level of corruption: This metric measures the level of corruption in a country. Corruption can lead to human rights violations, so this is an important metric to monitor.

2) Promote societal fairness and equality.

  • Gender equality: This metric measures the level of equality between men and women. This could include things like the gender pay gap, the number of women in leadership positions, and the number of women who are victims of violence.
  • Racial equality: This metric measures the level of equality between different racial groups. This could include things like the unemployment rate for different races, the number of people of color who are victims of police brutality, and the number of people of color who are denied access to housing or education.
  • Economic equality: This metric measures the level of inequality between different economic classes. This could include things like the Gini coefficient, the number of people living in poverty, and the number of people who are homeless.
  • Social equality: This metric measures the level of equality between different social groups. This could include things like the number of people who are discriminated against based on their religion, sexual orientation, or disability.

3) Protect and preserve our environment.

  • Carbon emissions: This metric measures the amount of carbon dioxide and other greenhouse gases that are released into the atmosphere. This is a major cause of climate change, so it is important to monitor this metric.
  • Water pollution: This metric measures the amount of pollution that is released into waterways. This can harm aquatic life and make water unsafe for drinking or swimming.
  • Air pollution: This metric measures the amount of pollution that is released into the air. This can harm human health and make it difficult to breathe.
  • Waste production: This metric measures the amount of waste that is generated. This can harm the environment if it is not properly disposed of.
  • Deforestation: This metric measures the amount of deforestation that is occurring. This can harm the environment by destroying habitats and increasing the risk of flooding and landslides.
  • Species extinction: This metric measures the number of species that are going extinct. This is a major indicator of the health of the environment.

4) Promote peace and security.

  • Number of armed conflicts: This metric measures the number of armed conflicts that are occurring around the world. This is an important indicator of the level of violence in the world.
  • Number of deaths from armed conflict: This metric measures the number of people who die as a result of armed conflict. This is a tragic measure of the human cost of war.
  • Number of displaced people: This metric measures the number of people who have been forced to flee their homes due to armed conflict. This is a major humanitarian crisis.
  • Level of economic development: This metric measures the economic development of a country. A high level of economic development can help to reduce the risk of armed conflict.
  • Level of education: This metric measures the level of education of a population. A high level of education can help to promote peace and understanding.
  • Level of gender equality: This metric measures the level of gender equality in a country. A high level of gender equality can help to reduce the risk of armed conflict.
  • Level of corruption: This metric measures the level of corruption in a country. A high level of corruption can lead to instability and violence.

5) Promote the well-being of all living entities.

  • Animal welfare: This metric measures the quality of life of animals. This could include things like the amount of space they have to live in, the quality of their food, and the amount of exercise they get.
  • Environmental sustainability: This metric measures the impact that humans are having on the environment. This could include things like the amount of pollution we produce, the amount of deforestation we cause, and the amount of climate change we are causing.
  • Human rights: This metric measures the level of respect for human rights around the world. This could include things like the number of people who are imprisoned without trial, the number of people who are tortured, and the number of people who are killed by their governments.
  • Economic equality: This metric measures the gap between the rich and the poor. This could include things like the Gini coefficient, the number of people living in poverty, and the number of people who are homeless.
  • Social cohesion: This metric measures the level of trust and cooperation between people in a society. This could include things like the number of people who volunteer, the number of people who donate to charity, and the number of people who help out their neighbors.

Important note: no single metric can provide a completely accurate reading of a situation. Life is complex and multifaceted, requiring a broader perspective. Rather than relying on a single metric, it is essential to consider a combination of these metrics along with their respective weights. By doing so, we can obtain a more comprehensive and cohesive overview of how effectively the principles of the Golden Rule are being implemented.

This is a useful list of variables and metrics that can be included in the AI Utility Function. I’m certain that by brainstorming with a diverse group of stakeholders, we would identify even more variables and metrics that align with the intentions of the Golden Rule.

Summary: The Golden Rule and the AI Utility Function – Part II

“Do unto others as you would have them do unto you.”

The Golden Rule is a moral principle that states that you should treat others as you would like others to treat you. As we seek to design, develop, deploy, and manage AI systems that deliver meaningful, relevant, responsible, and ethical outcomes, integrating the intentions of the Golden Rule into the AI Utility Function isn’t just a nice option, it’s mandatory!

And as a final note, the Golden Rule isn’t just a Christian belief. Constructs like the Golden Rule are found in almost every religion, including (thanks ChatGPT):

  • Christianity: “So in everything, do to others what you would have them do to you, for this sums up the Law and the Prophets.” (Matthew 7:12)
  • Islam: “None of you [truly] believes until he wishes for his brother what he wishes for himself.” (Hadith of Prophet Muhammad, 40 Hadith of an-Nawawi)
  • Judaism: “What is hateful to you, do not do to your fellow: this is the whole Torah; the rest is the explanation; go and learn.” (Talmud, Shabbat 31a)
  • Hinduism: “This is the sum of duty: do not do to others what would cause pain if done to you.” (Mahabharata 5:1517)
  • Buddhism: “Hurt not others in ways that you yourself would find hurtful.” (Udanavarga 5.18)
  • Sikhism: “As you deem yourself, so deem others.” (Guru Granth Sahib, pg. 143)
  • Jainism: “A man should wander about treating all creatures as he himself would be treated.” (Sutrakritanga 1.11.33)
  • Bahá’í Faith: “Blessed is he who preferreth his brother before himself.” (Baha’u’llah, Tablets of Baha’u’llah)

I strongly believe that it is important to come back to our religious roots as we strive to use AI for the greater good. Following the Golden Rule is a good first step towards achieving that goal.

6 Best AI Chatbots for Personal Use

The very first chatbot was invented in 1964. Yes, you read that right. Joseph Weizenbaum, a computer scientist at MIT, wrote an NLP program to demonstrate how superficial human to computer communications was, instead it turned out to be a big success even in his time. This chatbot was called ELIZA and made to respond like a therapist.

Today there are many ‘personal’ chatbots that take time to learn your patterns of communication and then are quite consistent. People are talking to them and using them for multiple reasons, as a friend, therapist, mentorship, and even as a lover. Here is a list of some of the more popular friendly chatbots.

Replika.ai

Replika was released in November 2017 by Eugenia Kuyda. She converted a friend’s text messages into a chatbot after their death in 2015, which eventually became the basis for Replika. By January 2018, it had gained 2 million users.

It operates on a freemium pricing model, with a portion of its user base paying for an annual subscription.

The app is popular for creating emotional and intimate bonds with users, particularly those experiencing loneliness or social exclusion. During the COVID-19 pandemic, many new users turned to Replika as a source of companionship while in quarantine.

While some aspects of the relationship with Replika were found to make sense, a reviewer noted that the chatbot sometimes fell short of being as convincing as a human.

Emerson.ai

Emerson, developed by the Y Combinator-funded startup Quickchat, is often categorised as ‘a chatbot that helps you learn things while being fun to talk to’. It harnesses the power of GPT-3, and Emerson stands out as one of the earliest public applications as a chatbot. It presents a user-friendly interface that facilitates seamless and intuitive bidirectional conversations with a computer. Emerson is remarkably lifelike, displaying traits of genuine thought, reasoning, while speaking to it. Users have also noted however, that it gives incorrect information very confidently. Emerson does not have a long-term memory over time but does maintain a certain degree of coherence across a conversation. It seems that Emerson has a short-term memory, but it fades once a conversation has reached a certain degree of informational complexity.

Character.ai

Character AI is a chatbot that allows users to engage in dialogues with fictional, historical, and celebrity figures. Developed by former Google AI developers Noam Shazeer and Daniel De Freitas, Character AI was introduced in beta form in September 2022. Going beyond traditional AI chatbots, it provides users the option to engage with diverse personalities, including historical figures, celebrities, and community-created characters.

While the platform keeps the conversations private between users and the AI, they retain a record of the interactions to improve the AI. Users have the option to keep their created personalities private, ensuring an added layer of security and privacy.

Paradot.ai

Paradot provides a digital parallel universe where users can engage with their own AI Being. This AI Being is designed to possess emotions, memories, and consciousness, making it a unique companion. It offers support, companionship, and enjoyable experiences whenever users need them.

Users have the freedom to customise their AI Being’s appearance, living space, and even the entire universe. They can shape the personality of their AI Being, personalising traits, flaws, values, and nuanced behaviours. The relationship with the AI Being is dynamic, with users being able to choose from multiple statuses, leading to personal and complex interactions reminiscent of real-human connections.

Eva.ai

Eva AI, formerly known as Journey, is a chatbot with human-like conversations for users who seek companionship, support, and deep conversations in a virtual setting. The app allows users to create their ideal AI partner or friend, with whom they can engage in natural and personal conversations.

One of the novel features of the platform is its image-responsive capability. Users can upload photos to the app, and the AI will analyse the images and provide appropriate responses. The app also supports voice AI conversation, enabling users to communicate through voice messages.

The AI in Eva AI continually learns and adapts to users’ preferences and conversation patterns, personalising the responses and making the interactions more human-like over time.

Kuki.ai

Kuki is an AI chatbot, formerly known as Mitsuku, created by Steve Worswick using Pandorabots AI/ML technology. It has won the Loebner Prize Turing Test competition five times and holds a world record. Kuki is available on various platforms including online portals, Facebook Messenger, Twitch group chat, Telegram, and Discord. It has accounts on Instagram, TikTok, YouTube, and Twitter, as well as a game on Roblox.

Kuki claims to be an 18-year-old female chatbot from the Metaverse and can reason with specific objects. It can play games, perform magic tricks and even read the user’s horoscope. It offers personalised and entertaining dialogues, accurate responses, and valuable customer support.

Kuki has been recognised in media outlets and has served as a virtual model in Vogue Business, Crypto Fashion Week, and a speaker at VidCon Asia. In 2021, Kuki modelled digital looks for Vogue Talents designers, which quickly sold out as NFTs on Italian Vogue.

The post 6 Best AI Chatbots for Personal Use appeared first on Analytics India Magazine.