AI — Страница 1037

10 Free AI Courses by NVIDIA

NVIDIA is one of the most influential hardware giants in the world. Apart from its much sought-after GPUs, the company also provides free courses to help you understand more about generative AI, GPU, robotics, chips, and more.

Most importantly, all of these are available free of cost and can be completed in less than a day. Let’s take a look at them.

Generative AI Explained

This self-paced, free online course introduces generative AI fundamentals, which involve creating new content based on different inputs. Through this course, participants will grasp the concepts, applications, challenges, and prospects of generative AI.

Learning objectives include defining generative AI and its functioning, outlining diverse applications, and discussing the associated challenges and opportunities. All you need to participate is a basic understanding of machine learning and deep learning principles.

Digital Fingerprinting with Morpheus

This one-hour course introduces participants to developing and deploying the NVIDIA digital fingerprinting AI workflow, providing complete data visibility and significantly reducing threat detection time.

Participants will gain hands-on experience with the NVIDIA Morpheus AI Framework, designed to accelerate GPU-based AI applications for filtering, processing, and classifying large volumes of streaming cybersecurity data.

Additionally, they will learn about the NVIDIA Triton Inference Server, an open-source tool that facilitates standardised deployment and execution of AI models across various workloads. No prerequisites are needed for this tutorial, although familiarity with defensive cybersecurity concepts and the Linux command line is beneficial.

Building A Brain in 10 Minutes

This course delves into neural networks’ foundations, drawing from biological and psychological insights. Its objectives are to elucidate how neural networks employ data for learning and to grasp the mathematical principles underlying a neuron’s functioning.

While anyone can execute the code provided to observe its operations, a solid grasp of fundamental Python 3 programming concepts—including functions, loops, dictionaries, and arrays—is advised. Additionally, familiarity with computing regression lines is also recommended.

Building RAG Agents with LLMs

Retrieval-based LLM systems have gained traction due to their ability to engage in informed conversations, leverage tools, analyse documents, and plan strategically. This course focuses on deploying agent systems effectively and scaling them to meet user and customer demands.

The key learning objectives include exploring scalable deployment methods for LLMs and vector databases, understanding microservices and their interplay, experimenting with contemporary LangChain paradigms for dialogue management and document retrieval. Besides you can gain practical experience with state-of-the-art models, along with insights into productionalisation and framework exploration.

If you are familiar with LLMs and related composition frameworks like LangChain and have intermediate Python proficiency, this is ideal for you.

Augment your LLM Using RAG

Retrieval Augmented Generation (RAG), devised by Facebook AI Research in 2020, offers a method to enhance a LLM output by incorporating real-time, domain-specific data, eliminating the need for model retraining. RAG integrates an information retrieval module with a response generator, forming an end-to-end architecture.

Drawing from NVIDIA’s internal practices, this introduction aims to provide a foundational understanding of RAG, including its retrieval mechanism and the essential components within NVIDIA’s AI Foundations framework. By grasping these fundamentals, you can initiate your exploration into LLM and RAG applications.

Building Video AI Applications at the Edge on Jetson Nano

This self-paced online course aims to equip learners with skills in AI-based video understanding using the NVIDIA Jetson Nano Developer Kit. Through practical exercises and Python application samples in JupyterLab notebooks, participants will explore intelligent video analytics (IVA) applications leveraging the NVIDIA DeepStream SDK.

The course covers setting up the Jetson Nano, constructing end-to-end DeepStream pipelines for video analysis, integrating various input and output sources, configuring multiple video streams, and employing alternate inference engines like YOLO.

Prerequisites include basic Linux command line familiarity and understanding Python 3 programming concepts. The course leverages tools like DeepStream, TensorRT, and requires specific hardware components like the Jetson Nano Developer Kit. Assessment is conducted through multiple-choice questions, and a certificate is provided upon completion.

For this course, you will require hardware including the NVIDIA Jetson Nano Developer Kit or the 2GB version, along with compatible power supply, microSD card, USB data cable, and a USB webcam.

How to Build Custom 3D Scene Manipulator Tools on NVIDIA Omniverse

This course offers practical guidance on extending and enhancing 3D tools using the adaptable Omniverse platform. Taught by the Omniverse developer ecosystem team, participants will gain skills to develop advanced tools for creating physically accurate virtual worlds.

Through self-paced exercises, learners will delve into Python coding to craft custom scene manipulator tools within Omniverse. Key learning objectives include launching Omniverse Code, installing/enabling extensions, navigating the USD stage hierarchy, and creating widget manipulators for scale control.

The course also covers fixing broken manipulators and building specialised scale manipulators. Required tools include Omniverse Code, Visual Studio Code, and the Python Extension. Minimum hardware requirements comprise a desktop or laptop computer equipped with an Intel i7 Gen 5 or AMD Ryzen processor, along with an NVIDIA RTX Enabled GPU with 16GB of memory.

Assemble a Simple Robot in Isaac Sim

This course offers a practical tutorial on assembling a basic two-wheel mobile robot using the ‘Assemble a Simple Robot’ guide within the Isaac Sim GPU platform. The tutorial spans around 30 minutes and covers key steps such as connecting a local streaming client to an Omniverse Isaac Sim server, loading a USD mock robot into the simulation environment, and configuring joint drives and properties for the robot’s movement.

Additionally, participants will learn to add articulations to the robot. By the end of the course, attendees will gain familiarity with the Isaac Sim interface and documentation necessary to initiate their own robot simulation projects.

The prerequisites for this course include a Windows or Linux computer capable of installing Omniverse Launcher and applications, along with adequate internet bandwidth for client/server streaming. The course is free of charge, with a duration of 30 minutes, focusing on Omniverse technology.

Disaster Risk Monitoring Using Satellite Imagery

Created in collaboration with the United Nations Satellite Centre, the course focuses on disaster risk monitoring using satellite imagery, teaching participants to create and implement deep learning models for automated flood detection. The skills gained aim to reduce costs, enhance efficiency, and improve the effectiveness of disaster management efforts.

Participants will learn to execute a machine learning workflow, process large satellite imagery data using hardware-accelerated tools, and apply transfer-learning for building cost-effective deep learning models.

The course also covers deploying models for near real-time analysis and utilising deep learning-based inference for flood event detection and response. Prerequisites include proficiency in Python 3, a basic understanding of machine learning and deep learning concepts, and an interest in satellite imagery manipulation.

Introduction to AI in the Data Center

In this course, you will learn about AI use cases, machine learning, and deep learning workflows, as well as the architecture and history of GPUs. With a beginner-friendly approach, the course also covers deployment considerations for AI workloads in data centres, including infrastructure planning and multi-system clusters.

The course is tailored for IT professionals, system and network administrators, DevOps, and data centre professionals.

The post 10 Free AI Courses by NVIDIA appeared first on Analytics India Magazine.

Google Unveils Gecko, a Versatile Text Embedding Model Distilled from Large Language Models

Google has announced Gecko, a compact and versatile text embedding model powered by thevast world knowledge of large language models (LLMs).

Text embedding models represent natural language as dense vectors, positioning semantically similar text near each other within the embedding space. Or in simple terms, text embedding models are like translators for computers. They take text and convert it into numbers in a way the computer can understand.

The numerical representations, also known as embeddings, capture semantic information about the words or sentences in the text. By allowing computers to process natural language, these embeddings are used to carry out a wide range of downstream tasks including document retrieval, sentence similarity, classification, and clustering.

Instead of building separate embedding models for each downstream task, there has been a push for creating a single model that can support many tasks. However, such general-purpose text embedding models require large amounts of training data to comprehensively cover desired domains and skills. This is where LLMs can be leveraged, as done by Google in this research.

“LLMs contain vast knowledge across various domains and are known to be exceptional few-shot learners”. Google’s approach leverages insights from knowledge distillation to create Gecko, a two-step LLM-powered embedding model.

“Our two-step distillation process begins with generating diverse, synthetic paired data using an LLM. Next, we further refine the data quality by retrieving a set of candidate passages for each query, and relabeling the positive and hard negative passages using the same LLM.”

So basically, starting with a large corpus of unlabeled passages, the team used a few-shot prompted LLM to generate a relevant task and query for each passage. They then embedded the concatenated task and query using a pretrained embedding model to obtain nearest neighbor passages, used an LLM to rerank the passages, and obtained positive and negative passages based on the LLM scores. This approach helped Gecko achieve strong retrieval performance.

The research showed that training Gecko on an LLM-generated synthetic dataset, FRet, containing LLM-ranked positives and negatives alone can lead to significantly improvement, setting a strong baseline as a zero-shot embedding model on the Massive Text Embedding Benchmark (MTEB).

“By combining this LLM-generated and LLM-ranked data with human-annotated data, our model, Gecko-1B with 768-dimensional embeddings, achieves the best performance on the popular MTEB benchmark among the models with compatible embedding dimensions and model sizes. It achieves an average score of 66.31, competing with 7x larger models and 5x higher dimensional embeddings”, the research mentioned.

The post Google Unveils Gecko, a Versatile Text Embedding Model Distilled from Large Language Models appeared first on Analytics India Magazine.

Beyoncé’s new album ‘Cowboy Carter’ is a statement against AI music

Beyoncé’s new album ‘Cowboy Carter’ is a statement against AI music Amanda Silberling 8 hours

Beyoncé’s “Cowboy Carter” has been out for only a few days, yet it’s already obvious that we’ll be talking about it for years to come — it’s breaking records across streaming platforms, and the artist herself calls it “the best music [she’s] ever made.” But in the middle of the press release for “Cowboy Carter,” Beyoncé made an unexpected statement against the growing presence of AI in music.

“The joy of creating music is that there are no rules,” said Beyoncé. “The more I see the world evolving the more I felt a deeper connection to purity. With artificial intelligence and digital filters and programming, I wanted to go back to real instruments.”

Beyoncé rarely does interviews, giving each of her comments about the new album more significance — these remarks are among few jumping-off points fans get to help them puzzle through each element of the album, and how they all fit together. So her stance on AI isn’t just a throwaway comment made in conversation with a reporter. It’s deliberate.

The central backlash against AI-generated art comes from the way this technology works. AI-powered music generators can create new tracks in minutes and emulate artists’ vocals to a scarily convincing degree. In some cases, that’s because the AI is being trained on the work of the artists whose jobs it could end up replacing.

Large language models and diffusion models both require sprawling databases of text, images and sounds to be able to create AI-generated works. Some of the best-known AI companies, like Open AI and Stability AI, use datasets that include copyrighted artworks without consent. Even though Stability AI’s music model was trained on licensed stock music, that’s not the case for the company’s image generator, Stable Diffusion. Stability AI’s VP of Audio Ed Newton-Rex quit his job over this, because he “[doesn’t] agree with the company’s opinion that training generative AI models on copyrighted works is ‘fair use.'”

It’s no wonder artists like Beyoncé have strong feelings about this technology — too many AI models have been trained on artists’ work without their consent, and especially for rising musicians who don’t have the clout to buoy them, it will be even harder to break into an already ruthless industry. Beyoncé’s stance makes even more sense in the context of “Cowboy Carter” itself.

Though it does not explicitly discuss AI, “Cowboy Carter” already addresses the theft and appropriation of artworks without consent. On the album itself, Beyoncé is giving listeners a history lesson about how Black musicians formed the foundation of country music, which is too often assumed to represent Southern white culture.

Even the title, “Cowboy Carter,” is a nod to the appropriation of Black music for white people’s gain. Though “Carter” could reference Beyoncé’s married name, it’s also a nod to the Carters, the “first family” of country music — and those Carters took the work of Black musicians to develop the style we now know as country, which continues to exclude Black artists (just recently, an Oklahoma country radio station recently refused a listener’s request to play Beyoncé’s “Texas Hold ‘Em,” since Beyoncé didn’t fit their definition of a country artist). Beyoncé’s seemingly random stance against AI unearths a similar truth: Once again, artists’ work is being stolen without their consent and contorted into something else, leaving them without payment or credit for their cultural contributions.

There are a few moments on the album when ninety-year-old country icon Willie Nelson appears on a radio show called “Smoke Hour,” and its first appearance precedes “Texas Hold ‘Em.” The placement of the track takes on an extra layer of meaning in light of the Oklahoma radio incident, and Nelson makes a slight jab: “Now for this next tune, I want y’all to sit back, inhale, and go to the good place your mind likes to wander off to. And if you don’t wanna go, go find yourself a jukebox.”

This is Beyoncé’s world: The jukebox and the radio are back in style, Black musicians can make whatever kind of music they want, and no one’s art gets stolen.

MoE-LLaVA: Mixture of Experts for Large Vision-Language Models

Recent advancements in Large Vision Language Models (LVLMs) have shown that scaling these frameworks significantly boosts performance across a variety of downstream tasks. LVLMs, including MiniGPT, LLaMA, and others, have achieved remarkable capabilities by incorporating visual projection layers and an image encoder into their architecture. By implementing these components, LVLMs enhance the visual perception capabilities of Large Language Models (LLMs). Performance can be further improved by increasing the model's size and number of parameters, as well as expanding the dataset scale.

Models like InternVL have expanded their image encoder to over 6 billion parameters, while others have extended the backend of LVLMs to 13 billion parameters, achieving superior performance on a wide array of tasks. IDEFICS has trained an LVLM with over 80 billion parameters. These scaling methods have matched or exceeded the performance of LLMs pretrained on over 34, 70, or even 100 billion parameters. However, scaling has a downside: it significantly increases training and inference costs. This is because it requires all parameters to be active for each token in calculation, leading to high computational needs and, consequently, higher costs.

This article discusses MoE-LLaVA, a Mixture of Experts (MoE)-based sparse LVLM architecture that employs an effective training strategy, MoE-Tuning, for LVLMs. MoE-Tuning innovatively addresses performance degradation in multi-modal sparsity learning, resulting in a model with a large number of parameters but consistent training and inference costs. The MoE-LLaVA architecture is designed to activate only the top-k experts during deployment, keeping the rest inactive.

We aim to thoroughly explore the MoE-LLaVA framework, examining its mechanism, methodology, architecture, and how it compares with leading image and video generation frameworks. Let's delve into the details.

MoE-LLaVA: Scaling Large Vision Language Models Affordably

In addition to leveraging visual projection layers and image encoders, Large Vision Language Models also scale up the model size by increasing the number of parameters to enhance the performance of the model. Some notable examples of Large Vision Language Models that have followed this approach to enhance their performance are MiniGPT-4, InternGPT, InternVL, and others. In real-world applications, scaling a Large Language Model or a Large Vision Language Model with high-quality training data often becomes a necessity to improve the performance of the model. Although scaling a model size does improve the performance, it also increases the computational costs of training and deploying the model, and further increases the complications and efficiency of deploying the model on parallel devices simultaneously. A major reason behind the increased training and inference costs along with computational requirements is that each token in the framework demands computation with every single parameter within the model known as the dense model.

On the other hand, sparse MoE or Mixture of Expert Models have demonstrated effective scaling of frameworks by processing data with the help of fixed activated parameters, an approach that has been widely adopted in the Natural Language Processing field. However, using Mixture of Expert to train sparse Large Vision Language Models directly is challenging since converting LLMs to LVLMs and sparsifying the model simultaneously results in significant performance degradation. To implement Mixture of Models to scale LLMs and LVLMs, it is essential to first initialize the LVLM for sparsification. To achieve this, the MoE-LLaVA framework introduces MoE-Tuning, a simple yet effective three phase training strategy.

As shown in the above figure, the MoE-Tuning process first trains a MLP or a Multilayer Perceptron that adapts the visual tokens to a Large Language Model in the first stage. The framework then trains the entire parameters of the LLM to pre-empower the Large Vision Language Model with a general multi-modal understanding capabilities. Finally, in the third stage, the framework replicates the FFN or Feed Forward Network as the initialization weights for the experts, and trains only the Mixture of Expert layers. Overall, the training process helps in the gradual transition of the sparse model from a LVLM initialization to a sparse mixture of expert models.

With the training process being covered, let us shine some light on MoE-LLaVA, a baseline for Large Vision Language Models with Mixture of Expert models that incorporates learnable routers and MoE models. At its core, the MoE-LLaVA model consists of multiple sparse paths, and the framework uses these paths to dispatch each token to different experts through the learnable router. The tokens are then processed collectively by the activated experts while keeping the inactive paths silent. The framework then stacks the Mixture of Expert encoder layers iteratively to provide a sparse path towards a larger and more powerful LVLM.

Thanks to the approach implemented by the MoE-LLaVA framework, it is able to outperform models with a similar number of activated parameters, and surpass them by a large difference on the POPE object hallucination benchmark, despite having only 2.2 billion parameters. Furthermore, the MoE-LLaVA framework with 2.2 billion parameters, is able to achieve performance comparable to the InternVL-Chat-19B framework with nearly 8 times the number of activated parameters.

Furthermore, powerful Large Language Models with strong generalization and instruction following capabilities have been implemented to Large Vision Language Models. Early LLMs like BLIP encoded visual signals into a sequence of visual tokens allowing them to adapt vision to LLMs successfully using multiple projection layers. At the same time, recent works focus on improving the model performance by implementing methods like expanding the instruction-tuning dataset, increasing the resolution of the image, optimizing training strategies, aligning the input, enhancing the image encoders, and much more. These approaches have helped empower LVLMs with powerful visual understanding capabilities by expanding the visual instruction fine-tuning dataset and model scales. Furthermore, some LVLMs also possess fine-grained image understanding capabilities such as region and multi-region understanding along with pixel-wise grounding capabilities. However, the computational cost accompanied with scaling up dense visual data and models is often significantly high which makes it challenging to wear. On the other hand, the MoE-LLaVA framework aims to make LVLM research more affordable by leveraging the capabilities of MoE models.

MoE-LLaVA : Method and Architecture

At its core, the MoE-LLaVA framework consists of a visual projection layer (Multilayer Perceptron), a vision encoder, MoE blocks, multiple stacked LLM blocks, and a word embedding layer.

Architecture

The following table summarizes the detailed configurations of the MoE-LLaVA framework.

For a given RGB image, the vision encoder processes the images to obtain a sequence of visual tokens with a visual projection layer mapping the visual token sequence to input images. The text inputs are processed by the word embedding layer that then projects it to obtain the sequence tokens. At the same time, the MoE-LLaVA framework concatenates the text and visual tokens together, and feeds them to the LLM. However, the framework only trains the visual projection layer with the large language model consisting of FFN or Feedforward Neural Networks, and Multi-Head Self Attention Layers. Finally, the framework applies residual connections and layer normalization to each block.

Moving along, the MoE-LLaVA framework replicates the FFN or Feedforward Neural Networks from the second stage to form an ensemble of experts as the initialization step. The router being a linear layer, predicts the probability of each token being assigned to each expert. Each token is processed by the top-k experts with the maximum probability, and calculates the weighted sum based on the softmax result of the probabilities.

MoE-Tuning

MoE-Tuning is a simple yet effective three phase training strategy that first trains a MLP or a Multilayer Perceptron that adapts the visual tokens to a Large Language Model in the first stage. The framework then trains the entire parameters of the LLM to pre-empower the Large Vision Language Model with a general multi-modal understanding capabilities. Finally, in the third stage, the framework replicates the FFN or Feed Forward Network as the initialization weights for the experts, and trains only the Mixture of Expert layers.

Stage 1

In the first stage, the primary objective is to adapt the image tokens to the large language model that allows the LLM to comprehend the instances in the image. The MoE-LLaVA framework employs a multilayer perceptron to project the image tokens into the input domain of the large language model, and treats image patches as pseudo-text tokens. In this stage, the MoE-LLaVA framework trains the LLM to describe the images, and does not apply the MoE layers to the LLM during this stage.

Stage 2

In the second stage, the MoE-LLaVA attempts to enhance the capabilities and controllability of the framework by tuning the model with multi-modal instruction data. The MoE-LLaVA framework achieves this by adjusting the LLM to become a LVLM with multi-modal understanding capabilities. The framework employs more complex instructions including text recognition and logical image reasoning tasks that require the model to possess stronger multi-modal capabilities. Traditionally, the training process for dense models is considered to be complete by this step. However, the MoE-LLaVA framework encountered challenges in transforming the LLM into a LVLM simultaneously with sparsifying the LVLM. To counter this challenge, the framework utilizes the weights from the stage as initialization for the next stage in an attempt to alleviate the learning difficulty of the sparse model.

Stage 3

In the third step, the model replicates the feedforward neural network several times to initialize the experts as an initialization procedure. The framework then feeds the text and image tokens into the mixture of expert layers following which the router calculates the matching weights between experts and each tokens. Each token is then processed by the top-k experts with the aggregated output calculated by weighted summation based on the weights of the router. Once the top-k experts are activated, the model shuts the remaining experts, an approach that equips the MoE-LLaVA framework with infinitely possible sparse paths, thus equipping the model with a wide range of capabilities.

MoE-LLaVA : Results and Experiments

The MoE-LLaVA framework adopts CLIP-Large as the vision encoder with the Multilayer Perceptron consisting of two layers with a GELU activation layer separating the two. By default, the framework employs an alternating replacement of the feedforward neural networks with the mixture of expert layers, meaning the mixture of expert layers comprise 50% of the total number of layers. The following table contains the different datasets along with their sample size used to train and evaluate the MoE-LLaVA framework.

Zero-Shot Image Question Answering

The following figure demonstrates that MoE-LLaVA is a sparse model with a soft router based on LVLM. The framework is evaluated on 5 image question answering benchmarks, and as it can be observed, the MoE-LLaVA framework demonstrates remarkable image understanding capabilities, and delivers comparable performance to the state of the art LLaVA 1.5 framework on five different benchmarks.

Object Hallucination Evaluation

To evaluate object hallucination, the MoE-LLaVA framework adopts the POPE evaluation pipeline, a polling-based query method, and the results are demonstrated in the following table. As it can be observed, out of all the frameworks, the MoE-LLaVA delivers the strongest results, indicating the ability of the framework to generate objects consistent with the input image. Additionally, it is worth noting that the MoE-LLaVA framework balances the yes ratio well, indicating the capability of the sparse model to provide accurate feedback for the given question.

The following image contains the distribution of expert loadings, where the discontinuous lines represent a well balanced distribution of tokens among the modalities or experts. The first figure illustrates the workload within the experts while the remaining images demonstrate the performance of experts towards different modalities.

Furthermore, the following figure demonstrates the distribution of modalities across different experts.

Final Thoughts

In this article we have talked about MoE-LLaVA, a baseline for Large Vision Language Models with Mixture of Expert models that incorporates learnable routers and MoE models. At its core, the MoE-LLaVA model consists of multiple sparse paths, and the framework uses these paths to dispatch each token to different experts through the learnable router. The tokens are then processed collectively by the activated experts while keeping the inactive paths silent. The framework then stacks the Mixture of Expert encoder layers iteratively to provide a sparse path towards a larger and more powerful LVLM. The MoE-Tuning strategy addresses the common issue of performance degradation in multi-modal sparsity learning innovatively, consequently constructing a model with a significantly large number of parameters but consistent training and inference costs. The architecture of the MoE-LLaVA framework has been designed in a way that it only activates the top-k experts during deployment while keeping the remaining experts inactive.

5 AI Courses From Google to Advance Your Career

Image by Editor

Artificial intelligence or AI is an exciting field to dabble with—whether or not you want to make a career as an AI developer or an AI researcher.

To help you explore, we’ve put together this list of courses and learning paths from Google. From the building blocks of AI to building and deploying AI models, these courses will help you learn it all.

So let’s go over them. Some familiarity with machine learning works will be helpful to get the most out of these courses.

1. Introduction to AI and Machine Learning on Google Cloud

The course Introduction to AI and Machine Learning on Google Cloud will help you understand how to go from data to AI solutions. So you’ll learn how to build machine learning models, design machine learning pipelines, and build a fully functional AI system while exploring Google Cloud offerings.

This course takes you through the following:

AI foundations
AI development options
AI development workflow

Link: Introduction to AI and Machine Learning on Google Cloud

2. Launching into Machine Learning

The Launching into Machine Learning course will help you build the foundations in machine learning that you’ll need going forward.

The course starts out by exploring fundamentals such as exploratory data analysis and using it effectively to improve data quality. It then covers building and deploying applications using Vertex AI AutoML. Further, you’ll also learn how to use BigQuery ML.

This course has the following modules:

Get to Know Your Data: Improve Data through Exploratory Data Analysis
Machine Learning in Practice
Train AutoML Models Using Vertex AI
BigQuery Machine Learning: Develop ML Models for Your Data Lives
Optimization
Generalization and Sampling

Link: Launching into Machine Learning

3. TensorFlow on Google Cloud

TensorFlow is one of the most popular development frameworks for developing deep learning and AI applications. The course TensorFlow on Google Cloud dives deep into building applications with TensorFlow.

From building input data processing pipelines to training neural networks at scale, this course teaches you all about using TensorFlow. The following are the modules in this course:

Introduction to the TensorFlow Ecosystem
Design and Build an Input Data Pipeline
Building Neural Networks with TensorFlow and the Keras API
Training at Scale with Vertex AI

Link: TensorFlow on Google Cloud

4. Perform Foundational Data, ML, and AI Tasks in Google Cloud

Perform Foundational Data, ML, and AI Tasks in Google Cloud is a learning path to learn how to perform the different tasks in a typical machine learning project.

Meaning you’ll get to experiment with data preparation and data processing tasks. In addition, you’ll learn to work with speech-to-text and video intelligence API to build powerful applications.

Link: Perform Foundational Data, ML, and AI Tasks in Google Cloud

5. Introduction to Generative AI Learning Path

Generative AI has become super popular recently and thanks to continued advances, it’s still the talk of the town. We see novel generative AI applications making it to the market.

So whether you want to use generative AI models effectively or want to dive into how they work, the Generative AI Learning Path has got you covered. This learning path has following courses:

Introduction to Generative AI
Introduction to Large Language Models
Introduction to Responsible AI
Generative AI Fundamentals
Responsible AI: Applying AI Principles with Google Cloud

Link: Introduction to Generative AI Learning Path

If you’re ready to explore further, you can as well go through the Generative AI for Developers Learning Path.

Wrapping Up

I hope you found this compilation of courses helpful. These courses should help you become proficient with the TensorFlow framework as well as build scalable AI solutions with Vertex AI.

Further, these courses are great in that you get to practice and apply what you have learned as you progress through the course. So happy learning and building!

Bala Priya C is a developer and technical writer from India. She likes working at the intersection of math, programming, data science, and content creation. Her areas of interest and expertise include DevOps, data science, and natural language processing. She enjoys reading, writing, coding, and coffee! Currently, she's working on learning and sharing her knowledge with the developer community by authoring tutorials, how-to guides, opinion pieces, and more. Bala also creates engaging resource overviews and coding tutorials.

Women in AI: Urvashi Aneja is researching the social impact of AI in India

Women in AI: Urvashi Aneja is researching the social impact of AI in India Kyle Wiggers 8 hours

To give AI-focused women academics and others their well-deserved — and overdue — time in the spotlight, TechCrunch is launching a series of interviews focusing on remarkable women who’ve contributed to the AI revolution. We’ll publish several pieces throughout the year as the AI boom continues, highlighting key work that often goes unrecognized. Read more profiles here.

Urvashi Aneja is the founding director of Digital Futures Lab, an interdisciplinary research effort that seeks to examine the interaction between technology and society in the Global South. She’s also an associate fellow at the Asia Pacific program at Chatham House, an independent policy institute based in London.

Aneja’s current research focuses on the societal impact of algorithmic decision-making systems in India, where she’s based, and platform governance. Aneja recently authored a study on the current uses of AI in India, reviewing use cases across sectors including policing and agriculture.

Q&A

Briefly, how did you get your start in AI? What attracted you to the field?

I started my career in research and policy engagement in the humanitarian sector. For several years, I studied the use of digital technologies in protracted crises in low-resource contexts. I quickly learned that there’s a fine line between innovation and experimentation, particularly when dealing with vulnerable populations. The learnings from this experience made me deeply concerned about the techno-solutionist narratives around the potential of digital technologies, particularly AI. At the same time, India had launched its Digital India mission and National Strategy for Artificial Intelligence. I was troubled by the dominant narratives that saw AI as a silver bullet for India’s complex socio-economic problems, and the complete lack of critical discourse around the issue.

What work are you most proud of (in the AI field)?

I’m proud that we’ve been able to draw attention to the political economy of AI production as well as broader implications for social justice, labor relations and environmental sustainability. Very often narratives on AI focus on the gains of specific applications, and at best, the benefits and risks of that application. But this misses the forest for the trees — a product-oriented lens obscures the broader structural impacts such as the contribution of AI to epistemic injustice, deskilling of labor and the perpetuation of unaccountable power in the majority world. I’m also proud that we’ve been able to translate these concerns into concrete policy and regulation — whether designing procurement guidelines for AI use in the public sector or delivering evidence in legal proceedings against Big Tech companies in the Global South.

How do you navigate the challenges of the male-dominated tech industry, and, by extension, the male-dominated AI industry?

By letting my work do the talking. And by constantly asking: why?

What advice would you give to women seeking to enter the AI field?

Develop your knowledge and expertise. Make sure your technical understanding of issues is sound, but don’t focus narrowly only on AI. Instead, study widely so that you can draw connections across fields and disciplines. Not enough people understand AI as a socio-technical system that’s a product of history and culture.

What are some of the most pressing issues facing AI as it evolves?

I think the most pressing issue is the concentration of power within a handful of technology companies. While not new, this problem is exacerbated by new developments in large language models and generative AI. Many of these companies are now fanning fears around the existential risks of AI. Not only is this a distraction from the existing harms, but it also positions these companies as necessary for addressing AI-related harms. In many ways, we’re losing some of the momentum of the “tech-lash” that arose following the Cambridge Analytica episode. In places like India, I also worry that AI is being positioned as necessary for socioeconomic development, presenting an opportunity to leapfrog persistent challenges. Not only does this exaggerate AI’s potential, but it also disregards the point that it isn’t possible to leapfrog the institutional development needed to develop safeguards. Another issue that we’re not considering seriously enough is the environmental impacts of AI — the current trajectory is likely to be unsustainable. In the current ecosystem, those most vulnerable to the impacts of climate change are unlikely to be the beneficiaries of AI innovation.

What are some issues AI users should be aware of?

Users need to be made aware that AI isn’t magic, nor anything close to human intelligence. It’s a form of computational statistics that has many beneficial uses, but is ultimately only a probabilistic guess based on historical or previous patterns. I’m sure there are several other issues users also need to be aware of, but I want to caution that we should be wary of attempts to shift responsibility downstream, onto users. I see this most recently with the use of generative AI tools in low-resource contexts in the majority world — rather than be cautious about these experimental and unreliable technologies, the focus often shifts to how end-users, such as farmers or front-line health workers, need to up-skill.

What is the best way to responsibly build AI?

This must start with assessing the need for AI in the first place. Is there a problem that AI can uniquely solve or are other means possible? And if we’re to build AI, is a complex, black-box model necessary, or might a simpler logic-based model do just as well? We also need to re-center domain knowledge into the building of AI. In the obsession with big data, we’ve sacrificed theory — we need to build a theory of change based on domain knowledge and this should be the basis of the models we’re building, not just big data alone. This is of course in addition to key issues such as participation, inclusive teams, labor rights and so on.

How can investors better push for responsible AI?

Investors need to consider the entire life cycle of AI production — not just the outputs or outcomes of AI applications. This would require looking at a range of issues such as whether labor is fairly valued, the environmental impacts, the business model of the company (i.e. is it based on commercial surveillance?) and internal accountability measures within the company. Investors also need to ask for better and more rigorous evidence about the supposed benefits of AI.

Transformer was Once called CargoNet

It took seven years for all the authors of ‘Attention is All You Need’, minus Niki Parmer, to come together in the same room. This moment finally arrived during the NVIDIA GTC 2024 session ‘Transforming AI’, hosted by GPU king Jensen Huang.

It was so great to see almost everyone (we missed you @nikiparmar09!!) from the Transformer paper again. We still haven't all been in the same room at the same time, but we'll make it happen one day.@lukaszkaiser @kyosu @ashVaswani @ilblackdragon @YesThisIsLion pic.twitter.com/L9UH8w6nuO

— Aidan Gomez (@aidangomez) March 21, 2024

Noam Shazeer, founder of Character AI, revealed that Transformer architecture was once called ‘CargoNet,’ but nobody really paid any attention to it whatsoever.

“There were a lot of names; there was something called CargoNet [short for Convolution, Attention, Recognition, and Google],” said Shazeer excitedly. However, the name failed to impress, with everyone unanimously downvoting it as “horrible”. “Wise people,” quipped Huang, pulling Shazeer’s leg.

Jakob Uszkroeit eventually came up with the name Transformer. “The reason it became such a generic name is that, on paper, we weren’t focused solely on translation. We were definitely aware that we were trying to create something very general, something that could truly transform anything into anything else,” said Llion Jones, founder of Sakana AI.

Speaking on Transformer’s multimodality aspect, Aidan Gomez, founder of Cohere, said, “When we were building the Tensor library, we were really focused on scaling up autoregressive training. It wasn’t just for language; there were components in there for images, audio, and text, both in input and output.”

What are the Creators of Transformer Up To Now?

Illia Polusukhin was the first one to leave Google in 2017. He ended up building NEAR Protocol, a blockchain platform designed to be faster, cheaper, and more user-friendly than existing options like Ethereum.

Ashish Vaswani left Google in 2021. “One of the big reasons why I left was that the only way to make these models smarter was not just by working in a vacuum of a lab, you actually had to go out and put them into people’s hands,” he said.

In late 2022, he, along with Niki Parmer, founded a company called Essential AI. “We’re really excited about building models that can ultimately learn to solve new tasks at the same level of efficiency as humans as they watch what we do,” said Vaswani, adding that their ultimate goal is to change the way we interact with computers and how we work.

Meanwhile, Shazeer founded Character AI in 2021. “The biggest frustration at that time was, here’s this incredible technology, and it’s not getting out to everyone, it has so many uses,” expressed Noam excitedly and energetically, so much so that Huang commented, saying, “This is what Zen looks like”.

Gomez founded Cohere in 2019. He said that the idea behind Cohere was the same as Noam’s, where he felt that this technology would change the world as computers started speaking back to humans.

“I think the way that I’ve gone about it is a bit different from Noam’s in the sense that Cohere builds for enterprises. We create a platform for every enterprise to adopt and integrate it (genAI) into their product, as opposed to directly going to consumers,” said Gomez.

Jones in 2023, co-founded a Japanese AI startup called Sakana AI, which is a nature-inspired artificial intelligence company. Sakana in Japanese means fish. The company is currently working on a technique called Evolutionary Model Merge, where it combines different models from the vast ocean of open-source models with diverse capabilities.

“We’re making the algorithms by hand. To do it we took all the models available on Hugging Face and then used very large amounts of computation to use evolutionary computation to search through the space of how to merge and stack the layers,” said Jones.

“I want to remind you that the massive amount of computation that NVIDIA has given us, there are other things we can do apart from gradient descent,” he added.

Lukasz Kaiser joined OpenAI in 2021. “That’s where the best Transformers were built. It’s a lot of fun at the company. We know you can take a ton of data and a ton of compute and make something nice,” said Kaiser.

Uszkroeit founded Inceptive AI in 2021 to use AI to design novel biological molecules for vaccines, therapeutics, and other treatments, essentially creating a new kind of ‘biological software’. “My first child was born during the pandemic, which certainly, but also otherwise, gave me a newfound appreciation for the fragility of life,” said Uskzoreit.

What After Transformer?

Huang asked the panel about the most significant improvements to the base Transformer design. Gomez replied that extensive work has been done on the inference side to speed up these models. However, Gomez said that he is quite unhappy with the fact that all the developments happening today are built on top of Transformers.

“I still think it kind of disturbs me how similar to the original form we are. I think the world needs something better than the Transformer,” he said, adding that he hopes it will be succeeded by a ‘new plateau of performance’. “I think it is too similar to the thing that was there six or seven years ago.”

Jones said that companies like OpenAI are currently using a lot of computation. “I think they’re doing a lot of wasted computation,” he said when Jensen asked about their interest in a larger context window and faster token generation capability. Huang chipped in quickly saying that, “We are trying to make it efficient”.

Uszkroeit thinks that the solution to the computation problem is the right allocation. “It’s really about spending the right amount of effort and ultimately energy.” Moreover, regarding SSMs (State Space Models) he is of the opinion that it is ‘too complicated’ and ‘not elegant enough’.

Meanwhile, Ashish Vaswani, chief Essential AI, believes that to make better models the right interface is essential. “If we ultimately want to build models that can mimic and learn how to solve tasks by watching us, the interface is going to be absolutely huge,” he said.

Jones believes that many young researchers have forgotten the pre-Transformer age. He said that all the problems they were facing back then while trying to get things working are likely still present in these models. “People seem to have forgotten the pre-Transformer age, so they have to rediscover all those problems,” he added.

Polusukhin said that the Transformer has recurrent steps. “The fun fact is that I find that nobody is actually playing with the fact that you can run a Transformer from a variable number of steps, and train that differently.”

Meanwhile, Lukasz Kaiser believes that we have never truly learned how to train recurrent levels with gradient descent. “I have this personal belief that we have never truly learned how to train recurrent levels with gradient descent. Maybe it’s just impossible,” he said.

The post Transformer was Once called CargoNet appeared first on Analytics India Magazine.

The Psychology of Data Visualization: How to Present Data that Persuades

Image by pikisuperstar on Freepik

We’ve all become keenly aware of the power that data, and consequently, analytics can bring to the table for businesses and organizations of all stripes. However, presentation matters—without a simple-to-understand, compelling way of getting information across, all our analytic capabilities don’t amount to much.

Enter data visualization, stage left. By leveraging the most immediate and dominant of all the senses, complex and intricate datapoints can be condensed into an actionable format that everyone involved in the process can intuitively understand.

Consequently, the ability to effectively communicate insights and ideas through visual representations has become increasingly crucial. Effective visualizations can cut through the noise, highlight patterns and relationships, and guide the viewer's attention to the most crucial insights.

This article discusses the psychological principles and techniques that underpin the creation of persuasive and effective data visualizations.

The Psychology of Visual Processing

The psychology of visual processing in data visualization is a deeply fascinating area that intersects psychology, neuroscience, and design principles. It focuses on how humans interpret, understand, and respond to visual data presentations. It highlights the incredible efficiency of our visual system compared to other forms of data processing.

The human brain processes visuals incredibly fast. In fact, research shows that it can understand things like shape, color, and orientation in as little as 13 milliseconds. This speed makes visual data much more immediate and impactful than text-based information?.

Therefore, figures alone often won’t do justice to your efforts. For example, compared to previous performance, even a simple chart supplemented with percentage changes and numbers does a much better job of explaining how your cloud cost optimization efforts are paying off for the organization compared to a simple, bland statement of fact.

Additionally, the Dual Processing Theory explains that we have two types of thinking: fast, instinctual (System 1) and slow, analytical (System 2). Visualizations tap into System 1, letting us grasp complex info quickly without needing to engage in deeper, slower analysis.

The way we use color, shape, and placement can affect how well people remember and make decisions based on the visualization. Understanding how visual elements influence perception and memory helps in creating more effective visuals?.

Creating effective data visualizations means leveraging the human innate visual processing skills to present complex information in an instantly understandable and memorable way.

The Principles of Persuasive Data Visualization

Creating persuasive data visualizations involves a blend of storytelling, strategic data selection, and effective design principles. Here's a breakdown of essential components to consider when aiming to craft visualizations that not only inform but also convince your audience:

Strategic Storytelling

Every effective data visualization begins with a clear, specific objective. This means knowing precisely what action or understanding you want to provoke in your audience. From there, crafting a narrative around your data helps to engage the audience, making the information more relatable and compelling.

This narrative should have a clear introduction, body, and conclusion, each part building on the last to guide your audience toward the desired understanding or action.

Appropriate Data Selection

Choosing the right data for your audience is crucial. The data should be directly relevant to your audience's interests or needs. This tailored approach ensures that the visualization speaks directly to the concerns or curiosity of the audience, making the message more persuasive.

One of the issues with data selection is that it can be time-consuming. However, you don’t have to do everything manually — using automated document generation tools cuts down on the time required, allowing you to devote more of your efforts to analyzing the most engaging datapoints to use in your presentation. On top of that, you can also use a variety of visualization tools — there’s no need to make handcrafted presentations for absolutely everything.

Design Principles

This principle is made up of several components including:

Alignment. Proper alignment of data displays, both vertically and horizontally, ensures that the information can be accurately compared and understood without causing confusion or misinterpretation through optical illusions.
Color choice. Colors should be used deliberately to highlight key data points and draw the audience's attention to the most important parts of the visualization. It's also important to choose color combinations that are accessible to everyone, including those with color vision deficiencies.
Title and label clarity. Titles and labels play a significant role in guiding the audience through data visualization. They should be clear, informative, and concise, providing context and emphasizing the key takeaways of the visualization.
Interactivity. When appropriate, incorporating interactivity into your visualizations can enhance the user experience, making the data exploration more engaging and insightful. However, it's crucial that this feature enhances rather than complicates the understanding of the data.

Best Practices for Persuasive Data Visualization

To create persuasive data visualizations, it's essential to follow best practices that make your data clear, engaging, and easy to understand. Here are some tips:

Choose the right visualization Tool. Selecting an appropriate tool is crucial. Options like ChartExpo, Power BI, and Looker Studio are praised for their ease of use and effectiveness in creating clear charts even for non-technical audiences.
Use colors strategically. Colors can significantly impact how information is processed and remembered. Use contrasting colors to highlight key insights and ensure your color choices are accessible to color-blind individuals. Avoid using too many colors in a single chart to maintain clarity.
Highlight key insights. Make sure to draw attention to the most important parts of your data. This could involve using visual cues like reference lines, or simply highlighting significant bars in a bar chart.
Look for business insights. Aligning data visualizations with high-level business objectives helps ensure that your visualizations are not just informative but also actionable. Use data to forecast trends and make informed decisions.
Pick the right type of chart. The choice of chart is critical for clarity and effectiveness. Bar charts, line graphs, scatter plots, pie charts, and more have specific uses that make them more suitable for certain types of data. For example, bar charts are great for comparing categorical data, while line graphs are ideal for showing trends over time.
Ensure context and comprehension. Data visualization should not only present numbers attractively but should also convey a clear message that's easy to understand. This includes using a compelling title, ensuring proportional scaling, and being clear with labels and legends.
Utilize different chart types for specific purposes. Different charts serve different analytical needs, such as bubble charts for adding an extra dimension to scatter plots, or waterfall charts for visualizing sequential changes. Composition charts like pie charts, stacked charts, and Sankey diagrams can elucidate the structure of data and its changes over time.

The Benefits of Data Visualization

One of the primary advantages is the simplification of complex data. Data visualization converts vast amounts of information into a format that's easier to process and understand, enabling users to grasp complicated concepts quickly and efficiently. This is particularly crucial in environments where quick decision-making is essential.

Data visualization also enhances storytelling. It allows businesses to present their narratives compellingly, making it easier to communicate with stakeholders, train teams, or attract customers. This approach is effective for presenting ideas and strategies, such as proposing the integration of new technologies or processes to improve departmental efficiency.

Moreover, it increases productivity by providing immediate insights, which helps in quick action and reduces delays due to data misinterpretation. By making data easier to digest, teams can focus on actionable items and improvements rather than spending time trying to understand complex datasets.

Risk management is another area where data visualization proves invaluable. It helps organizations to better understand and navigate scenarios involving uncertainties and risks, by visually simplifying data to highlight potential areas of concern.

Sometimes, management and the C-suite require a bit of nudging. I’m not playing the blame game here — people in positions of authority simply don’t have the time to micromanage and be hyper-aware of all the minute details and complexities on the ground level of operations. To use an example, if you’re aware that SAP consulting would benefit your department and lead to greater integration and efficiency, presenting that information all at once, in a compelling narrative laden with charts and graphs has a much better chance of working than simply mentioning the fact and letting the initial impression peter out while other priorities are discussed.

Conclusion

Data visualizations provide a powerful way to cut through the noise and deliver insights that truly resonate.

However, the true mastery of persuasive data visualization lies in striking the perfect balance between aesthetics and functionality. It requires a deep appreciation for the interplay between visual design principles, cognitive processes, and human behavior.

Only by striking that balance can we create visualizations that are not merely beautiful but also clear, insightful, and profoundly impactful.

Nahla Davies is a software developer and tech writer. Before devoting her work full time to technical writing, she managed—among other intriguing things—to serve as a lead programmer at an Inc. 5,000 experiential branding organization whose clients include Samsung, Time Warner, Netflix, and Sony.

How to Transform Your ML Models with Generative AI

A “mix-in” is a component or feature added to an existing system or product to enhance its functionality, performance, or complexity without altering its core structure, akin to adding toppings to a dessert to enrich its flavor and appeal.

Recently, a customer mentioned their plans to implement Generative AI (GenAI) for predictive maintenance. Maybe I’m too old school, but I don’t understand how GenAI would predict the maintenance needs of critical devices and components. Traditionally, predictive maintenance relies on a mix of supervised and unsupervised machine learning (ML) algorithms. These algorithms work in tandem to analyze usage patterns and predict the likelihood of device or component failure, often captured as a predictive score. ML models identify the performance behaviors indicative of a component failure by learning from historical data, including sensor outputs, operational logs, and previous failure instances.

Yet, GenAI introduces a novel dimension to these established ML models. It can generate new content and scenarios from existing data distributions, enhancing model capabilities. While GenAI may not directly address predictive maintenance in the traditional sense, it can significantly boost the precision and efficacy of predictive maintenance ML models. This is achieved through:

Synthetic Data Generation: One of the challenges in predictive maintenance is the lack of “failure examples” in training robust ML models, especially for critical equipment that rarely fails, such as doors falling off airplanes or nuclear power plant meltdowns. GenAI can generate synthetic data that mimics the characteristics of rare failure modes that augment the training dataset, allowing supervised models to learn from a wider variety of failure scenarios without actually having to wait for them to occur.
Feature Engineering: GenAI can discover complex patterns in data that traditional analysis techniques may miss, which could help predict component failures.
Anomaly Detection: GenAI can expand your ML models’ training sets by identifying deviations from normal operations that might not have been previously labeled as failure indicators.
Operating Condition Simulation: GenAI can simulate a device’s normal operating conditions under various scenarios and compare the simulations against real-time operational data to detect early signs of performance degradation that might indicate a component failure.

GenAI can significantly boost the performance of your ML model by generating new data scenarios and simulating rare one-off scenarios, yielding the following benefits:

Improve ML Model Accuracy: GenAI generates additional, diverse training data, helping the original AI model learn from a broader range of examples and use cases, improving its accuracy and ability to generalize from unseen data.
Increase ML Model Relevance: GenAI can create content or scenarios tailored explicitly to the AI model’s needs, ensuring the learning material is highly relevant and directly applicable to the tasks.
Encourage Data Scientist Exploration and Imagination: Just as mix-ins can turn a plain ice cream into an exciting dessert, GenAI encourages exploration and imagination by allowing data scientists to experiment with novel approaches and tackle problems from fresh perspectives.
Drive ML Model Innovation: GenAI can generate synthetic data to match your target outcomes better. This enriches traditional models with new insights and capabilities, improves prediction accuracy, and fosters innovative solutions to complex problems.
Increase ML Model Robustness and Reuse: GenAI can address imbalanced datasets and rare event prediction by enhancing data diversity and feature complexity, making models more adaptable and reusable to other use cases.

GenAI: The Human Exploration and Ideation Aide

Just like using GenAI can enhance the capabilities of traditional ML models and the data scientists who are building those ML models, humans can leverage GenAI to augment their personal and professional development and growth in the following ways:

Augmenting Creativity: GenAI can help individuals push the boundaries of their creativity by suggesting ideas, visualizations, and solutions that might not have occurred to them. It can be a source of inspiration or a partner in creative endeavors, whether in art, design, writing, modeling, or any other creative field.
Enhancing Decision-making: GenAI can augment human decision-making processes, especially in complex scenarios, by providing data-driven insights and simulations. It allows individuals to consider broader outcomes and factors than they might naturally account for, leading to more informed and nuanced decisions.
Expanding Knowledge and Learning: GenAI can tailor unique, out-of-the-box learning experiences, generate educational content, and simulate scenarios for experiential learning. This personalized approach can accelerate learning and help individuals acquire new skills or knowledge more effectively.
Improving Problem-solving: For complex or novel problems, GenAI can offer alternative solutions, uncover hidden patterns, and simulate the impact of different approaches. This can enhance human problem-solving capabilities by providing a more comprehensive array of strategies and perspectives.
Facilitating Collaboration: GenAI can also act as a catalyst for collaboration, bridging gaps between diverse fields of expertise by translating concepts, generating common ground, and fostering a shared understanding among collaborators.

How GenAI Enhances the “Thinking Like a Data Scientist” Methodology

The “Thinking Like a Data Scientist” methodology guides users through framing questions, analyzing data, and applying models to uncover insights to make more informed decisions that deliver more relevant, meaningful, responsible, and ethical outcomes. The methodology fosters curiosity and systematic exploration, emphasizing ideation exploration and iterative learning to translate data into actionable intelligence (Figure 1).

Figure 1: The Art of Thinking Like a Data Scientist

The integration of GenAI with the “Thinking Like a Data Scientist” methodology, which is a crucial part of the next book that I am writing, represents a compelling fusion of exploration, creativity, and rigorous analytical discipline. GenAI is a transformative agent that can significantly expand the horizons of what’s possible in data science, turning the methodology into an even more powerful engine for innovation and value creation in the following areas:

Hypothesis Testing and Exploration: Empower data scientists to test hypotheses in previously unimaginable ways. By generating synthetic data that can simulate various scenarios, conditions, and outcomes, practitioners can explore a vast landscape of possibilities without the constraints of existing datasets.
Embracing Diverse Perspectives: Amplify understanding and integrate diverse stakeholder perspectives by providing insights and generating data that reflect many scenarios, including those that human analysts may have yet to consider.
Challenging Conventional Thinking: Challenge conventional thinking by uncovering patterns, trends, and relationships that may not be evident through traditional analysis. It encourages users to question and rethink assumptions, offering a dynamic playground for experimentation where conventional thinking is continuously tested against AI-generated data.
Driving Innovation Through Economies of Learning: Accelerate and enrich the iterative learning cycle with a broader array of data points and scenarios for consideration. The result is a more agile, responsive, and innovative analytical process that can adapt and evolve quickly to the uncovered insights.
Unleashing the Natural Human Creativity: Generate novel combinations of data, scenarios, and predictions that encourage data scientists to think creatively about problem-solving, model design, and applying insights. This creative boost is essential for driving innovation, as it allows data scientists to venture beyond traditional boundaries and explore new frontiers in data analysis and application.

Summary: GenAI Creativity Mix-in

GenAI is not a replacement for your traditional ML models. Instead, it is a transformative mix-in that significantly boosts the ML models’ performance, accuracy, relevance, and reusability. Through the generation of synthetic data and intricate simulations, GenAI facilitates the exploration of a broader spectrum of scenarios, including those that are less common or entirely novel. This capability allows ML models to learn from a more comprehensive array of experiences, predicting outcomes with unmatched precision and broadening the horizons of what AI can achieve.

Similarly, GenAI acts as a catalyst for personal productivity, creativity, and innovation. It effectively amplifies our innate human qualities of curiosity, exploration, and imagination, bridging the realms of what is known and the endless possibilities of what could be. By leveraging GenAI, individuals can unlock new levels of creative thinking and innovative problem-solving, pushing the boundaries of traditional methodologies and fostering a culture of continuous discovery and growth. In essence, GenAI not only revolutionizes the landscape of ML but also redefines the parameters of human creativity and innovation (Figure 2).

Figure 2: The Path to Cultural Empowerment and Innovation

Why Developers Hate Jira

People in tech are sure to have come across Atlassian’s Jira, the popular project management tool. Though helpful in bug tracking and project management, it is sometimes considered as a “grumbling bug”. A simple web search on “hate jira” would throw up a flood of links and posts on X and Reddit, and a website dedicated to just hating the tool.

On the contrary, if you veer towards the Stack Overflow Developer Survey, Jira has been ranked one of most uncomplicated and asked for asynchronous tools consistently for the past two years. Though the critics are aplenty, there are many fans too. The problem is – developers hate it, managers love it.

Jira’s bad, but it’s best at what it does

What can be regarded as a feature by many, developers hate the fact that Jira offers a plethora of options, causing frustration. As a Reddit user said, “it’s too flexible”, making it unintuitive and a mess.

While it may seem advantageous theoretically, opponents argue that Jira deployments often transform into cumbersome and confusing configurations filled with numerous fields, dropdown menus, and toggles, making the user experience less intuitive. Particularly when compared to newer, more efficient alternatives such as Trello, ClickUp, and Notion, Jira seems sluggish and slow.

“Jira is just clunky as hell and enables focusing on things that aren’t the core product. It’s a high maintenance tool,” said a user on X. Just like Jira, similar thoughts have been expressed about Agile in a thread.

“Whose bright idea was it to measure productivity by how many complexity or story points you complete? That’s like measuring a truck driver by how much diesel they use, instead of how much deliveries they make,” expressed a user on X. “You end up spending more time managing the process instead of getting the work done,” said another.

Jira as a tool was developed mostly for larger enterprises and that comes at the cost of gearing it for smaller organisations, which in the end makes it bad for developers. It is more of a tool for the decision makers in the company and such teams.

“That means they [Jira creators] focus on satisfying the managers, and management consultants, and some class of Jira-specialised consultants, who benefit from their trade of coming into businesses to help fix their Jira setups being protected by it, not being particularly easy for organisations to use. They can get away with piss-poor experience for the devs,” explained a user on X.

To further illustrate the user experience, another developer said that Jira’s UI is not that friendly. “I feel like it’s gaslighting me,” the user added. It is often termed as a bloated and slow software which was developed by scrum enthusiasts.

Is Jira just a scapegoat?

Even with its problems, Jira is also admired by many when it comes to organising their workflows, but all of that comes at the cost of performance. On the other hand, several people claim that the blame on Jira is merely the blame on the management.

pic.twitter.com/bu3nFUXRK2

— user: kevinMarville(Kvnbbg) (@techandstream) March 26, 2024

A user on Hacker News said that he has never heard anyone say ‘Jira is brilliant’, but at the same time it is just a good culprit when it comes to blaming someone for the mishaps in the corporate theatre. Another use points out that the mentality of developers to sit and just work in silos and code is not something that is viable for a lot of organisations.

“I’m a developer, and while I don’t love it, I certainly don’t hate it, and I don’t see it as worse than any other task-tracking tool,” said a user. The developers who hate Jira are the ones that have to work with bloated setups and poorly managed frameworks. This makes it more of an organisation fault than the actual tool.

Atlassian has been accepting these feedbacks, positive and negative, and making changes to Jira. In an interview, Megan Cook, head of product for Jira Software, Agile Solutions, said that they are constantly working on improving Jira.

Cook also proposes that Atlassian’s ecosystem will provide a competitive advantage in the age of generative AI. She emphasises that the abundance of data circulating within and around Jira can and will be leveraged to enhance the software’s functionality and utility.

The post Why Developers Hate Jira appeared first on Analytics India Magazine.

Рубрика: AI