NVIDIA’s Earning Report Reveals Dominance in the AI Revolution

In recent years, AI has been the fulcrum of technological advancements, driving innovation and reshaping industries. NVIDIA, known for its prowess in graphics processing, stands tall as a testament to the vast financial and technological gains associated with the rise of AI. Their recent financial reports illuminate just how integral the company has become in the world of artificial intelligence.

A Financial Powerhouse: Surpassing Expectations

NVIDIA's latest financial results are nothing short of groundbreaking. With a staggering revenue of $13.51 billion in the second quarter, NVIDIA has more than outpaced its previous year's earnings of $6.7 billion. This figure not only represents a more than two-fold increase, but it also significantly overshadows market forecasts. Furthermore, the company's GAAP net income stood at an impressive $6.18 billion, a nine-fold ascent from the $656 million reported in Q2 2022.

Gaming and Beyond: NVIDIA's Diverse Revenue Streams

While NVIDIA's AI and data center segments were the prime catalysts for its success, its gaming division also witnessed commendable growth. Garnering $2.49 billion in Q2 revenue, this marked a 22% surge from the previous year. Key contributors to this uptick included the newly launched GeForce RTX 4060 GPU, the unveiling of the Avatar Cloud Engine (ACE) tailored for gaming, and the integration of 35 DLSS games like Diablo IV.

The AI Goldmine: Data Centers and Beyond

The true champions in NVIDIA's success story were its AI and data center divisions, which saw a record revenue of $10.32 billion, marking an astronomical growth of 141% from Q1 2023 and a 171% leap from the previous year.

Jensen Huang, NVIDIA's CEO, once reflected on the company's visionary shift in 2018, when NVIDIA adopted AI for DLSS. This strategic move wasn't just about enhancing graphics with AI; it was about revolutionizing the GPU for AI applications. He postulated that large language models (LLMs) would be pivotal in everything from VFX to heavy industries.

Hardware and Software: NVIDIA's Winning Combination

The company's foresight is evident with their flagship H100 Tensor Core GPU and more intricate systems like the HGX box. Such innovations have catered to the needs of leading clients, like Microsoft's Azure segment, solidifying NVIDIA's position as a crucial player in the AI sphere.

However, NVIDIA's strength doesn't solely lie in its hardware. Its unique blend of custom software and applications creates a deep-rooted bond with its customers, making migration to competitors like AMD less attractive. NVIDIA finance chief, Colette Kress, emphasized the significance of their data center products, particularly their intricate software components, in fortifying gross margins.

The Road Ahead: An AI-Driven Future

NVIDIA's success isn't a mere transient phase. The company's CEO acknowledged the monumental shift of companies worldwide from general-purpose to accelerated computing and generative AI. With major cloud service providers now investing heavily in NVIDIA's H100 AI infrastructures and a slew of partnerships in the pipeline, NVIDIA foresees a bright future. As the world transitions deeper into AI, NVIDIA projects an optimistic revenue forecast of around $16 billion for Q3.

NVIDIA stands as a beacon of innovation, illuminating the immense potential of AI. Their success story is a testament to the transformative power of artificial intelligence when combined with visionary leadership and cutting-edge technology.

Gupshup Launches Domain-Specific ACE Large Language Models

Conversational AI company Gupshup has announced the launch of ACE LLM, a family of domain-specific Large Language Models (LLMs) specialised for functions such as marketing, commerce, support, HR & IT, and industries like banking, retail, utilities and more.

Built upon foundation models such as Meta’s Llama 2, OpenAI GPT-3.5 Turbo, Mosaic MPT, Flan T-5, and others, ACE LLM has been meticulously adapted for specific industries and functions and is also equipped with enterprise-grade safety controls and guardrails.

The models empower enterprises to quickly and effectively transform conversational experiences across various stages of the customer lifecycle. From product discovery, lead generation, and commerce to troubleshooting and customer support, these models enable more precise, human-like interactions delivered with speed, scale, and compliance along with data governance, all while maintaining a lower total cost of ownership.

Available in 7 to 70 billion parameter sizes, ACE LLM generates text in 100+ languages such as Spanish, Portuguese, French, German, Bahasa, Arabic, Mandarin, Hindi, and English among other languages.

The built-in guardrails in ACE LLM eliminate irrelevant, out-of-context responses. When combined with a company’s knowledge base, the lift in accuracy levels and transparency make the output ready to handle dynamic user conversations. In addition, this unique LLM also supports enterprise-level controls for accuracy, source data boundaries, tone, auditing, teach mode for non-generative responses, automated testing, and analytics.

ACE LLM is available in the Gupshup public cloud and offers deployment options with support for geo-specific data residency or enterprise private cloud (VPC) with high scalability.

“To harness the full power of foundation LLMs, enterprises need to fine-tune them for their domain requirements and add additional guardrails around security, compliance, and relevance, while also ensuring data residency and cost efficiency. Gupshup is excited to launch the ACE LLM family of domain models, which are custom-built to fill this gap, thereby enabling enterprises to transform their customer experiences,” Beerud Sheth, CEO and co-founder of Gupshup, said.

The post Gupshup Launches Domain-Specific ACE Large Language Models appeared first on Analytics India Magazine.

GPT-4: 8 Models in One; The Secret is Out

The GPT4 model has been THE groundbreaking model so far, available to the general public either for free or through their commercial portal (for public beta use). It has worked wonders in igniting new project ideas and use-cases for many entrepreneurs but the secrecy about the number of parameters and the model was killing all enthusiasts who were betting on the first 1 trillion parameter model to 100 trillion parameter claims!

The cat is out of the bag

Well, the cat is out of the bag (Sort of). On June 20th, George Hotz, founder of self-driving startup Comma.ai leaked that GPT-4 isn’t a single monolithic dense model (like GPT-3 and GPT-3.5) but a mixture of 8 x 220-billion-parameter models.

Later that day, Soumith Chintala, co-founder of PyTorch at Meta, reaffirmed the leak.

Just the day before, Mikhail Parakhin, Microsoft Bing AI lead, had also hinted at this.

GPT 4: Not a Monolith

What do all the tweets mean? The GPT-4 is not a single large model but a union/ensemble of 8 smaller models sharing the expertise. Each of these models is rumored to be 220 Billion parameters.

GPT-4: 8 Models in One; The Secret is Out

The methodology is called a mixture of experts' model paradigms (linked below). It's a well-known methodology also called as hydra of model. It reminds me of Indian mythology I will go with Ravana.

Please take it with a grain of salt that it is not official news but significantly high-ranking members in the AI community have spoken/hinted towards it. Microsoft is yet to confirm any of these.

What is a Mixture of Experts paradigm?

Now that we have spoken about the mixture of experts, let's take a little bit of a dive into what that thing is. The Mixture of Experts is an ensemble learning technique developed specifically for neural networks. It differs a bit from the general ensemble technique from the conventional machine learning modeling (that form is a generalized form). So you can consider that the Mixture of Experts in LLMs is a special case for ensemble methods.

In short, in this method, a task is divided into subtasks, and experts for each subtask are used to solve the models. It is a way to divide and conquer approach while creating decision trees. One could also consider it as meta-learning on top of the expert models for each separate task.

A smaller and better model can be trained for each sub-task or problem type. A meta-model learns to use which model is better at predicting a particular task. Meta learner/model acts as a traffic cop. The sub-tasks may or may not overlap, which means that a combination of the outputs can be merged together to come up with the final output.

For the concept-descriptions from MOE to Pooling, all credits to the great blog by Jason Brownlee (https://machinelearningmastery.com/mixture-of-experts/). If you like what you read below, please please subscribe to Jason’s blog and buy a book or two to support his amazing work!

Mixture of experts, MoE or ME for short, is an ensemble learning technique that implements the idea of training experts on subtasks of a predictive modeling problem.

In the neural network community, several researchers have examined the decomposition methodology. […] Mixture–of–Experts (ME) methodology that decomposes the input space, such that each expert examines a different part of the space. […] A gating network is responsible for combining the various experts.

— Page 73, Pattern Classification Using Ensemble Methods, 2010.

There are four elements to the approach, they are:

  • Division of a task into subtasks.
  • Develop an expert for each subtask.
  • Use a gating model to decide which expert to use.
  • Pool predictions and gating model output to make a prediction.

The figure below, taken from Page 94 of the 2012 book “Ensemble Methods,” provides a helpful overview of the architectural elements of the method.

How Do 8 Smaller Models in GPT4 Work?

The secret “Model of Experts” is out, let's understand why GPT4 is so good!

ithinkbot.com

GPT-4: 8 Models in One; The Secret is Out
Example of a Mixture of Experts Model with Expert Members and a Gating Network
Taken from: Ensemble Methods
Subtasks

The first step is to divide the predictive modeling problem into subtasks. This often involves using domain knowledge. For example, an image could be divided into separate elements such as background, foreground, objects, colors, lines, and so on.

… ME works in a divide-and-conquer strategy where a complex task is broken up into several simpler and smaller subtasks, and individual learners (called experts) are trained for different subtasks.

— Page 94, Ensemble Methods, 2012.

For those problems where the division of the task into subtasks is not obvious, a simpler and more generic approach could be used. For example, one could imagine an approach that divides the input feature space by groups of columns or separates examples in the feature space based on distance measures, inliers, and outliers for a standard distribution, and much more.

… in ME, a key problem is how to find the natural division of the task and then derive the overall solution from sub-solutions.

— Page 94, Ensemble Methods, 2012.

Expert Models

Next, an expert is designed for each subtask.

The mixture of experts approach was initially developed and explored within the field of artificial neural networks, so traditionally, experts themselves are neural network models used to predict a numerical value in the case of regression or a class label in the case of classification.

It should be clear that we can “plug in” any model for the expert. For example, we can use neural networks to represent both the gating functions and the experts. The result is known as a mixture density network.

— Page 344, Machine Learning: A Probabilistic Perspective, 2012.

Experts each receive the same input pattern (row) and make a prediction.

Gating Model

A model is used to interpret the predictions made by each expert and to aid in deciding which expert to trust for a given input. This is called the gating model, or the gating network, given that it is traditionally a neural network model.

The gating network takes as input the input pattern that was provided to the expert models and outputs the contribution that each expert should have in making a prediction for the input.

… the weights determined by the gating network are dynamically assigned based on the given input, as the MoE effectively learns which portion of the feature space is learned by each ensemble member

— Page 16, Ensemble Machine Learning, 2012.

The gating network is key to the approach and effectively, the model learns to choose the type subtask for a given input and, in turn, the expert to trust to make a strong prediction.

Mixture-of-experts can also be seen as a classifier selection algorithm, where individual classifiers are trained to become experts in some portion of the feature space.

— Page 16, Ensemble Machine Learning, 2012.

When neural network models are used, the gating network and the experts are trained together such that the gating network learns when to trust each expert to make a prediction. This training procedure was traditionally implemented using expectation maximization (EM). The gating network might have a softmax output that gives a probability-like confidence score for each expert.

In general, the training procedure tries to achieve two goals: for given experts, to find the optimal gating function; for a given gating function, to train the experts on the distribution specified by the gating function.

— Page 95, Ensemble Methods, 2012.

Pooling Method

Finally, the mixture of expert models must make a prediction, and this is achieved using a pooling or aggregation mechanism. This might be as simple as selecting the expert with the largest output or confidence provided by the gating network.

Alternatively, a weighted sum prediction could be made that explicitly combines the predictions made by each expert and the confidence estimated by the gating network. You might imagine other approaches to making effective use of the predictions and gating network output.

The pooling/combining system may then choose a single classifier with the highest weight, or calculate a weighted sum of the classifier outputs for each class, and pick the class that receives the highest weighted sum.

— Page 16, Ensemble Machine Learning, 2012.

Switch Routing

We should also briefly discuss the switch routing approach differs from the MoE paper. I am bringing it up as it seems like Microsoft has used a switch routing than a Model of Experts to save some computational complexity, but I am happy to be proven wrong. When there are more than one expert's models, they may have a non-trivial gradient for the routing function (which model to use when). This decision boundary is controlled by the switch layer.

The benefits of the switch layer are threefold.

  1. Routing computation is reduced if the token is being routed only to a single expert model
  2. The batch size (expert capacity) can be at least halved since a single token goes to a single model
  3. The routing implementation is simplified and communications are reduced.

The overlap of the same token to more than 1 expert model is called as the Capacity factor. Following is a conceptual depiction of how routing with different expert capacity factors works

GPT-4: 8 Models in One; The Secret is Outillustration of token routing dynamics. Each expert processes a fixed batch-size

of tokens modulated by the capacity factor. Each token is routed to the expert

with the highest router probability, but each expert has a fixed batch size of

(total tokens/num experts) × capacity factor. If the tokens are unevenly dis-

patched, then certain experts will overflow (denoted by dotted red lines), resulting

in these tokens not being processed by this layer. A larger capacity factor allevi-

ates this overflow issue but also increases computation and communication costs

(depicted by padded white/empty slots). (source https://arxiv.org/pdf/2101.03961.pdf)

When compared with the MoE, findings from the MoE and Switch paper suggest that

  1. Switch transformers outperform carefully tuned dense models and MoE transformers on a speed-quality basis.
  2. Switch transformers have a smaller compute futprint than MoE
  3. Switch transformers perform better at lower capacity factors (1–1.25).

GPT-4: 8 Models in One; The Secret is Out
Concluding thoughts

Two caveats, first, that this is all coming from hearsay, and second, my understanding of these concepts is fairly feeble, so I urge readers to take it with a boulder of salt.

But what did Microsoft achieve by keeping this architecture hidden? Well, they created a buzz, and suspense around it. This might have helped them to craft their narratives better. They kept innovation to themselves and avoided others catching up to them sooner. The whole idea was likely a usual Microsoft gameplan of thwarting competition while they invest 10B into a company.

GPT-4 performance is great, but it was not an innovative or breakthrough design. It was an amazingly clever implementation of the methods developed by engineers and researchers topped up by an enterprise/capitalist deployment. OpenAI has neither denied or agreed to these claims (https://thealgorithmicbridge.substack.com/p/gpt-4s-secret-has-been-revealed), which makes me think that this architecture for GPT-4 is more than likely the reality (which is great!). Just not cool! We all want to know and learn.

A huge credit goes to Alberto Romero for bringing this news to the surface and investigating it further by reaching out to OpenAI (who did not respond as per the last update. I saw his article on Linkedin but the same has been published on Medium too.

Dr. Mandar Karhade, MD. PhD. Sr. Director of Advanced Analytics and Data Strategy @Avalere Health. Mandar is an experienced Physician Scientist working on the cutting edge implementations of the AI to the Life Sciences and Health Care industry for 10+ years. Mandar is also part of AFDO/RAPS helping to regulate implantations of AI to the Healthcare.

Original. Reposted with permission.

More On This Topic

  • GPT-2 vs GPT-3: The OpenAI Showdown
  • Meet Gorilla: UC Berkeley and Microsoft’s API-Augmented LLM Outperforms…
  • The secret to analysing large, complex datasets quickly and productively?
  • HuggingGPT: The Secret Weapon to Solve Complex AI Tasks
  • A Deep Dive into GPT Models: Evolution & Performance Comparison
  • Can Robots and Humans Combat Extinction Together? Find Out April 17

GenAI to Scale the Peak in 2 Years

With signs that there’s plenty of cash to go around for generative AI startups, Gartner placed genAI on the top of the ‘Peak of Inflated Expectations’ for the first time. If you think this technology is overhyped right now, wait for another 2-5 years, as per the report.

Startlingly, ChatGPT, the most well known example of genAI, already crossed the threshold of a product life cycle within the first three months of its release as per previous reports. The rapid adoption rate can be attributed to the media spotlight that further amplified the service expectations.

Apart from being flooded with investments, the technology is also being looked up as a driving force to revamp the industrial landscape by tech giants. Leading software corporation IBM is one of them as it paused hiring for 7,800 jobs due to AI. The company’s CEO Arvind Krishna believes AI’s ability to increase worker productivity is the solution to the talent crunch problem faced by many. Like others, the company also jumped headfirst into genAI with its in-house Watson assistant.

Another factor that will help the technology reach its peak is the sudden outburst of open source models. From Meta’s Llama 2 to Databricks’ Dolly, iterations of open and free language models have been making headlines daily in the recent past. Alongside, Hugging Face, the messiah of open source, is hosting an extensive number of diffusion as well as language models. With quality resources from every corner, contributions from the FOSS community is laying the foundation to the upcoming peak of the genAI.

ChatGPT is not everything

Generative AI is not a singular technology, stated Gartner analyst Arun Chandrasekaran in an interview. It includes everything from foundation and diffusion models to prompt engineering tools. “They all enable this trend of generative AI,” he elaborated.

Released two days ago, a report highlighted that every piece of visual art humanity has created over the last century and a half has been outnumbered by AI generated art just within a brief span of 1.5 years. The major chunk was produced by open-source diffusion models — developed last year by Stability AI.

For a transformer based technology like GPT-n, the ‘n’ number of identified use cases and billions of dollars channelled is the reason for its fame. While previous generations of the software could technically do these things, the quality of the outputs was much lower than that produced by an average human. A rarely mentioned advantage of the AI is that it’s remarkably good at things that would take humans decades to do, like processing the entire canon of a certain literary or learning an artistic movement.

According to Chandrasekaran, the main reason AI has hit the peak of the hype cycle is the sheer number of products claiming to have generative AI baked into them. Startups started flocking to the generative AI boat after OpenAI introduced GPT4’s capability via ChatGPT and the API. Even Sam Altman, the CEO of OpenAI has stated in the past of the technology becoming ‘wildly overhyped’.

Technology’s ever-evolving landscape is a tale as old as time. In the early 1990s, there was the advent of the Internet and the opportunities it presented for enterprising startups to revolutionise industries ranging from e-commerce to software downloading.

As the landscape shifts once again with generative AI, Startups but veterans like are scrambling to find ways to leverage these cutting-edge technologies to gain a competitive edge. Even veterans like Google and Microsoft have redoubled their efforts with red teams to make the most of the technology.

Embryonic stage

While some have declared the technology as an inflating bubble, on the contrary as per Gartner it is still in the nascent stage. Aligned with the Gartner study, analysts predict the return on investment on the technology after 2 years. The hard part for the quality content producing technology is turning the value of time and productivity into an ROI measurement.

Further proof of concept metrics as per experts also include scalability, ease of use, quality of response, accuracy of response and explainability or total cost of ownership.

McKinsey estimates that the technology could add between $2.6 to $4.4 trillion of economic value annually across various industries. The study titled ‘The economic potential of generative AI: The next productivity frontier’ draws results from 63 new use cases analysed across 16 business functions that could deliver those returns. These estimated returns are comparable to the UK’s 2021 GDP of $3.1 trillion.

While the ROI egg remains unhatched, companies should be focusing on tailoring and putting the technology to use as per their respective products. Trying to jump on the bandwagon for the sake of it can result in an actual bubble resulting in a market crash if not thought through.

The post GenAI to Scale the Peak in 2 Years appeared first on Analytics India Magazine.

Hugging Face raises $235M from investors including Salesforce and Nvidia

Hugging Face raises $235M from investors including Salesforce and Nvidia Kyle Wiggers 7 hours

As first reported by The Information, then seemingly verified by Salesforce CEO Marc Benioff on X (formerly known as Twitter), AI startup Hugging Face has raised $235 million in a Series D funding round.

The tranche, which had participation from Google, Amazon, Nvidia, Intel, AMD, Qualcomm, IBM, Salesforce and Sound Ventures, values Hugging Face at $4.5 billion. That’s double the startup’s valuation from May 2022 and reportedly more than 100 times Hugging Face’s annualized revenue, reflecting the enormous appetite for AI and platforms to support its development.

Hugging Face has raised a total of $395.2 million to date, placing it among the better-funded AI startups in the space. Those ahead of it are OpenAI ($11.3 billion), Anthropic ($1.6 billion), Inflection AI ($1.5 billion), Cohere ($435 million) and Adept ($415 million).

“AI is the new way of building all software. It’s the most important paradigm shift of the decade and, compared to the software shift, it’s going to be bigger because of new capabilities and faster because software paved the way,” co-founder and CEO Clément Delangue told TechCrunch via email. “Hugging Face intends to be the open platform that empowers this paradigm shift.”

Delangue, a French entrepreneur, launched Brooklyn-based Hugging Face in 2016 alongside Julien Chaumond and Thomas Wolf. The trio originally built a chatbot app targeted at teenagers. But after open sourcing the algorithm behind the app, Hugging Face pivoted to focus on creating a platform for creating, testing and deploying machine learning.

Today, Hugging Face offers a number of data science hosting and development tools, including a GitHub-like hub for AI code repositories, models and data sets as well as web apps to demo AI-powered applications. Hugging Face also provides libraries for tasks like data set processing and evaluating models in addition to an enterprise version of the hub that supports software-as-a-service and on-premises deployments.

Hugging Face’s paid functionality includes AutoTrain, which helps to automate the task of training AI models; Inference API, which allows developers to host models without managing the underlying infrastructure; and Infinity, which is designed to increase the speed with which an in-production model processes data.

Hugging Face has 10,000 customers today, it claims, and more than 50,000 organizations on the platform. And its model hub hosts over 1 million repositories.

Contributing to the growth is the strong, sustained interest in AI from the enterprise. According to a HubSpot poll, 43% of business leaders say that they plan to increase their investment in AI and automation tools over the course of 2023, while 31% say AI and automation tools are very important to their overall business strategy.

Much of what Hugging Face delivers falls into MLOps, a category of tools for streamlining the process of taking AI models to production and then maintaining and monitoring them. The MLOps market is substantial in its own right, with one report estimating that it’ll reach $16.61 billion by 2030.

But Hugging Face dabbles in other areas, too.

In 2021, Hugging Face launched BigScience, a volunteer-led project to produce an open source language model as powerful as OpenAI’s GPT-3, but free and open for anyone to use. It culminated in Bloom, a multilingual model that for more than a year has been available to tinker with on Hugging Face’s model hub.

Bloom is but one of several open source models to which Hugging Face has contributed development resources.

Hugging Face collaborated with ServiceNow, the enterprise software company, to release a free code-generating AI model called StarCoder (a follow-up model, SafeCoder, debuted this week). And the startup made available its own free version of ChatGPT, OpenAI’s viral AI-powered chatbot, in partnership with the German nonprofit LAION.

Hugging Face’s team-ups extend to major cloud providers, some of which are strategic investors.

Hugging Face recently worked with Nvidia to expand access to cloud compute via Nvidia’s DGX computing platform. It has a partnership with Amazon to extend its products to AWS customers and everage Amazon’s custom Trainium chips to train the next generation of Bloom. And Hugging Face collaborated with Microsoft on Hugging Face Endpoints on Azure, a way to turn Hugging Face-developed AI models into scalable production solutions hosted through Azure.

With this latest investment, Delangue says that Hugging Face plans to “double down” on its supportive efforts in many domains, including research, enterprise and startups. It has 170 employees, but plans on recruiting new talent over the coming months.

Shiv Nadar Institution of Eminence, Delhi-NCR launches Analytics Olympiad 3.0 for Data Science Professionals

Shiv Nadar Institution of Eminence, Delhi-NCR launches Analytics Olympiad 3.0 for Data Science Professionals

The Academy of Continuing Education at Shiv Nadar Institution of Eminence, in collaboration with MachineHack, is thrilled to announce the third edition of its hugely successful Analytics Olympiad. Lasting a whole month from August 24 to September 24, 2023, this annual event provides a much-desired platform for data scientists and machine learning professionals to showcase their predictive analytics prowess and win up to INR 1 lakh.

Analytics Olympiad 3.0 is based on the theme — Predicting Customer Loan Default — and will provide valuable insights for the banking, finance, insurance (BFSI) and fintech industry.

Problem Statement

In today’s financial landscape, an accurate prediction of customer loan default can reduce the risk and maintain a healthy lending portfolio. The challenge requires the participants to develop machine learning models to determine the likelihood of a customer defaulting on a loan.

Contestants will be provided with credit history, payment behaviour, and account details to build the model. This challenge mirrors a huge real-world problem faced by financial institutions, thus making it extremely relevant.

Start Date: 24th August 2023

End Date: 24th September 2023

Click here to register for the Analytics Olympiad

Details about the competition

The Analytics Olympiad 2023 will be held in three distinct phases, each designed to bring out the best in participants.

Phase #1: Register and compete: Online

Solving the problem statement on MachineHack. Contestants will kick-start their journey by engaging in intense problem-solving on the platform.

Phase #2: Achieve Top 50

Upon concluding Phase #1, the spotlight turns to the top 50 leaderboard contestants. These frontrunners will be invited to share presentations

outlining their problem statement analysis, approach, and the impact of their solutions.

Phase #3: Grand Finale: In Person on Campus

The competition reaches its climax with the top 10 shortlisted contestants. The top 10 finalists for the jury round are hosted and complete airfares and travel expenses borne by Academy of Continuing Education at Shiv Nadar IoE. Here, the finalists will undergo evaluation by a distinguished panel of experts during the jury rounds to be held in person at the beautiful and biodiverse campus of Shiv Nadar IoE Delhi NCR.

Simultaneously, leaders’ panel discussions will stimulate insightful dialogues about the intersection of analytics and business. The awaited moment will arrive on November 17 and 18, with the announcement of the top three winners, followed by prize distribution at Shiv Nadar University.

Evaluation Process

  • All submissions will be evaluated using the accuracy metric. One can use sklearn.metrics.accuracy to get a valid score
  • This competition supports private and public leaderboards
  • The public leaderboard is evaluated on 30% of Test data
  • The private leaderboard will be made available at the end of the phase 1 of the competition, which will be evaluated on 100% of Test data
  • The Final Score represents the score achieved based on the Best Score on the public leaderboard

Finalists will be judged as per the following criteria, in the final jury round:

30% Business Outcome/Impact,

20% Innovative + Creativity

20% Algorithm and ML approach

15% Statistically analysis

15% Presentation + Communication.

Start Date: 24th August 2023

End Date: 24th September 2023

Click here to register for the Analytics Olympiad

Recognition and Prizes that await the Champions

The winners of Analytics Olympiad 2023 will be recognized and rewarded with prizes as follows:

  • Winner: INR 1 Lakh
  • 1st Runner-Up: INR 30,000
  • 2nd Runner-Up: INR 20,000

Note: TDS will be deducted as per the government norms.

Disclaimers

  • MachineHack, Analytics India Magazine and Shiv Nadar Institution of Eminence reserve the right to disqualify a participant if any of the details entered by them are found to be incorrect.
  • Any external dataset usage is strictly prohibited. The participants will be disqualified if found using any external dataset.
  • MachineHack, Analytics India Magazine and Shiv Nadar Institution of Eminence reserve the right to modify, edit and change the rules of the Analytics Olympiad at any stage.

Eligibility Criteria

  • The participant must be a citizen of India.
  • The participant should be above 18 years of age.
  • The participant must hold a valid government ID, such as a PAN card or Aadhaar Card.
  • The participant needs to have a valid bank account based in India.
  • If shortlisted for the final jury round, the participant should be able to travel to Shiv Nadar Institution of Eminence, Delhi NCR.
  • Previous years’ finalists of jury rounds are not permitted to participate.

Start Date: 24th August 2023

End Date: 24th September 2023

Click here to register for the Analytics Olympiad

About Shiv Nadar Institution of Eminence, Delhi-NCR

Shiv Nadar Institution of Eminence is a student-centric, multidisciplinary and research-focused university offering a wide range of academic programs at the Undergraduate, Master and Doctoral levels. The University was set up in 2011 by the Shiv Nadar Foundation, a philanthropic foundation established by Mr. Shiv Nadar, founder of HCL. In the NIRF (Government’s National Institutional Ranking Framework), the University has been the youngest institution in the ‘top 100’ overall list.

The university’s Academy of Continuing Education aims to facilitate best-in-class knowledge, practices and skill development offerings to the growing ecosystem of lifetime learners and leaders, both within and outside the university. With distinguished academics as the university’s faculty members and programme instructors, the Academy of Continuing Education offers uniquely crafted programmes that are delivered innovatively, bringing together the best of the university’s rich intellectual resources.

The university aims to help students prepare for today as well as their future through its unique certification programme in data sciences and business analytics. The collaboration between the Academy of Continuing Education at Shiv Nadar Institution of Eminence and MachineHack hopes to strengthen the data science community in India and pave the way for innovation in business analytics.

Start Date: 24th August 2023

End Date: 24th September 2023

Click here to register for the Analytics Olympiad

The post Shiv Nadar Institution of Eminence, Delhi-NCR launches Analytics Olympiad 3.0 for Data Science Professionals appeared first on Analytics India Magazine.

How Cloud Computing Enhances Data Science Workflows

How Cloud Computing Enhances Data Science Workflows
Photo by Rakicevic Nenad

If data is the world’s most valuable resource, then data science is its most impactful process. As more organizations realize they need data science to retain a competitive edge, the practice becomes increasingly crucial across industries. That rapid growth is largely beneficial, but it can introduce some challenges.

Data volumes and processing demands are skyrocketing faster than conventional workflows can keep up. Data science teams need better ways to manage these rising needs, and cloud computing offers an ideal solution. Here are five reasons why.

1. Reducing Costs

Cloud computing’s cost efficiency is among its greatest strengths. Implementing and maintaining on-premise servers can be highly expensive upfront and requires significant ongoing labor and IT costs. Storing and processing data on the cloud eliminates many of those expenses.

In the cloud model, you don’t have to buy or maintain your own equipment. Considering how much processing power modern data science can require, that can represent massive savings. You also only pay for the resources you use, so any rising costs you acquire as you grow reflect actual data volume growth with no surplus.

2. Streamlining Workflows

The cloud can also streamline data science workflows. Software-as-a-service (SaaS) solutions give you access to computing speeds and capacity you may be unable to afford otherwise. Consequently, you can run more complex calculations with fewer processing delays.

Cloud systems also consolidate once-separate databases and workloads. That consolidation eliminates wasted time from switching between apps and reduces the risk of data entry and transfer errors. Bad data can significantly hinder operational efficiency, so this reliability further improves productivity.

3. Boosting Security

Despite common misgivings about cloud cybersecurity, cloud computing has several security advantages. The vast majority of cloud breaches come from human error, not technical shortcomings in the cloud itself. However, the SaaS model can make high security more accessible.

Cloud providers often have advanced security features data scientists may be unable to afford or implement in-house. That could include autonomous monitoring, automated compliance, and extensive encrypted backups. Segmenting networks is also easier in the cloud, making zero-trust and similar security architectures more accessible.

4. Expanding Data Capacity

Using the cloud also lets you store and process more data than you may be able to with an on-prem solution. Data science applications are often most effective when you have more information, but managing large data volumes on in-house systems can quickly become expensive and inefficient.

Global data volumes are on track to exceed 180 zettabytes by 2025. That can make data science more reliable than ever, but only if you have the storage capacity and computing strength to support it. The cloud makes that level of storage and analysis possible when it’d be prohibitively expensive in-house.

5. Improving Scalability

Similarly, the cloud is far more scalable than conventional data science workflows. Expanding your capacity the traditional way means buying and setting up additional servers, which is expensive and can disrupt current workflows. With the cloud, all you have to do is pay a higher rate for more capacity, and you’ll gain it immediately.

That fast scalability is critical given digital data’s current growth rate. However, if you need to downsize your operations, downscaling in the cloud is still more cost-effective than conventional means. As your capacity falls, so will your rates, ensuring downsizing doesn’t leave you with wasted hardware you’re not using.

Modern Data Science Needs Cloud Computing

Data science workflows today must be fast, reliable, secure, and able to manage considerable workloads. As those demands rise, conventional, on-premise setups quickly become insufficient.

Cloud computing offers the affordability, efficiency, security, capacity, and scalability data science teams need. Capitalizing on this opportunity will help you maximize returns from your data science applications.

April Miller is managing editor of consumer technology at ReHack Magazine. She have a track record of creating quality content that drives traffic to the publications I work with.

More On This Topic

  • Cloud Computing, Data Science and ML Trends in 2020–2022: The battle of…
  • How to Efficiently Scale Data Science Projects with Cloud Computing
  • Cloud-Native Super Computing
  • Beginner's Guide to Cloud Computing
  • 11 Best Practices of Cloud and Data Migration to AWS Cloud
  • Automation in Data Science Workflows

Generative AI is Just Fluff Talk for Indian IT

After launching the Generative AI Studio under its amplifAI0->∞ suite of AI offerings and solutions, Tech Mahindra chief executive and managing director CP Gurnani recently took to X to share that it is the first major Indian IT company to be working on their proprietary large language model called Project Indus.

The open-source large language model aims to speak over 40 Indic languages in the first phase, including Kinnauri, Kangri, Chambeli, Garhwali, Kumaoni, Jaunsari and more.

The “​​civilisational” initiative will be carried out by the Makers Lab of Tech Mahindra to develop India’s foundational model for various Indian languages, starting with Hindi. The project collects Hindi dialect speech data to train a language model using NLP algorithms. Contributors can anonymously submit short to extended speech samples with the option to delete recorded data.

According to the official website, mobile numbers are optionally collected for reference and gamification purposes, with encryption and a retention period of up to seven years. No personal information will be shared with third parties.

However, it is not yet clear whether Makers Lab will build the model from scratch or base it on top of any existing open-source LLMs like GPT-4 or Llama 2 like Stanford’s Alpaca and Vicuna-13B.

Unpacking the Gen AI Vows of Indian IT

When it comes to generative AI, all other Indian IT giants have also poured sufficient funds. However, their excitement in channelling the full potential of generative AI has led to no real use cases.

The Indian IT giants are forming partnerships to advance generative AI adoption. In an earlier interaction with AIM, Wipro CTO Subha Tatavarti said that the company has been working around generative AI for the past two years. Wipro teamed up with IIT Delhi to establish a Center of Excellence (CoE) as part of their USD 1 billion AI-driven innovation initiative in the Wipro ai360 ecosystem. They aim to combine Google Cloud’s generative AI with Wipro’s AI IP, business accelerators, and industry solutions.

Meanwhile, HCL partnered with Microsoft and AWS to enhance their generative AI efforts, while TCS is collaborating with Google Cloud to utilise foundational models like Vertex AI and generative AI Application Builder. Infosys follows an “AI First” strategy, focusing on specialised AI models from open-source LLMs, using them to accelerate clients’ AI initiatives through their Topaz framework, which encompasses generative AI-based services, solutions, and platforms.

No Moat, Only Fluff Talk

This is not the first time that we have seen ITs jumping onto the bandwagon of something new, something that is trending in the tech ecosystem. When Meta introduced Metaverse with much fanfare, we saw a similar reaction.

TCS started the trend, followed by Tech Mahindra and Infosys, all announcing their metaverse-related products and services in February of 2022. As a part of the metaverse programme, TCS’ “themaTiCS” suite targets improving remote work experiences where only 25% of employees are in the office at any time. Following the lead, Infosys introduced the Metaverse Foundry similar to Topaz, offering ready-to-use templates and is said to have found 100 use cases for enterprises to embrace metaverse offerings spanning XR consulting, blockchain consulting, digital twin and more. Tech Mahindra also introduced the TechMVerse, leveraging its 5G capabilities to deliver immersive experiences.

However, when Meta pulled the plug on their Metaverse dream, Indian ITs seemed to have lost interest too.

This time, Indic Languages are coming back to riding the generative AI wave, building indigenous LLMs in India is a huge growth opportunity considering that it is home to 122 major languages and 1599 other languages, along with 22 official languages, as per Census 2001.

At present, 58.8%of the content is in English, followed by Russian, Spanish and French. Forget about native languages like Garhwali or Kumaoni, even Hindi does not make it to the top ten highlighting a significant shortage of local language content.

Project Bhashini, was introduced in collaboration with Microsoft. Finance Minister Nirmala Sitharaman also introduced the National Language Translation Mission (NLTM) in the 2021-22 budget.

Project Indus is an important initiative in this direction. Gurnani urged people from different vocabularies to contribute to making this project successful as an LLM is only as good as the data it is trained on. And Tech Mahindra is the only company to work on its promises.

Choosing AI Upskilling Over Model Building

When it comes to building its own LLMs, India is taking a different approach, focusing on the upskilling of their employees. This can be attributed to the fact that historically, technology adoptions have increased work volume, requiring more expertise and hands.

All the heads of these companies like K Krithivasan (TCS), Salil Parekh (Infosys) and C Vyakumar (HCL) have highlighted generative AI as the quarter one’s focal point of this year, with clients exploring its potential for enhancing productivity, content creation, and customer service.

However, while generative AI is seeing a strong interest, clients cutting back on IT spending is a concern for HCL and Infosys, impacting revenue growth forecasts, as per a report by The Register. Despite the short-term hype, executives believe AI will bring meaningful long-term benefits, although measuring its effectiveness remains challenging.

Tech Mahindra, Infosys, TCS, HCL and Wipro have expanded their partnerships with major tech players like AWS, Google and Microsoft partnerships to upskill employees in generative AI. Traditionally, the huge talent pool of India has been seen as beneficial for a quick adopter of technology and not building inhouse. And now with the upskilling, the process will only get faster and smoother.

Read more: Wipro’s Tryst With Generative AI Began Way Before ChatGPT

The post Generative AI is Just Fluff Talk for Indian IT appeared first on Analytics India Magazine.

Meta’s SeamlessM4T Takes on OpenAI Whisper and Google AudioPaLM

Meta might have just upped its multimodal and multilingual offering with the latest release of SeamlessM4T – Massively Multilingual & Multimodal Machine Translation model.

Introducing SeamlessM4T, the first all-in-one, multilingual multimodal translation model.
This single model can perform tasks across speech-to-text, speech-to-speech, text-to-text translation & speech recognition for up to 100 languages depending on the task.
Details ⬇

— Meta AI (@MetaAI) August 22, 2023

SeamlessM4T is a foundational speech/text translation and transcription model, and an all-in-one system that performs multiple tasks such as speech-to-speech, speech-to-text, text-to-text translation, and speech recognition. The model facilitates input and output in 100 languages, and speech output in 35 languages (including English). However, what does it offer that sets it apart from existing translator models?

Meta’s SeamlessM4T vs OpenAI Whisper vs Google AudioPaLM

With speech-to-text translation models by tech companies already prevalent in the market, Meta seems to be pushing themselves to carve a niche for themselves. OpenAI and Google have developed their own speech to text models, namely Whisper and AudioPaLM-2, respectively. Whisper, an open-sourced multilingual speech recognition model, can translate and transcribe speech from over 97 diverse languages, and has been trained on over 680,000 hours of multilingual data. Whereas, Google’s AudioPaLM is a multimodal language model that is built on the capabilities of PaLM-2 and its generative audio model AudioLM.

When evaluated with other Speech-to-Text (S2T) and Speech-to-Speech translation (S2ST) models via ASR BLEU (Automatic Speech Recognition Bilingual Evaluation Understudy), a metric for evaluating the quality of machine-generated translations, SeamlessM4T (shown in blue) scores better than the others.

Source: Meta AI Blog

Companies are now actively seeking to incorporate multi-language translation as the next major development. Addressing the diverse vernacular markets worldwide, language translation and transcription are emerging as essential product offerings.

Indian IT giant Tech Mahindra is working on an indigenous Large Language Model (LLM) that would have the ability to speak in a number of Indic languages with different dialects, most notably Hindi. With Project Indus the model will have the ability to speak in 40 different Indic languages, to begin with. More languages that have originated in the country will also be added subsequently.

Recently, Eleven Labs, a research lab that explores frontiers of Voice AI, introduced Eleven Multilingual v2, an AI speech model that supports 28 languages with enhanced conversational capability and higher output quality.

Not For Everyone

Meta’s SeamlessM4T is made publicly available under CC BY-NC 4.0 licence – a non-commercial licence which means that people can remix, adapt and build on it but cannot use it for commercial purposes. This has been critically debated by users on how Meta is limiting adoption and deviating from the conventional way of offering on Apache. A user on Hacker News, spoke about how restricting others from engaging with models, providing enhancements, offering input and developing an ecosystem, while only benefiting oneself doesn’t align with good community conduct.

There are also few who have called the move as holding back such models from open sourcing as a move against competitor companies, however, with Meta’s Llama 2 that concern seems unwarranted. A user even mentioned that there’s ‘nothing particularly special about the weights or training data.’

Chasing ‘Multimodal’

‘Multimodality’ is a coveted feature big tech companies are going behind, however, not all deliver on their promises. OpenAI’s GPT-4 which was released earlier this year as a multimodal platform was said to allow inputs via images, voice and text , however it has not been able to fulfil all the offerings. Image integration is still not available for users and voice inputs can be given only via ChatGPT app.

Meta has released other multimodal models in the past. The company released CM3leon last month which generates both text-to-image and image-to-text, however, the model code was not released to the public. Being categorised as a multimodal offering, SeamlessM4T offers both text and speech, thereby fulfilling the tag. However, with the non-commercial licence that comes with the product, it is to be seen how much adoption happens.

The post Meta’s SeamlessM4T Takes on OpenAI Whisper and Google AudioPaLM appeared first on Analytics India Magazine.

Unlock the Full Potential of AI With 7 ChatGPT Courses for $29.99

Program codes with an AI bot on background.
Image: StackCommerce

Heard about ChatGPT? Of course you have. This AI platform has taken the world by storm since it was launched publicly last fall. If you want to understand the technology and deploy it successfully, the 2023 ChatGPT for Business Mastery Bundle is a great starting point.

Featuring seven in-depth courses, this learning library can save hundreds of hours of trial and error. The included content is worth $133, but you can pick up the bundle today for only $29.99 at TechRepublic Academy.

Businesses that properly harness AI technology are already gaining a head start on the competition.

This is in part because relatively few people know how to use it properly. While ChatGPT is fairly user-friendly, you still need a good understanding of the app to automate tasks effectively and extract useful outputs.

You could spend weeks putting in prompts, hoping for a good response. Or you could simply grab the 2023 ChatGPT for Business Mastery Bundle and become a genuine expert in around 19 hours.

The bundle covers several important topics. Through hands-on video lessons, you discover how to build an entire sales process using ChatGPT — from welcome emails to pixel-perfect presentations.

Other courses look at content creation for marketing, IT service automation and self-publishing on Amazon. You can even learn how to use ChatGPT to generate passive income.

All of the included courses come from top-rated instructors. For instance, LinCademy has helped over 100,000 students over the past three years, earning a rating of 4.5 out of 5 stars on Udemy.

Order today for $29.99 to get lifetime on-demand access to all seven courses, normally priced at $133.

Prices and availability are subject to change.

Subscribe to the Innovation Insider Newsletter

Catch up on the latest tech innovations that are changing the world, including IoT, 5G, the latest about phones, security, smart cities, AI, robotics, and more.

Delivered Tuesdays and Fridays Sign up today