Amazon Music follows Spotify with an AI playlist generator of its own, Maestro

Amazon Music follows Spotify with an AI playlist generator of its own, Maestro Sarah Perez @sarahintampa / 8 hours

Spotify isn’t the only one to dabble with AI playlists — on Tuesday, Amazon announced it would do the same. Amazon Music is now testing Maestro, an AI playlist generator, allowing U.S. customers on both iOS and Android to create playlists using spoken or written prompts, which can even contain emojis.

Amazon suggests that in addition to emojis, customers can write prompts that include activities, sounds, or emotions. They can also choose from prompt suggestions at the bottom of the screen if they don’t know what to write. Seconds later, an AI-generated playlist will appear with songs that — in theory — will match your input.

The product is launching in beta, so Amazon warns that the technology behind Maestro “won’t always get it right the first time.” Like Spotify, it’s also added some guardrails to the experience to proactively block offensive language and other inappropriate prompts, it says. (We’re guessing people will try to break through those barriers in time!)

Image Credits: Amazon

Maestro is not yet broadly available. While Spotify’s AI generator is starting its tests in the U.K. and Australia, Amazon’s product is launching to a “subset” of free Amazon Music users, as well as Prime customers and Unlimited Amazon Music subscribers on iOS and Android in the U.S. for the time being.

Subscribers will gain access to more functionality, however. For instance, they’ll be able to listen to playlists instantly and save them for later, but Prime members and ad-supported users will only be able to listen to 30-second previews of the songs before saving them. This could potentially push more users to upgrade to the paid subscription if they like the AI functionality. The move also follows the general trend of making premium AI experiences a paid offering.

Image Credits: Amazon

To access Maestro, users will need the latest version of the Amazon Music mobile app and will tap on the option for Maestro on their home screen. They may also see the option when they tap on the plus sign to create a new playlist. From there, users can either talk or write out their playlist prompt idea, then tap “Let’s go!” to start streaming it. The playlist can also be saved and shared with friends.

Amazon suggests playlists like “😭 and eating 🍝,” “Make my 👶 a genius,” “Myspace era hip-hop, “🏜️🌵🤠,” “Music my grandparents made out to,” “🎤🚿🧼”, and “I tracked my friends and they’re all hanging out without me,” to give you an idea of how silly the prompts can be for this new experience.

The company didn’t say when the beta would roll out more broadly, only that it would expand to more customers over time.

Spotify launches personalized AI playlists that you can build using prompts

Spotify confirms test of prompt-based AI playlists feature

Inside DBRX: Databricks Unleashes Powerful Open Source LLM

DBRX: A New State-of-the-Art Open LLM

In the rapidly advancing field of large language models (LLMs), a new powerful model has emerged – DBRX, an open source model created by Databricks. This LLM is making waves with its state-of-the-art performance across a wide range of benchmarks, even rivaling the capabilities of industry giants like OpenAI's GPT-4.

DBRX represents a significant milestone in the democratization of artificial intelligence, providing researchers, developers, and enterprises with open access to a top-tier language model. But what exactly is DBRX, and what makes it so special? In this technical deep dive, we'll explore the innovative architecture, training process, and key capabilities that have propelled DBRX to the forefront of the open LLM landscape.

The Birth of DBRX The creation of DBRX was driven by Databricks' mission to make data intelligence accessible to all enterprises. As a leader in data analytics platforms, Databricks recognized the immense potential of LLMs and set out to develop a model that could match or even surpass the performance of proprietary offerings.

After months of intensive research, development, and a multi-million dollar investment, the Databricks team achieved a breakthrough with DBRX. The model's impressive performance on a wide range of benchmarks, including language understanding, programming, and mathematics, firmly established it as a new state-of-the-art in open LLMs.

Innovative Architecture

The Power of Mixture-of-Experts At the core of DBRX's exceptional performance lies its innovative mixture-of-experts (MoE) architecture. This cutting-edge design represents a departure from traditional dense models, adopting a sparse approach that enhances both pretraining efficiency and inference speed.

In the MoE framework, only a select group of components, called “experts,” are activated for each input. This specialization allows the model to tackle a broader array of tasks with greater adeptness, while also optimizing computational resources.

DBRX takes this concept even further with its fine-grained MoE architecture. Unlike some other MoE models that use a smaller number of larger experts, DBRX employs 16 experts, with four experts active for any given input. This design provides a staggering 65 times more possible expert combinations, directly contributing to DBRX's superior performance.

DBRX differentiates itself with several innovative features:

  • Rotary Position Encodings (RoPE): Enhances understanding of token positions, crucial for generating contextually accurate text.
  • Gated Linear Units (GLU): Introduces a gating mechanism that enhances the model's ability to learn complex patterns more efficiently.
  • Grouped Query Attention (GQA): Improves the model's efficiency by optimizing the attention mechanism.
  • Advanced Tokenization: Utilizes GPT-4's tokenizer to process inputs more effectively.

The MoE architecture is particularly well-suited for large-scale language models, as it allows for more efficient scaling and better utilization of computational resources. By distributing the learning process across multiple specialized subnetworks, DBRX can effectively allocate data and computational power for each task, ensuring both high-quality output and optimal efficiency.

Extensive Training Data and Efficient Optimization While DBRX's architecture is undoubtedly impressive, its true power lies in the meticulous training process and the vast amount of data it was exposed to. DBRX was pretrained on an astounding 12 trillion tokens of text and code data, carefully curated to ensure high quality and diversity.

The training data was processed using Databricks' suite of tools, including Apache Spark for data processing, Unity Catalog for data management and governance, and MLflow for experiment tracking. This comprehensive toolset allowed the Databricks team to effectively manage, explore, and refine the massive dataset, laying the foundation for DBRX's exceptional performance.

To further enhance the model's capabilities, Databricks employed a dynamic pretraining curriculum, innovatively varying the data mix during training. This strategy allowed each token to be effectively processed using the active 36 billion parameters, resulting in a more well-rounded and adaptable model.

Moreover, DBRX's training process was optimized for efficiency, leveraging Databricks' suite of proprietary tools and libraries, including Composer, LLM Foundry, MegaBlocks, and Streaming. By employing techniques like curriculum learning and optimized optimization strategies, the team achieved nearly a four-fold improvement in compute efficiency compared to their previous models.

Training and Architecture

DBRX was trained using a next-token prediction model on a colossal dataset of 12 trillion tokens, emphasizing both text and code. This training set is believed to be significantly more effective than those used in prior models, ensuring a rich understanding and response capability across varied prompts.

DBRX's architecture is not only a testament to Databricks' technical prowess but also highlights its application across multiple sectors. From enhancing chatbot interactions to powering complex data analysis tasks, DBRX can be integrated into diverse fields requiring nuanced language understanding.

Remarkably, DBRX Instruct even rivals some of the most advanced closed models on the market. According to Databricks' measurements, it surpasses GPT-3.5 and is competitive with Gemini 1.0 Pro and Mistral Medium across various benchmarks, including general knowledge, commonsense reasoning, programming, and mathematical reasoning.

For instance, on the MMLU benchmark, which measures language understanding, DBRX Instruct achieved a score of 73.7%, outperforming GPT-3.5's reported score of 70.0%. On the HellaSwag commonsense reasoning benchmark, DBRX Instruct scored an impressive 89.0%, surpassing GPT-3.5's 85.5%.

DBRX Instruct truly shines, achieving a remarkable 70.1% accuracy on the HumanEval benchmark, outperforming not only GPT-3.5 (48.1%) but also the specialized CodeLLaMA-70B Instruct model (67.8%).

These exceptional results highlight DBRX's versatility and its ability to excel across a diverse range of tasks, from natural language understanding to complex programming and mathematical problem-solving.

Efficient Inference and Scalability One of the key advantages of DBRX's MoE architecture is its efficiency during inference. Thanks to the sparse activation of parameters, DBRX can achieve inference throughput that is up to two to three times faster than dense models with the same total parameter count.

Compared to LLaMA2-70B, a popular open source LLM, DBRX not only demonstrates higher quality but also boasts nearly double the inference speed, despite having about half as many active parameters. This efficiency makes DBRX an attractive choice for deployment in a wide range of applications, from content creation to data analysis and beyond.

Moreover, Databricks has developed a robust training stack that allows enterprises to train their own DBRX-class models from scratch or continue training on top of the provided checkpoints. This capability empowers businesses to leverage the full potential of DBRX and tailor it to their specific needs, further democratizing access to cutting-edge LLM technology.

Databricks' development of the DBRX model marks a significant advancement in the field of machine learning, particularly through its utilization of innovative tools from the open-source community. This development journey is significantly influenced by two pivotal technologies: the MegaBlocks library and PyTorch's Fully Sharded Data Parallel (FSDP) system.

MegaBlocks: Enhancing MoE Efficiency

The MegaBlocks library addresses the challenges associated with the dynamic routing in Mixture-of-Experts (MoEs) layers, a common hurdle in scaling neural networks. Traditional frameworks often impose limitations that either reduce model efficiency or compromise on model quality. MegaBlocks, however, redefines MoE computation through block-sparse operations that adeptly manage the intrinsic dynamism within MoEs, thus avoiding these compromises.

This approach not only preserves token integrity but also aligns well with modern GPU capabilities, facilitating up to 40% faster training times compared to traditional methods. Such efficiency is crucial for the training of models like DBRX, which rely heavily on advanced MoE architectures to manage their extensive parameter sets efficiently.

PyTorch FSDP: Scaling Large Models

PyTorch’s Fully Sharded Data Parallel (FSDP) presents a robust solution for training exceptionally large models by optimizing parameter sharding and distribution across multiple computing devices. Co-designed with key PyTorch components, FSDP integrates seamlessly, offering an intuitive user experience akin to local training setups but on a much larger scale.

FSDP’s design cleverly addresses several critical issues:

  • User Experience: It simplifies the user interface, despite the complex backend processes, making it more accessible for broader usage.
  • Hardware Heterogeneity: It adapts to varied hardware environments to optimize resource utilization efficiently.
  • Resource Utilization and Memory Planning: FSDP enhances the usage of computational resources while minimizing memory overheads, which is essential for training models that operate at the scale of DBRX.

FSDP not only supports larger models than previously possible under the Distributed Data Parallel framework but also maintains near-linear scalability in terms of throughput and efficiency. This capability has proven essential for Databricks' DBRX, allowing it to scale across multiple GPUs while managing its vast number of parameters effectively.

Accessibility and Integrations

In line with its mission to promote open access to AI, Databricks has made DBRX available through multiple channels. The weights of both the base model (DBRX Base) and the finetuned model (DBRX Instruct) are hosted on the popular Hugging Face platform, allowing researchers and developers to easily download and work with the model.

Additionally, the DBRX model repository is available on GitHub, providing transparency and enabling further exploration and customization of the model's code.

inference throughput for various model configurations on our optimized serving infrastructure using NVIDIA TensorRT-LLM at 16-bit precision with the best optimization flags we could find.

For Databricks customers, DBRX Base and DBRX Instruct are conveniently accessible via the Databricks Foundation Model APIs, enabling seamless integration into existing workflows and applications. This not only simplifies the deployment process but also ensures data governance and security for sensitive use cases.

Furthermore, DBRX has already been integrated into several third-party platforms and services, such as You.com and Perplexity Labs, expanding its reach and potential applications. These integrations demonstrate the growing interest in DBRX and its capabilities, as well as the increasing adoption of open LLMs across various industries and use cases.

Long-Context Capabilities and Retrieval Augmented Generation One of the standout features of DBRX is its ability to handle long-context inputs, with a maximum context length of 32,768 tokens. This capability allows the model to process and generate text based on extensive contextual information, making it well-suited for tasks such as document summarization, question answering, and information retrieval.

In benchmarks evaluating long-context performance, such as KV-Pairs and HotpotQAXL, DBRX Instruct outperformed GPT-3.5 Turbo across various sequence lengths and context positions.

DBRX outperforms established open source models on language understanding (MMLU), Programming (HumanEval), and Math (GSM8K).

DBRX outperforms established open source models on language understanding (MMLU), Programming (HumanEval), and Math (GSM8K).

Limitations and Future Work

While DBRX represents a significant achievement in the field of open LLMs, it is essential to acknowledge its limitations and areas for future improvement. Like any AI model, DBRX may produce inaccurate or biased responses, depending on the quality and diversity of its training data.

Additionally, while DBRX excels at general-purpose tasks, certain domain-specific applications may require further fine-tuning or specialized training to achieve optimal performance. For instance, in scenarios where accuracy and fidelity are of utmost importance, Databricks recommends using retrieval augmented generation (RAG) techniques to enhance the model's output.

Furthermore, DBRX's current training dataset primarily consists of English language content, potentially limiting its performance on non-English tasks. Future iterations of the model may involve expanding the training data to include a more diverse range of languages and cultural contexts.

Databricks is committed to continuously enhancing DBRX's capabilities and addressing its limitations. Future work will focus on improving the model's performance, scalability, and usability across various applications and use cases, as well as exploring techniques to mitigate potential biases and promote ethical AI use.

Additionally, the company plans to further refine the training process, leveraging advanced techniques such as federated learning and privacy-preserving methods to ensure data privacy and security.

The Road Ahead

DBRX represents a significant step forward in the democratization of AI development. It envisions a future where every enterprise has the ability to control its data and its destiny in the emerging world of generative AI.

By open-sourcing DBRX and providing access to the same tools and infrastructure used to build it, Databricks is empowering businesses and researchers to develop their own cutting-edge Databricks tailored to their specific needs.

Through the Databricks platform, customers can leverage the company's suite of data processing tools, including Apache Spark, Unity Catalog, and MLflow, to curate and manage their training data. They can then utilize Databricks' optimized training libraries, such as Composer, LLM Foundry, MegaBlocks, and Streaming, to train their own DBRX-class models efficiently and at scale.

This democratization of AI development has the potential to unlock a new wave of innovation, as enterprises gain the ability to harness the power of large language models for a wide range of applications, from content creation and data analysis to decision support and beyond.

Moreover, by fostering an open and collaborative ecosystem around DBRX, Databricks aims to accelerate the pace of research and development in the field of large language models. As more organizations and individuals contribute their expertise and insights, the collective knowledge and understanding of these powerful AI systems will continue to grow, paving the way for even more advanced and capable models in the future.

Conclusion

DBRX is a game-changer in the world of open source large language models. With its innovative mixture-of-experts architecture, extensive training data, and state-of-the-art performance, it has set a new benchmark for what is possible with open LLMs.

By democratizing access to cutting-edge AI technology, DBRX empowers researchers, developers, and enterprises to explore new frontiers in natural language processing, content creation, data analysis, and beyond. As Databricks continues to refine and enhance DBRX, the potential applications and impact of this powerful model are truly limitless.

Happiest Minds Launches AI Chatbot ‘hAPPI’

Happiest Minds Technologies Limited, an IT company, today announced the launch of ‘hAPPI’, a Generative AI-powered chatbot developed by its Generative AI Business Services (GBS) unit for Happiest Health. The chatbot will engage with users in health and wellness knowledge conversations.

Joseph Anantharaju, executive vice chairman of Happiest Minds, said, “We firmly believe in the transformative potential of Generative AI for our customers and Happiest Minds’ future. Our Generative AI Business Unit has hit the ground running, already serving over 20 customers and in ongoing discussions with many more.”

Anindya Chowdhury, President and CEO of Happiest Health explained that the average response time of hAPPI is 3-4 seconds per query and delivers a quick response to users.

Happiest Minds has strategically invested in a dedicated Generative AI business unit, offering services across various domains such as EdTech, BFSI, and Healthcare. The company has a team of AI engineering experts and a repository of over 120 use cases.

Its data science team has over 300 members, who collaborate closely with domain teams across diverse industry verticals and has formed a dedicated task force to leverage generative AI in addressing industry-specific challenges.

The post Happiest Minds Launches AI Chatbot ‘hAPPI’ appeared first on Analytics India Magazine.

Dot com to Dot AI: The New Tech Bubble?

Dot com to Dot AI
Image: Futurism

The onset of the Generative AI era has wowed everyone – the technologists and the enthusiasts alike. There are several reports and playbooks on how to ride on the Generative AI wave which is touted as the “iPhone moment” of the industry.

Interestingly, it is not just limited to the facade but has become table stakes in boardroom discussions. The executives and technologists are facing a sense of urgency to embrace this revolutionary change and accelerate their business growth.

Some consider this “wow” factor as inflated expectations from AI and fear revisiting the dot com bubble.

Let’s Talk About Nvidia First!

Amid all such frenzy, one company has recently made the headlines, i.e., Nvidia, the chip manufacturer. Notably, Nvidia is the leading GPU (Graphics Processing Units) provider, which is in high demand following the surge in the AI world. The availability of these GPUs is crucial to building AI models that require high computation power.

Nvidia stock’s stellar performance is evidence of its success trajectory, as also highlighted below:

Dot com to Dot AI
Source: The Motley Fool

Its growth journey is a function of growing AI investments, which brings a good segue to compare today’s Dot AI (.ai) world to the Dot Com (.com which was at the start of this millennium).

The Start of the Comparison

This “.ai” vs. “.com” comparison is inspired by a series of events, one of which is the latest news of a year-old AI startup that reportedly became the fastest company to gain unicorn status in India.

A similar sentiment floated around last year when Mistral AI raised $118 million in what seems to be Europe’s largest seed fund.

Notably, the enterprises training large language models require a significant quantum of funding to make big leaps, given that the likes of OpenAI, Anthropic, and others have also raised billions of dollars in this pursuit.

Such news creates a stir in the investors’ community, especially when AI is the much sought-after industry that can get investors a premium ROI aka generational returns.

HBR also highlights this by associating the investment thesis with the industry focus rather than the idea focus – “Venture capitalists must earn a consistently superior return on investments in inherently risky businesses. The myth is that they do so by investing in good ideas and good plans. In reality, they invest in good industries — that is, industries that are more competitively forgiving than the market as a whole. And they structure their deals in a way that minimizes their risk and maximizes their returns.”

One thing is clear, the world looks binary amid ChatGPT fever — GenAI and the rest of the world.

Bubble or Not?

Now comes the big question — is it a bubble?

Consider these statistics from FortuneBusinessInsights that expect the global GenAI market to increase at a CAGR of ~40% to $967B by 2032.

With such potential, there are also reports comparing this “.ai” bubble to the “.com” bubble.

So, let’s discuss the rationale that makes the market think of AI as another impending bubble.

While AI is the sought-after industry, one needs to watch out for leading indicators of an upcoming bubble. Speculative investments, lack of the right expertise, and no clear differentiator or innovation are the early signs of an inviting bubble.

Investors, in general, watch out for a robust diligence process, including but not limited to assessing the business model, financial, and legal intricacies, market demand, and analysis, which is a critical step in evaluating the investment opportunity.

Further, the strong governance policies, relevant product-market fit, and how viable the proposal concerning the feasibility, scalability, and potential for achieving greater returns are some of the key factors driving the investor’s decisions. Additionally, the revenue-generating capability, understanding of the total addressable market, barriers to entry, the business moat, and growth strategy also indicate a green signal.

Novelty and cutting-edge offerings like that of AI are seen as a golden opportunity for substantial returns on investment.

A Lot of Investments Go Rogue, but Why?

However, choosing the right investments is a challenging task. Let’s discuss some statistics that describe these risks:

  • ~75% of the firms even fail to break even the investments
  • In the context of disruptive technology such as AI, the reports suggest such startups carry a higher rate of failure due to inherently associated risk

CNN also reports that “some investors and people in the industry are worried the funding frenzy is turning into a bubble, with money thrown at companies that have neither earnings nor an innovative product nor the right expertise.”

Let’s see what investors typically look at. It is a common perception among investors that the success of the enterprise largely hinges on the founders’s resilience, integrity, and ability to execute innovative ideas into reality. Some factors consider the robustness of the business concept itself and its ability to address the customers’ pain points.

In addition to these attributes, various psychological factors like confidence in the founders’ ability (which could be assessed based on whether they are first-time founders or had successful exist in the past), or the founder's receptiveness to include contrarian views also provide an additional set of indicators (albeit non-quantitative) to onboard.

However, human experts, investors in this case, can only consider limited factors at a time to make the most effective decision. That’s where the power of computing aka machines comes into the picture, helping investors make data-backed decisions.

Then vs. Now of the VC World

Due to the inherently high-risk, high-impact nature of the venture capital industry, AI could be used to augment the VC’s hunch, something that is based more on quantitative analysis coming from historical data points. These models assess the viability of the proposal and predict the likelihood of success of an investment

Welcome to modern data-driven investing.

Quoting Gartner:

“The traditional pitch experience will significantly shift by 2025 and tech CEOs will need to face investors with AI-enabled models and simulations as traditional pitch decks and financials will be insufficient”

Building AI tools for evaluating attractive AI opportunities seems like an effective use of technology among multiple attractive uses of AI. It is a fair expectation that the investment community will benefit from such quantified tools that make informed investment decisions, saving the industry from another bubble.

Vidhi Chugh is an AI strategist and a digital transformation leader working at the intersection of product, sciences, and engineering to build scalable machine learning systems. She is an award-winning innovation leader, an author, and an international speaker. She is on a mission to democratize machine learning and break the jargon for everyone to be a part of this transformation.

More On This Topic

  • AI for Ukraine is a new educational project from AI HOUSE to…
  • The Generative AI Bubble Will Burst Soon
  • Baidu Research Unveils Top 10 Tech Trends Forecast for 2022
  • Simple Salary Guide for Tech Experts 2022
  • What's With All the Layoffs in Tech?
  • Data Scientists Need to Specialize to Survive the Tech Winter

India Ranks Third in Developer Contributions on GitHub

GitHub has recently released its Innovation Graph with data from the fourth quarter of 2023, providing a detailed look at global developer activities over the past four years.

The updated data charts show the growing use of AI among developers, marked by a surge in project documentation. This trend is largely fueled by the adoption of chat-based generative AI tools such as GitHub Copilot Chat and ChatGPT.

The update provides a four-year span of data across eight dimensions: Git pushes, repositories, developers, organisations, programming languages, licences, topics, and international collaborations.

Key insights from India revealed by the data are:

  • Indian developers have contributed to code in GitHub 15 million times in 2023 from a little less than 5 million to 15 million in 2023
  • Over 13 million Indian developers are actively using GitHub
  • India hosts a little less than 30 million repositories
  • 500k companies use GitHub growing from 247k in 2020
  • India sees the highest collaboration with USA in all the four years

– JavaScript is the most popular programming language among Indian developers, followed by Python and Shell.

The increase in documentation activities is likely influenced by the introduction of chat-based generative AI interfaces, which may be lowering the hurdles to maintaining documentation.

According to the graph, the most used programming languages are JavaScript and Python. Mike Linksvayer, the VP of developer policy at GitHub, said, “It’s not particularly surprising to those following developer trends, yet it remains fascinating to watch these shifts over time.”

Another surprising finding was that obscure programming languages like COBOL saw a popularity increase during the hackathons like Advent of Code in December.

Launched in September 2023, the GitHub Innovation Graph aims to assist policymakers, researchers, and developers in analysing software development trends. With four years of comprehensive data now accessible, GitHub is advocating for further examination and the sharing of insights derived from this open dataset.

The post India Ranks Third in Developer Contributions on GitHub appeared first on Analytics India Magazine.

AI Con USA: Navigate the Future of AI 2024

Partnership Content

AI Con USA April 2024

AI Con USA, happening June 2–7, 2024 in Las Vegas and online, is not to be missed.

The conference week includes in-depth pre-conference training, deep-dive tutorials, visionary keynotes, concurrent sessions on the latest topics, a leadership summit, an Expo packed with solutions, myriad networking opportunities, and more!

A Preview of Keynotes at AI Con USA:

  • I Got 99 Problems, but AI Ain’t One—Dona Sarkar, Microsoft
  • AI/ML Adoption Strategies for Enterprises—Hien Luu, DoorDash
  • The Unseen Engine of AI: How 5 Innovation-Minded Companies Optimized for Operational Efficiency—Nevra Ledwon, DecisionBrain
  • Operationalizing Disruptive Technologies: A Strategic Framework for Harnessing the Power of GenAI—Mary Thorn, S&P Global Ratings
  • AI: A Moderated Panel Discussion—Dionny Santiago, Indeed
  • Humanizing AI—Tariq King, Test IO
  • Realizing the Potential of AI Tools for Software Development—Matthew Gunter, GitHub
  • Embrace AI Holistically and Unlock Your Growth Potential—Tania Katan and Rob Nicoletti, HALO Strategies

Join us for a year’s worth of education packed into one amazing week.

Can't join us in-person? A curated, free virtual conference option is also available.

More On This Topic

  • AI Con USA: Navigate the Future of AI
  • 4 Career Lessons That Helped Me Navigate the Difficult Job Market
  • Future Says Series | Discover the Future of AI
  • 2024 Data Management Crystal Ball: Top 4 Emerging Trends
  • Top 10 Kaggle Machine Learning Projects to Become Data Scientist in 2024
  • The Top 8 Cloud Container Management Solutions of 2024

Apple lawsuit behind it, chip startup Rivos plots its next moves

Apple lawsuit behind it, chip startup Rivos plots its next moves Kyle Wiggers 9 hours

Rivos made headlines in 2022 after Apple filed a trade secrets suit against it, which accused Rivos of hiring away dozens of Apple engineers and using confidential info to develop chips to rival the iPhone maker’s own.

Rivos denied the allegations and countersued Apple for unfair competition. Apple ended up settling its lawsuit in February. Around the same time, it ended separate litigation with several of the Apple engineers Rivos had hired.

Now, with the courtroom drama behind it, Rivos is redoubling its efforts to bring its chipset tech to market, CEO Puneet Kumar told TechCrunch.

“Rivos was founded with the mission of building industry-leading power-efficient, high-performance chips,” Kumar said. “We’re excited to be targeting customers who are building data driven solutions.”

A substantial new funding tranche will help to finance those efforts.

Rivos on Tuesday announced that it raised over $250 million in an oversubscribed, extended Series A led by Matrix Capital Management with participation from chip giants including Intel (via its corporate VC division) and MediaTek. Other backers included Cambium Capital, Hotung Venture Group, Walden Catalyst, Dell Technologies Capital and Koch Disruptive Technologies.

It’s quite the turnaround for Rivos, which was founded in 2021 and roughly a year ago was struggling to raise funds from investors and recruit employees under the shadow of the Apple suit. In August, Rivos laid off nearly two dozen employees, or 6% of its workforce at the time, and was forced to delay a planned $400 billion Series A fundraising round, The Information reported at the time.

A custom server chip

The long-term goal with Rivos, Kumar said, is to build chips primarily for servers that can handle intensive data analytics and AI workloads, including generative AI workloads.

“We’re targeting customers building data-driven solutions, e.g., those utilizing generative AI and data analytics to drive decisions,” Kumar said. “There’re many companies targeting such markets; Rivos supports the intense hardware requirements of the AI models and analytics that will remake the enterprise.”

Rivos’ first chipset is built on RISC-V, the open standard instruction set architecture (ISA).

ISAs are a technical spec at the foundation of every chip, describing how software controls the chip’s hardware. For general-purpose computing, chip design teams typically license an existing ISA from an incumbent (e.g. Arm or Intel). But RISC-V presents an open, no-royalties-attached alternative.

Rivos’ chip features what Kumar describes as a “data parallel accelerator” to speed up AI- and big data-related computations, essentially a GPU designed for purposes beyond graphics processing. It was made using TSMC’s 3nm fabrication process. In chip manufacturing, “process” refers to the size of the smallest component that can be embedded on a chip.

That 3nm is considered close to the cutting edge. While Qualcomm, MediaTek, Nvidia and AMD among others are expected to employ TSMC’s process for their upcoming chip families, Apple was the only company to use it in 2024 in its M3 chipset series.

In addition to building the chip, Rivos is working on self-contained data center hardware based on the Open Compute Project modular standard, which will effectively serve as plug-and-play chip housing. And it’s creating a “firmware-to-app” software stack for programming the chip, Kumar said.

“Customer workloads can be easily deployed on our more efficient hardware, but still using their existing models and databases, giving them an immediate benefit,” Kumar added.

Rivos, which is pre-revenue at the moment, plans to make money by charging customers — chiefly large data center operators — for its hardware and complementary software solutions. David Goel, an early investor, said that Rivos’ “low-friction” adoption pipeline is a key differentiator in the cutthroat chip market.

“The Rivos team has adeptly integrated the groundbreaking new RISC-V architecture with an inventive accelerator, effectively bringing this vision to life,” Goel told TechCrunch. “Their prototype chip serves as a compelling demonstration of their unique capability.”

But is it differentiating enough?

Stiff competition

One of Rivos’ potential customer segments, big tech firms, are racing to develop their own in-house chips for AI and big data analytics as the generative AI boom continues.

Google’s on its fifth-gen TPU and recently revealed Axion, its first dedicated chip for running models. Amazon has several custom chip families under its belt. Microsoft last year jumped into the fray with the Azure Maia AI Accelerator and the Azure Cobalt 100 CPU. And Meta’s inching along with its own designs.

Startups by the dozens, meanwhile, are angling for a slice of a custom data center chip market that could reach $10 billion this year and double by 2025.

Groq, a company developing chips to run AI models faster than conventional hardware, recently formed a new business unit geared toward enterprise applications and use cases. AI hardware startup Tenstorrent, helmed by engineering luminary Jim Keller, is looking to build its chipsets into data centers. And Rebellions, a South Korean fabless AI chip firm, has raised hundreds of millions of dollars in capital to ramp up production of its data center-focused chip, Atom.

But Nvidia, the dominant force in chips right now, is proving to be a tough one to topple.

Nvidia briefly became a $2 trillion company this year, riding high on the demand for its GPUs for AI training. Wells Fargo Equity Research estimates that Nvidia has a 98% market share in data center GPUs, and the company’s data center business was up more than 400% in Q4 2023 as Nvidia builds a new unit to design bespoke chips for cloud computing firms and others.

Given the fierceness of the competition — and the chilling effect Nvidia’s supremacy has had on funding for would-be rivals — it’s been rough going for some custom server chip upstarts.

Graphcore, which reportedly had its valuation slashed by $1 billion after a deal with Microsoft fell through, a few months ago said that it was planning job cuts due to the “extremely challenging” macroeconomic environment. Habana Labs, the Intel-owned AI chip company, laid off an estimated 10% of its workforce last year. Also last year, SiFive — like Rivos, a RISC-V startup — let go 20% of its workforce and discontinued its core product line.

So will Rivos fare better? Maybe.

Kumar wouldn’t talk about customers, and Rivos’ chip isn’t anticipated to reach mass production until sometime next year. But with 375 employees and hundreds of millions of dollars in the bank, Kumar said that Rivos is well-positioned to expand manufacturing and double down on platform and software engineering.

“The rapid changes in generative AI and the merger with the data analytics stack makes it vital that accelerators be easy to program and debug, and that data can seamlessly move between CPU and accelerator,” Kumar said. “Rivos addresses this need through our ‘recompile-not-redesign’ approach.”

Excitement Builds as Industrial Metaverse Transforms Manufacturing

industrial metaverse

Sony, a company that commands a formidable share in the global AR/VR headset market, revealed its new headset at CES 2024 in partnership with Siemens Digital Industries Software.

Siemens said it will integrate the headset with its NX Immersive Designer software for product engineering. This integration will facilitate 3D design, review, and collaboration for Siemens’ industrial clientele, utilising the Xcelerator portfolio.

Sony’s collaboration with Siemens to introduce the new headset exemplifies a shifting emphasis towards the industrial metaverse, signalling a transition from consumer-oriented applications to industrial-focused solutions.

“There’s actually a lot of excitement around what the industrial metaverse can deliver to manufacturers around the world. We have a very structured view of this – the industrial metaverse takes that comprehensive digital twin and creates the next level of immersive environment,” Bob Jones, executive vice president, global sales and customer success at Siemens Digital Industries Software (DISW) told AIM.

Jones also believes the demand for Siemens’ industrial Metaverse offerings will further increase once Sony releases its headset. However, not everyone shares Jones’ optimism.

Last year, Microsoft laid off its entire ‘Industrial Metaverse Core’ team, which consisted of around 100 employees, to focus on building AI chatbots.

During the same time, Meta also downsized its workforce in its Reality Labs division, which is dedicated to the technology and advancement of the metaverse.

Sony and Siemens are not alone in putting time, effort and money into the metaverse.

Earlier this year, Apple released its VisionPro mixed reality headsets. Not so long ago, the Cupertino-based company also published a blog stating that they expect VisionPro to lead to a new era of spatial computing for business.

Manufacturing leading the charge

Interestingly, the automobile industry is already at the forefront of leveraging the metaverse. It enables engineers to engage with a comprehensive digital twin, identifying potential issues or improvements and exploring alternative approaches in the virtual realm before constructing physical prototypes.

“They are using the metaverse for virtual prototyping of vehicle designs, and AR/VR/Web3D-based virtual showrooms/configurators as well as immersive training for assembly line workers,” Anuj Gupta, enterprise and sales lead at AutoVRse, told AIM.

Automobile companies are also leveraging the metaverse for remote maintenance support. The metaverse facilitates real-time transmission of repair data from dealerships and auto shops to manufacturers’ aftersales and development teams. This enhances car maintenance, repair, and aftersales development through seamless data integration.

Gupta pointed out that Tata AutoComp (TACO), one of the leading auto ancillary manufacturers, is using VR to train their assembly line workers on assembly processes for cockpits, seats, radiators, HVAC systems, and more.

“With Tata Motors, we utilised VR to deepen customer engagement, providing immersive glimpses into the brand’s vision. Similarly, with Bosch, we integrated immersive VR setups, including HTC VIVE headsets, into sales processes, training staff at flagship stores and airports,” Gupta revealed.

“Volvo leverages VR to exhibit its luxury cars to customers, facilitating customisation and offering a more informed and memorable buying experience,” he added.

It’s not just the automotive industry finding use cases in the industrial metaverse, but other manufacturing companies too. Jones said, Freyr, a European battery manufacturer, utilises Siemens’ comprehensive digital twin to meticulously design every aspect of the gigafactories they are currently building before the construction begins.

“They aim to leverage the industrial metaverse to enable potential customers to interact with the comprehensive digital twin, ensuring the factory aligns with their operational requirements and meets their needs effectively,” he said.

Siemens utilised the metaverse to create its Digital Native Factory in Nanjing, integrating digital technologies from inception.

Simulation with a digital twin optimised construction, avoiding costly errors. Besides, ongoing simulation enhanced efficiency, yielding a 200% capacity increase and 20% productivity boost.

The Rise of Industrial Metaverse

Gupta believes the Metaverse is poised to experience substantial growth in the coming years.

“As industries worldwide embrace digital transformation, there’s an increasing recognition of the potential of metaverse technologies to revolutionise operations, training, and collaboration in manufacturing and related sectors,” he said.

Besides the likes of Meta and Microsoft, a few other companies like AWS and NVIDIA are also still working on the metaverse. However, Siemens is among the few companies making significant strides with the technology.

Jones revealed his customers are reaching out directly to inquire about the company’s industrial metaverse offering. “It’s one of the products we didn’t have to spend on marketing.

“An automotive manufacturer in Japan has engaged with us on this initiative. We have also garnered significant interest from multiple customers in the US,” Jones said.

Moreover, he anticipates that the initial interest will likely extend into the realm of product development, especially after the launch of the Sony headset.

In India, AutoVRse has developed VRseBuilder – a VR-led, end-to-end one-stop, self-serve, modular, SaaS-style platform which empowers large organisations to effortlessly create, deploy, and manage AR/VR training solutions and applications at scale and in real-time.

The Bengaluru-based startup’s customer list also includes Volvo, UltraTech Cements, Tata Power, Vedanta, Accenture, Godrej MHE and TVS Motors, among others.

The post Excitement Builds as Industrial Metaverse Transforms Manufacturing appeared first on Analytics India Magazine.

Meet India’s First AI Chatbot for UPSC Aspirants

PAiGPT, an AI-powered conversational chatbot for UPSC aspirants, recently released its app for Android and iOS. The team behind PAiGPT visited Old Rajinder Nagar, Delhi , known as the Mecca for UPSC preparations, to validate the app and received good feedback from students and educators.

The app’s USP is its ability to fetch real-time information on various topics and current affairs, similar to Perplexity AI and Google Gemini. However, what sets it apart is its feature that provides trending topics and the option to create multiple choice questions based on the available information.

Moreover, aspirants can upload images of editorials from popular newspapers and the app can generate summaries. Additionally, the company plans to introduce a feature through which the app can generate the summary in Hindi, even if the uploaded image contains text in English.

“If you use ChatGPT and Perplexity AI with multimodal capabilities, such as vision, you have to pay $20 a month, which is very hefty for any UPSC aspirant. But if you come to our platform, that cost will be around Rs 250 per month,” said Eshank Agarwal, founder of PAiGPT, in an exclusive interview with AIM.

“With PAiGPT, you will receive everything related to current affairs, including real-time information. You can upload your article editorial and receive a summary, MCQs, and subjective questions. You will also get an idea of the main questions that may come up,” he added.

PAiGPT was founded in September 2022 by Agarwal, Addya Rai, Siddharth Singh, and Deepanshu Singh. All three co-founders have a background in UPSC. Deepanshu, who has successfully cleared the UPSC exam, previously held the position of senior mentor and faculty at Unacademy, and is currently the chief strategy officer at PAiGPT.

Similarly, Siddharth Singh, serving as the CMO at PAiGPT, has a background with Adda24/7, where he developed exclusive and comprehensive UPSC IAS study materials.

“We are not competing with coaching centers. Our focus is on solving the pain points for Indian students and possibly expanding to serve western students in the future,” said Agarwal.

When asked if it was the next Perplexity AI of India, Agarwal was clear, “PAiGPT is PAiGPT”. “PAiGPT means personalized AI, India’s first answer engine for the students and world’s best in terms of fetching knowledge, and giving answers to the students”.

The tech behind PAiGPT

PAiGPT is built on top of Llama 2 7B. Agarwal clarified that they do not use OpenAI’s GPT model citing the high cost associated with it. He added that they have fine-tuned Llama 2 and trained it with 600 million tokens.

“We have created our own premises with 100 million tokens, secondly, we have used Red Pajama, and thirdly, we have taken data from popular newspapers like The Hindu,” said Agarwal.

PAiGPT uses RAG architecture and a search engine to fetch real-time information. “We crawl data from a browser and store it in RAG architecture,” said Agarwal. The company is also also using active quantisation techniques.

“If you use NVIDIA A10 and operate in FP32, that amounts to approximately 31.2 teraflops. However, our approach involves quantising activations from FP32 to INT8,” explained Agarwal.

Moreover, Agarwal said that they have deployed their model on AWS. However, he added that it is not economical, as the company is paying for the NVIDIA A10 GPUs at $10 per hour for two clusters simultaneously. The company is planning to partner with local cloud service providers to reduce the costs.

“We are trying to collaborate with local data centers like ESDS, E2E Networks, and others and reduce prices,” said Agarwal.

What next?

In the first week of May, the company will introduce a new feature called Prep Section. With this feature when users enter a keyword for any subject, AI will recommend a list of questions. “For example, if you type ‘Indus Valley Civilisation’, you can choose how many MCQs to generate, and then you can attempt an AI-based customised test series,” said Aggarwal.

Secondly, the company plans to introduce an answer evaluation tool. “The evaluation of the Main’s answer key is a big problem in UPSC. At any coaching institute, if you attempt the Mains paper, their evaluation may take up to seven days because they have very few human resources. We are working on addressing this issue. If you upload your Main’s answer key in PAiGPT, you will get the evaluation in real-time,” said Agarwal

The company is also planning to help students with their mental health. “In the coming six months, you will see PAiGPT also work towards students’ mental health with the help of AI,” said Agarwal.

Talking about PAiGPT’s counterpart, YC-backed SuperKalam, Agarwal said that the company is using OpenAI’s GPT-4. Interestingly, PAiGPT’s chief strategy officer Deepanshu previously worked at SuperKalam.

The company is bootstrapped and has invested Rs 1 crore till date, with each co-founder contributing Rs 25 lakh. Currently, the company is seeking good strategic partners. Agarwal said that the company is also in touch with Biocon Biologics and is planning to meet founder Kiran Mazumdar Shaw soon.

“We are not constrained by funding,” said Agarwal, adding that the company can deliver PAiGPT to 10,000 users with just Rs 3 lakh. However, the company soon plans to raise funds with a valuation of Rs 30 crore. “We want to become a billion-dollar company in the next five years,” concluded Agarwal.

The post Meet India’s First AI Chatbot for UPSC Aspirants appeared first on Analytics India Magazine.

Microsoft Renews Funding for IWill GITA, World’s First Gen-AI Hindi Mental Health Program

IWill, a leading AI and Digital Health Startup from India, has secured fresh funding from Microsoft‘s ‘AI for Accessibility’ program. The funding is earmarked to accelerate IWill GITA, the World’s First Controlled Generative-AI Mental Health Companion in Hindi.

Back in 2022, Microsoft had initially backed IWill under the same program, facilitating the genesis of the project. IWill GITA, based on Cognitive Behavioral Therapy (CBT) principles, leverages the potential of Gen-AI while maintaining clinical flows and responsible AI use.

Commencing its pilot launch in January 2024, it aims to provide 615 million Hindi-speaking population across rural and urban landscapes with the most effective mental health support at the most affordable price. The initiative strives to bridge the staggering 80%+ treatment gap prevalent in the country.

“We are very excited to support the IWill team as they further develop IWill GITA. Our shared vision is to broaden the horizons of mental health support, reaching out to an even larger audience and serving more people within the Hindi-speaking community,” said Ioana Tanase, Accessibility PM at Microsoft.

“We’re so excited and humbled to receive the continued support from Microsoft. Microsoft’s immense support was invaluable in helping us create IWill GITA and we feel honored that basis its success and future vision, Microsoft AI4A extended the funding and support,” added Shipra Dawar, Founder & CEO of IWill and ePsyClinic.

IWill GITA, acronym for ‘Gen-AI Inclusive Therapy Assistant,’ draws inspiration from the ethos of Lord Krishna’s teachings encapsulated in the Bhagavad Gita. Ms. Dawar emphasised the alignment of their mission with the national agenda for mental health awareness, citing PM Narendra Modi’s call for greater efforts in this domain. She underscored the pivotal role of Microsoft’s financial, technical, and mentorship support in enabling accessible and effective mental healthcare solutions for all.

The post Microsoft Renews Funding for IWill GITA, World’s First Gen-AI Hindi Mental Health Program appeared first on Analytics India Magazine.