Axelera lands new funds as the AI chip market heats up

Big Data futuristic background

The generative AI boom is driving the demand for AI chips, which are purpose-built to train and run generative AI models. And major players, from VCs to startups, are scrambling to get in on the ground floor.

SoftBank’s Masayoshi Son is reportedly looking to raise $100 billion for a chip initiative that would compete with tech giant Nvidia. OpenAI, meanwhile, is said to be in talks with investment firms to launch an AI chip-making venture.

AI chip startup Axelera has kept a comparatively low profile. Nevertheless, it’s managed to win over backers including Samsung in part by focusing on a niche within the burgeoning AI chip market: chips that run AI on edge devices.

“There’s no denying that the AI industry has the potential to transform a multitude of sectors,” Fabrizio Del Maffeo, one of the co-founders of Axelera and its CEO, told TechCrunch in an interview. “However, to truly harness the value of AI, organizations need a solution that delivers high-performance and efficiency while balancing cost.”

Axelera — headquartered in the Netherlands, with a roughly 180-person workforce spread across offices in Belgium, Switzerland, Italy and the U.K. — designs AI-running chips and systems for applications like security, retail, automotive and robotics that it supplies to partners manufacturing B2B edge computing and internet of things products.

Axelera was borne out of an effort led by Del Maffeo and a group at Imec, the Belgium-based technology lab, along with Evangelos Eleftheriou and a group of Zurich-based IBM researchers to build a highly efficient AI chip architecture. The founding team incubated much of Axelera within Bitfury Group, a blockchain company specializing in Bitcoin hardware.

The defining characteristics of Axelera’s AI hardware stack are the instruction set architecture (ISA) RISC-V and in-memory computing.

ISAs are a technical spec at the foundation of chips that describe how software controls the chip’s hardware. Chip designers typically license an existing ISA from a large chipmaker such as Arm or Intel, but RISC-V presents an open, no-royalties-attached alternative. As for in-memory computing, it refers to running calculations in a system’s RAM to reduce the latency introduced by storage devices.

Axelera isn’t the first to try its hand at an in-memory and/or RISC-V-based architecture for AI chips.

NeuroBlade is developing chips that combine both compute and memory into a single hardware block for data processing. MemVerge, GigaSpaces, Hazelcast and H20.ai also offer in-memory hardware solutions for AI and data analytics applications. Elsewhere, Tenstorrent, backed by Hyundai Motor Group and Samsung, sells AI processors and other related IP built around RISC-V.

Axelera
One of Axelera’s accelerator cards.
Image Credits: Axelera

Axelera has attempted to differentiate itself by delivering both chip hardware and software to manage and deploy AI models to that hardware. And from all appearances, the strategy appears to be working for it.

Axelera on Thursday announced that it closed a $68 million Series B funding round that brings its total raised to $120 million. Contributors to the round included the European Innovation Council Fund, Innovation Industries Strategic Partnership Fund, Invest-NL and Samsung Catalyst Fund.

The new cash will be put toward expanding to new markets ahead of full production of Axelera’s flagship Metis AI platform in H2 2024, according to Del Maffeo. Axelera also has an eye on the data center chip market, with preliminary plans to fund R&D of chips aimed at high-performance compute use cases.

“Metis entered in full production in Q2 and will be delivered in volume in Q3,” Del Maffeo said. “Axelera AI is now developing a new generation of products for computer vision, large language models and large multimodal models. This new product family will be unveiled later this year and enter in full production in 2025.”

The challenge will be shipping its AI chips at scale — and competing against the countless others in the AI chip race. Many rivals have formidable backing; a Crunchbase report from June finds that VC-backed chip startups have raised nearly $5.3 billion in just 175 deals so far this year.

The reward could be substantial, however. According to Statista and Market.us data, the AI chip market might gross as much as $67 billion in revenue by 2027. Axelera has little chance of unseating entrenched vendors like Nvidia anytime soon, if ever. (Nvidia has an estimated 70% to 95% share of the AI chip market, per Mizuho Securities.) But nabbing even a fraction of the market would be a meaningful win.

“The funding supports our mission to democratize access to AI, from the edge to the cloud,” Del Maffeo said, adding that Axelera has “tens” of enterprise customers. “By expanding our product lines beyond the edge computing market, we are able to address industry challenges in AI inference and support current and future AI processing needs.”

Google Translate gets 110 new languages with AI’s help, bringing the total to 243

Ai supported translation feature

Google Translate is getting support for 110 new languages. In a blog post published on Thursday, Google revealed that this latest effort is its largest language expansion ever, almost doubling the 133 languages that the tool supported prior and now reaching 243 languages in total.

The new languages are spoken by more than 614 million people across the globe, representing around 8% of the world's population, according to Google. Some of them are world languages with more than 100 million speakers, others are spoken by small groups of Indigenous people, and still others have no native speakers but are actively being revitalized.

Around 25% of the new languages come from Africa. In Google's biggest expansion of African languages, the new additions will cover Fon, Kikongo, Luo, Ga, Swati, Venda, and Wolof.

Also: Google's new AI-powered tool helps users learn English right in Search

Google also highlighted other new languages that its translation tool now handles.

Cantonese has been one of the most requested languages, Google said, but it's a tricky one because the written characters overlap with those of Mandarin. Manx is a Celtic language spoken in the Isle of Man that almost died out in 1974, but has since been revived, with thousands of people now fluent in it. Punjabi (Shahmukhi) is the variety of Punjabi written in Perso-Arabic script (Shahmukhi) and is the most spoken language by people in Pakistan.

Because of regional varieties, dialects, and different spelling standards, translating a single language can be challenging. Many languages have no one accepted variant, making it difficult to pick the "correct" one. As one example cited by Google, Romani is a language with many dialects throughout Europe.

To handle demanding translations, Google turned to AI, specifically its AI-powered PaLM 2 LLM (large language model). The company cited PaLM 2 as a key factor in helping Google Translate better understand languages closely connected to each other, such as French creole languages like Seychellois Creole and Mauritian Creole.

With around 7,000 languages spoken on the planet, Google Translate is still limited to a small fraction of them. However, the company is slowly making progress.

Also: I tested this $700 AI device that can translate 40 languages in real time

In 2022, Google added 24 new languages via a machine learning model that learns another language even without seeing an example of it. In the same year, the company announced the 1,000 Languages Initiative, with the goal of building AI models that can translate among the 1,000 most spoken languages in the world.

Google Translate is available on the web, in the Chrome browser, and as a mobile app for iOS and Android.

Featured

TWO AI Announces SUTRA for Startups Program, Offers 1 Billion Tokens for Free 

TWO AI has launched the SUTRA for Startups program, an initiative aimed at accelerating startups with access to its multilingual LLM-based services. The program will select 10 startups from India, Korea, Japan, or the Middle East to receive API access to SUTRA and 1 billion tokens for free.

The selected startups will leverage SUTRA’s scalable and cost-efficient text-based models to develop solutions targeting global markets. This marks the first round of the SUTRA for Startups program, designed to foster innovation by providing affordable multilingual LLM capabilities to emerging companies.

Eligible startups must have existing LLM-based services and be headquartered in the specified regions. Applications are open until July 31, 2024, at 11:59 PM PST. Winners will be announced soon after the submission deadline.

TWO will evaluate applicants based on factors such as the establishment date, number of employees, and revenue to ensure they meet the eligibility criteria. Selected startups will be contacted via email or phone to finalize details and provide API access.

Jio backed TWO AI recently announced the launch of SUTRA through its new AI app, ChatSUTRA, available at two.chat.ai. It is now accessible via web and will soon be available on iOS and Android.

The startup raised a $20M seed fund in February 2022 from Jio Platforms and South Korean internet conglomerate Naver. “Jio has been one of our key partners for a long time and has invested in us from the very beginning,” said Pranav Mistry, the founder of TWO, in an exclusive interaction with AIM.

Another product from TWO is Geniya which can browse data from the internet using Google, rivalling Perplexity AI. Mistry said that Geniya is still in public beta and users can try it out, following the official launch soon.

The post TWO AI Announces SUTRA for Startups Program, Offers 1 Billion Tokens for Free appeared first on Analytics India Magazine.

AI Copyright Battles are a Waste of Time

In a potentially unprecedented lawsuit, the world’s largest record companies – Sony Music, Universal Music Group, and Warner Records – are suing two AI music-generation startups, Suno and Udio, alleging copyright violations.

The record companies claim that both startups trained their AI algorithms using songs they did not have the rights to.

Music-Generation Platforms Stand Their Ground

Interestingly, a day after the lawsuit, Udio came out with a statement emphasising the importance of using AI. They highlighted several vital use cases, even citing that a musician had used the product after losing the ability to use his hands.

The startup also cited examples of producers who have sampled AI-generated tracks to create hit songs, like ‘BBL Drizzy’, and everyday music lovers have used the technology to express the gamut of human emotions from love to sorrow to joy.

Udio has been clear on its stance about using copyrighted material from the beginning. In an exclusive interview with AIM, Andrew Sanchez, co-founder and COO of Udio, said that the company has invested a significant amount of time in ensuring that their systems do not generate outputs that could infringe anyone’s rights.

Similarly, Suno’s CEO Mikey Shulman defended the technology, stating it creates new content and doesn’t replicate existing music.

Musicians Embrace AI

Several musicians have supported the use of AI. For them, the ability to actualise the creativity of people was hindered by the lack of skillsets and resources. According to them, AI helps eliminate this gap by giving them a medium to make the impossible possible.

Singer and Udio investor will.i.am, who has long been an evangelist for AI’s musical possibilities, has also said that GenAI is “giving agency to dreamers”.

However, not all artists are on the same page.

A few months ago, over 200 musicians across genres, including Katy Perry, Billie Eilish, and Jon Bon Jovi, signed an open letter urging tech companies to stop using AI that ‘devalues music’ and violates the rights of human artists.

Interestingly, there was a recent study that said 71% of musicians fear AI.

Is There a Need for Copyright?

Should the law acknowledge the work of the programmer or the user of such a program? In the analogue world, this is equivalent to asking whether copyright should be granted to the pen manufacturer versus the writer.

In response to the music publishers’ motion, Anthropic stated that it is “confident that using copyrighted content as training data for a [large language model] is fair use under the law—meaning that it is not infringement at all.”

Similarly, as defendants in several copyright infringement litigation cases, thanks to their AI operations, Microsoft and GitHub are strongly incentivised to continue claiming their actions are not licensable.

What Does the Law Say?

At present, works exclusively created by AI, even if they stem from a human-written text prompt, are not copyright protected.

These AI systems, which include chatbots like ChatGPT and image and music generators, are not legally considered the authors of the content they produce. Their outputs are primarily the result of human-created material.

Currently, firms can copyright their characters, artwork, episodes, music, etc. However, if firms do not own these outputs, they cannot legally restrict how you use a work of art that they do not possess the copyright to.

Grey Areas Remain

Big tech companies have also had their fair share of copyright issues. As mentioned before, Microsoft, GitHub and Anthropic have their own stances in response to these issues.

The issue of whether ChatGPT and other generative AI technologies have violated copyright by using data sets for training is currently being debated in numerous lawsuits worldwide. However, it’s important to note that these claims are still under debate, with one US court ruling that AI-generated artwork cannot be copyrighted.

Because of copyright concerns, some organisations have taken steps to ban the use of AI-generated art. For instance, Getty Images has banned the upload and sale of illustrations made using AI art tools like DALL-E, Midjourney and Stable Diffusion.

One prominent example is Théâtre D’opéra Spatial, or Space Opera Theater, a painting created by Jason M. Allen using Midjourney. Many artists were unhappy with the fact that it won an annual fine art competition.

Last year, Hollywood authors also went on strike, partly to demand guidelines for how generative AI will be utilised within the industry.

While the strikes were successful in limiting the use of chatbots in movie theatres, AI will continue to be used. Producers and writers agreed that it can be helpful in many parts of filmmaking, including scriptwriting.

Embrace AI, Don’t Fear It

A study done by distributor Ditto Music revealed that approximately 60% of the musicians polled already use AI in some way. The fact that most musicians have had favourable encounters with AI demonstrates the technology’s potential impact on the business.

While there are valid concerns about the use of copyrighted material to train AI systems, the technology also presents immense opportunities for creativity and artistic expression.

The AI genie is out of the bottle – the creative industry would be wise to learn how to make the most of it.

The post AI Copyright Battles are a Waste of Time appeared first on Analytics India Magazine.

How To Speed Up Python Code with Caching

python-cache-fimg
Image by Author

In Python, you can use caching to store the results of expensive function calls and reuse them when the function is called with the same arguments again. This makes your code more performant.

Python provides built-in support for caching through the functools module: the decorators @cache and @lru_cache. And we'll learn how to cache function calls in this tutorial.

Why Is Caching Helpful?

Caching function calls can significantly improve the performance of your code. Here are some reasons why caching function calls can be beneficial:

  • Performance improvement: When a function is called with the same arguments multiple times, caching the result can eliminate redundant computations. Instead of recalculating the result every time, the cached value can be returned, leading to faster execution.
  • Reduction of resource usage: Some function calls may be computationally intensive or require significant resources (such as database queries or network requests). Caching the results reduces the need to repeat these operations.
  • Improved responsiveness: In applications where responsiveness is crucial, such as web servers or GUI applications, caching can help reduce latency by avoiding repeated calculations or I/O operations.

Now let’s get to coding.

Caching with the @cache Decorator

Let’s code a function that computes the n-th Fibonacci number. Here's the recursive implementation of the Fibonacci sequence:

def fibonacci(n):      if n <= 1:          return n      return fibonacci(n-1) + fibonacci(n-2)  

Without caching, the recursive calls result in redundant computations. If the values are cached, it'd be much more efficient to look up the cached values. And for this, you can use the @cache decorator.

The @cache decorator from the functools module in Python 3.9+ is used to cache the results of a function. It works by storing the results of expensive function calls and reusing them when the function is called with the same arguments. Now let’s wrap the function with the @cache decorator:

from functools import cache    @cache  def fibonacci(n):      if n <= 1:          return n      return fibonacci(n-1) + fibonacci(n-2)  

We’ll get to performance comparison later. Now let’s see another way to cache return values from functions using the @lru_cache decorator.

Caching with the @lru_cache Decorator

You can use the built-in functools.lru_cache decorator for caching as well. This uses the Least Recently Used (LRU) caching mechanism for function calls. In LRU caching, when the cache is full and a new item needs to be added, the least recently used item in the cache is removed to make room for the new item. This ensures that the most frequently used items are retained in the cache, while less frequently used items are discarded.

The @lru_cache decorator is similar to @cache but allows you to specify the maximum size—as the maxsize argument—of the cache. Once the cache reaches this size, the least recently used items are discarded. This is useful if you want to limit memory usage.

Here, the fibonacci function caches up to 7 most recently computed values:

from functools import lru_cache    @lru_cache(maxsize=7)  # Cache up to 7 most recent results  def fibonacci(n):      if n <= 1:          return n      return fibonacci(n-1) + fibonacci(n-2)    fibonacci(5)  # Computes Fibonacci(5) and caches intermediate results  fibonacci(3)  # Retrieves Fibonacci(3) from the cache  

Here, the fibonacci function is decorated with @lru_cache(maxsize=7), specifying that it should cache up to 7 most recent results.

When fibonacci(5) is called, the results for fibonacci(4), fibonacci(3), and fibonacci(2) are cached. When fibonacci(3) is called subsequently, fibonacci(3) is retrieved from the cache since it was one of the seven most recently computed values, avoiding redundant computation.

Timing Function Calls for Comparison

Now let’s compare the execution times of the functions with and without caching. For this example, we don't set an explicit value for maxsize. So maxsize will be set to the default value of 128:

from functools import cache, lru_cache  import timeit    # without caching  def fibonacci_no_cache(n):      if n <= 1:          return n      return fibonacci_no_cache(n-1) + fibonacci_no_cache(n-2)    # with cache  @cache  def fibonacci_cache(n):      if n <= 1:          return n      return fibonacci_cache(n-1) + fibonacci_cache(n-2)    # with LRU cache  @lru_cache  def fibonacci_lru_cache(n):      if n <= 1:          return n      return fibonacci_lru_cache(n-1) + fibonacci_lru_cache(n-2)  

To compare the execution times, we’ll use the timeit« function from the timeit module:

# Compute the n-th Fibonacci number  n = 35      no_cache_time = timeit.timeit(lambda: fibonacci_no_cache(n), number=1)  cache_time = timeit.timeit(lambda: fibonacci_cache(n), number=1)  lru_cache_time = timeit.timeit(lambda: fibonacci_lru_cache(n), number=1)    print(f"Time without cache: {no_cache_time:.6f} seconds")  print(f"Time with cache: {cache_time:.6f} seconds")  print(f"Time with LRU cache: {lru_cache_time:.6f} seconds")  

Running the above code should give a similar output:

Output >>>  Time without cache: 2.373220 seconds  Time with cache: 0.000029 seconds  Time with LRU cache: 0.000017 seconds  

We see a significant difference in the execution times. The function call without caching takes much longer to execute, especially for larger values of n. While the cached versions (both @cache and @lru_cache) execute much faster and have comparable execution times.

Wrapping Up

By using the @cache and @lru_cache decorators, you can significantly speed up the execution of functions that involve expensive computations or recursive calls. You can find the complete code on GitHub.

If you’re looking for a comprehensive guide on best practices for using Python for data science, read 5 Python Best Practices for Data Science.

Bala Priya C is a developer and technical writer from India. She likes working at the intersection of math, programming, data science, and content creation. Her areas of interest and expertise include DevOps, data science, and natural language processing. She enjoys reading, writing, coding, and coffee! Currently, she's working on learning and sharing her knowledge with the developer community by authoring tutorials, how-to guides, opinion pieces, and more. Bala also creates engaging resource overviews and coding tutorials.

More On This Topic

  • 3 Simple Ways to Speed Up Your Python Code
  • How To Speed Up SQL Queries Using Indexes [Python Edition]
  • How to Speed Up XGBoost Model Training
  • Speed up Machine Learning with Fast Kriging (FKR)
  • 3 Research-Driven Advanced Prompting Techniques for LLM Efficiency…
  • RAPIDS cuDF to Speed up Your Next Data Science Workflow

Beyond Bangalore: Why Semiconductor Companies are Moving to Tier 2 & 3 Cities

In a significant shift from the norm, semiconductor businesses in India are starting to explore tier-2 cities across the country for new opportunities. This signals a strategic industry evolution driven by the quest for diversified talent pools, cost-effective operations, and broader market reach.

In a conversation with AIM, Srini Chinamilli, the CEO and co-founder of Tessolve Semiconductors, affirmed Bangalore’s status as the ‘Silicon Valley of India’. He also highlighted the growing potential of tier-2 cities for the company’s operations.

“Bangalore is still the hub, but people have to go outside because Bangalore faces different kinds of problems, can’t have all eggs in one basket,” Chinamilli said, acknowledging the challenges of concentrating all operations in one location.

Chinamilli noted that replicating Bangalore’s energy, ecosystem, and diversity is challenging.

Emerging Talent in Tier 2 and 3

When asked about the capabilities and potential of tier-2 and tier-3 cities, Chinamilli explained, “If you look at Bangalore, most engineers here are from elsewhere. So, if there is an opportunity for people to stay closer to where they are and still work, you know, the attrition levels are much lower.”

He also praised the work ethics and dedication of employees in tier-2 cities, saying, “These employees from tier-two cities tend to be very dedicated, similar to Bangalore. But the attrition levels are much lower.”

While the cost of engineers may be slightly lower in these cities, Chinamilli emphasised that Tessolve strives to maintain consistent compensation across locations.

He emphasised that Bangalore will remain Tessolve’s hub, while the company has been strategically expanding to cities like Coimbatore, Bhubaneswar, Cochin, Hubli, and Vizag for the past 17 years.

The expansion will help tap into local talent pools and foster a more stable workforce.

Tessolve Clients | Source: Tessolve

Developing Ecosystems in Tier 2 Cities / Higher Employee Retention

Industry experts see this shift to smaller cities as crucial for India’s ambition to be a semiconductor industry leader.

Satya Gupta, the president of the VLSI Society of India, emphasised the “need to establish at least 1,000 chip design companies and train 1 million electronics and chip design professionals annually”.

He proposes developing clusters in cities such as Bhubaneswar, Indore, Pune, Jaipur, Coimbatore, and Trivandrum, alongside incentivising the inception of at least one chip start-up in each of India’s 806 districts.

A notable development came when Synopsys announced plans to open a new office in Bhubaneswar, employing over 300 highly skilled VLSI and semiconductor design engineers.

Rituparna Mandal, VP for customer success group and head of Synopsys India, said, “Odisha’s support in fostering a strong semiconductor workforce and ecosystem, and the involvement of top industry leaders, academics, and researchers in the O-Chip initiative, exemplify a thriving semiconductor community in the state. It aligns with Synopsys’ priorities of expansion in India.”

Similarly, Signature IP has strategically chosen to nurture talent and establish a presence in tier 2 cities.

CEO Purna Mohanty attributes the early success and scaling of the company’s operations in India to Bhubaneswar’s https://www.linkedin.com/in/mohantyp/ role, noting the lower attrition rates in tier 2 cities have facilitated long-term R&D projects.

Madhav Rao, the SVP of engineering at Tessolve, pointed out the intense competition for talent in Bangalore as a driving factor for their expansion into tier 2 cities like Coimbatore and Hubli in the last two years.

Smaller cities bring a notable economic advantage through lower operational costs, helping businesses allocate more resources towards innovation and development.

While infrastructure in tier-2 cities was a concern until a few years ago, the situation has changed. Rao noted that the network and overall infrastructure in tier-2 cities are now quite adequate for their needs.

Challenges Remain

Though the lower tier seems promising, there still are numerous challenges. For instance, some clients prefer employees in their own centres or in Bangalore for spontaneous meetings. Another challenge is that smaller cities tend to have employees who hail from that place alone or surrounding areas, limiting the talent pool.

To mitigate these challenges, companies are tying up with local universities and colleges. Mohanty pointed out that Signature IP has recruited professors from local engineering colleges in Bhubaneswar to lead their R&D efforts full-time, helping identify promising interns every year.

The post Beyond Bangalore: Why Semiconductor Companies are Moving to Tier 2 & 3 Cities appeared first on Analytics India Magazine.

Big Data is Back, and It’s Bigger Than Ever

While AI has hogged much of the limelight in the past few years, Databricks’ products SVP Adam Conway rightly points out that big data has finally made a comeback.

The influx of AI systems has highlighted the importance of data within businesses, with many pivoting towards actually utilising the data they generate on a day-to-day basis. For this, Conway has highlighted just how “big” big data has become, when compared to what someone would have meant when they used the term in the 2010s.

“I talk to enterprises regularly that have data lakes in the 10s to 100s of petabytes (PB), even a few enterprises in the over one exabyte (EB) range. Organisations often have single tables in the multi-PB range – in the heyday of ‘big data’ in 2010, 1 PB was huge, now, I would consider it on the smaller side,” he said.

Every Bit Counts

While data was and continues to be a big deal, enterprises seem to be taking active notice of how to actually deal with the data they have, thanks to the rise of AI.

This is further cemented by MotherDuck co-founder and CEO Jordan Tigani, who headed Google’s BigQuery team until 2020. In a post titled ‘Big Data is Dead’, Tigani stated that the often repeated adage of big data being “too big” has all but disappeared thanks to better tools used to query massive amounts of data.

Essentially, data is no longer considered “big” anymore.

“Data sizes may have gotten marginally larger, but hardware has gotten bigger at an even faster rate. It (big data) had a good run, but now we can stop worrying about data size and focus on how we’re going to use it to make better decisions,” he said.

With many businesses pivoting to using AI when it comes to how they work, data is a big factor in how they assess the effectiveness of their products, as well as just gaining basic insights on their inner workings.

Conway also points out that the actual term ‘big data’ is specific, because with AI, aggregates no longer fulfil the needs of businesses. As we said, every little bit of data counts. “Aggregates are great for BI and reporting, like if you want to look at revenue by customer or by product, but terrible if you want to do AI.

“For example, if you want to predict that a customer will buy an iced drink on a hot day you need all the individual transactions to train that model, you need to join that with weather data and train a model,” he said.

So while AI is the reason why data has made such a big comeback, at least in how businesses use their own, it also needs to feed on raw data to effectively work.

Does This Mean Data Wasn’t Relevant Before?

Not really. Data obviously formed a lot of decisions that businesses made and continue to make today. But with AI, data is now a massive resource, and everyone knows that.

A 240T tokens dataset is now available for your LLM training. 🤯
I don't even know how to go about downloading a 240T dataset lol. FineWeb's 15T comes out to 48 Terabytes. Can you imagine what a 240T looks like?
8× larger than previous SOTA (RedPajama-Data-v2 30T 125TB) pic.twitter.com/7xXRc6XPK3

— Rohan Paul (@rohanpaul_ai) June 24, 2024

This is further cemented by the amount of data acquisition that has occurred over the past couple of years. In the past year alone, OpenAI has managed to sign deals with several media organisations to make use of their data.

However, as one user points out, “Data needs to be curated!”

AI companies have put a lot of focus on properly annotating and structuring data for training purposes. With this tech more widely available, companies have also been restructuring their data so that it can easily train AI that they intend to use internally.

“Many of the most important revenue generating or cost saving AI workloads depend on massive data sets. In many cases, there is no AI without big data,” Conway said. And, of course, there is no resurgence of big data without AI.

With Big Data Comes Data Engineering

Thanks to this, there’s also a massive change in data and analyst roles, with data engineering becoming a much sought-after career for many.

Databricks knows this, which is why it’s currently betting big on data engineering rather than focusing on AI itself. It has also launched tools like LakeFlow to enhance the workflow of data engineers.

This doesn’t come as a surprise. As AIM reported previously, Databricks’ CEO Ali Ghodsi admitted that customers had asked for a focus on data over anything else. “Two years ago, at the CIO Forum, we asked our customers what they wanted most from Databricks, and the majority expressed a need for easier data integration,” he said.

Additionally, as Databricks vice president of field engineering APJ, Nick Eayrs, told AIM, the focus on data engineering itself means that building AI and implementing it has a solid foundation, thanks to AI’s heavy reliance on data.

This is why Conway advises, “So next time you see the data engineer in your company that works in Spark or Hadoop, ask them what they do, ask what kind of data your company has, and ask what is done with it. You will probably be pleasantly surprised. Big Data is probably quietly transforming your company.”

The post Big Data is Back, and It’s Bigger Than Ever appeared first on Analytics India Magazine.

Etched Takes a Risky Bet on Transformers to Topple NVIDIA

AI hardware company Etched has unveiled Sohu, the first specialised chip (ASIC) built exclusively for Transformer models. Today, every major AI product (ChatGPT, Claude, Gemini, Sora) is powered by Transformers and the company believes that within a few years, every large AI model will run on custom chips.

“An accelerator from NVIDIA or Google is flexible. It can be programmed to run many different kinds of AI models, like convolutional networks, TMAs, or Transformers. Our chips are different. They are only able to run this one very narrow class of model, which we call Transformers,” said co-founder and CEO Gavin Uberti.

Etched claims that Sohu can process over 500,000 tokens per second with Llama 70B. One 8xSohu server replaces 160 H100s. According to the company, “Sohu is more than 10 times faster and cheaper than even NVIDIA’s next-generation Blackwell (B200) GPUs.”

“We’re able to burn the Transformer algorithm into the chip. We’re building our own silicon and our own servers, and this enables us to get more than 20 times higher throughput in terms of output words per second for models like ChatGPT or Llama,” said Uberti.

Worth the Risk?

Notably, Sohu can’t run CNNs, LSTMs, SSMs, or any other AI models. “If Transformers change dramatically or go away, then we’ll be in a bad place. But if we’re right and Transformers keep being the dominant way that AI models work, we will be the most performant chip in the market by an order of magnitude,” he said.

He added that Sohu can run image and video generation models as well. “I don’t know if you saw Sora from OpenAI, but that’s a Transformer as well,” he said.

On the software side, Etched says that customers don’t need to deal with the hassle of CUDA and PyTorch code, which require incredibly complicated compilers. “However, since Sohu only runs Transformers, we only need to write software for Transformers!” the company said.

Moreover, they assert that “Santa Clara’s dirty little secret is that GPUs haven’t improved, they’ve simply grown larger”. Uberti said that next-gen GPUs (NVIDIA B200, AMD MI300X, Intel Gaudi 3, AWS Trainium2, etc.) now count two chips as one card to “double” their performance.

Etched was founded by Harvard dropouts Uberti and Chris Zhu, both of whom have a background in AI and hardware development. They recently raised $120 million in a Series A funding round, bringing their total funding to $125.4 million. Notable investors include Peter Thiel, Thomas Dohmke (the CEO of GitHub), and Balaji Srinivasan (former CTO at Coinbase).

NVIDIA is Not the Only One

While NVIDIA maintains a dominant 95% share of the AI chip market, new competitors are fast emerging. Groq, like Etched, is gaining traction with its LPUs. Groq has demonstrated impressive performance, achieving throughput of 877 tokens/s on Llama 3 8B and 284 tokens/s on Llama 3 70B. A user on X compared Llama 3 (on Groq) and GPT-4 by tasking them with coding a snake game in Python, where Groq performed exceptionally fast.

In addition to Groq, major tech giants like Microsoft, Google, and Amazon are also advancing their own AI chips. Last year, Microsoft introduced Azure Maia 100 AI Accelerator, built for AI tasks and generative AI workloads in cloud computing environments.

Amazon has also entered the fray with its Tranium2 and Inferentia chips, designed specifically for training AI models such as ChatGPT and its competitors. Startups like Databricks and Anthropic use Amazon’s Trainium2 chips to develop their models. Reportedly, Amazon is currently building its own ChatGPT-like model.

Google has been developing custom AI accelerator chips called Tensor Processing Units (TPUs) since 2015. At Google I/O 2024, the search giant unveiled its 6th generation Tensor Processing Units, called Trillium, delivering a 4.7x performance boost over its predecessor.

Will the Transformer Stay for Long?

The success of Etched is dependent on Transformers.

However, recently, Cohere founder Aidan Gomez and one of the authors of paper, ‘Attention is all you Need’ expressed his dissatisfaction with the current state of AI developments, all of which are built on top of Transformers.

“It kind of disturbs me how similar to the original form we are. I think the world needs something better than the Transformer,” he said.

Nonetheless, researchers challenging Transformers is not new. The latest paper by Sepp Hochreiter, the inventor of LSTM, has unveiled a new LLM architecture featuring a significant innovation: xLSTM, which stands for Extended Long Short-Term Memory.

In December last year, researchers Albert Gu and Tri Dao from Carnegie Mellon and Together AI introduced Mamba, a state-space model (SSM) that demonstrates superior performance across various modalities, including language, audio, and genomics.

In April, Google also unveiled a new family of open-weight language models, RecurrentGemma 2B, by Google DeepMind, based on the novel Griffin architecture.

Etched can only hope that Transformers are here to stay in the long run.

The post Etched Takes a Risky Bet on Transformers to Topple NVIDIA appeared first on Analytics India Magazine.

This Surat-based Company is Creating AI Hardware for India

Starting a hardware company based in India is no easy task. However, Surat-based company, Vicharak, took on the herculean task of churning out hardware in-house, designed specifically for AI workloads. The company recently secured funding of INR 1 crore, boosting its valuation to INR 100 crore.

We’ve received ₹1 crore in funding at a ₹100 crore valuation. We’ve often heard that doing hardware is hard in a city like Surat or even in India. We’ve gone through hundreds of failed prototypes and iterations in our labs, but somebody has to start at some point, right?
This…

— Vicharak (@Vicharak_In) June 25, 2024

Speaking with AIM, founder and CEO Akshar Vastarpara said that Vicharak’s focus is not just on creating hardware, it’s redefining computing technology.

“Our first target is to develop a GPU-like technology that can be used in mobile phones, laptops, and servers. We are approaching this in a very different way, starting with the consumer base but scaling to servers and lower-level areas as well,” Vastarpara explained.

This led to the creation of Vaaman. It is a complete packaged computing board that boasts a six-core ARM CPU and an FPGA with 112,128 logic cells. Its distinctive design allows it to tackle challenges that existing products can’t. With a 300-MBps connection between the FPGA and CPU, Vaaman is optimised for hardware acceleration and excels in parallel computing.

India for the World

“Our goal is not to compete directly [with NVIDIA] but to offer something unique. FPGAs are reconfigurable chips, capable of doing many things that ASIC (Application-Specific Integrated Circuit) startups can’t. We can achieve 90% efficiency compared to what they offer,” Vastarpara said confidently.

This is such an insane white pill moment for me.
Back in college I used to such reviews by western YouTubers of western tech companies.
Now time watching an Indian YouTuber make review about a deep tech product of an Indian company.
We’re gonna make it folks. pic.twitter.com/RwkOcxMzmK

— Varsh (@infinite_varsh) June 26, 2024

Vicharak’s products are poised to revolutionise single-board computing. “We are in the same industry as Raspberry Pi, but our boards include FPGAs alongside processors, offering a complete AI infrastructure,” he elaborated.

Priced at $180, their boards offer a competitive edge with advanced capabilities at an affordable cost.

Moreover, Vicharak aims to make its software completely free while maintaining proprietary IP on their hardware. “Our plan is to integrate FPGAs into every kind of computer, much like CPUs and GPUs today. The software we develop will be free, but the IP for our FPGA designs will remain ours,” Vastarpara clarified.

Not just hardware, Vicharak is also in direct competition to NVIDIA’s CUDA with its focus on software. Its flagship product, Gati, exemplifies this vision.

“Gati is our AI exploration project. We’re writing our own infrastructure on top of FPGA, creating a stack similar to what NVIDIA does with CUDA,” said Vastarpara. The goal is to enable AI inference on FPGAs, offering a flexible and powerful alternative to traditional GPUs and CPUs.

Apart from Vaaman and Gati, Vicharak has also built Axon, a processor powered by Rockchip RK3588S, an 8-core 64-bit SoC, utilising a 8 nm lithography process. It also integrates a 4-core GPU and a built-in NPU which provides up to 6 TOPS of performance for AI workloads.

Future Outlook

Looking ahead, Vicharak aims to make its software completely free while maintaining proprietary IP on their hardware. “Our plan is to integrate FPGAs into every kind of computer, much like CPUs and GPUs today. The software we develop will be open source, but the IP for our FPGA designs will remain ours,” Vastarpara clarified.

Reflecting on his journey, Vastarpara shared, “I graduated as a software engineer in October 2016. While I could write software, I realised that my true passion lay in electronics and hardware. That’s how Vicharak was born.”

Initially, Vastarpara focused on consultancy projects, which allowed him to bootstrap his company. “We grew a team of 30 people, and by 2022, we had shifted our focus entirely to consumer-facing products,” he added.

Vicharak is already garnering interest from various sectors. “We are set to demo our product within two months, working with government contractors for smart traffic systems and robotics startups. We are nearing the launch stage and expect to see our technology in practical use soon,” he said.

The post This Surat-based Company is Creating AI Hardware for India appeared first on Analytics India Magazine.

TiDB Can Do What MongoDB or CockRoachDB Can’t

Even though database solutions have evolved over time, developers are constantly seeking solutions that are flexible, easily scalable, and provide real-time analytics.

TiDB, which is an advanced distributed SQL database developed by PingCap, claims to solve all these problems for developers. Its biggest selling point is that it offers a compelling blend of horizontal scalability, MySQL compatibility, and real-time analytics capabilities.

Competitors like CockroachDB lack a built-in real-time analytics engine. While MongoDB does support basic analytics capabilities, it may encounter difficulties when handling complex analytical workloads or extremely large datasets.

The concept behind TiDB, as described by Ed Huang, the co-founder and chief technology officer at PingCAP, originated nearly a decade ago from the challenges he personally encountered in leveraging databases.

Back then, he was employed by a startup where he managed database clusters heavily reliant on MySQL at the time.

“Our business operations were deeply tied to relational databases due to their complex logic. However, our data was growing rapidly, necessitating sharding (a technique that spreads data across numerous MySQL instances),” Huang said in an exclusive interview with AIM.

This meant every few months, the database size would double, requiring them to rebalance and move data constantly.

Inspired by Google

Huang reveals that this was when he came across two Google papers, which served as an inspiration for TiDB (where ‘Ti’ stands for Titanium).

“About ten years ago, I came across Google’s papers on Spanner and F1—new SQL databases that offer traditional SQL interfaces but are incredibly scalable under the hood. I realised this was the direction we needed to go—a solution that could handle our scaling needs without sacrificing SQL functionality,” Huang said.

Hence, by merging the strengths of distributed or NoSQL databases with those of traditional databases, Huang aimed at creating a new database that application developers would embrace.

“We saw this integration as the future after being inspired by these research papers. This led us to embark on an open-source project to develop a new database from scratch, ensuring compatibility with MySQL. Our extensive experience with MySQL also motivated us to initiate what would become TiDB,” Huang added.

TiDB Architecture

The overall architecture of TiDB is decoupled into two layers: the storage layer and the key value layer. “I’m really proud to say that I wrote the first line of code for TiDB. We built it completely from scratch, forming a brand new community around it,” Huang said.

TiDB’s architecture is designed to manage extensive datasets while accommodating both transactional and analytical workloads seamlessly.

It has a distributed key-value storage system similar to databases like Cassandra or MongoDB, ensuring data is stored across multiple servers for scalability and resilience against failures.

“Another notable aspect of TiDB is its capability to handle both OLTP (Online Transaction Processing) and MySQL-compatible workloads, as well as OLAP (Online Analytical Processing) or analytics workloads concurrently.

“This is made possible by its dual storage engine architecture within the storage layer. One is the key-value-packed TiKV storage engine, optimised for transactional processing. There’s another storage engine known as TiFlash, designed specifically for handling analytics queries efficiently,” Huang added.

Databricks Loves TiDB

Over 3,000 customers currently leverage TiDB, hundreds of whom are PingCap’s paying customers. Some notable users of TiDB include Databricks, Airbnb, LinkedIn, Dailymotion, and Capcom.

Huang reveals the US remains the biggest market for TiDB, however, companies in other geographies also leverage the open-source database.

“Databricks is one of our biggest adopters in the US. Actually, all of Databricks’ metadata is supported by TiDB. Another big customer we have in the US is Pinterest. Currently, we manage hundreds of terabytes of data for Pinterest, assisting them in migrating from HBase to TiDB,” Huang revealed.

TiDB sees higher adoption among customers using legacy NoSQL databases. Most of the customers paying for TiDB services are from the Banking, Finance, Security and Insurance (BFSI) sector.

“In the past, companies relied on Oracle, MySQL, or other legacy databases. Nowadays, with the shift towards mobile platforms, data volumes have significantly increased, posing challenges for infrastructure, especially in sectors like finance,” Huang said.

These industries often have extensive legacy code built on SQL, making it difficult to transition to NoSQL interfaces seamlessly.

“They still require SQL compatibility for their codebase but now need scalability and robust data consistency at financial-grade levels. Japan’s largest payment company relies on TiDB. We also see great adoption in the e-commerce and gaming industry,” Huang added.

TiDB in India

Flipkart, one the largest e-commerce companies in India, revealed in a blogpost they have leveraged TiDB to scale to 1 million QPS. The e-commerce giant also faced scaling challenges which were met by vertically scaling the MySQL cluster. However, they saw TiDB as the solution.

“Flipkart has been using TiDB as a hot store in production since early 2021 for moderate throughput levels of 60k reads and 15k writes at DB level QPS. We set out to demonstrate the feasibility of using TiDB as a hot SQL data store for use cases with very high QPS and low latency requirements for the first time,” the company said in the blog post.

“Another large logistics company in India is also our customer, they are managing terabytes of data and using our real-time analytics capability. We also have a few SaaS companies using our cloud service,” Huang said.

India is home to many high-growth cloud-native companies that could benefit from TiDB. Moreover, TiDB’s real-time analytical capabilities could be an attractive prospect for many SaaS companies.

“They prefer not to establish multiple data warehouses or utilise various data sources separately for analytics. Our goal is to offer a unified platform where they can use a single system to gain real-time insights seamlessly. As far as I know, other databases like MongoDB or CockroachDB do not come with a real-time analytics engine,” he concluded.

The post TiDB Can Do What MongoDB or CockRoachDB Can’t appeared first on Analytics India Magazine.