TechCrunch Minute: Perplexity AI could be worth up to $3B. Here’s why

TechCrunch Minute: Perplexity AI could be worth up to $3B. Here’s why Alex Wilhelm 7 hours

Perplexity AI‘s latest, large fundraising event could be quickly superseded by another, even larger chunk of capital, TechCrunch reports. Yes, the $62.7 million that the startup raised at just over a $1 billion valuation could be quickly stomped on by a raise of as much as $250 million at a valuation that is up to 2.5 to 3x larger.

What’s going on? Quick revenue growth at the company that has reportedly reached around $20 million worth of annual recurring revenue. Sure at $1 billion that’s a 50x revenue multiple, but if the company is on a quick enough growth pace, investors paying up to 150x for its current ARR might not be as insane as it looks on paper, even if similarly priced bets back in the 2021-era often struggled.

The hype around Perplexity is a big deal, because it shows that some startups are doing well enough to attract outsized venture investment. Good. A concern that I have had for some time is that the AI boom would wind up merely enriching incumbents and not lifting enough startups up to create a new class of tech giants; my view is that having a permanent class of tech gods is not the best way to drive long-term innovation. And I think that search, in general, is a good indication of what happens when technology giants fail to meaningfully compete with one another.

So, news from Amazon and Microsoft and Meta and Adobe in the AI realm felt like a reminder this week that Big Tech is going to try to eat the AI moment. Perplexity, to bastardize Star Wars, could be among our key hopes to avoid merely seeing Microsoft or Alphabet add another trilly to their market cap. Hit play, let’s have a chat!

Now Run Programs in Real Time with Llama 3 on Groq

Groq is lightning fast. It recently achieved a throughput of 877 tokens/s on Llama 3 8B and 284 tokens/s on Llama 3 70B. A user on X compared both Llama 3 (on Groq) and GPT-4 by asking them to code a snake game in Python, and Groq was exceptionally fast. “There is no comparison,” he said.

I'm amazed by this.
Llama 3 on Groq makes GPT-4 look like a grandpa.
I asked both models to list all the prime numbers from 1 to 1000.
Llama 3 hit over 830 tokens per second(!) pic.twitter.com/g1BcNpCBj4

— Alex Banks (@thealexbanks) April 20, 2024

Andrej Karpathy, a former OpenAI researcher, was also impressed by Groq’s speed and jokingly commented: “Ugh kids these days! Back in my days, we used to watch the tokens stream one at a time and wait for the output.”

Another user wrote, “Llama 3 8b on Groq is absurdly fast and good quality! Here, with a simple prompt, it pumps out interrogatories for a trademark case at 826 tokens/second. Not perfect, but useful, and the output approaches GPT-4 level quality.”

Llama 3 is a compelling choice for enterprises integrating LLMs into their operations. On Groq, Llama 3 is priced at $0.59 per 1 million tokens for input and $0.79 per 1 million tokens for output, significantly lower than the pricing for Claude 3 and GPT-4 by Anthropic and OpenAI.

Mind keeps getting blown every time I see this comparison between @OpenAI GPT-4, @AnthropicAI Claude Opus and @Meta Llama 3 70B on @GroqInc in a post I'm putting together…
~17x lower input cost 🤯
~38x lower output cost 🤯🤯
14x faster @ ~280 vs ~20 tokens per sec 🤯🤯🤯 pic.twitter.com/VGo4NgrAx9

— Hitarth Sharma (@iamhitarth) April 22, 2024

Groq doesn’t offer its LPU hardware directly as a standalone product. Instead, it provides access to the processing power of its LPUs through its cloud service, GroqCloud.

Recently, the company acquired Definitive Intelligence, a Palo Alto-based company that provides various AI solutions designed for businesses, such as chatbots, data analytics tools, and documentation builders.

In a recent interview, Groq founder Jonathan Ross said that within four weeks of the cloud service being launched, the company now has 70,000 developers, and approximately 18,000 API keys have already been generated.

“It’s really easy to use and doesn’t cost anything to get started. You just use our API, and we’re compatible with most applications that have been built,” said Ross. He added that if any customer has a large-scale requirement and is generating millions of tokens per second, the company can deploy hardware for the customer on-premises.

What’s the secret?

Founded in 2016 by Ross, Groq distinguishes itself by eschewing GPUs in favour of its proprietary hardware, the language processing unit (LPU).

Prior to Groq, Ross worked at Google, where he created the tensor processing unit (TPU). He was responsible for designing and implementing the core elements of the original TPU chip, which played a pivotal role in Google’s AI efforts, including the AlphaGo competition.

LPUs are only meant to run the LLMs and not train them. “The LPUs are about 10 times faster than GPUs when it comes to inference or the actual running of the models,” said Ross, adding that when it comes to training LLMs, that’s a task for the GPUs.

When asked about the purpose of this speed, Ross said, “Human beings don’t like to read like this, as if something is being printed out like an old teletype machine. Eyes scan a page really quickly and figure out almost instantly whether or not they’ve got what they want.”

Groq’s LPU poses a significant challenge to traditional GPU manufacturers like NVIDIA, AMD, and Intel. Groq built its tensor streaming processor specifically to speed up deep learning computations, rather than modifying general-purpose processors for AI.

The LPU is designed to overcome the two LLM bottlenecks: compute density and memory bandwidth. In terms of LLMs, an LPU has greater compute capacity than a GPU and CPU. This reduces the amount of time per word calculated, allowing text sequences to be generated much faster.

Additionally, eliminating external memory bottlenecks enables the LPU inference engine to deliver orders of magnitude better performance on LLMs compared to GPUs. The LPU is designed to prioritise the sequential processing of data, which is inherent in language tasks. This contrasts with GPUs, which are optimised for parallel processing tasks such as graphics rendering.

“You can’t produce the 100th word until you’ve produced the 99th so there is a sequential component to them that you just simply can’t get out of a GPU,” said Ross.

Moreover, he added that GPUs are notoriously thirsty for power, often requiring as much power as the average household per chip. “LPUs use as little as a tenth as much power,” he said.

What’s Next?

Groq recently partnered with Earth Wind & Power to develop the first European vertically integrated AI Compute Centre in Norway. Groq has committed to deploying and operating 21,600 LPUs at Earth Wind & Power’s AI Compute Center in 2024, with the option to increase this number to 129,600 LPUs in 2025.

“If we can deploy over 220,000 LPUs this year, given how much faster they are than GPUs, it would be equivalent to more than all of Meta’s compute,” said Ross, adding that next year they want to deploy 1.5 million LPUs, which would be more than all of the compute of all the tech hyperscalers combined.

The post Now Run Programs in Real Time with Llama 3 on Groq appeared first on Analytics India Magazine.

Apple Releases Four Open Source LLMs with OpenELM Series of Models

apple

Apple has open sourced OpenELM, a collection of Efficient Language Models (ELMs). OpenELM utilises a layer-wise scaling approach to efficiently distribute parameters within each layer of the transformer model, resulting in improved accuracy.

Click here to check out the model on Hugging Face.

OpenELM models were pre-trained using the CoreNet library. The models come in 270M, 450M, 1.1B, and 3B parameters, both pre-trained and fine-tuned according to instructions.

The pre-training dataset consists of RefinedWeb, deduplicated PILE, a subset of RedPajama, and a subset of Dolma v1.6, totaling approximately 1.8 trillion tokens. Please review the licence agreements and terms of use for these datasets before utilising them.

Apple is joining the public AI game with 4 new models on the Hugging Face hub! https://t.co/oOefpK37J9

— clem 🤗 (@ClementDelangue) April 24, 2024

For instance, with a parameter budget of around one billion parameters, OpenELM demonstrates a remarkable 2.36% increase in accuracy compared to OLMo, while requiring only half the pre-training tokens.

In benchmarking, modern, consumer-grade hardware was used, with BFloat16 as the data type. CUDA benchmarks were conducted on a workstation equipped with an Intel i9-13900KF CPU, 64 GB of DDR5-4000 DRAM, and an NVIDIA RTX 4090 GPU with 24 GB of VRAM, running Ubuntu 22.04.

To benchmark OpenELM models on Apple silicon, an Apple MacBook Pro with an M2 Max system-on-chip and 64GiB of RAM, running macOS 14.4.1, was employed.

Token throughput was measured in terms of tokens processed per second, including prompt processing (pre-fill) and token generation. All models were benchmarked sequentially, with a full “dry run” generating 1024 tokens for the first model to significantly increase the throughput of generation for subsequent models.

The entire framework, including training logs, multiple checkpoints, pre-training configurations, and MLX inference code, has been made open-source, aiming to empower and strengthen the open research community, facilitating future research efforts.

The post Apple Releases Four Open Source LLMs with OpenELM Series of Models appeared first on Analytics India Magazine.

Integrating Generative AI in Content Creation

Integrating Generative AI in Content Creation
Image generated with Ideogram.ai

I know how hard it is to create content, especially for captivating audiences.

No matter what medium we use, written, sound, or visual; every single one requires thought and strategy to provide value from the content.

However, only some have enough time to generate quality content, especially for working people who double as content creators like me. Work and personal life could be in the way.

Things have changed in recent years as Generative AI has become more mainstream. This is because Generative AI has much potential to improve our content quality and efficiency if used effectively.

Generative AI, like ChatGPT, could make content creation easier. You might wonder, "Why don’t we ask the tool to create our content fully?” Well, my personal experience would say it’s a bad idea. An uncurated, entirely Generated AI content actually has terrible quality and needs rework. That’s why I suggest we use the tool to help us create content, not replace the works.

This article will discuss how we can integrate Content Creation with Generative AI. Let’s learn together.

Generative AI in Content Creation

When we talk about Generative AI, it will refer to the tools capable of generating new content that resembles human-generated content.

Many tools fall into the Generative AI category. For example, ChatGPT is a generative AI tool that could produce text content, or Midjourney, which could produce image content. With time, the tool has evolved to generate more human-like content than ever.

We would learn about integrating Generative AI tools into our content generation funnel. We would separate them into two parts: Text and Image.

Text Content

Text-generation Generative AI is a tool that could generate human-like text from the input that we give. The AI model could act closely to what we intended from the prompt we input. For example, generating emails, creating content drafts, planning social media schedules, etc.

In content creation, the text-generation model can help us in many ways. Here are some ways the model could help your content creation.

Content Ideation and Brainstorming

Suppose you are a content creator with a specific niche, e.g., data science, data engineering, machine learning, etc., and need new ideas to engage with the audience. You could prompt Generative AI models such as ChatGPT to generate new ideas.

For example, I use ChatGPT to generate content ideas with this prompt:

“I am a data scientist content creator, and I want to post content about why data science is important for business.

Generate me 3 content ideas that I can post about.”

The result is a list of content ideas explaining how to address the content. The list example is like this:

  • Case Studies on Data-Driven Decision Making
  • The Role of Data Science in Enhancing Customer Experience
  • Emerging Trends in Data Science and Business Intelligence

The actual result contains a longer explanation of how to create the content and what we should include.

ChatGPT here facilitates thinking creativity, as we don’t ask directly for ChatGPT to create the content. It serves as a partner to spark ideas and validate our thoughts.

Content Outline Draft

The Generative AI tool is not only capable of generating ideas, but it can help us create a more detailed outline.

While it requires more validation, having an outline for our content would cut some time for us to create the content.

For example, I ask ChatGPT to create a blog outline with this prompt:

“Write a draft outline for a 300-word blog on why data scientists should learn about data engineering.”

The result is a draft outline for us to fill. Here are my results with the prompt above:

Title: Bridging the Gap: The Importance of Data Engineering for Data Scientists

Introduction (50 words)

Body

  • Foundation of Data Science (60 words)
  • Enhanced Collaboration and Communication (60 words)
  • Increased Autonomy and Innovation (60 words)
  • Career Advancement and Versatility (60 words)

The result also contains a more detailed explanation of what we should write and how to write about it.

Content Enhancement

As a content creator, one of the best advice we ever get is repurposing our content.

Sometimes, our content is already good enough, but a new development makes our previous content obsolete.

Or the content we have doesn’t get enough engagement as something is lacking.

Whatever you have before, it’s still content that is good enough as it is, and we can use it again for our next content. However, we still need to change something to improve their quality.

This is where Generative AI comes in to help enhance our previous content. Using the tool, they could bring a new perspective and structure missing from our previous content by reading and understanding our content.

For example, here is a prompt template you can use, and hypothetical results come from the ChatGPT.

I have my content in the text below:

##INSERT CONTENT##

Can you analyze what is wrong with them and provide detail ideas on how I can improve them to increase audience engagement.

Here is the result:

  • Enhance the Title: Make the title more engaging and curiosity-evoking.
  • Improve the Introduction: Include a compelling fact, question, or anecdote.
  • Incorporate Real-world Examples: Use case studies or examples in the opportunities section.
  • Provide Solutions to Challenges: Discuss ongoing efforts and solutions to the challenges mentioned.
  • Add Visuals and Infographics: Use charts, visuals, and infographics to illustrate key points.
  • Strengthen the Conclusion: End with a strong call to action for the readers.

As you can see, ChatGPT could provide detailed information on why our content is not working as expected and how to improve it.

Image Content

Unlike Text Content, Image Content relies on the visual aspect to captivate the audience.

Using a Generative AI tool to generate might seem controversial for some, but it’s still a valid tool to help us engage with the audience. However, image content often becomes supportive content that works alongside text content. That’s why we usually have the Text Content ready before the Image Content.

Let’s see a few use cases for Generative AI for Image content.

Image to Accompanying Text Content

The simplest use case you can do with Generative AI is to develop an image suitable for text content.

In social media, content with images usually has much more engagement and provides additional detail that the audience might miss. With endless social media scrolling, images could make the viewer stop on your content.

With Generative AI, you can use the prompt to generate an image suitable for your content.

In the example below, I use ChatGPT with a prompt that would execute DALL·E for image generation.

“I want to post content about working as a data scientist. Can you generate a simple image appropriate for the content I would post?”

Integrating Generative AI in Content Creation
Image generated with DALL·E 3

The result is an image of a data scientist working in front of multiple screens. It’s a simple image, but we can always tweak that to meet your needs.

Speaking of tweaking, we would discuss how Generative AI helps with image content personalization.

Personalized Content Creation

Personalized content for your audiences could create more engagement than standard image generation. This is because you cater every content specific to certain aspects of individuals. You can even more personalized the image content for email campaigns or customer gifts.

For example, you want to send content to celebrate your most valuable audience, Charlie. You can generate an amazing image that reflects his personality to show how much you care about your audience. Here is a sample prompt and image generated with Ideogram.ai.

“Typography "Happy Birthday, Charlie" with library book background and cat playing around.”

Integrating Generative AI in Content Creation
Image generated with Ideogram.ai

Of course, the personalization varies for each content creator. Do you want each individual to get a personal or slightly more generic image? The choice is up to you.

Conclusion

As a content creator, sometimes content creation can be mundane and cost us a lot of time. With Generative AI, we can make our work more efficient by delegating them to the model.

In this article, we learned how Generative AI integrates into our content creation by helping us improve our Text and Image content. The generative model could help us enhance or use our current content as a brainstorming platform.

Cornellius Yudha Wijaya is a data science assistant manager and data writer. While working full-time at Allianz Indonesia, he loves to share Python and data tips via social media and writing media. Cornellius writes on a variety of AI and machine learning topics.

More On This Topic

  • Cutting Down Implementation Time by Integrating Jupyter and KNIME
  • Integrating ChatGPT Into Data Science Workflows: Tips and Best Practices
  • Optimizing Data Analytics: Integrating GitHub Copilot in Databricks
  • The Benefits of Natural Language AI for Content Creators
  • ChatGPT, GPT-4, and More Generative AI News
  • Are Data Scientists Still Needed in the Age of Generative AI?

Nvidia acquires AI workload management startup Run:ai

Nvidia acquires AI workload management startup Run:ai Kyle Wiggers 7 hours

Nvidia is acquiring Run:ai, a Tel Aviv-based company that makes it easier for developers and operations teams to manage and optimize their AI hardware infrastructure, for an undisclosed sum.

Ctech reported earlier this morning the companies were in “advanced negotiations” that could see Nvidia pay upwards of $1 billion for Run:ai. Evidently, those negotiations went without a hitch.

Nvidia says that it’ll continue to offer Run:ai’s products “under the same business model” for the immediate future, and invest in Run:ai’s product roadmap as part of Nvidia’s DGX Cloud AI platform.

“Run:ai has been a close collaborator with Nvidia since 2020 and we share a passion for helping our customers make the most of their infrastructure,” Omri Geller, Run:ai’s CEO, said in a statement. “We’re thrilled to join Nvidia and look forward to continuing our journey together.”

Geller co-founded Run:ai with Ronen Dar several years ago after the two studied together at Tel Aviv University under professor Meir Feder, Run:ai’s third co-founder. Geller, Dar and Feder sought to build a platform that could “break up” AI models into fragments that run in parallel across hardware, whether on-premises, on clouds or at the edge.

While Run:AI has relatively few direct competitors, other startups are applying the concept of dynamic hardware allocation to AI workloads. For example, Grid.ai offers software that allows data scientists to train AI models across GPUs, processors and more in parallel.

But relatively early in its life, Run:AI managed to establish a large customer base of Fortune 500 companies — which in turn attracted VC investment. Prior to the acquisition, Run:ai had raised $118 million in capital from backers including Insight Partners, Tiger Global, S Capital and TLV Partners.

Everything You Need to Know About Llama 3 | Most Powerful Open-Source Model Yet | Concepts to Usage

Meta Llama 3 open source LLM OUTPERFORM GPT 4

Meta has recently released Llama 3, the next generation of its state-of-the-art open source large language model (LLM). Building on the foundations set by its predecessor, Llama 3 aims to enhance the capabilities that positioned Llama 2 as a significant open-source competitor to ChatGPT, as outlined in the comprehensive review in the article Llama 2: A Deep Dive into the Open-Source Challenger to ChatGPT.

In this article we will discuss the core concepts behind Llama 3, explore its innovative architecture and training process, and provide practical guidance on how to access, use, and deploy this groundbreaking model responsibly. Whether you are a researcher, developer, or AI enthusiast, this post will equip you with the knowledge and resources needed to harness the power of Llama 3 for your projects and applications.

The Evolution of Llama: From Llama 2 to Llama 3

Meta's CEO, Mark Zuckerberg, announced the debut of Llama 3, the latest AI model developed by Meta AI. This state-of-the-art model, now open-sourced, is set to enhance Meta's various products, including Messenger and Instagram. Zuckerberg highlighted that Llama 3 positions Meta AI as the most advanced freely available AI assistant.

Before we talk about the specifics of Llama 3, let's briefly revisit its predecessor, Llama 2. Introduced in 2022, Llama 2 was a significant milestone in the open-source LLM landscape, offering a powerful and efficient model that could be run on consumer hardware.

However, while Llama 2 was a notable achievement, it had its limitations. Users reported issues with false refusals (the model refusing to answer benign prompts), limited helpfulness, and room for improvement in areas like reasoning and code generation.

Enter Llama 3: Meta's response to these challenges and the community's feedback. With Llama 3, Meta has set out to build the best open-source models on par with the top proprietary models available today, while also prioritizing responsible development and deployment practices.

Llama 3: Architecture and Training

One of the key innovations in Llama 3 is its tokenizer, which features a significantly expanded vocabulary of 128,256 tokens (up from 32,000 in Llama 2). This larger vocabulary allows for more efficient encoding of text, both for input and output, potentially leading to stronger multilingualism and overall performance improvements.

Llama 3 also incorporates Grouped-Query Attention (GQA), an efficient representation technique that enhances scalability and helps the model handle longer contexts more effectively. The 8B version of Llama 3 utilizes GQA, while both the 8B and 70B models can process sequences up to 8,192 tokens.

Training Data and Scaling

The training data used for Llama 3 is a crucial factor in its improved performance. Meta curated a massive dataset of over 15 trillion tokens from publicly available online sources, seven times larger than the dataset used for Llama 2. This dataset also includes a significant portion (over 5%) of high-quality non-English data, covering more than 30 languages, in preparation for future multilingual applications.

To ensure data quality, Meta employed advanced filtering techniques, including heuristic filters, NSFW filters, semantic deduplication, and text classifiers trained on Llama 2 to predict data quality. The team also conducted extensive experiments to determine the optimal mix of data sources for pretraining, ensuring that Llama 3 performs well across a wide range of use cases, including trivia, STEM, coding, and historical knowledge.

Scaling up pretraining was another critical aspect of Llama 3's development. Meta developed scaling laws that enabled them to predict the performance of its largest models on key tasks, such as code generation, before actually training them. This informed the decisions on data mix and compute allocation, ultimately leading to more efficient and effective training.

Llama 3's largest models were trained on two custom-built 24,000 GPU clusters, leveraging a combination of data parallelization, model parallelization, and pipeline parallelization techniques. Meta's advanced training stack automated error detection, handling, and maintenance, maximizing GPU uptime and increasing training efficiency by approximately three times compared to Llama 2.

Instruction Fine-tuning and Performance

To unlock Llama 3's full potential for chat and dialogue applications, Meta innovated its approach to instruction fine-tuning. Its method combines supervised fine-tuning (SFT), rejection sampling, proximal policy optimization (PPO), and direct preference optimization (DPO).

The quality of the prompts used in SFT and the preference rankings used in PPO and DPO played a crucial role in the performance of the aligned models. Meta's team carefully curated this data and performed multiple rounds of quality assurance on annotations provided by human annotators.

Training on preference rankings via PPO and DPO also significantly improved Llama 3's performance on reasoning and coding tasks. Meta found that even when a model struggles to answer a reasoning question directly, it may still produce the correct reasoning trace. Training on preference rankings enabled the model to learn how to select the correct answer from these traces.

The results speak for themselves: Llama 3 outperforms many available open-source chat models on common industry benchmarks, establishing new state-of-the-art performance for LLMs at the 8B and 70B parameter scales.

Responsible Development and Safety Considerations

While pursuing cutting-edge performance, Meta also prioritized responsible development and deployment practices for Llama 3. The company adopted a system-level approach, envisioning Llama 3 models as part of a broader ecosystem that puts developers in the driver's seat, allowing them to design and customize the models for their specific use cases and safety requirements.

Meta conducted extensive red-teaming exercises, performed adversarial evaluations, and implemented safety mitigation techniques to lower residual risks in its instruction-tuned models. However, the company acknowledges that residual risks will likely remain and recommends that developers assess these risks in the context of their specific use cases.

To support responsible deployment, Meta has updated its Responsible Use Guide, providing a comprehensive resource for developers to implement model and system-level safety best practices for their applications. The guide covers topics such as content moderation, risk assessment, and the use of safety tools like Llama Guard 2 and Code Shield.

Llama Guard 2, built on the MLCommons taxonomy, is designed to classify LLM inputs (prompts) and responses, detecting content that may be considered unsafe or harmful. CyberSecEval 2 expands on its predecessor by adding measures to prevent abuse of the model's code interpreter, offensive cybersecurity capabilities, and susceptibility to prompt injection attacks.

Code Shield, a new introduction with Llama 3, adds inference-time filtering of insecure code produced by LLMs, mitigating risks associated with insecure code suggestions, code interpreter abuse, and secure command execution.

Accessing and Using Llama 3

Meta has made Llama 3 models available through various channels, including direct download from the Meta Llama website, Hugging Face repositories, and popular cloud platforms like AWS, Google Cloud, and Microsoft Azure.

To download the models directly, users must first accept Meta's Llama 3 Community License and request access through the Meta Llama website. Once approved, users will receive a signed URL to download the model weights and tokenizer using the provided download script.

Alternatively, users can access the models through the Hugging Face repositories, where they can download the original native weights or use the models with the Transformers library for seamless integration into their machine learning workflows.

Here's an example of how to use the Llama 3 8B Instruct model with Transformers:

  # Install required libraries  !pip install datasets huggingface_hub sentence_transformers lancedb  

Deploying Llama 3 at Scale

In addition to providing direct access to the model weights, Meta has partnered with various cloud providers, model API services, and hardware platforms to enable seamless deployment of Llama 3 at scale.

One of the key advantages of Llama 3 is its improved token efficiency, thanks to the new tokenizer. Benchmarks show that Llama 3 requires up to 15% fewer tokens compared to Llama 2, resulting in faster and more cost-effective inference.

The integration of Grouped Query Attention (GQA) in the 8B version of Llama 3 contributes to maintaining inference efficiency on par with the 7B version of Llama 2, despite the increase in parameter count.

To simplify the deployment process, Meta has provided the Llama Recipes repository, which contains open-source code and examples for fine-tuning, deployment, model evaluation, and more. This repository serves as a valuable resource for developers looking to leverage Llama 3's capabilities in their applications.

For those interested in exploring Llama 3's performance, Meta has integrated its latest models into Meta AI, a leading AI assistant built with Llama 3 technology. Users can interact with Meta AI through various Meta apps, such as Facebook, Instagram, WhatsApp, Messenger, and the web, to get things done, learn, create, and connect with the things that matter to them.

Arena results

What's Next for Llama 3?

While the 8B and 70B models mark the beginning of the Llama 3 release, Meta has ambitious plans for the future of this groundbreaking LLM.

In the coming months, we can expect to see new capabilities introduced, including multimodality (the ability to process and generate different data modalities, such as images and videos), multilingualism (supporting multiple languages), and much longer context windows for enhanced performance on tasks that require extensive context.

Additionally, Meta plans to release larger model sizes, including models with over 400 billion parameters, which are currently in training and showing promising trends in terms of performance and capabilities.

To further advance the field, Meta will also publish a detailed research paper on Llama 3, sharing its findings and insights with the broader AI community.

As a sneak preview of what's to come, Meta has shared some early snapshots of its largest LLM model's performance on various benchmarks. While these results are based on an early checkpoint and are subject to change, they provide an exciting glimpse into the future potential of Llama 3.

Conclusion

Llama 3 represents a significant milestone in the evolution of open-source large language models, pushing the boundaries of performance, capabilities, and responsible development practices. With its innovative architecture, massive training dataset, and cutting-edge fine-tuning techniques, Llama 3 establishes new state-of-the-art benchmarks for LLMs at the 8B and 70B parameter scales.

However, Llama 3 is more than just a powerful language model; it's a testament to Meta's commitment to fostering an open and responsible AI ecosystem. By providing comprehensive resources, safety tools, and best practices, Meta empowers developers to harness the full potential of Llama 3 while ensuring responsible deployment tailored to their specific use cases and audiences.

As the Llama 3 journey continues, with new capabilities, model sizes, and research findings on the horizon, the AI community eagerly awaits the innovative applications and breakthroughs that will undoubtedly emerge from this groundbreaking LLM.

Whether you're a researcher pushing the boundaries of natural language processing, a developer building the next generation of intelligent applications, or an AI enthusiast curious about the latest advancements, Llama 3 promises to be a powerful tool in your arsenal, opening new doors and unlocking a world of possibilities.

Snowflake Releases Open Enterprise LLM, Arctic with 480 Billion Parameters

Snowflake Arctic

After open-sourcing the Arctic family of text embedding models, Snowflake is now adding another LLM to the list for enterprise use cases. Snowflake Arctic sets a new standard for openness and enterprise-grade performance.

Designed with a unique Mixture-of-Experts (MoE) architecture, Arctic provides top-tier optimisation for complex enterprise workloads, surpassing several industry benchmarks in SQL code generation, instruction following, and more.

Arctic’s unique MoE design enhances both training systems and model performance with a carefully crafted data composition tailored to enterprise needs. With a breakthrough in efficiency, Arctic activates only 17 out of 480 billion parameters at a time, achieving industry-leading quality with unprecedented token efficiency.

“Despite using 17x less compute budget, Arctic is on par with Llama3 70B in language understanding and reasoning while surpassing in Enterprise Metrics,” said Baris Gultekin, Snowflake’s head of AI.

Compared to other models, Arctic activates approximately 50% fewer parameters than DBRX, and 80% fewer than Grok-1 during inference or training. Moreover, it outperforms leading open models such as DBRX, Llama 2 70B, Mixtral-8x7B, and more in coding (HumanEval+, MBPP+) and SQL generation (Spider and Bird-SQL), while also providing superior performance in general language understanding (MMLU).

“This is a watershed moment for Snowflake, with our AI research team innovating at the forefront of AI,” said Sridhar Ramaswamy, CEO, Snowflake. “By delivering industry-leading intelligence and efficiency in a truly open way to the AI community, we are furthering the frontiers of what open source AI can do. Our research with Arctic will significantly enhance our capability to deliver reliable, efficient AI to our customers,” he said.

The best open model?

The best part is that Snowflake is releasing Arctic’s weights under an Apache 2.0 licence, along with details of the research behind its training, establishing a new level of openness for enterprise AI technology. “With the Apache 2 licensed Snowflake Arctic embed family of models, organisations now have one more open alternative to black-box API providers such as Cohere, OpenAI, or Google,” says Snowflake.

“The continued advancement and healthy competition between open source AI models is pivotal not only to the success of Perplexity, but the future of democratising generative AI for all,” said Aravind Srinivas, co-founder and CEO, Perplexity. “We look forward to experimenting with Snowflake Arctic to customise it for our product, ultimately generating even greater value for our end users.”

As part of the Snowflake Arctic model family, Arctic is the most open LLM available, allowing ungated personal, research, and commercial use with its Apache 2.0 licence. Snowflake goes further by providing code templates, along with flexible inference and training options, enabling users to deploy and customise Arctic quickly using their preferred frameworks, including NVIDIA NIM with NVIDIA TensorRT-LLM, vLLM, and Hugging Face.

Yoav Shoham, co-founder and co-CEO, AI21 Labs, said, “We are excited to see Snowflake help enterprises harness the power of open source models, as we did with our recent release of Jamba — the first production-grade Mamba-based Transformer-SSM model.”

For immediate use, Arctic is available for serverless inference in Snowflake Cortex, Snowflake’s fully managed service offering machine learning and AI solutions in the Data Cloud, alongside other model gardens and catalogues such as Hugging Face, Lamini, Microsoft Azure, NVIDIA API catalogue, Perplexity, Together, and more.

“We’re pleased to increase enterprise customer choice in the rapidly evolving AI landscape by bringing the robust capabilities of Snowflake’s new LLM model Arctic to the Microsoft Azure AI model catalogue,” said Eric Boyd, corporate vice president, Azure AI Platform, Microsoft.

Everyone loves the winter

Snowflake’s AI research team, comprising industry-leading researchers and system engineers, developed Arctic in less than three months, spending roughly one-eighth of the training cost of similar models. Snowflake has set a new benchmark for the speed at which state-of-the-art open, enterprise-grade models can be trained, enabling users to create cost-efficient custom models at scale.

Clement Delangue, CEO and co-founder of Hugging Face said, “We’re excited to see Snowflake contributing significantly with this release not only of the model with an Apache 2.0 licence but also with details on how it was trained. It gives the necessary transparency and control for enterprises to build AI and for the field as a whole to break new grounds.”

Snowflake Ventures has also recently invested in LandingAI, Mistral AI, Reka, and others, reinforcing its commitment to helping customers derive value from their enterprise data with LLMs and AI.

“Snowflake and Reka are committed to getting AI into the hands of every user, regardless of their technical expertise, to drive business outcomes faster,” said Dani Yogatama, co-founder and CEO, Reka. “With the launch of Snowflake Arctic, Snowflake is furthering this vision by putting world-class truly-open large language models at users’ fingertips.”

Additionally, Snowflake has expanded its partnership with NVIDIA to further AI innovation, combining the full-stack NVIDIA accelerated platform with Snowflake’s Data Cloud to provide a secure and powerful infrastructure and compute capabilities for unlocking AI productivity.

The post Snowflake Releases Open Enterprise LLM, Arctic with 480 Billion Parameters appeared first on Analytics India Magazine.

Adobe Launches Firefly Image 3 Beta With Auto Stylisation, Structure Reference Capabilities

Adobe officially released the beta version of the Firefly Image 3 Foundation Model during the company’s Creativity Conference on Tuesday.

Firefly Image 3 is the latest addition to Adobe’s creative generative AI models, with several new features added. Most interesting of these is the ability for users to have more power over personalisation with auto stylisation capabilities as well as the ability to reference style and structure.

“Structure Reference enables users to easily apply the structure of an existing image to newly generated images. You can now use an existing image as a structural reference template and generate multiple image variations with the same layout,” the company said.

Similarly, the style reference feature allows users to upload style preferences and generate images based on them.

Generative Expand, powered by #AdobeFirefly Image 3 and your wildest dreams. 🏝💎 Available now: https://t.co/ycJsFSHw9X pic.twitter.com/N5idUW4Tu0

— Adobe (@Adobe) April 23, 2024

Apart from this, Firefly boasts higher quality image generation with more variety, as the newer model relies on a new style engine: “Image outputs include new varieties of styles, colours, backgrounds, subject poses and more.”

With Firefly’s launch early last year, the model was criticised for barely being on par with other generative models on the market like Midjourney and Stable Diffusion. However, with the introduction of its structure reference ability, Firefly seems to be fast catching up to its competitors.

This is also compounded by the ease of access afforded to users who use Adobe for work and personal use.

“Firefly has been used to generate over 7 billion images worldwide since its initial debut in March 2023. Adobe built it for direct integration into workflows Adobe customers use every day, including Adobe Photoshop, Adobe Express, Adobe Illustrator, Adobe Substance 3D and now Adobe InDesign,” the company said.

Currently, the latest version of Firefly is available for beta testing in Photoshop and the Firefly web app. As is the case with all Firefly models, the company has also ensured that Content Credentials are automatically attached to every image.

Adobe has also stated that the models are trained on only licensed content, including Adobe Stock, to avoid issues surrounding copyright infringement.

The post Adobe Launches Firefly Image 3 Beta With Auto Stylisation, Structure Reference Capabilities appeared first on Analytics India Magazine.

5 Free Advanced Python Programming Courses

5 Free Advanced Python Programming CoursesImage by Author

Learning a language or finding good introductory Python courses is relatively easy, but when it comes to mastering advanced concepts, finding free yet high-quality resources can be quite challenging. Most of the excellent content for advanced courses is typically limited to paid options. However, fear not! Today, I've got you covered. I'll be sharing a list of 5 advanced Python courses that you can take to level up your skills without spending a penny. So, without any further wait, let's dive in!

1. Python 3 Programming Specialization by University Of Michigan

This specialization, available on Coursera, is well-known in the Python community, boasting a whopping 4.7 rating and over 16,000 reviews. It comprises 5 courses covering a wide range of advanced topics. Since you're already familiar with the basics of Python, feel free to skip the introductory course and explore the rest. Here's a brief overview:

Course 2: Python Functions, Files, and Dictionaries: Dive into dictionary data structures, user-defined functions, sorting techniques, and more.
Course 3: Data Collection and Processing with Python: Master Python list comprehensions, interact with REST APIs, and manipulate data efficiently.
Course 4: Python Classes and Inheritance: Learn about classes, instances, inheritance, and advanced class design principles.
Course 5: Python Project: pillow, tesseract, and OpenCV: Gain hands-on experience with image manipulation, text detection, and face recognition using third-party libraries.

Course Link: Python 3 Programming Specialization by University of Michigan

Note: You can audit this specialization to enjoy the content for free. However, you won't receive a certificate of completion unless you pay for the specialization.

2. Advanced Python by Patrick Loeber

Patrick Loeber, a software engineer and developer advocate at AssemblyAI, offers an advanced Python course through videos on his YouTube channel. With over 263K subscribers. The code used for explanation can be found on his website. His course covers a variety of topics, including:

  • Lists, Tuples, Dictionaries, Strings, Collections, and Sets
  • Functional Programming with Lambda functions and Itertools
  • Exception Handling, Logging, and JSON Manipulation
  • Multithreading, Multiprocessing, and Concurrency
  • The asterisk (*) operator
  • Shallow vs. Deep Copying
  • Context Managers
  • And much more!

Course Link: Advanced Python by Patrick Loeber

3. Learn Advanced Python 3 by Codecademy

Codecademy is a popular online platform that offers numerous free courses. This particular course takes 6 hours to complete and will take your Python programming skills to the next level. You'll learn new paradigms that will give you the flexibility to create clean, effective code and make you a truly advanced Python 3 programmer. The fun part about this course is that it includes mini-projects that deepen your understanding of the concepts under discussion.

Here's the course content:

  • Learn to debug and track software with logging, including an ATM project
  • Explore creating efficient programs using functional programming, with a focus on higher-order functions
  • Analyze hotel databases using SQLite 3 for a deeper understanding of Python's database capabilities
  • Implement code more efficiently through concurrent programming techniques
  • Discover how to package and deploy Python scripts using Flask for effective application distribution

If you find the content of the advanced course a bit challenging, you can step down to their Learn Intermediate Python 3 course. It covers topics like functions, OOP, unit testing, iterators and generators, specialized collections, and resource management in Python.

Course Link: Learn Advanced Python 3 by Codecademy

4. Python Programming MOOC 2023

This course material page offers both the Introduction to Programming course (BSCS1001, 5 cr) and the Advanced Course in Programming (BSCS1002, 5 cr) from the Department of Computer Science at the University of Helsinki. If you're already familiar with Python basics, you can use the first part of the course as a refresher or skip it entirely. However, the real gem lies in the second part, which focuses on advanced Python programming concepts. You'll find recordings, slides, and numerous exercises to sharpen your skills.

Here's what this course covers:

  • Objects and Methods, Encapsulation, Scope of Methods, and Class Attributes
  • Class Hierarchies, Access Modifiers, Object-Oriented Programming Techniques, and Developing a Larger Application
  • List Comprehensions and Recursion
  • Functions as Arguments, Generators, Functional Programming, and Regular Expressions
  • PyGame — Animation, Events, and Different Techniques
  • Gaming project in Python from Scratch

Course Link: Python Programming MOOC 2023

5. Scientific Computing with Python (Beta) — FreeCodeCamp

If you prefer project-based learning, this course is well-suited for you. The Scientific Computing with Python (Beta) curriculum will equip you with the skills to analyze and manipulate data using Python. You'll learn key concepts like data structures, algorithms, object-oriented programming, and how to perform complex calculations using a variety of tools.

Let's take a look at the course content:

  • Learn String Manipulation by Building a Cipher
  • Learn How to Work with Numbers and Strings by Implementing the Luhn Algorithm
  • Learn Lambda Functions by Creating an Expense Tracker
  • Learn Python List Comprehension by Building a Case Converter Program
  • Learn Regular Expressions by Building a Password Generator Program
  • Learn Algorithm Design by Building a Shortest Path Algorithm
  • Learn Recursion by Solving the Tower of Hanoi Mathematical Puzzle
  • Learn Data Structures by Building the Merge Sort Algorithm
  • Learn Classes and Objects by Building a Sudoku Solver
  • Learn Tree Traversal by Building a Binary Search Tree

After these guided projects, you'll be asked to work on some projects from scratch like an Arithmetic Formatter, Time Calculator, Budget App, Polygon Area Calculator, and Probability Calculator to put your knowledge to the test.

Course Link: Scientific Computing with Python (Beta) — FreeCodeCamp

Wrapping Up

These free courses offer a fantastic opportunity to advance your Python skills without breaking the bank. However, if you're eager to explore paid options for more in-depth learning, I recommend checking out the following resources:

  • Fluent Python by Luciano Ramalho, 2nd Edition — Shows you how to make your code shorter, faster, and more readable at the same time
  • Courses by Fred Baptiste on Udemy — To get better at everyday Python programming
  • Serious Python by Julien Danjou — Covers deployment, scalability, testing, and more
  • Architecture Patterns with Python by Harry Percival and Bob Gregory — Covers architectural design patterns

Here's a BONUS for you: You can access "Architecture Patterns with Python" for FREE on the author’s website. Enjoy its content over here. Happy learning!

More On This Topic

  • Free Intermediate Python Programming Crash Course
  • Free eBook: 10 Practical Python Programming Tricks
  • The Most Popular Intro to Programming Course From Harvard is Free!
  • KDnuggets News March 30: The Most Popular Intro to Programming…
  • What Makes Python An Ideal Programming Language For Startups
  • ChatGPT as a Python Programming Assistant

C.P. Gurnani & InterGlobe’s Rahul Bhatia Announce AI Business Venture AIonOS 

InterGlobe’s Rahul Bhatia and Assago Group’s C.P. Gurnani have launched AIonOS, an AI business venture to transform businesses into AI-native enterprises. The joint venture will create an infrastructure, data, and generative AI ecosystem to boost productivity and profitability.

The company aims to empower businesses with advanced AI solutions that streamline workflows and elevate customer experience, redefining the way businesses operate in the digital age.

“AIonOS is aimed at enabling businesses to accelerate their digital transformation by enhancing human and system capabilities with AI-powered solutions,” said Rahul Bhatia, InterGlobe’s Group Managing Director.

He added, “By leveraging our deep sectoral expertise and the power of AI, we aim to revolutionise industries, redefine possibilities, and shape the future of businesses.”

The venture will onboard businesses onto its IntelliOS platform, which brings AI into every decision-making process and delivers tangible business benefits.

AIonOS will offer specialised AI products and technologies, including custom solutions, industry-specific products, data insight engines, and AI-led customer experience.

“At AIonOS, we are redefining industry standards with IntelliOS, our AI native platform that enables organizations to initiate their transformation towards cognitive enterprises,” commented C.P. Gurnani, Executive Vice Chairman of AIonOS. “Our approach to AI combines sophisticated technology with the nuances of human interaction in every solution we deliver.”

Headquartered in Singapore, AIonOS will have a global presence across North America, India, the Middle East, Europe, and Asia-Pacific. The venture will initially focus on the Travel, Transportation, Logistics, and Hospitality (TTLH) sector, having already onboarded several businesses as launch customers.

AIonOS plans to expand its reach and impact across additional sectors, aiming to equip enterprises with the tools to lead in today’s dynamic marketplace. The joint venture combines InterGlobe’s deep industry expertise with Assago’s technological prowess to deliver cutting-edge AI solutions.

The post C.P. Gurnani & InterGlobe’s Rahul Bhatia Announce AI Business Venture AIonOS appeared first on Analytics India Magazine.