GenAI, RAG, LLM… Oh My!

Slide1

I am pleased to announce the release of my updated “Thinking Like a Data Scientist” workbook. As a next step, I plan to work on a supplemental workbook incorporating GenAI tools like OpenAI ChatGPT, Google Gemini, and Microsoft Copilot with the Thinking Like a Data Scientist approach. We need to improve our prompt engineering skills to get started with this process.

Prompt engineering is the practice of carefully crafting inputs to optimize the performance and accuracy of responses from large language models (LLM).

Integrating Generative AI (GenAI) tools with the “Thinking Like a Data Scientist” (TLADS) methodology requires an understanding of prompt engineering. Using more relevant prompts, we can direct the GenAI tools to focus on specific hypotheses, test assumptions, and generate analyses aligning with our strategic business goals. Prompt engineering facilitates the iterative hypothesis generation and testing process, which is central to TLADS. It enhances the quality of insights by ensuring that the outputs from GenAI tools are relevant and strategically valuable in light of the targeted business initiative and supporting use cases. By tailoring prompts according to TLADS principles, we can bridge the gap between raw data and strategic insights, optimizing the business value derived from AI technologies.

Levels of Prompt Engineering Mastery

Prompt engineering is vital for creating prompts that utilize large language models (LLMs) for specific business and operational initiatives. The process of prompt engineering involves four levels of development, each building on the capabilities and understanding gained from the previous level (see Figure 1).

  • Level 1: Prompt Engineering on Public LLMs. This foundational level involves learning how to create effective prompts from general-purpose LLMs such as OpenAI’s ChatGPT. It is an excellent starting point that emphasizes understanding how prompts impact responses and fueling experimenting with different prompt styles to observe the limitations and biases of these public LLMs.
  • Level 2: Retrieval-Augmented Generation (RAG). RAG involves integrating external, validated data sources to enhance the quality and accuracy of responses from LLMs. Level 2 requires understanding how to leverage credible databases, documents, or trusted Internet sites to fetch relevant information that enhances the LLM’s base knowledge to deliver more relevant and accurate responses.
  • Level 3: Fine-Tuning. This level refers to the process of customizing a language model (LLM) to improve its performance and relevance for specialized tasks on a specific dataset. This could involve training the model to better suit a particular domain, adapting it to reflect an organization’s unique tone and style, or tailoring it to meet the specific needs of a market or operational niche.
  • Level 4: Custom, Proprietary LLMs. Developing custom, proprietary LLMs represents the highest level of prompt engineering maturity. Organizations create these LLMs based on their own exclusive data and knowledge sources. This process involves careful data curation, experimentation with various architectures, and significant computational resources.
Slide2

Figure 1: Stages of Prompt Engineering / LLM Mastery

As organizations strive to master prompt engineering, some important factors they should consider include 1) continuous learning and model updating and 2) ethical considerations and bias mitigation.

LLM Continuous Learning and Updating

Continuous Learning and Model Updating focus on keeping models’ accuracy and effectiveness intact over time as new data emerges or conditions evolve. This process is critical for maintaining the relevancy and accuracy of LLMs. It is also connected to Step 8 in my recent book, “The Art of Thinking Like a Data Scientist: Second Edition.”

  • Data Monitoring: Regularly monitor input data and model responses to identify shifts in data patterns or emerging inaccuracies in outputs.
  • Model Evaluation: Continuously assess the model’s performance using metrics that can signal when the model starts to drift or underperform.
  • Feedback Loops: Implementing mechanisms to collect feedback, either from users or automated systems, to inform adjustments in the model.
  • Update Strategies: Develop strategies for retraining or fine-tuning the model based on new data or after identifying performance issues. This can involve incremental learning, where the model is periodically updated with new data without retraining from scratch.
  • Testing and Validation: Before deploying updated models, rigorous testing ensures that changes improve the model without introducing new problems.
Slide3

Figure 2: “The Art of Thinking Like a Data Scientist Second Edition”

Users Role and Responsibilities

It is crucial for users to actively participate in the continuous learning cycle of large language models (LLMs) to ensure that these models stay relevant, accurate, and aligned with the evolving real-world applications. This includes the following:

  • Providing Feedback: Users monitor and report discrepancies or shortcomings in the model’s performance. For example, an editor using a text-generating AI might note and report that the model frequently misinterprets technical jargon, prompting necessary refinements.
  • Validating Outputs: Users ensure that the model’s outputs remain accurate and practical for real-world applications. For instance, a user employing an AI for stock market predictions would verify the AI’s forecasts against actual market movements to ensure reliability.
  • Guiding Adjustments: Users play a crucial role in shaping the model’s development by providing specific insights on how it can better meet their needs. For example, a teacher using an educational AI could suggest enhancements to better align the AI’s tutoring responses with curriculum standards.

Reinforcement Learning through Human Feedback (RLHF) is a machine learning technique that utilizes human input to improve an LLM’s performance. RLHF combines reinforcement learning with feedback from people to create a more precise and reliable model. It is an automated companion that ensures our LLMs continually learn and improve through human feedback via:

  • Training Signal from Human Feedback: In RLHF, the model adjusts its algorithms based on positive or negative feedback from human evaluators, learning to replicate successful outcomes and avoid errors.
  • Iterative Improvement: This method emphasizes iterative adjustments to the model, closely aligning with the principles of continuous learning.

TLADS Step 8: Create a Learning-based User Experience

In my updated TLADS, Step 8 is focused on creating a learning-based User Experience (UEX) that leverages algorithmic learning with human learning to enhance decision-making processes. It accomplishes that by focusing on two aspects of creating a learning-based user experience (Figure 3).

  • Integrative Feedback Systems: Part of creating a learning-based UEX involves integrating feedback mechanisms that improve user experience and feed valuable data into the model.
  • User-Centric Design: Ensuring that the model updates and learning processes are driven by user needs and behaviors, creating a more effective and intuitive user experience.
Slide4

Figure 3: TLADS Step 8: Create a Learning-based User Experience (UEX)

GenAI, RAG, LLM Summary

I am thrilled to be working on a supplement to my recently published second edition of “The Art of Thinking Like a Data Scientist” (TLADS) workbook. The supplement will explore integrating GenAI tools like OpenAI ChatGPT with the TLADS methodology to generate more accurate, meaningful, relevant, responsible, and ethical AI outcomes.

Our goal of integrating GenAI with TLADS begins with prompt engineering, which involves the process of crafting GenAI requests. These requests should not only drive but also significantly enhance the output quality of large language models (LLMs). With this alignment between GenAI and TLADS, we can encourage the exploration, imagination, and creativity necessary to deliver more relevant, meaningful, responsible, and ethical AI outcomes.

As we progress through the levels of prompt engineering mastery, from simple interactions with public LLMs to developing complex and exclusive models, we uncover a path of growth and depth in how we interact with GenAI tools. With each level, we get closer to a deeper integration where our prompts become more than just inputs; they become conversations that guide AI to generate accurate insights aligned with our specific business needs. This journey involves continuous learning and ethical diligence, ensuring that as our models learn and evolve, they do so by upholding values that respect and enhance user trust and regulatory compliance.

Ola Krutrim Takes On Microsoft Azure and AWS, Launches AI Cloud Platform

Ola Krutrim, India’s first AI unicorn, has launched Krutrim AI Cloud, its own cloud platform for enterprises, researchers, and developers. Additionally, the company has introduced a standalone Android app for its AI assistant, now available on the Google Play Store.

Krutrim Cloud provides access to state-of-the-art AI computing infrastructure, Krutrim’s foundational models, and other open-source models such as Meta’s Llama 3 and Mistral. This platform will enable developers to run and build LLMs at a fraction of the costs currently offered by other cloud service providers.

With this development, Ola Krutrim will compete with Microsoft Azure, Google Cloud, and Amazon Web Services. These platforms have all experienced robust revenue growth in recent quarters, driven by increasing demand from businesses and enterprises of varying sizes.

The company has also announced Model-as-a-Service (MaaS) and GPU-as-a-Service offerings, allowing access to its foundational models and AI compute infrastructure. Additionally, Krutrim has launched multiple APIs and SDKs for location services, including Places API, Tiles API, and Routing APIs.

“We are committed to developing full-stack AI capabilities in India, for the world,” said Bhavish Aggarwal, Founder of Krutrim. “We believe that India needs its own technology platforms to enable the emergence of world-class products at a fraction of current costs.”

The Krutrim assistant app, built on the company’s own LLM (trained on over 2 trillion tokens with the largest representation of Indic data), will simplify the use of AI for everyone. The app currently understands and generates intelligent responses in 10+ Indian languages and will be expanded to 22 official languages in the near future.

In the future, the Krutrim app will enable users to give voice commands, integrating text, voice, and visual data for enhanced functionality. The Krutim assistant easily collaborates with other apps, facilitating tasks like cab bookings, setting reminders, and messaging without the need to switch between applications.

Krutrim’s vision is to create a comprehensive, reliable, and scalable Maps Platform that addresses local needs and supports broader initiatives for technological and economic advancement in India.

The company is also working on developing MapGPT to enable natural conversations powered by location-aware Krutrim AI intelligent assistants. The app will offer an immersive experience of location hotspots and integrate community-based features for real-time updates on traffic and road conditions.

Launched in December last year, Krutrim is touted as “India’s first full-stack AI” solution. It is trained on 2 trillion tokens and can understand over 20 Indian languages and generate content in about 10 languages, including Marathi, Hindi, Bengali, Tamil, Kannada, Telugu, Odia, Gujarati, and Malayalam.

The post Ola Krutrim Takes On Microsoft Azure and AWS, Launches AI Cloud Platform appeared first on Analytics India Magazine.

SAP Has Over 1300 Customers in India

India is a sweet spot for German tech conglomerate SAP. With over 1300 customers based in India and over 425,000 globally, the company’s second-largest R&D centre outside of Germany is in Bengaluru.

“India is a big volume market. It plays a pivotal role in SAP’s global operations since over 80% of our business in India caters to the European market,” Subbu Ananth, CRO, SAP BTP, APJ, told AIM at the company’s flagship event SAP Now in Mumbai earlier this week.

Furthermore, SAP is deeply integrated into India’s economy, as 60% of the country’s GDP is processed through SAP’s systems.80% of SAP’s customers in India are small and midsize businesses. Now, from small to large enterprises, everyone is excited about integrating AI.

Recently, Philip Hertzberg, SAP’s chief AI officer, mentioned that SAP is expecting a 10 to 15% rise in efficiency through the use of generative AI. SAP has partnerships with leading AI companies like Google, Microsoft, Anthropic, Cohere, and Aleph Alpha.

Joule, SAP’s generative AI assistant that allows access to the company’s extensive cloud enterprise suite across different apps and programmes, is also available on SAP BTP now. Talking about the same, Ananth noted that the Indian team plays a major role in SAP’s generative AI initiatives. The company has developed over 30 generative AI use cases and has collected anonymous datasets from more than 27,000 customers via cloud storage to train its models.

“In terms of AI, our approach in India is to democratise the technology, making it accessible across various organisational levels and not just confined to large corporations,” added Subbu.

Why India is a Hotspot for AI

“India has been a leader in acquiring new names and logos globally,” said Ananth, highlighting the importance of its role in SAP’s global operations.

The growth potential for AI in India is substantial. According to EY, generative AI can add around $1.2 -1.5 trillion to India’s GDP over the next seven years. Furthermore, India attracts major investments in digital native companies like Razorpay, Cure.fit, Swiggy, Ola, which are inherently inclined to integrate AI, machine learning, and robotics into their operations.

The broader perspective reveals that if organisations do not engage with AI, they risk losing their competitive edge. Recognising this, there is a concerted effort within India to ensure that both large industry players and newer market entrants can effectively use AI. This inclusive strategy is expected to spur growth across all segments, reinforcing India’s position as a crucial hub in SAP’s global AI strategy.

This success is expected to pave the way for India to become a key hub for AI customer acquisition, thanks to the continuous enhancement of local skills and expertise. Ananth shared that with a strong foundation of AI talent and numerous mid-sized companies, India is well-positioned to take a leading role in global AI adoption.

Challenges Persist

The cloud migration services market size is expected to grow from $119.13 billion in 2020 to $448.34 billion by 2025, at a CAGR of 30.2% during the forecast period. But many customers are still struggling to move from on-prem to cloud services like SAP S/4HANA.

For many organisations, this transition can be daunting due to the scale of migration, the need for new skills, and potential disruptions to ongoing operations.

Additionally, there is a skills gap in cloud technologies among the workforce, which can hinder the adoption and effective use of cloud solutions. “So, we address this by upskilling the developer ecosystem and leveraging our centre of excellence for better implementation of the product,” explained the executive.

However, SAP is not alone in facing these challenges or deploying these strategies. Like SAP, Oracle has a strong foothold in ERP systems and has been aggressively pushing its cloud services. Oracle provides similar training and support through its Oracle University and professional certification programmes to ease the transition for its customers.

However, looking ahead, SAP’s vision for the next six to 12 months is to promote the concept of ‘clean core’ extensively, emphasising the importance of maintaining a clean digital infrastructure to support AI-driven innovations.

“Our overarching goal is to integrate BTP extensively across various sectors, reinforcing its utility in sustainability and AI applications,” Ananth concluded.

The post SAP Has Over 1300 Customers in India appeared first on Analytics India Magazine.

10 Wild Use Cases for Llama-3

Meta dropped Llama-3 just a few weeks ago and it has taken everyone by surprise. People are coming up with wild use cases every day, pushing the model to its limits in incredible ways.

Here are 10 impressive examples of what it can do.

Llama-3 8B with a context length of over 1M

Developed by Gradient and sponsored by compute from Crusoe Energy, this model, called Llama-3 8B Gradient Instruct 1048k, extends LLama-3 8B’s context length from 8k to over 1048K. This model shows that SOTA LLMs can efficiently manage long contexts with minimal training by appropriately adjusting RoPE theta.

The model was trained progressively on increasing context lengths, drawing on techniques like NTK-aware interpolation and Ring Attention for efficient scaling. This approach allowed for a massive increase in training speed, making the model both powerful and efficient in handling extensive data.

We've been in the kitchen cooking 🔥 Excited to release the first @AIatMeta LLama-3 8B with a context length of over 1M on @huggingface – coming off of the 160K context length model we released on Friday!
A huge thank you to @CrusoeEnergy for sponsoring the compute. Let us know… pic.twitter.com/iZ9zcKzOc6

— Gradient (@Gradient_AI_) April 29, 2024

RAG App with Llama-3 running locally

You can build a RAG app with Llama-3 running locally on your computer (it’s 100% free and doesn’t require an internet connection).

The instructions include simple steps like installing the necessary Python Libraries, setting up the Streamlit App, creating Ollama embeddings and a vector store using Chroma, and setting up the RAG chain among other things.

Build a RAG app with Llama-3 running locally on your computer
(100% free and without internet):

— Shubham Saboo (@Saboo_Shubham_) May 2, 2024

Agri Vertical Dhenu 1.0 model fine-tuned on Llama3-8B

KissanAI’s Agri Vertical Dhenu1.0 model has been fine tuned on Llama3 8B for 150K instructions. It is India-focused and available for anyone to download, tinker and provide feedback.

Tool Calling Champion

Llama-3 70b on GroqInc is a tool-calling champion. The 70b model passed the task when given a query, was very fast, and had the best pricing. It’s also performing great at benchmarks and tests.

I found a new tool calling champion
Llama3 70b on @GroqInc
Challenge: given user query, extract financial quarters and years.
Example: "How did revenue change between Q4 2023 and year before that?"
The 70b model:
• passed the task
• was very fast
• had best pricing
I… pic.twitter.com/q1UERftTMj

— virat (@virattt) May 1, 2024

Lightning-fast Copilot in VSCode

You can connect @GroqInc with VSCode, unlocking the full potential of Llama-3 as your Copilot.

Just create your account on the Groq console, head to the ‘API Key’ menu and generate yours, download the CodeGPT extension from the VSCode marketplace. After this, open CodeGPT and select Groq as the provider, click on ‘Edit Connection’, paste your Groq API Key, and then click ‘Connect’.

That’s how you can connect Groq to VSCode and access all the models offered by this service.

Your own Copilot in VSCode lightning-fast thanks to Groq and Llama3
In this thread 🧵, I'll guide you through connecting @GroqInc with VSCode, unlocking the full potential of Llama 3 (@AIatMeta) as your Copilot👇 pic.twitter.com/vy8lvqWXxQ

— Daniel San (@dani_avila7) May 2, 2024

Llama-3 Function Calling

Llama-3 function calling works pretty well. Nous Research announced Hermes 2 Pro, which comes with Function Calling and Structured Output capabilities. The Llama-3 version now uses dedicated tokens for tool call parsing tags to make streaming function calls easier.

The model surpasses Llama-3 8B Instruct on AGIEval, GPT4All Suite, TruthfulQA, and BigBench.

Mind blowing 🤯 function calling by the new `Hermes 2 on Llama-3` by @Teknium1 @intrstllrninja running on @ollama
The last question asks it to do 3 different function calls and write an article. Just check that quality.
code: https://t.co/QLscvoY45i pic.twitter.com/xYIiOeLZea

— Ashpreet Bedi (@ashpreetbedi) May 2, 2024

TherapistAI, powered by Llama3-70B

TherapistAI.com now runs on Llama3-70B, which, according to the benchmarks, is almost as good as GPT-4. The Llama3-70B model significantly enhanced the app’s conversational capabilities, enabling a back-and-forth, ping-pong style interaction. The responses have become concise, direct, and highly focused on problem-solving.

With Llama-3, Therapist AI now actively engages by asking questions, which helps it understand better and address specific user needs. It also exhibits an impressive memory, allowing it to maintain context over longer conversations, thereby enhancing its ability to deliver relevant and actionable answers.

You can also use Llama-3 to build such applications. It ensures great performance and is less expensive than using ChatGPT 4, which is around $20 per month.

🧠 https://t.co/HDQWMHoORv now runs on Llama3-70B
It's WAY better than Mixtral which is what I was running it on before, and it's now almost as good as GPT4 according to the benchmarks, but in my experience it's even better for therapy
Now with Llama3-70B:
– conversation like… https://t.co/dpOJlTxLTI pic.twitter.com/UfgZvxPvhU

— @levelsio (@levelsio) April 23, 2024

AI Coding assistant with Llama 3

It’s time to give your productivity a boost by building an AI Coding assistant with Llama3.

To develop an AI coding assistant using Llama3, start by downloading Llama3 via Ollama, and then integrate a system message to enable it as a Python coding assistant. Next, install the Continue VSCode extension, connect it with my-python-assistant, and activate the tab-autocomplete feature to enhance coding efficiency.

Let's build an AI Coding assistant with Llama3 ↓🧵🦙

— Pau Labarta Bajo (@paulabartabajo_) April 29, 2024

Superfast Research Assistant using Llama 3

You can build a research assistant powered by Llama-3 models running on Groq. You can then take any complex topic, search the web for information about it, package it up, and send it to Llama-3 running on Groq. It will send back a proper research report.

Superfast Research Assistant using Llama3 on @GroqInc and @tavilyai
Build an Assistant that:
🔎 researches a complex topic
✍ writes a report at 800 tokens/sec
Try it yourself: https://t.co/mRfeAIS6r7 pic.twitter.com/N3i8nT3cIx

— Ashpreet Bedi (@ashpreetbedi) April 22, 2024

Building RAG Capabilities for Accessing Private Data

Subtl.ai is building in-house RAG capabilities for accessing private data. Founded with the goal of democratizing access to private data for specific professional needs, the platform significantly improves efficiency by offering a 5x faster access to information. It does all this while maintaining data security through an AI that securely processes and recalls your data, allowing AI-enhanced access as well as data protection.

The company will be releasing their AI bot built on Llama-3 soon.

The post 10 Wild Use Cases for Llama-3 appeared first on Analytics India Magazine.

10 Key Takeaways From Sam Altman’s Talk at Stanford

In a recent Q&A session at Stanford University, Sam Altman, the visionary CEO of OpenAI, shared invaluable insights on the future of artificial intelligence and its potential impact on society. As the co-founder of the research organization behind groundbreaking AI models like GPT and DALL-E, Altman's perspective holds immense significance for entrepreneurs, researchers, and anyone interested in the rapidly evolving field of AI.

This blog post delves into 10 key takeaways from his thought-provoking talk, offering a glimpse into the challenges and opportunities that lie ahead.

1. The best time for startups and AI research

Altman emphasized that the current AI landscape presents an unprecedented opportunity for entrepreneurs and researchers alike. He believes that now is the best time to start a company since the advent of the internet, and possibly in the entire history of technology. The potential for AI to revolutionize industries and solve complex problems has never been greater. Altman encouraged aspiring founders to seize this moment and contribute to the AI ecosystem, whether through starting a company or pursuing cutting-edge research.

2. OpenAI's iterative deployment strategy

One of the key strategies that has fueled OpenAI's success is their commitment to iterative deployment. Altman stressed the importance of shipping early and often, even if the products are imperfect. By putting AI models into the hands of users and gathering feedback, OpenAI can continuously improve their offerings and address real-world challenges. This approach allows them to learn from their mistakes, refine their models, and stay at the forefront of AI development. Altman encouraged entrepreneurs to embrace this mindset and be willing to learn from their products' shortcomings.

3. The trajectory of AI model capabilities

Altman provided a tantalizing glimpse into the future of AI model capabilities, particularly with the anticipated release of GPT-5 and beyond. He confidently stated that each successive iteration of these models will be significantly smarter than its predecessor, with no signs of slowing down. The implications of this rapid advancement are profound, as AI systems become increasingly capable of tackling complex tasks and understanding nuanced contexts. Altman emphasized that we are still in the early stages of this exponential growth curve, and the true potential of AI is yet to be fully realized.

The Possibilities of AI [Entire Talk] — Sam Altman (OpenAI)The Possibilities of AI [Entire Talk] - Sam Altman (OpenAI)
Watch this video on YouTube

4. Balancing compute power and equitable access

As AI models become more sophisticated, the demand for large-scale computing infrastructure continues to grow. Altman highlighted the need for powerful computers and data centers to support the training and deployment of these models. However, he also emphasized the importance of ensuring equitable access to AI resources on a global scale. OpenAI is committed to making their models accessible to people around the world, recognizing that the benefits of AI should not be limited to a select few. Altman suggested that access to compute power may eventually be considered a fundamental human right.

5. Adapting society to the pace of AI development

One of the most significant challenges posed by the rapid advancement of AI is society's ability to keep pace with the rate of change. Altman acknowledged that while the short-term impact of AI may be less disruptive than anticipated, the long-term consequences could be profound. He stressed the importance of resilience and adaptability, both at an individual and societal level. As AI transforms industries and reshapes the job market, people will need to develop new skills and embrace lifelong learning. Altman emphasized that fostering these qualities should be a priority in education and workforce development.

6. Subtle dangers of AI: a greater concern

While much of the public discourse surrounding AI focuses on the potential for cataclysmic events, Altman argued that the subtle dangers of AI deserve greater attention. He expressed concern about the unintended consequences and unknown unknowns that may arise as AI systems become more complex and integrated into our lives. These risks, such as the erosion of privacy or the amplification of biases, may be less dramatic than apocalyptic scenarios, but they could have far-reaching implications for society. Altman called for proactive efforts to identify and mitigate these subtle dangers.

7. The role of incentives and mission alignment

Altman shed light on OpenAI's unique organizational structure, which combines a nonprofit mission with a for-profit business model. He acknowledged that this approach has its challenges, but emphasized the importance of aligning incentives with the overall mission of responsible AI development. While financial interests play a role in sustaining OpenAI's work, Altman assured the audience that the gravity of their mission remains the primary driver. He stressed the need for transparency and accountability in balancing these competing priorities.

8. AI's potential impact on geopolitics and power dynamics

As AI continues to advance, its influence on global power structures becomes increasingly uncertain. Altman acknowledged the difficulty in predicting how AI will reshape geopolitics, but emphasized that its impact could be more significant than any other technology in history. The development of artificial general intelligence (AGI) could disrupt traditional power dynamics and create new opportunities for nations to assert their influence. Altman stressed the importance of international cooperation and the need for a global framework to navigate the geopolitical implications of AGI.

9. Embracing the transformative power of AI

Despite the challenges and uncertainties surrounding AI, Altman remained optimistic about its potential to augment human capabilities and drive progress. He likened AI to a tool that can be used to build upon the “scaffolding” of society, enabling future generations to achieve greater heights. Just as we stand on the shoulders of those who came before us, AI can help us create a foundation for even more remarkable advancements. Altman encouraged the audience to embrace the transformative power of AI and to actively participate in shaping its future.

10. Fostering a culture of innovation and collaboration

Altman highlighted the importance of cultivating a strong culture within organizations working on AI. He credited OpenAI's success to the shared sense of purpose and mission among its team members. By fostering an environment that encourages innovation, collaboration, and a willingness to tackle difficult challenges, organizations can attract top talent and drive meaningful progress in AI research and development. Altman emphasized the value of diversity and inclusivity in building teams that can approach problems from different perspectives and generate novel solutions.

The Future of AI Through Altman's Eyes

Sam Altman's insightful talk at Stanford University provided a captivating glimpse into the future of AI and its potential impact on society. From the unprecedented opportunities for startups and researchers to the challenges of adapting to the pace of change, Altman's statements offer valuable guidance for navigating the AI landscape. As we embrace the transformative power of AI, it is crucial to prioritize responsible development and deployment, ensuring that its benefits are widely accessible and its risks are carefully managed. The path ahead may be uncertain, but with visionary leaders like Altman at the forefront, we can work together to build a future in which AI empowers humanity to reach new heights.

Three things we learned about Apple’s AI plans from its earnings

Three things we learned about Apple’s AI plans from its earnings Sarah Perez @sarahintampa / 8 hours

Apple CEO Tim Cook didn’t give much away about the company’s AI plans on Thursday’s Q2 earnings call with investors, but he did confirm a few tidbits about how the tech giant plans to move forward with artificial intelligence.

Notably, his comments suggested that despite spending more than $100 billion on R&D over the last five years, Apple isn’t planning to spin up too many new data centers to run or train AI models. Instead, when it comes to AI, it will continue to pursue a “hybrid” approach, as it does with other cloud services, the company told investors.

AI will span devices beyond the iPhone

We also learned that Apple envisions AI as a key opportunity across the “vast majority” of the company’s device lineup, not just the iPhone. While we’ve known this for some time — after all, Apple has been calling its M3 MacBook Airs the “best consumer laptop for AI” — the company shouted out how AI is being used across its products on its earnings call.

“I think AI — generative AI and AI — both are big opportunities for us across our products, and we’ll talk more about it in the coming weeks. I think there are numerous ways there that are great for us, and we think that we’re well-positioned,” Cook said.

In addition to the MacBook Air, the Apple Watch uses AI and machine learning in features like its irregular heart rhythm notifications and fall detection, Cook noted. And when speaking about the enterprise, the CEO referenced big companies buying and exploring the use cases for Vision Pro, though he added that he wouldn’t want to “cabin that to AI only.”

“I would just say that we see generative AI as a very key opportunity across our products. And we believe that we have advantages that set us apart there,” Cook said.

AI won’t likely come up at the iPad event this month

However, customers itching to have an AI-powered Siri will have to wait a bit longer for that news, which has long been expected to be announced at Apple’s Worldwide Developers Conference (WWDC) in June. When Cook was asked Thursday about how AI will affect consumer demand for new devices like iPhone, he responded that, with regard to generative AI, we wouldn’t see any impact “within the next quarter or so,” but said he was “extremely optimistic” about the technology.

Apple isn’t planning to make its bigger AI announcements before WWDC.

This discovery came about through a correction to a CNBC news story, which had misinterpreted a statement Cook made to seemingly indicate there would be “big plans to announce” from an “AI point of view” at both upcoming events, including next week’s iPad event and WWDC in June. But as subsequent corrections show (likely after a lashing by a frantic Apple comms team), Cook had paused before saying “… from an AI point of view …” which was the start of his next thought and not connected to Apple’s plans for both events.

The story was updated with this correction so people didn’t think AI news would be announced at the iPad event scheduled for May 7. (You can read through the backstory on the corrections here on 9to5Mac.)

While we didn’t expect to hear much if anything about AI until at least WWDC, this correction basically confirms that timing.

Apple iPad event 2024: Watch Apple unveil new iPads right here

Apple is taking a hybrid approach to AI investments

The biggest AI news, however, is something Cook said about Apple’s CapEx expenditures, which are funds spent on fixed assets, like servers and data centers, real estate and more.

While that’s not often the most interesting subject, this time the company’s response hinted toward Apple’s AI investment plans. As technology investor M.G. Siegler pointed out on his blog, Apple CFO Luca Maestri had answered a question about generative AI’s impact on Apple’s historical CapEx cadence by explaining that Apple pursues a hybrid model, “where we make some of the investments ourselves, in other cases we share them with our suppliers and partners …”

Plus, he added, Apple does “something similar on the data center side. We have our own data center capacity and then we use capacity from third parties.”

“It’s a model that has worked well for us historically, and we plan to continue along the same lines going forward,” Maestri said.

Siegler interpreted this to mean that Apple won’t need to spend on CapEx because Apple isn’t planning to immediately build and train LLMs (large language models) on its own servers.

And, if you squint a little, it could also be another signal that Apple could be looking at third parties to power its AI services. As Bloomberg reported in April, Apple has been holding discussions with ChatGPT maker OpenAI and Google to power an AI chatbot coming in an iOS 18 update.

With Apple confirming that its CapEx wouldn’t be affected by its near-term AI plans, it’s likely that Apple is planning to forge some sort of deal with partners for AI services in addition to what it can handle on-device and by itself. Whether Apple eventually shifts the balance to utilize more of its own servers and data centers over time still remains to be seen.

Apple earnings see 10% iPhone sales drop, massive buyback fuels stock jump

Apple WWDC 2024, set for June 10-14, promises to be ‘A(bsolutely) I(ncredible)’

How Are APAC Tech Salaries Faring in 2024?

Working for a salary in tech has been somewhat of a wild ride in APAC in recent years.

First, there were the boom times leading into the year 2022, when the widespread pursuit of digitisation initiatives following the peak of the global pandemic combined with pervasive talent shortages put tech talent in the driver’s seat. Salaries rose, often at very high rates.

Then, economic headwinds hit the global tech market. This caused an about-face in fortunes in the region, as hiring freezes and layoffs were seen throughout 2022 and 2023. The jobs available and headline salary figures took a hit as demand for tech roles dried up.

The good news is that 2024 is bringing back more stable salary growth for tech workers. Recruiter Robert Half said there is 3-5% salary growth for tech workers overall. Meanwhile, bigger salary increases are being seen in high-demand segments like artificial intelligence.

Digital transformation drove APAC tech salaries before 2022

Image of Melissa Lau, the Director of Robert Half.
Melissa Lau, Director, Robert Half

Robert Half director Melissa Lau, based in Hong Kong, was an eyewitness to the exceptional increases in tech salaries that occurred before 2022. Back then, companies across the region were aggressively hiring tech talent to accelerate digital transformation efforts across various industries. Hong Kong also served as a prominent hub for crypto companies.

“At the time, the supply of skilled individuals in the field fell short of meeting the required headcount, creating a talent shortage,” Lau told TechRepublic. “As a result, salaries experienced an upward spike as companies competed to attract and retain the limited pool of highly skilled tech workers.”

APAC 2022 and 2023 tech salary crunch followed global tech sector woes

From 2022, the tech market in APAC experienced some turbulence. The global tech market ran into a period of “high inflation and elevated interest rates,” according to one summary from Deloitte, and macroeconomic uncertainty led to “softening consumer spending, lower product demand, falling market capitalisations and workforce reductions in 2022.”

This trend continued into 2023. “Several companies implemented retrenchments that adversely affected a significant number of individuals in the tech industry,” Lau explained. The 2022 crypto crash, which slashed almost three quarters from the value of cryptocurrency Bitcoin at the time, also ended the heady rush for tech pros in the crypto industry in Hong Kong, she said.

FREE DOWNLOAD: TechRepublic’s top tech job predictions for 2024

Software engineering salaries cut in 2023

The Asia Tech Salary Report from talent platform provider NodeFlair noted the “challenges the industry faced, including layoffs and hiring freezes,” in Asia during 2023. Using proprietary and external data from six countries in APAC, it found software engineer salaries decreased by an average of 0.99% during 2023, compared to an increase of 7.61% experienced in 2022.

Some software engineering disciplines fared worse than others. For instance, there was a 6.66% reduction in salaries for game engineer positions. Salaries for solutions engineers dropped by 5.7%, blockchain engineers by 5.4% and DevOps pros by 2%.

The news was not all bad; data science roles bucked the trend with 11.3% growth in salaries.

2024 is looking better for tech salaries across APAC

Tech sector salaries appear to have stabilised in 2024. In fact, Robert Half has seen a return to more steady, stable growth in the Hong Kong tech market. “The technology industry is showing signs of a gradual recovery after a decline in demand in 2023 caused by overhiring,” Lau said.

This means that, broadly, tech workers can now expect salary increases of between 3-5% if they remain at their current company and do not receive a promotion. Those who do receive promotions are receiving increases ranging anywhere between 5% and 10%, Lau said.

SEE: Tech Worker Salary Growth in Australia Has Normalised

Those who make the effort to find new employment are also in a position to take advantage of higher increases, albeit lower than before the salary crunch. Lau said those who are changing positions could expect to be rewarded with salary increases of between 5-15%.

NodeFlair, too, expects salaries to recover. “Salaries for tech employees, in general, are poised for recovery in 2024 as the economy rebounds. The growth rate may vary across different roles, with a strong emphasis on the increasing demand for AI and data science professionals.”

APAC tech worker salaries in 2024 depend on location

NodeFlair’s report showed the salaries workers command across APAC depend on the market in which they reside. For instance, the median monthly base salary of a lead software engineer in Singapore is US $6,688, compared with US $1,937 in Vietnam.

Median monthly base salaries for tech workers vary by country in the APAC region.
Median monthly base salaries for tech workers vary by country in the APAC region. Image: NodeFlair

The tech roles with the best potential for salary increases in 2024

There are also differences in demand for different roles, with AI and data science expected to lead salary growth in 2024. “This sector-specific surge in demand may lead to competitive salary offers to attract and retain top talents in these specialised fields,” NodeFlair said.

AI and data science

The global rise of AI is driving hiring in Asian markets. NodeFlair’s report found salaries for data scientists at all levels rose by 11.3% year-on-year in the Singapore market during 2023, even amidst moribund demand for tech roles as a whole. Just one example is that the salary for a middle percentile lead data scientist in Singapore has grown from S$12,500 per month (US$9,234) in 2022 to S$14,187 per month (US$10,480) in 2023, or 14%.

NodeFlair report of data scientist salaries.
NodeFlair found salaries for data scientists, which includes AI professionals, grew between 2022 and 2023 in Singapore by 11.3%. Source: NodeFlair

NodeFlair argued demand for skills in areas like machine learning, natural language processing and data analysis would be critical as businesses recognise the transformative potential of AI.

Lau said this is being reflected in Robert Half data showing tech salaries rising in Hong Kong. “This can be attributed to the industry’s increasing focus on hiring AI professionals,” she explained. “As AI continues to advance and shape various sectors, companies are actively seeking skilled AI specialists, leading to a growing demand and subsequent rise in salaries.”

Cybersecurity

The continuous growth and increasing complexity of cyber threats are leading to “extremely high demand” for cybersecurity professionals in Hong Kong and the region, Lau said.

“Companies across industries are investing heavily in protecting their digital assets and customer data, driving up salaries for cybersecurity specialists.”

NodeFlair found that salaries for cybersecurity engineering roles grew by 8.24% year-on-year in the Singapore market between 2022 and 2023.

SEE: AI deepfakes rising as a risk for APAC organisations

Project management

In addition, IT project management is proving to be a lucrative skill. Lau said large corporations, like insurers or conglomerates, are offering attractive salaries for project management roles.

“These companies often require skilled professionals to oversee and lead large-scale projects such as system enhancements, upgrades, and business process reengineering, meaning their specialised expertise to manage these moving parts contributes to competitive salaries,” she said.

India Can Build Five GPT-4 Models Simultaneously on Yotta Infrastructure

Hiranandani Group’s data centre firm Yotta Data Services’ chief Sunil Gupta, in a recent interview with Forbes, said that India will be able to build five GPT-4 models simultaneously using their existing infrastructure.

“I have ordered 16,000 [GPUs], so if there is a customer or there are like five customers, each one of them who wants to make a GPT-4, I can simultaneously handle their load,” said Gupta.

Yotta Data Services has received the first tranche of 4,000 GPUs from NVIDIA, with plans to scale up to 32,768 units by the end of 2025.

This expansion is part of their partnership announced in December last year, where Yotta ordered 4,096 NVIDIA H100 Tensor Core GPUs initially, set to increase to 16,384 GPUs by June 2024, marking a significant investment of close to $1 billion.

The state-of-the-art chips procured from NVIDIA will power Yotta’s upcoming Shakti Cloud platform, positioning it as the 10th quickest supercomputer globally.

In addition to enhancing their GPU capabilities, Yotta Data Services partnered with Deloitte India to offer clients access to NVIDIA GPU computing infrastructure for developing innovative Generative AI applications efficiently.

Moreover, Yotta Data Services recently announced a partnership with Nepal’s BLC Holdings to construct Nepal’s first supercloud data center, named ‘K1’, in Ramkot near Kathmandu. This supercloud data center will provide a comprehensive suite of cloud, managed IT, and cybersecurity services, catering to various use cases, including AI models and enterprise applications.

The multi-million K1 facility will offer up to 4MW critical IT load capacity, spread across 3 acres and 60,000 sq ft area. Located within 20 km of Kathmandu airport, it will provide cloud, managed IT and cybersecurity services to store and process data, AI models and enterprise applications.

The post India Can Build Five GPT-4 Models Simultaneously on Yotta Infrastructure appeared first on Analytics India Magazine.

Retired Air Marshal joins Synergy Quantum to Drive Growth

Retired Air Marshal joins Synergy Quantum to Drive Growth

Synergy Quantum, a leading player in quantum technology, has announced the appointment of Air Marshal Gurcharan Singh Bedi (Retd) as Vice President of Business Development for Airforce. This landmark move marks one of the first instances of a retired central government employee joining a tech company, signaling a significant shift in the intersection of public service and private sector innovation.

Air Marshal Bedi has served as a decorated fighter pilot with over 3700 hours of flying experience. He is a recipient of the Vayu Sena Medal (Gallantry), Atti Vishisht Seva Medal, and Vishisht Seva Medal, along with numerous other prestigious honors.

Among his achievements is his tenure as the Air Advisor in the High Commission of India in London, he has held various key positions, including his three-year tenure as the Air Advisor in the High Commission of India in London, Air Officer Commanding Jammu & Kashmir, and Director General (Inspection and Safety).

At Synergy Quantum, Air Marshal Bedi’s appointment is viewed as a strategic move to leverage his vast experience and insights to drive business development initiatives, particularly within the realm of airforce applications of quantum technology.

In recent years, India has been ramping up its investments in quantum computing, recognizing its potential to drive innovation and address complex challenges. India announced investing Rs.6003.65 Crore (approximately $740 million) over eight years, from 2023-24 to 2030-31 for its National Quantum Mission.

The primary objective of this initiative is to boost research and development efforts and foster an innovative quantum technology landscape within India. Originally announced in the 2020 Budget with an allocation of approximately Rs 8,000 crore, the initiative faced a delay of nearly two years.

The post Retired Air Marshal joins Synergy Quantum to Drive Growth appeared first on Analytics India Magazine.

Meta Spends $30 Billion on a Million NVIDIA GPUs to Train its AI Models

In a “staggering” revelation, Meta AI chief Yann LeCun confirmed that Meta has obtained $30 billion worth of NVIDIA GPUs to train their AI models. Enough to run a small nation or even put a man on the moon in 1969.

Speaking at the Forging the Future of Business with AI Summit organised by Imagination in Action, LeCun said that more variations of Llama-3 would be out over the next few months, with training and fine-tuning currently taking place.

“Despite all the computers we have on our hands, it still takes a lot of time to fine-tune, but a bunch of variations on those models are going to come out over the next few months,” he said.

Speaking of fine-tuning and training, host John Werner stated that Meta had bought an additional 500,000 GPUs from NVIDIA, taking the total number of NVIDIA GPUs up to a million, with a retail value of $30 billion.

Combining the total costs of the GPUs so far, Werner pointed out that the training of the model exceeded the costs of the entire Apollo space programme, which back in the 1960s, amounted to about $25.4 billion.

Agreeing, LeCun said, “Yeah, it’s staggering, isn’t it? A lot of it, not just training, but deployment, is limited by computational abilities. One of the issues that we’re facing is the supply of GPUs and the cost of them at the moment.

Obviously, adjusted for inflation, the Apollo programme still outsells the Meta in terms of how much was actually spent, with roughly $257 billion spent. But it’s no secret that the cost of GPUs is a continuously growing expense for AI companies.

Recently, OpenAI’s Sam Altman said that he doesn’t care if the company spends upwards of $50 billion a year in developing AGI. The company, as of March, employs as many as 720,000 NVIDIA H100 GPUs for Sora alone. This amounts to about $21.6 billion.

Similarly, all big tech companies are hoping to expand how many GPUs they can obtain by the end of the year, or even by 2025.

Microsoft is aiming for 1.8 million GPUs by the end of the year. Meanwhile, OpenAI hopes to use 10 million GPUs for their latest AI model.

In the meantime, NVIDIA has also been churning out GPUs, with their latest DGX H200 GPU being hand-delivered by CEO Jensen Huang to Altman.

Coming back to LeCun, he pointed out that the need of the hour was the ability to upscale learning algorithms so they could be parallelised across several GPUs. “Progress on this has been kind of slow in the community, so I think we’re kind of waiting for breakthroughs there,” he said.

With that occurring, costs could potentially lower for AI companies, though with increasingly fast upscaling overall, demand could remain the same.

The post Meta Spends $30 Billion on a Million NVIDIA GPUs to Train its AI Models appeared first on Analytics India Magazine.