In Gartner’s Hype Cycle for Artificial Intelligence 2024, we learned AI has made it over the “Peak of Inflated Expectations” and is poised to slip into the “Trough of Disillusionment.” Relax, Tolkien fans, this isn’t another Middle-earth adventure sequel. It means despite record investment, AI hasn’t returned the expected business value. There’s been more hype than proof, and now, early adopters are hitting performance snags that lower their ROI.
As noted by the Harvard Business Review, up to 80% of AI projects fail. This is often caused by cloud infrastructure that is incapable of handling generative AI (GenAI) research and development. Given the right infrastructure, organizations can mine their unstructured data to discover industry trends, strengthen decision-making and product quality, drive marketing results, raise customer experiences and more. The possibilities are truly daunting, but only available to companies with environments specifically optimized for AI.
Getting your head in the cloud
Some believe when it comes to GenAI that it’s cheaper to do it on-premises, entailing high-end processing and networking technology. The problem is you need graphics processing units (GPUs) to speed up computer processing, which are expensive and difficult to get a hold of. You also have to run workloads each and every day at 90% resource utilization.
Most organizations want to be able to develop incrementally – which the cloud allows. As for handling unpredictable workloads, the cloud’s innate elasticity provides a much better approach. Further, an additional benefit of the cloud is the types of GenAI models being deployed. Open-source and closed-source models are both available, however, closed-source models can’t be used on-premises even though they do circles around their open-source equivalents.
Closed-source models have to be in the cloud, which thankfully has a low cost barrier to entry, supported by a thriving community of managed services providers and host of flattering technology partners.
Lifting your cloud infrastructure
There are ways organizations can be sure that their compute resources and storage infrastructure are up to snuff when it comes to handling GenAI cost efficiently. To start, fine-tune applications for optimum performance, while ensuring files and metadata are in the right place. This will allow for cost-effective scaling to follow.
Cleaning and consolidating large sets makes data easier to use, while producing stronger insights because complete sets are analyzed. GenAI can also be used to cross-reference and validate details, resulting in a higher quality data, particularly when harnessing collection and analytical resources.
Configuring compute and storage can prevent unexpected financial surprises. This requires understanding a model’s size so it can be fed into the right GPU. On the storage side, workload adjustments can head off latency, but GenAI apps and models will always require continued optimization and tuning.
Cloud providers will typically provide a few models for evaluation purposes. This can help users select the best model and lower testing expenses. Cloud providers also will offer credits that can be redeemed to significantly reduce computing costs even further.
Taking it up a grade
Many view issues they’re having with GenAI as the result of a technology problem, when really it’s a business concern. You have to figure out what’s limiting you – then apply tools to fix it. Additionally, some will try to find a use case after getting the technical kinks out, which is putting the cart before the horse. Identifying the use case first will better clarify your goals and determine what the return on investment should be.
What makes GenAI projects in the cloud appear so complex is a lack of understanding, specifically when it comes to a goal and the path to get there. Because every workload and model varies, it’s important to set output and performance benchmarks first, then work backward. Use those cloud credits you’ve built with providers to thoroughly test your infrastructure.
Start your AI initiative with a proof of concept (PoC) involving no less than 10 users. You want to get feedback from users, even if the experience was not a positive one. Be sure you have set standard benchmarks, then monitor every input and output produced by your GenAI. By evaluating just these items you can gain insight into workload adjustments needed to take things up a grade.
Why go it alone?
My final piece of advice is to remember that you don’t have to do this solo. There are managed services with built-in security to keep bad content from getting into your data. Major providers, including Google and Amazon, offer tools that can help. Additionally, there are firms with consultants that can take it on, using their hands-on experience to create a custom approach that’s secure and fits your budget.
Eduardo Mota is senior cloud data architect – AI/ML specialist, at DoiT, a provider of technology and cloud expertise to buy, optimize, and manage AWS, GCP, and Azure cloud services. An accomplished cloud architect and machine learning specialist, he holds a Bachelor of Business Administration and multiple machine learning certifications, demonstrating his relentless pursuit of knowledge. Eduardo’s journey includes pivotal roles at DoiT and AWS, where his expertise in AWS and GCP cloud architecture and optimization strategies significantly impacted operational efficiency and cost savings for multiple organizations.