Back to Basics Bonus Week: Deploying to the Cloud

Back to Basics Bonus Week: Deploying to the Cloud
Image by Author

The team at KDnuggets hope you have been enjoying the ‘Back to Basic’ series. To end it off, we have a bonus week for those who want to go that extra mile and increase their knowledge base.

If you haven’t already, have a look at:

  • Week 1: Python Programming & Data Science Foundations
  • Week 2: Database, SQL, Data Management and Statistical Concepts
  • Week 3: Back to Basics Week 3: Introduction to Machine Learning
  • Week 4: Advanced Topics and Deployment

Moving onto the bonus week,

  • Bonus 1: Getting Started with Google Platform in 5 Steps
  • Bonus 2: Deploying your Machine Learning Model to Production in the AWS Cloud

Getting Started with Google Platform in 5 Steps

Bonus Week — Part 1: Getting Started with Google Cloud Platform in 5 Steps

Explore the essentials of Google Cloud Platform for data science and ML, from account setup to model deployment, with hands-on project examples.

This article aims to provide a step-by-step overview of getting started with Google Cloud Platform (GCP) for data science and machine learning. We'll give an overview of GCP and its key capabilities for analytics, walk through account setup, explore essential services like BigQuery and Cloud Storage, build a sample data project, and use GCP for machine learning.

Whether you're new to GCP or looking for a quick refresher, read on to learn the basics and hit the ground running with Google Cloud.

Deploying your Machine Learning Model to Production in the AWS Cloud

Bonus Week — Part 2: Deploying Your Machine Learning Model to Production in the Cloud

Learn a simple way to have a live model hosted on AWS.

AWS, or Amazon Web Services, is a cloud computing service used in many businesses for storage, analytics, applications, deployment services, and many others. It’s a platform utilizes several services to support business in a serverless way with pay-as-you-go schemes.

Machine learning modeling activity is also one of the activities that AWS supports. With several services, modeling activities can be supported, such as developing the model to making it into production. AWS has shown versatility, which is essential for any business that needs scalability and speed.

This article will discuss deploying a machine learning model in the AWS cloud into production. How could we do that? Let’s explore further.

Wrapping it Up

And that’s a wrap!

Congratulations on completing the Bonus Week to the Back to Basic series.

The team at KDnuggets hope that the Back to Basics pathway has provided readers with a comprehensive and structured approach to mastering the fundamentals of data science.

If you have enjoyed the Back to Basic series, let us know in the comments so the team can craft another series. Please drop suggestions too!

Nisha Arya is a Data Scientist and Freelance Technical Writer. She is particularly interested in providing Data Science career advice or tutorials and theory based knowledge around Data Science. She also wishes to explore the different ways Artificial Intelligence is/can benefit the longevity of human life. A keen learner, seeking to broaden her tech knowledge and writing skills, whilst helping guide others.

More On This Topic

  • Back to Basics Week 1: Python Programming & Data Science Foundations
  • Back to Basics Week 3: Introduction to Machine Learning
  • Back to Basics Week 4: Advanced Topics and Deployment
  • Back to Basics Week 2: Database, SQL, Data Management and…
  • Back To Basics, Part Dos: Gradient Descent
  • Tips & Tricks of Deploying Deep Learning Webapp on Heroku Cloud

7 Hardware Devices for Edge Computing Projects

Edge computing has heightened the demand for robust hardware solutions in recent years. This shift towards decentralised processing, closer to data sources, aims to reduce latency and elevate real-time decision-making. As a result, efficient and powerful hardware has become a crucial focal point, driving advancements to meet the evolving needs of this dynamic computing landscape.

Here are some of the top hardware devices for edge computing projects in 2023:

Raspberry Pi 5

The Raspberry Pi 5, the latest iteration of the renowned single-board computer, marks a substantial leap forward with enhanced performance and capabilities. Featuring a faster quad-core Arm Cortex-A76 CPU, upgraded VideoCore VI GPU, and increased RAM capacity, it is well-suited for demanding applications like video editing, gaming, and machine learning.

Adding Gigabit Ethernet, Wi-Fi 5, USB 3.0 ports, and USB-C power delivery further elevates its connectivity and convenience. Whether used for learning, media streaming, gaming, robotics, home automation, web development, or machine learning, the Raspberry Pi 5 combines affordability, compact size, and versatility.

NVIDIA Jetson Series

The Nvidia Jetson Nano emerges as a formidable force in edge computing, particularly for AI and deep learning applications. Engineered with a compact design and robust features, it finds its niche in diverse sectors, from smart cameras and autonomous robots to industrial automation and medical imaging.

Armed with an NVIDIA Maxwell GPU, quad-core ARM Cortex-A57 CPU, and 4GB LPDDR4 RAM, it delivers high-performance graphics and computing capabilities, facilitating the real-time analysis of intricate data. The Jetson Nano’s comprehensive software ecosystem, including the JetPack SDK with CUDA Toolkit and TensorRT, empowers developers to create and deploy sophisticated AI applications seamlessly.

Google Coral Edge TPU

The Google Coral Edge TPU stands as the pinnacle of accelerating AI at the edge, offering specialized hardware designed for the efficient execution of TensorFlow Lite models on edge devices. Boasting high performance with up to 4 TOPS, it ensures swift inference for complex AI tasks while consuming a mere 0.5W per TOPS, catering to the needs of battery-powered devices.

Available in multiple forms, including a USB stick and PCI-e card, it facilitates seamless integration into diverse systems. The Coral Edge TPU’s compatibility with TensorFlow Lite, coupled with Google’s user-friendly SDK and development tools, streamlines the implementation process for developers.

Notably, the accelerator brings tangible benefits such as reduced latency, enhanced privacy, and increased reliability to edge computing applications, making it a versatile solution for smart cameras, autonomous robots, industrial automation, healthcare, and retail.

Microsoft Azure LoT edge

Microsoft Azure IoT Edge enables edge computing by seamlessly extending the power of Azure services, including Azure Functions and Azure Machine Learning, directly to edge devices. This cloud-based platform ensures reduced latency, improved efficiency, and offline functionality by enabling local processing of data generated at the edge.

The key features, such as container orchestration with Docker and centralized management through the Azure IoT Hub, simplify deployment and monitoring tasks, enhancing overall operational efficiency. With applications spanning industrial IoT, smart cities, retail, healthcare, and agriculture, Azure IoT Edge proves its versatility.

Intel NUC

Intel NUCs, Next Unit of Computing, are compact powerhouses ideally suited for edge computing applications. These mini PCs handle demanding tasks such as real-time video analytics, machine learning inference, and industrial automation control, boasting high-performance configurations, cutting-edge Intel Core processors, and Iris Xe graphics.

The compact 4×4-inch form factor facilitates easy integration into space-constrained environments, making them a go-to choice for diverse applications. Their flexibility, scalability, and various connectivity options provide adaptability for evolving computing demands.

Compatible with various operating systems, they find applications in smart cities, retail, industrial automation, healthcare, and education, showcasing their versatility as a dependable solution for critical edge computing tasks.

Amazon EC2 Local Edge

Amazon EC2 Local Edge enables the execution of AWS Lambda functions and containerised applications at the edge, offering unparalleled flexibility and control. It also addresses the need for low latency in real-time applications like robotics and gaming, ensuring faster response times by running applications locally.

It caters to concerns regarding data privacy by allowing users to keep sensitive information on their hardware while providing secure access and encryption. Moreover, the service facilitates offline operation in remote or unreliable locations, ensuring continuous functionality.

With features such as AWS Fargate for container orchestration, Amazon EC2 Local Volume for local data storage, and centralised management via AWS OpsHub, this solution emerges as a game-changer for various industries, including industrial IoT, smart cities, healthcare, retail, and media and entertainment.

Dell EMC PowerEdge XE240m

The Dell EMC PowerEdge XE240m emerges as a solution purpose-built for demanding edge computing applications in harsh environments. This compact 2U rack server ensures uninterrupted service reliability.

The formidable computing capabilities crucial for real-time data processing in sectors like industrial IoT, oil and gas, defense, aerospace, smart cities, and telecommunications are delivered by the Dell EMC PowerEdge XE240m, powered by dual-socket Intel Xeon processors.

The server’s scalable storage options, versatile compatibility with various operating systems, and advanced security features underscore its adaptability. Additionally, the PowerEdge XE240m’s ease of management through iDRAC9 and front-accessible I/O enhances its practicality in challenging settings.

The post 7 Hardware Devices for Edge Computing Projects appeared first on Analytics India Magazine.

Mistral AI, a Paris-based OpenAI rival, closed its $415 million funding round

Mistral AI, a Paris-based OpenAI rival, closed its $415 million funding round Romain Dillet @romaindillet / 9 hours

French startup Mistral AI has officially closed its much anticipated Series A funding round. The company has raised €385 million, or $415 million at today’s exchange rate — according to Bloomberg, it values the company at roughly $2 billion. Mistral AI is also opening up its commercial platform today.

As a reminder, Mistral AI raised a $112 million seed round less than six months ago to set up a European rival to OpenAI. Co-founded by Google’s DeepMind and Meta alums, Mistral AI is working on foundational models with an open technology angle.

Andreessen Horowitz (a16z) is leading the most recent funding round with Lightspeed Venture Partners investing once again in the AI company. That’s not all as a long list of investors is also participating the round, such as Salesforce, BNP Paribas, CMA-CGM, General Catalyst, Elad Gil and Conviction.

“Since the creation of Mistral AI in May, we have been pursuing a clear trajectory: that of creating a European champion with a global vocation in generative artificial intelligence, based on an open, responsible and decentralised approach to technology,” Mistral AI co-founder and CEO Arthur Mensch said in a statement.

In September, Mistral AI released its first model called Mistral 7B. This large language model isn’t meant to compete directly with GPT-4 or Claude 2 as it was trained on a “small” dataset of around 7 billion parameters.

Instead of opening access to the Mistral 7B model via APIs, the company made it available as a free download so that developers could run it on their devices and servers.

The model was released under the Apache 2.0 license, an open-source license that has no restrictions on use or reproduction beyond attribution. While the model can be run by anyone, it was developed behind closed doors with a proprietary dataset and undisclosed weights.

Mistral AI also played an important role in shaping the discussions around the EU’s AI Act. The French AI startup has been lobbying for a total exemption for foundational models, saying that regulation should apply to use cases and companies working on products that are used by end users directly.

EU lawmakers reached a political deal just a couple of days ago. Companies working on foundational models will face some transparency requirements and will have to share technical documentation and summaries of what’s in the datasets.

Mistral AI’s best model is now only accessible via an API

The company still plans to make money from its foundational models. That’s why Mistral AI is opening up its developer platform in beta today. With this platform, other companies will be able to pay to use Mistral AI’s models via APIs.

In addition to the Mistral 7B model (“Mistral-tiny”), developers will be able to access the new Mixtral 8x7B model (“Mistral-small”). This model uses “a router network” to process input tokens and choose the most apt group of parameters to give an answer.

“This technique increases the number of parameters of a model while controlling cost and latency, as the model only uses a fraction of the total set of parameters per token. Concretely, Mixtral has 45B total parameters but only uses 12B parameters per token. It, therefore, processes input and generates output at the same speed and for the same cost as a 12B model,” the company wrote in a blog post.

Mixtral 8x7B has also been released under the Apache 2.0 license and is available as a free download. A third model, Mistral-medium, is available on Mistral’s developer platform. It supposedly performs better than Mistral AI’s other models and it is only available through the paid API platform — no download link available.

EU lawmakers bag late night deal on ‘global first’ AI rules

Generative AI Needs Vigilant Data Cataloging and Governance

Generative AI Needs Vigilant Data Cataloging and Governance Sponsored Content by Alation December 11, 2023 by Kevin Petrie, VP of Research at Eckerson Group

Our industry’s breathless hype about generative AI tends to overlook the stubborn challenge of data governance. In reality, many GenAI initiatives will fail unless companies properly govern the text files that feed the language models they implement.

Data catalogs offer help. Data teams can use the latest generation of these tools to evaluate and control GenAI inputs on five dimensions: accuracy, explainability, privacy, IP friendliness, and fairness. This blog explores how data catalogs support these tasks, mitigate the risks of GenAI, and increase the odds of success.

What is GenAI?

GenAI refers to a type of artificial intelligence that generates digital content such as text, images, or audio after being trained on a corpus of existing content. The most broadly applicable form of GenAI centers on a large language model (LLM), which is a type of neural network whose interconnected nodes collaborate to interpret, summarize, and generate text. OpenAI’s release of ChatGPT 3.5 in November 2022 triggered an arms race among LLM innovators. Google released Bard, Microsoft integrated OpenAI code into its products, and GenAI specialists such as Hugging Face and Anthropic gained new prominence with their LLMs.

Now things get tricky

Companies are embedding LLMs into their applications and workflows to boost productivity and gain competitive advantage. They seek to address use cases such as customer service document processing based on their own domain-specific data, especially natural language text. But text files introduce the risks of data quality, fairness, and privacy. They can cause GenAI models to hallucinate, propagate bias, or expose sensitive information unless properly cataloged and governed.

Data teams, more accustomed to database tables, must get a handle on governing all these PDFs, Google Docs, and other text files to ensure GenAI does more good than harm. And the stakes run high: 46% of data practitioners told Eckerson Group in a recent survey that their company does not have sufficient data quality and governance controls to support its AI/Machine Learning (ML) initiatives.

Data teams need to govern the natural-language text that feeds GenAI initiatives

Enter the data catalog

The data catalog has long assisted governance by enabling data analysts, scientists, engineers, and stewards to evaluate and control datasets in their environment. It centralizes a wide range of metadata—file names, database schemas, category labels, and more—so data teams can vet data inputs for all types of analytics projects. Modern catalogs go a step further to evaluate risk and control usage of text files for GenAI initiatives. This helps data teams fine-tune and prompt their LLMs with inputs that are accurate, explainable, private, IP friendly, and fair. Here’s how.

Accuracy

Infographic image showing how catalogs help data teams govern LLM input to be accurate, explainable, private, IP friendly, and fair.Catalogs help data teams govern LLM inputs to be accurate, explainable, private, IP friendly, and fair.

GenAI models need to minimize hallucinations by using inputs that are correct, complete, and fit for purpose. Catalogs centralize metadata to help data teams evaluate data objects according to these requirements. For example, data engineers might append accuracy scores to text files, rate their alignment with master data, or classify them by topic or sentiment. Such metadata helps the data scientist select the right files for fine-tuning or prompt enrichment via retrieval-augmented generation. This helps control the accuracy of LLM inputs and outputs.

Explainability

LLMs should provide transparent visibility into the sources of their answers. Catalogs help by enabling data scientists and ML engineers to evaluate the lineage of their source files. For example, the data scientist with a financial-services company might use a catalog to trace the lineage of sources for an LLM that processes mortgage applications. They can explain this lineage to customers, auditors, or regulators, which helps them trust the LLM’s outputs.

Privacy

Companies must maintain privacy standards and policies when creating LLMs. Data catalogs assist by identifying, evaluating, and tagging personally identifiable information (PII). Armed with this intelligence, data scientists and ML or natural language processing (NLP) engineers can work with data stewards to obfuscate PII before using those files. They also can collaborate with data stewards or security administrators to implement role-based access controls based on compliance risk.

IP friendliness

Companies must protect intellectual property such as copyrights and trademarks to avoid liability risks. By evaluating data ownership and usage restrictions for text files, catalogs can help data engineers and data stewards ensure that data science teams do not overstep any legal boundaries as they fine-tune and implement LLMs.

Fairness

GenAI initiatives must not propagate bias by inadvertently delivering responses that unfairly represent certain populations or viewpoints. To prevent bias, data teams can evaluate, classify, and rank files according to their representation of different groups. By centralizing this metadata in a catalog, they can decide on a holistic basis whether they have the right balanced inputs for their LLMs. This helps companies control the level of fairness.

Vigilance

Generative AI creates exciting opportunities for companies to make their workers more productive, their processes more efficient, and their offerings more competitive. But it also exacerbates the long-standing risks such as data quality, privacy, and fairness. Data catalogs offer a critical platform for governing these risks and enabling companies to realize the promise of GenAI.

Conclusion

Generative AI creates exciting opportunities for companies to make their workers more productive, their processes more efficient, and their offerings more competitive. But it also exacerbates long-standing risks such as data quality, privacy, and fairness. Data catalogs offer a critical platform for governing these risks and enabling companies to realize the promise of GenAI. And in a symbiotic fashion, GenAI can help catalogs achieve this goal. Check out Alation’s recent announcement to learn how its Allie AI co-pilot helps companies automatically document and curate datasets at scale.

Related

Oracle’s Symbiotic Connection with AMD and NVIDIA

Oracle's Symbiotic Connection with AMD and NVIDIA

Oracle Cloud and AMD have fostered a long-standing collaboration in the realm of cloud computing, a partnership that Karan Batta, Senior Vice President at Oracle Cloud Infrastructure, shed light on at AMD’s Advancing AI event when speaking with Lisa Su, the CEO of AMD.

“We’re also excited to actually support MI300X as part of our generative AI service that’s going to be coming up live very soon as well,” said Batta. Oracle has announced that it would be hosting AMD’s MI300X GPUs for its cloud infrastructure, which were released at the event by Su and would be available starting next year.

AMD loves Oracle

Batta expressed his excitement about the partnership between Oracle and AMD, emphasising their journey since the inception of Oracle Cloud Infrastructure (OCI) in 2017. The partnership has seen the integration of every generation of AMD’s EPYC into OCI’s bare metal compute platform, garnering success with notable customers like Red Bull.

Read: AMD Eyes Big Wins with MI300X for AI Workloads

This success prompted an expansion across the entire portfolio of platform-as-a-service (PaaS) offerings, including Kubernetes and VMware. Additionally, this extends to Pensando oDPUs, where offloading logic enhances performance and flexibility for customers.

Batta highlighted Oracle’s support for the MI300X in the bare-metal compute stack, underlining the company’s commitment to integrating the latest technologies into its offerings. Customer feedback on AMD’s MI300X has been positive, with early adopters like Databricks expressing enthusiasm about the upcoming generative AI service.

Lisa Su, CEO of AMD; Karan Batta, Senior Vice President at Oracle Cloud Infrastructure

Databricks, along with Lamini and Essential AI have also been a long running customer of AMD. All three of these companies have been using AMD’s earlier GPUs for working on AI workloads. Highlighting another significant milestone, Karan mentioned their partnership on Exadata earlier in the year, signalling a promising trajectory for their future.

On the other hand, AMD has also partnered with Microsoft and Meta for offering MI300X on their platforms. AMD believes that AI is a collaborative frontier, and not just a competition.

Meta AI senior director engineering Ajit Matthews announced that Meta is going to use MI300X for building its data centres. Microsoft’s CTO Kevin Scott also said that Azure OpenAI Service will now also run on MI300X. This is similar to what Oracle has been doing with its multi-cloud approach.

Oracle loves NVIDIA, and everyone

When it comes to cloud, Oracle has played a multi-cloud gamble and is partnering with Microsoft to integrate it within its services. Moreover, the company has also announced that it is hoping to partner with Google and AWS soon. This clearly showcases Oracle’s commitment to staying on top of the game via collaboration. Same goes for hardware providers.

Batta further highlighted that AMD’s MI300X are also going to be part of Oracle’s generative AI service that is going to go live very soon. It is clear that just like AMD is all about partnership, Oracle is also committed to building an ecosystem for generative AI, instead of competing with others.

Interestingly, Oracle had announced a multi-year partnership with NVIDIA to accelerate AI adoption for enterprises. Chris Chelliah, Oracle’s spokesperson, highlighted that NVIDIA selected OCI as the first hyper-scale cloud provider to offer NVIDIA DGX Cloud, emphasising the strength of Oracle’s infrastructure.

The partnership leverages MySQL HeatWave data for real-time anomaly detection on NVIDIA clusters, showcasing the synergy between the two entities. Oracle Cloud Infrastructure customers now have simplified access to high-performance accelerated computing and software for production AI projects.

The best of both worlds

The inclusion of NVIDIA’s DGX Cloud AI supercomputing platform and NVIDIA AI Enterprise software in the Oracle Cloud Marketplace, along with the announcement of AMD’s MI300X, and support for ROCm, provides a streamlined path for end-to-end AI development and deployment.

Oracle’s intricate dance between AMD and NVIDIA reflects a nuanced strategy in the rapidly evolving landscape of generative AI and cloud computing. Oracle wants everything to be about collaboration, and to give its customers the best of both, NVIDIA and AMD’s world.

The post Oracle’s Symbiotic Connection with AMD and NVIDIA appeared first on Analytics India Magazine.

Lean Co-pilot Lets You Use LLMs as Copilots in Lean

Lean Co-pilot Lets You Use LLMs as Copilots in Lean

The LeanDojo team and California Institute of Technology have introduced Lean Co-pilot, a collaborative tool designed for LLM-human interaction to craft 100% accurate formal mathematical proofs.

The innovative system utilises LLMs to suggest proof tactics within the Lean theorem prover, providing a seamless environment for human intervention and modification.

Click here to check out the GitHub repository.

The challenging landscape of automating theorem proving has long been hindered by the unreliability of current LLMs in mathematical and reasoning tasks, often prone to mistakes and hallucinations. Traditionally, mathematical proofs have predominantly relied on manual derivation, demanding meticulous verification.

A short demo of Lean Co-pilot by @KaiyuYang4 We are here at #NeurIPS2023 to talk about AI for theorem proving
– Tutorial on Machine Learning for Theorem Proving (https://t.co/eNMzVij15R): Monday 1:45–4:15 PM, Hall B2
– LeanDojo’s oral presentation: Tuesday 10 AM, Ballroom A-C… pic.twitter.com/sqC0wrZS06

— Prof. Anima Anandkumar (@AnimaAnandkumar) December 11, 2023

Lean, a powerful theorem prover, excels at formal verification but poses a laborious task for humans when writing in Lean. Lean Co-pilot addresses this issue by leveraging LLMs to automate the suggestion of Lean proof tactics, significantly expediting proof synthesis. The system allows for human inputs only when necessary, offering a balanced collaboration between machine and human intellect.

Key features of Lean Co-pilot include LLM-driven suggestions for proof steps, searching for proofs, and selecting useful lemmas from an extensive mathematical library. The tool seamlessly integrates into Lean’s Visual Studio Code workflow, ensuring a user-friendly experience.

Users can set up Lean Co-pilot as a Lean package, utilising built-in models from LeanDojo or incorporating custom models that can run locally or on the cloud.

LeanDojo, the platform supporting Lean Co-pilot, encourages accessibility by providing open-source models and tools under the MIT licence. The tool operates on various platforms, including Linux, macOS, and Windows WSL, with optional support for CUDA-enabled GPUs.

Lean Co-pilot’s requirements include Git LFS, optional CUDA and cuDNN (recommended for GPU support), and CMake >= 3.7 along with a C++17 compatible compiler for building Lean Co-pilot itself.

Lean Co-pilot’s introduction aims to make LLMs more accessible to Lean users, fostering a positive feedback loop where proof automation contributes to enhanced data quality, ultimately driving improvements in LLMs for mathematical tasks.

The post Lean Co-pilot Lets You Use LLMs as Copilots in Lean appeared first on Analytics India Magazine.

OpenAI Hires Former X Head to Lead India Operations

OpenAI is planning to expand its footprint in India by roping in Rishi Jaitly, former VP of Elon Musk’s X, as reported by TechCrunch. Jaitly will be working as a senior advisor and will be responsible for guiding the company to navigate India’s AI policy and regulatory landscape.

Though the news is not officially announced by OpenAI, he’s helping the team to set up an office in India. Interestingly, the development is said to have taken place soon after Altman’s inaugural trip to the country in June.

According to Jaitly’s LinkedIn, his extensive professional journey encompasses diverse roles, including founding the Virginia Tech Institute for Leadership in Technology, where he pioneered the world’s first executive degree in humanities. Currently serving as a distinguished humanities fellow and professor of practice at Virginia Tech, he engages in teaching and leadership at the intersection of humanities and technology.

As Principal at Alchmy LLC, he focuses on evangelism and excellence for companies globally. Notably, as the Co-Founder & CEO of Times Bridge, he scaled a venture capital business, bringing global ideas to India. Jaitly also played key roles at Twitter (he was India’s first employee), leading operations and partnerships in Asia, and at Google, where he overturned internet censorship and advocated for an open web across India and South Asia.

Back in June, the company made its first international expansion to London, UK.

During Altman’s visit to India, he found himself unnecessarily entangled in a controversy after declaring it “hopeless” for Indian companies to compete with their American counterparts in AI, later clarifying that his remark was taken out of context; it referred specifically to the challenge of competing with a $10 million budget. In response, CP Gurnani, Tech Mahindra CEO, revealed that the IT giant is working on Project Indus, an indigenous LLM that would have the ability to speak in many Indic languages, most notably Hindi.

The post OpenAI Hires Former X Head to Lead India Operations appeared first on Analytics India Magazine.

From Data Defiance to Cyber Resilience: The Winners of Shell’s Cyber Threat Hackathon

From Data Defiance to Cyber Resilience: The Winners of Shell’s Cyber Threat Hackathon

Shell and MachineHack collectively hosted the ‘Cyber Threat Detection Hackathon‘, which kicked off on September 15 and concluded on November 10, 2023, signifying a crucial initiative in advancing cybersecurity solutions.

The challenge was to develop models that identify hidden code in text to enhance web application security and resilience against cyber threats.

The hackathon focused on detecting code in text, a method commonly used by malicious entities to breach systems and access data.

Hackers often hide malicious code in innocuous media like images, videos, or text files. This embedded code can be executed unknowingly, compromising security.

In the hackathon, the participants received a text with hidden source code. The code may have been lacking in source control markers and could be segmented within the text. Attendees had to identify and extract this code – the strategies to do that included pattern recognition, machine learning models, and natural language processing techniques.

The challenge required analysing text structures, identifying anomalies, and using algorithms to detect the embedded code. The participants adapted their methods to diverse text formats and unexpected code placements.

The hackathon was open to everyone except the employees and contractors of the organisers. And the jury consisted of the top leadership at Shell.

The Prize Money

The stakes were high since the winners would receive a grand sum of $2500 for the first prize. The second- and third-prize winners would take home $1200 and $700, respectively. That’s not all; the next ten runners-up would receive $60.

The Winners

Winner: Ramashish Gupta

Ramashish Gupta, an undergraduate at IIT Kharagpur, won the first prize in the hackathon. His approach involved a two-step training process: addressing challenges in code repetition and improving accuracy through dynamic text matching. “At the Shell Hackathon, I started with a pre-trained T5 model from Salesforce. I fine-tuned it using a two-step training process involving seq2seq training,” Gupta explained.

The challenges included the model’s inability to repeat code, causing errors. To address this, he added a pre-training step to teach the model to replicate code before actual training. Additionally, he implemented text matching for code extraction using dynamic matching. Extensive data analysis helped reduce issues like data inconsistency.

Check out the solution here.

1st Runner up: Mohan Krishna Gupta

Mohan Krishna Gupta, a fresh BTech graduate and an NLP engineer at Textify AI, secured the second position at the hackathon. His approach was an NLP question-answering task using an ensemble of RoBERTa and DeBERTa models.

“I experimented with different models and finally built an ensemble using RoBERTa and DeBERTa, training each for 30 epochs on Google Colab,” Gupta said. He then used PyTorch and HuggingFace transformers libraries for training. His strategy to handle the large context sizes was to create smaller chunks and modify the indices of the answer for each chunk.

“I often participate in hackathons as they always provide a good exposure to using the latest technologies to solve problems,” he said.

Check out the solution here.

2nd Runner up: Jatin Yadav

Jatin Yadav, a GCP data engineer at Cognizant, secured third place in the hackathon. A graduate in computer science, Yadav initially attempted manual code extraction before shifting to using LLMs, specifically FLAN-T5. He enhanced the model’s tokeniser to recognize specific programming tokens, improving its accuracy.

“I tried multiple LLMs, including BERT, LLaMA and FLAN-T5 variants. At last, I stuck with FLAN-T5 (xl variant:2.85 billion parameters) because of its portability, less training time and more accurate results.” Yadav also added tokens that were not recognised by the model ( “{“,”}”,”” )and retrained it for better accuracy.

Check out the solution here.

Other winners

Apart from the top three winners, the challenge recognised Prabin Kumar Nayak, Roshan Rateria, Rajat Ranjan, Bhavyan Sahayata, Thangadurai Jayaraman, and Ayush Patel as the runners-up.

The ‘Cyber Threat Detection Hackathon’ provided a platform for emerging talents in AI and ML to demonstrate their skills in cybersecurity. The hackathon highlighted a collaboration between the technology sector and the developers’ community. The hackathon marked advancements in AI for cybersecurity challenges, suggesting future collaborative solutions.

The post From Data Defiance to Cyber Resilience: The Winners of Shell’s Cyber Threat Hackathon appeared first on Analytics India Magazine.

Top 7 Noteworthy AI Innovations in Fashion in 2023

While we saw companies around the world jumping onto the bandwagon of implementing generative AI in their daily workflow, the fashion industry took its own sweet time. However, we have recently witnessed significant developments in this space, particularly over the last six months of 2023. AI has been a catalyst in analysing data for trend-driven collections, and it’s pivotal in developing eco-friendly materials, optimizing production for sustainability and more.

Now, let’s explore some of the key developments in the application of AI in the fashion industry this year.

Meta

Meta and luxury fashion brand EssilorLuxottica has launched the new Ray-Ban Meta smart glasses that offer improved audio with custom-designed speakers, an ultra-wide 12 MP camera for higher quality photos and 1080p videos and are powered by the Qualcomm Snapdragon AR1 Gen1 Platform. Livestreaming to Facebook or Instagram is now possible, and the glasses feature hands-free convenience, water resistance (IPX4), and a sleek charging case for up to 36 hours of use. The glasses come in Wayfarer and Headliner styles, with various frame and lens combinations, and are prescription-lens compatible.

Humane AI

Introduced at the premier Paris fashion week by supermodel Naomi Campbell, OpenAI-backed startup Humane AI introduced Human Ai pin, shedding light on the future of AI wearables.

Humane’s AI Pin is a voice-activated wearable device that uses LLMs for conversational AI interaction. Operating on the Cosmos system, it streamlines user experience by automatically directing queries to relevant tools, eliminating the need for manual app or settings management. Through the AI Mic software, the device connects to AI models, including OpenAI’s ChatGPT, enabling tasks such as making calls, translating conversations, taking photos, and delivering reminders.

Google

Google has launched two features to improve online clothes shopping. The Virtual Try-On for apparel uses generative AI to display clothes on diverse real models, aiding users in making informed decisions about how garments will look on different body types and skin tones. Currently available for women’s tops from brands like Anthropologie, Everlane, H&M, and LOFT, this feature is expected to expand to include more brands and items, including men’s tops.

The second feature, Guided Refinements, utilises machine learning and visual matching algorithms to help users narrow down online product searches based on factors like colour, style, and pattern, offering a more tailored shopping experience with options from various retailers. Both features aim to replicate the convenience of trying on clothes in-store and address common challenges faced by online shoppers.

Levi Strauss

Denim pioneer Levi Strauss announced a partnership with Lalaland.AI, an Amsterdam-based digital fashion studio specialising in AI-generated models. The collaboration aims to test AI-generated models later this year to complement human models, diversifying and expanding the representation of models for Levi’s products.

Lalaland.ai employs advanced AI to create hyper-realistic models representing various body types, ages, sizes, and skin tones, contributing to a more inclusive and sustainable shopping experience. While acknowledging the potential of AI technology to enhance consumer experience, the focus remains on leveraging technology to create a more diverse, personal, and engaging customer experience, aligning with Levi’s ongoing efforts to foster diversity in both content creation and consumer representation.

Adobe

Adobe’s Project Primrose is an interactive dress crafted using wearable and flexible, non-emissive textiles, allowing the entire surface to display content made using Adobe Firefly, Adobe After Effects, Adobe Stock, and Adobe Illustrator. The dress is engineered to animate fabric, undergoing design and style changes quickly.

The technology at play involves light-diffusing modules for displays, utilising a non-emissive and flexible system employing reflective-backed polymer-dispersed liquid crystal (PDLC), a material commonly used in smart windows. This energy-efficient material, adaptable to various shapes, dynamically diffuses light.

Moreover, the dress integrates sensors that respond to the wearer’s movements. Designers can opt to employ Adobe’s creative AI tools to generate visuals based on textual prompts.

Shopify

Canadian e-commerce giant Shopify is also embracing generative AI through its Shopify Magic initiative. Under this umbrella, the company unveiled enhanced features that leverage generative AI. Shopify Magic can now deliver tailored responses to customer queries based on their interaction history and store policies. It extends its capabilities to generate content such as blog posts, product descriptions, and marketing emails.

The introduction of Sidekick, a chatbot-like AI tool, enables Shopify to comprehend and respond to queries related to business decision-making. The new capabilities of Shopify Magic utilise a combination of proprietary Shopify data, including merchant business data, and large language models like OpenAI’s GPT-4. Shopify Magic now allows merchants to automate the creation of blog posts for various occasions, offering customisation options for tone and language translation. Furthermore, it can generate content for customer emails based on brief prompts, automating the process of crafting weekly newsletters and announcements.

Microsoft

Microsoft has recently filed a patent for an innovative AI-powered smart backpack featuring advanced technology such as a camera, microphone, speaker, network interface, processor, and storage. The backpack, reminiscent of science fiction, is designed to enhance daily activities. In its patent application, Microsoft outlines various functionalities, including the ability to autonomously assess safety conditions for activities like skiing.

Not just the bag, Microsoft has significantly impacted the fashion industry through collaborations like the one with Portugal startup XNFY Lab, resulting in the launch of AI Generated Fashion powered by Azure Machine Learning. This initiative addresses sustainability concerns by allowing brands to generate high-definition clothing models based on market trends and customer demands, reducing the risk of unsold items.

In parallel, Microsoft’s Azure AI Video Indexer contributes by enabling clothing detection in videos, aiding content creators in advertising and post-event analysis.

Read more: Top 7 Smart Wearables Powered by Generative AI

The post Top 7 Noteworthy AI Innovations in Fashion in 2023 appeared first on Analytics India Magazine.

6 Ways Ethical AI Took Centre Stage in 2023

In 2023, big decisions were made about making sure AI plays fair and doesn’t mess with people’s privacy. Instead of just talking about it, governments and companies got serious and felt the pressure to make new rules. Ethical AI, which used to be this fancy idea, got a makeover – it now had some real power behind it.

The people in charge wanted to make sure AI followed some strict rules which included protecting people’s privacy.

To make sure AI behaves, plans and checklists in 2023, the folks in charge didn’t just talk the talk – they walked the walk.

Here are 6 actions taken in 2023 to make surely AI will be built and used ethically:

UNESCO Forms Business Council for Ethics of AI

The Business Council for the Ethics of AI made its debut officially at the Ministerial and High Authorities Summit on the Ethics of Artificial Intelligence in Latin America and the Caribbean, hosted in Santiago de Chile. Microsoft and Telefónica, serving as co-chairs, played an important role in the council’s inauguration.

This collaboration is spearheaded by UNESCO and Latin American companies at the forefront of AI development or application across diverse sectors, and has taken shape as a platform for corporate convergence. The Council will provide a forum for companies to convene, share insights, and champion ethical standards within the AI industry, aligning with UNESCO’s guidelines.

AI Bill for Rights in US

The White House introduced the Blueprint for an AI Bill of Rights, a set of rules for responsible AI use. This blueprint involves collaboration with academics, human rights groups, the public, and major companies like Microsoft and Google. It aims to make AI transparent, fair, and safe, focusing on potential civil rights issues in areas like employment, education, healthcare, financial services, and commercial surveillance. By emphasising practical impacts on daily life, the AI Bill of Rights seeks to ensure AI behaves responsibly and respects fundamental rights, ushering in a new era where even artificial intelligence is bound by clear rules and guidelines.

The AI Bill of Rights is a set of guidelines for the responsible design and use of AI, created by the White House Office of Science and Technology Policy amid an ongoing global push to establish more regulations to govern AI.

EU AI Act

In the spring of 2021, the European Commission pitched the inaugural EU regulatory game plan for AI. They’re sorting AI systems into risk categories based on their potential impact, with more or less rule-mania depending on the danger level. If greenlit, these guidelines will be the global frontrunners in AI regulation.

Fast forward to June 14, 2023 – MEPs gave a nod to Parliaments negotiating position on the AI Act. Now, the intricate dance begins with EU nations in the Council to hash out the final law draft. The goal is to handshake on an agreement before the year bows out, setting the stage for a new chapter in AI governance.

PM Narendra Modi Proposes Ethical AI

In August, during one of the most important events in India called the B20 Summit, Prime Minister Narendra Modi talked about AI. He said it’s important to use AI in a good and fair way.

Modi also had this idea of having a special day each year called ‘International Consumer Care Day’. And, instead of dealing with carbon credits, he suggested using something called ‘green credit’.

In his speech, Modi said that India is like the leader in using tech in Industry 4.0. He also mentioned that India is a big deal when it comes to making sure things run smoothly and trusty in the global supply chain.

Pope Warns of Irresponsible Use of AI

In March, Pope Francis took centre stage at the ‘Minerva Dialogues,’ an annual brainiac gathering hosted by the Vatican’s Dicastery for Education and Culture to call for ethical use of AI. While he acknowledged the benefits of AI when used for the common good, but also warned against unethical or irresponsible use.

This call to ethical arms echoed just a week after AI mischief-makers had a field day, generating images using AI of the Pope that left many fooled. A slick coat may be a style statement, but misinformation? That’s a different AI story.

WHO calls for safe and ethical AI

The World Health Organization (WHO) also stepped up to the mic, urging to hit the brakes on the power of LLMs. In their official statement, WHO made it clear that using AI can be risky be it for a decision-support tool, or to upgrade diagnostic capacity in under-resourced settings

‘Handle with care’ was the main message of WHO. The organisation also threw down the rulebook, stressing the importance of playing nice with ethical principles and solid governance. Because in the world of AI and health, a little caution and a lot of ethics can go a long way.

The post 6 Ways Ethical AI Took Centre Stage in 2023 appeared first on Analytics India Magazine.