World’s First Commissioned Music Video Created by OpenAI Sora Looks Awfully Nice (Not!)

The world’s first commissioned music video created entirely through OpenAI’s Sora, ‘The Hardest Part,’ directed by Paul Trillo, is now out.

"The Hardest Part" – directed by @paultrillo – out now. https://t.co/NlbyCvf5iO
The first commissioned music video created entirely through @OpenAI's Sora. pic.twitter.com/Gz44TdU7Bc

— Washed Out (@realwashedout) May 2, 2024

The music video’s debut comes at a time when OpenAI is pitching Sora to Hollywood and other entertainment giants. The AI startup has been actively arranging meetings in Los Angeles with Hollywood studios, media executives, and talent agencies. Their goal is clear: to forge partnerships and encourage filmmakers to integrate Sora into their creative processes.

OpenAI has also taken steps to showcase Sora’s capabilities in a blog post, illustrating how artists, filmmakers, and creative designers can leverage Sora to produce surreal videos. The blog contained new Sora generated videos from various visual artists and directors, showcasing short films alongside their impressions of the technology.

This move hints at a potential paradigm shift in how Hollywood, advertising, and other creative industries approach content creation.

Sora expected later this year

In a recent interview, OpenAI chief technology officer Mira Murati said that Sora would be available this year, possibly within “a few months”. However, she appeared hesitant about delving into the specifics of the data Sora was trained on and dodged the question.

“I won’t go into the details of the data used, but it was either publicly available or licensed data,” she said. Murati was trolled for saying that she wasn’t sure whether it used videos from YouTube, Facebook, and Instagram. However, she confirmed that Sora uses content from Shutterstock, with which OpenAI has a partnership.

Murati said that Sora could take a few minutes to generate videos, depending on the complexity of the prompt, and that it’s ‘much, much more expensive’. “We don’t know what it’s going to look like exactly when we make it available to the public, but we’re trying to make it available at a similar cost eventually to what we saw with DALL-E,” she added.

The post World’s First Commissioned Music Video Created by OpenAI Sora Looks Awfully Nice (Not!) appeared first on Analytics India Magazine.

AWS Collaborates With ShellKode to Train 1 Lakh Women Developers in GenAI

AWS has partnered with Bengaluru-based born-in-the-cloud company ShellKode to train one lakh women developers in GenAI.

The two companies have collaborated in an effort to help women developers upskill themselves during the current shift towards AI in the workplace. As a part of this, ShellKode launched “EmpowerHer”, which will work towards upskilling the developers.

“We’re empowering a generation of aspiring developers, particularly women, with the cutting-edge tools and knowledge of AI to transform India’s innovation landscape and shape the future of enterprise,” said Arun Kumar, ShellKode CEO.

The programme will pair the developers with GenAI mentors in order to help them better understand how GenAI works. In addition, the company stated that the mentors will be able to provide “personalised and invaluable guidance, career advice, and a supportive network.”

This will be done through networking events, seminars and online interactions with the GenAI community, where they will be able to connect with industry professionals.

The need to democratise the AI landscape in India has been a continuing conversation. Recently, several companies have pledged to increase the number of women developers in the country, including Microsoft. The tech giant had pledged to upskill and certify over 75,000 women developers in AI by 2025.

Likewise, the Karnataka government had also collaborated with JobsForHer in March this year towards a similar goal. The Karnataka Digital Economic Mission (KDEM) had launched the HerShakti, specifically for women in tech. As part of the initiative, they hope to upskill 500 women in the next six months.

Similarly, in an era where mid-career employees are rushing towards upskilling themselves, the gender distribution has been fairly even in demand. Several edtech startups have reported increased interest in AI and AI-adjacent courses, especially among women in Tier 2 and Tier 3 cities.

With this in mind, the EmpowerHer programme comes at an opportune time.

The post AWS Collaborates With ShellKode to Train 1 Lakh Women Developers in GenAI appeared first on Analytics India Magazine.

Dropbox, Figma CEOs back Lamini, a startup building a generative AI platform for enterprises

Dropbox, Figma CEOs back Lamini, a startup building a generative AI platform for enterprises Kyle Wiggers 9 hours

Lamini, a Palo Alto-based startup building a platform to help enterprises deploy generative AI tech, has raised $25 million from investors including Stanford computer science professor Andrew Ng.

Lamini, co-founded several years ago by Sharon Zhou and Greg Diamos, has an interesting sales pitch.

Many generative AI platforms are far too general-purpose, Zhou and Diamos argue, and don’t have solutions and infrastructure geared to meet the needs of corporations. In contrast, Lamini was built from the ground up with enterprises in mind, and is focused on delivering high generative AI accuracy and scalability.

“The top priority of nearly every CEO, CIO and CTO is to take advantage of generative AI within their organization with maximal ROI,” Zhou, Lamini’s CEO, told TechCrunch. “But while it’s easy to get a working demo on a laptop for an individual developer, the path to production is strewn with failures left and right.”

To Zhou’s point, many companies have expressed frustration with the hurdles to meaningfully embracing generative AI across their business functions.

According to a March poll from MIT Insights, only 9% of organizations have widely adopted generative AI despite 75% having experimented with it. Top hurdles run the gamut from a lack of IT infrastructure and capabilities to poor governance structures, insufficient skills and high implementation costs. Security is a major factor, too — in a recent survey by Insight Enterprises, 38% of companies said security was impacting their ability to leverage generative AI tech.

So what’s Lamini’s answer?

Zhou says that “every piece” of Lamini’s tech stack has been optimized for enterprise-scale generative AI workloads, from the hardware to the software, including the engines used to support model orchestration, fine-tuning, running and training. “Optimized” is a vague word, granted, but Lamini is pioneering one step that Zhou calls “memory tuning,” which is a technique to train a model on data such that it recalls parts of that data exactly.

Memory tuning can potentially reduce hallucinations, Zhou claims, or instances when a model makes up facts in response to a request.

“Memory tuning is a training paradigm — as efficient as fine-tuning, but goes beyond it — to train a model on proprietary data that includes key facts, numbers and figures so that the model has high precision,” Nina Wei, an AI designer at Lamini, told me via email, “and can memorize and recall the exact match of any key information instead of generalizing or hallucinating.”

I’m not sure I buy that. “Memory tuning” appears to be more a marketing term than an academic one; there aren’t any research papers about it — none that I managed to turn up, at least. I’ll leave Lamini to show evidence that its “memory tuning” is better than the other hallucination-reducing techniques that are being/have been attempted.

Fortunately for Lamini, memory tuning isn’t its only differentiator.

Zhou says the platform can operate in highly secured environments, including air-gapped ones. Lamini lets companies run, fine tune, and train models on a range of configurations, from on-premises data centers to public and private clouds. And it scales workloads “elastically,” reaching over 1,000 GPUs if the application or use case demands it, Zhou says.

“Incentives are currently misaligned in the market with closed source models,” Zhou said. “We aim to put control back into the hands of more people, not just a few, starting with enterprises who care most about control and have the most to lose from their proprietary data owned by someone else.”

Lamini’s co-founders are, for what it’s worth, quite accomplished in the AI space. They’ve also separately brushed shoulders with Ng, which no doubt explains his investment.

Zhou was previously faculty at Stanford, where she headed a group that was researching generative AI. Prior to receiving her doctorate in computer science under Ng, she was a machine learning product manager at Google Cloud.

Diamos, for his part, co-founded MLCommons, the engineering consortium dedicated to creating standard benchmarks for AI models and hardware, as well as the MLCommons benchmarking suite, MLPerf. He also led AI research at Baidu, where he worked with Ng while the latter was chief scientist there. Diamos was also a software architect on Nvidia’s CUDA team.

The co-founders’ industry connections appear to have given Lamini a leg up on the fundraising front. In addition to Ng, Figma CEO Dylan Field, Dropbox CEO Drew Houston, OpenAI co-founder Andrej Karpathy, and — strangely enough — Bernard Arnault, the CEO of luxury goods giant LVMH, have all invested in Lamini.

AMD Ventures is also an investor (a bit ironic considering Diamos’ Nvidia roots), as are First Round Capital and Amplify Partners. AMD got involved early, supplying Lamini with data center hardware, and today, Lamini runs many of its models on AMD Instinct GPUs, bucking the industry trend.

Lamini makes the lofty claim that its model training and running performance is on par with Nvidia equivalent GPUs, depending on the workload. Since we’re not equipped to test that claim, we’ll leave it to third parties.

To date, Lamini has raised $25 million across seed and Series A rounds (Amplify led the Series A). Zhou says the money is being put toward tripling the company’s 10-person team, expanding its compute infrastructure, and kicking off development into “deeper technical optimizations.”

There are a number of enterprise-oriented, generative AI vendors that could compete with aspects of Lamini’s platform, including tech giants like Google, AWS and Microsoft (via its OpenAI partnership). Google, AWS and OpenAI, in particular, have been aggressively courting the enterprise in recent months, introducing features like streamlined fine-tuning, private fine-tuning on private data, and more.

I asked Zhou about Lamini’s customers, revenue and overall go-to-market momentum. She wasn’t willing to reveal much at this somewhat early juncture, but said that AMD (via the AMD Ventures tie-in), AngelList and NordicTrack are among Lamini’s early (paying) users, along with several undisclosed government agencies.

“We’re growing quickly,” she added. “The number one challenge is serving customers. We’ve only handled inbound demand because we’ve been inundated. Given the interest in generative AI, we’re not representative in the overall tech slowdown — unlike our peers in the hyped AI world, we have gross margins and burn that look more like a regular tech company.”

Amplify general partner Mike Dauber said, “We believe there’s a massive opportunity for generative AI in enterprises. While there are a number of AI infrastructure companies, Lamini is the first one I’ve seen that is taking the problems of the enterprise seriously and creating a solution that helps enterprises unlock the tremendous value of their private data while satisfying even the most stringent compliance and security requirements.”

Optimizing Memory for Large Language Model Inference and Fine-Tuning

Memory for Large Language Model Inference

Large language models (LLMs) like GPT-4, Bloom, and LLaMA have achieved remarkable capabilities by scaling up to billions of parameters. However, deploying these massive models for inference or fine-tuning is challenging due to their immense memory requirements. In this technical blog, we will explore techniques for estimating and optimizing memory consumption during LLM inference and fine-tuning across various hardware setups.

Understanding Memory Requirements

The memory required to load an LLM is primarily determined by the number of parameters and the numerical precision used to store the parameters. A simple rule of thumb is:

  • Loading a model with X billion parameters requires roughly 4X GB of VRAM in 32-bit float precision
  • Loading a model with X billion parameters requires roughly 2X GB of VRAM in 16-bit bfloat16/float16 precision

For example, loading the 175B parameter GPT-3 model would require approximately 350GB of VRAM in bfloat16 precision. As of today, the largest commercially available GPUs like the NVIDIA A100 and H100 offer only 80GB of VRAM, necessitating tensor parallelism and model parallelism techniques.

During inference, the memory footprint is dominated by the model parameters and the temporary activation tensors produced. A high-level estimate for the peak memory usage during inference is the sum of the memory required to load the model parameters and the memory for activations.

Quantifying Inference Memory

Let's quantify the memory requirements for inference using the OctoCode model, which has around 15 billion parameters in bfloat16 format (~ 31GB). We'll use the Transformers library to load the model and generate text:

</pre> from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline import torch model = AutoModelForCausalLM.from_pretrained("bigcode/octocoder", torch_dtype=torch.bfloat16, device_map="auto", pad_token_id=0) tokenizer = AutoTokenizer.from_pretrained("bigcode/octocoder") pipe = pipeline("text-generation", model=model, tokenizer=tokenizer) prompt = "Question: Please write a Python function to convert bytes to gigabytes.nnAnswer:" result = pipe(prompt, max_new_tokens=60)[0]["generated_text"][len(prompt):] def bytes_to_gigabytes(bytes): return bytes / 1024 / 1024 / 1024 bytes_to_gigabytes(torch.cuda.max_memory_allocated()) <pre>

Output:

29.0260648727417

The peak GPU memory usage is around 29GB, which aligns with our estimate of 31GB for loading the model parameters in bfloat16 format.

Optimizing Inference Memory with Quantization

While bfloat16 is the common precision used for training LLMs, researchers have found that quantizing the model weights to lower precision data types like 8-bit integers (int8) or 4-bit integers can significantly reduce memory usage with minimal accuracy loss for inference tasks like text generation.

Let's see the memory savings from 8-bit and 4-bit quantization of the OctoCode model:

</div> # 8-bit quantization model = AutoModelForCausalLM.from_pretrained("bigcode/octocoder", load_in_8bit=True,  pad_token_id=0) pipe = pipeline("text-generation", model=model, tokenizer=tokenizer) result = pipe(prompt, max_new_tokens=60)[0]["generated_text"][len(prompt):] bytes_to_gigabytes(torch.cuda.max_memory_allocated())</pre> 
Output:

15.219234466552734

 # 4-bit quantization model = AutoModelForCausalLM.from_pretrained("bigcode/octocoder", load_in_4bit=True, low_cpu_mem_usage=True, pad_token_id=0) pipe = pipeline("text-generation", model=model, tokenizer=tokenizer) result = pipe(prompt, max_new_tokens=60)[0]["generated_text"][len(prompt):] bytes_to_gigabytes(torch.cuda.max_memory_allocated()) </pre> <pre>

Output:

9.543574333190918

With 8-bit quantization, the memory requirement drops from 31GB to 15GB, while 4-bit quantization further reduces it to just 9.5GB! This allows running the 15B parameter OctoCode model on consumer GPUs like the RTX 3090 (24GB VRAM).

However, note that more aggressive quantization like 4-bit can sometimes lead to accuracy degradation compared to 8-bit or bfloat16 precision. There's a trade-off between memory savings and accuracy that users should evaluate for their use case.

Quantization is a powerful technique that can enable LLM deployment on resource-constrained environments like cloud instances, edge devices, or even mobile phones by drastically reducing the memory footprint.

Estimating Memory for Fine-Tuning

While quantization is primarily used for efficient inference, techniques like tensor parallelism and model parallelism are crucial for managing memory requirements during the training or fine-tuning of large language models.

The peak memory consumption during fine-tuning is typically 3-4 times higher than inference due to additional memory requirements for:

  • Gradients
  • Optimizer states
  • Activations from the forward pass stored for backpropagation

A conservative estimate is that fine-tuning an LLM with X billion parameters requires around 4 * (2X) = 8X GB of VRAM in bfloat16 precision.

For example, fine-tuning the 7B parameter LLaMA model would require approximately 7 * 8 = 56GB of VRAM per GPU in bfloat16 precision. This exceeds the memory capacity of current GPUs, necessitating distributed fine-tuning techniques.

Distributed Fine-Tuning Techniques

Several distributed fine-tuning methods have been proposed to overcome GPU memory constraints for large models:

  1. Data Parallelism: The classic data parallelism approach replicates the entire model across multiple GPUs while splitting and distributing the training data batches. This reduces training time linearly with the number of GPUs but does not reduce the peak memory requirement on each GPU.
  2. ZeRO Stage 3: An advanced form of data parallelism that partitions the model parameters, gradients, and optimizer states across GPUs. It reduces memory compared to classic data parallelism by keeping only the required partitioned data on each GPU during different phases of training.
  3. Tensor Parallelism: Instead of replicating the model, tensor parallelism divides the model parameters into rows or columns and distributes them across GPUs. Each GPU operates on a partitioned set of parameters, gradients, and optimizer states, leading to substantial memory savings.
  4. Pipeline Parallelism: This technique partitions the model layers across different GPUs/workers, with each device executing a subset of the layers. Activations are passed between workers, reducing peak memory but increasing communication overhead.

Estimating memory usage for these distributed methods is non-trivial as the distribution of parameters, gradients, activations, and optimizer states varies across techniques. Moreover, different components like the transformer body and language modeling head may exhibit different memory allocation behaviors.

The LLMem Solution

Researchers recently proposed LLMem, a solution that accurately estimates GPU memory consumption when applying distributed fine-tuning methods to LLMs across multiple GPUs.

Estimating GPU Memory Usage for Fine-Tuning Pre-Trained LLM

Estimating GPU Memory Usage for Fine-Tuning Pre-Trained LLM

LLMem considers factors like recombining parameters before computation (ZeRO Stage 3), output gathering in the backward pass (tensor parallelism), and the different memory allocation strategies for the transformer body and language modeling head.

Experimental results show that LLMem can estimate peak GPU memory usage for fine-tuning LLMs on a single GPU with error rates of up to 1.6%, outperforming the state-of-the-art DNNMem's average error rate of 42.6%. When applying distributed fine-tuning methods to LLMs with over a billion parameters on multiple GPUs, LLMem achieves an impressive average error rate of 3.0%.

By accurately estimating memory requirements upfront, LLMem can help users select the most efficient distributed fine-tuning method that avoids out-of-memory issues while minimizing training time.

Emerging Techniques

While quantization, tensor parallelism, and model parallelism are established techniques, researchers continue to explore novel methods to push the boundaries of efficient LLM training and deployment.

  1. LoRA and QLoRA: These techniques involve training a smaller residual adapter module to update the pre-trained LLM with new knowledge instead of directly fine-tuning the massive number of parameters. This can lead to substantial memory savings while retaining most of the model's performance.
  2. FlashAttention: The self-attention mechanism is a memory and compute bottleneck in transformer models. FlashAttention approximates the standard attention with linear complexity, reducing memory requirements from quadratic to linear in the input sequence length.
  3. Mixture-of-Experts: This approach conditionally routes each input data sample to a specialized expert model instead of processing it through the entire model. This dynamic sparsity can save memory by only activating a subset of experts for each sample.
  4. Reversed Model Surgery: Researchers have explored surgical model compression by iteratively removing less important components like attention heads to trade off memory/speed for accuracy.
  5. Offloading: Finally, techniques that offload parameters, optimizer states, or activations to CPU RAM or disk can supplement limited GPU memory for large models.

These cutting-edge methods illustrate the vibrant research ecosystem focused on democratizing efficient LLM training and deployment across diverse hardware environments.

Conclusion

The memory requirements of large language models pose significant challenges for their widespread adoption in real-world applications. By understanding memory estimation techniques and leveraging quantization, distributed training strategies, and emerging innovations, we can optimize LLM deployments on resource-constrained devices.

Tools like LLMem pave the way toward accurate memory estimation, enabling users to select the most suitable fine-tuning configuration. As hardware evolves and research advances, we can anticipate more efficient LLM training and inference, driving progress in natural language processing and artificial intelligence.

Striking the right balance between model capacity, accuracy, and resource utilization will be crucial for unlocking the full potential of large language models across diverse domains and use cases. By embracing memory optimization techniques, we move closer to a future where state-of-the-art language AI is accessible, scalable, and sustainable.

Atlassian Unveils Rovo, a new AI-Powered Knowledge Discovery Tool for Enterprise

Atlassian Corporation has announced the launch of Atlassian Rovo at it’s annual flagship conference – Team ’24.

Rovo can provide answers in seconds and the search results are personalized and contextual. Permissions are fully respected, so employees see only the information they’re supposed to see, while restricted data remains private.

Its API will be designed to let you connect niche and home-grown apps., to deliver relevant answers for any team or industry.

Moreover, with Rovo Chat, teams can engage in interactive conversations to ask questions until they get the answers they need, generate new ideas, get helpful feedback, and resolve issues while they work.

Rovo, powered by Atlassian Intelligence, takes human-AI collaboration to the next level and teams can:

  • Find: Search helps teams find the exact information they need across huge volumes of data. Rovo Search can pull information from popular tools, including Google Drive, Microsoft Sharepoint, Microsoft Teams, GitHub, Slack, and Figma, to deliver even more comprehensive answers.
  • Learn: Gain a deeper understanding of their company’s data through AI-driven insights, knowledge cards, and AI chat for deeper data exploration.
  • Act: Add specialized agents to workflows to handle time-consuming tasks and to complete projects.

A common data model called ‘teamwork graph’ is Rovo’s secret sauce, which connects data from Atlassian tools and other SaaS apps to unlock a comprehensive view of any organization’s goals, knowledge, teams, and work.

With every new tool connection, team action, and project event, the teamwork graph draws more connections and expands its knowledge to deliver increasingly relevant results.

“AI presents a huge opportunity for Atlassian. We have over 20 years of insights into knowledge work – how teams go about planning and tracking work, goal setting and unleashing knowledge. A year ago, we launched Atlassian Intelligence to help teams boost productivity with AI. Since then, we’ve woven AI into the fabric of our products across the Atlassian portfolio,” Mike Cannon-Brookes, co-founder and co-CEO, Atlassian said.

The post Atlassian Unveils Rovo, a new AI-Powered Knowledge Discovery Tool for Enterprise appeared first on Analytics India Magazine.

Bloomberg Partners With AppliedXL, To Use AI in Generating Stories for Terminal Users

Bloomberg has partnered with New York-based computational journalism startup AppliedXL to provide insights and predictive outcomes to their Bloomberg Terminal users.

As part of the collaboration, AppliedXL will parse publicly available data to structure news stories that predict early trends and provide relevant market analysis. The startup, as of now, will focus solely on providing insights from the pharmaceutical industry.

“Our collaboration with innovative companies like AppliedXL helps decision makers quickly identify unique pharmaceutical news content by making it easily accessible on the Bloomberg Terminal,” said Chris Collins, chief product and technology officer at Bloomberg News.

Under the Hood

The startup, which makes use of AI in analysing data and detecting trends, will help Terminal users get ahead of potential catalytic events within the industry. Similarly, the AI will also alert users to any anomalies or irregularities that could potentially affect the market as a whole.

One such example is AppliedXL’s detection of anomalies during a set of clinical trials by biopharmaceutical company Summit Therapeutics using publicly available data on the NIH public trial registry. The company later suffered a significant drop in shares due to the initial failure of its then-sole drug candidate, which was predicted months earlier by AppliedXL’s AI.

The startup will provide similar insights to users who have access to Bloomberg Terminal. In addition, the AI has been trained in regular editorial and journalistic practices, which will be exercised during the creation of their news stories.

The company is expected to access over 7,000 updates daily on ongoing clinical trials and generate upwards of 60 stories on the most relevant developments of the day. This will include both domestic and international interventional clinical trials.

“There’s no artificial intelligence without human wisdom. We use machines to understand the patterns in data, but we need humans to understand the contexts that influence them,” said AppliedXL CEO Francesco Marconi.

According to the company, to do so, their AI has been developed in collaboration with journalists during the training process “to review data, help develop interpretations, and validate the quality of the output.”

This is one instance of a growing number of media companies relying on AI. Earlier this year, Bloomberg released a paper on their own LLM, BloombergGPT, focused on the financial industry.

On the other hand, media companies have also begun collaborating with big tech companies like OpenAI to use their reporting. Most recently, OpenAI partnered with the Financial Times in a licensing agreement, with the latter’s journalistic work now cited when using ChatGPT.

The post Bloomberg Partners With AppliedXL, To Use AI in Generating Stories for Terminal Users appeared first on Analytics India Magazine.

Containerize Python Apps with Docker in 5 Easy Steps

docker-python
Image by Author

When building applications with Python, you’ll often run into dependency conflicts, version mismatches, and the like. With Docker, you can package applications—along with the required dependencies, runtime, and config—into a single portable artifact called the image. Which you can then use to spin up a Docker container that runs the app.

So whether it is a simple Python application or a data science application, Docker makes managing dependencies simpler. This is especially helpful in data science projects where you need different libraries and specific versions of these libraries for your application to work without errors. With Docker you can have isolated, consistent, and reproducible environments for all your applications.

As a first step in this direction, let's learn how to containerize a Python application.

Step 1: Get Started

First, install Docker on the platform you use. You can run Docker on Windows, Linux, and MacOs. Here are a couple of things you may want to do after you've installed Docker on your machine.

The Docker daemon binds to a Unix socket, owned by the root user by default. So you can access it only using sudo. To avoid prefixing all your docker commands with sudo, create a docker group add a user to the group like so:

$ sudo groupadd docker    $ sudo usermod -aG docker $USER

For newer versions of Docker, BuildKit is the default builder. If you're using an older version of Docker, however, you may get deprecation warnings when you run the docker build command. This is because the legacy build client will be deprecated in future releases. As a workaround, you can install buildx, a CLI tool to use BuildKit's capabilities. And use the docker buildx build command to build with BuildKit.

Step 2: Code Your Python Application

Next, code a Python application which we can containerize using Docker. Here we’ll containerize a simple command-line TO-DO list app. The code for this app is on GitHub: todo.py file.

You can containerize any Python app of your choice or follow along with the example we use here. If you’re interested in a step-by-step tutorial on building the command-line TO-DO application, read Build a Command-Line App with Python in 7 Easy Steps.

Step 3: Create the Dockerfile

Next, we’ll create a Dockerfile. Think of it as a recipe that defines how to build the Docker image for the application. Create a file named Dockerfile in your working directory with the following:

  # Use Python 3.11 as base image  FROM python:3.11-slim    # Set the working directory in the container  WORKDIR /app    # Copy the current directory contents into the container at /app  COPY . /app    # Command to run the Python script  CMD ["/bin/bash"]

Here, we use Python 3.11 as the base image. We then set the working directory for all the following instructions with the WORKDIR command. We then use the COPY command to copy files from the project into the container’s file system.

Because we’re containerizing a command-line app, we specify the command to execute as “/bin/bash”. Which starts an interactive bash shell when we run the image and start a container.

Step 4: Build the Docker Image

We have our todo.py file and Dockerfile ready. Next, we can build the Docker image with the following command:

docker build -t todo-app .

With the -t option in the build command, you can specify both a name and a tag like so: docker build -t name:tag .

This command builds a Docker image named todo-app based on the instructions in the Dockerfile. The . at the end specifies that the build context is the current directory.

The build takes a couple of minutes:

Sending build context to Docker daemon  4.096kB  Step 1/4 : FROM python:3.11-slim  3.11-slim: Pulling from library/python  13808c22b207: Pull complete  6c9a484475c1: Pull complete  b45f078996b5: Pull complete  16dd65a710d2: Pull complete  fc35a8622e8e: Pull complete  Digest: sha256:dad770592ab3582ab2dabcf0e18a863df9d86bd9d23efcfa614110ce49ac20e4  Status: Downloaded newer image for python:3.11-slim   ---> c516402fec78  Step 2/4 : WORKDIR /app   ---> Running in 27d02ba3a48d  Removing intermediate container 27d02ba3a48d   ---> 7747abda0fc0  Step 3/4 : COPY . /app   ---> fd5cb75a0529  Step 4/4 : CMD ["/bin/bash"]   ---> Running in ef704c22cd3f  Removing intermediate container ef704c22cd3f   ---> b41986b633e6  Successfully built b41986b633e6  Successfully tagged todo-app:latest  

Step 5: Run Your Docker Container

Once the image is built, you can start a Docker container from the built image with the following command:

docker run -it todo-app

The -it option is a combination of -i and -t:

  • The -i option is used to run containers interactively and keeps STDIN open even if not attached.
  • The -t option allocates a pseudo-TTY. So it provides a terminal interface within the container that you can interact with.

Now, our TO-DO app runs inside the Docker container, and we can interact with it at the command line:

root@9d85c09f01ec:/app# python3 todo.py  usage: todo.py [-h] [-a] [-l] [-r]    Command-line Todo List App    options:    -h, --help  	show this help message and exit    -a , --add  	Add a new task    -l, --list  	List all tasks    -r , --remove   Remove a task by index  root@9d85c09f01ec:/app# python3 todo.py -a 'walk 2 miles'  root@9d85c09f01ec:/app# python3 todo.py -l  1. walk 2 miles  

Wrapping Up

And there you have it! You've successfully containerized a command-line Python application using Docker. In this tutorial, we looked at containerizing a simple Python application using Docker.

We built this application in Python without using any external Python libraries. So we did not define a requirements.txt file. The requirements.txt file usually lists the various libraries and their versions, which you can install using a simple pip install command. If you want a tutorial that focuses on Docker for data science, check out Docker Tutorial for Data Scientists.

Bala Priya C is a developer and technical writer from India. She likes working at the intersection of math, programming, data science, and content creation. Her areas of interest and expertise include DevOps, data science, and natural language processing. She enjoys reading, writing, coding, and coffee! Currently, she's working on learning and sharing her knowledge with the developer community by authoring tutorials, how-to guides, opinion pieces, and more. Bala also creates engaging resource overviews and coding tutorials.

More On This Topic

  • Build An AI Application with Python in 10 Easy Steps
  • Build a Command-Line App with Python in 7 Easy Steps
  • Python Vector Databases and Vector Indexes: Architecting LLM Apps
  • How to Create Stunning Web Apps for your Data Science Projects
  • Meet MetaGPT: The ChatGPT-Powered AI Assistant That Turns Text Into…
  • Windows on Snapdragon Brings Hybrid AI to Apps at the Edge

Illuminating AI: The Transformative Potential of Neuromorphic Optical Neural Networks

Artificial intelligence (AI) has become a fundamental component of modern society, reshaping everything from daily tasks to complex sectors such as healthcare and global communications. As AI technology progresses, the intricacy of neural networks increases, creating a substantial need for more computational power and energy. This escalation not only heightens carbon emissions and generates more electronic waste but also adds to economic pressures through increased operational costs. In response, researchers are delving into a novel integration of two progressive fields: optical neural networks (ONNs) and neuromorphic computing. Known as Neuromorphic Optical Neural Networks, this innovative combination harnesses the swift data processing of light with the sophisticated, brain-like architecture of neuromorphic systems. This article delves into this integration, which could greatly improve AI's speed, efficiency, and scalability, potentially ushering in a new era of AI technology that seamlessly blends light and intelligence.

The Inherent Challenges of Traditional Electronic Computing for AI

The foundation of contemporary AI is built on electronic computing, which utilizes electrons to process and transmit information. While electronic computing has been pivotal in advancing AI capabilities, it faces several inherent limitations that could hinder future progress. One of the major issues is the substantial energy requirement and heat generation, which necessitates complex cooling solutions and leads to elevated operational costs. As neural networks become more intricate, the demand for energy escalates, exacerbating these challenges.

Moreover, scalability in electronic computing is a growing concern. Expanding AI systems to accommodate larger datasets or more sophisticated algorithms requires a significant increase in computational resources, which may not always be feasible due to cost and environmental impact considerations. Additionally, the longevity and reliability of electronic components are compromised under the strain of continuous operation, leading to frequent replacements, and further increasing maintenance expenses.

Optical Neural Networks: Harnessing the Speed of Light

In response to these challenges, there is a shift towards developing Optical Neural Networks (ONNs), which use light (photons) instead of electricity (electrons) to process data. This paradigm shift capitalizes on the inherent properties of light, such as its phase, polarization, and amplitude, to perform computations. The use of light potentially allows for faster data processing speeds and reduced power consumption.

Optical neural networks offer several compelling advantages over traditional electronic-based AI systems. One of the most striking benefits is speed; ONNs can process data at the speed of light, facilitating near-instantaneous computations crucial for real-time applications such as autonomous driving. They are also significantly more energy-efficient, operating at cooler temperatures and consuming less power, which not only reduces operational costs but also bolsters the sustainability of computing infrastructures.

Another major advantage is scalability and the capacity for parallel processing. ONNs can handle larger data volumes and execute numerous operations simultaneously through techniques like wavelength division multiplexing, which processes multiple data streams concurrently without a proportional increase in energy or space. These capabilities make ONNs exceptionally well-suited for scaling AI applications efficiently.

Von Neumann Bottleneck

Traditional electronic neural networks are built on the Von Neumann architecture, which distinctly separates processing and memory functions. This separation requires ongoing data exchanges that can hamper system efficiency. As neural networks grow in complexity and handle larger datasets, this architecture faces significant difficulties. The primary issue is the shared communication bus between the processing and memory units, which can significantly slow down AI computations and affect the speed of model training. Although GPUs can alleviate some of these challenges by enabling parallel processing, they also introduce inefficiencies related to data transfer. Moreover, frequent data exchanges, exacerbated by a complex memory hierarchy, negatively impact system performance. Large datasets exacerbate these issues, leading to extended memory access times. When combined with restricted memory bandwidth, these factors form critical performance bottlenecks. Consequently, these limitations place considerable stress on Von Neumann systems, resulting in increased energy use and higher carbon emissions.

The Rise of Neuromorphic Computing

To address the limitations of the Von Neumann architecture, researchers are advancing neuromorphic computing (NC). This innovative architecture draws inspiration from the human brain's neural networks to facilitate parallel and distributed processing. By emulating the brain's efficient processing capabilities and integrating memory and processing in a single location, NC effectively overcomes traditional computing bottlenecks. This approach not only speeds up computations but also reduces power consumption, enhancing the handling of complex tasks.

Neuromorphic ONNs: Bridging Light and Intelligence

In the quest to overcome the limitations inherent in traditional electronic computing for AI, researchers are pioneering the development of neuromorphic optical neural networks. This innovative field merges the rapid data transmission capabilities of optical neural networks (ONNs) with the advanced architectural and learning efficiencies of neuromorphic computing (NC). The synergy between these technologies not only enhances the speed and efficiency of data processing but also scales the biological intricacies of neuromorphic systems with the light-speed potential of optical computing.

Key Benefits of Neuromorphic ONNs

Some of the primary advantages of neuromorphic optical neural networks include:

  1. Enhanced Processing Speed and Efficiency: By utilizing light for both computation and data transmission within a neuromorphic framework, these networks achieve unparalleled processing speeds and heightened energy efficiency. This makes them exceptionally suitable for applications requiring rapid response times and substantial data handling.
  2. Scalability: The ability to multiplex and demultiplex optical signals enables these networks to scale efficiently. This feature allows for handling increased data volumes without significant losses in speed or system efficiency, addressing one of the critical challenges faced by traditional computing systems.
  3. Analog Computing Capabilities: Operating in an analog mode, neuromorphic optical neural networks closely mimic the natural processes of biological neural networks. This capability is particularly beneficial for complex tasks such as pattern recognition and sensory data interpretation, which require nuanced and adaptive processing beyond the binary constraints of traditional digital systems.

Impact of Neuromorphic ONNs Beyond AI Challenges

The potential of neuromorphic optical neural networks to transform industries that demand rapid data processing, low latency, and high energy efficiency is immense. Areas such as autonomous vehicles, which require the real-time processing of extensive sensor data; smart sensors and IoT applications, where efficient, on-device processing is critical in smart environments; and healthcare, particularly for quick diagnosis and data analysis in medical imaging, stand to benefit significantly from these advancements.

Challenges in the Path of Neuromorphic ONNs

Despite the potential, the development of Neuromorphic ONNs is not without challenges. The precision required in fabricating optical components is immense, with minor imperfections having the potential to drastically affect performance. Additionally, integrating these components with existing electronic systems to create a seamless interface poses significant technical challenges. Another concern is the adaptability and programmability of these systems once they are fabricated, as adjusting optical components can be complex and cumbersome.

The Road Ahead

As we advance, the integration of optical and neuromorphic technologies in AI systems holds the promise of redefining what is possible in technology and beyond. While there are hurdles to overcome, particularly in the areas of manufacturing precision and system integration, the potential benefits of Neuromorphic ONNs—such as increased processing speeds, reduced energy consumption, and greater scalability—offer compelling reasons to pursue this innovative approach. With ongoing research and development, these systems may soon lead to more sustainable, efficient, and powerful AI applications that could transform numerous aspects of society.

Denodo Partners with SITL to Provide Enterprises with Data Fabric and Data Mesh Capabilities

Denodo and Sonata Information Technology India Limited (SITL) have announced that they have entered into a partnership to provide Indian enterprises with advanced logical data fabric and data mesh capabilities.

As part of the engagement, customers will have seamless access to cutting-edge solutions that will enable them to unlock greater value from their distributed data sets, streamline operations, and enhance decision-making.

Today’s enterprises struggle with establishing data infrastructures that can seamlessly handle the volume, velocity, and variety of data from different sources. This challenge is compounded by the rapid evolution of technologies like generative AI, which many organisations struggle to incorporate effectively.

The key to overcoming these challenges, according to Denodo, lies in eliminating data silos and establishing a centralized logical data management layer enabled by data virtualisation.

Such an enterprise data layer enables easy access to real-time data, regardless of location or format, while facilitating timely insights crucial for agile decision-making.

“Sonata is well placed to help enterprises in creating and realising modernisation-driven hyper-growth. Working together, we will enable our mutual customers across all industries to improve decision making, operational excellence, and regulatory compliance, with faster time-to-data,” Ravi Shankar, senior vice president and chief marketing officer at Denodo, said.

The post Denodo Partners with SITL to Provide Enterprises with Data Fabric and Data Mesh Capabilities appeared first on Analytics India Magazine.

Google’s New Feature Lets Users Chat with Gemini Directly in Chrome’s Search Bar

Google has integrated a new feature in Chrome that lets users access Gemini AI directly from the search bar. Users just have to type “@” in the desktop address bar and select Chat with Gemini. Then, you can type your prompt and Gemini will generate a response instantly.

This is the latest feature by Google to integrate AI into Chrome. Earlier, the search giant had announced three AI features for Chrome including Write Better, theme generation using AI, and smart tab organization.

By integrating seamlessly into existing workflows, it lets users quickly prompt Gemini and ask their queries without having to open any separate app or portal. This hints towards the next wave of AI adoption in which tech giants will be integrating major features into existing products and workflows thereby reducing friction.

Google just integrated a new feature in Chrome that lets users type '@' to pull up Gemini.
I'm a sucker for AI features that integrate seamlessly into existing workflows.
1. Practically no learning curve for a new AI tool
2. Instantly improves the product and takes advantage of… https://t.co/A0k1bQKPbt

— Rowan Cheung (@rowancheung) May 1, 2024

Including Gemini into Chrome’s Omnibox is a great example by Google to link its AI chatbot in with one of its biggest existing products with a massive distribution.

The announcement received a positive response from users online, with many excited to try out the new feature.

The Gemini mobile app is going global! 🚀 People in 100+ countries can now use Gemini on their phones to supercharge their creativity and productivity in more languages. We’ve also expanded our Extensions feature and enabled people to quickly start to #ChatWithGemini in the…

— Google (@Google) May 1, 2024

Recently, Google also expanded its Gemini app globally, making it available in over 150+ countries for both Android devices and iPhones, and adding languages and Extensions support.

The post Google’s New Feature Lets Users Chat with Gemini Directly in Chrome’s Search Bar appeared first on Analytics India Magazine.