AI — Страница 1546

Data storytelling – the art of telling stories through data

Can Generative AI Be Used to Apply for a Job?

A robot representing AI writing a resume. on a computer. — Image: Vectors/Adobe Stock

One way to write a resume more efficiently might be to use a generative AI to suggest a format and phrasing. Explore how ChatGPT fares when it comes to writing resumes, how common it is for job seekers to ask for generative AI help and whether AI-written resumes cause problems for hiring software. Google Bard can help with resume writing, too, as can more conventional sites such as Grammarly.

Jump to:

Can I use AI to apply for a job?
How many job seekers use AI to write resumes?
How do AI-written resumes fare with applicant tracking software?

Can I use generative AI to apply for a job?

It’s certainly possible to use AI to write a cover letter or a resume for a job application. You’ll probably want to tweak the results, but it could be a useful resource.

When I asked ChatGPT for a resume template, it provided a conventional, effective template. Some of ChatGPT’s advice seemed basic, so I asked what adjustments it would recommend for a midcareer professional. Its advice was good:

Replace the objective section with a professional summary.
Emphasize work experience more than education.
Remove a section about relevant coursework.

The AI didn’t always follow its own conversation perfectly; for instance, it recommended a midcareer professional add a certifications section, which was already in the template.

SEE: Skills-first hiring aims to make staffing decisions based on the talent someone actually possesses, not their job title. (TechRepublic)

Next, I provided information about my last few jobs manually. ChatGPT wrote an objective section for my resume and added a skills section.

ChatGPT was only as good as the information I provided, but with some tweaks, I can see it providing a good basic template that saves time.

How many job seekers use AI to write resumes?

In a ResumeBuilder.com survey, 46% of job seekers said they used ChatGPT to write their resumes or cover letters.

Beth Noveck, director of the Burnes Center for Social Change at Northeastern University and the GovLab, compared AI-written resumes to traditional templates that show students how a professional resume should look.

“[A resume] is not ‘War and Peace,'” Noveck said in an interview with TechRepublic. “We all copy the format of successful applicants in an industry. Having a good model, having a good template can make all the difference in your ability to apply, and I think that’s important from an equity perspective,” she said.

“Not enough job seekers are using AI just to play around with it and to be able to find different versions of their resume,” Chad Sowash, former recruiter and cohost of the HR industry podcast Chad and Cheese, said. “Whether they’re looking for VP of Marketing or CMO or any type of position, they can come up with really great rough drafts from generative AI.”

How do AI-written resumes fare with applicant tracking software?

MIT Sloan reported in a January 2023 study that using generative AI to write resumes improves job seekers’ chances to be hired by 7.8%. In particular, using the tool improved spelling and grammar.

Applicant tracking software will accept AI-written resumes and may use AI to assess them.

SEE: Should hiring managers use generative AI in their work? (TechRepublic)

Noveck pointed out that 90% of Fortune 500 companies already use automated hiring mechanisms of one kind or another. Algorithms have been in place to run analytics, targeting and discovering candidates for decades. Now, generative AI can fit into an established applicant tracking system, which itself could be made up of a variety of software tools in a tech stack.

“The system can contextualize what’s going on in the resume and whether you go to the next round or not,” Sowash said in an interview with TechRepublic.

Subscribe to the Innovation Insider Newsletter

Catch up on the latest tech innovations that are changing the world, including IoT, 5G, the latest about phones, security, smart cities, AI, robotics, and more.

Delivered Tuesdays and Fridays Sign up today

Microsoft brings Bing Chat to the enterprise

Microsoft brings Bing Chat to the enterprise Kyle Wiggers 8 hours

Bing Chat, Microsoft’s AI-powered chatbot experience, is heading to the enterprise.

At its annual Inspire conference, Microsoft unveiled Bing Chat Enterprise, a version of Bing Chat with business-focused data privacy and governance controls. With Bing Chat Enterprise, chat data isn’t saved, Microsoft can’t view a customer’s employee or business data and customer data isn’t used to train the underlying AI models.

“We’ve heard from many corporate customers who are excited to empower their organizations with powerful new AI tools but are concerned that their companies’ data will not be protected,” Frank X. Shaw, Microsoft’s chief communications officer, wrote in a blog post shared with TechCrunch. “[Using Bing Chat Enterprise,] what goes in — and comes out — remains protected, giving commercial customers managed access to better answers, greater efficiency and new ways to be creative.”

To Shaw’s point, where it concerns chatbots like Bing Chat, companies have expressed concerns about confidential data ending up with developers who trained the models on user data. Recently, Apple restricted the internal use of tools including OpenAI’s ChatGPT and Microsoft-owned GitHub’s Copilot, following on the heels of Samsung, Walmart, Verizon, Bank of America, JPMorgan and others.

According to a survey from Cyberhaven, as many as 6.5% of employees have pasted company data into ChatGPT while 3.1% have copied and pasted sensitive data to the chatbot.

Besides the data controls, Bing Chat Enterprise, which is available in preview beginning today, is functionally similar to Bing Chat — answering queries in text as well as with graphs, charts and images. For example, an employee can ask Bing Chat Enterprise to create messaging for a new product or compare a product with a competitor, and include in the prompt sensitive data like product specs and pricing.

Image Credits: Microsoft

Soon, via an upcoming feature called Visual Search, Bing Chat Enterprise be able to respond to questions about uploaded images and search the web for related content. Visual Search began rolling out today in Bing Chat on mobile and the web.

Yusuf Mehd, consumer chief marketing officer at Microsoft, and Jared Spataro, CVP of modern work and business apps, explain the ins and outs of Visual Search in a blog post:

“Visual Search lets anyone upload images and search the web for related content. Take a picture, or use one you find elsewhere and prompt Bing to tell you about it — Bing can understand the context of an image, interpret it and answer questions about it.”

Bing Chat Enterprise can be used wherever Bing Chat is supported — that is, Bing.com/chat, the Microsoft Edge sidebar and soon from Windows Copilot, the Windows-native version of Bing Chat. It’s free for customers subscribed to Microsoft 365 E3, E5, Business Standard and Business Premium and in the future will be available as a standalone offering for $5 per user per month.

Bing Chat Enterprise is enabled automatically when an employee logs into Bing with the Microsoft Account associated with their organization.

The pressure’s on to monetize viral AI-powered chatbot technologies such as Bing Chat. ChatGPT reportedly cost OpenAI tens of millions of dollars to process the millions of prompts people fed into the software in January alone. Meanwhile, financial analysts estimate that Bing Chat, which is underpinned by OpenAI’s GPT-4 model, needs at least $4 billion of infrastructure to serve responses to all of Bing’s users.

The launch of Bing Chat Enterprise comes several months after Microsoft-owned GitHub rolled out Copilot for Business, a $19-per-month enterprise version of the AI-powered code completion tool. For its part, OpenAI debuted ChatGPT Plus, a paid service that delivers a number of benefits over the base-level ChatGPT, including priority access to new features and improvements.

10 Must-Try AI Models for a Filmmaker

While there have been ongoing debates and disputes within Hollywood regarding the integration of AI in the film industry, it’s important to recognise the immense power that AI wields. Apart from its potential to revolutionise the entertainment sector, AI serves as a formidable tool for experimentation and is far from posing any threat to the essence of the art itself.

Right now, the tools may not be upto the mark but this is just the beginning. Soon enough, we would be able to make an entire movie with the help of AI.

In the meanwhile, here is a list of must-try AI models for filmmakers.

Synthesia

Synthesia stands out as an exceptional AI video-generation platform that empowers users to effortlessly create videos featuring AI avatars. With a wide range of capabilities, the platform offers support for over 60 languages, a diverse selection of templates, a screen recorder, a media library, and numerous other valuable features.

Gen 2

Gen-2 is an advanced AI system that excels in generating innovative videos by seamlessly combining elements such as text, images, and video clips. This multi-modal approach empowers Gen-2 to create captivating and unique video content that encompasses a diverse range of media formats.

Murf

Murf provides a versatile solution for converting text to speech, voice-overs, and dictations, catering to professionals across various fields including product developers, podcasters, educators, and business leaders. With Murf, users gain access to extensive customisation options, allowing them to create natural-sounding voices that suit their specific needs. The tool offers a wide selection of voices and dialects to choose from, and its user-friendly interface ensures a seamless experience throughout the content creation process.

Wav2Lip

Wav2Lip is a powerful tool that allows you to synchronize the speech segment of a video with the corresponding lip and facial expressions of the person featured. With Wav2lip, you can seamlessly align the audio and visual elements, ensuring that the movements of the lips and face accurately match the spoken words.

Retrieval-based-Voice-Conversion-WebU

Retrieval-based Voice Conversion is a method that uses a specialised neural network to change one person’s voice into another person’s voice. It relies on the advanced VITS model, which is a cutting-edge system used for converting text into speech. RVC enables the creation of lifelike and expressive voice transformations, even when there is limited data and computing power available. In simpler terms, it can make someone sound like another person using a smart computer program.

so-vits-svc

You must have seen viral reels on Instagram featuring Drake’s voice on popular songs. That was done using this AI model. The SVC Fork, also known as so-vits-svc, is a remarkable open-source software available on GitHub. This software empowers individuals to train their very own AI model, enabling it to speak in any desired voice and language.

Pictory

You can enter the script or link to your article in Pictory and it will convert it into the video. One of the remarkable advantages of this tool is its accessibility to users without any prior experience in video editing or design. Getting started is simple: you provide a script or article that forms the foundation of your video content.

DeepBrainAI

This is similar to Pictory. By inputting basic text, users can instantly create videos without any hassle. All you need to do is prepare your script and utilize the Text-to-Speech feature, which allows you to receive your first AI video in less than 5 minutes. This streamlined process enables users to swiftly transform their text into engaging video content with utmost ease.

ChatGPT based on GPT-4

OpenAI’s ChatGPT based on GPT-4 is a no-brainer if you want assistance while writing scripts. ChatGPT will provide you with an immense amount of creative options while script writing. You just need to give it a cue to what your scene should look just like, and it will take care of the rest.

MusicGen

Any film is incomplete without good music. Meta has recently unveiled MusicGen, an AI-powered music generator capable of transforming text descriptions into melodic compositions. The code for MusicGen has been made available by Meta, allowing users to access and experience the demo online with just a browser. The generated musical tunes show promising results, showcasing the significant advancements achieved by AI music models.

The post 10 Must-Try AI Models for a Filmmaker appeared first on Analytics India Magazine.

Oracle Cloud Drives 30% Reduction in Technology Support Costs for Tanishq

Jewellery brand Tanishq has migrated its inventory management system to Oracle Cloud Infrastructure (OCI), including Oracle Database, Oracle Application Express (APEX), Oracle Web Application Firewall, and OCI Flexible Load Balancing.

In order to meet the increasing customer demand for its products, Tanishq made the strategic decision to transfer its system, developed on Oracle APEX (a low-code application development platform), to OCI (Oracle Cloud Infrastructure). This migration to OCI has enabled Tanishq to achieve a cohesive, up-to-date overview of inventory across all stores, facilitating effective management of customer demand. Furthermore, it has allowed Tanishq to optimize inventory levels, ultimately leading to increased revenue generation.

By having instant access to information about available inventory and order completion, Tanishq aims to effectively manage the increased customer demand while also reducing technology support expenses by 30 percent with Oracle Cloud Infrastructure (OCI).

“With OCI and Oracle APEX, we can scale up easily and transform our inventory tracking and order fulfilment process enabling us to reduce operational costs. Previously, we operated with a time-consuming manual process that could take up to a few days to generate an inventory report, severely hindering our ability to offer our customers more products,” said Srinivasan K, Head of Information Technology, Tanishq.

Recently, Oracle and Uber Technologies, Inc. announced a seven-year strategic cloud partnership to accelerate Uber’s innovation, help deliver new products to market, and drive increased profitability.

Oracle Cloud Infrastructure launched in October 2016 is a platform of cloud services that enable you to build and run a wide range of applications in a highly-available, consistently high-performance environment.

The post Oracle Cloud Drives 30% Reduction in Technology Support Costs for Tanishq appeared first on Analytics India Magazine.

Leveraging AI for smarter electronic data interchange

Electronic Data Interchange (EDI) can be traced back to the late 1960s and early 1970s when businesses began to seek more efficient ways to exchange data electronically. Consequently, the concept of using computers to transmit and receive business documents emerged, aiming to replace manual paper-based processes. Then in the 1980s, standards organizations such as ANSI and UN/EDIFACT developed standardized formats and protocols for EDI, which led to the evolution of EDI as we know it today.

Over the years, EDI has matured, integrated with various communication technologies, and embraced new data formats, such as XML. Today, EDI continues to play a vital role in facilitating efficient B2B communication, supply chain optimization, and business process automation. The global EDI market is projected to grow from $1.98 billion in 2023 to $4.52 billion by 2030, at a CAGR of 12.5%

While EDI is the perfect solution for seamless data exchange, it does come with its own set of integration challenges. However, now with AI in the picture, businesses can supercharge EDI transactions.

Exploring EDI

Electronic Data Interchange (EDI) serves as a fundamental technology that facilitates the seamless exchange of business information between trading partners in various industries. Its importance lies in reduction of manual processes, and improvement of data accuracy and efficiency. In the retail industry, EDI enables efficient order management, inventory control, and shipment tracking. For example EDI 947 provides information regarding the quantity, location, and reason for inventory adjustments, allowing trading partners to stay informed in real time.

The technology also has applications in the the healthcare sector, EDI streamlines claims processing, automates medical record exchanges, and ensures compliance with healthcare standards. EDI 270 (Eligibility Inquiry/Response) for example help automate the exchange of medical records and insurance claims, reducing administrative burdens. Manufacturing industries benefit from EDI by improving procurement processes, enabling just-in-time inventory management, and enhancing collaboration with suppliers.

Common EDI integration challenges

Complexity of data formats: EDI involves working with various data formats and standards, such as EDIFACT, X12, XML, and CSV. Each format has its own intricacies, and mapping data between different formats can be a complex task, requiring expertise and careful attention to detail.

Data transformation and mapping: Integrating EDI often requires transforming data from one format to another to ensure compatibility with internal systems. The process involves mapping data elements, fields, and values, which can be time-consuming and prone to errors if not handled properly.

Trading partner onboarding: Organizations need to establish EDI connections with multiple trading partners, each with their own specific requirements and protocols. Coordinating and managing the onboarding process can be challenging, especially when dealing with partners with varying levels of technical expertise and readiness.

Connectivity and communication: EDI integration requires establishing reliable connections with trading partners, typically through secure networks or Value-Added Networks (VANs). Ensuring uninterrupted connectivity and timely data exchange can be challenging, especially when dealing with partner networks that may have differing infrastructure or technical limitations.

Data Validation and Error Handling: Validating the integrity and accuracy of incoming and outgoing EDI data is crucial. However, data validation can be complex, given the numerous data elements and business rules involved. Proper error handling mechanisms, such as notifications and automated error resolution, need to be in place to address validation errors and discrepancies effectively.

Scalability and volume handling: As organizations grow and engage with more trading partners, the volume of EDI transactions increases. Ensuring scalability and handling high volumes of data within tight timeframes can strain internal systems, requiring robust infrastructure and efficient processing capabilities.

Data security and privacy: EDI involves the exchange of sensitive business data, which necessitates robust security measures to protect against unauthorized access, data breaches, or tampering. Ensuring data encryption, secure file transfers, and adherence to privacy regulations are vital considerations during EDI integration.

AI and EDI

Chatgpt has become the talk of the town ever since its introduction. It has opened avenues in every industry. The technology has especially proved to be useful in data integration. By harnessing the power of machine learning algorithms and advanced analytics, AI brings intelligence and automation to the integration process.

The best part about AI is that it can analyze and understand complex EDI data formats and automate data transformation and mapping processes while learning from past mappings. AI algorithms can monitor connectivity and communication channels, proactively identifying and resolving issues to ensure uninterrupted data exchange. It also automates data validation and error detection as machine learning models learn from historical patterns to flag anomalies and suggest corrective actions.

Parting words

As the EDI landscape continues to evolve, AI will play an increasingly critical role in shaping the future of seamless data exchange and facilitating smarter business interactions. AI can automate data transformation, monitor connectivity, and provide automated error handling. With AI’s capabilities, organizations can enhance efficiency, accuracy, and security in EDI, paving the way for streamlined processes and smarter business interactions.

DataStax brings vector search to its Astra DB database service

DataStax brings vector search to its Astra DB database service Frederic Lardinois @fredericl / 8 hours

DataStax, the well-funded Apache Cassandra-centric database company, is placing a lot of its current bets on AI and its technology’s ability to provide highly scalable vector search capabilities to provide real-time context to generative AI models. Today, following a short public preview, the company is launching the vector search capabilities of its hosted Astra DB service into general availability.

Vector databases have emerged as a foundational technology for generative AI. “If you’re a database company and you weren’t making this your top priority, I wouldn’t be able to understand that,” DataStax CPO Ed Anuff told me. “This is the most exciting thing that’s happened to databases in a very long time. It’s just super cool. Databases are pretty cool. They’re foundational and all that, but now being the system that provides memory for artificial intelligence — it completely changes why you get up in the morning.”

DataStax customers can now use Astra DB’s new vector search capabilities on AWS, Microsoft Azure and Google Cloud Platform, where it originally launched. DataStax Enterprises users who run the service in their own data centers will get access to vector search within the next month.

Anuff noted that DataStax saw a lot of uptake during the preview period and given the nature of the product, customers who use vector search also tend to be highly active users. Within a few days after the company launched the public preview, he told me, the company saw just over 1,000 signups and DataStax CEO Chet Kapoor said that the company started 50 new major enterprise POCs last week alone.

“I consider myself to be leaning forward and aggressive with our goals,” Kapoor said. “This blew my mind. So I am very surprised. We are the database-as-a service going into real-time AI — and now we are almost showing up in every Pinecone conversation, every Chroma conversation which is there. It’s happening with investors, as well as customers and with partners.”

Given the hype around generative AI and the importance of vector search to augment these models with more recent data or personalized data, for example, it’s no surprise that other database services are also trying to capitalize on this momentum. The DataStax team argues that its core technology based on Apache Cassandra, which allows it (and its database index) to reach the massive scale needed for many of these use cases, as well as its wide range of certifications, give it a competitive edge. Anuff also stressed that Astra DB now supports the popular LangChain framework for building LLM-based applications.

“The ability to trust the output of generative AI models will be critical to adoption by enterprises,” explained Matt Aslett, VP and Research Director, Ventana Research. “The addition of vector embeddings and vector search to existing data platforms enables organizations to augment generic models with enterprise information and data, reducing concerns about accuracy and trust.”

Mitsubishi Electric Contributes Towards India’s Semiconductor Push

To bolster India’s semiconductor industry, Mitsubishi Electric’s Semiconductor & Devices department has offered its backing to the nationwide semiconductors and devices lab project. This initiative aims to disseminate information and hands-on expertise regarding the role and utilisation of semiconductors.

In the initial phase of this project, two institutions of technical education were supported with educational material and advanced training methods.

As part of the second phase, the company prepared educational kits and equipment which were provided to the three technical institutions; the Indian Institute of Technology located in Delhi, BMS College of Engineering located in Bengaluru and the Institute of Technology, NIRMA University located in Ahmedabad to set the practical knowledge of semiconductor devices as a literary technical project across institutions of India that has benefited around 700 students from these colleges.

The educational material prepared by the semiconductors & devices department includes IGBT(Insulated Gate Bipolar Transistor Module), SiC (Silicon Carbide) Module, IPMs (Intelligent Power Modules), DIPIPMs, DIPIPM (Dual In Line IPM) Evaluation PCBA with controller, IGBT Gate Drivers, Application notes, etc. which have been supplied at the labs of the three colleges.

The company’s efforts are directed towards bringing technological upliftment in the country. Hence, it is redirecting its CSR efforts to educate the youth and help them get familiar with technical advancements of the Semiconductor industry. Inauguration of the Power Semiconductor devices set-up at the lab of these institutions was done by the General Manager of Mitsubishi Electrical’s Semiconductor and Devices division Hitesh Bhardwaj.

Bhardwaj, addressed the media marking the growing demand for practical learning in the field and said, “We are glad to collaborate with the three highly impactful institutions in the field of engineering and technology that strongly helps us to contribute towards the betterment of the society through our products & solutions.

He also added that their skill development CSR initiative has empowered students from across the country, enabling them to contribute towards the development of indigenous technology that is sustainable.

The Semiconductor Landscape in India

In addition to Mitsubishi’s efforts, Lam Research has put forth a proposal to train 60,000 Indian engineers using its Semiverse Solution virtual fabrication platform. The objective of this initiative is to accelerate India’s goals in semiconductor education and workforce development. This benefit the chip industry in India across industries, as an educated workforce could help bring about homegrown innovation in the field powering advancements.

Recently, it seemed like the mega semiconductor deal between Foxconn and Vedanta would have an adverse impact on India’s chip aspirations. However, MoS for electronics and IT Rajeev Chandrasekhar addressing the deal falling through said that the Taiwanese electronics manufacturer Foxconn’s decision to pull out of the Vedanta joint venture has no impact on India’s semiconductor fabrication plant goal.

“This decision of Foxconn to withdraw from its JV with Vedanta has no impact on India’s Semiconductor Fab goals. None,” he tweeted.
He added that this will also mean that the companies will pursue their plans independently. “It’s not for govt to get into why or how two private companies choose to partner or choose not to, but in simple terms it means both companies can & will now pursue their strategies in India independently, and with appropriate technology partners in Semicon n Electronics,” Chandrasekhar tweeted.

The post Mitsubishi Electric Contributes Towards India’s Semiconductor Push appeared first on Analytics India Magazine.

Ensuring Reliable Few-Shot Prompt Selection for LLMs

Authors: Chris Mauck, Jonas Mueller

In this article, we prompt the Davinci Large Language Model from OpenAI (the model underpinning GPT-3/ChatGPT) with few-shot prompts in an effort to classify the intent of customer service requests at a large bank. Following typical practice, we source the few-shot examples to include in the prompt template from an available dataset of human-labeled request examples. However, the resulting LLM predictions are unreliable — a close inspection reveals this is because real-world data is messy and error-prone. LLM performance in this customer service intent classification task is only marginally boosted by manually modifying the prompt template to mitigate potentially noisy data. The LLM predictions become significantly more accurate if we instead use data-centric AI algorithms like Confident Learning to ensure only high-quality few-shot examples are selected for inclusion in the prompt template.

Let’s consider how we can curate high-quality few-shot examples for prompting LLMs to produce the most reliable predictions. The need to ensure high-quality examples in the few-shot prompt may seem obvious, but many engineers don’t know there are algorithms/software to help you do this more systematically (in fact an entire scientific discipline of Data-Centric AI). Such algorithmic data curation has many advantages, it is: fully automated, systematic, and broadly applicable to general LLM applications beyond intent classification.

Banking Intent Dataset

This article studies a 50-class variant of the Banking-77 Dataset which contains online banking queries annotated with their corresponding intents (the label shown below). We evaluate models that predict this label using a fixed test dataset containing ~500 phrases, and have a pool of ~1000 labeled phrases which we consider as candidates to include amongst our few-shot examples.

You can download the candidate pool of few-shot examples and test set here and here. Here’s a notebook you can run to reproduce the results shown in this article.

Few-shot Prompting

Few-shot prompting (also known as in-context learning) is a NLP technique that enables pretrained foundation models to perform complex tasks without any explicit training (i.e. updates to model parameters). In few-shot prompting, we provide a model with a limited number of input-output pairs, as part of a prompt template that is included in the prompt used to instruct the model how to handle a particular input. The additional context provided by the prompt template helps the model better infer what types of outputs are desired. For example, given the input: “Is San Francisco in California?”, a LLM will better know what type of output is desired if this prompt is augmented with a fixed template such that the new prompt looks something like:

Text: Is Boston in Massachusetts?

Label: Yes

Text: Is Denver in California?

Label: No

Text: Is San Fransisco in California?

Label:

Few-shot prompting is particularly useful in text classification scenarios where your classes are domain-specific (as is typically the case in customer service applications within different businesses).

In our case, we have a dataset with 50 possible classes (intents) to provide context for, such that OpenAI’s pretrained LLM can learn the difference between classes in context. Using LangChain, we select one random example from each of the 50 classes (from our pool of labeled candidate examples) and construct a 50-shot prompt template. We also append a string that lists the possible classes before the few-shot examples to ensure the LLM output is a valid class (i.e. intent category).

The above 50-shot prompt is used as input for the LLM to get it to classify each of the examples in the test set (target text above is the only part of this input that changes between different test examples). These predictions are compared against the ground truth labels to evaluate the LLM accuracy produced using a selected few-shot prompt template.

Baseline Model Performance

# This method handles:  # - collecting each of the test examples  # - formatting the prompt  # - querying the LLM API  # - parsing the output  def eval_prompt(examples_pool, test, prefix="", use_examples=True):      texts = test.text.values      responses = []      examples = get_examples(examples_pool, seed) if use_examples else []      for i in range(len(texts)):          text = texts[i]          prompt = get_prompt(examples_pool, text, examples, prefix)          resp = get_response(prompt)          responses.append(resp)      return responses

# Evaluate the 50-shot prompt shown above.  preds = eval_prompt(examples_pool, test)  evaluate_preds(preds, test)  >>> Model Accuracy: 59.6%

Running each of the test examples through the LLM with the 50-shot prompt shown above, we achieve an accuracy of 59.6% which not bad for a 50-class problem. But this is not quite satisfactory for our bank’s customer service application, so let’s take a closer look at the dataset (i.e. pool) of candidate examples. When ML performs poorly, the data is often to blame!

Issues in Our Data

Via close inspection of the candidate pool of examples from which our few-shot prompt was drawn, we find mislabeled phrases and outliers lurking in the data. Here are a few examples that were clearly annotated incorrectly.

Previous research has observed many popular datasets contain incorrectly labeled examples because data annotation teams are imperfect.

It is also common for customer service datasets to contain out-of-scope examples that were accidentally included. Here we see a few strange-looking examples that do not correspond to valid banking customer service requests.

Why Do These Issues Matter?

As the context size for LLMs grows, it is becoming common for prompts to include many examples. As such, it may not be possible to manually validate all of the examples in your few-shot prompt, especially with a large number of classes (or if you lack domain knowledge about them). If the data source of these few-shot examples contains issues like those shown above (as many real-world datasets do), then errant examples may find their way into your prompts. The rest of the article examines the impact of this problem and how we can mitigate it.

Can we warn the LLM the examples may be noisy?

What if we just include a “disclaimer warning” in the prompt telling the LLM that some labels in the provided few-shot examples may be incorrect? Here we consider the following modification to our prompt template, which still includes the same 50 few-shot examples as before.

prefix = 'Beware that some labels in the examples may be noisy and have been incorrectly specified.'  preds = eval_prompt(examples_pool, test, prefix=prefix)  evaluate_preds(preds, test)  >>> Model Accuracy: 62.0%

Using the above prompt, we achieve an accuracy of 62%. Marginally better, but still not good enough to use the LLM for intent classification in our bank’s customer service system!

Can we remove the noisy examples entirely?

Since we can’t trust the labels in the few-shot examples pool, what if we just remove them entirely from the prompt and only rely on the powerful LLM? Rather than few-shot prompting, we are doing zero-shot prompting in which the only example included in the prompt is the one the LLM is supposed to classify. Zero-shot prompting entirely relies on the LLM’s pretrained knowledge to get the correct outputs.

preds = eval_prompt(examples_pool, test, use_examples=False)  evaluate_preds(preds, test)  >>> Model Accuracy: 67.4%

After removing the poor-quality few-shot examples entirely, we achieve an accuracy of 67.4% which is the best that we’ve done so far!

t seems that noisy few-shot examples can actually harm model performance instead of boosting it as they are supposed to do.

Can we identify and correct the noisy examples?

Instead of modifying the prompt or removing the examples entirely, the smarter (yet more complex) way to improve our dataset would be to find and fix the label issues by hand. This simultaneously removes a noisy data point that is harming the model and adds an accurate one that should improve its performance via few-shot prompting, but making such corrections manually is cumbersome. Here we instead effortlessly correct the data using Cleanlab Studio, a platform that implements Confident Learning algorithms to automatically find and fix label issues.

After replacing the estimated bad labels with ones estimated to be more suitable via Confident Learning, we re-run the original 50-shot prompt through the LLM with each test example, except this time we use the auto-corrected label which ensures we provide the LLM with 50 high-quality examples in its few-shot prompt.

# Source examples with the corrected labels.  clean_pool = pd.read_csv("studio_examples_pool.csv")  clean_examples = get_examples(clean_pool)    # Evaluate the original 50-shot prompt using high-quality examples.  preds = eval_prompt(clean_examples, test)  evaluate_preds(preds, test)  >>> Model Accuracy: 72.0%

After doing this, we achieve an accuracy of 72% which is quite impressive for the 50-class problem.

We’ve now shown that noisy few-shot examples can considerably decrease LLM performance and that it is suboptimal to just manually change the prompt (via adding caveats or removing examples). To achieve the highest performance, you should also try correcting your examples using Data-centric AI techniques like Confident Learning.

Importance of Data-centric AI

This article highlights the significance of ensuring reliable few-shot prompt selection for language models, specifically focusing on customer service intent classification in the banking domain. Through the exploration of a large bank's customer service request dataset and the application of few-shot prompting techniques using the Davinci LLM, we encountered challenges stemming from noisy and erroneous few-shot examples. We demonstrated that modifying the prompt or removing examples alone cannot guarantee optimal model performance. Instead, data-centric AI algorithms like Confident Learning implemented in tools like Cleanlab Studio proved to be more effective in identifying and rectifying label issues, resulting in significantly improved accuracy. This study emphasizes the role of algorithmic data curation in obtaining reliable few-shot prompts, and highlights the utility of such techniques in enhancing Language Model performance across various domains.
Chris Mauck is Data Scientist at Cleanlab.

Hugging Showcases Demos Based On Open Source Text-To-Video Models, Pinpoints Flaws

Hugging Face, the AI developers’ go-to platform has released AI WebTV, as the latest advancement in automatic video and music synthesis. The model aims to advocate for open-source accessible text-to-video models like Zeroscope and MusicGen.

The technique excels in replacing backgrounds during camera panning or rotation. Moreover, it gives users creative freedom, granting control over the number of frames in the generation process, resulting in high quality slow-motion effects. The prime video model behind the WebTV is Zeroscope V2, that can be implemented in NodeJS and TypeScript.

The HF model works by taking video shot prompts, which then via a text-to-video model, generate results in a sequence of takes. To enhance the creative process further, a human-authored base theme and idea are fed into a large language model, to generate diverse individual prompts for each video clip.

Prompt: 3D rendered animation showing a group of food characters forming a pyramid, with a banana standing triumphantly on top. In a city with cotton candy clouds and chocolate road, Pixar’s style, CGI, ambient lighting, direct sunlight, rich color scheme, ultra realistic, cinematic, photorealistic.

Talking about the ability of text-to-video models, the HF blog stated, “We’ve seen it with large language models and their ability to synthesize convincing content that mimics human responses, but this takes things to a whole new dimension when applied to video,”said the HF blog authored by Julian Bilcke.

The video sequences released along with the demo are made short, to show WebTV as a tech demo rather than an actual show with an art direction or programming.

Even though the advancement is being lauded, HF has pointed out a few cases where the model fails. Firstly, it can have issues with movement and direction. For instance, a clip can sometimes be played in reverse. Also, at certain instances the modifier keyword is not taken into account. Furthermore, the model sometimes injects words from the prompt which can appear in the video.

Source: https://huggingface.co/blog/ai-webtv

Similar to HF’s model, last year in September Meta AI released Make-A-Video but the model remains closed source like the majority of services announced the the tech giant.

Read more: Meta AI Releases A Multimodal Model “CM3leon” — But Won’t Release It

The post Hugging Showcases Demos Based On Open Source Text-To-Video Models, Pinpoints Flaws appeared first on Analytics India Magazine.

Рубрика: AI