AI — Страница 1536

It’s Time For ChatGPT to Have Voice

In the ever-evolving landscape of AI, OpenAI’s ChatGPT is on the cusp of a groundbreaking transformation. With the recent introduction of ‘custom instructions‘, users can now experience a more personalized and intuitive interaction with the AI model. But the possibilities don’t end there. Imagine ChatGPT with its own unique voice, ushering in a new era of conversational AI. This development is reminiscent of the past shortcomings of voice assistants like Alexa, which failed to deliver personalized experiences.

By incorporating custom instructions, users will have the ability to use fewer prompts, making the interaction process more efficient and user-friendly as ChatGPT will now be able to remember your conversation context based on your chosen preferences, allowing for a more personalized and tailored AI interaction experience.

With the exciting development of ChatGPT remembering conversations, integrating a voice interface seems like the next logical step. Just like how Amazon’s Alexa revolutionized the way we interact with voice-controlled devices.

In the past, voice assistants like Alexa could perform tasks like sharing news and playing songs, but they lacked a personalized touch. With ChatGPT’s memory feature, we can take this interaction to a new level. Imagine booking a restaurant or listening to your favorite songs effortlessly, as ChatGPT remembers your preferences and caters to your unique needs. This evolution in the field of generative AI brings us closer to a more natural and intuitive way of interacting with technology.

How Amazon Alexa made big promises and failed

Upon its initial rollout in 2015, Alexa faced a surge of users posing curious and offbeat questions, ranging from inquiries about the meaning of life to whimsical desires. However, as time passed, Alexa’s unsatisfactory responses failed to retain users’ interest. The devices lacked the ability to deliver personalized experiences or generate significant advertising revenue for the company. As a result, users’ engagement with Alexa gradually diminished over time.

It was speculated that generative AI will bring back Alexa from the dead, however it doesn’t look like it is going to happen anytime soon. A few months ago, Alexa was declared dead. The company pulled a plug on its ‘Amazon Alexa’ voice-assisted feature succumbing to huge operating losses.

In addition to Amazon, other companies like Google have also shifted their focus away from their respective voice assistants, as reported earlier this year by The Information. Similarly, Microsoft’s Cortana and Samsung’s Bixby, once considered Alexa competitors, have also faced challenges and reduced prominence in the market.

Right now OpenAI has the first mover advantage in generative AI and it is not easy to catch up in this race. This is high time for OpenAI to tap the market of voice assistance before anyone else does.

Bard breathing under the neck

Recently, Google announced that the Bard would be available in over 40 languages. With its latest updates, users can now listen to Bard’s responses, alongside changing the tone and style of Bard’s responses to five different options: simple, short, long, professional or casual.

Google is making significant efforts to enhance Bard’s conversational abilities and enable generative AI to speak more naturally. Personalised voice interaction opens up numerous use cases that are more than just asking factual details like weather outside.

For instance like asking for recipes based on groceries, asking questions and receiving quick insights on any topic with ChatGPT holding to and fro conversation just like any other human.

Currently, Bard has access to real-time information and events, giving it an advantage in these scenarios. In contrast, ChatGPT relies on plugins, which are only available to plus users, to access real-time data.

OpenAI in the last few days had a slew of news agency partnerships through Associated Press and American Journalist Project. Hopefully, this will help ChatGPT to have real time information just like Bard which will make personalized conversations even better.

To stay competitive and meet user expectations, OpenAI should consider incorporating voice control into ChatGPT, transforming it into a personal virtual assistant like Jarvis from Ironman. Just like Tony Stark’s seamless conversations with Jarvis, the world is eagerly waiting for real-life AI interactions that can provide personalized and natural responses.

The post It’s Time For ChatGPT to Have Voice appeared first on Analytics India Magazine.

This week in AI: Companies voluntarily submit to AI guidelines — for now

This week in AI: Companies voluntarily submit to AI guidelines — for now Kyle Wiggers 8 hours

Keeping up with an industry as fast-moving as AI is a tall order. So until an AI can do it for you, here’s a handy roundup of recent stories in the world of machine learning, along with notable research and experiments we didn’t cover on their own.

This week in AI, we saw OpenAI, Anthropic, Google, Inflection, Microsoft, Meta and Amazon voluntarily commit to pursuing shared AI safety and transparency goals ahead of a planned Executive Order from the Biden administration.

As my colleague Devin Coldewey writes, there’s no rule or enforcement being proposed, here — the practices agreed to are purely voluntary. But the pledges indicate, in broad strokes, the AI regulatory approaches and policies that each vendor might find amendable in the U.S. as well as abroad.

Among other commitments, the companies volunteered to conduct security tests of AI systems before release, share information on AI mitigation techniques and develop watermarking techniques that make AI-generated content easier to identify. They also said that they would invest in cybersecurity to protect private AI data and facilitate the reporting of vulnerabilities, as well as prioritize research on societal risks like systemic bias and privacy issues.

The commitments are important step, to be sure — even if they’re not enforceable. But one wonders if there are ulterior motives on the part of the undersigners.

Top AI companies visit the White House to make ‘voluntary’ safety commitments

Reportedly, OpenAI drafted an internal policy memo that shows the company supports the idea of requiring government licenses from anyone who wants to develop AI systems. CEO Sam Altman first raised the idea at a U.S. Senate hearing in May, during which he backed the creation of an agency that could issue licenses for AI products — and revoke them should anyone violate set rules.

In a recent interview with press, Anna Makanju, OpenAI’s VP of global affairs, insisted that OpenAI wasn’t “pushing” for licenses and that the company only supports licensing regimes for AI models more powerful than OpenAI’s current GPT-4. But government-issued licenses, should they be implemented in the way that OpenAI proposes, set the stage for a potential clash with startups and open source developers who may see them as an attempt to make it more difficult for others to break into the space.

Devin said it best, I think, when he described it to me as “dropping nails on the road behind them in a race.” At the very least, it illustrates the two-faced nature of AI companies who seek to placate regulators while shaping policy to their favor (in this case putting small challengers at a disadvantage) behind the scenes.

It’s a worrisome state of affairs. But, if policymakers step up to the plate, there’s hope yet for sufficient safeguards without undue interference from the private sector.

Here are other AI stories of note from the past few days:

OpenAI’s trust and safety head steps down: Dave Willner, an industry veteran who was OpenAI’s head of trust and safety, announced in a post on LinkedIn that he’s left the job and transitioned to an advisory role. OpenAI said in a statement that it’s seeking a replacement and that CTO Mira Murati will manage the team on an interim basis.
Customized instructions for ChatGPT: In more OpenAI news, the company has launched custom instructions for ChatGPT users so that they don’t have to write the same instruction prompts to the chatbot every time they interact with it.
Google news-writing AI: Google is testing a tool that uses AI to write news stories and has started demoing it to publications, according to a new report from The New York Times. The tech giant has pitched the AI system to The New York Times, The Washington Post and The Wall Street Journal’s owner, News Corp.
Apple tests a ChatGPT-like chatbot: Apple is developing AI to challenge OpenAI, Google and others, according to a new report from Bloomberg’s Mark Gurman. Specifically, the tech giant has created a chatbot that some engineers are internally referring to as “Apple GPT.”
Meta releases Llama 2: Meta unveiled a new family of AI models, Llama 2, designed to drive apps along the lines of OpenAI’s ChatGPT, Bing Chat and other modern chatbots. Trained on a mix of publicly available data, Meta claims that Llama 2’s performance has improved significantly over the previous generation of Llama models.
Authors protest against generative AI: Generative AI systems like ChatGPT are trained on publicly available data, including books — and not all content creators are pleased with the arrangement. In an open letter signed by more than 8,500 authors of fiction, non-fiction and poetry, the tech companies behind large language models like ChatGPT, Bard, LLaMa and more are taken to task for using their writing without permission or compensation.
Microsoft brings Bing Chat to the enterprise: At its annual Inspire conference, Microsoft announced Bing Chat Enterprise, a version of its Bing Chat AI-powered chatbot with business-focused data privacy and governance controls. With Bing Chat Enterprise, chat data isn’t saved, Microsoft can’t view a customer’s employee or business data and customer data isn’t used to train the underlying AI models.

More machine learnings

Technically this was also a news item, but it bears mentioning here in the research section. Fable Studios, which previously made CG and 3D short films for VR and other media, showed off an AI model it calls Showrunner that (it claims) can write, direct, act in and edit an entire TV show — in their demo, it was South Park.

Maybe showing off an AI-generated fake TV episode during a writers’ strike is a bad idea

I’m of two minds on this. On one hand, I think pursuing this at all, let alone during a huge Hollywood strike that involves issues of compensation and AI, is in rather poor taste. Though CEO Edward Saatchi said he believes that the tool puts power in the hands of creators, the opposite is also arguable. At any rate it was not received particularly well by people in the industry.

On the other hand, if someone on the creative side (which Saatchi is) does not explore and demonstrate these capabilities, then they will be explored and demonstrated by others with less compunction about putting them to use. Even if the claims Fable makes are a bit expansive for what they actually showed (which has serious limitations) it is like the original DALL-E in that it prompted discussion and indeed worry even though it was no replacement for a real artist. AI is going to have a place in media production one way or the other — but for a whole sack of reasons it should be approached with caution.

On the policy side, a little while back we had the National Defense Authorization Act going through with (as usual) some really ridiculous policy amendments that have nothing to do with defense. But among them was one addition that the government must host an event where researchers are companies can do their best to detect AI-generated content. This kind of thing is definitely approaching “national crisis” levels so it’s probably good this got slipped in there.

Over at Disney Research, they’re always trying to find a way to bridge the digital and the real — for park purposes, presumably. In this case they have developed a way to map virtual movements of a character or motion capture (say for a CG dog in a film) onto an actual robot, even if that robot is a different shape or size. It relies on two optimization systems each informing the other of what is ideal and what is possible, sort of like a little ego and super-ego. This should make it much easier to make robot dogs act like regular dogs, but of course it’s generalizable to other stuff as well.

And here’s hoping AI can help us steer the world away from sea-bottom mining for minerals, because that is definitely a bad idea. A multi-institutional study put AI’s ability to sift signal from noise to work predicting the location of valuable minerals around the globe. As they write in the abstract:

In this work, we embrace the complexity and inherent “messiness” of our planet’s intertwined geological, chemical, and biological systems by employing machine learning to characterize patterns embedded in the multidimensionality of mineral occurrence and associations.

The study actually predicted and verified locations of uranium, lithium, and other valuable minerals. And how about this for a closing line: the system “will enhance our understanding of mineralization and mineralizing environments on Earth, across our solar system, and through deep time.” Awesome.

OpenAI and Google Are Out to Control Newsrooms

Amid Meta hogging the limelight with their Llama-2 reveal and Microsoft sharing the attention, OpenAI has also been keeping themselves busy this week – in a different way of sorts. From rolling out additional features such as custom instructions for ChatGPT to extending support for older models of OpenAI API to users, the company has also been focusing on ‘itself’. They are slowly building a path to control media with a slew of news agency partnerships through Associated Press and American Journalist Project. Interestingly, Google has reportedly demonstrated their AI tool ‘Genesis’ to the executives at The New York Times, The Washington Post and News Corp.

What’s with big tech hovering over media publications? Is it a mere battle over media supremacy? Or, is it a ploy to support their individual needs?

The AI – Data Tradeoff

Local news rooms have been struggling owing to reduced viewers, advertising and financial crunch which has led to closure of many outlets. By 2025, the US is set to lose a third of their newspaper outlets. In this bleeding market, aid offered by big tech in the form of money and AI tools is their rescue boat. In return, massive data is exchanged.

OpenAI recently signed a multi-million dollar partnership with Associated Journalism Project (AJP)- their biggest collaboration with any media organisation. The company announced a $5 million partnership with AJP and another $5 million through API credits for AJP’s grantee organisations – which means that OpenAI will indirectly tie up with AJP’s 41 news agencies. The agreement is said to benefit AJP by exploring AI tools for local news. In return, though not confirmed, OpenAI will get access to historical and real-time data from these news agencies – the coveted goldmine for training their future models.

A week ago, OpenAI announced their tie-up with one of the biggest news agencies in the US – Associated Press (AP). The said partnership is said to help examine the potential use cases for generative AI in news products and services. At the same time, OpenAI will gain access to AP’s news data from 1985.

However, Google’s approach seems to be on a different tangent.

Building a Defence Strategy

Google has been reportedly demonstrating their AI tool with big news establishments, such as The New York Times, The Washington Post and News Corp. who owns The Wall Street Journal, in a bid to help journalists in writing and automating certain tasks. A few of them who saw the Google pitch believe that this move takes the efforts that have gone into producing accurate news stories for granted.

Unlike OpenAI who has been tying up with not-for-profit and local news agencies, Google’s approach is two-pronged. Apart from data access, Google’s future tie-up with major news publications is a probable move to avoid the ongoing payment tussle that Google is facing for putting up news content without paying publishers. In Australia, Google is paying news companies for putting up their content on their platform. In Canada, Google is planning to remove links to Canadian news if the law mandating them to pay news publishers comes into effect. If Google’s AI tech that was demonstrated is adopted by news publishers, it is likely that the company can evade such problems.

Considering how companies such as Twitter, Reddit and StackOverflow, the databanks of today, are restricting content from being scraped off their website, training AI models will become more challenging. Recently, 8000 authors signed a letter urging companies such as OpenAI, Google, Meta and few others to offer compensation for using their copyrighted work to train models. To probably bypass future shortage of data and overcome any form of legal tussles such as plagiarism and copyrighting, OpenAI and Google are building a proactive shield now itself.

The Unsolvable Setbacks

While setting a future path, OpenAI still has a long way to go with fixing their current hiccups. The company finds itself in one soup or another. Recently, the Federal Trade Commission (FTC) opened an expansive investigation into OpenAI’s activities over risks of security breach and leaking of personal data. A similar issue on personal data leak led to OpenAI retracting their Bing web browser feature within two weeks of releasing it on the ChatGPT mobile app.

With data security and rising accusations of churning inaccurate data, how will OpenAI be able to effectively work with news media where factual and accurate information is a basic necessity? While the larger plan for tie-ups with the media may be obscured at the moment, it currently looks like a plan to save their skin from future predicaments.

The post OpenAI and Google Are Out to Control Newsrooms appeared first on Analytics India Magazine.

Free From Google: Generative AI Learning Path

Illustration by Author

Are you interested in discovering the potential of Generative AI Models and their applications? Luckily, Google Cloud released the Generative AI Learning Path, a great collection of free courses, that start from explaining the basic concepts of Generative AI to more sophisticated tools, like Generative AI Studio to build your customized generative AI models.

This article will explore seven of the available courses, that will allow you to understand the concepts behind the Large Language Models that surround us every day and create new AI solutions. Let’s get started!

1. Introduction to Generative AI

Course link: Introduction to Generative AI

This first course is an introduction to Generative AI by Dr Gwendolyn Stripling, an AI Technical Curriculum Developer at Google Cloud. It will make you learn what Generative AI is and how it’s applied. It begins with the basic concepts of data science (AI, Machine Learning, Deep Learning), and how Generative AI is different from these disciplines. Moreover, it explains the key concepts that surround Generative AI with very intuitive illustrations, such as transformers, hallucinations and Large Language Models.

Video duration: 22 minutes

Lecturer: Gwendolyn Stripling

Suggested readings:

Ask a Techspert: What is generative AI?
Build new generative AI powered search & conversational experiences with Gen App Builder
The implications of Generative AI for businesses

2. Introduction to Large Language Models

Course link: Introduction to Large Language Models

This second course aims to introduce what Language Models are at a high level. In particular, it gives examples of LLM applications, like text classification, question answering and document summarization. In the end, it shows the potential of Google’s Generative AI Development tools to build your applications with no code.

Video duration: 15 minutes

Lecturer: John Ewald

Suggested readings:

NLP's ImageNet moment has arrived
Google Cloud supercharges NLP with large language models
LaMDA: our breakthrough conversation technology

3. Introduction to Image Generation

Course link: Introduction to Image Generation

This third course focuses on explaining the most important diffusion models, a family of models that generate images. Some of the most promising approaches are Variational Autoencoders, Generative Adversarial Models and Autoregressive Models.

It also shows the use cases that can be categorized into two types: unconditioned generation and conditioned generation. The first includes human face synthesis and super-resolution as applications, while examples of conditioned generation are the generation of images from a text prompt, the image inpainting and the text-guided image-to-image.

Video duration: 9 minutes

Lecturer: Kyle Steckler

4. Attention Mechanism

Course link: Attention Mechanism

In this short course, you will learn more about the attention mechanism, which is a very important concept behind transformers and Large Language Models. It has enabled improving tasks, such as machine translation, text summarization and question answering. In particular, it shows how the attention mechanism works to solve the machine translation.

Video duration: 5 minutes

Lecturer: Sanjana Reddy

5. Transformer Models and BERT Model

Course link: Transformer Models and BERT Model

This course covers transformer architecture, which is an underlying concept behind the BERT model. After explaining the transformer, it gives an overview of BERT and how it’s applied to solve different tasks, such as single-sentence classification and question answering.

Differently from the previous courses, the theory is accompanied by a laboratory, which requires prior knowledge of Python and TensorFlow.

Video duration: 22 minutes

Lecturer: Sanjana Reddy

Suggested readings:

Attention Is All You Need
Transformer: A Novel Neural Network Architecture for Language Understanding

6. Create Image Captioning Models

Course link: Create Image Captioning Models

This course aims to explain the Image Captioning Models, which are generative models that produce text captions by taking images as input. It exploits an encoder-decoder structure, attention mechanism and transformer to solve the task of predicting a caption for a given image. Like the previous course, there is also a laboratory to put the theory into practice. It is again oriented toward data professionals with prior knowledge in Python and Tensorflow.

Video duration: 29 minutes

Lecturer: Takumi Ohyama

7. Introduction to Generative AI Studio

Course link: Introduction to Generative AI Studio

This last course introduces and explores Generative AI Studio. It starts by explaining again what Generative AI is and its use cases, such as code generation, information extraction and virtual assistance. After giving an overview of these core concepts, Google Cloud shows the tools that help on solving Generative AI tasks even without an AI background. One of these tools is Vertex AI, which is a platform that enables to manage the machine learning cycle, from building to the deployment of the machine model. This end-to-end platform includes two products, Generative AI Studio and Model Garden. The course is focused on explaining Generative AI Studio, which allows to build easily generative models with no code or low code.

Video duration: 15 minutes

Suggested readings:

Generative AI Studio
Overview of text prompt design
Lab: Get Started with Generative AI Studio

Final Thoughts

I hope that you have found useful this fast overview of the Generative AI course provided by Google Cloud. If you don’t know where to start in understanding the core concepts of Generative AI, this path covers every aspect. In case you have already a machine learning background, there are surely models and use cases you can discover from one of these courses. Do you know other free courses about Generative AI? Drop them in the comments if you have insightful suggestions.

Eugenia Anello is currently a research fellow at the Department of Information Engineering of the University of Padova, Italy. Her research project is focused on Continual Learning combined with Anomaly Detection.

One Year of Midjourney: A Psychedelic Image Generator to Realistic Photo Album

Last week, Midjouney completed one year of its launch. All this while, the image generation platform has gone through a massive transformation.

We all know about controversial photos like Donald Trump in Prison, Pope in a Puffer Jacket, and the bomb blast at the White House. For these images to look real, Midjourney did a lot of improvements and all these happened within a span of a year.

In February 2022, Midjourney, launched the closed beta product (just for 500 users) and a few months later on July 12, launched the first public beta version of the product. The difference between the first and the current versions 5 and beyond is mind boggling. We have mapped the journey of the platform, metamorphosing from a psychedelic image generator to a realistic photo album.

evolution of Midjourney over the course of ~1 year pic.twitter.com/bkJ32voNeA

— Tanay Jaipuria (@tanayj) March 30, 2023

All photos below have the same prompt – an artist in her artist studio

Midjourney V1 – February 2022

Midjourney’s beta version was released to a limited group of 500 users, who are given the privilege to invite another 500 users, totaling 1000 users. This initial release serves as a testing phase to gather feedback and refine the platform. The founder, David Holz, encourages users to share their images on social media, likely to generate interest and grow the user base organically.

At its initial release, Midjourney V1 was revolutionary for its time, surpassing Stable Diffusion v1.4 and DALL-E 2 in producing remarkable results. While “quality” wasn’t a primary focus, users were impressed with the platform’s capabilities, even if the images were relatively low in detail. Although lacking in dimensionality, V1 produced interesting textures in the environment, likely serving as an early exploration of the platform’s capabilities.

Midjourney V2 – April 12, 2022

With community feedback from the beta phase, Midjourney releases new features, including “Upscaling” and “Variation” buttons, along with a solid pricing plan. The previous free generation option is transitioned to a paid beta model. As a result, the platform’s popularity begins to grow, and the waitlist for new users starts to expand.

In V2, Midjourney improved the character rendering to achieve a more realistic appearance. However, artefacts still created a unique “psychedelic” or dreamlike quality. The process involved generating low-detail images initially and then using the Remaster feature to enhance details and overall image quality.

V2 improved significantly, providing a more realistic sense of space, resembling concepts used in architectural design. It demonstrated that newer models aren’t necessarily always better, and creativity could flourish with older versions.

Midjourney V3 – July 25, 2022

Version 3 of Midjourney was launched with “–stylize” and “–quality” parameters. These enhancements offered users more control over the image stylization process and improved the overall quality of the generated images. As Midjourney gained momentum, its Discord community grew rapidly and reached an impressive milestone of one million users, surpassing the popular Fortnite and Minecraft Discord servers in terms of size.

V3 marked a significant step up in image quality, particularly in lighting and shadow effects. It was the first version where Midjourney showed promise in handling real illustration tasks, expanding its utility beyond just creative exploration. This version showcased a big leap in proper lighting and reflections, making the generated images look more realistic, suitable for architectural visualisation.

Midjourney V4 – November 5, 2022

The V4 update is considered a major game-changer for Midjourney. It elevated the platform’s image processing capabilities to an unprecedented level of quality, surpassing what was achievable with the newer versions of Stable Diffusion. This version was a game-changer, becoming the highest quality model available at the time. It was capable of handling subject matters like logos and web design, showcasing the platform’s versatility and adaptability.

This was the first version to be considered actually “realistic,” with images resembling photographs and renders. While technically “nicer,” some users preferred the creativity, art style, and unique possibilities of older versions.

Midjourney V5 – March 15, 2023

Continuing the trajectory of improvement, Midjourney V5 maintains the focus on enhancing image quality and versatility, building upon the progress made in the previous versions. Just like with V4, the platform maintained its focus on aesthetics, ensuring that images produced were visually appealing and creative.

With Midjourney V5.1 released onMay 3, 2023followed by Midjourney V5.2 on June 23, 2023, Midjourney pushed the boundaries of aesthetics even further, leading to even more visually stunning results. V5.2 continued to enhance aesthetics, particularly character designs with improved facial details and cohesive designs. Water reflections and other intricate details, such as trees were handled more effectively than in previous versions.

Midjourney v5.2 introduces several exciting new features such as Zoom Out, Make Square, Shorten, Variation Mode, Stylise, and Pan. Using all these new features people have been able to even make short videos using Midjourney images. The future is going to be revolutionary!

Read: 6 Spectacular Features of Midjourney v5.2

The post One Year of Midjourney: A Psychedelic Image Generator to Realistic Photo Album appeared first on Analytics India Magazine.

Data Science Hiring Process at Naukri.com

Founded in 1997 by Sanjeev Bikhchandani, the founder of InfoEdge and Ashoka University, Naukri.com is India’s largest platform for white collar job seekers as well as recruiters.

To cater to these distinct user groups, Naukri developed state-of-the-art AI-powered vertical search and recommendation engines in the country. Comprising a talented group of around 60 data scientists, 15 ML engineers, and 20 analytics professionals, the team operates under specialized units like Search, Recommendation, Language Model (Taxonomy), Pricing, User Acquisition, Content Generation, Job-CV Matching, and Information Extraction. Each unit is expertly led by a Lead Data Scientist or an AVP of Data Science, guaranteeing strong leadership and focused proficiency.

“We take pride in having solved the challenges of ‘job seeker discovery’ and ‘job discovery’ for each group respectively,” said Jatin Thukral, Executive Vice President & Head – Data Science, Naukri.com told AIM.

Naukri.com has carved a unique path in the world of B2B data science and AI with its exceptional AI and Analytics team. They effectively solve business problems by collaborating with product managers and and engineering teams to make it posssible.The company has been investing in AI R&D since 2011, amassing over a decade of invaluable experience as a trailblazer in B2B data science and AI.

Introduced in 2022, Naukri’s ResDex is India’s largest resume database that was built by leveraging data science vigorously. Recruiters use this platform to handle over one million job mandates from a vast pool of eight crore applications available on Naukri.

The company is currently hiring for various data scientists positions in senior, lead, AVP, and VP roles.

The experience levels required for these roles are as follows: Data Scientist (0-4 years), Senior Data Scientist (4-8 years), Lead Data Scientist (6-10 years), AVP Data Science (8-12 years), VP Data Science (10-14 years), and SVP Data Science (10-16 years).

AI & Analytics Play at Naukri.com

Naukri.com’s data science efforts cater to both B2B and B2C needs. On the B2B front, the focus is on optimizing recruiter efficiency through an advanced CV search engine and precise candidate recommendations based on their unique journey. For B2C users, the company provides personalized job recommendations, salary-related insights, and an easy-to-navigate job search engine.

“Our approach to AI model development is unique because of its flexibility, as it avoids restricting itself to specific AI architectures. Instead, the team relies on a combination of various models to effectively address real-world challenges, fostering innovation and adaptability,” added Thukral.

One of the prime examples of their AI implementation is harnessing the power of AI to automatically generate job requirements tailored to recruiters’ search actions and patterns, that was made possible using various tiers of hierarchical agglomerative clustering models where the first stack is based on single linkages and the subsequent stacks are based on multiple linkages.

With a dedicated ecosystem, the company focuses on developing, deploying, and monitoring AI and ML-based engines to address various business challenges and deliver superior client solutions.

To achieve this, a diverse range of AI models and techniques are employed, including deep learning models like LSTMs, Bi-Directional LSTMs, BERT, CNNs, Transformers, Attention Mechanisms, and GANs. The most commonly used classical machine learning and statistical models are Random Forests, XGBoost, Hierarchical and Divisive Agglomerative Clusterings, and Logistic Classifiers.

Through training these open-source models on proprietary data and combining them into AI architectures, the organization optimizes problem-solving capabilities and ensures effective deployment into products.

Their job recommendation engine, RecoClus leverages over 200 multiple models to provide personalized job recommendations. It automatically extracts information from candidates’ resumes and matches them with the most suitable job openings, learns candidates’ preferences based on their previous job applications, encompassing various job categories like technical, HR, finance. It then recommends jobs based on recent search behavior and optimises job notifications, allowing applicants to apply.

The ResDex Enterprise platform analyses recruiters’ past activity to personalise search results to deliver the most relevant talent options. For example, if a recruiter from Amazon is acting heavily on job seekers from Flipkart, the model will automatically learn this behavior and enhance visibility of job seekers from Flipkart in the future searches of the recruiter.

Hiring Process

The hiring process for data science roles is rigorous and consists of five rounds of interviews to assess candidates’ understanding of statistics, linear algebra, classical machine learning, and deep learning. Candidates are also asked to solve real-world programming case studies live with a recruiter. The next round involves a senior Data Science Leader evaluating technical skills and data science temperament. The final round is an HR assessment for personality and cultural fit.

Candidates must have a degree (Btech/Mtech/MSc/PhD) from prestigious institutions such as IIT, IISc, ISI, IIIT in Maths or Engineering. Prior experience in a top 10 AI team is advantageous.

Expectations

The ideal candidate is expected to have skills including Mathematics, Statistics (Probability, Hypothesis Testing, etc.), Linear Algebra, Classical ML, Algorithms (RF, Boosting, etc.), Testing (Performance Metrics), Loss functions, Deep Learning (LSTMs, CNNs, etc.), Data Processing (NLP, Text Mining), DL Libraries (TensorFlow, PyTorch), LLM Frameworks (Vicuna, MPT-7B), and Programming Languages (Python with PySpark interface), Databases (SQL, MongoDB, HDFS).

Joining Naukri’s data science team brings a fast-paced research environment, autonomy for independent projects, collaboration with smart colleagues, meritocracy, data-driven decision-making, and ample learning opportunities.

Work Culture

According to Thukral, the work culture at Naukri is young, dynamic, collegial, and fast-paced. When it comes to data science, it holds a crucial role as the core of the Naukri business and innovations. “This makes the job of a data scientist at Naukri very exciting, albeit challenging,” he added.

At all levels, data scientists fully embrace the responsibility of overseeing one or more projects from start to finish, having complete ownership of the final business outcomes. They collaborate closely with diverse stakeholders, including product managers, ML engineers, technologists, design teams, and more. Naukri aims to maintain a strong balance sheet and follows the highest level of professional ethics, along with conservative cash management.

Additionally, data science and AI play a crucial role in powering Naukri’s top-selling products, distinguishing it as one of the consistently lucrative internet enterprises in India.

By joining Naukri.com, you can focus on building a long-term career with the company, as it values stable growth over forced job hopping. Furthermore, Naukri benefits from the guidance of some of the best thought leaders in the internet industry, providing employees with an enriching and supportive environment for professional development.

Overall, Naukri.com presents an attractive opportunity for those seeking a rewarding and impactful career in the field of data science and AI, backed by a strong foundation and a thriving work culture.

Check out their openings now.

The post Data Science Hiring Process at Naukri.com appeared first on Analytics India Magazine.

Best Transformer-based LLMs on Hugging Face (Part 2)

In Part 1, we discussed how Transformers form the crux of NLP. So let’s take a look at autoregressive and sequence-to-sequence models.

Autoregressive Model

Autoregressive models are trained on the language modeling task, predicting the next word based on the context. They function as the decoder part of the original transformer model, using a mask to only consider previous words during attention. While these models can be fine-tuned for various tasks, their primary use is text generation. Let’s take a look at some of them.

GPT: Improving Language Understanding by Generative Pre-Training

In June 2018, OpenAI released GPT, the first pretrained Transformer model, used for fine-tuning on various NLP tasks and obtained state-of-the-art results. By training a language model on diverse, unlabeled text and using task-aware input transformations during fine-tuning, significant improvements were achieved across various tasks. Outperforming task-specific models, this approach demonstrates remarkable gains, such as 8.9% in commonsense reasoning, 5.7% in question answering, and 1.5% in textual entailment. It shows the potential of unsupervised (pre-)training in enhancing performance on discriminative tasks and provides insights into the effectiveness of Transformers with data containing long-range dependencies.

GPT-2: Language Models are Unsupervised Multitask Learners

An upgraded version of GPT, OpenAI introduced GPT 2 pretrained on a large dataset called WebText. By providing the model with a document and questions, it performs really well on the CoQA dataset without needing lots of training examples. Making the language model even bigger improves its performance on various tasks. GPT-2 outperformed other models on several language modeling datasets and produces more coherent text samples.

CTRL: A Conditional Transformer Language Model for Controllable Generation

In 2019, Salesforce introduced CTRL, a powerful 1.63 billion-parameter language model. CTRL used control codes that dictate style, content, and task-specific behavior. These codes are derived from natural text structure, enabling precise text generation while retaining unsupervised learning advantages. CTRL can also predict the most probable parts of training data for a sequence, offering a way to analyze vast data through model-based source attribution.

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

Carnegie Mellon University and Google worked on Transformer-XL for learning long-range dependencies without disrupting temporal coherence. It incorporates segment-level recurrence and a novel positional encoding scheme, enabling it to capture dependencies 80% longer than RNNs and 450% longer than traditional Transformers. This leads to improved performance on both short and long sequences and achieves remarkable speedup during evaluation, up to 1,800+ times faster than vanilla Transformers.

Reformer: The Efficient Transformer

Again by Google Research, transformer based Reformer is designed with new methods to reduce memory usage and computation time. These tricks include using Axial position encoding to handle long sequences without a huge positional encoding matrix. Instead of traditional attention, it uses LSH (local-sensitive hashing) attention to save computation during attention layers. Intermediate results are not stored for each layer, but rather obtained during the backward pass using reversible transformer layers or recomputed as needed, saving memory. Additionally, feedforward operations are computed in smaller chunks instead of the entire batch. As a result, this model can handle much longer sentences compared to traditional autoregressive transformers.

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Google’s XLNet rearranges the sentence’s tokens and then predicts the next token using the preceding tokens. This is achieved through a masked approach, where a specific permutation of the sentence is hidden, allowing the model to understand the correct sequence. Moreover, XLNet employs Transformer-XL’s recurrence mechanism to establish connections between distant tokens for long-term dependencies. The library includes various versions of the model suitable for language modeling, token classification, sentence classification, multiple choice classification, and question answering tasks.

Sequence-to-Sequence Models

Sequence-to-sequence models combine the transformer’s encoder and decoder to handle various tasks, including translation, summarization, and question answering. They can be adapted to different tasks but are most commonly used for translation, summarization, and question answering.

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

BART by Meta is a sequence-to-sequence model for language tasks like writing, translation, and understanding. It consists of an encoder and a decoder. The encoder processes corrupted tokens, while the decoder works with the original tokens, masking future words. During pretraining, the encoder applies various transformations like masking random tokens, deleting tokens, masking spans of tokens with a single mask token, permuting sentences, and rotating the document to start at a specific token.

PEGASUS: Pre-training with Extracted Gap-sentences forAbstractive Summarization

Following the same encoder-decoder model architecture as BART, PEGASUS employs two self-supervised objectives for pre-training: Masked Language Modeling (MLM) and Gap Sentence Generation (GSG) for summarization. In MLM, random tokens are masked and predicted by the encoder, akin to BERT. In GSG, entire encoder input sentences are masked and fed to the decoder, which has a causal mask to predict future words. Unlike BART, Pegasus’ pretraining task closely resembles summarization, where significant sentences are masked and generated together as one output sequence from the remaining sentences, resembling an extractive summary.

Marian: Fast Neural Machine Translation in C++

Microsoft developed Marian, a powerful and self-contained Neural Machine Translation system. It includes an integrated automatic differentiation engine based on dynamic computation graphs. Marian is entirely written in C++. The system’s encoder-decoder framework is designed to achieve both high training efficiency and fast translation speeds, making it a research-friendly toolkit.

T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

In T5 by Google, the traditional transformer model is modified with positional embeddings learned at each layer. It handles various NLP tasks by transforming them into text-to-text challenges using specific prefixes like “summarize:”, “question:”, “translate English to German:”, etc. The pretraining involves both supervised and self-supervised training. Supervised training uses GLUE and SuperGLUE benchmarks as downstream tasks, converted into text-to-text format. Self-supervised training involves corrupting tokens in the input sentence, randomly removing 15% of them, and replacing them with individual sentinel tokens. The encoder takes the corrupted sentence, the decoder takes the original sentence, and the target includes the dropped-out tokens delimited by their sentinel tokens.

MBart: Multilingual Denoising Pre-training for Neural Machine Translation

Meta’s MBart has a similar structure and training goal as BART, but it stands out by being trained in 25 different languages. Its main purpose is to excel in both supervised and unsupervised machine translation tasks. MBart pioneers a novel approach by pre-training the entire sequence-to-sequence model on diverse languages, using denoising techniques on full texts.

Read more: Best Transformer-based LLMs on Hugging Face (Part 1)

The post Best Transformer-based LLMs on Hugging Face (Part 2) appeared first on Analytics India Magazine.

Meta’s Llama 2 Challenges OpenAI’s ChatGPT: A New Era in AI Development

In an intriguing turn of events, Meta has decided to open-source its large language model, Llama 2. This strategic decision not only positions Meta as a direct competitor to OpenAI's ChatGPT but also democratizes access to advanced AI tools. The implications of this decision are far-reaching, potentially fostering a new wave of innovation and experimentation in the AI community.

The announcement of Llama 2's open-sourcing was made signaling Meta's support for Azure and Windows. This collaboration between two tech giants could potentially lead to a new era of AI development. The partnership between Meta and Microsoft is a clear indication of the growing importance of AI in the tech industry and the potential it holds for transforming various sectors.

In another exciting development, Qualcomm has joined forces with Meta to bring Llama 2 to a range of devices, including laptops, phones, and headsets, starting from 2024. This partnership aims to create AI-powered apps that operate independently of cloud services, marking a significant step towards the decentralization of AI technology. This move could potentially revolutionize the way we interact with technology, making AI an integral part of our everyday lives.

Democratizing AI and Ensuring Safety

Meta's decision to open-source Llama 2 is a testament to its commitment to democratizing AI. By providing businesses, startups, and researchers with access to advanced AI tools, Meta is fostering a culture of community-based innovation. Llama 2, which has been trained on 40% more data than its predecessor, reportedly outperforms other large language models in various tests, indicating its potential to drive significant advancements in the AI field.

The open-sourcing of Llama 2 also underscores Meta's commitment to safety and transparency in AI technology. Meta has disclosed that Llama 2 has undergone rigorous safety testing, both internally and externally. This comprehensive evaluation process ensures that the model is not only powerful but also safe to use. This focus on safety is crucial in the development of AI, as it ensures that the technology can be used responsibly and ethically.

The Future of AI with Open-Sourcing

The open-sourced Llama 2 will be available through Microsoft's Azure platform, with plans to make it accessible through AWS, Hugging Face, and other providers. This move signifies Meta's belief in the importance of an open approach to AI development, particularly in the rapidly advancing generative space.

The open-sourcing of Llama 2 is expected to have a far-reaching impact. With over 100,000 requests from researchers to use its first model, Meta's open-source Llama 2 is expected to have a much wider reach. This could potentially lead to a surge in AI research and development, as more researchers and developers gain access to this advanced tool.

A New Era in AI

Meta's decision to open-source Llama 2 marks a significant milestone in the AI landscape. As Llama 2 squares off against OpenAI's ChatGPT, the real winners will be the businesses, startups, and researchers who will gain access to more advanced AI tools. This move is likely to foster innovation and accelerate the development of AI technology, marking a new era in the story of AI. The battle between Llama 2 and ChatGPT is not just a competition between two AI models, but a testament to the rapid advancements in AI technology and its potential to transform our world.

Software developers’ dance with generative AI is still at that awkward stage

Artificial intelligence — especially generative AI — promises to reshape the roles and tasks of software developers and other IT professionals. But it's all relatively immature, and professionals are proceeding with both enthusiasm and caution.

The latest survey of 90,000 developers from Stack Overflow, released in June, finds 44% currently use AI tools in their work, with another 25% open to using AI soon. Still, they are split when it comes to trusting what AI delivers to them. Only 3% "highly trust" AI for their work, and 39% indicate they are lukewarm, and "somewhat trust" AI. More than one in four, 28%, simply don't trust AI yet.

Also: 92% of programmers are using AI tools, says GitHub developer survey

In other words, it's potentially a wonderful thing, but it's awkward.

"Today's generative AI party resembles a middle school dance more than a full-on college bash with a live band," says Luis Flynn, senior manager for AI and analytics at SAS. "Developers are rightfully proceeding with caution. Today ChatGPT users can rapidly and casually inquire about any code or syntax so they can begin prototyping applications in moments all from a tiny bit of dialogue. This type of digital push button is simultaneously impressive and scary."

As it stands, Flynn continues, "AI is a digital mirror of what humanity has learned using the internet. And it shows us humanity is inherently flawed. By blindly and hastily leveraging ChatGPT, we can misuse code or — at the very least — impose error into our workstreams."

But when responsibly vetted by seasoned developers, "the potential of generative AI is incredible," Flynn says. "Scrappy data scientists, data engineers and business analysts have mechanisms to fuel their productivity to new levels. But we're not quite there yet."

Also: How to use ChatGPT to write code
AI will help developers do their jobs better, but it is also increasingly a part of the solutions they will be building for clients or employers. Flynn has recommendations in terms of the skills IT pros should learn and emphasize to succeed in an increasingly AI-intensive world. "A profound understanding of your organizational data and where it fits into your business processes is key," he says. "If you couple data competence with ambition, resourcefulness and a curious approach to problem-solving, things will fall into place."

This also means changes in the way we work with one another. It's also notable that AI is not the province of technologists alone — many professionals from different disciplines should be involved. An AI-intensive world "requires cross-functional teams that include domain experts coupled with developers, data scientists, or business analysts who understand the power of tuning AI to a particular industry," Flynn points out. "These are the people who know how to navigate our "collective computational wisdom" but can trim the fat and train with smaller data sets tuned for the desired outcomes of a particular business in a specific industry."

Also: How to use ChatGPT to create an app

What types of roles will IT pros play as some app development and deployment becomes highly automated? "IT professionals will have various roles as app development and deployment is streamlined," says Flynn. "But there will always be someone to enforce compliance and uphold the transparency and ethical use of AI. Beyond the fears of privacy and ethical breaches, there will be a need for power user experience advocacy and design. The simplicity of ChatGPT is one of its most impressive features."

Importantly, it will be the jobs of developers and IT professionals to facilitate the democratization of AI, making it safe, useful, and accessible to all users.

Think about the implications of when the metaverse came online, Flynn explained. "The barrier was getting people to buy virtual reality headsets. It's like throwing a destination wedding: If you make it hard to get to, you limit your audience. There will always be people who understand the human factors involved in any emerging technology. They'll know how to invoke time and space to fold generative AI into everyday workflows. Many of our roles in IT will stay the same, but we'll be more productive because powerful tools like generative AI will be just a click away."

How Amazon Alexa made big promises and failed

Bard breathing under the neck

More machine learnings

The AI – Data Tradeoff

Building a Defence Strategy

The Unsolvable Setbacks

More On This Topic

Midjourney V1 – February 2022

Midjourney V2 – April 12, 2022

Midjourney V3 – July 25, 2022

Midjourney V4 – November 5, 2022

Midjourney V5 – March 15, 2023

Read: 6 Spectacular Features of Midjourney v5.2

Autoregressive Model

GPT: Improving Language Understanding by Generative Pre-Training

GPT-2: Language Models are Unsupervised Multitask Learners

CTRL: A Conditional Transformer Language Model for Controllable Generation

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

Reformer: The Efficient Transformer

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Sequence-to-Sequence Models

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

PEGASUS: Pre-training with Extracted Gap-sentences forAbstractive Summarization

Marian: Fast Neural Machine Translation in C++

T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

MBart: Multilingual Denoising Pre-training for Neural Machine Translation

Democratizing AI and Ensuring Safety

The Future of AI with Open-Sourcing

A New Era in AI

Featured

More On This Topic