AI — Страница 1040

TacticAI: Leveraging AI to Elevate Football Coaching and Strategy

Football, also known as soccer, stands out as one of the most widely enjoyed sports globally. Beyond the physical skills displayed on the field, it's the strategic nuances that bring depth and excitement to the game. As former German football striker Lukas Podolsky famously remarked, “Football is like chess, but without the dice.”

DeepMind, known for its expertise in strategic gaming with successes in Chess and Go, has partnered with Liverpool FC to introduce TacticAI. This AI system is designed to support football coaches and strategists in refining game strategies, focusing specifically on optimizing corner kicks – a crucial aspect of football gameplay.

In this article, we'll take a closer look at TacticAI, exploring how this innovative technology is developed to enhance football coaching and strategy analysis. TacticAI utilizes geometric deep learning and graph neural networks (GNNs) as its foundational AI components. These components will be introduced before delving into the inner workings of TacticAI and its transformative impact on football strategy and beyond.

Geometric Deep Learning and Graph Neural Networks

Geometric Deep Learning (GDL) is a specialized branch of artificial intelligence (AI) and machine learning (ML) focused on learning from structured or unstructured geometric data, such as graphs and networks that have inherent spatial relationships.

Graph Neural Networks (GNNs) are neural networks designed to process graph-structured data. They excel at understanding relationships and dependencies between entities represented as nodes and edges in a graph.

GNNs leverage the graph structure to propagate information across nodes, capturing relational dependencies in the data. This approach transforms node features into compact representations, known as embeddings, which are utilized for tasks such as node classification, link prediction, and graph classification. For example, in sports analytics, GNNs take the graph representation of game states as input and learn player interactions, for outcome prediction, player valuation, identifying critical game moments, and decision analysis.

TacticAI Model

The TacticAI model is a deep learning system that processes player tracking data in trajectory frames to predicts three aspects of the corner kicks including receiver of the shot (who is most likely to receive the ball), determines shot likelihood (will the shot be taken), and suggests player positioning adjustments (how to position the players to increase/decrease shot probability).

Here's how the TacticAI is developed:

Data Collection: TacticAI uses a comprehensive dataset of over 9,000 corner kicks from Premier League seasons, curated from Liverpool FC's archives. The data includes various sources, including spatio-temporal trajectory frames (tracking data), event stream data (annotating game events), player profiles (heights, weights), and miscellaneous game data (stadium info, pitch dimensions).
Data Pre-processing: The data were aligned using game IDs and timestamps, filtering out invalid corner kicks and filling in missing data.
Data Transformation and Pre-processing: The collected data is transformed into graph structures, with players as nodes and edges representing their movements and interactions. Nodes were encoded with features like player positions, velocities, heights, and weights. Edges were encoded with binary indicators of team membership (whether players are teammates or opponents).
Data Modeling: GNNs process data to uncover complex player relationships and predict the outputs. By utilizing node classification, graph classification, and predictive modelling, GNNs are used for identifying receivers, predicting shot probabilities, and determining optimal player positions, respectively. These outputs provide coaches with actionable insights to enhance strategic decision-making during corner kicks.
Generative Model Integration: TacticAI includes a generative tool that assists coaches in adjusting their game plans. It offers suggestions for slight modifications in player positioning and movements, aiming to either increase or decrease the chances of a shot being taken, depending on what's needed for the team's strategy.

Impact of TacticAI Beyond Football

The development of TacticAI, while primarily focused on football, has broader implications and potential impacts beyond the football. Some potential future impacts are as follows:

Advancing AI in Sports: TacticAI could play a substantial role in advancing AI across different sports fields. It can analyze complex game events, better manage resources, and anticipate strategic moves offering a meaningful boost to sports analytics. This can lead to a significant improvement of coaching practices, the enhancement of performance evaluation, and the development of players in sports like basketball, cricket, rugby, and beyond.
Defense and Military AI Enhancements: Utilizing the core concepts of TacticAI, AI technologies could lead to major improvements in defense and military strategy and threat analysis. Through the simulation of different battlefield conditions, providing resource optimization insights, and forecasting potential threats, AI systems inspired by TacticAI's approach could offer crucial decision-making support, boost situational awareness, and increase the military's operational effectiveness.
Discoveries and Future Progress: TacticAI's development emphasizes the importance of collaboration between human insights and AI analysis. This highlights potential opportunities for collaborative advancements across different fields. As we explore AI-supported decision-making, the insights gained from TacticAI's development could serve as guidelines for future innovations. These innovations will combine advanced AI algorithms with specialized domain knowledge, helping address complex challenges and achieve strategic objectives across various sectors, expanding beyond sports and defense.

The Bottom Line

TacticAI represents a significant leap in merging AI with sports strategy, particularly in football, by refining the tactical aspects of corner kicks. Developed through a partnership between DeepMind and Liverpool FC, it exemplifies the fusion of human strategic insight with advanced AI technologies, including geometric deep learning and graph neural networks. Beyond football, TacticAI's principles have the potential to transform other sports, as well as fields like defense and military operations, by enhancing decision-making, resource optimization, and strategic planning. This pioneering approach underlines the growing importance of AI in analytical and strategic domains, promising a future where AI's role in decision support and strategic development spans across various sectors.

AI21 Labs’ new AI model can handle more context than most

AI21 Labs’ new AI model can handle more context than most Kyle Wiggers 10 hours

Increasingly, the AI industry is moving toward generative AI models with longer contexts. But models with large context windows tend to be compute-intensive. Or Dagan, product lead at AI startup AI21 Labs, asserts that this doesn’t have to be the case — and his company is releasing a generative model to prove it.

Contexts, or context windows, refer to input data (e.g. text) that a model considers before generating output (more text). Models with small context windows tend to forget the content of even very recent conversations, while models with larger contexts avoid this pitfall — and, as an added benefit, better grasp the flow of data they take in.

AI21 Labs’ Jamba, a new text-generating and -analyzing model, can perform many of the same tasks that models like OpenAI’s ChatGPT and Google’s Gemini can. Trained on a mix of public and proprietary data, Jamba can write text in English, French, Spanish and Portuguese.

Jamba can handle up to 140,000 tokens while running on a single GPU with at least 80GB of memory (like a high-end Nvidia A100). That translates to around 105,000 words, or 210 pages — a decent-sized novel.

Meta’s Llama 2, by comparison, has a 32,000-token context window — on the smaller side by today’s standards — but only requires a GPU with ~12GB of memory in order to run. (Context windows are typically measured in tokens, which are bits of raw text and other data.)

On its face, Jamba is unremarkable. Loads of freely available, downloadable generative AI models exist, from Databricks’ recently released DBRX to the aforementioned Llama 2.

But what makes Jamba unique is what’s under the hood. It uses a combination of two model architectures: transformers and state space models (SSMs).

Transformers are the architecture of choice for complex reasoning tasks, powering models like GPT-4 and Google’s Gemini, for example. They have several unique characteristics, but by far transformers’ defining feature is their “attention mechanism.” For every piece of input data (e.g. a sentence), transformers weigh the relevance of every other input (other sentences) and draw from them to generate the output (a new sentence).

SSMs, on the other hand, combine several qualities of older types of AI models, such as recurrent neural networks and convolutional neural networks, to create a more computationally efficient architecture capable of handling long sequences of data.

Now, SSMs have their limitations. But some of the early incarnations, including an open source model called Mamba from Princeton and Carnegie Mellon researchers, can handle larger inputs than their transformer-based equivalents while outperforming them on language generation tasks.

Jamba in fact uses Mamba as part of the core model — and Dagan claims it delivers three times the throughput on long contexts compared to transformer-based models of comparable sizes.

“While there are a few initial academic examples of SSM models, this is the first commercial-grade, production-scale model,” Dagan said in an interview with TechCrunch. “This architecture, in addition to being innovative and interesting for further research by the community, opens up great efficiency and throughput possibilities.”

Now, while Jamba has been released under the Apache 2.0 license, an open source license with relatively few usage restrictions, Dagan stresses that it’s a research release not intended to be used commercially. The model doesn’t have safeguards to prevent it from generating toxic text or mitigations to address potential bias; a fine-tuned, ostensibly “safer” version will be made available in the coming weeks.

But Dagan asserts that Jamba demonstrates the promise of the SSM architecture even at this early stage.

“The added value of this model, both because of its size and its innovative architecture, is that it can be easily fitted onto a single GPU,” he said. “We believe performance will further improve as Mamba gets additional tweaks.”

Google.org launches $20M generative AI accelerator program

Google.org launches $20M generative AI accelerator program Kyle Wiggers 7 hours

Google.org, Google’s charitable wing, is launching a new program to help fund nonprofits developing tech that leverages generative AI.

Called Google.org Accelerator: Generative AI, the program is to be funded by $20 million in grants and include 21 nonprofits to start, including Quill.org, a company creating AI-powered tools for student writing feedback, and World Bank, which is building a generative AI app to make development research more accessible.

In addition to funding, nonprofits in the six-week accelerator program will get access to technical training, workshops, mentors and guidance from an “AI coach.” And, through Google.org’s fellowship program, teams of Google employees will work with three of the nonprofits — Tarjimly, Benefits Data Trust and mRelief — full-time for up to six months to help launch their proposed generative AI tools.

Tarjimly aims to use AI to translate languages for refugees, while Benefits Data Trust is tapping AI to create assistants that support caseworkers in helping low-income applicants enroll in public benefits. mRelief, meanwhile, is designing a tool to streamline the U.S. SNAP benefits application process.

“Generative AI can help social impact teams be more productive, creative and effective in serving their communities,” Annie Lewin, director of global advocacy at Google.org, said in a blog post. “Google.org funding recipients report that AI helps them achieve their goals in one third of the time at nearly half the cost.”

According to a PwrdBy survey, 73% of nonprofits believe AI innovation aligns with their missions and 75% believe AI makes their lives easier, particularly in areas like donor categorization, routine back-office tasks and “mission-driven” initiatives. But there remain significant barriers for nonprofits looking to build their own AI solutions or adopt third-party products — chiefly cost, resources and time.

In the blog post, Lewin cites a Google.org survey that similarly found that, while four in five nonprofits think generative AI may be applicable to their work, nearly half currently aren’t using the tech as a result of a range of internal and external roadblocks. “[These nonprofits] cite a lack of tools, awareness, training and funding as the biggest barriers to adoption,” she said.

Encouragingly, the number of nonprofit AI-focused startups is beginning to tick up.

Nonprofit accelerator Fast Forward said that this year, more than a third of applicants for its latest class were AI companies. And Crunchbase reports that, more broadly, dozens of nonprofit organizations across the globe are dedicating work around ethical approaches to AI, like AI ethics lab AlgorithmWatch, virtual reading clinic JoyEducation, and conservation advocacy group Earth05.

The 7 Best AI Tools for Data Science Workflow

Image from DALLE-3

It is now evident that those who adopt AI quickly will lead the way, while those who resist change will be replaced by those who are already using AI. Artificial intelligence is no longer just a passing fad; it is becoming an essential tool in various industries, including data science. Developers and researchers are increasingly using AI-powered tools to simplify their workflows, and one such tool that has gained immense popularity recently is ChatGPT.

In this blog, I will discuss the 7 best AI tools that have made my life as a data scientist easier. These tools are indispensable in my daily tasks, such as writing tutorials, researching, coding, analyzing data, and performing machine learning tasks. By sharing these tools, I hope to help fellow data scientists and researchers streamline their workflows and stay ahead of the curve in the ever-evolving field of AI.

1. PandasAI: Conversational Data Analysis

Every data professional is familiar with pandas, a Python package used for data manipulation and analysis. But what if I told you that instead of writing code, you can analyze and generate data visualizations by simply typing a prompt or a question? That's what PandasAI does — it's like an AI Agent for your Python workflow that automates data analysis using various AI models. You can even use locally run models.

In the code below, we have created an agent using the pandas dataframe and OpenAI model. This agent can perform various tasks on your dataframe using natural language. We asked it a simple question and then requested an explanation of how it arrived at the results.

import os  import pandas as pd  from pandasai.llm import OpenAI  from pandasai import Agent    sales_by_country = pd.DataFrame(      {          "country": [              "United States",              "United Kingdom",              "France",              "Germany",              "Italy",              "Spain",              "Canada",              "Australia",              "Japan",              "China",          ],          "sales": [5000, 3200, 2900, 4100, 2300, 2100, 2500, 2600, 4500, 7000],      }  )    llm = OpenAI(api_token=os.environ["OPENAI_API_KEY"])  pandas_ai_df = Agent(sales_by_country, config={"llm": llm})    response = pandas_ai_df.chat("Which are the top 5 countries by sales?")  explanation = pandas_ai_df.explain()    print("Answer:", response)  print("Explanation:", explanation)

The results are amazing. Experimenting with my real-life data would have taken at least half an hour.

Answer: The top 5 countries by sales are: China, United States, Japan, Germany, United Kingdom  Explanation: I looked at the data we have and found a way to sort it based on sales. Then, I picked the top 5 countries with the highest sales numbers. Finally, I put those countries into a list and created a sentence to show them as the top 5 countries by sales.

2. GitHub Copilot: Your AI Code Assistant

GitHub Copilot is now necessary if you are a full time developer or dealing with the code everyday. Why? It enhances your ability to write clean and effective code faster. You can even chat with your file and debug faster or generate context aware code.

GitHub Copilot includes AI chatbot, inline chatbox, code generation, autocomplete, CLI autocomplete, and other GitHub-based features that can help with code search and understanding.

GitHub Copilot is a paid tool, so if you don't want to pay $10/ month then you should check out Top 5 AI Coding Assistants You Must Try.

3. ChatGPT: Chat Application Powered by GPT-4

ChatGPT has been dominating the AI space for 2 years now. People use it for writing emails, generating content, code generation, and all kinds of nominal work-related tasks.

If you pay for a subscription, you get access to the state-of-the-art model GPT-4, which is excellent at solving complex problems.

I use it daily for code generation, for code explanation, for asking general questions, and for content generation. The work generated by AI is not always perfect. You may need to make some edits to present it to a wider audience.

ChatGPT is an essential tool for data scientists. Using it is not cheating. Instead, it saves you time in researching and finding solutions compared to everyone else.

If you value privacy, consider running open source AI models on your laptop. Check out 5 Ways To Use LLMs On Your Laptop.

4. Colab AI: AI Powered Cloud Notebook

If you have trained a deep neural network for a complex machine learning task, then you must have first trained it on Google Colab due to the availability of freely accessible GPUs and TPUs. With the surge in Generative AI, Google Colab has recently introduced some features that will help you generate code, debug faster, and autocomplete.

Colab AI is like an integrated AI coding assistant in your workspace. You can generate code by simply prompting and asking follow-up questions. It also comes with inline code prompting, although it has limited use with the free version.

I would highly recommend getting the paid version as it provides better GPUs and an overall better coding experience.

Discover the Top 11 AI Coding Assistants for 2024 and try out all alternatives to Colab AI to find the best fit for you.

5. Perplexity AI: Smart Search Engine

I have been using Perplexity AI as my new search engine and research assistant. It helps me learn about new technologies and concepts by providing concise and up-to-date summaries with links to relevant blogs and videos. I can even ask follow-up questions and get a modified answer.

Perplexity AI offers various features to assist its users. It can answer a wide range of questions, from basic facts to complex queries, using the latest sources. Its Copilot feature allows users to explore their topics in-depth, enabling them to expand their knowledge and discover new areas of interest. Furthermore, users can organize their search results into "Collections" based on projects or topics, making it easier to find what they need in the future.

Check out 8 AI-powered search engines that can enhance your internet searching and research capabilities as an alternative to Google.

6. Grammarly: AI Writing Assistance

I want to let you know that Grammarly is an exceptional tool for individuals with Dyslexia. It helps me write content quickly and accurately. I have been using Grammarly for almost 9 years now, and I love the features that correct my spelling, grammar, and overall structure of my writing. Recently, they introduced Grammarly AI, which allows me to improve my writing with the help of generative AI models. This tool has made my life easier as I can now write better emails, direct messages, content, tutorials, and reports. It is a vital tool for me, much like Canva.

7. Hugging Face: Building the Future of AI

Hugging Face is not just a tool, but an entire ecosystem that has become an essential part of my daily work life. I use it to access datasets, models, machine learning demos, and APIs for AI models. Additionally, I rely on various Hugging Face Python packages for training, fine-tuning, evaluating, and deploying machine learning models.

Hugging Face is an open-source platform that's free for the community and allows people to host datasets, models, and AI demos. It even lets you deploy your models inferences and run them on GPUs. In the next few years, it's likely to become the primary platform for data discussions, research and development, and operations.

Discover the top 10 data science tools to use in 2024 and become a super data scientist, solving data problems better than anyone.

Conclusion

I have been using Travis, an AI-powered tutor, to conduct research on advanced topics such as MLOps, LLMOps, and data engineering. It provides simple explanations about these topics and you can ask follow-up questions just like with any chatbot. It's perfect for those who only want search results from top publications on Medium.

In this blog, we have explored 7 powerful AI tools that can significantly enhance the productivity and efficiency of data scientists and researchers — from conversational data analysis with PandasAI to code generation and debugging assistance with GitHub Copilot and Colab AI, offering game-changing capabilities to simplify complex code related tasks and save valuable time. ChatGPT's versatility allows for content generation, code explanation, and problem-solving, while Perplexity AI provides a smart search engine and research assistant. Grammarly AI offers invaluable writing assistance, and Hugging Face serves as a comprehensive ecosystem for accessing datasets, models, and APIs to develop and deploy machine learning solutions.

Abid Ali Awan (@1abidaliawan) is a certified data scientist professional who loves building machine learning models. Currently, he is focusing on content creation and writing technical blogs on machine learning and data science technologies. Abid holds a Master's degree in technology management and a bachelor's degree in telecommunication engineering. His vision is to build an AI product using a graph neural network for students struggling with mental illness.

Data Science Hiring Process at Confluent

In September last year, Confluent, a leading provider of data streaming solutions, introduced ‘Data Streaming for AI’, a new initiative to speed up organisations’ creation of real-time AI applications. More recently, the company announced the general availability of Confluent Cloud for Apache Flink. This fully-managed service enables customers to process data in real-time and create high-quality, reusable data streams.

Behind all the innovations in this space is Confluent’s strong and resilient AI and analytics team. “Building a truly data-driven culture is one of the top priorities for Confluent’s Data team. A critical part of achieving that is applying data science to address real-world requirements in business operations,” Ravi Kiran Yanamandra,manager of data science for product and growth marketing at Confluent, told AIM.

Yanamandra, along with Karthik Nair, director of international talent acquisition at Confluent, took us through the company’s AI applications, hiring process, skills needed, and work culture.

The company is seeking data scientists and engineers to further bolster its tech team.

Inside Confluent’s Data Science Wing

The data team at Confluent is structured into sub-teams specialising in data engineering, data science, and business intelligence.

The organisation has leveraged data science in experimentation to inform product decisions and optimise marketing investments across multiple channels. This involves a multi-channel attribution model and an improved predictability model built on key business KPIs through machine learning forecasting models, enabling more precise planning.

Additionally, machine learning forecasting models improve the predictability of critical business KPIs.

In terms of implementation, Confluent’s data science team uses a combination of online and offline machine learning models to support various aspects of the business. For example, online algorithms are deployed to evaluate the quality of leads or new signups in real time, allowing for immediate actions based on the insights generated.

Furthermore, offline models are operationalised to assist business partners in making informed decisions, such as guiding marketing spend decisions through the marketing channel attribution model and providing predictive insights into future performance through revenue forecasting models.

“While still in the experimental phase, we are actively exploring the potential of generative AI as a productivity tool,” said Yanamandra, highlighting that the initial applications include enhancing internal search capabilities and evaluating the quality of support provided through channels like chatbots and email communications.

Moreover, through its comprehensive data streaming platform, organisations can stream, connect, process, and manage data in real time, creating innovative solutions previously considered unattainable. By integrating generative AI with data streaming, organisations can swiftly address inquiries with up-to-date and comprehensive information.

In addition to leveraging existing technologies, the team also builds proprietary models using its proprietary data assets to address specific business challenges, said Yanamandra. These models, such as consumption forecasting and lead scoring, are tailored to Confluent’s unique needs, further enhancing their competitive advantage in the market.

The team predominantly uses SQL and Python for modelling and analysis, supported by tools like dbt, Docker, and VertexAI for data pipeline management and production model deployment. Tableau is the primary platform for visualisation and reporting, enabling stakeholders to gain actionable insights from the data effectively.

Interview Process

“The hiring process for our data team focuses on assessing candidates in three areas – technical, analytical, and soft skills,” commented Yanamandra.

Candidates are evaluated based on their proficiency in Python and SQL, experience in ML algorithms and data modelling, and familiarity with A/B testing and data visualisation. Analytical skills are assessed through problem-solving abilities and structured thinking, while soft skills, such as business understanding and communication, are also crucial.

The interview process begins with technical screening focussed on SQL and Python, followed by a real-world business scenario assessment. For data science roles, there’s an additional stage dedicated to statistics knowledge and machine learning abilities. The final interview with the hiring manager evaluates project delivery experience, technical leadership, motivations, and cultural fit.

Expectations

When joining Confluent’s data science team, new members can expect to actively engage with business partners, focusing on solving their specific business challenges. Successful candidates join as subject matter experts for the company’s data tools and technologies, and training is provided to deepen their understanding of the relevant business domain.

New joiners can expect to work in a “highly collaborative, innovation-driven, and fast-paced environment on the data science team. We move quickly and prioritise translating data insights into tangible business impact”, Yanamandra added.

“Another unique aspect is that candidates are exposed to diverse domains, offering opportunities to collaborate across functions such as marketing, sales, finance, and product analytics,” Nair told AIM.

Mistakes to Avoid

While interviewing candidates, Yanamandra has noticed a common pattern. Candidates often assume that proficiency in technical skills such as SQL, Python, or machine learning is the sole criterion evaluated for data science roles.

However, while these skills are definitely crucial, Confluent equally prioritises problem-solving abilities and the capacity to apply data science concepts to practical business scenarios.

Work Culture

Confluent strives to prioritise hiring individuals who demonstrate empathy and interact well with others, fostering a collaborative and inclusive environment. “As a rapidly growing company, our employees are self-motivated and driven to seize market opportunities. We follow unified decision-making and communication with an open and hierarchical-free structure,” Nair told AIM.

Nair stated that the company also offers flexibility through a ‘remote-first’ policy, allowing employees to work from various locations. Alongside competitive benefits, it ensures each employee’s contributions are recognised and valued.

“Our team thrives on a culture of intellectual curiosity and innovation, where individuals will be encouraged to push the boundaries of what’s possible,” said Nair. The company strives to build an equal, diverse and inclusive work culture.

“We’re a high-growth company with incredible results and achievements, yet only scratching the surface of our potential impact in a rapidly growing market. Joining us promises an exhilarating journey of growth,” concluded Nair.

If you think you are a right fit for Confluent, apply here.

The post Data Science Hiring Process at Confluent appeared first on Analytics India Magazine.

Metaview’s tool records interview notes so that hiring managers don’t have to

Metaview’s tool records interview notes so that hiring managers don’t have to Kyle Wiggers 8 hours

Siadhal Magos and Shahriar Tajbakhsh were working at Uber and Palantir, respectively, when they both came to the realization that hiring — particularly the process of interviewing — was becoming unwieldy for many corporate HR departments.

“It was clear to us that the most important part of the hiring process is the interviews, but also the most opaque and unreliable part,” Magos told TechCrunch. “On top of this, there’s a bunch of toil associated with taking notes and writing up feedback that many interviewers and hiring managers do everything they can to avoid.”

Magos and Tajbakhsh thought that the hiring process was ripe for disruption, but they wanted to avoid abstracting away too much of the human element. So they launched Metaview, an AI-powered note-taking app for recruiters and hiring managers that records, analyzes and summarizes job interviews.

“Metaview is an AI note-taker built specifically for the hiring process,” Magos said. “It helps recruiters and hiring managers focus more on getting to know candidates and less on extracting data from the conversations. As a consequence, recruiters and hiring managers save a ton of time writing up notes and are more present during interviews because they’re not having to multitask.”

Metaview integrates with apps, phone systems, videoconferencing platforms and tools like Calendly and GoodTime to automatically capture the content of interviews. Magos says the platform “accounts for the nuances of recruiting conversations” and “enriches itself with data from other sources,” such as applicant tracking systems, to highlight the most relevant moments.

“Zoom, Microsoft Teams and Google Meet all have transcription built in, which is a possible alternative to Metaview,” Magos said. “But the information that Metaview’s AI pulls out from interviews is far more relevant to the recruiting use case than generic alternatives, and we also assist users with the next steps in their recruiting workflows in and around these conversations.”

Image Credits: Metaview

Certainly, there’s plenty wrong with traditional job interviewing, and a note-taking and conversation-analyzing app like Metaview could help, at least in theory. As a piece in Psychology Today notes, the human brain is rife with biases that hinder our judgement and decision making, for example a tendency to rely too heavily on the first piece of information offered and to interpret information in a way that confirms our preexisting beliefs.

The question is, does Metaview work — and, more importantly, work equally well for all users?

Even the best AI-powered speech dictation systems suffer from their own biases. A Stanford study showed that error rates for Black speakers on speech-to-text services from Amazon, Apple, Google, IBM and Microsoft are nearly double those for white speakers. Another, more recent study published in the journal Computer Speech and Language found statistically significant differences in the way two leading speech recognition models treated speakers of different genders, ages and accents.

There’s also hallucination to consider. AI makes mistakes summarizing, including in meeting summaries. In a recent story, The Wall Street Journal cited an instance where, for one early adopter using Microsoft’s AI Copilot tool for summarizing meetings, Copilot invented attendees and implied calls were about subjects that were never discussed.

When asked what steps Metaview has taken, if any, to mitigate bias and other algorithmic issues, Magos claimed that Metaview’s training data is diverse enough to yield models that “surpass human performance” on recruitment workflows and perform well on popular benchmarks for bias.

I’m skeptical and a bit wary, too, of Metaview’s approach to how it handles speech data. Magos says that Metaview stores conversation data for two years by default unless users request that the data be deleted. That seems like an exceptionally long time, and candidates would probably.

But none of this appears to have affected Metaview’s ability to get funding or customers.

Metaview this month raised $7 million from investors including Plural, Coelius Capital and Vertex Ventures, bringing the London-based startup’s total raised to $14 million. Metaview’s client count stands at 500 companies, Magos says, including Brex, Quora, Pleo and Improbable — and it’s grown 2,000% year-over-year.

“The money will be used to grow the product and engineering team primarily, and give more fuel to our sales and marketing efforts,” Magos said. “We will triple the product and engineering team, further fine-tune our conversation synthesis engine so our AI is automatically extracting exactly the right information our customers need and develop systems to proactively detect issues like inconsistencies in the interview process and candidates that appear to be losing interest.”

MongoDB has Over 3,000 Customers in India and Growing

MongoDB loves India. Boris Bialek, the field CTO of MongoDB, was in Bengaluru earlier this month and in an exclusive interaction with AIM, he said: “India’s market momentum is tremendous. The growth is huge, and we have over 3,000 customers here. We experience millions of downloads every month and have an exceptionally active developer community here.”

In India, MongoDB, the king of the NoSQL database, serves unicorns, smaller startups, and digital native companies, including those specialising in generative AI and digital transformation initiatives. Some of its notable customers include Zomato, Tata Digital, Canara HSBC Life Insurance, Tata AIG, and Devnagri.

India cares about data security like no other. With data and AI sovereignty on the rise, there’s a demand for keeping data within a controlled environment, either on-premises or in a cloud, but not over public APIs. This is where MongoDB comes in—it acts as a bridge, integrating various components into a cohesive system.

“Our approach emphasises simplicity, transparency, and trust. We make things clear around how vectors are used and provide transparency in data usage. This level of clarity addresses concerns about system trustworthiness and what is being built,” he added.

Other NoSQL databases like Redis and Apache Cassandra are also widely used by Indian developers. The former has over 12 million daily downloads and derives 60% of its revenue from national database projects. Apache Cassandra has a strong presence of 14.95%, with companies like Infosys, Fujitsu, and Panasonic using it. Amazon DynamoDB holds a market share of approximately 9.5%.

Focuses on Real-time

“We no longer see ourselves just as a NoSQL database. We’re part of a larger picture, providing services for banking transactions, multi-document handling, search capabilities, and integrating edge components for global manufacturing lines,” said Bialek.

This is where the concept of a ‘developer data platform’ emerges, and it’s a significant change the team has observed over recent years. It’s about accelerating integration without the need to maintain multiple disconnected systems.

“We’re discussing real-time data. Everything today demands immediacy, whether it’s a UPI transaction that needs decisions in milliseconds or providing individualised, real-time data like online stock trades,” he added. Behind this is a vast amount of JSON data, which is what the JSON document model in MongoDB is about.

Several companies achieved remarkable improvements in efficiency and customer satisfaction through MongoDB solutions. Bialek spoke about an Italian energy company that was able to slash its service desk response time from a day to two minutes!

Meanwhile, a German company pioneered chatbot systems utilising self-trained LLMs for tailored client interactions, such as rapid insurance claim processing. In the retail sector, data analysis led to a 50% reduction in return shipments for an online shoe retailer by suggesting optimal shoe sizes, demonstrating the mutual benefits of predictive services for customers and businesses.

Generative AI Struggles

During recent customer interactions, Bialek noted several challenges regarding speeding up and implementing new technology like generative AI.

“No matter if the company is small or large, the main challenge is accelerating business processes. Developers are in high demand, so the focus is on how fast we can create new platforms like payment systems and integrate new technologies like UPI LITE with limited help. We’re collaborating with major payment providers to address these challenges,” Bialek added.

Secondly, customers are experiencing a stark difference between companies that are aggressively implementing generative AI and have clear directions, vis à vis others that are still figuring it out. So, MongoDB assists them in understanding and applying use cases.

MongoDB to the Rescue

In November last year, the company added new features to MongoDB Atlas Vector Search that offers several benefits for generative AI application development.

“The feedback has been overwhelmingly positive, citing its ease of use. Some clients have even built solutions in half a day, a task that would have previously taken six months and a consultant,” Bialek commented.

Furthermore, LLMs require storage for vectors, and MongoDB simplifies access to these. The emphasis is on delivering a holistic solution by integrating data layers, applications, and underlying technologies.

In January of this year, MongoDB partnered with California-based Patronus AI to provide automated LLM evaluation and testing for business clients. This partnership merges Patronus AI’s functions with MongoDB’s Atlas Vector Search tool, creating a retrieval system solution for dependable document-based LLM workflows.

Customers can build these systems through MongoDB Atlas and utilise Patronus AI for evaluation, testing, and monitoring, enhancing accuracy and reliability.

“I would say that MongoDB is a bit like the glue in the system, bringing everything together in a straightforward manner,” said Bialek.

What’s Next?

Going forward, Bialek noted that the primary goal is to focus on improving user-friendliness through simplicity, real-time data processing, and application development.

“We are releasing a new version of our product this year, as we do annually. We’re currently showcasing our Atlas Streams in public preview, which is crucial for real-time streams, processing, and vector search integration. There’s a lot more in development, too,” concluded Bialek.

Read more: Is MongoDB Vector Search the Panacea for all LLM Problems?

The post MongoDB has Over 3,000 Customers in India and Growing appeared first on Analytics India Magazine.

Mastering Python for Data Science: Beyond the Basics

Image from Freepik

Python reigns supreme in the data science world, yet many aspiring (and even veteran) data scientists only scratch the surface of its true capabilities. To truly master data analysis with Python, you must venture beyond the basics and use advanced techniques tailored for efficient data manipulation, parallel processing, and leveraging specialized libraries.

The large, complex datasets and computationally intensive tasks that you’ll run into demand more than entry-level Python skills.

This article serves as a detailed guide aimed at enhancing your Python skills. We'll delve into techniques for speeding up your code, using Python with large data sets, and turning models into web services. Throughout, we'll explore ways to handle complex data problems effectively.

Mastering Advanced Python Techniques for Data Science

Mastering advanced Python techniques for data science is essential in the current job market. Most companies require data scientists who have a knack for Python. Django and Flask.

These components streamline the inclusion of key security features, especially in adjacent niches, such as running PCI compliant hosting, building a SaaS product for digital payments, or even accepting payments on a website.

So, what about practical steps? Here are some of the techniques you can start mastering now:

Efficient Data Manipulation with Pandas

Efficient data manipulation with Pandas revolves around leveraging its powerful DataFrame and Series objects for handling and analyzing data.

Pandas excels in tasks like filtering, grouping, and merging datasets, allowing for intricate data manipulation operations with minimal code. Its indexing functionality, including multi-level indexing, enables quick data retrieval and slicing, making it ideal for working with large datasets.

Additionally, Pandas' integration with other data analysis and visualization libraries in the Python ecosystem, such as NumPy and Matplotlib, further enhances its capability for efficient data analysis.

These functionalities make Pandas an indispensable tool in the data science toolkit. So, even though Python is an extremely common language, you shouldn’t view this as a drawback. It is as versatile as it is ubiquitous — and mastery of Python allows you to do everything from statistical analysis, data cleaning, and visualization to more “niche” things like using vapt tools and even natural language processing applications.

High-Performance Computing with NumPy

NumPy significantly enhances Python's capability for high-performance computing, especially through its support for large, multi-dimensional arrays and matrices. It achieves this by providing a comprehensive array of mathematical functions designed for efficient operations on these data structures.

One of the key features of NumPy is its implementation in C, which allows for rapid execution of complex mathematical computations using vectorized operations. This results in a notable performance improvement compared to using Python's native data structures and loops for similar tasks. For instance, tasks like matrix multiplication, which are common in many scientific computations, can be executed swiftly using functions like np.dot().

Data scientists can use NumPy's efficient handling of arrays and powerful computational capabilities to achieve significant speedups in their Python code, making it viable for applications requiring high levels of numerical computation.

Enhancing Performance Through Multiprocessing

Enhancing performance through multiprocessing in Python involves using the ‘multiprocessing’ module to run tasks in parallel across multiple CPU cores instead of sequentially on a single core.

This is particularly advantageous for CPU-bound tasks that require significant computational resources, as it allows for the division and concurrent execution of tasks, thereby reducing the overall execution time. The basic usage involves creating ‘Process’ objects and specifying the target function to execute in parallel.

Additionally, the ‘Pool’ class can be used to manage multiple worker processes and distribute tasks among them, which abstracts much of the manual process management. Inter-process communication mechanisms like ‘Queue’ and ‘Pipe’ facilitate the exchange of data between processes, while synchronization primitives such as ‘Lock’ and ‘Semaphore’ ensure that processes do not interfere with each other when accessing shared resources.

To further enhance code execution, techniques like JIT compilation with libraries such as Numba can significantly speed up Python code by dynamically compiling parts of the code at runtime.

Leveraging Niche Libraries for Elevated Data Analysis

Using specific Python libraries for data analysis can significantly boost your work. For instance, Pandas is perfect for organizing and manipulating data, while PyTorch offers advanced deep-learning capabilities with GPU support.

On the other hand, Plotly and Seaborn can help make your data more understandable and engaging when creating visualizations. For more computationally demanding tasks, libraries like LightGBM and XGBoost offer efficient implementations of gradient-boosting algorithms that handle large datasets with high dimensionality.

Each of these libraries specializes in different aspects of data analysis and machine learning, making them valuable tools for any data scientist.?

Data Visualization Techniques

Data visualization in Python has advanced significantly, offering a wide array of techniques for showcasing data in meaningful and engaging ways.

Advanced data visualization not only enhances the interpretation of data but also aids in uncovering underlying patterns, trends, and correlations that might not be evident through traditional methods.

Mastering what you can do with Python individually is indispensable — but having an overview of how a Python platform can be utilized to the fullest extent in an enterprise setting is a point that is sure to set you apart from other data scientists.

Here are some advanced techniques to consider:

Interactive visualizations. Libraries like Bokeh and Plotly allow for creating dynamic plots that users can interact with, such as zooming in on specific areas or hovering over data points to see more information. This interactivity can make complex data more accessible and understandable.

Complex chart types. Beyond basic line and bar charts, Python supports advanced chart types like heat maps, box plots, violin plots, and even more specialized plots like raincloud plots. Each chart type serves a specific purpose and can help highlight different aspects of the data, from distributions and correlations to comparisons between groups.

Customization with matplotlib. Matplotlib offers extensive customization options, allowing for precise control over the appearance of plots. Techniques like adjusting plot parameters with plt.getp and plt.setp functions or manipulating the properties of plot components enable the creation of publication-quality figures that convey your data in the best light possible.

Time series visualization. For temporal data, time series plots can effectively display values over time, helping to identify trends, patterns, or anomalies across different periods. Libraries like Seaborn make creating and customizing time series plots straightforward, enhancing the analysis of time-based data.

Visualization Tools for Data Science

Enhancing performance through multiprocessing in Python allows for parallel code execution, making it ideal for CPU-intensive tasks without requiring IO or user interaction.

Different solutions are suited for different purposes — from creating simple line charts to complex interactive dashboards and everything in between. Here are some of the popular ones:

Infogram stands out for its user-friendly interface and diverse template library, catering to a wide range of industries, including media, marketing, education, and government. It offers a free basic account and various pricing plans for more advanced features.
FusionCharts allows for the creation of over 100 different types of interactive charts and maps, designed for both web and mobile projects. It supports customization and offers various exporting options.
Plotly offers a simple syntax and multiple interactivity options, suitable even for those with no technical background, thanks to its GUI. However, its community version does have limitations like public visualizations and a limited number of aesthetics.
RAWGraphs is an open-source framework emphasizing no-code, drag-and-drop data visualization, making complex data visually easy to understand for everyone. It's particularly suited for bridging the gap between spreadsheet applications and vector graphics editors.
QlikView is favored by well-established data scientists for analyzing large-scale data. It integrates with a wide range of data sources and is extremely fast in data analysis.

Conclusion

Mastering advanced Python techniques is crucial for data scientists to unlock the full potential of this powerful language. While basic Python skills are invaluable, mastering sophisticated data manipulation, performance optimization, and leveraging specialized libraries elevates your data analysis capabilities.

Continuous learning, embracing challenges, and staying updated on the latest Python developments are key to becoming a proficient practitioner.

So, invest time in mastering Python's advanced features to empower yourself to tackle complex data analysis tasks, drive innovation, and make data-driven decisions that create real impact.

Nahla Davies is a software developer and tech writer. Before devoting her work full time to technical writing, she managed—among other intriguing things—to serve as a lead programmer at an Inc. 5,000 experiential branding organization whose clients include Samsung, Time Warner, Netflix, and Sony.

Neuralink would Need up to a Million Electrodes to Make Humans Immortal

Noland Arbaugh, a 29-year-old paralysed man who became the first person to receive a Neuralink implant, is busy playing video games. Be it Chess, Civilization, or his recent obsession Mario Kart, where Arbaugh has been controlling the character Bowser, navigating the track and shooting down other players.

The catch is, this is all happening without the use of hands – with just his thoughts.

“It makes being paralysed really not that bad,” Arbaugh said, describing the impact of the technology on his life. He even put out his first post on X after the implant, saying: “Twitter banned me because they thought I was a bot, X and Elon Musk reinstated me because I am.”

The implant, named Telepathy, has allowed him to regain a degree of independence and participate in activities he enjoyed before his accident.

How far are we from achieving immortality?

Musk’s vision for Neuralink goes beyond just helping individuals with paralysis. In fact, his goal is to achieve a symbiosis between human intelligence and artificial intelligence, and ultimately achieve immortality in the form of robots and humanoids.

Citing Iain M Banks’ ‘Culture’ series of novels and taking inspiration of a featured technology called ‘neural lace’, Musk imagines Neuralink to be a high bandwidth, brain to computer interface, where “it retains all of your memories and brain state, so even if your physical body dies, you can kind of reincorporate in another physical body and retain… pretty much your original memories”.

He said, however, Neuralink is a long way from achieving this, but it is off to a great start. “Our current Neuralink has about 1,000 electrodes. Ultimately, you need 100,000 or a million electrodes,” said Musk, about achieving the ultimate goal. He said these electrodes are tiny wires, thinner than a human hair.

Musk believes that by merging with AI through a brain-computer interface, humans can overcome the limitations of their biological brains and achieve a new level of cognitive capabilities.

“It’s sort of like if you can’t beat ‘em join ‘em,” he said, in a recent interview with the founder of Abundance360, Peter H Diamandis, emphasising that the human brain can’t compare to the rate at which artificial intelligence systems are advancing.

The journey so far

Arbaugh, who is paralysed from the neck down due to a swimming accident, became the first person to receive the Neuralink brain chip implant in January as part of a clinical trial.

The first human Neuralink patient, who is paralysed, controlling a computer and playing chess just by thinking. pic.twitter.com/eMt159JoIg

— Historic Vids (@historyinmemes) March 21, 2024

In May 2023, Neuralink received the FDA approval to begin human clinical trials, and the company has since recruited participants with quadriplegia due to spinal cord injuries or ALS.

Paul Nuyujukian, a professor of bioengineering and neurosurgery at Stanford University, said, “The difference that the previous implants have with Neuralink is that this is fully implantable, battery-powered, and wireless. All of this is being done over Bluetooth protocol.”

Concerns have been raised around the technology, which was tested on animals, regarding the treatment meted out to animals during Neuralink’s trials, with reports of rushed experiments leading to unnecessary suffering and deaths.

Meanwhile, Musk has announced plans for a second product called Blindsight, which aims to cure blindness by sending visual information directly to the brain.

As clinical trials progress and more individuals participate, Nuyujukian said, “A technology like this has the potential to transform our treatments, not just for stroke, paralysis, degenerative and motor degenerative diseases, but for pretty much every kind of brain disease, from Parkinson’s to epilepsy, to dementias, and Alzheimer’s.”

Though Musk acknowledges that any of this is still a long way off, Neuralink, for him, is the first step towards this vision. He said, “If your brain state is essentially stored, you’re kind of backed up on a hard drive. Then you can always restore that brain state into a biological body or maybe a robot or something.”

Mind blown!

The post Neuralink would Need up to a Million Electrodes to Make Humans Immortal appeared first on Analytics India Magazine.

LatentView Analytics Acquires Decision Point Analytics for Generative AI Solutions

LatentView Analytics, a global digital analytics consulting and solutions firm, has announced its acquisition of Decision Point, a leader in AI-led Business Transformation and Revenue Growth Management (RGM) solutions.

The strategic move, approved by the board, involves the acquisition of 70% of Decision Point’s outstanding equity capital for a total consideration of $39.1 million, with the remaining 30% to be acquired over the next two years.

Established in 2012, Decision Point boasts a workforce of over 300 employees worldwide and specialises in RGM, Demand Forecasting, Pricing Analytics, Promotion Analytics, Retail Segmentation, and Marketing Mix Models, particularly focusing on Consumer Packaged Goods (CPG) brands. The company’s expertise has garnered recognition from industry giants such as Microsoft and the Promotion Optimization Institute (POI).

Read: LatentView Analytics Exceeds Indian IT GenAI Confidence

One of Decision Point’s notable achievements is the development of Beagle GPT, a conversational GenAI app utilised by Fortune 500 CPG customers to enhance data analytics usage within their organisations.

LatentView Analytics, renowned for its business transformation consulting and agile analytics solutions, views the acquisition as a significant opportunity to strengthen its capabilities in data engineering, data science, and data visualisation. Additionally, it will enable LatentView to expand its footprint in North America and Europe, key markets for Decision Point’s solutions.

Commenting on the acquisition, Rajan Sethuraman, CEO of LatentView Analytics, emphasised the appeal of Decision Point’s Revenue Growth Management solutions in driving sustainable and profitable growth through data. “Additionally, this deal will bring 300+ highly skilled employees into LatentView’s CPG practice and help us expand into the Latin America market,” he added.

Echoing Sethuraman’s sentiments, Rajan Venkatesan, CFO of LatentView Analytics, affirmed the company’s commitment to its core verticals and announced that the acquisition would be fully funded from existing cash reserves. He anticipates the transaction to be EBITDA accretive, yielding long-term benefits for clients.

Ravi Shankar, Founder & CEO of Decision Point, expressed excitement about the integration, ensuring that the existing management team would continue to lead Decision Point with the support of LatentView’s robust go-to-market presence in North America and Europe. Shankar also highlighted the potential for synergies between the organisations and emphasised the opportunity to introduce LatentView’s marketing and supply chain analytics solutions to Decision Point’s clientele.

Krishnan Venkata, LatentView’s Chief Client Officer, emphasised the increasing importance of GenAI applications in driving sustainable growth for global leaders. With this acquisition, LatentView is well-positioned to enhance its technology and data analytics offerings, thus adding value to its clients’ strategies.

“We had two big wins this year. One of them is really on the back of LASER, the generative AI solution that we have. In fact, LASER is going to be the main solution that will be deployed,” said Rajan Sethuraman, CEO, LatentView Analytics, in an exclusive interview with AIM.

During the third quarter, the Chennai-based data analytics firm secured two significant new clients. One of these clients is exclusively leveraging the generative AI solutions offered by LatentView.

Sethuraman said that being a smaller organisation compared to Indian IT companies, they were able to understand the technology quicker. “I think its an advantage we have over larger organisations that we are a smaller company, focused exclusively on AI and analytics.”

“Larger organisations will have the capability to invest more money in getting things done. However, some of this also calls for nimbleness in terms of how quickly you’re able to understand and appreciate a new technology and how you can quickly connect the dots with a business problem that an organisation is trying to solve,” he added.

The post LatentView Analytics Acquires Decision Point Analytics for Generative AI Solutions appeared first on Analytics India Magazine.

Рубрика: AI