GenAI May Code, But Can it Think Like a Data Scientist?

Generative AI has entered the data science workflow with speed and scale. From drafting code snippets to helping with initial brainstorming, its utility is evident. Yet, when the conversation turns to judgment, nuance, or accountability, many data scientists argue that the boundaries become clear.

While GenAI has already changed how certain tasks are done, sometimes reliance on it either does not make sense or may even be harmful, said Aashutosh Nema, lead data scientist at Dell Technologies, in an exclusive interaction with AIM.

“As a data scientist, I often use GenAI to help draft ideas or code quickly. But there are many parts of my work [for which] I still don’t rely on GenAI since it involves deeper judgment, nuance, or responsibility,” he said.

The Spaces AI Cannot Reach

Problem definition remains one of the most human-driven responsibilities.

“AI can suggest ideas, but it doesn’t understand the real-world consequences of those choices,” Nema said.

Figuring out which question to ask, what metric matters, and what trade-offs are acceptable takes real business knowledge and understanding, he said, adding that this is where domain expertise and responsibility take precedence.

Experiment design follows a similar pattern. GenAI might propose a testing framework or surface standard metrics, but questions of sample size, risk, and bias rest with experienced scientists.

Nema added, “GenAI can suggest an experiment structure or surface metrics, but deciding how to test, what technique to use, and when a model is good enough requires human judgment.” In regulated or high-risk contexts, accountability cannot be shifted onto a model.

Working with raw data also exposes GenAI’s limits. “Data often has missing values, mislabeled entries, or silent shifts over time. Cleaning, validating, and understanding context is judgment-heavy and relies on domain knowledge, something AI lacks.” For Nema, data quality remains a human-driven task that underpins the reliability of any model.

Sujatha S Iyer, head of AI security at Zoho Corp, reinforced this point in an interaction with AIM.

“While generative AI has advanced rapidly, it predominantly lacks the real-world judgment and causal reasoning of a human expert and grasp of unquantifiable variables that define true expertise,” she said.

She explained that enterprise data is typically structured and domain-specific, differing from the unstructured data that LLMs process effectively. These models frequently overlook the nuanced knowledge embedded in structured data unless it is meticulously provided.

Iyer said that the effectiveness of AI still hinges on careful feature engineering and the selection of relevant data. Running LLMs is resource-intensive, so guiding them with domain expertise is critical.

Furthermore, due to significant social implications, rigorous benchmarking for fairness and bias is essential.

“This means skilled data scientists are still needed to guide the process, apply domain knowledge, and ensure responsible AI use,” she said.

The Risks of Overreliance

Where GenAI does perform well is in brainstorming or drafting. It can quickly assemble lists of options or generate baseline code logic. But here, too, Nema cautioned about its weaknesses.

“GenAI is great for getting an initial draft list of ideas, but it mostly agrees with the user and rarely challenges assumptions. In data science, progress often comes from active disagreement and asking ‘what if we’re wrong?’ A step GenAI doesn’t handle well.”

Edge cases in coding highlight another danger. He explained that while GenAI can generate initial code logic or templates similar to an entry-level data scientist, it struggles with reliably handling edge cases or understanding its integration into existing systems. Human oversight remains essential for determining testing parameters, connection points, and potential failure areas.

Finally, there is the ethical dimension. “In regulated use cases, someone must sign off on fairness, privacy, and compliance. You can’t hand those responsibilities off to a model.”

Nema emphasised that even as performance improves, reliable measurement, evaluation, and guardrails remain human responsibilities.

Iyer also cautioned against falling for the hype cycle. She shared an example where one should prioritise specific use cases rather than succumbing to hype. For instance, traditional statistical machine learning methods often suffice for forecasting sales revenue. These methods are computationally efficient and offer clear explanations, a critical advantage over opaque ‘black-box’ models in such applications. Ensuring accuracy, explainability, and compliance, she said, should guide method selection rather than defaulting to generative AI.

A Human-Driven Core

GenAI is advancing quickly, but its reach is not universal. For many data scientists, the message may be that AI can assist, accelerate, and augment, but it cannot yet replace the human judgment that defines the discipline.

The critical parts of the workflow, whether in problem definition, experiment design, raw data validation, coding edge cases, or ensuring ethics, remain firmly in human hands.

Looking ahead, Iyer pointed out that domain and contextual knowledge will remain central. “At the end, it’s all about delivering better results to the customer.”

She highlighted an example of a financial anomaly detection, where a model might flag a large transaction as suspicious. However, if the customer pre-notified the bank, immediately blocking it would harm customer experience and operations. Providing this context to the model would significantly reduce false alarms.

“That’s where a data scientist really adds value, by combining domain knowledge with AI to make smarter decisions,” she said.

Iyer highlighted the continuing importance of domain and contextual knowledge, emphasising that the ultimate goal is to enhance customer outcomes.

The post GenAI May Code, But Can it Think Like a Data Scientist? appeared first on Analytics India Magazine.

GenAI May Code, But Can it Think Like a Data Scientist?

The Spaces AI Cannot Reach

The Risks of Overreliance

A Human-Driven Core

Hexaware CTO Says Without AI, 90% of Young Coders Would Struggle

GenAI Is Killing Old Open Source Rules

Chennai’s OrbitAID Opens Bengaluru Facility for On-Orbit Refuelling, Satellite Servicing

India’s IITs are Great But…

Oracle Wants Agents AI to Turn HR Hassle into HR Hustle

Latest stories

Nagaland University Brings Fractals Into Quantum Research

GenAI Is Killing Old Open Source Rules

Oracle Wants Agents AI to Turn HR Hassle into HR...

Hexaware CTO Says Without AI, 90% of Young Coders Would...

Chennai’s OrbitAID Opens Bengaluru Facility for On-Orbit Refuelling, Satellite Servicing

You might also like...

Nagaland University Brings Fractals Into Quantum Research

GenAI Is Killing Old Open Source Rules

Oracle Wants Agents AI to Turn HR Hassle into HR Hustle