This Sequoia-Backed Startup Uses AI to Help You Sleep Better 

Last week, at its first-ever media conference in India, Sequoia-backed Indian startup Wakefit announced its intention to improve sleep health and quality for Indian consumers, this time using AI.

Regul8 is one of the flagship products of the company’s recently launched Wakefit Zense, the country’s first AI-powered sleep solutions suite.

It is India’s first mattress temperature controller, which allows users to manually set the temperature between 15°C and 40°C or choose from presets like neutral, cold, warm, ice, and even fire. The icing on the cake: It also supports dual preferences, so individuals can customise their sides of the bed independently, eliminating the common household dispute over the AC remote.

On an average, a home air conditioner in India can use about 3,000 watts of electricity an hour in India. However, thanks to Wakefit’s recently launched Regul8, the sleep controller mattress, you don’t need an AC anymore. Most importantly, it is 60% more energy-efficient than a 1.5-ton air conditioner.

The other flagship product is an AI-powered contactless sleep-tracking device called Track8.

Explaining the technology behind Track8, Yash Dayal, the CTO of Wakefit, told AIM, “The tracker works by placing a passive sensor below the mattress, and as a person sleeps, it leverages ballistocardiography. This means that tiny vibrations from heartbeats, snoring, or any movement are read by our sensor. “These raw signals are then processed through AI and ML models to derive sleep metrics, restfulness, and other body vitals.”

Integrating both Track8 and Regul8 can create a system where temperature regulation is based on how you sleep. According to Dayal, the products, including AI and ML models, were made by the in-house team consisting of about 80 to 100 tech experts.

Discussing the broader AI strategy, Dayal said, “We don’t rely on generative AI models for these consumer products, but we are experimenting with models like those from OpenAI and Gemini internally for efficient supply chain management, demand planning, and forecasting.”

What about Data Privacy?

“Users have the option to choose whether to share their data or not. Their data is used solely to provide insights on sleep quality, such as heart rate variability, respiratory rate, movement index, and snoring index. It is anonymised, encrypted, and used only to improve their sleep quality,” said the director and co-founder Chaitanya Ramalingegowda, in an exclusive interview with AIM.

Making High-Quality Products Affordable for All

Founded in 2015, Wakefit’s philosophy is rooted in making high-quality products affordable for middle-class consumers. Led by co-founders Ankit Garg and Ramalingegowda, Wakefit has created a foundation for itself by building coming-of-age products like orthopaedic mattresses and dual comfort solutions, but at affordable prices.

“Ankit and I come from middle-class backgrounds, so we understand the constraints of operating on a budget. From the beginning, our philosophy has been to make high-quality products affordable. For example, why should an orthopaedic memory foam mattress cost INR 50,000 when it can cost INR 12,000?” said Ramalingegowda.

This principle extends to their latest tech-enabled sleep solutions, which are designed to offer cutting-edge technology at a fraction of the cost of similar products in the US and Europe.

“The core idea was to leverage this success in non-tech products by applying material sciences, to create something uniquely beneficial for sleep,” added Ramalingegowda.

“In markets like the US and Europe, similar products cost around INR 4.5 lakh. We built our product from the ground up to make it available for INR 45,000 to 50,000. We are middle-class folks building for middle-class India,” explained Ramalingegowda.

For example, high-end sleep technology brands like Eight Sleep offer smart mattresses with advanced sleep tracking and temperature control features, at around INR 2.8 lakh for the Pod 3 model.

Similarly, Sleep Number provides adjustable mattresses with integrated sleep tracking and temperature regulation, often exceeding INR 3 lakh. Chili Technology and Bryte also offer premium sleep solutions with advanced features, with prices reaching up to approximately INR 3.75 lakh.

By offering its products at a mere fraction of that price, Wakefit positions itself as a cost-effective alternative to these luxury competitors, making sleep technology more accessible to the middle-class Indian market. This strategic pricing is an effort to bridge the gap between affordability and high-quality sleep solutions.

Future Prospects

Looking ahead, Wakefit plans to integrate generative AI and work with voice-based ecosystems to provide more actionable insights rather than passive data reporting. “Over the next two years, we will steadily launch new features and products, adding more value to our customers,” concluded Ramalingegowda.

Wakefit’s commitment to innovation and accessibility positions it as a leader in the sleep technology market in India. With the launch of Wakefit Zense, the company is set to revamp how Indians experience sleep, making AI-powered sleep solutions an integral part of everyday life.

5 Free Templates for Data Science Projects on Jupyter Notebook

5 Free Templates for Data Science Projects on Jupyter Notebook
Image generated by Author with DALL·E 3

For many professional data scientists, Jupyter Notebook has become their staple working environment. Even for me, it’s always the first place I go to for any data science experiment and workflow.

As a data scientist workplace, Jupyter Notebook is a unique IDE as the code can be executed independently in each cell. At the same time, the author could explain each cell. This distinction allows the notebook to be reused by others and become a project template.

In this article, we will discuss five free templates for building a data science project on Jupyter Notebook. So, what are these Jupyter Notebook templates? Let’s get into them.

1. Cookiecutter Template for Python Data Science Projects

The first template that we discuss is not necessarily a complete code project that we can fill in already. The template we would discuss is not only the Jupyter Notebook but also complete projects that support the Jupyter Notebook. What we would have is the Python data science projects by AWS.

This template creates a complete data science project structure that is ready to be used for your actual project. Using Cookiecutter CLI, you can generate the directory structure of the data science projects, which is similar to the structure below.

|-- bin/  |-- notebooks                    # A directory to place all notebooks files.  |   |-- *.ipynb  |   `-- my_nb_path.py            # Imported by *.ipynb to treat src/ as PYTHONPATH  |-- requirements/  |-- src  |   |-- my_custom_module         # Your custom module  |   |-- my_nb_color.py           # Imported by *.ipynb to colorize their outputs  |   `-- source_dir               # Additional codes such as SageMaker source dir  |-- tests/                       # Unit tests  |-- MANIFEST.in                  # Required by setup.py (if module name specified)  |-- setup.py                     # To pip install your Python module (if module name specified)    # These sample configuration files are auto-generated too:  |-- .editorconfig                # Sample editor config (for IDE / editor that supports this)  |-- .gitattributes               # Sample .gitattributes  |-- .gitleaks.toml               # Sample Gitleaks config (if pre_commit is advanced)  |-- .gitignore                   # Sample .gitignore  |-- .pre-commit-config.yaml      # Sample precommit hooks  |-- LICENSE                      # Boilperplate (auto-generated)  |-- README.md                    # Template for you to customize  |-- pyproject.toml               # Sample configurations for Python toolchains  `-- tox.ini                      # Sample configurations for Python toolchains  

If you are interested in how the template is applied to a real project, check its actual usage above this Reinforcement Learning Energy Storage use case.

2. Data Science Notebooks Templates by Coen Meintjes

The next Jupyter Notebook template we will discuss is the one by Coen Meintjes. It is a basic Jupyter Notebook collection from Data Exploration to Model Evaluation. It’s not a project-specific kind of template; in fact, it mostly consists of the essential code, nothing more. But I would say it is good. Why is that?

It’s a basic template that everyone can use for different kinds of projects with little tweaks. You can use this Jupyter Notebook template to develop any project idea. Moreover, the templates here explain in depth many of the processes in the notebook, so any beginner or professional could benefit from them.

3. Data Science Projects by Yusuf Cinarci

Let’s move on to a more project-specific template, the Data Science Projects Jupyter Notebook templates by Yusuf Cinarci. In these templates, you can use them to develop simple projects for your portfolio or any business needs.

There are many project templates you can choose from. You can choose from simple Salary Data Exploration with Python to the development of a Fake News Detection and Movie Recommendation System. The notebook is perfect for you who want to kickstart your projects easily.

What I like about the Yusuf Cinarci template collections is that they are not overly complicated, so beginners can start their projects when they learn about data science. However, many of the projects are for beginners, so they might lack data science projects if you are looking for one.

4. Data Science Projects by Sukman Singh

If you need a more complex project Jupyter Notebook template, then the Data Science Projects Jupyter Notebook template by Sukman Singh could be the one for you. It’s perfect for those who want to develop prediction models easily but need inspiration for their ideas.

The template collection contains many data science project templates, including customer churn prediction, loan approval prediction, and claim fraud projects. These are standard business projects that you can use to enrich your data project portfolio.

The project might seem one-dimensional, but you can extend the project as well. It’s a template that you can use to develop your project and use another dataset that you feel is appropriate for your business problem.

5. Awesome Notebooks by Jupyter Naas

Lastly, we will discuss Awesome Notebooks by Jupyter Naas. Awesome Notebook is a project by Jupyter Naas to create the largest catalogue of production-ready Jupyter Notebook templates. There is an abundance of free Jupyter Notebook templates from which you can choose.

The project consists of many professional Jupyter Notebook templates with specific use cases. From AI development to analysing business funnels to YouTube video download, there are so many templates you can choose from.

Many of the templates require you to understand how to work as a data scientist, so learning how to use Python as a data scientist would help you use these templates. Once you understand what you need, this template collection will help your work.

Conclusion

Jupyter Notebook is an environment used by many professional data scientists as many data science breakthroughs happen in this platform. One of its nice features is that it is easily shareable and can become a template that many can use.

In this article, we have discussed five free Jupyter Notebook templates that can be used to boost your data science activity. The templates are:

  1. Cookiecutter Template for Python data science projects
  2. Data Science Notebooks Templates by Coen Meintjes
  3. Data Science Projects by Yusuf Cinarci
  4. Data Science Projects by Sukman Singh
  5. Awesome Notebooks by Jupyter Naas

Cornellius Yudha Wijaya is a data science assistant manager and data writer. While working full-time at Allianz Indonesia, he loves to share Python and data tips via social media and writing media. Cornellius writes on a variety of AI and machine learning topics.

More On This Topic

  • 10 Jupyter Notebook Tips and Tricks for Data Scientists
  • Python in Finance: Real Time Data Streaming within Jupyter Notebook
  • How to Setup Julia on Jupyter Notebook
  • Jupyter Notebook Magic Methods Cheat Sheet
  • An Overview of Mercury: Creating Data Science Portfolio and…
  • Analyze Python Code in Jupyter Notebooks

All AI Startups Have the Potential to be the Next OpenAI

In a recent interview, OpenAI CTO Mira Murati said that the models the company uses in its labs are not far behind those currently available to the public.

Speaking on the release of GPT-4o, she said that it was a massive deal for OpenAI to be able to make this kind of technology accessible to the public. “I don’t think there is enough emphasis on how unique that is for the stage where the technology is today.

“In the sense that inside the labs, we have these capable models and they’re not that far ahead of what the public has free access to. That’s a completely different trajectory for bringing technology into the world than what we’ve seen historically,” she said.

This brings up an interesting point of how, while many AI startups struggle to survive past their early stages, it is still possible to rise up, much like OpenAI did in the last few years.

While Murati’s point is specifically about OpenAI’s ability to make advanced models widely available, she also emphasised that this is to ensure that the general public is fully aware of how the technology is progressing.

“It’s a great opportunity because it brings people along; it gives them an intuitive sense of the capabilities and risks. The opportunities are huge now,” she said. Murati’s stance implies that startups are not far behind in terms of the technology needed to develop advanced AI systems.

Behind OpenAI’s Success

OpenAI is widely regarded as an AI success story. Initially founded as a non-profit towards the goal of developing beneficial AGI, the company quickly rose through the ranks of the startup world to become a household name.

However, there were several factors that helped OpenAI gain the attention it did. Namely, access to funding as well as a large talent pool, allowed the startup to focus largely on quality research.

One of the key points to remember about OpenAI is that its big break with ChatGPT came only a few years after the company received a $1 billion investment from Microsoft. Additionally, the timing was near perfect, as they had focused their attention on the development of Transformer models, shortly after the release of the 2017 paper, ‘Attention is All You Need’.

Of course, it’s key to note that all of these factors rely on ensuring that your problem statement is one that can result in something ground-breaking.

What Needs to Be Done?

Murati’s implication that AI models are much more widely available now than ever before, means that startups need to try harder to stand out. OpenAI appeared as a pioneer in making an easy-to-use AI product widely available to the public for free. However, only years later, doing the same doesn’t have the same impact anymore.

In India, several startups have managed to make a splash, albeit not as big as OpenAI’s. Successful Indian startups like Krutrim, Kissan and Sarvam have been able to leverage gaps between the AI revolution and use cases specific to India, allowing them to break ground as pioneers in the country.

Being able to identify certain gaps and plug them in a way that ensures a startup’s continuity is a tough task. This is especially true as rapid advancements from bigger companies mean that startups with a poorly thought-out plan and product find themselves obsolete almost overnight.

With the proper resources, however, this can be remedied. One must keep in mind that OpenAI’s ability to get ahead of the curve was solely due to a team of researchers that tried to leverage then-unexplored technology for use among the general public.

AI advancements mean that a lot more students and professionals are pivoting towards studying AI, thereby creating a large talent pool, especially in India. Besides, with startup funding increasing exponentially, it seems that the only roadblock now is trying to find a way to stand out in a sea of problem statements.

However, as Murati said, “If you can push human knowledge forward, you can push society and civilisation forward.”

MIT Prof Calls for AI Experts to Revamp Kumbh Mela

Ramesh Raskar, an MIT associate professor and head of the MIT Media Lab’s Camera Culture research group, has issued a compelling call for AI and ML experts worldwide to participate in the Kumbh Mela project. His message calls for innovators with expertise in AI, machine learning, computer vision, the Internet of Things, robotics, drones, autonomous telecom data, and mobile devices to join the project.

This project aims to address the significant logistical challenges of the massive gathering, such as healthcare access, crowd management, sanitation, and transportation.

He also highlighted the use of AI to enhance the experience and management of the event. He discussed using AI for crowd management, health monitoring, and improving the overall safety and efficiency of the festival, emphasising the potential of these technologies to transform large-scale gatherings.

Register here.

AI-first Kumbh Mela

Raskar, an Indian-origin computer scientist, began working on projects related to the Kumbh Mela in 2015. His efforts focused on utilising technology to enhance crowd management and safety during the massive religious gathering.

Raskar worked with local innovators, students, city officials, civic groups, the police and historians in his hometown of Nashik to draw up a shortlist of key concerns, from a list of multiple issues received through an open crowdsourced platform.

His team focused on developing both high-tech and low-tech solutions and these solutions included apps for real-time crowd monitoring, platforms for dynamic data analysis of transportation and accommodation, and systems to connect festival-goers with quality food suppliers.

The KumbhaThon Project

Back in 2015, the Kumbhathon team held three-month workshops and innovation camps to discuss, test and implement a range of smart solutions to address challenges in the areas of access to healthcare, transportation, food and sanitation, housing and crowd control.

Kumbathan, a one of its kind innovation, is headed by Sunil Khandbahale, a Nashik-based innovator who joined hands with Raskar to use technology in effectively micro-managing this mega-religious congregation.

With at least 39 people killed in a stampede back in 2003, they developed apps, designed specifically for this event, aimed to effectively curb the problems faced by attendees. This app manages crowds and avoids stampedes – a regular problem at the festival.

Modelled on the lines of the world’s largest community-based real time traffic and navigation app, Waze, the app used mobile phone location data to help local police redirect pedestrians away from saturated areas.

Known for his pioneering work in computational photography and AI-driven healthcare technologies, Raskar emphasises the power of innovation to drive positive change in an increasingly interconnected world.

Stone Ridge Expands Reservoir Simulation Options with AMD Instinct™ Accelerators

Shutterstock 763130437

Stone Ridge Technology (SRT) pioneered the use of GPUs for high performance reservoir simulation (HPC) nearly a decade ago with ECHELON, its flagship software product. ECHELON, the first of its kind, engineered from the outset to harness the full potential of massively parallel GPUs, stands apart in the industry for its power, efficiency, and accuracy. Now, ECHELON has added support for AMDInstinct accelerators into its simulation engine, offering new flexibility and optionality to its clients.

Reservoir simulators model the flow of hydrocarbons and water in the subsurface of the earth in the presence of wells. Energy companies use them to create and assess field development strategies. Emily Fox, SRT Director of Communications remarks, “SRT pioneered the integration of GPU accelerators in high-performance reservoir simulations and continues to lead by now offering AMD Instinct as a computational platform.” SRT’s CTO and ECHELON developer Ken Esler adds, “Many in the field were skeptical about the efficacy of GPUs, however, ECHELON’s GPU-native design overcame these doubts and marked a significant leap in simulation speed and performance, firmly establishing SRT’s position in the industry.”

Esler explains, “Over a decade ago we set out to supercharge the capability of a CPU-based in-house simulator at a major oil company. We could see the enormous potential of GPUs with their high memory-bandwidth and floating-point performance. We also realized that to unlock the true potential of GPUs, we needed to build a new simulator from scratch, specifically designed for GPU acceleration. This was the genesis of ECHELON, around 2013. Since then, we’ve continuously enhanced its features, robustness, and performance, first alone, and since 2018 together with our consortium partner Eni S.p.A”

Expanding GPU Options to Meet Customer Demands

SRT’s latest development effort was to port ECHELON from CUDA to the AMD HIP platform, enabling ECHELON to use AMD Instinct GPUs like the MI210, MI250X, and the upcoming MI300 Series. This strategic decision broadens ECHELON’s hardware compatibility and offers clients increased flexibility and choice for their high-performance computing needs. Vincent Natoli, SRT’s CEO says, “Companies need flexibility in selecting their hardware technology and should not be locked into a single vendor. We want our clients to have convenient access to the hardware of their choice when it comes to implementing business critical workflows.”

The timing of SRT’s decision in Ken Esler’s words: “The impressive specifications of AMD’s Instinct processors, particularly the MI210 and MI300, presented an increasingly compelling, competitive solution that caught our attention. The MI210’s memory bandwidth at 1.6 terabytes per second was quite competitive, and the MI300’s subsequent leap to over five terabytes per second is even more exciting. AMD’s innovative approach in processor packaging and its use of chiplets have allowed for larger GPUs without compromising yields, resulting in highly competitive products. The engineering behind these developments is quite impressive.”

The Right Software for a Smooth Port

SRT had been contemplating the idea of adapting ECHELON for use on AMD platforms for a while. “The maturation of the ROCm and HIP ecosystem significantly lowered the barriers to adopting AMD GPUs,” says Esler. The integration with ROCm, AMD’s open software platform, was essential in ensuring that ECHELON could fully leverage AMD’s GPU capabilities. “I appreciated AMD’s strategy with ROCm,” says Esler. “Instead of creating a new, proprietary and incompatible language for accelerated computing, AMD embraced existing frameworks. That significantly reduced the effort needed to adapt our existing code, allowing us to avoid more complex alternatives that are even further removed from ECHELON.”

“We develop most of our code internally, but we do rely on Thrust. AMD’s rocThrust turned out to be an effective drop-in replacement. We were also impressed with the ROCm LLVM compiler and Clang in combination with HIP extensions, which enhanced productivity. The support for debuggers and profilers in ROCm has been beneficial. Overall, we are seeing impressive progress in AMD tool development.”

Collaboration Yields Fast Results

The project began in earnest in spring 2023. Erik Greenwald of SRT described porting from CUDA to HIP as initially straightforward. “We created a wrapper to retarget each build. The initial version was created relatively quickly,” he says. Although there were a few challenges, like adjusting from CUDA’s 32 warp to AMD’s 64 warp, Greenwald found these issues manageable. “It was quite painless and quick to achieve initial results,” he reflects. Esler adds, “With AMD’s support, we steadily enhanced ECHELON’s performance and progress in a very acceptable timeframe. We were pleased and satisfied with the performance.”

ECHELON Gets to Work on AMD Instinct accelerators

ECHELON is developed in a Consortium framework, with charter members SRT and Eni, S.p.A, the Italian integrated energy company. The ECHELON Consortium, is a collaboration of industry partners committed to advancing high-performance subsurface flow simulation and is currently open to participation by new member organizations that would like a role in shaping the development of ECHELON.

Eni S.p.A recently announced its new HPC6 high-performance computing (HPC) system at its Green Data Center. Each of the 3472 computing nodes in HPC6 comprises a 64-core AMD EPYC™ CPU (AMDEPYC) and four high-performance AMD Instinct™ MI250X GPUs, offering unmatched computational efficiency and versatility for a wide array of applications.

“With a peak computing power of over 600 PetaFlop/s, HPC6 reaffirms Eni’s leadership position in the field of supercomputing among industrial entities,” says Sergio Zazzera, Head of Technical Computing for Geosciences & Subsurface Operations, Eni. “It enables highly complex simulations with enormous volumes of data, such as those needed for studying new geological basins, forecasting sub-surface flows in complex geologies, researching new materials for CO2 capture, and ensuring plasma stability in the field of magnetically confined fusion. Additionally, HPC6 will take advantage of specialized Generative AI solutions in the energy sector.”

So, what’s ahead for the world’s fastest GPU-powered reservoir simulation software? According to Esler, “We still have the lead in performance relative to our competitors by a good margin, but we are always looking to enhance ECHELON’s performance and extend its features.” There are also exciting prospects in fields beyond traditional hydrocarbon recovery, such as CO₂ and hydrogen storage, emerging areas that represent new frontiers for ECHELON as the energy sector moves towards greener technologies.

Segment Anything with AMD GPUs

EPAM Acquires Health Data Analytics Company Odysseus Data Servicese

EPAM Systems, Inc. has announced its acquisition of Odysseus Data Services, Inc., a top health data analytics company.

Odysseus will expand EPAM’s ability to transform the life sciences value chain through advanced data analytics, data methods and AI.

Headquartered in Cambridge, Massachusetts, Odysseus generates healthcare data insights and evidence for clients through skilled data science and analytics, software engineering and data management and ontology and vocabulary management.

The company’s focus on a standardised and systematic approach to healthcare data analytics is the foundation for a better understanding of the inner workings of healthcare interventions in drug treatment, drug safety and efficacy, epidemiological research, provider support, quality measurements and cost reduction.

Odysseus is an active member of the Observational Health Data Sciences and Informatics” (OHDSI) collaborative and is intimately involved in the open standards and open science community through participation in research and development, including OMOP CDM, open source tools and methods.

“We see the next wave of innovation based on standardised data powering AI and GenAI to improve life sciences research, clinical studies and post-market surveillance. Based on the combined strengths of EPAM and Odysseus, we are well positioned to lead that innovation,” Greg Killian, senior vice president of life sciences at EPAM, said.

“With EPAM’s strong foundation in AI, machine learning, data analytics and data management and cloud infrastructure combined with our healthcare data analytics and Real World Evidence expertise, we can address the whole life sciences value chain more comprehensively,” Gregory Klebanov, CEO of Odysseus, added.

Alexa, Can Ethical AI Be the Pathway to Better Models?

Do you ever wonder what happened to Alexa? Mihail Eric, a former research scientist from Alexa AI, wrote a tell-all post about why Alexa is no longer at the forefront of voice assistants, which is particularly true in this era of rapid advancements.

While Siri got ChatGPT integration, Eric undertook to find out why its prime competitor, Alexa, failed. A key reason that came up was it getting embroiled in bureaucracy due to its commitment to pushing an ethical product.

“We had all the resources, talent, and momentum to become the unequivocal market leader in conversational AI. But most of that tech never saw the light of day and never received any noteworthy press,” Eric said on X.

How Alexa dropped the ball on being the top conversational system on the planet

A few weeks ago OpenAI released GPT-4o ushering in a new standard for multimodal, conversational experiences with sophisticated reasoning capabilities.
Several days later, my good friends at PolyAI…

— Mihail Eric (@mihail_eric) June 11, 2024

Eric was part of the Conversational Modelling team at AlexaAI in 2019. He worked on improving Alexa’s capabilities through the power of AI. However, since this was much before ChatGPT, the necessary requirements for compute and other infrastructure were not met.

However, this is not what effectively killed Alexa as an AI competitor. While only a few years ago, parents were panicking about their kids learning the name “Alexa” before the word “mama”, things nosedived rapidly.

Eric says this could be attributed to several issues, most notably to several bureaucratic hurdles, including a huge focus on ethically sourcing user data.

Alexa Serves as a Cautionary Tale

While the commonly held belief at the moment is that AI should be built on ethical frameworks, Alexa’s story shows that it is easier said than done.

Data privacy is one of the most significant issues raised regarding ethical AI. Eric stated that in a bid to ensure data privacy was preserved, the company inadvertently set up several roadblocks for itself, effectively halting any advancements in training the voice assistant.

“Definitely a crucial practice, but one consequence was that the internal infrastructure for developers was agonisingly painful to work with. It would take weeks to get access to any internal data for analysis of experiments,” he said.

Interestingly, his next point goes into another major problem that companies face today – the issue of poorly annotated data. Eric stated that despite repeated attempts to ensure that data was properly annotated, this was again bogged down by layers of bureaucracy, leading to further delays in development.

Data Plays a Major Factor Too

Currently, major AI companies have begun a mad scramble for functional datasets. Data as a product (DaaP) has slowly become a point of consideration, especially with customer-facing companies that accrue data.

Meanwhile, companies like OpenAI and Google have struck several partnerships with media companies to have reliable datasets that they can train their LLMs on. However, Alexa largely relied on crowd-sourced data, as well as data taken from Alexa users and employees to train it. This data needed to be properly annotated.

“I remember, on one occasion, our team did an analysis demonstrating that the annotation scheme for some subset of utterance data was completely wrong, leading to incorrect data labels,” Eric said.

However, correcting this proved to be even worse, as it required approval from several teams and a proper justification for why it needed to be done, apart from just “it’s scientifically the right thing to do and could lead to better models for some other team”.

This proved to be true, as accurate datasets are hard to come by and worth their weight in gold. This is precisely why OpenAI has partnered with several media organisations over the past year. They’ve always needed quality datasets that can be reliably used to train ChatGPT.

Meanwhile, Google was an example of what poor datasets could do, as their integration of the Reddit API led to baffling answers to some really innocuous queries from users.

Google AI overview suggests adding glue to get cheese to stick to pizza, and it turns out the source is an 11 year old Reddit comment from user F*cksmith 😂 pic.twitter.com/uDPAbsAKeO

— Peter Yang (@petergyang) May 23, 2024

Obviously, there were several other issues with the company that ultimately led to the fall of Alexa. This included a lack of communication between teams as well as a mindset that leaned towards the consumer side rather than the scientific side.

As Eric put it, “The success metric imposed by senior leadership had no scientific grounding and was borderline impossible to achieve.”

With Google, OpenAI and Apple all announcing major upgrades to their multi-modal capabilities, it seems that Alexa is nowhere in the race. However, not all is lost.

How Do You Avoid This?

Eric’s post, while critical of Alexa, also proves to be a valuable lesson for AI companies and startups. As mentioned before, one of the biggest talking points surrounding the industry is keeping in mind ethical AI, but again, this is easier said than done.

However, this doesn’t mean that ethical AI as a whole is impossible and should be abandoned. Eric believes that better data infrastructures need to be put into place to ensure that better models are built.

Further, he said that rapid advancements mean that both companies and startups feel the pressure to ship products quickly. “Of course, you should conduct research aggressively, but don’t have delivery cycles measured in quarters, as this will produce inferior systems to meet the deadlines,” he concluded.

With everyone rushing to stay ahead in the AI race, this can also be remedied if startups work to ensure their products stay relevant and sustainable in the long run.

It’s Too Early to Write Off LLMs

It’s Too Early to Write Off LLMs

These days, everyone seems to have strong opinions on LLMs. While some are grounded in research by experts such as Yann LeCun, others just follow the hype to criticise it. Some say they’re our ticket to the AGI, while others think they’re just glorified text-producing algorithms with a fancy name.

One of the biggest arguments against LLMs achieving AGI is that they’re just not like us. As one user puts it in a Reddit discussion, “Human intelligence develops from small amounts of data, in real-time, on 20W of power, using metacognition. By contrast, LLMs work with massive amounts of data, are pre-trained, use massive power, and operate without any cognitive awareness. Therefore, AGI requires a different paradigm.”

Meanwhile, LeCun advises people getting into the AI space to work on something apart from LLMs. “If you are a student interested in building the next generation of AI systems, don’t work on LLMs. This is in the hands of large companies, there’s nothing you can bring to the table,” said LeCun.

Similarly, Francois Chollet, the creator of Keras, also recently shared similar thoughts about this. “OpenAI has set back the progress towards AGI by 5-10 years because frontier research is no longer being published and LLMs are an offramp on the path to AGI,” he said in an interview.

Interestingly, the discussion on alternatives of LLM-based models has been ongoing for a while now. Recently, Mufeed VH, the young creator of Devin’s alternative Devika, spoke about how people should move away from Transformer models and start building new architectures and how we should do more innovation.

LLMs are the projections of the world

But here’s the kicker according to people on Reddit: human intelligence also developed with tons of data and energy. Our cognitive architecture is like the universe’s ultimate information superhighway, built over hundreds of millions of years and encoded in our DNA.

Ilya Sutskever, the former chief scientist at OpenAI, has noted several times that text is the projection of the world. But how much of that is linked with LLMs is still questionable. LLMs, in a way, are building cognitive architecture from scratch, echoing the evolutionary and real-time learning processes, albeit with a bit more electricity.

One insightful comment noted, “An essential similarity between the human brain and LLMs is that they are essentially compression algorithms… compressing massive amounts of world data into worldviews that provide predictive models to guide action.”

The brain’s architecture is that of a finely tuned machine, learning efficiently from small amounts of data in real time. LLMs, on the other hand, need vast amounts of data and computational power to get anywhere close to that level of performance.

On the other hand, it’s a misconception that LLMs are all about scaling up datasets. In fact, progress is happening with ever-smaller datasets and clever techniques like synthetic data generation with positive-feedback cycles, and even very small models such as Phi-3 and Llama.

One user wisely noted, “Our brain isn’t one architecture for all processes. There are different parts of our brain that process things differently to do different things in life.” Indeed, while LLMs emulate parts of the brain related to language, there’s still much work to be done.

Diffusion models, tackling visual processing, and retrieval-augmented generation (RAG), are trying to mimic the hippocampus. But the road to AGI is long and winding, with many more brain regions to cover.

Then what to focus on? A little bit more of LLMs

Just as LeCun suggests, there are many people working in different fields of AI to figure out things other than LLMs.

A Reddit thread sparked this conversation, asking, “What is it like to work on niche topics that aren’t LLM or Vision?” One user shared, “I shifted my focus from computer vision to ML theory. Now I’m working on kernel methods which are nascent but hold promise to explain and interpret large and over-parameterised networks.”

While LLMs grab the headlines, the world of AI research is full of unsung heroes working on everything from speech synthesis to climate modelling. As one comment wisely put it, “Most of the cooler AI applications aren’t chatbots or generating bounding boxes on camera feeds.”

The bottom line is – LLMs are algorithms that utilise extensive datasets, rely on unsupervised learning, generalise skills not explicitly trained for, and have broad applicability to various downstream tasks. This approach resembles human intelligence, but current LLMs are not upgrading themselves every second like humans.

This is one discussion that can go on for the longest time. Humans can continuously learn, while models trained on static datasets with current training methods cannot. But it might be too soon to write off LLMs.

AI Can Never Replace Excel

The world is built on spreadsheets. Even before the introduction of modern machine learning tools, most of the data of the world was stored on Excel spreadsheets. Cut to today, and that’s still the case.

Just like any other tool, though, with the advent of applications such as ChatGPT, it was thought that there would be no need for Excel sheets anymore, since data science could easily be done by just allowing LLMs to make sense of the data.

Probably true … pic.twitter.com/s2aSya4vxl

— Wall Street Silver (@WallStreetSilv) June 12, 2024

The AI Revolution in Excel

One of the strongest arguments against Excel was that it was becoming obsolete. Excel is not user-friendly, and the application rounds off very large figures with accurate computations, which reduces its accuracy.

Excel is also a stand-alone application that is not fully integrated with other corporate systems. It does not provide sufficient control because users do not have a clear and consistent view of the quotations sent by their representatives, as well as the history of those quotes.

So, what does the future of Excel look like? We can anticipate a significant role for AI. Microsoft has been steadily integrating AI aspects into Excel, and with its ever-expanding capabilities, it is set to continue revolutionising the software, sparking excitement about the future possibilities.

Microsoft unveiled Copilot, its innovative AI tool, in mid-March 2023. This cutting-edge technology is set to revolutionise the functionality of Microsoft Office programs, including Word, Excel, PowerPoint, Outlook, and Teams.

Consider the practical implications of Copilot in Excel. By harnessing natural language processing (NLP) AI techniques, it empowers users to ask questions in plain language and receive accurate, context-aware answers. This transformative technology enhances Excel’s usability, providing valuable recommendations and precise results.

By integrating Copilot into the Microsoft 365 suite, Microsoft is placing generative AI tools in front of over a billion of its users, possibly changing how large segments of the global workforce communicate with one another.

One example is the Data Analyse feature found in the most recent versions of Excel. NLP can also recommend functions, formulas, and features the user may not know, making it easier to identify the best answer.

Python in Excel

The introduction of Python in Excel has given the application a major boost. Python in Excel uses Anaconda, a prominent repository notable for allowing developers to run multiple Python environments.

Now users can do advanced data analysis within the familiar Excel interface leveraging Python, which is available on the Excel ribbon.

Access to Python allows users to use Python objects within cell functions and calculations. Consider a Python object being referenced or its data used in a PivotTable.

Even popular libraries such as scikit-learn, Seaborn, and Matplotlib can be utilised with Excel. This allows Python-created visualisations, data models, and statistical calculations to be combined with Excel functions and plugins.

The integration of Python cements Excel’s position in data analytics, suggesting that Excel’s utility in the workplace is far from diminishing. Last year, Microsoft announced that they would be experimenting with GPT in their office applications.

Interestingly, Microsoft has been following this path the entire time in acquiring Github, acquiring OpenAI, which helped them make Copilot for writing Python code and use ChatGPT’s Code Interpreter, AI assistance on PowerBI, and, now, Python in Excel.

Looking at the progress of Excel, a future where AI writing Python on our Excel sheets cannot be ruled out. Moreover, the talk of AGI being built around Python makes the future of Excel more secure.

Why AI in Excel Is a Win-Win Situation

Microsoft Excel is the world’s most popular spreadsheet software. Approximately 54% of organisations use Excel. According to a recent report by corporate planning company Board International, 55% of organisations undertake their enterprise planning, including budgeting and sales forecasting, on spreadsheets.

Four out of five Fortune 500 companies use Excel, and over two billion people worldwide use spreadsheets.

Excel and spreadsheets became popular in the 1980s, and despite attempts by competitors, Excel and its ilk continue to dominate. Office workers today utilise tools like Excel to visualise, analyse, organise, distribute, and present chunks of corporate data—whether exported from databases or created on the fly.

Businesses continue to rely on Excel, so much so that financial specialists claim you’ll “have to pry Excel out of their cold, dead hands” before they ever stop using it.

How to use ChatGPT to analyze PDFs for free

PDF ChatGPT

From contracts to research papers, most important documents come in lengthy PDFs — regardless of what you do, you're guaranteed to encounter them in your lifetime, and they nearly always contain verbose language. AI can help you better understand the content and save time doing so.

ChatGPT can act as your assistant by parsing through the PDFs and being on standby for anything you need. It can answer questions, provide summaries, and even generate content from the text, including outlines, emails, and more. The best part is that the feature is entirely free.

Also: How to get ChatGPT to browse the web for free

In May 2024, nearly a year after ChatGPT launched, OpenAI upgraded the free version of the chatbot with several GPT-4o features typically reserved for their paying customers, including the ability to upload screenshots, photos, and documents. Getting started with the tool is easy, and in the long run, it will save you lots of time and effort.

Here's how to get started.

1. Log into ChatGPT

Even though OpenAI allows you to access ChatGPT without logging in, you will need to sign in to your account to use GPT-4o and its advanced features, including Browse, Vision, data analysis, file uploads, and GPTs.

Also: How to use ChatGPT to write Excel formulas

If you have never created a ChatGPT account, you can easily do so from the sign-in page or log in with your existing Google or Microsoft account. I opt for the latter option so I don't have to memorize another username and password.

2. Upload your PDF

Once you sign in, you will be brought to the ChatGPT interface. Next to the textbox where you would typically insert text to start chatting, you will see a little paper clip icon. When you click that, you will have several options: Connect to Google Drive, Connect to Microsoft OneDrive, or Upload from a computer.

Also: The best AI chatbots of 2024: ChatGPT, Copilot and worthy alternatives

Select the best option for where your document lives. Since I typically upload a document I just downloaded, I always opt for upload from my computer. Then, you can click on the PDF you want assistance with. For this example, I am using a PDF of my latest ZDNET article.

3. Add your question

Once you upload the PDF, you can accompany it with a text query that indicates what you'd like ChatGPT to do. You can make these prompts as adventurous or simple as you'd like. I kept it simple and asked for a summary: "Can you summarize what this article is about?"

As you can see in the photo above, I immediately got a comprehensive four-sentence summary, and since I wrote the article, I can verify that it was accurate.

For a more complex task, however, you could ask the chatbot to "Take out the action items, place them in bullet points, and format them into an email addressed to my boss."

Also: ChatGPT vs. Microsoft Copilot vs. Gemini: Which is the best AI chatbot?

Overall, using ChatGPT to help parse through dense PDFs can help you save time and sift through long-winded paragraphs. Of course, always make sure to check ChatGPT's work for hallucinations, just in case. If you want to explore more document-summarizing tools, stay tuned for similar capabilities coming to Apple OS later this year.

Artificial Intelligence