AI — Страница 1654

Elon thinks AI could become humanity’s uber-nanny: excerpts from a dinner convo

Elon thinks AI could become humanity’s uber-nanny: excerpts from a dinner convo Ingrid Lunden @ingridlunden / 8 hours

After months of moving fast and breaking things at Twitter, Elon Musk’s been on a crash course of a different sort in the last several weeks, doing the rounds of interviews across a range of media events and with different media platforms (although TC’s invite may have gotten lost in the mail…).

Yesterday, it was the turn of the WSJ, which featured him as its headliner to close off the opening night dinner of their CEO Council conference in London. Dialing in over video, showing up late for his half hour slot, Musk then proceeded to talk for more than an hour about Twitter (or “X slash Twitter” as Musk tried to call it), AI, Mars and more.

The interview kicked off just minutes after it was announced that Musk would be interviewing Ron DeSantis on Wednesday (today) on Twitter’s Spaces audio streaming platform, and that DeSantis would be making, in Musk’s words, “quite an announcement” as part of it.

Everyone widely expects that announcement to be about DeSantis officially entering the race for U.S. President, which prompted Thorold Barker, WSJ’s EMEA editor, to ask Musk if he was endorsing DeSantis.

Musk avoided a direct yes or no answer, which speaks to something else about him: he likes to talk in circles and his strategy sometimes goes that way, too. Musk replied that he sees Twitter as a “town square,” where he wants everyone to participate. Yes, he wants to bring more audience to the platform at a time when so many appear to be running away; but he also wants to position Twitter as a new media platform in its own right.

Musk’s actions and words should sound extremely familiar. In fact, it often feels like he has spent a lot of time tearing apart the company only to land on an idea that is a repeat of what it was already trying to do: build a media company, which he is now doing — complete with a new CEO with advertising chops who he’s poached from a media company, a strategy to court advertisers, and build entertaining content.

“Ranging from the left, moderate, to what’s considered right… I do think it’s important that Twitter be about the reality and the perception of a level playing field, a place where all voices are heard, and where there’s the kind of dynamic interaction that you don’t really see anywhere else,” he said. “I mean, today on Twitter, for example, AOC and Ted Cruz got into an argument, which was, independent of which side you agree with, very entertaining.” (Cue audience laughter.) “I [would] really just like someone fairly normal and sensible to be the President,” he added later, to more chuckling from the audience.

For Musk, the Twitter story he wants to tell — or at least hopes to sell — is that Twitter is on the return path from the layoffs, internal restructuring, and big changes. And that it’s entertaining.

It’s hard to know how close that story is to reality at this point.

In the meantime, and before investors run out of patience, Musk remains quite exasperating and compelling in equal measure. As Musk’s half hour conversation stretched to more than one hour, he ranged from talking about ambitions to make Mars a self-sustaining civilization through to who might succeed him at any of his companies; the political situation in China through to the future of humanity.

There aren’t many people who would be asked this range of questions in one sitting, and even fewer who would be interested to hear their answers. Yet that’s what you get with this guy. Here’s a selection of excerpts from the conversation:

Musk says he has only one part-time assistant.

To an audience of CEOs and other executives “scheduled within an inch of their lives,” as Barker put it, Elon described the chaos of being at the helm of three different companies, Twitter, SpaceX and Tesla. (No wonder he dreams about building X, an everything app: it could help compress some of the work and things he has to oversee.)

“My days are very long and complicated, as you might imagine, and there’s this great deal of context switching… Switching context is is quite painful. But I do generally try to divide companies so it’s predominantly one company on one day, so today’s a Tesla day for example, although I might end up at Twitter late tonight… Time management is extremely difficult. And this is going to sound pretty strange but I I only have one part time assistant… but I do most of the scheduling myself. And the reason is because it’s impossible for someone else to know what the priorities are. So since the most valuable thing I have is time, I schedule it myself.”

Musk believes civilization is less robust than people assume, and that AI might accelerate its destruction.

For someone who is time poor, it’s interesting how bearish Musk has come out on AI. He touts the AI built inside Tesla as the best in the world, and he was an early parter at OpenAI, but he’s not a fan it seems of artificial general intelligence, not least because it’s coming and seems to be out of control. “I think that that is it’s unnecessary for everything, but it is happening and happening very quickly. There is a risk that advanced AI either eliminates or constrains humanity’s growth.”

But don’t take any of that too seriously! Much later in the same conversation he contradicted himself to say, “I don’t think AI is going to try to destroy humanity, but it might put us under strict controls,” describing a scenario of “AI assuming control for the safety of all the humans and taking over all the computing systems and weapon systems of Earth and effectively being like some sort of uber-nanny.”

Musk claims that Twitter has gotten rid of hate speech and 90% of the “scam and spam entrepreneurs” on its platform, which will make it more attractive to advertisers.

The pair touched lightly on Musk’s “courtship” of Linda Yaccarino, who was named as Twitter’s new CEO last week (she has yet to step into the job). It sounded like their interaction first started as a conversation about advertising, but Musk didn’t get into the specifics of how, and when, that turned into a recruitment exercise. Instead, he shifted direction to talk about how the company’s ailing ads business is coming out of rehab.

“Linda, felt that it would be very helpful for advertisers to see me in person so invited me down to a conference in Miami, which was very helpful,” he said of his appearance on stage with Yaccarino several weeks before he named her CEO.

He said there, he “met with a number of advertisers personally to assure and show them that Twitter is a good place to advertise.”

He claimed that “hate speech has declined” and that “the quality of the system, especially with respect to scammers and spammers, is dramatically better than it used to be… We’ve gotten rid of, at this point, well over 90% of scam and spam entrepreneurs. It should be quite rare at this point that you see a scam.”

He also said that its figured out how to shut down bot farms, presumably in aid of improving authentic sentiment on the platform, although he didn’t spell out why these exactly were a problem for advertisers, if for example they were brigading in favor of a brand?

He says that he and the new CEO will work together over moderation.

Musk has been quite outspoken before about free speech, and specifically him saying what he wants to say, which doesn’t sound like it will change going forward. While “the general principle is that will hew close to the law. So for any given country, we will try to adhere as close to to law as possible,” the company is working on “adjacency controls that ensure that “if you’re Disney for example…the content nearby [will be] family friendly. That’s totally understandable.

MDAU wasn’t a great metric.

The problem with “monetizable daily active users,” Twitter’s self-defined metric that it adopted when still a public company to describe its audience and how it was growing, was, Musk said, that “a bunch of those users… would see a notification on their phone, about a tweet, but they wouldn’t actually click through to the site. Certainly what really matters is true user seconds of screen time.” That is what the company tracks now, he said. “That’s based on the screen time as reported to us by iOS, Android and the browser.” He says the feedback from advertisers for this measurement has been positive.

Is there a limit on free speech — at least Elon’s free speech?

The topic of George Soros came up, specifically Musk’s characterization of him and the huge amount of outcry and argument that it generated. Elon really does not feel he should be reined in.

“I’m not going to mitigate what I say because that would be inhibiting for your freedom of speech. That doesn’t mean you have to agree with what I say… The point is to have a divergent set of views. Free speech is only relevant if it’s speech by someone you don’t like. Is that allowed? If so, you have free speech. Otherwise you do not. And for those who would advocate censorship, I would say it is … only a matter of time before the censor gets turned on you.” Easy to say, harder to see played out.

Twitter will be hiring again.

The conversation then veered into a totally different area, to the subject of Twitter staffing — unspoken questions about moderation and who is left to handle it being possibly the bridge here? With the company now down to about 1,500 employees from the 7,500+ it had when he took over last year, there are indeed a lot of questions about how it can recover, build and grow. There will be more people added, he said, not least if it plans to revive that advertising business and court big advertisers. Oh, and build that “everything” app.

“We are going to start adding people to the company, and we have started adding some number of people to the company,” he said. “And, but it’s still there’s still a lot of change to have to happen. So but I think 1,500 is probably a reasonable number.”

On Twitter’s valuation.

“All’s well that ends well,” was his short answer to whether he regretted buying Twitter. On the ambition/projection that the company could be worth $250 billion after he bought it for $44 billion, he described Twitter as “on the comeback arc.” He didn’t have an answer for whether Twitter could go public again, nor whether it would even stay headquartered in San Francisco.

Content moderation: bad; AI regulation: good

Musk is okay with people speaking their minds, but he is less okay with AI doing the same.

“I’ve been pushing hard for a long time and met with a number of senior senators and congressmen, people in Congress in the White House to advocate for AI regulation, starting with an inside committee that is formed of independent parties as well as perhaps just bids from the leaders in industry,” he noted.

One area that he’s particularly concerned with is the presence of AI in social media. On the subject of the weaponizing of AI, “the pen is mightier than the sword. So one of the first places where to be careful of AI being used is in social media, to manipulate public opinion.”

He said this was one of the reasons he wants to turn Twitter into primarily a subscriber-based system, is “because it is dramatically harder to create,” and thus more unlikely that it would not be an authentic person, or so the thinking goes. “It’s like a 10,000 times harder to create an account that has a verified phone number from a credible carrier that has a credit card and that pays a small amount of money per month. And have those credit cards and phone numbers be highly distributed, not clustered, incredibly difficult. So whereas the past someone could create a million fake accounts for a penny apiece and then manipulate or have something appear to be very, very much liked by the public when in fact it is not, or promoted and retweeted, when in fact it is not. Popularity is not real, it’s essentially gaming the system.

“So the bias towards a subscription based verification, I think, is very powerful. And that really, you won’t be able to trust any social media company that does not do this, because it will simply be overrun with bots to such an extreme extreme degree.”

BLOOM Finally Blossoms into a Multilingual Chatbot

Just when we thought that BLOOM was not going anywhere. Palo Alto-based SambaNova, in collaboration with Together launched BLOOMChat, an open-source and multilingual chatbot which is built on top of BLOOM-176B, the multilingual LLM pre-trained by BigScience Group.

Built for researcher and commercial use, BLOOMChat can be used for various purposes, as it is offered under a modified version of Apache 2.0 licence that incorporates RAIL’s usage restrictions derived from BLOOM.

Click here to check out BLOOMChat.

The blog explains that to make this model, SambaNova fine-tuned BLOOM-176B on LAION’s OIG dataset from OpenChatKit, Dolly 2.o, and OASST1.

BLOOM is one of the largest open source multilingual models which constraints 46 languages and by fine tuning it on open conversation and alignment datasets, the company was able to build it for user interactive chatbot.

The chatbot has been developed through the utilisation of SambaNova RDUs (Reconfigurable Dataflow Units). Its training has resulted in impressive performance achievements, with a win-rate of 45.25% compared to GPT-4’s 54.75% in a human preference study conducted across 6 languages.

SambaNova said that the need for a multilingual LLM has been there all this while since most of the new models like OpenChatKit, Dolly 2.0, LLaMA-Adapter-V2-65B, and Vicuna have been only focused on English. The model is still little behind GPT-4 on performance, and still shows competitive results.

Moreover, BLOOMChat has showcased strong capabilities in WMT translation tasks, outperforming other BLOOM variants as well as mainstream open-source chat models, thus establishing its prominence in this domain.

SambaNova also said that the model is preferred 66% of the time compared to mainstream open-source chat LLMs.

The company admits that there still are several limitations in the model like hallucination, code switching, repetition, weak maths, and prone to toxicity.

BLOOM-ing Slowly, but Surely

Unleashing the power of open-source, BLOOM was developed by a group of 1000 researchers from 60+ countries and 250+ institutions. The 176 billion parameters LLM can generate text in 46 natural language and dialects and 13 programming languages. The work, however, has been slow compared to other open source models mushrooming in the space.

Like for instance Meta’s LLaMA has been making noise in the open source ecosystem, eversince its leak on GitHub a few months ago, where developers have been experimenting with it to reduce the memory requirements for the model and more. Soon came Vicuna, Alpaca and others, which have been instrumental in significantly lowering training and inference costs, in comparison to closed-door, multimillion-dollar backed companies, training similar models.

Recently, Hugging Face also launched HuggingChat. Just like BLOOMChat, this chatbot too provides various functionalities and integrations catering to both developers and researchers, making OpenAI count its ChatGPT days. Read: OpenAI has Stopped Caring About ‘Open AI’ Altogether.

The post BLOOM Finally Blossoms into a Multilingual Chatbot appeared first on Analytics India Magazine.

How to use search operators to refine your Bing AI search results

Image: Mark Kaelin/TechRepublic

Microsoft has spent a great deal of time and resources developing a new and improved AI-enabled version of Bing. Presumably, the plan is to make Bing more useful and more relevant when compared to its well-established competition. Only time will tell if this gambit will be successful.

SEE: Find the right artificial intelligence architect for your team using this hiring kit from TechRepublic Premium.

In the meantime, Windows 11 users will benefit from quick access to the new AI-enabled Bing directly from their standard desktop search box located on the taskbar. To get the best search results, users should take advantage of built-in search operators.

These 10 common search operators help you refine your search inputs, which in turn refine your search results, saving time and energy and increasing your productivity.

Use search operators to refine your Bing AI search results

To work properly, search operators for Bing have a specific syntax that must be followed. The search operator is always followed by a colon, which is immediately followed by a parameter. There are no blank spaces before or after the colon, as shown here:

operator:parameter

This convention takes some getting used to, but it should be followed every time. Capitalization is not important.

SEE: Use Google? Try these tips to get better Google search results.

Note: There are more search operators available, but these 10 are likely the most useful.

site:

Adding the site: operator to a search query will limit that search to a specific website. For example, windows 11 site:techrepublic.com will find Windows 11 related articles only on TechRepublic. The site: operator only shows results for two levels of subdomains.

domain:

The operator domain: will limit a search query to an entire domain, including all indexed subdomains.

contains:

The operator contains: limits search query results to pages containing links to specific file types. For example, if you want to find links to .pdf files, use the operator contains:pdf.

filetype:

If you want to search for results with a specific file type, you will use the filetype: operator. Note the subtle difference between contains: (finding links to PDFs) and filetype: (finding PDFs).

define:

If you only want the definition of a word or phrase, add the define: operator (Figure A). For example, define:artificial intelligence.

Figure A

a search result for define:artificial intelligence in Bing — Using the define: operator, you can quickly get definitions of terms.

imagesize:

The imagesize: operator will search Bing Images for images related to your search query limited to your specified size. The size parameter can be small (less than 200 pixels), medium (200 to 500 pixels) or large (greater than 500 pixels).

inanchor:

The inanchor: operator will limit your results to just webpages with your query word or phrase in its anchor text, such as titles, subtitles and headings.

inbody:

Use the inbody: operator to limit your search query word or phrase to each indexed websites’ body text.

intitle:

The intitle: operator will limit results to a queried word or phrase that appears in a webpages’ title.

location:

The location: operator will limit a search to a specific location. For example, birds location:us will limit results to websites identifying themselves as residing in the United States.

Better searches yield better results

For general web surfing, these Bing search operators might seem to be extra work for little gain, but for serious searches performed as part of your job duties and under time constraints, they can save substantial amounts of time and effort.

Better search results, the kind of results you can actually use, require better search queries, and these search operators can be the key to getting those desired results.

Read next: See how Microsoft might turn Bing Chat into your AI personal assistant.

Microsoft Weekly Newsletter

Be your company's Microsoft insider by reading these Windows and Office tips, tricks, and cheat sheets.

Delivered Mondays and Wednesdays Sign up today

The most important reason you should be using Linux at home

Let's be clear: Linux powers pretty much everything you use. Cloud? Linux. Social media? Linux. Google? Linux.

You cannot escape it. In fact, without Linux and open-source software, businesses across the globe wouldn't be nearly as competitive. This is a fact, not an opinion.

What is an opinion, however, is that Linux isn't viable for consumers or home users.

I would argue that opinion is a bit shortsighted because the current state of Linux not only makes for an ideal operating system for your home computers, but also can bring you considerable value as a server OS on your home network.

Don't stop reading yet. I know you're probably thinking, "I don't know how to set up a server!" What you might not know is that it's far easier than you might think.

Also: How to choose the right Linux desktop distribution

And beyond that, there's a world of possibility to unleash when you allow Linux and open-source software into your home.

And it's not just about cost. Yes, Linux is a free operating system. You can download a single ISO image, burn it to a USB drive, and install Linux on as many computers as you like. More than anything, Linux is about freedom. Instead of having to do things the Apple or Microsoft way, you can do it your way.

It's the Burger King of operating systems.

Or not.

Anyway, what I'm talking about is the freedom to use it how you need it, where you need it, and when you need it. It's also about security. Although Linux is one of the most secure operating systems on the market, that's not exactly the kind of security I'm talking about.

Let me explain.

You probably use Google Workspace, Office 365, or iCloud. I consider myself a Google Workspace power user because I've used it for hours every day and have been doing so for a very long time. I use Google Workspace knowing that everything I create or save to that service is available to a third party. For most of what I create, that's fine. However, there are certain sensitive documents I create that I'm not okay with sharing or saving via a third-party hosted solution. For that, I would much rather keep things in-house.

Also: My idea for a great new beginner-friendly Linux distribution

That's where Linux and open-source come in for me and should be considered for you as well.

On my home network, I have a number of Linux servers deployed that I use as in-house cloud systems, invoicing and billing platforms, project management tools, and more. Beyond the operating system, the software I use for these purposes includes the following:

Nextcloud: As my in-house cloud service (for storing, sharing, editing, and creating documents).
InvoicePlane: For billing and invoicing clients.
OpenProject: For project management.
Portainer: For container deployments.

I also employ Samba on all of my Linux machines for file sharing across systems.

I'm not saying that just anyone can get the likes of Nextcloud up and running, but it's not nearly as hard as you think. In fact, with simple instructions (that I will provide in upcoming tutorials), you'd be surprised that yes, you can successfully install those platforms and make use of some powerful and flexible applications without leaving your home network.

Also: How to set up a cloud service at home

The privacy and security of your information

That's power. And the security you get by not saving sensitive data on a third-party, public service cannot be overstated. You could effectively replace all of those third-party services (some of which you pay for) with free, open-source tools on your network. By doing this, you're not relying on Google, Microsoft, Dropbox, Slack, or Apple to keep the security of your data as a top priority. Although the chances of Google getting hacked are slim, it's not impossible. But more than that, one of the issues that's causing me great concern is AI.

Also: Cool things you can do with the Linux desktop that you can't do with MacOS or Windows

Consider this: In order to be effective, artificial intelligence must be trained. With Google, Microsoft, and who knows who else using more and more AI, they need content to train their systems. Who's to say they are not using documents saved to their systems as fodder for training? Personally, I don't want my novels being used for such purposes. Because of that, I'm seriously considering migrating from Google Docs to an in-house Nextcloud instance. Nextcloud includes all the features I need to develop and write fiction, without having to worry those books are being used to train AI.

That may not be a make-or-break reason for you, but it is for me. I write for a living and do not want my work to be used for anything other than its original purpose. Along with that level of assurance, I also prefer bringing those needs in-house because I am in complete control. On top of that, should I lose my internet connection, I can still reach the servers on my network, so I can continue to work.

Don't forget the desktop

I'm not saying you must use Linux on the desktop if you plan on using Linux as a server OS for your network. That's not the case, because most of those services you deploy with Linux would be used via a web browser. But the thing about Linux on the desktop is that it removes a number of frustrations you've probably experienced with other operating systems. As I said earlier, with Linux you get to do things your way. If you don't like the way a Linux distribution works, you can change it. You won't find that level of freedom with MacOS or Windows.

Also: 4 ways Windows gets MacOS wrong

There are also tons of software titles you can install and use for free, some of which are even proprietary. You can install Spotify, Slack, and more… so you're not really missing out on anything. And if you like games, there's Steam. Yes, once upon a time, there were glaring holes in the available software options for Linux. Given nearly everything today is handled by way of a web browser, those glaring holes are far fewer. Plus, with the rise in popularity of Snap, Flatpak, and AppImages, even a number of proprietary apps have made their way to the operating system.

Not only do you have all the apps you need, but you can also deploy numerous different services, and keep all of your important data in-house. Doing that with proprietary OSes isn't nearly as easy as it is with Linux and open-source.

So, what are you waiting for? Let's get Linux installed and start deploying those services you want to run within the confines of your LAN. It's secure, reliable, and as flexible as you want.

Open Source

KDnuggets News, May 24: Free ChatGPT Course: Use The OpenAI API to Code 5 Projects • Super Bard: The AI That Can Do It All and Better

Features

Free ChatGPT Course: Use The OpenAI API to Code 5 Projects by Kanwal Mehreen
Super Bard: The AI That Can Do It All and Better by Abid Ali Awan
Bayesian vs Frequentist Statistics in Data Science by Nisha Arya

From Our Partners

Design effective & reliable machine learning systems! by Manning
Innovations in Measuring Community Perceptions Challenge by Intellibridge

This Week's Posts

A Beginner's Guide to Anomaly Detection Techniques in Data Science by Eugenia Anello
Introduction to Correlation by Benjamin O. Tayo
5 ChatGPT features to boost your daily work by Josep Ferrer
IT Staff Augmentation: How AI Is Changing the Software Development Industry by Santiago Alonso
How to Efficiently Scale Data Science Projects with Cloud Computing by Nate Rosidi
The Future of AI: Exploring the Next Generation of Generative Models by Nisha Arya
How I Did Automatic Image Labeling Using Grounding DINO by Parthiban Marimuthu
Exploring Data Distributions with Histograms by Benjamin O. Tayo
WebLLM: Bring LLM Chatbots to the Browser by Bala Priya C
StarCoder: The Coding Assistant That You Always Wanted by Abid Ali Awan
What Are Foundation Models and How Do They Work? by Saturn Cloud

Opera launches new integrated AI sidebar powered by OpenAI’s ChatGPT

Opera launches new integrated AI sidebar powered by OpenAI’s ChatGPT Aisha Malik 7 hours

Opera announced today that it’s introducing an AI side panel in its browser called “Aria” that is powered by OpenAI’s ChatGPT. The company says Aria is both a web and a browser expert that makes it easier to find information on the web, generate text or code, or get your product queries answered. The new feature is currently available for testing.

Aria is also able to answer questions about Opera itself, as the company says it’s knowledgeable about the browser’s “whole database of support documentation.” Opera says Aria is based on its “composer” infrastructure and connects to OpenAI’s GPT technology and is enhanced by additional capabilities, such as adding live results from the web.

The company outlines that Aria is a free service with up-to-date information, which means it’s connected to the internet and not limited to content prior to 2021, as is the case with standard GPT-based solutions. Aria is launching in more than 180 countries.

You can test Aria by downloading the newest version of Opera One. As an Android user, you can test Aria in the latest beta version of the browser. Opera notes that as a tester, the only thing you need to do in order to access Aria is create an Opera account, if you do not already have one. You will then be notified via email or inside of the product about your whitelisting status. Once your Opera Account is whitelisted, you can access Aria through the settings of Opera for Android beta or through the browser sidebar of Opera One.

Image Credits: Opera

Opera’s latest launch builds on the browser’s current AI features. Earlier this year, Opera integrated generative AI chatbots powered by ChatGPT and ChatSonic into its desktop browsers, Opera and Opera GX. The company also launched a feature that lets you generate AI prompts by highlighting text on a website or typing it in. These chatbots can summarize articles or web pages, write social media posts for you or help you ideate through prompts.

Last month, the company launched Opera One, a redesigned version of its flagship browser, that it says has elements that will make it ready for a “generative AI-based future.”

“We’re now forging ahead by introducing a browser AI that will allow you to interact with the web aided by AI directly in the browser,” Opera wrote in a blog post. “Aria’s current form of a chat interface that communicates directly with you, the user, marks the first stage of the project. The AI-based service is set to become even more integrated into Opera in the coming versions of the browser, with the ultimate aim of being natively blended into the browser to help you perform cross-browser tasks.”

The new sidebar is somewhat similar to the features that Microsoft has introduced in Edge. In March, Microsoft announced that its Edge web browser will include a new Bing AI chatbot in a sidebar. Of course, Opera and Microsoft aren’t the only companies focused on integrated AI-powered tools into their browsers, as Brave introduced summarization features in its search engine earlier this year and is experimenting with AI-focused features for its browser.

Opera browser adds ChatGPT and AI summarization features

Jeffrey Ullman’s Unsettling Ultimatum

“Looking back, it’s amazing how easy things were for researchers when I was a young man. In comparison to just how competitive the field has become,” said the 80-year-old American computer scientist Jeffrey Ullman in an exclusive interview with AIM.

Recalling the state of research in the 60s he said, everywhere you looked, there was something new. Now there are a lot of new directions that come up, but so much of research these days is squeezing a little bit of performance out of something just to get a little improvement.

Further, talking about research he said, “A part of the problem is that too many people are led to believe that the success in life depends upon them doing research. What has happened is people who could be excellent teachers are basically forced to do second and third grade research because that’s how they get promoted.”

He humbly opined, “If there’s money to be made from that case, just make your money. Go do something else.”

Ullman won the Turing Award in 2020 for co-creating the building blocks of computer programming in the early 60s along with his long time friend Alfred Aho. The duo refined one of the key components of a computer: the “compiler”.

Bias In, Bias Out

Ullman’s contributions to computer science have a direct impact on the foundations of data science. His research bedrocked many key concepts and techniques employed in the storage, analysis, and interpretation of data, making him an influential figure.

Emphasising on the relevance of data science he said, “The thing that makes machine learning powerful is that you can feed so much data to these interesting models. For example, multi-layer neural nets only make sense if you look at epically millions or billions of training examples.” He also addressed that “for years people have been trying to exploit data to solve problems”.

Clarifying the bias problem in language models, he said, “Often people think the machine learning algorithms introduce bias. 50 years ago, everybody knew ‘garbage in garbage out’. In this particular case, it is ‘bias in, bias out’.

He stated a canonical example of a company which decided to write software to sift through resumes of people applying for jobs. “If the data shows women have not been promoted in proportion to their ability then the AI is going to learn that being female is a negative indication of success, and they’re gonna throw out females. If it’s a good algorithm, it’s going to learn and pick that up,” the said.

The fault is not with the AI, it is with the company policy and the data that policy has generated Ullman believes. “If they built an LLM, in the time of Galileo, the chat would have said that the sun revolves around the Earth because everybody believed that,” he chuckled.

ChatGPT and Eric Schmidt

“I am surprised you didn’t ask if these language models are becoming sentient,” he said jokingly. “The deceptive thing is because the people who created the text on which these models are built are sentient. The LLM is regurgitating the consensus of what sentient beings say it is going to sound sentient,” he added.

A couple of days ago Ullman was discussing with Eric Schmidt the problem of what happens when you ask ChatGPT what’s the best way to kill a million people?

The former Google CEO told Ullman that it wouldn’t answer that question. It knows not to let that question be answered. But that doesn’t mean you can’t rephrase it in such a way that you can get some really dangerous information out of that.

“It’s a very hard problem. If you understand that there is something these generative models shouldn’t be saying. You can stop them from saying it. But how do you understand everything that could cause trouble?” Ullman pondered.

Drawing parallels, he recalled the instance when social networks tried to get rid of hate speech on their platforms. The tech giants have put efforts for decades “but how to write an algorithm that disallows anything that we would all agree as hate speech and allows anything that we would all agree is not hate speech” still remains an unsolved issue.

Current work

Currently, Ullman is trying to calculate how much speech there is available. I’ve heard that the newest model GPT-4 essentially use everything. “At some point, they will run out of new data. You’ve got 8 billion people. Let’s say half of the people even type around 100,000 words a year. You’ve got a couple of 100 trillion words of data and that might support significantly more than a trillion parameters model.”

Ullman believes that there are many occupations where it could be used as a tool but the idea that lawyers and physicians are going to be completely replaced by these LLMs. “I do not think it’s going to happen,” he said.

“In fact, what I suspect will happen is exactly what happened when tools like Microsoft Word became available. You’ll spend just as much time writing or doing your job, but the output will be better quality. The jobs won’t really go away, the results may be better if these models are used,” he concluded.

The post Jeffrey Ullman’s Unsettling Ultimatum appeared first on Analytics India Magazine.

Data Engineering Landscape in the AI-Driven World

Image from Bing Image Creator

One of the biggest impacts has been the wider adoption of “prompt engineering,” essentially the skill of prompting AI to assist in coding-related tasks. I’ve seen Andrej Karpathy joke on Twitter, “The hottest new programming language is English.”

Generative AI has also kick-started a gold rush with dozens of very early start-up companies racing to develop an AI that can query the data warehouse and return an intelligent answer to the ad hoc questions data consumers ask in their natural language. "This would radically simplify the self-service analytics process and further democratize data, but it will be difficult to solve beyond basic "metric fetching," given the complexity of data pipelines for more advanced analytics," commented Monte Carlo CTO Shane Murray.

"When I evaluate data engineering candidates for a role, I’m looking for their track record of making an impact and hitting the ground running," Murray mentioned. That could be in their primary occupation or by contributing to open-source projects. In either case, it’s not that you were there, but what impact did you make?

If you don’t like change, data engineering is not for you. "Little in this space has escaped reinvention," Murray remarked. It’s clear that the process of building and maintaining data pipelines will become much easier, as will the ability for data consumers to access and manipulate data.

However, what hasn’t changed is the data lifecycle. "It is emitted, it is transformed for a use, and then it is archived," Murray pointed out. "While the underlying infrastructure may change and automation will shift time and attention to the right or left, human data engineers will continue to play a crucial role in extracting value from data, whether architecting scalable and reliable data systems or as specialist engineers within a chosen domain of data."

Data Platform Teams Offer Opportunities

I've found that data platform teams, which are now quite common in data teams of various sizes, are great places for data engineers to cut their teeth.

Murray further explained, "Here, you can specialize in a specific domain of data that is central to the business operations, such as customer data or product / behavioral data. In this role, you should aim to gain an understanding of the end-to-end problem—from source to the analytical use case—as it'll make you an asset to the team and the business."

"Alternatively, one might specialize in a specific capability of the data platform, such as reliability engineering, business intelligence, experimentation, or feature engineering." Murray specified. "These types of roles typically give a broader, but shallower, understanding of each business use case, but may be an easier jump from a software engineering role into data."

Another path I'm seeing more often for data engineers is the data product manager role, said Murray. If one is growing data engineering skills but finds they are more compelled by talking to end users, articulating the problems to be solved, and distilling the vision and roadmap for the team, then a product management role may be a future prospect.

Data teams are beginning to invest in this skillset as we move to treat "data as a product," ranging from critical dashboards and decision-support tools to applications of machine learning that are critical to business operations or customer experience. "Great data product managers will have an understanding of how to build a reliable and scalable data product, but also apply product thinking to drive the vision, roadmap, and adoption," Murray affirmed.

Modern Data Stack

The modern data stack is quickly becoming the dominant, trending tech stack in the data engineering field, Murray articulated. This stack has a cloud-based data warehouse or lake at the center and complementary cloud-based solutions for data ingestion, transformation, orchestration, visualization, and data observability.

It’s advantageous because it has a quick time to value, is fundamentally more user-friendly than the prior generation of tools, is extensible to a wide range of analytical and machine learning use cases, and can scale to the size and complexity of data managed in today’s world.

"The exact solutions will vary depending on organizational size and specific data use cases, but generally the most common modern data stack is Snowflake, Fivetran, dbt, Airflow, Looker, and Monte Carlo. There may also be Atlan and Immuta to address data catalog and access, respectively," Murray explained. "Larger organizations or those with more machine learning use cases will typically have data stacks that more heavily utilize Databricks and Spark."

A Potential Disruption

"The modern data stack era kicked off by Snowflake and Databricks hasn’t even reached a point of consolidation yet, and already we are seeing ideas that may further disrupt the status quo of modern data pipelines," Murray reflected. "On the near horizon are the more widespread adoption of streaming data, zero-ETL, data sharing, and a unified metrics layer." Zero-ETL and data sharing are particularly interesting as they have the potential to simplify the complexity of modern data pipelines, which have multiple points of integration and thus failure.

Tech Job Landscape

The tech industry job market is projected to experience a significant shift in 2023, driven by the growth of big data analytics. According to Dice Media's analysis, this shift will occur as the global big data analytics market is expected to grow at an impressive rate of 30.7 percent, reaching a projected value of $346.24 billion by 2030. This growth is anticipated to create numerous opportunities for skilled professionals in the field, such as data engineers, business analysts, and data analysts.

"I strongly believe that data engineering jobs will not be solely about writing code, but rather, they will involve more communication with business stakeholders and designing end-to-end systems," commented Deexith Reddy, an experienced data engineer and open-source enthusiast. "Therefore, to ensure job security, one must focus on both the breadth of data analytics and the depth of data engineering."

Generative AI is likely to make the data engineering field more competitive. However, during our call, Reddy also emphasized that contributing to open-source projects will always be beneficial for building a strong portfolio, considering technological advancements and recent AI breakthroughs.

Reddy shed further light on the critical role data engineers play in enhancing an organization's capabilities by utilizing open-source technologies. For instance, there has been widespread adoption of open-source technologies like Apache Spark, Apache Kafka, and Elasticsearch among data engineers, as well as Kubernetes among data scientists for data science practices. These OSS technologies help meet the computational requirements for deep learning and machine learning workloads, as well as MLOps workflows.

Companies often identify and recruit top contributors from open-source projects like these, fostering an environment that values and encourages open-source contributions. This approach helps retain skilled data engineers and allows organizations to benefit from their expertise.
Saqib Jan is a writer and technology analyst with a passion for data science, automation, and cloud computing.

Quantum resistant cryptography – bolstering cyber security against the threats posed by quantum computing

Cyber security experts face a tough challenge from the new type of quantum computers capable of easily breaking through security codes. Quantum computers, based on principles of quantum physics instead of standard electronic systems, are still nascent and do not have enough processing power to crack encryption keys. However, the experts at QDex Labsbelieve that the threat from quantum computers could become real shortly. Therefore, they are ready with the new standard of security system that protects computer systems from existing cyber threats and the security threats posed by quantum computers.

How cryptography works

Cryptographic solid protocols have been working well to protect computer systems from cyber threats. The principle of encryption relies on a series of complex mathematical formulas that can render an original piece of information unreadable by converting it into something that looks like gibberish. The digital ciphers can encrypt all types of data effectively, maintaining confidentiality and ensuring total data safety.

All data remain encrypted during storage and transmission and protect systems from any harm to data, thereby enhancing the confidence of all stakeholders.

Types of encryption

There are two main types of cryptography – symmetric and asymmetric or public key.

Symmetric encryption – Using the same key for encrypting and decrypting data is how symmetric encryption works. Encryption is speedy and widely used to encrypt all stored data and communications.
Public-key cryptography – This type of cryptography uses a pair of keys linked mathematically. One of the keys remains with the people who encrypt messages for the owner of the other critical team, while the owner has a different key to help decrypt messages. This cryptography helps sign documents, notes, and certificates by linking the owner’s identity with the public keys.

The mathematics behind the two types of cryptography is different, which impacts security. Almost all internet applications use both types of cryptography to ensure comprehensive data and system security.

The threat from quantum computers

While everything is fine with the existing encryption methods, scientists fear that rapid technological advancements in quantum computing could break the codes and rupture the shield of encryption that protects data securely.

It is almost impossible for conventional computers to break codes, as it would take considerable time and effort for a symmetric 64-bit key. To break codes, it is necessary to try all possible keys, which run into thousands, to find the one that can do the trick. For decoding 128-bit keys, one must try innumerable combinations to arrive at the correct key, which is practically impossible and not even worth it. Therefore, making keys longer can create a solid defense of more robust encryption with no chance of breaking.

However, the threat looms large when dealing with quantum computers that use Grover’s algorithm as the computing method.

The computer is so speedy that it can conveniently transform a 128-bit key into a 64-bit one in terms of the quantum-computational equivalent. The solution lies in creating longer keys, such as 256-bit, which offers the same security as 128-bit quantum computing.

Microsoft’s Blizzard Acquisition Comes With An AI Twist

Microsoft’s acquisition of Activision Blizzard is all set to be one of the biggest deals in the history of the gaming market. However, the reason for this acquisition might go beyond Blizzard’s war chest of intellectual property and award-winning game franchises. Reports have emerged that Blizzard is using AI in the game development process, and if it’s one thing that Microsoft loves right now.

One of the key takeaways from the recent Microsoft Build conference was the introduction of Copilot into all Microsoft products. While Microsoft CEO Satya Nadella has expressed his desire to integrate AI systems into all products in the future, it seems that this strategy goes beyond just enterprise and customer facing products. Microsoft’s gaming division, Xbox, seems to be next in line for an AI makeover.

Acquiring Blizzard goes beyond games

The regulatory smoke around Microsoft’s Blizzard acquisition is beginning to clear as more regulators approve the deal. Apart from getting ownership of blockbuster franchises, Microsoft will also get a foothold in the mobile gaming market through King, which is part of Activision Blizzard. However, considering Blizzard’s current creative strategy, there might be even more of an overlap than first predicted by regulators.

In a leaked email, Blizzard’s chief design officer, Allen Adham, announced the launch of a new internal tool for game design. Titled Blizzard Diffusion, this image generator is completely trained on art from Blizzard’s games. In the email, Adham stated, “Prepare to be amazed. We are on the brink of a major evolution in how we build and manage our games.”

Reportedly, the tool will be used to generate concept art for game environments, along with artwork depicting new characters and the possible outfits they could wear. In addition to this tool, Blizzard is also testing a variety of AI tools for various applications in gaming, including autonomous in-game NPCs (non-playable characters), AI assisted voice cloning, and anti-toxicity measures in online games.

While other game studios have only explored AI on a surface level, Blizzard’s attitude seems to be that AI can fundamentally change game design. In this, Microsoft and Blizzard seem to be on the same page. Microsoft’s gaming arm, Xbox, has an internal AI team to keep a track on how AI can change the way they make games.

Along these lines, Microsoft Gaming CEO Phil Spencer recently appeared in an interview to speak about how AI can impact the game-making process. This seems to align with Microsoft’s larger strategy of supercharging creative tasks with the addition of AI. There is no bigger showcase of this than Microsoft’s recent build event.

Building for the future

If it was one keyword that stood out during Microsoft’s Build conference, it was Copilot, or an AI assistant to assist with cognitive tasks. From Windows, to Office, to Bing, Microsoft has added OpenAI’s GPT-4 capabilities to all of its products. It seems that now this is going to not only extend to Xbox, but Blizzard as well.

Phil Spencer stated, “The intersection of AI and gaming has always been there…I type something in DALL-E and all of a sudden I get an image, you can think about stuff like that for 3D assets as well.”

He went on to speak about how AI can also be used to create more believable characters that players can interact with in games. This comes after a Microsoft-owned property, the Elder Scrolls V: Skyrim, got an integration with LLMs through Inworld AI and the modding (modifying) community.

It seems that Microsoft is learning from the community when it comes to how AI can be used to change gaming as we know it. This also aligns with Blizzard’s strategy of using AI in a variety of game dev applications, sweetening the pot for Microsoft.

As the Redmond giant acquires Activision Blizzard, they will not only get access to a wide bevy of profitable franchises, but also to valuable internal game dev tools. As discussed previously, these tools will also include AI superchargers, aligning the interests between Microsoft and Blizzard even further.

While it is not public knowledge the kind of impact that AI is already having on the game development process, there is also a negative side to using this technology. Video game voice actors have already come out against the usage of their likeness in AI voice models, and generative AI also has a bevy of copyright issues to wade through before it becomes mainstream. However, it seems that the path for Microsoft is clear now, as it adds even more Copilots to its arsenal.

The post Microsoft’s Blizzard Acquisition Comes With An AI Twist appeared first on Analytics India Magazine.

Рубрика: AI