The week in AI: The pause request heard around the world

by Ana Lopez

Keeping up in an industry that evolves as fast as AI is quite a task. So until an AI can do it for you, here’s a handy recap of the past week’s stories in the world of machine learning, along with some notable research and experiments that we didn’t just cover.

In one of the more surprising stories of the past week, Italy’s data protection authority (DPA) blocked OpenAI’s viral AI-powered chatbot, ChatGPT, citing concerns that the tool would violate the European Union’s General Data Protection Regulation. The data protection authority is reportedly opening an investigation into whether OpenAI unlawfully processed people’s data, and into the lack of a system to prevent minors from accessing the technology.

It’s unclear what the outcome might be; OpenAI has 20 days to respond to the order. But the DPA’s move could have significant implications for companies deploying machine learning models, not just in Italy, but across the European Union.

If Natasha notes in her piece on the news that many of OpenAI’s models have been trained on data scraped from the Internet, including social networks like Twitter and Reddit. Assuming the same goes for ChatGPT, as the company doesn’t appear to have informed the people whose data it reused to train the AI, it could very well be in breach of the GDPR on the block.

GDPR is just one of many potential legal hurdles facing AI, especially generative AI (e.g., text and art-generating AI like ChatGPT). With each edit it becomes clearer that it will take some time for the dust to settle. But that doesn’t deter VCs, who keep pouring capital into the technology like there’s no tomorrow.

Will those turn out to be wise investments or commitments? It’s hard to say at the moment. But rest assured, we will report back on whatever happens.

Here are the other AI headlines from the past few days:

  • Ads coming to Bing Chat: Microsoft said last week it is “exploring” placing ads in the comments of Bing Chat, the search tool powered by OpenAI’s GPT-4 language model. As Devin points out, while the sponsored comments are clearly labeled as such, it’s a new and potentially more subversive form of advertising that may not be so easily delineated — or ignored. In addition, it could further erode confidence in language models, which already make enough factual errors to cast doubt on the veracity of their answers.
  • A request for a break: A letter with more than 1,100 signatories, including Elon Musk, published Tuesday, called on “all AI labs to immediately suspend training of AI systems more powerful than GPT-4 for at least six months.” But the circumstances surrounding it turned out to be darker than expected. In the following days, some signatories reversed their positions report revealed that other notable signatories, such as Chinese President Xi Jinping, turned out to be fakes.
  • And a response to the pause request: Prominent AI ethicists point out that worrying about distant, hypothetical issues is dangerous and self-defeating if we don’t address the problems AI contributes to today.
  • Twitter reveals its algorithm: As repeatedly promised by Twitter CEO Elon Musk, Twitter opened part of the source code for public scrutiny, including the algorithm used to recommend tweets in users’ timelines. Interestingly, Twitter appears to rank tweets in part using a neural network that is continuously trained on tweet interactions to optimize for positive engagement, such as likes and replies. But there’s a lot of nuance to it, since the researchers to dig in the codebase note.
  • Summary meetings with AI: Following companies like Otter and Zoom, meeting intelligence tool Read introduced a new feature that shortens an hour-long meeting into a two-minute clip accompanied by important tips. The company says it uses large language models — it didn’t specify which ones — combined with video analytics to pick out the most salient parts of the meeting, a useful feature.

Table of Contents

More machine learning

At AI enabler Nvidia, Bionemo is an example of their new strategy, where the advantage is not so much that it is new, but that it is increasingly accessible for companies. The new version of this biotech platform adds a glossy web UI and improved fine-tuning of a number of models.

“A growing portion of pipelines has to deal with heaps of data, amounts we’ve never seen before, hundreds of millions of sequences that we need to feed into these models,” said Amgen’s Peter Grandsard, who leads a research division that uses AI technology. “We try to achieve operational efficiency in research as well as in production. With the acceleration that technology like Nvidia offers, what you could have done for one project last year, you can now do five or ten with the same investment in technology.”

This book excerpt from Meredith Broussard at Wired is worth reading. She was curious about an AI model that had been used in her cancer diagnosis (she’s fine) and found it incredibly clumsy and frustrating trying to take ownership of and understand that data and process. Medical AI processes clearly need to take more into account the patient.

In fact, nefarious AI applications create new risks, for example attempts to influence discourse. We’ve seen what GPT-4 is capable of, but it was an open question whether such a model could create effective persuasive text in a political context. This Stanford study suggests the following: When people were exposed to essays arguing a case on issues like gun control and carbon taxes, “AI-generated messages on all subjects were at least as compelling as human-generated messages.” These messages were also perceived as more logical and factual. Will AI-generated text change anyone’s mind? Hard to say, but it seems very likely that people will increasingly use it for these kinds of agendas.

Examples of text used to see if AI can be persuasive.

Machine learning has been used by another group at Stanford to get better simulate the brain – as in the tissue of the organ itself. Not only is the brain complex and heterogeneous, but “very similar to Jell-O, which makes both testing and modeling of physical effects on the brain very challenging,” explains Professor Ellen Kuhl in a press release. Their new model picks and chooses between thousands of brain modeling methods, mixing and matching to identify the best way to interpret or project the given data. It doesn’t reinvent brain damage modeling, but should make any study of it faster and more effective.

Out in the natural world, a new Fraunhofer approach to seismic imaging applies ML to an existing data pipeline that processes terabytes of hydrophone and airgun output. Normally this data would need to be simplified or abstracted, losing some of its precision in the process, but the new ML-powered process allows for analysis of the unabridged dataset.

Image Credits: Fraunhofer

Interestingly, the researchers note that this would normally be a boon to oil and gas companies seeking deposits, but with the shift away from fossil fuels, it could be deployed for more climate-friendly purposes, such as identifying potential sites for CO2 storage or potentially harmful gas. Structure.

Monitoring forests is another important task for climate and conservation research, and measuring tree size is part of it. But this job involves manually checking trees one by one. A team in Cambridge has built an ML model who uses a smartphone lidar sensor to estimate trunk diameter, after training it on some manual measurements. Just point the phone at the trees around you and boom. The system is more than four times faster, yet more accurate than their expectations, said lead study author Amelia Holcomb: “I was surprised that the app works as well as it does. Sometimes I like to challenge it with a particularly crowded piece of forest, or a particularly odd-shaped tree, and I think it will be impossible to get it right, but it does.”

Because it’s fast and doesn’t require special training, the team hopes it can be widely released as a way to collect data for tree surveys, or make existing efforts faster and easier. For now only Android.

Finally, enjoy this interesting research and experiment by Eigil zu Tage-Ravn to see what a generative art model makes of the famous painting in the Spouter-Inn, described in Chapter 3 of Moby-Dick.

Image Credits: Review of the public domain

Related Posts