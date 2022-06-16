 Skip to main contentSkip to main content
You have permission to edit this article.
Edit
contributed

A celebrated AI has learned a new trick: How to do chemistry

  • 0

(The Conversation is an independent and nonprofit source of news, analysis and commentary from academic experts.)

Marc Zimmer, Connecticut College

(THE CONVERSATION) Artificial intelligence has changed the way science is done by allowing researchers to analyze the massive amounts of data modern scientific instruments generate. It can find a needle in a million haystacks of information and, using deep learning, it can learn from the data itself. AI is accelerating advances in gene hunting, medicine, drug design and the creation of organic compounds.

People are also reading…

Deep learning uses algorithms, often neural networks that are trained on large amounts of data, to extract information from new data. It is very different from traditional computing with its step-by-step instructions. Rather, it learns from data. Deep learning is far less transparent than traditional computer programming, leaving important questions – what has the system learned, what does it know?

As a chemistry professor I like to design tests that have at least one difficult question that stretches the students’ knowledge to establish whether they can combine different ideas and synthesize new ideas and concepts. We have devised such a question for the poster child of AI advocates, AlphaFold, which has solved the protein-folding problem.

Protein folding

Proteins are present in all living organisms. They provide the cells with structure, catalyze reactions, transport small molecules, digest food and do much more. They are made up of long chains of amino acids like beads on a string. But for a protein to do its job in the cell, it must twist and bend into a complex three-dimensional structure, a process called protein folding. Misfolded proteins can lead to disease.

In his chemistry Nobel acceptance speech in 1972, Christiaan Anfinsen postulated that it should be possible to calculate the three-dimensional structure of a protein from the sequence of its building blocks, the amino acids.

Just as the order and spacing of the letters in this article give it sense and message, so the order of the amino acids determines the protein’s identity and shape, which results in its function.

Because of the inherent flexibility of the amino acid building blocks, a typical protein can adopt an estimated 10 to the power of 300 different forms. This is a massive number, more than the number of atoms in the universe. Yet within a millisecond every protein in an organism will fold into its very own specific shape – the lowest-energy arrangement of all the chemical bonds that make up the protein. Change just one amino acid in the hundreds of amino acids typically found in a protein and it may misfold and no longer work.

AlphaFold

For 50 years computer scientists have tried to solve the protein-folding problem – with little success. Then in 2016 DeepMind, an AI subsidiary of Google parent Alphabet, initiated its AlphaFold program. It used the protein databank as its training set, which contains the experimentally determined structures of over 150,000 proteins.

In less than five years AlphaFold had the protein-folding problem beat – at least the most useful part of it, namely, determining the protein structure from its amino acid sequence. AlphaFold does not explain how the proteins fold so quickly and accurately. It was a major win for AI, because it not only accrued huge scientific prestige, it also was a major scientific advance that could affect everyone’s lives.

Today, thanks to programs like AlphaFold2 and RoseTTAFold, researchers like me can determine the three-dimensional structure of proteins from the sequence of amino acids that make up the protein – at no cost – in an hour or two. Before AlphaFold2 we had to crystallize the proteins and solve the structures using X-ray crystallography, a process that took months and cost tens of thousands of dollars per structure.

We now also have access to the AlphaFold Protein Structure Database, where Deepmind has deposited the 3D structures of nearly all the proteins found in humans, mice and more than 20 other species. To date they it has solved more than a million structures and plan to add another 100 million structures this year alone. Knowledge of proteins has skyrocketed. The structure of half of all known proteins is likely to be documented by the end of 2022, among them many new unique structures associated with new useful functions.

Thinking like a chemist

AlphaFold2 was not designed to predict how proteins would interact with one another, yet it has been able to model how individual proteins combine to form large complex units composed of multiple proteins. We had a challenging question for AlphaFold – had its structural training set taught it some chemistry? Could it tell whether amino acids would react with one another – a rare yet important occurrence?

I am a computational chemist interested in fluorescent proteins. These are proteins found in hundreds of marine organisms like jellyfish and coral. Their glow can be used to illuminate and study diseases.

There are 578 fluorescent proteins in the protein databank, of which 10 are “broken” and don’t fluoresce. Proteins rarely attack themselves, a process called autocatalytic posttranslation modification, and it is very difficult to predict which proteins will react with themselves and which ones won’t.

Only a chemist with a significant amount of fluorescent protein knowledge would be able to use the amino acid sequence to find the fluorescent proteins that have the right amino acid sequence to undergo the chemical transformations required to make them fluorescent. When we presented AlphaFold2 with the sequences of 44 fluorescent proteins that are not in the protein databank, it folded the fixed fluorescent proteins differently from the broken ones.

The result stunned us: AlphaFold2 had learned some chemistry. It had figured out which amino acids in fluorescent proteins do the chemistry that makes them glow. We suspect that the protein databank training set and multiple sequence alignments enable AlphaFold2 to “think” like chemists and look for the amino acids required to react with one another to make the protein fluorescent.

A folding program learning some chemistry from its training set also has wider implications. By asking the right questions, what else can be gained from other deep learning algorithms? Could facial recognition algorithms find hidden markers for diseases? Could algorithms designed to predict spending patterns among consumers also find a propensity for minor theft or deception? And most important, is this capability – and similar leaps in ability in other AI systems – desirable?

This article is republished from The Conversation under a Creative Commons license. Read the original article here: https://theconversation.com/a-celebrated-ai-has-learned-a-new-trick-how-to-do-chemistry-182031.

Licenced as Creative Commons - attribution, no derivatives.

0 Comments

This content was contributed by a user of the site. If you believe this content may be in violation of the terms of use, you may report it.

0
0
0
0
0

Tags

Be the first to know

* I understand and agree that registration on or use of this site constitutes agreement to its user agreement and privacy policy.

Related to this story

Most Popular

US report: nearly 400 crashes of automated tech vehicles

US report: nearly 400 crashes of automated tech vehicles

Automakers reported nearly 400 crashes of vehicles with partially automated driver-assist systems, including 273 involving Teslas, according to new statistics from U.S. safety regulators. But the National Highway Traffic Safety Administration cautioned Wednesday against using the numbers to compare automakers, saying it didn’t weight them by the number of vehicles from each manufacturer that use the systems, or how many miles those vehicles traveled. Automakers reported crashes from July of last year through May 15 under an order from the agency, which is examining such crashes broadly for the first time.

The S&P 500 is in a bear market; here’s what that means

The S&P 500 is in a bear market; here’s what that means

Wall Street opened the week with heavy losses that put the benchmark S&P 500 at a level considered to be a so-called bear market. Rising interest rates, the war in Ukraine and China’s economic slowdown are leading investors to reconsider what they’re willing to pay for a wide range of stocks, from high-flying tech companies to industrial conglomerates. Big swings have become commonplace and Monday was no exception, with the S&P 500 falling 3.9%. It’s 21.8% below its record set early this year and so now is in a bear market. The Dow industrials sank 2.8% and the tech-heavy Nasdaq composite tumbled 4.7%.

California regulators approve state's 1st robotic taxi fleet

California regulators approve state's 1st robotic taxi fleet

California regulators have given a robotic taxi service the green light to begin charging passengers for driverless rides in San Francisco. The decision will make Cruise, a company controlled by automaker General Motors, the first fully driverless ride-hailing service in California. There are dozens of companies trying to train vehicles to steer themselves on increasingly congested roads. Waymo, a Google spinoff, has been offering a robotic taxi service in the Phoenix area since October 2020. Cruise's San Francisco service initially will consist of 30 electric vehicles confined to transporting passengers in less-congested parts of the city late at night.

Boston transit agency to try urine sensors on elevators

The Massachusetts Bay Transportation Authority is going to tackle the nuisance of public urination with technology. The MBTA is launching a pilot program this summer in which urine detection sensors will be placed in four downtown elevators. The data will be collected for several months with a goal of creating a system that can alert transit ambassadors to dispatch a cleaning crew. The MBTA said public urination is not only unsanitary but can also damage elevators. The sensors are not a new concept. Nearly a decade ago, the Metropolitan Atlanta Rapid Transit Authority launched a program that triggered strobe lights, alarms and alerts to MARTA police when urine was detected.

Bitcoin plunges as major crypto lender halts operations

Bitcoin plunges as major crypto lender halts operations

The price of bitcoin and other cryptocurrencies have tumbled after the major crypto lender Celsius halted all withdrawals citing "extreme market conditions." It is the second collapse of a part of the crypto world in the last two months. The stablecoin Terra imploded in early May, erasing tens of billions of dollars worth of value in a matter of hours. Bitcoin was trading at roughly $23,400 Monday afternoon, down more than 16% in the past day. Ethereum, another widely followed cryptocurrency, was down more than 20%.

Broadband expansion has companies looking to recruit

Bridging the digital divide has become a priority for Louisiana since the COVID-19 pandemic showed the crucial role high-speed internet plays in the state’s education and economic systems. The Advertiser reports although federal investments are aiming to bridge that gap, the funds can only go so far without an increase in the number of workers capable of building and installing high-speed internet infrastructure. Internet infrastructure company System Services and LUS Fiber are working with South Louisiana Community College to launch a new fiber-optic install technician program this summer to meet the region's workforce needs. The school’s new program is expected to launch at SLCC’s Crowley campus in July. It has room for up to 30 students for an 18-to-20 week course.

Aging dams could soon benefit from $7B federal loan program

Aging dams could soon benefit from $7B federal loan program

The federal government is taking the first step to set up a program that could offer more than $7 billion of loans to repair aging dams across the country. The U.S. Army Corps of Engineers published a proposed rule Friday for the program, which could be open for applications in 2023. The Corps' Water Infrastructure Financing Program was authorized under a 2014 law. But it hadn't been set up because it lacked funding. A series of laws adopted over the past 18 months finally provided that funding. The safety of the nation's 92,000 dams has come under increased scrutiny in recent years after some high-profile failures forced evacuations.

Asian shares gain after Fed assurance on rates lifts Wall St

Asian shares gain after Fed assurance on rates lifts Wall St

Asian shares have advanced after the Federal Reserve raised its key interest rate by three-quarters of a point and signaled more rate hikes were coming to fight inflation. Wall Street rallied after the Fed's hike, the biggest since 1994, as investors took heart from Chair Jerome Powell's comments suggesting future rate increases may be more modest. The Bank of Japan is holding a two-day policy meeting, starting Thursday. The Japanese central bank is under pressure to act given downward pressures on the yen from U.S. rate hikes and super-low rates in Japan. Investors have been selling yen and buying dollars in anticipation of higher yields from dollar-denominated holdings.

EPA: 'Forever chemicals' pose risk even at very low levels

EPA: 'Forever chemicals' pose risk even at very low levels

The Environmental Protection Agency is warning that two nonstick and stain-resistant compounds in drinking water pose health risks at levels so low they cannot currently be detected. Most uses of “forever chemicals” known as PFOA and PFOS have been voluntarily phased out by U.S. manufacturers. But there are some ongoing uses. The EPA on Wednesday issued nonbinding health advisories setting health risk thresholds for PFOA and PFOS to near zero. Environmental and public health groups hail the EPA’s action. The chemicals are in products including cardboard packaging and carpets. The chemicals remain in the environment because they don't degrade. Serious health conditions associated with the chemicals include cancer.

Watch Now: Related Video

How to save energy this summer

Get up-to-the-minute news sent straight to your device.

Topics

News Alerts

Breaking News