The Language of Biology

Published

By Suzanne Ross, Writer, Microsoft Research

If you want to go to another country, it would behoove you to learn the language of the land. Luca Cardelli, an Italian researcher working in England, knows this lesson well. He wants to help scientists travel to an unknown country — the membranes and cells of our bodies — and feel right at home. To do this, he is developing a computer language to model the processes of biology.

Cardelli, who is a senior researcher specializing in designing programming languages at Microsoft Research Cambridge, has been diving into cell biology to learn the best way to represent organic processes.

Microsoft Research Blog

Introducing Aurora: The first large-scale foundation model of the atmosphere

Aurora, a new AI foundation model from Microsoft Research, can transform our ability to predict and mitigate extreme weather events and the effects of climate change by enabling faster and more accurate weather forecasts than ever before.

“I want to study languages that can precisely and concisely represent biological processes,” said Cardelli.

Once you have a way to describe an environment, be it a city in a foreign land or the way viruses interact with membranes, you can start to understand that environment. “A living cell is, to a rather surprising extent, an information processing device,” said Cardelli.

For instance, the algorithm that a virus follows to reproduce itself in a biological system can be represented as a sequence of steps. The steps involve the transport of materials and information, much as a computer network must transport bytes and translate them into something we can understand.

“We can learn a lot about computer systems by modeling biological systems and vice versa,” said Cardelli. “Formal modeling at the level that can drive a simulator is becoming of central importance in biology.”

Cardelli has modeled the invasion of a cell by a virus as a first example in how to write a bioalgorithm.

When a virus invades a cell it follows some basic steps with variations depending on the type of virus. A much simplified explanation: The first step is for the virus to attach to the surface of the host target cell. It does this through binding a surface protein with a specific receptor on the host cell. Then the virus enters the cell.

The virus, once inside its target, breaks free of the membrane and replicates within the host cell. Eventually the virus replicates itself many times over and it ruptures the cell – sending out many more viruses. Each new virus goes on to invade more cells.

Cardelli’s work is a step towards being able to model some of these interactions so that they can be studied.

The world of systems biology is much bigger than that of a cell. Biologists also need to study tissues, organs, organisms, and colonies. A formal programming language can help to represent the many levels of abstraction.

A formal language will be more precise than present systems of notation used by biologists. One of the challenges is making sure that the language isn’t at too low a level of abstraction, which might mean getting lost in a mess of details. However, if you start too high, too many details will be ignored. Cardelli is finding out that there is a need to be able to model different levels of abstraction.

“The connection to computing is that the many levels of organization in biological systems are similar to software systems, both in complexity and in algorithmic like information driven behavior,” said Cardelli.

Continue reading

See all blog posts