Science

The artificial intelligence system that can identify cancer-causing mutations

BoostDM uses self-learning algorithms and is capable of searching through the mutational profiles of 28,000 genomes in 66 types of tumors

The BoostDM research team (l-r): Ferran Muiños, Francisco Martinez-Jimenez, Abel González-Pérez, Oriol Pich and Núria López-Bigas.
The BoostDM research team (l-r): Ferran Muiños, Francisco Martinez-Jimenez, Abel González-Pérez, Oriol Pich and Núria López-Bigas.IRB Barcelona

A team of Spanish scientists, led by Núria López-Bigas, has developed a system based on artificial intelligence that can identify the mutations that cause cancer in different kinds of tumors. Known as BoostDM, it uses self-teaching algorithms and is capable of searching through the mutational profiles of 28,000 genomes in 66 types of cancer and is now available for doctors and scientists the world over to incorporate into their investigations free of charge. López-Bigas, who heads up the Biomedical Genomics Research Group at the Institute for Research in Biomedicine in Barcelona, explains over the telephone that the new system will contribute toward a greater understanding of the initial processes governing the formation of tumors in different tissues. “One of the objectives of BoostDM is to help doctors to make better decisions in prescribing specific therapies for each individual patient,” she says.

RELATED

The conclusions of the research, published on July 28 in the scientific journal Nature, show that with sufficient data it is possible to determine which of the thousands of mutations present in a tumor are the actual cause of the illness without the need to carry out expensive and time-consuming experiments studying the effects of each and every one, as currently happens in the majority of hospitals. “One single tumor can have 50,000 mutations, but only two or three of these are the ones that trigger the disease; identifying them is key to improving treatments,” says López-Bigas.

The exact problem that the new system is seeking to solve is how to find these two or three mutations. López-Bigas explains that at the moment there are two ways to identify them. One is to test every mutation in the laboratory in cell cultures to see which of them generates tumors. The second is to use artificial intelligence to analyze the data from 28,000 tumors, this information having been donated to science by 28,000 volunteer patients. “We chose the second option. Analyzing the database we have in the correct way allows us to learn which are the characteristics of the tumorous mutations in each gene,” she says.

One of the objectives of BoostDM is to help doctors to make better decisions in prescribing specific therapies for each individual patients
Scientist Núria López-Bigas

To design the algorithm that identifies the mutations that cause the disease, the scientists based their research on a key concept in evolution: positive selection. The mutations that favor the growth and development of cancer are found in greater numbers in the various samples, in comparison to those mutations that would occur randomly. “The development of a tumor follows classical Darwinian biology in the evolution of species and has, above all, two basic characteristics: variation and selection,” says López-Bigas.

Ferran Muiños, a post-doctoral researcher and lead co-author of the paper, explains it like this: “We started working on the premise that we are only able to see some mutations because the tumorous cells containing that mutation guide the development of the tumor, and we asked ourselves what distinguishes these mutations from the rest of the possible mutations. Doing this manually would be incredibly laborious, but there are computer-based strategies that allow this information to be organized systematically and efficiently.”

In conclusion, BoostDM is designed to learn, based on the stored data, which attributes are distinct in those mutations that favor the growth of cancer – insight that will provide useful information for the development of new areas of therapeutic focus. The mutations that the algorithm is looking for have the ability to replicate more quickly, to spread and even to evade being eliminated by treatments such as chemotherapy. “We discovered that these malignant mutations have a selective advantage because they divide more rapidly than their neighboring cells and that can lead to them generating a tumoral mass,” says López-Bigas.

To date, the system designed by López-Bigas and her team has generated 185 models to identify mutations in a gene in a single type of cancer. For example, it has developed a model that has identified all of the possible tumor-initiating mutations of the EGFR (epidermal growth factor receptor) gene in some types of lung cancer and another for the same gene in glioblastoma cases, an extremely aggressive type of cancer affecting the brain.

One single tumor can have 50,000 mutations, but only two or three of these are the ones that trigger the disease; identifying them is key to improving treatments
Scientist Núria López-Bigas

The researchers hope that as the volume of data on sequenced tumors made available in the public domain increases and is incorporated into the system, it will be possible to have models for every cancer gene within the next few years. “When a model is developed, it can be interrogated over every possible mutation of a cancer gene in a tissue type to determine if it is relevant or not for the development of the disease,” the team said in a press release.

The researchers concur that the BoostDM system will help to create valuable knowledge for personalized treatment of cancer and help professionals when taking medical decisions. The tool has been integrated into two other systems, the IntOGen Platform and the Cancer Genome Interpreter, also developed by López-Bigas’ team and which are more focused on the clinical decision-making process by medical oncologists.

English version by Rob Train.

More information

You may also like

Most viewed in...

Top 50