Genes still emerging, science finds

By Carl Zimmer / New York Times News Service

Each of us carries just over 20,000 genes that encode everything from the keratin in our hair down to the muscle fibers in our toes. It’s no great mystery where our own genes came from: Our parents bequeathed them to us. And our parents, in turn, got their genes from their parents.

But where along that genealogical line did each of those 20,000 protein-coding genes get its start?

That question has hung over the science of genetics ever since its dawn a century ago. “It’s a basic question of life: how evolution generates novelty,” said Diethard Tautz of the Max Planck Institute for Evolutionary Biology in Plön, Germany.

New studies are now bringing the answer into focus. Some of our genes are immensely old, perhaps dating all the way back to the earliest chapters of life on earth.

But a surprising number of genes emerged more recently — many in just the past few million years. The youngest evolved after our own species broke off from our cousins, the apes.

Scientists are finding that new genes come into being at an unexpectedly fast clip. And once they evolve, they can quickly take on essential functions. Investigating how new genes become so important may help scientists understand the role they may play in diseases like cancer.

“It’s premature to make any grandiose claims, but there’s a coherence that’s emerging,” said David Begun, an evolution scientist at the University of California, Davis.

Identifying gene families

Scientists first speculated about the origin of genes in the early 20th century. Some proposed that when cells duplicate their DNA, they accidentally copy some genes twice. At first the two genes are identical. But later, they evolve into different sequences.

At the end of the century, as geneticists gained the ability to read the precise sequence of DNA, they found that this hunch was correct. “It became clear that gene duplication played a role in evolution,” Tautz said. As genes duplicate over millions of years, they can grow into so-called gene families, each containing hundreds of similar genes.

The case for gene duplications became so strong that many scientists grew convinced that it was the source of all new genes. They speculated that when life originally emerged billions of years ago, the first primordial microbes had a tiny set of genes. Those genes then duplicated over and over again to give rise to all the genes on Earth today.

But when scientists gained the ability to sequence entire genomes, there was a surprise waiting for them. They started to find genes that existed in the genome of just one species. According to the duplication theory, these solitary genes shouldn’t exist; they would have to have been copied from earlier genes in other organisms.

“They looked like perfectly normal genes, except they were only found in one species,” said Anne-Ruxandra Carvunis, an evolutionary biologist at the University of California, San Diego. “There was no explanation for how a gene could be in one species and not in other ones.”

These genes came to be known as “orphan genes.” As scientists sequenced more genomes, they tried to return these orphans to their gene families. Sometimes they succeeded. But very often the orphans remained orphans.

For some scientists, like Tautz, the data pointed to an inescapable conclusion: Orphan genes had not been passed down through the generations for billions of years. They had come into existence much later.

“It’s almost like Sherlock Holmes,” said Tautz, citing the detective’s famous dictum: “When you have eliminated the impossible, whatever remains, however improbable, must be the truth.”

‘De novo genes’

Begun and his colleagues renamed orphan genes “de novo genes,” from the Latin for new. He found that many of his fellow scientists weren’t ready to accept this idea.

“It took a while for people to believe this was occurring,” he said. “It seems kind of nutty to people when they first hear of it.”

One reason it no longer seems so improbable is that Begun and other researchers have documented the step-by-step process by which a new gene can come into existence.

In many species, ours included, protein-coding genes make up a tiny portion of the genome. New genes can emerge from the vast expanse of noncoding DNA.

The first step is for a tiny bit of DNA to mutate into what scientists call a “start sequence.” All protein-coding genes have start sequences, which enable cells to recognize where genes begin.

Once a cell recognizes the start of a gene, it can make a copy of the gene’s DNA. It can then use that copy as a guide for building a protein.

The new protein may turn out to be toxic, or it may serve no purpose. But once it emerges, new mutations to the new gene may make it more useful.

“Once they’re produced, there’s an opportunity for natural selection to sculpt them,” said Aoife McLysaght, a geneticist at Trinity College Dublin.

Begun and his colleagues are now getting a look at these early stages in the birth of de novo genes. They can do so by looking for such genes in different populations of a species of the fruit fly Drosophila melanogaster.

The scientists found 142 de novo genes that were present in some populations of flies but not in others, meaning that they must have evolved recently: They’ve had only enough time to spread across part of the species.

Begun suspects that the true number of de novo genes in the flies is higher. He and his colleagues used very strict guidelines about what stretches of DNA they put on their list, and so they may have missed some genes. “I think we have a lower bound here,” he said.

Fast-paced evolution

Begun’s research indicates that new genes can evolve at a remarkably fast rate — a finding supported by another study, published in the journal eLife.

Christian Schlötterer of the University of Veterinary Medicine in Vienna and his colleagues surveyed five closely related species of Drosophila flies that share a common ancestor that lived about 10 million years ago. The researchers found that as the species diverged from one another, hundreds of new genes evolved along each lineage.

Far from being a fluke, these studies suggest that de novo genes are abundant. In fact, scientists are now wondering why these fast-evolving genes aren’t swelling the genomes of animals and plants.

Schlötterer and his colleagues found the answer in their study: Along each lineage, many de novo genes are also lost. In some cases, a mutation disables a new gene, so that cells can no longer read them. In other cases, a mutation deletes the entire stretch of DNA where the new gene sits.

While many de novo genes ultimately vanish, some cling to existence and take on essential jobs. Tautz said the rise of these genes might be as important a factor in evolution as gene duplication.

Some scientists are investigating how that force has shaped our own biology, though it is harder to study de novo genes in humans because many experiments that can be done on flies cannot be done on humans.

Some clues come from diseases. Japanese researchers, for example, have found a de novo gene involved in cancer. The gene, called NCYM, is found only in humans and chimpanzees, suggesting that it arose several million years ago in our common ancestor.

Yusuke Suenaga of the Chiba Cancer Center Research Institute in Japan and his colleagues found that NCYM plays an important role in childhood brain tumors; its role in ordinary brain cells remains to be discovered.

NCYM is just one of many de novo genes we carry. McLysaght and her colleagues estimate there are 40 such genes in the human genome, although other researchers have come up with much higher estimates. But what does that mean to our species? Carvunis, the evolutionary biologist in San Diego, says the answers may still be far in the future.

“The true impact of de novo genes in what makes us humans,” she said, “remains to be fully investigated.”