

新科學家: 撲朔迷離的人類基因組

 醫(yī)學abeycd 2021-08-08

   回溯1960年,一幅美麗的圖片展現在人們面前,。我們的基因原來由蛋白質菜單組成,。雙螺旋結構可以通過解鏈來使RNA執(zhí)行菜單的復制,并將其傳送至細胞內部的蛋白質工廠,。然而直至70年代,,人們發(fā)現只有很少的一部分基因編碼用于蛋白質制造,,而現在我們知道這一比例不到1.2%。那么剩下的那一部分基因編碼在做什么,?有些人假設它們一定在執(zhí)行某些功能,,而另一些人則認為它們都是垃圾。“人類基因中至少90%的部分是無用的,,或者說是垃圾”,,基因學家Susumu Ohno在1972年寫道
   Ohno當時就知道,,或者認為,,一些并沒用來執(zhí)行蛋白質編碼的DNA仍然在一些過程中扮演著重要的角色,。比如,,制造基因的RNA拷貝過程 - 轉錄- 一些蛋白質簇在基因附近被綁定至特定序列。這些被稱為轉錄因子的蛋白質,,通過控制基因來加速或阻斷轉錄過程,,而組裝成的特定序列被稱作調節(jié)DNA或開關。
   那么到底有多少DNA作為開關而工作,,或者說,,還執(zhí)行一些其它功能?為了提供一幅基因組的不同組件各自功能的全景圖,,ENCODE( DNA元素百科全書)項目于2003年成立,。ENCODE項目由遍及全世界,并使用各種技術手段的不同小組構成,。針對僅僅1%基因的前期研究結果于2007年發(fā)布,。本周,所有基因的研究結果也已經發(fā)布,,并且在“Science”和其他期刊上公開發(fā)表大約30篇論文,。

    新科學家記者 林達.格迪斯 

              The ever deepening mystery of the human genome

                   05 September 2012 by Linda Geddes

For similar stories, visit the Genetics Topic Guide

  • The more we learn out about the secrets of the human genome, the less we seem to know about all that DNA actually does. But there are some clues       

(Image: Laguna Design/SPL/Getty)

             "Don't junk the 'junk DNA' just yet"

       "HEART disorder: 99 per cent probability, early fatal potential. Life expectancy: 30.2 years."

        At birth, the time and cause of Vincent's death were already known. His inferior genes meant that the best job he could hope to get was as a cleaner, rather than realising his ambition of becoming an astronaut.

        Thus begins the film Gattaca, set in a future when a person's potential is thought to be determined by their genes. Gattaca was released in 1997 during the middle of the Human Genome Project, and its plot reflected what many believed at the time: we'd soon be able to predict all kinds of things about people based on their genes. "There was this belief that we could answer huge amounts of things just by studying genes and gene variants," says geneticist Tim Spector of Kings College London, who was involved in the project.

  Yet today, this prospect seems more distant than ever. After the genome was sequenced, another major project was launched to try to understand which bits of the genome do what. The results, released this week, reveal that our genome is far more complex and mysterious than biologists imagined just a decade ago.

Back in the 1960s, a beautifully simple picture emerged. Our DNA consisted of recipes for proteins. The double helix could be unzipped to allow RNA copies of these recipes to be made and sent to the protein-making factories in cells. But by the 1970s, it had become clear that only a tiny proportion of our DNA codes for proteins - just 1.2 per cent, we now know. What about all the rest? Some assumed it must do something, others suggested it was mostly junk. "At least 90 of our genomic DNA is 'junk' or 'garbage' of various sorts," the geneticist Susumu Ohno wrote in 1972.

   Ohno knew, though, that some of the DNA that didn't code for proteins still played a vital role. For instance, the process of making RNA copies of genes - transcription - involves clusters of proteins binding to specific sequences near the genes. These proteins - called transcription factors - control the activity of genes by either boosting or blocking transcription, so the sequences to which they bind are known as regulatory DNA or switches.

 So how much DNA acts a switch, or has some other function? To provide an overall picture of which parts of the genome do what, the Encyclopedia of DNA Elements (ENCODE) project was set up in 2003. It involves many teams around the world using a variety of techniques. The results of a pilot studylooking at just 1 per cent of the genome were released in 2007. This week, the results of its study of the entire genome were released, with the publication of more than 30 papers in Nature and other journals.

 Among other things, ENCODE looked for switches that control gene activity. The researchers did this by taking known transcription factors and seeing which bits of DNA these proteins bound to. So far, they have found 4 million sites, covering 8.5 per cent of the genome - far more than anyone expected.

     Even this is likely to be a gross underestimate of the true number, because ENCODE hasn't yet looked at every cell type, or every known transcription factor. "When we extrapolate up, it's more like 18 or 19 per cent," says Ewan Birney of the European Bioinformatics Institute in Cambridge, UK, who is coordinating the data analysis for ENCODE. "We see way more switches than we were expecting, and nearly every part of the genome is close to a switch."

    But - and it is a big but - these findings do not show whether these switches actually do anything useful. Many of them may have played a role in the past, for instance, but are now "disconnected".

        The other big surprise is that these regulatory regions are widely dispersed throughout the genome, with many lying in the middle of long stretches between genes that were thought to be barren wastelands. More than 95 per cent of the genome may lie within 10,000 base pairs of a switch. "It means that nearly all of the genome is in play for doing something, or if you change it maybe it would have an effect on something somewhere," Birney says.

       The way in which these switches work is also turning out to be vastly more complicated than thought. One ENCODE study found that individual switches interact with many genes. What's more, most genes are being influenced by numerous switches at the same time. "Almost every gene we look at is physically touching other pieces of DNA, and it's never just one, it tends to be five, eight, 10 sites, and each site in turn has RNAs on it, proteins on it, histones on it," says team member Job Dekker of the University of Massachusetts Medical School in Worcester.

   This might help to explain one of biology's biggest puzzles: the mystery of the "missing heritability". We know there's a big genetic component to traits and diseases such as height and diabetes, but the genetic variants found so far typically account for only a tiny percentage of this heritability.

              The missing inheritance

    The assumption has been that genetic variants work in isolation, so their effects are additive: if you've got one variant that increases the risk of, say, heart disease 5 per cent and another that increases it 10 per cent, your overall risk is 15 per cent. But Dekker's discovery suggests that the effects of some variants can multiply: these variants may have a small effect on their own, but a much bigger effect if a person has certain other variants too.

   "I firmly believe that much of the missing heritability is due to complex interactions between multiple genes, multiple non-coding variants and multiple environmental factors," saysJason Moore of the Geisel School of Medicine at Dartmouth in Hanover, New Hampshire. "The reason we've missed a lot of the heritability for complex diseases could be because we've ignored the complexity of the interactions that we know exist in biology."

 So up to 20 per cent of the genome may consist of regulatory switches, working or otherwise. What about the rest? ENCODE tried to address this by mapping what proportion of the genome is involved in some kind of biochemical event, which might suggest how much of it is in daily use. The results suggest that up to 80 per cent of the genome is active, with much of it being transcribed into RNAs.

        This RNA is not carrying the codes for making proteins, so what is it for? We know that there are lots of different kinds of functional RNAs, many of which are involved in regulating gene activity, such as microRNAs. What's more, some non-coding RNA is turning out to perform other unexpected jobs.

 "They can work like taxi drivers to deliver proteins around the genome, but they can also tether one part of the genome to another and act as a bridge," says Kevin Morris of the Scripps Research Institute in La Jolla, California. Yet others act as decoys, reducing protein output by soaking up coding RNAs.

        But so far all the RNAs with known functions do not begin to add up to 80 per cent of the genome. One explanation is that most RNA transcripts are useless, being mere "noise" generated by overzealous enzymes that don't know when to stop transcribing DNA into RNA. It's like getting a cat to kill mice in your house, and not being able to stop it killing birds in the neighbourhood too.

  "Transcription of non-coding DNA does not automatically indicate function," says Ryan Gregory of the University of Guelph in Ontario, Canada, who studies genome evolution. "I don't think ENCODE will show that the majority of the 98 per cent of non-coding DNA in the human genome is functional for regulation. It would be astonishing if it took so much to regulate a mere 20,000 genes."

     Only a handful of biologists, the most vocal being John Mattick of the University of Queensland in Brisbane, Australia, think most non-coding RNAs will turn out to have an important role. "It's now up to the proponents of the noise theory to explain why there's so much of the genome that's showing functional signatures," says Mattick.

    But doing something is not the same as doing something useful, and there are good reasons to think that most of our DNA does not play a vital role. For starters, at conception we all have dozens of new mutations. Most of us have one to five mutations that adversely affect gene function in our protein-coding DNA alone, says Joseph Nadeau of the Institute for Systems Biology in Seattle, Washington.

  If most of our DNA were vital, populations would acquire harmful mutations faster than they lose them through the death of embryos and children with lots of nasty mutations. "If the fraction of the genome that's functional increases, the question is: how do we tolerate that? Why aren't we dead many times over?" asks Nadeau.

One way to get a sense of the importance of a given bit of DNA is to look at whether it can accumulate mutations without consequence or whether it remains unchanged in a population because natural selection eliminates any individuals with mutations. "Something like seven out of ten nucleotide changes in [protein] coding sequences get kicked out because they're deleterious, but nine out of ten changes in non-coding sequence don't get kicked out," says Chris Ponting of the University of Oxford. "That's telling us something about the importance of coding changes versus non-coding changes."

Birney agrees, although he says ENCODE has looked at the transcribed RNA that's specific to primates and found that some of it seems to be under selective pressure in humans.

              Another way to assess the importance of a given bit of DNA is to delete it to see what happens. This obviously can't be done in people, but in mice huge chunks of non-coding DNA that appeared to be functional have been deleted without any obvious effect. Then again, it is possible to delete many protein-coding genes from organisms such as yeast without any obvious effect, too.

 One explanation is that in cossetted lab conditions, organisms can manage without DNA that is essential for survival in more challenging environments. Another is that there is a lot of redundancy in the genome. Although you would expect mutations to eliminate redundancy, there may be some circumstances in which it can be maintained.

       "[Redundancy] could be what makes the system robust," says Dekker. "If any given piece of DNA is only part of the context, deleting it may have very limited effects. I think it may explain why we can tolerate huge variation between individuals yet we're all still walking down the street."

       A third reason to think most of our DNA isn't vital is that although there isenormous variation in genome size between species, there is very little correlation between the complexity of animals and their genome size. There is no obvious reason why the marbled lungfish needs around 40 times as much DNA as we do, and nearly 400 times as much as the green pufferfish, for instance.

 Because they can   

Gregory has found that some kinds of animals, such as metamorphosing amphibians whose cells need to divide rapidly at times, have smaller genomes than other animals. This suggests that organisms tend to accumulate DNA until its size becomes detrimental, rather like the way people in a big house tend to fill the attic with junk whereas those in small houses have to keep throwing things out.     

              So we are left with something of a mystery. Although several lines of evidence suggest that most of our DNA is far from essential, ENCODE's results suggest that most regions of our genome do do something. One answer could be that most of these regions do not do anything of any great consequence. "They may still have effects. They may change someone's facial anatomy," says Ponting. "They may have very small effects which evolution isn't acting upon."

Birney thinks some of these regions do matter, though. "We don't yet have a definitive answer to how much of it is important, but we've discovered a lot more things that could be important than anybody had ever suspected," he says. "People often say it's the protein coding regions, plus a bit more. It's not a bit more, it's a lot more."

When ENCODE's work started, Birney himself was also highly sceptical about the role of non-protein-coding RNAs in genome function. He even bet Mattick a case of vintage champagne that less than 20 per cent of non-coding RNAs would turn out to be useful. "I am definitely closer to losing this bet," he says.

It could be a long time before one of them gets the champagne, though. Working out which of the millions of regions ENCODE has identified as functional are actually important is an immense task that could take many decades. Ultimately, the only way to prove that a particular region is vital is to show that variants in this region have some effect on people, which is far from easy. In some cases, though, the evidence already exists: the positions of significant genetic variants identified by studies looking for associations between genetic variants and diseases often coincide with ENCODE regions. "Most of the time we have something that's either bang on top of those regions identified by [these] studies, or really close by," says Birney.

              This is already giving us clues about the causes of diseases. For example, some variants associated with Crohn's disease are in switches that ENCODE found are active in immune cells called T-helper cells.

 It seems, though, that the more we learn about the genome, the less we know. "Our genome knows how to make a human, but I think it is hubris to think that that recipe book would be simple and well laid out," says Birney. "We are one of the most complicated things that we know about, and indeed it does look very complicated."

  We're certainly a long way from the complete understanding of the genome portrayed in Gattaca. In fact, we may never achieve it. "It may well be too complex," says Moore.

              Part of this complexity comes from the many ways in which our genome interacts with the environment. "I think DNA will become more predictive, but the flip side of this research is understanding how much of complex traits are due to environmental effects, or free will - things that we can change," says Birney. "DNA is not destiny."

        To some extent, even the writers of Gattaca agreed. (Spoiler alert.) Despite Vincent's genetic inadequacies, he refused to accept his genetic fate and ultimately achieved his goal of leaving Earth.

Linda Geddes is a reporter for New Scientist     

    轉藏 分享 獻花(0



    請遵守用戶 評論公約

    類似文章 更多