Science
Fact Checked

What Is Genome Annotation?

Mary McMahon
Mary McMahon
Mary McMahon
Mary McMahon

Genome annotation tags sections of a genome with information about the genetic data it contains. This is part of the process at genome projects, where the goal is not just to sequence the DNA of a target organism, but to understand what it does and how it functions. Researchers can conduct annotation in their labs and may share data with other scientists to pool resources and information. Online databases open to the public are available, and some also allow members of the general public to submit their own annotations.

The first step in genome annotation is sequencing, where researchers determine the order of amino acids in an organism's DNA. Sequencing a whole genome takes a long time, and it is common for scientists to start annotating before the genome has been decoded. With a section of sequenced DNA in hand, a researcher can start annotating. Scientists can note where genes appear to start and stop, paying attention to distinctive strings of DNA that contain information about the function of the genome.

Through sequencing, medical researchers can identify the order of amino acids in a genome.
Through sequencing, medical researchers can identify the order of amino acids in a genome.

Computers are capable of performing some genome annotation on their own. They can seek known patterns, such as strings of amino acids that appear at the beginning and end of genes. In automated annotation, the computer can add notes to different sections of a string of DNA to provide information about it. It is also possible to compare segments from different organisms to look for variance that might provide important information about the species as a whole.

Genome annotation tags sections of a genome with information about the genetic data it contains.
Genome annotation tags sections of a genome with information about the genetic data it contains.

Manual genome annotation involves physically reviewing the DNA. Many researchers use computers to display the information and tag it, allowing them to enter it into databases as they work. In some cases, a manual review may be necessary after automated annotation, to make sure the computer got the information correct. This can be a painstaking process and errors do occur, which is one reason researchers like to pool databases. If an annotation doesn't match others on the same section of DNA, people can evaluate the information to determine what happened, and fix the error.

Sequencing an entire genome is a complex, lengthy process for scientists.
Sequencing an entire genome is a complex, lengthy process for scientists.

It is not always possible to determine what a gene does during the genome annotation process. Scientists can flag genes and separate them from other components of the genome, such as non-coding DNA like repeats. This information can be used in research as people develop theories about different segments of the genome. They can add to the annotations to note the function of a gene.

Mary McMahon
Mary McMahon

Ever since she began contributing to the site several years ago, Mary has embraced the exciting challenge of being a AllTheScience researcher and writer. Mary has a liberal arts degree from Goddard College and spends her free time reading, cooking, and exploring the great outdoors.

Mary McMahon
Mary McMahon

Ever since she began contributing to the site several years ago, Mary has embraced the exciting challenge of being a AllTheScience researcher and writer. Mary has a liberal arts degree from Goddard College and spends her free time reading, cooking, and exploring the great outdoors.

You might also Like

Discuss this Article

Post your comments
Login:
Forgot password?
Register:
    • Through sequencing, medical researchers can identify the order of amino acids in a genome.
      By: adimas
      Through sequencing, medical researchers can identify the order of amino acids in a genome.
    • Genome annotation tags sections of a genome with information about the genetic data it contains.
      By: WavebreakmediaMicro
      Genome annotation tags sections of a genome with information about the genetic data it contains.
    • Sequencing an entire genome is a complex, lengthy process for scientists.
      By: lily
      Sequencing an entire genome is a complex, lengthy process for scientists.