- News Home
10 April 2014 11:44 am ,
Vol. 344 ,
The Pyrenean ibex, an impressive mountain goat that lived in the central Pyrenees in Spain, went extinct in 2000. But a...
Tight budgets are forcing NASA to consider turning off one or more planetary science projects that have completed their...
Ebola is not a stranger to West Africa—an outbreak in the 1990s killed chimpanzees and sickened one researcher. But the...
In an as-yet-unpublished report, an international panel of geoscientists has concluded that a pair of deadly...
Tropical disease experts tried and failed before to eradicate yaws, a rare disfiguring disease of poor countries. Now,...
Since 2002, researchers have reported that agricultural communities in the hot and humid Pacific Coast of Central...
Balkan endemic kidney disease surfaced in the 1950s and for decades defied attempts to finger the cause. It occurred...
- 10 April 2014 11:44 am , Vol. 344 , #6180
- About Us
Why DNA Is Spelled ATGC
11 September 2002 (All day)
Of the 16 nucleotide bases that could pair up to make DNA, why do only A, T, G, and C make up the genomic alphabet? Researchers have long put it down to the composition of the primordial soup in which the first life arose. But Dónall Mac Dónaill of Trinity College Dublin says the choice incorporates a tactic for minimizing errors similar to that used by error-coding systems incorporated into credit card numbers, bank accounts, and airline tickets.
In the error-coding theory first developed in 1950 by Bell Telephone Laboratories researcher Richard Hamming, a so-called parity bit is added to the end of digital numbers to make the digits add up to an even number. For example, when transmitting the number 100110, you would add an extra 1 onto the end (100110,1); the number 100001 would have a zero added (100001,0). Because the most likely transmission error--switching a single digit from 1 to 0 or vice versa--causes the sum of the digits to be odd, the recipient of an odd number can assume that an error occurred.
Mac Dónaill asserts, in a forthcoming issue of Chemical Communications, that a similar process was at work in the choice of bases in the genetic alphabet. To demonstrate this, he represented each nucleotide as a four-digit binary number. The first three digits represent the three bonding sites that each nucleotide presents to its partner. Each site is either a hydrogen donor or acceptor; a nucleotide offering donor-acceptor-acceptor sites would be represented as 100 and would only bond with an acceptor-donor-donor nucleotide, or 011. The fourth digit is 1 if the nucleotide is a single-ringed pyrimidine type and 0 if it is a double-ringed purine type. Nucleotides readily bond with members of the other type.
Mac Dónaill noticed that the final digit acts as a parity bit: The four digits of A, T, G, and C all add up to an even number. Banishing all odd-parity nucleotides from the DNA alphabet reduces errors, Mac Dónaill says. For example, nucleotide C (100,1) binds naturally to nucleotide G (011,0), but it might accidentally bind to the odd parity nucleotide X (010,0), because there is just one mismatch. Such a bond would be weak compared to C-G but not impossible. However, C is highly unlikely to bond to any other even-parity nucleotides, such as the idealized amino-adenine (101,0), because there are two mismatches.
"It is a novel idea which should provoke others to explore aspects of informatics in the genetic code," says computational chemist Graham Richards of Oxford University. "Instinctively, one feels that the DNA code should have evolved systems to minimize errors. Mac Dónaill's work shows how this could have been achieved."
Mac Dónaill's site