Genetic code: description, characteristics, research history. Biosynthesis of protein and nucleic acids

Lecture 5 Genetic code

Concept definition

The genetic code is a system for recording information about the sequence of amino acids in proteins using the sequence of nucleotides in DNA.

Since DNA is not directly involved in protein synthesis, the code is written in the language of RNA. RNA contains uracil instead of thymine.

Properties of the genetic code

1. Tripletity

Each amino acid is encoded by a sequence of 3 nucleotides.

Definition: A triplet or codon is a sequence of three nucleotides that codes for one amino acid.

The code cannot be monopleth, since 4 (the number of different nucleotides in DNA) is less than 20. The code cannot be doublet, because 16 (the number of combinations and permutations of 4 nucleotides by 2) is less than 20. The code can be triplet, because 64 (the number of combinations and permutations from 4 to 3) is greater than 20.

2. Degeneracy.

All amino acids, with the exception of methionine and tryptophan, are encoded by more than one triplet:

2 AKs for 1 triplet = 2.

9 AKs x 2 triplets = 18.

1 AK 3 triplets = 3.

5 AKs x 4 triplets = 20.

3 AKs x 6 triplets = 18.

A total of 61 triplet codes for 20 amino acids.

3. The presence of intergenic punctuation marks.

Definition:

Gene is a segment of DNA that codes for one polypeptide chain or one molecule tPHK, rRNA orsPHK.

GenestPHK, rPHK, sPHKproteins do not code.

At the end of each gene encoding a polypeptide, there is at least one of 3 triplets encoding RNA stop codons, or stop signals. In mRNA they look like this: UAA, UAG, UGA . They terminate (end) the broadcast.

Conventionally, the codon also applies to punctuation marks AUG - the first after the leader sequence. (See lecture 8) It performs the function of a capital letter. In this position, it codes for formylmethionine (in prokaryotes).

4. Uniqueness.

Each triplet encodes only one amino acid or is a translation terminator.

The exception is the codon AUG . In prokaryotes, in the first position (capital letter), it codes for formylmethionine, and in any other position, it codes for methionine.

5. Compactness, or the absence of intragenic punctuation marks.
Within a gene, each nucleotide is part of a significant codon.

In 1961, Seymour Benzer and Francis Crick experimentally proved that the code is triplet and compact.

The essence of the experiment: "+" mutation - the insertion of one nucleotide. "-" mutation - loss of one nucleotide. A single "+" or "-" mutation at the beginning of a gene corrupts the entire gene. A double "+" or "-" mutation also spoils the entire gene.

A triple "+" or "-" mutation at the beginning of the gene spoils only part of it. A quadruple "+" or "-" mutation again spoils the entire gene.

The experiment proves that the code is triplet and there are no punctuation marks inside the gene. The experiment was carried out on two adjacent phage genes and showed, in addition, the presence of punctuation marks between genes.

6. Versatility.

The genetic code is the same for all creatures living on Earth.

In 1979 Burrell opened ideal human mitochondrial code.

Definition:

“Ideal” is the genetic code in which the rule of degeneracy of the quasi-doublet code is fulfilled: If the first two nucleotides in two triplets coincide, and the third nucleotides belong to the same class (both are purines or both are pyrimidines), then these triplets encode the same amino acid .

There are two exceptions to this rule in generic code. Both deviations from the ideal code in the universal relate to the fundamental points: the beginning and end of protein synthesis:

codon

Universal

code

Mitochondrial codes

Vertebrates

Invertebrates

Yeast

Plants

STOP

STOP

With UA

A G A

STOP

STOP

230 substitutions do not change the class of the encoded amino acid. to tearability.

In 1956, Georgy Gamov proposed a variant of the overlapped code. According to the Gamow code, each nucleotide, starting from the third in the gene, is part of 3 codons. When the genetic code was deciphered, it turned out that it was non-overlapping, i.e. each nucleotide is part of only one codon.

Advantages of the overlapped genetic code: compactness, lesser dependence of the protein structure on the insertion or deletion of a nucleotide.

Disadvantage: high dependence of the protein structure on nucleotide substitution and restriction on neighbors.

In 1976, the DNA of the φX174 phage was sequenced. It has a single stranded circular DNA of 5375 nucleotides. The phage was known to encode 9 proteins. For 6 of them, genes located one after another were identified.

It turned out that there is an overlap. The E gene is completely within the gene D . Its initiation codon appears as a result of a one nucleotide shift in the reading. Gene J starts where gene ends D . Gene initiation codon J overlaps with the termination codon of the gene D due to a shift of two nucleotides. The design is called "reading frame shift" by a number of nucleotides that is not a multiple of three. To date, overlap has only been shown for a few phages.

Information capacity of DNA

There are 6 billion people on Earth. Hereditary information about them
enclosed in 6x10 9 spermatozoa. According to various estimates, a person has from 30 to 50
thousand genes. All humans have ~30x10 13 genes, or 30x10 16 base pairs, which make up 10 17 codons. The average book page contains 25x10 2 characters. The DNA of 6x10 9 spermatozoa contains information equal in volume to approximately

4x10 13 book pages. These pages would occupy the volume of 6 NSU buildings. 6x10 9 sperm take up half of a thimble. Their DNA takes up less than a quarter of a thimble.

The genetic code is a way of encoding the sequence of amino acids in a protein molecule using the sequence of nucleotides in a nucleic acid molecule. The properties of the genetic code follow from the features of this coding.

Each amino acid of a protein is associated with three successive nucleic acid nucleotides - triplet, or codon. Each of the nucleotides can contain one of four nitrogenous bases. In RNA it is adenine(A) uracil(U) guanine(G) cytosine(C). By combining nitrogenous bases in different ways (in this case, nucleotides containing them), you can get many different triplets: AAA, GAU, UCC, GCA, AUC, etc. The total number of possible combinations is 64, i.e. 4 3 .

The proteins of living organisms contain about 20 amino acids. If nature “conceived” to encode each amino acid not with three, but with two nucleotides, then the variety of such pairs would not be enough, since there would be only 16 of them, i.e. 4 2 .

Thus, the main property of the genetic code is its triplet. Each amino acid is encoded by a triplet of nucleotides.

Since there are significantly more possible different triplets than amino acids used in biological molecules, such a property as redundancy genetic code. Many amino acids began to be encoded not by one codon, but by several. For example, the amino acid glycine is encoded by four different codons: GGU, GGC, GGA, GGG. Redundancy is also called degeneracy.

Correspondence between amino acids and codons is reflected in the form of tables. For example, these:

In relation to nucleotides, the genetic code has the following property: uniqueness(or specificity): each codon corresponds to only one amino acid. For example, the GGU codon can only code for glycine and no other amino acid.

Again. Redundancy is about the fact that several triplets can encode the same amino acid. Specificity - each specific codon can code for only one amino acid.

There are no special punctuation marks in the genetic code (except for stop codons that indicate the end of polypeptide synthesis). The function of punctuation marks is performed by the triplets themselves - the end of one means that another will begin next. This implies the following two properties of the genetic code: continuity And non-overlapping. Continuity is understood as the reading of triplets immediately one after another. Non-overlapping means that each nucleotide can be part of only one triplet. So the first nucleotide of the next triplet always comes after the third nucleotide of the previous triplet. A codon cannot start at the second or third nucleotide of the preceding codon. In other words, the code does not overlap.

The genetic code has the property universality. It is the same for all organisms on Earth, which indicates the unity of the origin of life. There are very rare exceptions to this. For example, some triplets of mitochondria and chloroplasts code for amino acids other than their usual ones. This may indicate that at the dawn of the development of life, there were slightly different variations of the genetic code.

Finally, the genetic code has noise immunity, which is a consequence of its property as redundancy. Point mutations, sometimes occurring in DNA, usually result in the replacement of one nitrogenous base with another. This changes the triplet. For example, it was AAA, after the mutation it became AAG. However, such changes do not always lead to a change in the amino acid in the synthesized polypeptide, since both triplets, due to the property of the redundancy of the genetic code, can correspond to one amino acid. Given that mutations are more often harmful, the noise immunity property is useful.

Previously, we emphasized that nucleotides have an important feature for the formation of life on Earth - in the presence of one polynucleotide chain in a solution, the process of formation of a second (parallel) chain spontaneously occurs based on the complementary compound of related nucleotides. The same number of nucleotides in both chains and their chemical relationship is an indispensable condition for the implementation of such reactions. However, during protein synthesis, when information from mRNA is implemented into the protein structure, there can be no question of observing the principle of complementarity. This is due to the fact that in mRNA and in the synthesized protein not only the number of monomers is different, but, what is especially important, there is no structural similarity between them (nucleotides on the one hand, amino acids on the other). It is clear that in this case there is a need to create a new principle for the exact translation of information from a polynucleotide into a polypeptide structure. In evolution, such a principle was created and the genetic code was laid in its basis.

The genetic code is a system for recording hereditary information in nucleic acid molecules, based on a certain alternation of nucleotide sequences in DNA or RNA that form codons corresponding to amino acids in a protein.

The genetic code has several properties.

    Tripletity.

    Degeneracy or redundancy.

    Unambiguity.

    Polarity.

    Non-overlapping.

    Compactness.

    Versatility.

It should be noted that some authors also offer other properties of the code related to the chemical features of the nucleotides included in the code or to the frequency of occurrence of individual amino acids in the proteins of the body, etc. However, these properties follow from the above, so we will consider them there.

A. Tripletity. The genetic code, like many complexly organized systems, has the smallest structural and smallest functional unit. A triplet is the smallest structural unit of the genetic code. It consists of three nucleotides. A codon is the smallest functional unit of the genetic code. As a rule, mRNA triplets are called codons. In the genetic code, a codon performs several functions. First, its main function is that it codes for one amino acid. Second, a codon may not code for an amino acid, but in this case it has a different function (see below). As can be seen from the definition, a triplet is a concept that characterizes elementary structural unit genetic code (three nucleotides). codon characterizes elementary semantic unit genome - three nucleotides determine the attachment to the polypeptide chain of one amino acid.

The elementary structural unit was first deciphered theoretically, and then its existence was confirmed experimentally. Indeed, 20 amino acids cannot be encoded by one or two nucleotides. the latter are only 4. Three out of four nucleotides give 4 3 = 64 variants, which more than covers the number of amino acids present in living organisms (see Table 1).

The combinations of nucleotides presented in Table 64 have two features. First, of the 64 variants of triplets, only 61 are codons and encode any amino acid, they are called sense codons. Three triplets do not encode

Table 1.

Messenger RNA codons and their corresponding amino acids

F undamentals of codons

nonsense

nonsense

nonsense

Met

Shaft

amino acids a are stop signals marking the end of translation. There are three such triplets UAA, UAG, UGA, they are also called "meaningless" (nonsense codons). As a result of a mutation, which is associated with the replacement of one nucleotide in a triplet with another, a meaningless codon can arise from a sense codon. This type of mutation is called nonsense mutation. If such a stop signal is formed inside the gene (in its informational part), then during protein synthesis in this place the process will be constantly interrupted - only the first (before the stop signal) part of the protein will be synthesized. A person with such a pathology will experience a lack of protein and will experience symptoms associated with this lack. For example, this kind of mutation was found in the gene encoding the hemoglobin beta chain. A shortened inactive hemoglobin chain is synthesized, which is rapidly destroyed. As a result, a hemoglobin molecule devoid of a beta chain is formed. It is clear that such a molecule is unlikely to fully fulfill its duties. There is a serious disease that develops according to the type of hemolytic anemia (beta-zero thalassemia, from the Greek word "Talas" - the Mediterranean Sea, where this disease was first discovered).

The mechanism of action of stop codons is different from the mechanism of action of sense codons. This follows from the fact that for all the codons encoding amino acids, the corresponding tRNAs were found. No tRNAs were found for nonsense codons. Therefore, tRNA does not take part in the process of stopping protein synthesis.

codonAUG (sometimes GUG in bacteria) not only encodes the amino acid methionine and valine, but is alsobroadcast initiator .

b. Degeneracy or redundancy.

61 of the 64 triplets code for 20 amino acids. Such a threefold excess of the number of triplets over the number of amino acids suggests that two coding options can be used in the transfer of information. Firstly, not all 64 codons can be involved in encoding 20 amino acids, but only 20, and secondly, amino acids can be encoded by several codons. Studies have shown that nature used the latter option.

His preference is clear. If only 20 out of 64 triplet variants were involved in coding amino acids, then 44 triplets (out of 64) would remain non-coding, i.e. meaningless (nonsense codons). Earlier, we pointed out how dangerous for the life of a cell is the transformation of a coding triplet as a result of a mutation into a nonsense codon - this significantly disrupts the normal operation of RNA polymerase, ultimately leading to the development of diseases. There are currently three nonsense codons in our genome, and now imagine what would happen if the number of nonsense codons increased by about 15 times. It is clear that in such a situation the transition of normal codons to nonsense codons will be immeasurably higher.

A code in which one amino acid is encoded by several triplets is called degenerate or redundant. Almost every amino acid has several codons. So, the amino acid leucine can be encoded by six triplets - UUA, UUG, CUU, CUC, CUA, CUG. Valine is encoded by four triplets, phenylalanine by two and only tryptophan and methionine encoded by one codon. The property that is associated with the recording of the same information with different characters is called degeneracy.

The number of codons assigned to one amino acid correlates well with the frequency of occurrence of the amino acid in proteins.

And this is most likely not accidental. The higher the frequency of occurrence of an amino acid in a protein, the more often the codon of this amino acid is present in the genome, the higher the probability of its damage by mutagenic factors. Therefore, it is clear that a mutated codon is more likely to code for the same amino acid if it is highly degenerate. From these positions, the degeneracy of the genetic code is a mechanism that protects the human genome from damage.

It should be noted that the term degeneracy is used in molecular genetics in another sense as well. Since the main part of the information in the codon falls on the first two nucleotides, the base in the third position of the codon turns out to be of little importance. This phenomenon is called “degeneracy of the third base”. The latter feature minimizes the effect of mutations. For example, it is known that the main function of red blood cells is the transport of oxygen from the lungs to the tissues and carbon dioxide from the tissues to the lungs. This function is carried out by the respiratory pigment - hemoglobin, which fills the entire cytoplasm of the erythrocyte. It consists of a protein part - globin, which is encoded by the corresponding gene. In addition to protein, hemoglobin contains heme, which contains iron. Mutations in globin genes lead to the appearance of different variants of hemoglobins. Most often, mutations are associated with substitution of one nucleotide for another and the appearance of a new codon in the gene, which can code for a new amino acid in the hemoglobin polypeptide chain. In a triplet, as a result of a mutation, any nucleotide can be replaced - the first, second or third. Several hundred mutations are known to affect the integrity of globin genes. Near 400 of which are associated with the replacement of single nucleotides in the gene and the corresponding amino acid substitution in the polypeptide. Of these, only 100 substitutions lead to instability of hemoglobin and various kinds of diseases from mild to very severe. 300 (approximately 64%) substitution mutations do not affect hemoglobin function and do not lead to pathology. One of the reasons for this is the “degeneracy of the third base” mentioned above, when the replacement of the third nucleotide in the triplet encoding serine, leucine, proline, arginine, and some other amino acids leads to the appearance of a synonym codon encoding the same amino acid. Phenotypically, such a mutation will not manifest itself. In contrast, any replacement of the first or second nucleotide in a triplet in 100% of cases leads to the appearance of a new hemoglobin variant. But even in this case, there may not be severe phenotypic disorders. The reason for this is the replacement of an amino acid in hemoglobin with another one similar to the first in terms of physicochemical properties. For example, if an amino acid with hydrophilic properties is replaced by another amino acid, but with the same properties.

Hemoglobin consists of an iron porphyrin group of heme (oxygen and carbon dioxide molecules are attached to it) and a protein - globin. Adult hemoglobin (HbA) contains two identical- chains and two-chains. Molecule-chain contains 141 amino acid residues,- chain - 146,- And-chains differ in many amino acid residues. The amino acid sequence of each globin chain is encoded by its own gene. The gene encoding- the chain is located on the short arm of chromosome 16,-gene - in the short arm of chromosome 11. Change in the gene encoding- hemoglobin chain of the first or second nucleotide almost always leads to the appearance of new amino acids in the protein, disruption of hemoglobin functions and serious consequences for the patient. For example, replacing “C” in one of the CAU (histidine) triplets with “U” will lead to the appearance of a new UAU triplet encoding another amino acid - tyrosine. Phenotypically, this will manifest itself in a serious illness .. A similar replacement in position 63-chain of the histidine polypeptide to tyrosine will destabilize hemoglobin. The disease methemoglobinemia develops. Change, as a result of mutation, of glutamic acid to valine in the 6th positionchain is the cause of a severe disease - sickle cell anemia. Let's not continue the sad list. We only note that when replacing the first two nucleotides, an amino acid may appear similar in physicochemical properties to the previous one. Thus, the replacement of the 2nd nucleotide in one of the triplets encoding glutamic acid (GAA) in-chain on “Y” leads to the appearance of a new triplet (GUA) encoding valine, and the replacement of the first nucleotide with “A” forms an AAA triplet encoding the amino acid lysine. Glutamic acid and lysine are similar in physicochemical properties - they are both hydrophilic. Valine is a hydrophobic amino acid. Therefore, the replacement of hydrophilic glutamic acid with hydrophobic valine significantly changes the properties of hemoglobin, which ultimately leads to the development of sickle cell anemia, while the replacement of hydrophilic glutamic acid with hydrophilic lysine changes the function of hemoglobin to a lesser extent - patients develop a mild form of anemia. As a result of the replacement of the third base, the new triplet can encode the same amino acids as the previous one. For example, if uracil was replaced by cytosine in the CAH triplet and a CAC triplet arose, then practically no phenotypic changes in a person will be detected. This is understandable, because Both triplets code for the same amino acid, histidine.

In conclusion, it is appropriate to emphasize that the degeneracy of the genetic code and the degeneracy of the third base from a general biological point of view are protective mechanisms that are incorporated in evolution in the unique structure of DNA and RNA.

V. Unambiguity.

Each triplet (except for meaningless ones) encodes only one amino acid. Thus, in the direction of codon - amino acid, the genetic code is unambiguous, in the direction of amino acid - codon - it is ambiguous (degenerate).

unambiguous

codon amino acid

degenerate

And in this case, the need for unambiguity in the genetic code is obvious. In another variant, during the translation of the same codon, different amino acids would be inserted into the protein chain and, as a result, proteins with different primary structures and different functions would be formed. The cell's metabolism would switch to the "one gene - several polypeptides" mode of operation. It is clear that in such a situation the regulatory function of genes would be completely lost.

g. Polarity

Reading information from DNA and from mRNA occurs only in one direction. Polarity is essential for defining higher order structures (secondary, tertiary, etc.). Earlier we talked about the fact that structures of a lower order determine structures of a higher order. The tertiary structure and structures of a higher order in proteins are formed immediately as soon as the synthesized RNA chain moves away from the DNA molecule or the polypeptide chain moves away from the ribosome. While the free end of the RNA or polypeptide acquires a tertiary structure, the other end of the chain still continues to be synthesized on DNA (if RNA is transcribed) or ribosome (if polypeptide is transcribed).

Therefore, the unidirectional process of reading information (in the synthesis of RNA and protein) is essential not only for determining the sequence of nucleotides or amino acids in the synthesized substance, but for the rigid determination of secondary, tertiary, etc. structures.

e. Non-overlapping.

The code may or may not overlap. In most organisms, the code is non-overlapping. An overlapping code has been found in some phages.

The essence of a non-overlapping code is that the nucleotide of one codon cannot be the nucleotide of another codon at the same time. If the code were overlapping, then the sequence of seven nucleotides (GCUGCUG) could encode not two amino acids (alanine-alanine) (Fig. 33, A) as in the case of a non-overlapping code, but three (if one nucleotide is common) (Fig. 33, B) or five (if two nucleotides are common) (see Fig. 33, C). In the last two cases, a mutation of any nucleotide would lead to a violation in the sequence of two, three, etc. amino acids.

However, it has been found that a mutation of one nucleotide always disrupts the inclusion of one amino acid in a polypeptide. This is a significant argument in favor of the fact that the code is non-overlapping.

Let us explain this in Figure 34. Bold lines show triplets encoding amino acids in the case of non-overlapping and overlapping code. Experiments have unambiguously shown that the genetic code is non-overlapping. Without going into the details of the experiment, we note that if we replace the third nucleotide in the nucleotide sequence (see Fig. 34)At (marked with an asterisk) to some other then:

1. With a non-overlapping code, the protein controlled by this sequence would have a replacement for one (first) amino acid (marked with asterisks).

2. With an overlapping code in option A, a replacement would occur in two (first and second) amino acids (marked with asterisks). Under option B, the substitution would affect three amino acids (marked with asterisks).

However, numerous experiments have shown that when one nucleotide in DNA is broken, the protein always affects only one amino acid, which is typical for a non-overlapping code.

ГЦУГЦУГ ГЦУГЦУГ ГЦУГЦУГ

HCC HCC HCC UHC CUG HCC CUG UGC HCU CUG

*** *** *** *** *** ***

Alanine - Alanine Ala - Cys - Lei Ala - Lei - Lei - Ala - Lei

A B C

non-overlapping code overlapping code

Rice. 34. Scheme explaining the presence of a non-overlapping code in the genome (explanation in the text).

The non-overlapping of the genetic code is associated with another property - the reading of information begins from a certain point - the initiation signal. Such an initiation signal in mRNA is the codon encoding AUG methionine.

It should be noted that a person still has a small number of genes that deviate from the general rule and overlap.

e. Compactness.

There are no punctuation marks between codons. In other words, the triplets are not separated from each other, for example, by one meaningless nucleotide. The absence of "punctuation marks" in the genetic code has been proven in experiments.

and. Versatility.

The code is the same for all organisms living on Earth. Direct proof of the universality of the genetic code was obtained by comparing DNA sequences with corresponding protein sequences. It turned out that the same sets of code values ​​are used in all bacterial and eukaryotic genomes. There are exceptions, but not many.

The first exceptions to the universality of the genetic code were found in the mitochondria of some animal species. This concerned the terminator codon UGA, which read the same as the UGG codon encoding the amino acid tryptophan. Other rarer deviations from universality have also been found.

MZ. The genetic code is a system for recording hereditary information in nucleic acid molecules, based on a certain alternation of nucleotide sequences in DNA or RNA that form codons,

corresponding to the amino acids in the protein.The genetic code has several properties.

The same nucleotides are used, except for the nucleotide containing thymine, which is replaced by a similar nucleotide containing uracil, which is denoted by the letter ( in Russian-language literature). In DNA and RNA molecules, nucleotides line up in chains and, thus, sequences of genetic letters are obtained.

The proteins of almost all living organisms are built from only 20 types of amino acids. These amino acids are called canonical. Each protein is a chain or several chains of amino acids connected in a strictly defined sequence. This sequence determines the structure of the protein, and therefore all its biological properties.

However, in the early 1960s, new data revealed the failure of the “comma-free code” hypothesis. Then experiments showed that codons, considered by Crick to be meaningless, could provoke protein synthesis in a test tube, and by 1965 the meaning of all 64 triplets had been established. It turned out that some codons are simply redundant, that is, a number of amino acids are encoded by two, four or even six triplets.

Properties

Correspondence tables of mRNA codons and amino acids

Genetic code common to most pro- and eukaryotes. The table lists all 64 codons and lists the corresponding amino acids. The base order is from the 5" to the 3" end of the mRNA.

standard genetic code
1st
base
2nd base 3rd
base
U C A G
U UUU (Phe/F) Phenylalanine UCU (Ser/S) Serine UAU (Tyr/Y) Tyrosine UGU (Cys/C) Cysteine U
UUC UCC UAC UGC C
UUA (Leu/L) Leucine UCA UAA Stop ( Ocher) UGA Stop ( Opal) A
UUG UCG UAG Stop ( Amber) UGG (Trp/W) Tryptophan G
C CUU CCU (Pro/P) Proline CAU (His/H) Histidine CGU (Arg/R) Arginine U
CUC CCC CAC CGC C
CUA CCA CAA (Gln/Q) Glutamine CGA A
CUG CCG CAG CGG G
A AUU (Ile/I) Isoleucine ACU (Thr/T) Threonine AAU (Asn/N) Asparagine AGU (Ser/S) Serine U
AUC ACC AAC AGC C
AUA ACA AAA (Lys/K) Lysine AGA (Arg/R) Arginine A
AUG (Met/M) Methionine ACG AAG AGG G
G GUU (Val/V) Valine GCU (Ala/A) Alanine GAU (Asp/D) Aspartic acid GGU (Gly/G) Glycine U
GUC GCC GAC GGC C
GUA GCA GAA (Glu/E) Glutamic acid GGA A
GUG GCG GAG GGG G
The AUG codon codes for methionine and is also the site of translation initiation: the first AUG codon in the mRNA coding region serves as the start of protein synthesis. Reverse table (codons for each amino acid are indicated, as well as stop codons)
Ala/A GCU, GCC, GCA, GCG Leu/L UUA, UUG, CUU, CUC, CUA, CUG
Arg/R CGU, CGC, CGA, CGG, AGA, AGG Lys/K AAA, AAG
Asn/N AAU, AAC Met/M AUG
Asp/D GAU, GAC Phe/F UUU, UUC
Cys/C UGU, UGC Pro/P CCU, CCC, CCA, CCG
Gln/Q CAA, CAG Ser/S UCU, UCC, UCA, UCG, AGU, AGC
Glu/E GAA, GAG Thr/T ACU, ACC, ACA, ACG
Gly/G GGU, GGC, GGA, GGG Trp/W UGG
His/H CAU, CAC Tyr/Y UAU, UAC
Ile/I AUU, AUC, AUA Val/V GUU, GUC, GUU, GUG
START AUG STOP UAG, UGA, UAA

Variations on the Standard Genetic Code

The first example of a deviation from the standard genetic code was discovered in 1979 during the study of human mitochondrial genes. Since that time, several such variants have been found, including a variety of alternative mitochondrial codes, such as reading the stop codon UGA as the codon defining tryptophan in mycoplasmas. In bacteria and archaea, GUG and UUG are often used as start codons. In some cases, genes start coding for a protein at a start codon that is different from the one normally used by the species.

In some proteins, non-standard amino acids, such as selenocysteine ​​and pyrrolysine, are inserted by the stop codon-reading ribosome, which depends on the sequences in the mRNA. Selenocysteine ​​is now regarded as the 21st, and pyrrolysine the 22nd of the amino acids that make up proteins.

Despite these exceptions, the genetic code of all living organisms has common features: codons consist of three nucleotides, where the first two are defining, codons are translated by tRNA and ribosomes into a sequence of amino acids.

Deviations from the standard genetic code.
Example codon Usual value Reads like:
Some types of yeast of the genus Candida CUG Leucine Serene
Mitochondria, in particular Saccharomyces cerevisiae CU(U, C, A, G) Leucine Serene
Mitochondria of higher plants CGG Arginine tryptophan
Mitochondria (in all studied organisms without exception) UGA Stop tryptophan
Nuclear genome of ciliates Euplotes UGA Stop Cysteine ​​or selenocysteine
Mammalian mitochondria, Drosophila, S.cerevisiae and many simple AUA Isoleucine Methionine = Start
prokaryotes GUG Valine Start
Eukaryotes (rare) CUG Leucine Start
Eukaryotes (rare) GUG Valine Start
Prokaryotes (rare) UUG Leucine Start
Eukaryotes (rare) ACG Threonine Start
Mammalian mitochondria AGC, AGU Serene Stop
Drosophila mitochondria AGA Arginine Stop
Mammalian mitochondria AG(A,G) Arginine Stop

Evolution

It is believed that the triplet code was formed quite early in the course of the evolution of life. But the existence of differences in some organisms that appeared at different evolutionary stages indicates that it was not always so.

According to some models, at first the code existed in a primitive form, when a small number of codons denoted a relatively small number of amino acids. A more precise codon value and more amino acids could be introduced later. At first, only the first two of the three bases could be used for recognition [which depends on the structure of the tRNA].

- Lewin b. Genes. M. : 1987. C. 62.

see also

Notes

  1. Sanger F. (1952). “The arrangement of amino acids in proteins”. Adv. Protein Chem. 7 : 1-67. PMID.
  2. Ichas M. biological code. - M.: Mir, 1971.
  3. Watson J. D., Crick F. H. (April 1953). “Molecular structure of nucleic acids; a structure for deoxyribose nucleic acid”. Nature. 171 : 737-738. PMID. reference)
  4. Watson J. D., Crick F. H. (May 1953). “Genetical implications of the structure of deoxyribonucleic acid”. Nature. 171 : 964-967. PMID. Uses deprecated |month= parameter (help)
  5. Crick F. H. (April 1966). “The genetic code - yesterday, today, and tomorrow”. Cold Spring Harb. Symp. quant. Biol.: 1-9. PMID. Uses deprecated |month= parameter (help)
  6. Gamow G. (February 1954). “Possible relation between deoxyribonucleic acid and protein structures”. Nature. 173 : 318. DOI: 10.1038/173318a0 . PMID. Uses deprecated |month= parameter (help)
  7. Gamow G., Rich A., Ycas M. (1956). “The problem of information transfer from the nucleic acids to proteins”. Adv. Bio.l Med. Phys. 4 : 23-68. PMID.
  8. Gamow G, Ycas M. (1955). “Statistical correlation of protein and ribonucleic acid composition” . Proc. Natl. Acad. sci. U.S.A. 41 : 1011-1019. PMID.
  9. Crick F. H., Griffith J. S., Orgel L. E. (1957).

This is the way in which information about the sequence of twenty amino acids is encoded using a sequence of four nucleotides.

Genecode Properties

1) Tripletity
One amino acid is encoded by three nucleotides. In DNA they are called triplet, in mRNA they are called codons, in tRNA they are called anticodons. In total, there are 64 triplets, 61 of them encode amino acids, and 3 are stop signals - they show the ribosome the place where protein synthesis should be stopped.

2) Degeneracy (redundancy)
There are 61 codons that code for amino acids, but only 20 for amino acids, so most amino acids are coded for by more than one codon. For example, the amino acid alanine is encoded by four codons - HCC, HCC, HCA, HCH. The exception is methionine, it is encoded by one AUG codon - in eukaryotes this is the start codon during translation.

3) Uniqueness
Each codon codes for only one amino acid. For example, the GCC codon codes for only one amino acid, alanine.

4) Continuity
There are no separators ("punctuation marks") between individual triplets. Because of this, when one nucleotide is dropped or inserted, a “reading frame shift” occurs: starting from the mutation site, the reading of the triplet code is disturbed, and a completely different protein is synthesized.

5) Versatility
The genetic code is the same for all living organisms on Earth.