By default, MOZYME only recognizes the standard twenty amino-acid residues, but some proteins contain residues which have extra molecular fragments attached. In order to allow these unusual species to be recognized, the keyword XENO is provided. If an unknown residue is detected, then a message of the type: "Unknown residue: 19 atom 296 C:40 N: 5 O: 6 S: 2 see keyword 'XENO'" will be printed.
XENO, from the Greek xenos, xenos, for 'stranger', defines the unusual species in terms of the extra atoms which are added to the normal residue. The format of XENO is:
XENO=(nC,nN,nO,nS,name[;nC,nN,nO,nS,name][;nC,nN,nO,nS,name]...) e.g. XENO=(38,3,4,2,HEM;4,3,2,0,RES)
Up to ten fragments can be defined. To specify a fragment, the number of extra carbon, nitrogen, oxygen, and sulfur atoms in the fragment is used. A name should be selected which describes the fragment. If name has exactly three letters, then the residue name will be replaced by name.
As an example, consider the extra fragment in bacteriorhodopsin, on Lys216:
-COCHRNH-; R = CH2CH2CH2CH2N=CHCH=CMeCH=CHCH=CMeCH=CHC9H15
Not counting hydrogens, the empirical formula for a lysine fragment is C6N2O (See below and Table of Residues ). For residue 216, the empirical formula is C26N2O, so the extra fragment, the Schiff base, accounts for C20. Therefore, the number of extra atoms is, in the order C, N, O, S, '20,0,0,0'. To specify the retinal fragment, the keyword would be XENO=(20,0,0,0,RETINAL). In the output or PDB file generated, residue 216 would be identified as LYS. If XENO=(20,0,0,0,RET) were used, residue 216 would be identified as RET.
Quite often the side-chain cannot be related to any of the 20 standard amino acids. In that case, the simplest option is to define the un-modified residue as being glycine (GLY). If that is done, then the XENO keyword could be constructed by subtracting two carbon atoms, one nitrogen and one oxygen atom. Thus if the message was "Unknown residue: 19 atom 296 C:40 N: 5 O: 6 S: 2 see keyword 'XENO'", the keyword would be XENO=(38,4,5,2,HEME). If the side-chain can be related to a standard amino acid, then the appropriate number of atoms should be deleted. The following table gives the patterns for the various standard amino acids.
Number of atoms in each residue for use in working out the XENO keyword
| Amino acid | Number C |
Number N |
Number O |
Number S |
Amino acid | Number C |
Number N |
Number O |
Number S |
|
| Glycine | 2 | 1 | 1 | 0 | Glutamic acid | 5 | 1 | 3 | 0 | |
| Alanine | 3 | 1 | 1 | 0 | Glutamine | 5 | 2 | 2 | 0 | |
| Valine | 5 | 1 | 1 | 0 | Arginine | 6 | 4 | 1 | 0 | |
| Leucine | 6 | 1 | 1 | 0 | Histidine | 6 | 3 | 1 | 0 | |
| Isoleucine | 6 | 1 | 1 | 0 | Phenylalanine | 9 | 1 | 1 | 0 | |
| Serine | 3 | 1 | 2 | 0 | Cysteine | 3 | 1 | 1 | 1 | |
| Threonine | 4 | 1 | 2 | 0 | Tryptophan | 11 | 2 | 1 | 0 | |
| Aspartic acid | 4 | 1 | 3 | 0 | Tyrosine | 9 | 1 | 1 | 0 | |
| Asparagine | 4 | 2 | 1 | 0 | Methionine | 5 | 1 | 1 | 1 | |
| Lysine | 6 | 2 | 1 | 0 | Proline | 5 | 1 | 1 | 0 |
Only the number of extra atoms for the elements C, N, O, and S need be specified, because these are the elements used in identifying the residues. Residues with other atoms will not be recognized.
If atoms are missing from a particular residue, the "missing" atoms can be defined using negative numbers. Thus, if a residue should be lysine, but the terminal amino group is missing, then it can be replaced using XENO(0,-1,0,0,Lysine). However, if, as a result of missing atoms, the residue corresponds to a known residue, then the residue will be incorrectly recognized as a known residue. For example, if the residue should be lysine, but the group -CH2-CH2-CH2-NH2, is missing, so that the residue is -NH-CH(CH3)-CO, the keyword XENO(-3,-1,0,0,Lysine) will not replace it, because the damaged residue corresponds to alanine, and would be recognized as such. This error is uncorrectable!
In the output, all the atoms of the residue are labeled with the three letter abbreviation of the amino-acid. Ideally, the atoms of the extra fragment would be labeled differently, but it is not easy to algorithmically 'recognize' the fragment. Instead, the unusual residue is indicated in the residue sequence by an asterisk (*), and a one-line description given immediately before the sequence is printed.
If XENO is not used, the calculation will still work, but the label for the modified fragment will be UNK, instead of the more descriptive label which would result from using XENO.