================ Indexing in MELD ================ When interacting with MELD we often need to give the index of a specific *atom* or *residue*. This document describes how indexing works in MELD and explains the various methods of indexing. MELD uses zero-based indexing internally ---------------------------------------- MELD is based on the python programming language, which, like most modern programming languages, uses zero-based indexing. However, in structural biology, we often use one-based indexing. The difference is that zero-based indexing starts counting from zero, while one-based indexing starts from one. **Internally, MELD uses zero-based indexing**, but provides various methods for using one-based indexing. To help eliminate errors, all functions in meld that take an atom index require that it is of type :code:`AtomIndex`. This is effectively just an integer, but it has be labeled as an :code:`AtomIndex` to indicate that it is a zero-based absolute atom index. Similarly, functions that take a residue index require that it has type :code:`ResidueIndex`. Functions for indexing ---------------------- The two primary ways for indexing are both methods of the sytem object: - :code:`system.index.atom(resid, atom_name, expected_resname=None, chainid=None, one_based=False)` - :code:`system.index.residue(resid, expected_resname=None, chainid=None, one_based=False)` Calls to :code:`inex.atom` will return a zero-based absolute :code:`AtomIndex`. Calls to :code:`index.residue` will return a zero-based absolute :code:`ResidueIndex`. Specifying :code:`resname` to catch errors ------------------------------------------ Indexing can be tricky and errors can result in strange behavior, as e.g. restraints may be created between the wrong atoms. To help catch errors, it is possible to specify :code:`expected_resname`. When :code:`rexpcected_resname` is specified, calls to :code:`index.atom` and :code:`index.residue` will check that actual residue name that is found matches :code:`expected_resname`. Note that the residue names will be those after processing by ``tleap``, so they may not correspond exactly to those in a pdb file. Normally, the :code:`expected_resname` will be three characters in all-caps, e.g. :code:`"ALA"`. Using one-based indexing ------------------------ By default both :code:`index.atom` and :code:`index.residue` use zero-based indexing, where both :code:`chainid` and :code:`resid` start from zero. To use one-based indexing set :code:`one_based=True`, which will cause both :code:`resid` and :code:`chainid` to be interpreted as one-based. Using relative indexing ----------------------- By default, the :code:`resid` refers to *absolute* residue index, which starts from zero (one for one-based indexing) and does not consider which chain the residue resides in. The ordering of residues corresponds to the order that sub-systems were added when the system was built. If :code:`chainid` is set, then :code:`resid` refers to the relative index of a residue within the corresponding chain. So, :code:`resid=0, chainid=0` would refer to the first residue in the first chain (assuming zero-based indexing). Ordering of :code:`chainids` ---------------------------- Chains are indexed sequentially starting from zero (one for one-based indexing). The order of chains is partially determined by the order that sub-systems are added in. When created by sequence, each sub-system corresponds to exactly one chain. When created from a pdb file, each sub-system will have the same number of chains as the pdb file has unique chain indentifiers. The ordering of the chains is alphabetical with a blank chain identifier coming first, followed by "A", etc. To be more concrete, consider the following example: - A sub-system is added from sequence - A second subsystem is added from a pdb file - The pdb file contains two chain identifiers, "A" and "B". In this case, the :code:`chainid` would be defined as follows: - **0**: the chain added by sequence - **1**: chain "A" from the pdb file - **2**: chain "B" from the pdb file In some cases, MELD will add additional residues that were not present in either the sequence or pdb file. Examples include extra residues added to encode RDC alignment tensors, which are added the :code:`RdcAlignmentPatcher` and solvent and ions that are added when explicit solvent calculations are specified. These additional residues are considered to be in an additional chain that is added in the final position.