Selection algebra
Services of this website allow calculations for specific segments of structures rather than only for the whole models. Most of the time this is used to select one of the chains. However the selection algebra is more complex than that and much more powerful. It allows selection of both chain and residue ranges, "wildcards" and logical segment negation.
Below is a test area and examples of the most common use of the selection syntax.
Test area
Here you can test the selection algebra. Color of the field informs you whether the input string is valid.
All job submission forms use the same validation procedure. To select the whole model, leave selection string empty.
Single chain and single residue range selections
Goal
Selection string
Select all residues from chain A
A
Select residues in 100 to 200 range (inclusive) from chain B
B(100-200)
Select all residues from the beginning to 200 (inclusive) from chain C
C(-200)
Select all residues from 100 to the end (inclusive) from chain D
D(100-)
Select only residue 150 from chain E
E(150)
More information
- Selection algebra always checks all residues with given chain, both polymer (protein / nucleic) and ligands.
- It is not possible to use residue insertion codes in the selection (also because they are evil).
- Chain names are case-sensitive – A maps to a different entity than a.
- Chain names may comprise multiple characters – AB is valid syntax representing a single entity.
- Chain name may also comprise underscore characters (_) or whitespace.
- Residue ranges may include negative numbers – A(-20-50) is valid syntax.
- It is currently impossible to directly select a single residue with a negative number except by using the A(-30--30) syntax (but on the other hand there is little reason and chance to do so).
- No whitespace is permitted within the parenthesis – A(20 - 50) is not a valid syntax.
Multiple chain and residue range selections
Goal
Selection string
Select multiple residues from chain A
A(-50,100-200,300-)
Select multiple chains (whole chains)
{-C,E-J,M,X-}
Select multiple residues from multiple chains
{A-D}(-50,100-200)
Select multiple residues from all chains in the model
(100-200)
More information
- Comma operator is used to create a logical sum of its operands, for example -30,60- means -30 or 60-.
- Chain and residue ranges may overlap and don't have to be present in the model.
- A residue is included in the calculation as long as it belongs to any of the selected ranges.
Sum and negation of selections
Goal
Selection string
Select whole chains A and B
A+B
Select whole chains A-C and a segment of chain E
{A-C}+E(100-200)
Select whole chain B except for residues 100-200
B+!B(100-200)
Select all chains in the model except for residues 100-200 from each chain
!(100-200)
More information
- + operator is used to create a logical sum of its operands (same as comma but on "higher level").
- A+B is the same as {A,B} and A(-100)+A(200-) is the same as A(-100,200-).
- ! operator negates the selection it precedes – matching residues are excluded from calculation.
- Negative selections always have the priority – if a residue is both selected and excluded, it stays excluded regardless where the negation it appears in the string.