dials.algorithms.symmetry

Methods for symmetry determination.

This module provides a base class for symmetry determination algorithms.

class dials.algorithms.symmetry.symmetry_base(intensities, normalisation='ml_aniso', lattice_symmetry_max_delta=2.0, d_min=<libtbx.AutoType object>, min_i_mean_over_sigma_mean=4, min_cc_half=0.6, relative_length_tolerance=None, absolute_angle_tolerance=None)[source]

Bases: object

Base class for symmetry analysis.

static kernel_normalisation(intensities)[source]

Kernel normalisation of the input intensities.

Parameters:intensities (cctbx.miller.array) – The intensities to be normalised.
Returns:The normalised intensities.
Return type:cctbx.miller.array
static ml_aniso_normalisation(intensities)[source]

Anisotropic maximum-likelihood normalisation of the input intensities.

Parameters:intensities (cctbx.miller.array) – The intensities to be normalised.
Returns:The normalised intensities.
Return type:cctbx.miller.array
static ml_iso_normalisation(intensities)[source]

Isotropic maximum-likelihood normalisation of the input intensities.

Parameters:intensities (cctbx.miller.array) – The intensities to be normalised.
Returns:The normalised intensities.
Return type:cctbx.miller.array
static quasi_normalisation(intensities)[source]

Quasi-normalisation of the input intensities.

Parameters:intensities (cctbx.miller.array) – The intensities to be normalised.
Returns:The normalised intensities.
Return type:cctbx.miller.array

Algorithms for determination of Laue group symmetry.

class dials.algorithms.symmetry.determine_space_group.CorrelationCoefficientAccumulator(x=None, y=None)[source]

Bases: object

Class for incremental computation of correlation coefficients.

Uses the single-pass formula for Pearson correlation coefficient:
https://en.wikipedia.org/wiki/Pearson_correlation_coefficient#For_a_sample
accumulate(x, y)[source]

Accumulate the x and y values provided.

Parameters:
  • x (list) – The list of x values to accumulate.
  • y (list) – The list of y values to accumulate.
coefficient()[source]

Calculate the correlation coefficient.

Returns:The correlation coefficient.
Return type:float
denominator()[source]

“Calculate the denominator of the correlation coefficient formula.

\[\sqrt{n \sum{x^2} - \sum{x}^2} \sqrt{n \sum{y^2} - \sum{y}^2}\]
Returns:The value of the denominator.
Return type:float
n()[source]

Return the number of values contributing to the correlation coefficient.

Returns:n (int)
numerator()[source]

“Calculate the numerator of the correlation coefficient formula.

\[n \sum{x y} - \sum{x} \sum{y}\]
Returns:The value of the numerator.
Return type:float
class dials.algorithms.symmetry.determine_space_group.ScoreCorrelationCoefficient(cc, sigma_cc, expected_cc, lower_bound=-1, upper_bound=1, k=2)[source]

Bases: object

p_cc_given_not_s

Probability of observing this CC if the sym op is NOT present, p(CC; !S).

p_cc_given_s

Probability of observing this CC if the sym op is present, p(CC; S).

Modelled by a Cauchy distribution centred on cc_true and width gamma = sigma_cc

p_s_given_cc

The likelihood of this symmetry element being present.

p(S; CC) = p(CC; S) / (p(CC; S) + p(CC; !S))

class dials.algorithms.symmetry.determine_space_group.ScoreSubGroup(subgroup, sym_op_scores)[source]

Bases: object

Score the probability of a given subgroup being the true subgroup.

  1. Calculates the combined correlation coefficients for symmetry operations present/absent from the subgroup.
  2. Calculates overall Zcc scores for symmetry elements present/absent from the subgroup.
  3. Calculates the overall likelihood for this subgroup.

See appendix A2 of Evans, P. R. (2011). Acta Cryst. D67, 282-292.

as_dict()[source]

Return a dictionary representation of the subgroup scoring.

The dictionary will contain the following keys:
  • patterson_group: The current subgroup
  • likelihood: The likelihood of the subgroup being correct
  • confidence: The confidence of the subgroup being correct
  • z_cc_for: The combined Z-scores for all symmetry elements present in the subgroup
  • z_cc_against: The combined Z-scores for all symmetry elements present in the lattice group but not in the subgroup
  • z_cc_net: The net Z-score, i.e. z_cc_for - z_cc_against
  • cc_for: The overall correlation coefficient for all symmetry elements present in the subgroup
  • cc_against: The overall correlation coefficient for all symmetry elements present in the lattice group but not in the subgroup
  • max_angular_difference: The maximum angular difference between the symmetrised unit cell and the P1 unit cell.
  • cb_op: The change of basis operation from the input unit cell to the ‘best’ unit cell.
Returns:
Return type:dict
class dials.algorithms.symmetry.determine_space_group.ScoreSymmetryElement(intensities, sym_op, cc_true, cc_sig_fac)[source]

Bases: object

Analyse intensities for presence of a given symmetry operation.

  1. Calculate the correlation coefficient, CC, for the given sym op.
  2. Calculate the probability of observing this CC if the sym op is present, p(CC; S), modelled by a Cauchy distribution centred on cc_true and width gamma = sigma_cc.
  3. Calculate the probability of observing this CC if the sym op is NOT present, p(CC; !S).
  4. Calculate the likelihood of symmetry element being present, p(S; CC) = p(CC; S) / (p(CC; S) + p(CC; !S))

See appendix A1 of Evans, P. R. (2011). Acta Cryst. D67, 282-292.

as_dict()[source]

Return a dictionary representation of the symmetry element scoring.

The dictionary will contain the following keys:
  • likelihood: The likelihood of the symmetry element being present
  • z_cc: The Z-score for the correlation coefficent
  • cc: The correlation coefficient for the symmetry element
  • n_ref: The number of reflections contributing to the correlation coefficient
  • operator: The xyz representation of the symmetry element
Returns:
Return type:dict
class dials.algorithms.symmetry.determine_space_group.determine_space_group(intensities, normalisation='ml_aniso', lattice_symmetry_max_delta=2.0, d_min=<libtbx.AutoType object>, min_i_mean_over_sigma_mean=4, min_cc_half=0.6, relative_length_tolerance=None, absolute_angle_tolerance=None)[source]

Bases: dials.algorithms.symmetry.symmetry_base

Determination of Laue group symmetry using algorithms similar to POINTLESS.

as_dict()[source]

Return a dictionary representation of the results.

Returns:dict
as_json(filename=None, indent=2)[source]

Return a json representation of the results.

Parameters:
  • filename (str) – Optional filename to export the json representation of the results.
  • indent (int) – The indent level for pretty-printing of the json. If None is the most compact representation.
Returns:

Return type:

str

dials.algorithms.symmetry.determine_space_group.trunccauchy_pdf(x, a, b, loc=0, scale=1)[source]

Calculate a truncated Cauchy probability density function.

Parameters:
  • x (float) – The point at which to calculate the PDF.
  • a (float) – The lower bound of the truncated distribution.
  • b (float) – The upper bound of the truncated distribution.
  • loc (float) – The location parameter for the Cauchy distribution.
  • scale (float) – The scale parameter for the Cauchy distribution.
Returns:

The value of the probability density function.

Return type:

float

Functions to help with reindexing against a reference dataset.

dials.algorithms.symmetry.reindex_to_reference.determine_reindex_operator_against_reference(test_miller_set, reference_miller_set)[source]

Reindex a miller set to match a reference miller set.

This function takes two miller arrays, a reference and a test array. The space group is checked to see if any reindexing may be required to give consistent indexing between both datasets. If possible twin operators exist, the different indexing options are tested against the reference set, using the correlation between datasets as the test.

Parameters:
Returns:

The change of basis operator which should be applied to the test dataset to give consistent indexing with the reference.

Return type:

cctbx.sgtbx.change_of_basis_op

dials.algorithms.symmetry.cosym

Methods for symmetry determination from partial datasets.

This module implements the methods of Gildea, R. J. & Winter, G. (2018). Acta Cryst. D74, 405-410 for determination of Patterson group symmetry from sparse multi-crystal data sets in the presence of an indexing ambiguity.

class dials.algorithms.symmetry.cosym.CosymAnalysis(intensities, params)[source]

Bases: dials.algorithms.symmetry.symmetry_base, dials.util.observer.Subject

Peform cosym analysis.

Peform cosym analysis on the input intensities using the methods of Gildea, R. J. & Winter, G. (2018). Acta Cryst. D74, 405-410 for determination of Patterson group symmetry from sparse multi-crystal data sets in the presence of an indexing ambiguity.

as_dict()[source]

Return a dictionary representation of the results.

Returns:dict
as_json(filename=None, indent=2)[source]

Return a json representation of the results.

Parameters:
  • filename (str) – Optional filename to export the json representation of the results.
  • indent (int) – The indent level for pretty-printing of the json. If None is the most compact representation.
Returns:

Return type:

str

run()[source]
class dials.algorithms.symmetry.cosym.ScoreSubGroup(subgroup, sym_op_scores)[source]

Bases: object

Score the probability of a given subgroup being the true subgroup.

  1. Calculates overall Zcc scores for symmetry elements present/absent from the subgroup.
  2. Calculates the overall likelihood for this subgroup.

See appendix A2 of Evans, P. R. (2011). Acta Cryst. D67, 282-292.

as_dict()[source]

Return a dictionary representation of the subgroup scoring.

The dictionary will contain the following keys:
  • patterson_group: The current subgroup
  • likelihood: The likelihood of the subgroup being correct
  • confidence: The confidence of the subgroup being correct
  • z_cc_for: The combined Z-scores for all symmetry elements present in the subgroup
  • z_cc_against: The combined Z-scores for all symmetry elements present in the lattice group but not in the subgroup
  • z_cc_net: The net Z-score, i.e. z_cc_for - z_cc_against
  • max_angular_difference: The maximum angular difference between the symmetrised unit cell and the P1 unit cell.
  • cb_op: The change of basis operation from the input unit cell to the ‘best’ unit cell.
Returns:
Return type:dict
stars
class dials.algorithms.symmetry.cosym.ScoreSymmetryElement(cc, sigma_cc, cc_true)[source]

Bases: object

Analyse intensities for presence of a given symmetry operation.

  1. Calculate the probability of observing this CC if the sym op is present, p(CC; S), modelled by a Cauchy distribution centred on cc_true and width gamma = sigma_cc.
  2. Calculate the probability of observing this CC if the sym op is NOT present, p(CC; !S).
  3. Calculate the likelihood of symmetry element being present, p(S; CC) = p(CC; S) / (p(CC; S) + p(CC; !S))

See appendix A1 of Evans, P. R. (2011). Acta Cryst. D67, 282-292.

as_dict()[source]

Return a dictionary representation of the symmetry element scoring.

The dictionary will contain the following keys:
  • likelihood: The likelihood of the symmetry element being present
  • z_cc: The Z-score for the correlation coefficent
  • cc: The correlation coefficient for the symmetry element
  • operator: The xyz representation of the symmetry element
Returns:
Return type:dict
stars
class dials.algorithms.symmetry.cosym.SymmetryAnalysis(coords, sym_ops, subgroups, cb_op_inp_min)[source]

Bases: object

as_dict()[source]

Return a dictionary representation of the results.

Returns:dict
subgroups_table()[source]
sym_ops_table()[source]

Target function for cosym analysis.

class dials.algorithms.symmetry.cosym.target.Target(intensities, lattice_ids, weights=None, min_pairs=None, lattice_group=None, dimensions=None, nproc=1)[source]

Bases: object

Target function for cosym analysis.

dim

The number of dimensions used in the analysis.

Type:int
compute_functional(x)[source]

Compute the target function at coordinates x.

Parameters:x (scitbx.array_family.flex.double) – a flattened list of the N-dimensional vectors, i.e. coordinates in the first dimension are stored first, followed by the coordinates in the second dimension, etc.
Returns:The value of the target function at coordinates x.
Return type:f (float)
compute_functional_and_gradients(x)[source]

Compute the target function and gradients at coordinates x.

Parameters:x (scitbx.array_family.flex.double) – a flattened list of the N-dimensional vectors, i.e. coordinates in the first dimension are stored first, followed by the coordinates in the second dimension, etc.
Returns:f: The value of the target function at coordinates x. grad: The gradients of the target function with respect to the parameters.
Return type:Tuple[float, scitbx.array_family.flex.double]
compute_gradients_fd(x, eps=1e-06)[source]

Compute the gradients at coordinates x using finite differences.

Parameters:
  • x (scitbx.array_family.flex.double) – a flattened list of the N-dimensional vectors, i.e. coordinates in the first dimension are stored first, followed by the coordinates in the second dimension, etc.
  • eps (float) – The value of epsilon to use in finite difference calculations.
Returns:

The gradients of the target function with respect to the parameters.

Return type:

grad (scitbx.array_family.flex.double)

curvatures(x)[source]

Compute the curvature of the target function.

Parameters:x (scitbx.array_family.flex.double) – a flattened list of the N-dimensional vectors, i.e. coordinates in the first dimension are stored first, followed by the coordinates in the second dimension, etc.
Returns:The curvature of the target function with respect to the parameters.
Return type:curvs (scitbx.array_family.flex.double)
curvatures_fd(x, eps=1e-06)[source]

Compute the curvatures at coordinates x using finite differences.

Parameters:
  • x (scitbx.array_family.flex.double) – a flattened list of the N-dimensional vectors, i.e. coordinates in the first dimension are stored first, followed by the coordinates in the second dimension, etc.
  • eps (float) – The value of epsilon to use in finite difference calculations.
Returns:

The curvature of the target function with respect to the parameters.

Return type:

curvs (scitbx.array_family.flex.double)

get_sym_ops()[source]

Get the list of symmetry operations used in the analysis.

Returns:The list of symmetry operations.
Return type:List[cctbx.sgtbx.rt_mx]
set_dimensions(dimensions)[source]

Set the number of dimensions for analysis.

Parameters:dimensions (int) – The number of dimensions to be used.

LBFGS refinement engine for cosym analysis.

class dials.algorithms.symmetry.cosym.engine.lbfgs_with_curvs(target, coords, use_curvatures=True, termination_params=None)[source]

Bases: object

Minimise a target function using the LBFGS minimiser.

Implementation of an LBFGS minimiser using curvature information, according to the interface defined by scitbx.lbfgs.

callback_after_step(minimizer)[source]

Log progress after each successful step of the minimisation.

compute_functional_and_gradients()[source]

Compute the functional and gradients.

Returns:A tuple of the functional and gradients.
Return type:tuple
compute_functional_gradients_and_curvatures()[source]

Compute the functional, gradients and curvatures.

Returns:A tuple of the functional, gradients and curvatures.
Return type:tuple
compute_functional_gradients_diag()[source]

Compute the functional, gradients and diagonal.

Returns:A tuple of the functional, gradients and diagonal, where the diagonal is the reciprocal of the curvatures.
Return type:tuple
curvatures()[source]

Return the curvatures.

Seed clustering method for cosym analysis.

class dials.algorithms.symmetry.cosym.seed_clustering.seed_clustering(coordinates, n_datasets, n_sym_ops, min_silhouette_score, n_clusters=<libtbx.AutoType object>)[source]

Bases: object

Perform seed clustering of coordinates.

Labels points into clusters such that cluster contains exactly one copy of each dataset, then performs silhouettete analysis on the resulting clusters to determine the true number of clusters present, under the constraint that only equal-sized clusterings are valid, i.e. each dataset should appear an equal number of times in each cluster.

cluster_labels

A label for each coordinate.

Type:scitbx.array_family.flex.int