# Scaling of Beta-Lactamase dataset¶

## Overview¶

Once the reflections have been integrated, a miller index, intensity and intensity error estimate have been determined for each measured reflection, in addition to information on the unit cell properties. However, before the data can be reduced for structure solution, the intensity values must be corrected for experimental effects which occur prior to the reflection being measured on the detector. These primarily include sample illumination/absorption effects and radiation damage, which result in symmetry-equivalent reflections having unequal measured intensities (i.e. a systematic effect in addition to any variance due to counting statistics). Thus the purpose of scaling is to determine a scale factor to apply to each reflection, such that the scaled intensities are representative of the ‘true’ scattering intensity from the contents of the unit cell.

Scaling is dependent on the space group symmetry assigned, which can be assessed now that we have integrated intensities. Therefore first we shall run dials.symmetry on the integrated.pickle and integrated_experiments.json files:

dials.symmetry integrated_experiments.json integrated.pickle


As can be seen from the output, the best solution is given by C 1 2/m 1, in agreement with the result from dials.refine_bravais_settings.

To run scaling, any reflection files containing integrated reflections can be passed to dials.scale. In the example below, we shall use the output files of dials.symmetry, reindexed_experiments.json and reindexed_reflections.pickle. When run, dials.scale performs scaling on the dataset, and calculates an inverse scale factor for each reflection (i.e. the corrected intensities are given by $$I^{cor}_i = I^{obs}_i / g_i$$). The updated dataset is saved to scaled.pickle, while details of the scaling model are saved in an updated experiments file scaled_experiments.json. This can then be used to produce an MTZ file for structure solution.

## The scaling process¶

First, a scaling model must be created, from which we derive scale factors for each reflection. By default, three components are used to create a physical model for scaling (model=physical), in a similar manner to that used in the program aimless. This model consists of a smoothly varying scale factor as a function of rotation angle (scale_term), a smoothly varying B-factor to account for radiation damage as a function of rotation angle (decay_term) and an absorption surface correction, dependent on the direction of the incoming and scattered beam vector relative to the crystal (absorption_term).

Let’s run dials.scale on the Beta-lactamase dataset, using a d_min cutoff:

dials.scale reindexed_experiments.json reindexed_reflections.pickle d_min=1.4


As can be seen from the log, a subset of reflections are selected to be used in scale factor determination, which helps to speed up the algorithm. In a typical rotation dataset, between 10 and 40 parameters will be used for each term of the model, therefore the problem is overdetermined and a subset of reflections can be used to determine the model components. Outlier rejection is performed at several stages, as outliers have a disproportionately large effect during scaling and can lead to poor scaling results.

Once the model has been initialised and a reflection subset chosen, the model parameters are be refined to give the best fit to the data, and then are used to calculate the scale factor for all reflections in the dataset. An error model is also optimised, to transform the intensity errors to an expected normal distribution. An error estimate for each scale factor is also determined based on the covariances of the model parameters. Finally, a table and summary of the merging statistics are presented, which give indications of the quality of the scaled dataset.

           ----------Overall merging statistics (non-anomalous)----------

Resolution: 69.19 - 1.40

Observations: 274776

Unique reflections: 41140

Redundancy: 6.7

Completeness: 94.11%

Mean intensity: 80.0

Mean I/sigma(I): 15.5

R-merge: 0.065

R-meas:  0.071

R-pim:   0.027


## Inspecting the results¶

To see what the scaling is telling us about the dataset, plots of the scaling model should be viewed. These can be generated by passing the output files to the utility program dials.plot_scaling_models:

dials.plot_scaling_models scaled_experiments.json scaled.pickle
open scale_model.png absorption_surface.png


What is immediately apparent is the periodic nature of the scale term, with peaks and troughs 90° apart. This indicates that the illumated volume was changing significantly during the experiment: a reflection would be measured as twice as intense if it was measured at rotation angle of ~120° compared to at ~210°. The absorption surface also shows a similar periodicity, as may be expected. What is less clear is the form of the relative B-factor, which also has a periodic nature. As a B-factor can be understood to represent radiation damage, this would not be expected to be periodic, and it is likely that this model component is accounting for variation that could be described only by a scale and absorption term. To test this, we can repeat the scaling process but turn off the decay_term:

dials.scale reindexed_experiments.json reindexed_reflections.pickle d_min=1.4 decay_term=False

           ----------Overall merging statistics (non-anomalous)----------

Resolution: 69.19 - 1.40

Observations: 274585

Unique reflections: 41140

Redundancy: 6.7

Completeness: 94.11%

Mean intensity: 76.6

Mean I/sigma(I): 16.1

R-merge: 0.063

R-meas:  0.069

R-pim:   0.027


By inspecting the statistics in the output, we can see that removing the decay term has had the effect of causing around 200 more reflections to be marked as outliers (taking the outlier count from 0.75% to 0.82% of the data), while improving some of the R-factors and mean I/sigma(I). Therefore it is probably best to exclude the decay correction for this dataset. Other options which could be explored are the numbers of parameters used for the various components, for example by changing the scale_interval, or by adjusting the outlier rejection criterion with a different outlier_zmax.

## Exporting for further processing¶

Once we are happy with the results from scaling, the data can be exported as an unmerged mtz file, for further symmetry analysis with pointless or to start structural solution. To obtain an unmerged mtz file, dials.export should be run, passing in the output from scaling, with the option intensity=scale:

dials.export scaled.pickle scaled_experiments.json intensity=scale