Please click here to go to the tutorial for DIALS 2.2.
Multi-crystal symmetry analysis and scaling with DIALS¶
Introduction¶
Recent additions to DIALS and xia2 have enabled multi-crystal analysis to be
performed following integration. These tools are particularly relevant
for analysis of many partial-datasets, which may be the only practical way of
performing data collections for certain crystals. After integration, the
space group symmetry can be investigated by testing for the presence of symmetry
operations relating the integrated intensities of groups of reflections - the
program to perform this is analysis is dials.symmetry
(with algorithms
similar to those of the program Pointless).
Another thing to consider is that for certain space groups (polar space groups),
there is an inherent ambiguity in the way that the diffraction pattern can be
indexed. In order to combine multiple datasets from these space groups, one must
reindex all data to a consistent setting, which can be done with the program
dials.cosym
(see Gildea and Winter for details).
Finally, the data must be scaled, to correct for experimental effects such as
differences in crystal size/illuminated volume and radiation damage - this can
be done with the program dials.scale
(with algorithms similar to those
of the program Aimless). After the data has been scaled, choices
can then be made about applying a resolution limit to exclude certain regions
of the data which may be negatively affected by radiation damage.
In this tutorial, we shall investigate a multi-crystal dataset collected on
the VMXi beamline, Diamond’s automated facility for data collection from
crystallisation experiments in-situ. The dataset consists of four repeats of
a 60-degree rotation measurement on a crystal of Proteinase K, taken at different
locations on the crystal. We shall start with the integrated reflections and
experiments files generated by running the automated processing software
xia2
with pipeline=dials
.
Have a look at the Processing in Detail tutorial if you
want to know more about the different processing steps up to this point.
Note
To obtain the data for this tutorial you can run
dials.data get vmxi_proteinase_k_sweeps
. If you are at Diamond
Light Source on BAG training then the data are already available.
After typing module load bagtraining
you’ll be moved to a working
folder, with the data already located in the tutorial-data/ccp4/integrated_files
subdirectory. The processing in this tutorial will produce quite a few files,
so it’s recommended to make an move to new directory:
mkdir multi_crystal
cd multi_crystal
xia2.multiplex¶
The easiest way to run these tools for a multi-dataset analysis is through the
program xia2.multiplex
.
This runs several DIALS programs, including the programs described above, while
producing useful plots and output files.
To run xia2.multiplex
, we must provide the path to the input integrated files from
dials.integrate
:
xia2.multiplex experiments_0.expt experiments_1.expt experiments_2.expt experiments_3.expt reflections_0.refl reflections_1.refl reflections_2.refl reflections_3.refl
Show/Hide Log
DIALS 3.dev.617-g669c71566-release
The following parameters have been modified:
input {
experiments = experiments_0.expt
experiments = experiments_1.expt
experiments = experiments_2.expt
experiments = experiments_3.expt
reflections = reflections_0.refl
reflections = reflections_1.refl
reflections = reflections_2.refl
reflections = reflections_3.refl
}
Selecting 4 experiments with profile-fitted reflections
Selecting 4 experiments with refined reflections
Unit cell: (68.3603, 68.3603, 103.953, 90, 90, 90)
0 singletons:
Point group a b c alpha beta gamma
1 cluster:
Cluster_id N_xtals Med_a Med_b Med_c Med_alpha Med_beta Med_gamma Delta(deg)
4 in P422.
cluster_1 4 68.36 (0.01 ) 68.36 (0.01 ) 103.95(0.02 ) 90.00 (0.00) 90.00 (0.00) 90.00 (0.00)
P 4/m m m (No. 123) 68.36 68.36 103.95 90.00 90.00 90.00 0.0
Standard deviations are in brackets.
Each cluster:
Input lattice count, with integration Bravais setting space group.
Cluster median with Niggli cell parameters (std dev in brackets).
Highest possible metric symmetry and unit cell using LePage (J Appl Cryst 1982, 15:255) method, maximum delta 3deg.
Using all data sets for subsequent analysis
Laue group determined by dials.cosym: P 4 2 2
Resolution limit: 1.78 (cc_half > 0.3)
Space group determined by dials.symmetry: P 41 21 2
Overall merging statistics:
+--------------------+--------------+------------------+-------------------+
| | Overall | Low resolution | High resolution |
|--------------------+--------------+------------------+-------------------|
| Resolution (Å) | 68.37 - 1.78 | 68.42 - 4.83 | 1.81 - 1.78 |
| Observations | 216866 | 21170 | 137 |
| Unique reflections | 20799 | 1380 | 126 |
| Multiplicity | 10.4 | 15.3 | 1.1 |
| Completeness | 85.67% | 100.00% | 10.61% |
| Mean I/σ(I) | 30.0 | 56.6 | 2.4 |
| Rmerge | 0.057 | 0.048 | 0.155 |
| Rmeas | 0.060 | 0.050 | 0.212 |
| Rpim | 0.016 | 0.012 | 0.144 |
| CC½ | 0.999 | 0.998 | 0.958 |
+--------------------+--------------+------------------+-------------------+
Resolution shells:
+------------------+----------+-------------+----------------+----------------+----------+---------------+----------+---------+--------+---------+--------+---------+
| Resolution (Å) | N(obs) | N(unique) | Multiplicity | Completeness | Mean I | Mean I/σ(I) | Rmerge | Rmeas | Rpim | Ranom | CC½ | CCano |
|------------------+----------+-------------+----------------+----------------+----------+---------------+----------+---------+--------+---------+--------+---------|
| 68.42 - 4.83 | 21170 | 1380 | 15.34 | 100 | 284.2 | 56.6 | 0.048 | 0.05 | 0.012 | 0.026 | 0.998* | -0.102 |
| 4.83 - 3.84 | 20595 | 1269 | 16.23 | 100 | 417 | 60.2 | 0.045 | 0.047 | 0.011 | 0.024 | 0.999* | -0.032 |
| 3.84 - 3.35 | 20441 | 1258 | 16.25 | 100 | 315.8 | 55.9 | 0.049 | 0.05 | 0.012 | 0.026 | 0.999* | -0.162 |
| 3.35 - 3.05 | 20531 | 1226 | 16.75 | 100 | 217.4 | 51.1 | 0.054 | 0.056 | 0.013 | 0.028 | 0.999* | -0.076 |
| 3.05 - 2.83 | 20409 | 1221 | 16.71 | 100 | 149.7 | 44.6 | 0.06 | 0.062 | 0.015 | 0.032 | 0.999* | -0.081 |
| 2.83 - 2.66 | 20944 | 1230 | 17.03 | 100 | 124.2 | 42.2 | 0.067 | 0.069 | 0.016 | 0.036 | 0.998* | -0.183 |
| 2.66 - 2.53 | 20254 | 1200 | 16.88 | 100 | 98.3 | 36 | 0.075 | 0.077 | 0.019 | 0.041 | 0.998* | 0.054 |
| 2.53 - 2.42 | 16541 | 1212 | 13.65 | 100 | 87.6 | 30.7 | 0.081 | 0.084 | 0.023 | 0.048 | 0.997* | -0.048 |
| 2.42 - 2.32 | 12327 | 1208 | 10.2 | 100 | 78.4 | 26.2 | 0.084 | 0.088 | 0.027 | 0.06 | 0.996* | -0.001 |
| 2.32 - 2.24 | 10345 | 1209 | 8.56 | 100 | 75.1 | 22.9 | 0.086 | 0.091 | 0.031 | 0.065 | 0.993* | -0.079 |
| 2.24 - 2.17 | 8653 | 1193 | 7.25 | 99.92 | 66.2 | 19.5 | 0.09 | 0.097 | 0.036 | 0.07 | 0.992* | -0.135 |
| 2.17 - 2.11 | 6964 | 1189 | 5.86 | 99.41 | 56.4 | 15.5 | 0.102 | 0.111 | 0.045 | 0.093 | 0.990* | -0.128 |
| 2.11 - 2.06 | 5481 | 1148 | 4.77 | 96.15 | 51.1 | 13 | 0.102 | 0.114 | 0.05 | 0.107 | 0.990* | -0.003 |
| 2.06 - 2.01 | 4333 | 1108 | 3.91 | 92.72 | 44 | 10.7 | 0.114 | 0.131 | 0.062 | 0.13 | 0.983* | -0.026 |
| 2.01 - 1.96 | 3142 | 1028 | 3.06 | 86.46 | 37.7 | 8.4 | 0.126 | 0.151 | 0.08 | 0.171 | 0.973* | -0.034 |
| 1.96 - 1.92 | 2119 | 928 | 2.28 | 77.46 | 34.4 | 6.9 | 0.126 | 0.156 | 0.09 | 0.175 | 0.962* | -0.412 |
| 1.92 - 1.88 | 1174 | 719 | 1.63 | 60.27 | 28.9 | 4.9 | 0.143 | 0.189 | 0.122 | 0.253 | 0.946* | 0.95 |
| 1.88 - 1.84 | 819 | 566 | 1.45 | 49 | 25.4 | 4.2 | 0.162 | 0.221 | 0.148 | 0.368 | 0.936* | 0 |
| 1.84 - 1.81 | 487 | 381 | 1.28 | 31.54 | 23.1 | 3.6 | 0.197 | 0.27 | 0.184 | 0.328 | 0.902* | 0 |
| 1.81 - 1.78 | 137 | 126 | 1.09 | 10.61 | 15.7 | 2.4 | 0.155 | 0.212 | 0.144 | 9.324 | 0.958* | 0 |
+------------------+----------+-------------+----------------+----------------+----------+---------------+----------+---------+--------+---------+--------+---------+
Intensity correlation clustering summary:
========= ============== ========== ======== ============== ==============
Cluster No. datasets Datasets Height Multiplicity Completeness
========= ============== ========== ======== ============== ==============
1 2 0 3 0.0022 3.3 0.68
2 3 0 2 3 0.0037 5.9 0.76
3 4 0 1 2 3 0.0044 8.1 0.83
========= ============== ========== ======== ============== ==============
Cos(angle) clustering summary:
========= ============== ========== ======== ============== ==============
Cluster No. datasets Datasets Height Multiplicity Completeness
========= ============== ========== ======== ============== ==============
1 2 1 2 0.00088 5.6 0.79
2 2 0 3 0.014 3.3 0.68
3 4 0 1 2 3 0.048 8.1 0.83
========= ============== ========== ======== ============== ==============
This runs dials.cosym
to analyse the Laue symmetry and reindex all datasets
consistently, scales the data with dials.scale
,
calculates a resolution limit with dials.estimate_resolution
and reruns
dials.scale
with the determined resolution cutoff. The
final dataset is exported to an unmerged mtz and a
HTML report
is generated. The easiest way to see the results is to open the
HTML report
in your browser of choice e.g.:
firefox xia2.multiplex.html
Provided is a summary of the merging statistics as well as several plots, please explore these for a few minutes now! This dataset results in good merging statistics, however if you navigate to the “Analysis by batch” tab in “All data”, you will see that the fourth dataset has poorer statistics compared to the others. Let’s repeat the processing manually to explore the different steps and address this issue.
Manual reprocessing¶
The first step is Laue/Patterson group analysis using dials.cosym:
dials.cosym experiments_0.expt experiments_1.expt experiments_2.expt experiments_3.expt reflections_0.refl reflections_1.refl reflections_2.refl reflections_3.refl
Scoring all possible sub-groups
+-------------------+-----+--------------+----------+--------+--------+---------+--------------------+
| Patterson group | | Likelihood | NetZcc | Zcc+ | Zcc- | delta | Reindex operator |
|-------------------+-----+--------------+----------+--------+--------+---------+--------------------|
| P 4/m m m | *** | 1 | 9.82 | 9.82 | 0 | 0 | -a,b,-c |
| C m m m | | 0 | 0.03 | 9.84 | 9.81 | 0 | a+b,-a+b,c |
| P m m m | | 0 | 0.01 | 9.83 | 9.82 | 0 | -a,b,-c |
| P 4/m | | 0 | -0.01 | 9.82 | 9.82 | 0 | -a,b,-c |
| P 1 2/m 1 | | 0 | 0.03 | 9.85 | 9.82 | 0 | -a,-c,-b |
| C 1 2/m 1 | | 0 | 0.03 | 9.85 | 9.82 | 0 | a-b,a+b,c |
| P 1 2/m 1 | | 0 | 0.01 | 9.83 | 9.82 | 0 | -b,-a,-c |
| C 1 2/m 1 | | 0 | -0.01 | 9.82 | 9.82 | 0 | a+b,-a+b,c |
| P 1 2/m 1 | | 0 | -0.02 | 9.81 | 9.82 | 0 | -a,b,-c |
| P -1 | | 0 | -9.82 | 0 | 9.82 | 0 | -a,b,-c |
+-------------------+-----+--------------+----------+--------+--------+---------+--------------------+
Best solution: P 4/m m m
Unit cell: (68.3603, 68.3603, 103.953, 90, 90, 90)
Reindex operator: -a,b,-c
Laue group probability: 1.000
Laue group confidence: 1.000
Reindexing operators:
x,y,z: [0, 1, 2, 3]
Show/Hide Log
DIALS 3.dev.617-g669c71566-release
The following parameters have been modified:
input {
experiments = experiments_0.expt
experiments = experiments_1.expt
experiments = experiments_2.expt
experiments = experiments_3.expt
reflections = reflections_0.refl
reflections = reflections_1.refl
reflections = reflections_2.refl
reflections = reflections_3.refl
}
Hierarchical clustering of unit cells
Using Andrews-Bernstein distance from Andrews & Bernstein J Appl Cryst 47:346 (2014)
Distances have been calculated
Unit cell: (68.3603, 68.3603, 103.953, 90, 90, 90)
0 singletons:
Point group a b c alpha beta gamma
1 cluster:
Cluster_id N_xtals Med_a Med_b Med_c Med_alpha Med_beta Med_gamma Delta(deg)
4 in P422.
cluster_1 4 68.36 (0.01 ) 68.36 (0.01 ) 103.95(0.02 ) 90.00 (0.00) 90.00 (0.00) 90.00 (0.00)
P 4/m m m (No. 123) 68.36 68.36 103.95 90.00 90.00 90.00 0.0
Standard deviations are in brackets.
Each cluster:
Input lattice count, with integration Bravais setting space group.
Cluster median with Niggli cell parameters (std dev in brackets).
Highest possible metric symmetry and unit cell using LePage (J Appl Cryst 1982, 15:255) method, maximum delta 3deg.
Filtering reflections for dataset 0
Read 76079 predicted reflections
Selected 54367 reflections integrated by profile and summation methods
Combined 1127 partial reflections with other partial reflections
Removed 20 reflections below partiality threshold
Removed 0 intensity.sum.value reflections with I/Sig(I) < -5
Removed 14 intensity.prf.value reflections with I/Sig(I) < -5
Filtering reflections for dataset 1
Read 75607 predicted reflections
Selected 54845 reflections integrated by profile and summation methods
Combined 1284 partial reflections with other partial reflections
Removed 50 reflections below partiality threshold
Removed 0 intensity.sum.value reflections with I/Sig(I) < -5
Removed 14 intensity.prf.value reflections with I/Sig(I) < -5
Filtering reflections for dataset 2
Read 77983 predicted reflections
Selected 54461 reflections integrated by profile and summation methods
Combined 1404 partial reflections with other partial reflections
Removed 38 reflections below partiality threshold
Removed 0 intensity.sum.value reflections with I/Sig(I) < -5
Removed 8 intensity.prf.value reflections with I/Sig(I) < -5
Filtering reflections for dataset 3
Read 76468 predicted reflections
Selected 53877 reflections integrated by profile and summation methods
Combined 1062 partial reflections with other partial reflections
Removed 8 reflections below partiality threshold
Removed 0 intensity.sum.value reflections with I/Sig(I) < -5
Removed 5 intensity.prf.value reflections with I/Sig(I) < -5
Patterson group: P 4/m m m
--------------------------------------------------------------------------------
Normalising intensities for dataset 1
ML estimate of overall B value:
13.52 A**2
ML estimate of -log of scale factor:
-3.04
--------------------------------------------------------------------------------
Normalising intensities for dataset 2
ML estimate of overall B value:
11.06 A**2
ML estimate of -log of scale factor:
-3.50
--------------------------------------------------------------------------------
Normalising intensities for dataset 3
ML estimate of overall B value:
11.38 A**2
ML estimate of -log of scale factor:
-2.96
--------------------------------------------------------------------------------
Normalising intensities for dataset 4
ML estimate of overall B value:
12.14 A**2
ML estimate of -log of scale factor:
-2.67
--------------------------------------------------------------------------------
Estimation of resolution for Laue group analysis
Removing 3 Wilson outliers with E^2 >= 16.0
Resolution estimate from <I>/<σ(I)> > 4.0 : 2.16
Resolution estimate from CC½ > 0.60: 1.83
High resolution limit set to: 1.83
Selecting 148639 reflections with d > 1.83
================================================================================
Automatic determination of number of dimensions for analysis
+--------------+--------------+
| Dimensions | Functional |
|--------------+--------------|
| 1 | 14.747 |
| 2 | 15.8561 |
| 3 | 14.5986 |
| 4 | 14.9292 |
| 5 | 15.3244 |
| 6 | 14.8874 |
| 7 | 15.2239 |
| 8 | 14.8612 |
+--------------+--------------+
Best number of dimensions: 8
Using 8 dimensions for analysis
Principal component analysis:
Explained variance: 0.0083, 0.0022, 0.0018, 0.0015, 0.0013, 0.00077, 0.00022, 2.2e-05
Explained variance ratio: 0.52, 0.14, 0.11, 0.093, 0.079, 0.048, 0.014, 0.0014
Scoring individual symmetry elements
+--------------+--------+------+-----+-----------------+
| likelihood | Z-CC | CC | | Operator |
|--------------+--------+------+-----+-----------------|
| 0.939 | 9.78 | 0.98 | *** | 4 |(0, 0, 1) |
| 0.942 | 9.83 | 0.98 | *** | 4^-1 |(0, 0, 1) |
| 0.942 | 9.83 | 0.98 | *** | 2 |(1, 0, 0) |
| 0.941 | 9.81 | 0.98 | *** | 2 |(0, 1, 0) |
| 0.943 | 9.85 | 0.98 | *** | 2 |(0, 0, 1) |
| 0.943 | 9.85 | 0.98 | *** | 2 |(1, 1, 0) |
| 0.941 | 9.82 | 0.98 | *** | 2 |(-1, 1, 0) |
+--------------+--------+------+-----+-----------------+
Scoring all possible sub-groups
+-------------------+-----+--------------+----------+--------+--------+---------+--------------------+
| Patterson group | | Likelihood | NetZcc | Zcc+ | Zcc- | delta | Reindex operator |
|-------------------+-----+--------------+----------+--------+--------+---------+--------------------|
| P 4/m m m | *** | 1 | 9.82 | 9.82 | 0 | 0 | -a,b,-c |
| C m m m | | 0 | 0.03 | 9.84 | 9.81 | 0 | a+b,-a+b,c |
| P m m m | | 0 | 0.01 | 9.83 | 9.82 | 0 | -a,b,-c |
| P 4/m | | 0 | -0.01 | 9.82 | 9.82 | 0 | -a,b,-c |
| P 1 2/m 1 | | 0 | 0.03 | 9.85 | 9.82 | 0 | -a,-c,-b |
| C 1 2/m 1 | | 0 | 0.03 | 9.85 | 9.82 | 0 | a-b,a+b,c |
| P 1 2/m 1 | | 0 | 0.01 | 9.83 | 9.82 | 0 | -b,-a,-c |
| C 1 2/m 1 | | 0 | -0.01 | 9.82 | 9.82 | 0 | a+b,-a+b,c |
| P 1 2/m 1 | | 0 | -0.02 | 9.81 | 9.82 | 0 | -a,b,-c |
| P -1 | | 0 | -9.82 | 0 | 9.82 | 0 | -a,b,-c |
+-------------------+-----+--------------+----------+--------+--------+---------+--------------------+
Best solution: P 4/m m m
Unit cell: (68.3603, 68.3603, 103.953, 90, 90, 90)
Reindex operator: -a,b,-c
Laue group probability: 1.000
Laue group confidence: 1.000
Reindexing operators:
x,y,z: [0, 1, 2, 3]
Writing html report to: dials.cosym.html
Writing json to: dials.cosym.json
Saving reindexed experiments to symmetrized.expt
Saving reindexed reflections to symmetrized.refl
As you can see, the \(P\,4/m\,m\,m\) Patterson group is found with the highest confidence. For the corresponding space group, the mirror symmetries are removed to give \(P\,4\,2\,2\), as the chiral nature of macromolecules means we have a restricted choice of space groups. In this example, all datasets were indexed consistently, but this is not the case in general.
Next, the data can be scaled:
dials.scale symmetrized.expt symmetrized.refl
From the merging statistics it is clear that the data quality is good out to the furthest resolution (\(CC_{1/2} > 0.3\)), which can be confirmed by a resolution analysis:
dials.estimate_resolution scaled.expt scaled.refl
Resolution cc_half: 1.78
Show/Hide Log
The following parameters have been modified:
input {
experiments = scaled.expt
reflections = scaled.refl
}
DIALS 3.dev.617-g669c71566-release
Detected existence of a multi-dataset reflection table
containing 4 datasets.
Read 74952 predicted reflections
Selected 54370 scaled reflections
Combined 2 partial reflections with other partial reflections
Read 74323 predicted reflections
Selected 54522 scaled reflections
Combined 3 partial reflections with other partial reflections
Read 76579 predicted reflections
Selected 54045 scaled reflections
Combined 4 partial reflections with other partial reflections
Read 75406 predicted reflections
Selected 53975 scaled reflections
Combined 2 partial reflections with other partial reflections
Using median unit cell across experiments : (68.3603, 68.3603, 103.953, 90, 90, 90)
Removing 9 Wilson outliers with E^2 >= 16.0
Resolution cc_half: 1.78
If the resolution limit was lower than the extent of the data, scaling would be rerun with a new resolution limit, for example:
dials.scale scaled.expt scaled.refl d_min=1.78
Show/Hide Log
DIALS 3.dev.617-g669c71566-release
The following parameters have been modified:
cut_data {
d_min = 1.78
}
input {
experiments = scaled.expt
reflections = scaled.refl
}
Checking for the existence of a reflection table
containing multiple datasets
Detected existence of a multi-dataset reflection table
containing 4 datasets.
Found 4 reflection tables & 4 experiments in total.
Dataset ids are: 0,1,2,3
Space group being used during scaling is P 4 2 2
Using median unit cell across experiments : (68.3603, 68.3603, 103.953, 90, 90, 90)
Scaling models have been initialised for all experiments.
================================================================================
The experiment id for this dataset is 0.
The scaling model type being applied is physical.
Applying filter of min_isigi > -5.0, partiality > 0.4
Excluding 20037/74952 reflections
Reflections passing individual criteria:
criterion: user excluded, reflections: 12546
criterion: excluded for scaling, reflections: 20037
The following corrections will be applied to this dataset:
+--------------+----------------+
| correction | n_parameters |
|--------------+----------------|
| scale | 10 |
| decay | 8 |
| absorption | 24 |
+--------------+----------------+
Loaded error model:
Error model details:
Type: basic
Parameters: a = 0.78765, b = 0.05985
Error model formula: σ'² = a²(σ² + (bI)²)
estimated I/sigma asymptotic limit: 21.213
Using previously determined optimal intensity choice: profile intensities
Completed preprocessing and initialisation for this dataset.
================================================================================
The experiment id for this dataset is 1.
The scaling model type being applied is physical.
Applying filter of min_isigi > -5.0, partiality > 0.4
Excluding 19172/74323 reflections
Reflections passing individual criteria:
criterion: user excluded, reflections: 11927
criterion: excluded for scaling, reflections: 19172
The following corrections will be applied to this dataset:
+--------------+----------------+
| correction | n_parameters |
|--------------+----------------|
| scale | 10 |
| decay | 8 |
| absorption | 24 |
+--------------+----------------+
Loaded error model:
Error model details:
Type: basic
Parameters: a = 0.78765, b = 0.05985
Error model formula: σ'² = a²(σ² + (bI)²)
estimated I/sigma asymptotic limit: 21.213
Using previously determined optimal intensity choice: profile intensities
Completed preprocessing and initialisation for this dataset.
================================================================================
The experiment id for this dataset is 2.
The scaling model type being applied is physical.
Applying filter of min_isigi > -5.0, partiality > 0.4
Excluding 21795/76579 reflections
Reflections passing individual criteria:
criterion: user excluded, reflections: 14121
criterion: excluded for scaling, reflections: 21795
The following corrections will be applied to this dataset:
+--------------+----------------+
| correction | n_parameters |
|--------------+----------------|
| scale | 10 |
| decay | 8 |
| absorption | 24 |
+--------------+----------------+
Loaded error model:
Error model details:
Type: basic
Parameters: a = 0.78765, b = 0.05985
Error model formula: σ'² = a²(σ² + (bI)²)
estimated I/sigma asymptotic limit: 21.213
Using previously determined optimal intensity choice: profile intensities
Completed preprocessing and initialisation for this dataset.
================================================================================
The experiment id for this dataset is 3.
The scaling model type being applied is physical.
Applying filter of min_isigi > -5.0, partiality > 0.4
Excluding 20789/75406 reflections
Reflections passing individual criteria:
criterion: user excluded, reflections: 12917
criterion: excluded for scaling, reflections: 20789
The following corrections will be applied to this dataset:
+--------------+----------------+
| correction | n_parameters |
|--------------+----------------|
| scale | 10 |
| decay | 8 |
| absorption | 24 |
+--------------+----------------+
Loaded error model:
Error model details:
Type: basic
Parameters: a = 0.78765, b = 0.05985
Error model formula: σ'² = a²(σ² + (bI)²)
estimated I/sigma asymptotic limit: 21.213
Using previously determined optimal intensity choice: profile intensities
Completed preprocessing and initialisation for this dataset.
================================================================================
Configuring a MultiScaler to handle the individual Scalers.
Determining symmetry equivalent reflections across datasets.
Using quasi-random reflection selection. Selecting from 17132 symmetry groups
with <I/sI> > 1.0 (203702 reflections)). Selection target of 716.56 reflections
from each dataset, with a total number between 11465.02 and 13758.03.
Summary of cross-dataset reflection groups chosen (1277 groups, 13392 reflections):
+---------------+------------+----------+-----+-----+-----+-----+
| d-range | n_groups | n_refl | 0 | 1 | 2 | 3 |
|---------------+------------+----------+-----+-----+-----+-----|
| 68.43 - 7.961 | 42 | 689 | 105 | 203 | 178 | 203 |
| 7.961 - 5.649 | 31 | 727 | 70 | 241 | 199 | 217 |
| 5.649 - 4.617 | 33 | 831 | 72 | 277 | 231 | 251 |
| 4.617 - 4.001 | 34 | 912 | 70 | 304 | 262 | 276 |
| 4.001 - 3.58 | 37 | 982 | 72 | 324 | 285 | 301 |
| 3.58 - 3.269 | 31 | 885 | 72 | 296 | 242 | 275 |
| 3.269 - 3.027 | 34 | 846 | 70 | 277 | 228 | 271 |
| 3.027 - 2.832 | 31 | 775 | 70 | 253 | 209 | 243 |
| 2.832 - 2.67 | 34 | 818 | 71 | 263 | 220 | 264 |
| 2.67 - 2.533 | 34 | 773 | 71 | 248 | 207 | 247 |
| 2.533 - 2.415 | 31 | 560 | 70 | 197 | 190 | 103 |
| 2.415 - 2.313 | 38 | 551 | 104 | 177 | 200 | 70 |
| 2.313 - 2.222 | 41 | 523 | 121 | 165 | 167 | 70 |
| 2.222 - 2.141 | 46 | 505 | 133 | 164 | 137 | 71 |
| 2.141 - 2.069 | 77 | 594 | 153 | 176 | 105 | 160 |
| 2.069 - 2.003 | 104 | 582 | 150 | 150 | 105 | 177 |
| 2.003 - 1.943 | 137 | 657 | 159 | 217 | 141 | 140 |
| 1.943 - 1.889 | 205 | 614 | 141 | 172 | 161 | 140 |
| 1.889 - 1.838 | 213 | 474 | 83 | 125 | 132 | 134 |
| 1.838 - 1.792 | 44 | 94 | 19 | 28 | 32 | 15 |
+---------------+------------+----------+-----+-----+-----+-----+
Summary of reflections chosen for minimisation from each dataset (52727 total):
+--------------+------------------+----------------------+----------------------+--------------------+
| Dataset id | reflections | randomly selected | randomly selected | combined number |
| | connected to | reflections | reflections | of reflections |
| | other datasets | within dataset | across datasets | |
|--------------+------------------+----------------------+----------------------+--------------------|
| 0 | 1876 | 4603 | 6250 | 11840 |
| 1 | 4257 | 4759 | 6416 | 13937 |
| 2 | 3631 | 4673 | 6395 | 13354 |
| 3 | 3628 | 4801 | 6405 | 13596 |
| total | 13392 | 18836 | 25466 | 52727 |
+--------------+------------------+----------------------+----------------------+--------------------+
Completed configuration of MultiScaler.
================================================================================
Components to be refined in this cycle for all datasets: scale, decay, absorption
Performing a round of scaling with an LBFGS minimizer.
Time taken for refinement 1.53
Refinement steps:
+--------+--------+----------+
| Step | Nref | RMSD_I |
| | | (a.u) |
|--------+--------+----------|
| 0 | 52727 | 1.1283 |
| 1 | 52727 | 1.1281 |
| 2 | 52727 | 1.1279 |
| 3 | 52727 | 1.1277 |
| 4 | 52727 | 1.1275 |
| 5 | 52727 | 1.1274 |
+--------+--------+----------+
RMSD no longer decreasing
lbfgs minimizer stop: callback_after_step is True
================================================================================
Scale factors determined during minimisation have now been
applied to all datasets.
Combined outlier rejection has been performed across multiple datasets,
3223 outliers have been identified.
Performing multi-dataset profile/summation intensity optimisation.
+-----------------+---------+---------+
| Combination | CC1/2 | Rmeas |
|-----------------+---------+---------|
| prf only | 0.99924 | 0.05433 |
| sum only | 0.99898 | 0.061 |
| Imid = 850.18 | 0.99571 | 0.17467 |
| Imid = 23374.11 | 0.86957 | 0.26031 |
| Imid = 2337.41 | 0.98762 | 0.2736 |
| Imid = 233.74 | 0.99841 | 0.09012 |
+-----------------+---------+---------+
Profile intensities determined to be best for scaling.
Combined outlier rejection has been performed across multiple datasets,
448 outliers have been identified.
Components to be refined in this cycle for all datasets: scale, decay, absorption
Performing a round of scaling with an LBFGS minimizer.
Time taken for refinement 1.11
Refinement steps:
+--------+--------+----------+
| Step | Nref | RMSD_I |
| | | (a.u) |
|--------+--------+----------|
| 0 | 52626 | 1.061 |
| 1 | 52626 | 1.0609 |
| 2 | 52626 | 1.0607 |
| 3 | 52626 | 1.0607 |
+--------+--------+----------+
RMSD no longer decreasing
lbfgs minimizer stop: callback_after_step is True
================================================================================
Scale factors determined during minimisation have now been
applied to all datasets.
Combined outlier rejection has been performed across multiple datasets,
452 outliers have been identified.
Determining a combined error model for all datasets
Performing a round of error model refinement.
Error model details:
Type: basic
Parameters: a = 0.78272, b = 0.06434
Error model formula: σ'² = a²(σ² + (bI)²)
estimated I/sigma asymptotic limit: 19.856
Results of error model refinement. Uncorrected and corrected variances
of normalised intensity deviations for given intensity ranges. Variances
are expected to be ~1.0 for reliable errors (sigmas).
+--------------------------+----------+------------------------+----------------------+
| Intensity range (<Ih>) | n_refl | Uncorrected variance | Corrected variance |
|--------------------------+----------+------------------------+----------------------|
| 10337.23 - 1839.37 | 1435 | 33.605 | 1.022 |
| 1839.37 - 1409.14 | 1435 | 25.237 | 0.982 |
| 1409.14 - 1185.17 | 1435 | 14.464 | 0.956 |
| 1185.17 - 928.58 | 2770 | 13.131 | 0.993 |
| 928.58 - 508.36 | 12281 | 8.108 | 1.015 |
| 508.36 - 278.31 | 21018 | 4.739 | 1.098 |
| 278.31 - 152.36 | 29020 | 2.742 | 1.194 |
| 152.36 - 83.41 | 31615 | 1.723 | 1.204 |
| 83.41 - 45.67 | 26243 | 1.271 | 1.216 |
| 45.67 - 24.99 | 16271 | 0.917 | 1.087 |
+--------------------------+----------+------------------------+----------------------+
Components to be refined in this cycle for all datasets: scale, decay, absorption
Performing a round of scaling with a Levenberg-Marquardt minimizer.
Time taken for refinement 14.03
Refinement steps:
+--------+--------+----------+
| Step | Nref | RMSD_I |
| | | (a.u) |
|--------+--------+----------|
| 0 | 52626 | 1.0311 |
| 1 | 52626 | 1.0295 |
| 2 | 52626 | 1.0291 |
| 3 | 52626 | 1.0289 |
| 4 | 52626 | 1.0287 |
| 5 | 52626 | 1.0286 |
| 6 | 52626 | 1.0286 |
+--------+--------+----------+
RMSD no longer decreasing
================================================================================
Components to be refined in this cycle for all datasets: scale, decay, absorption
Performing a round of scaling with a Levenberg-Marquardt minimizer.
Time taken for refinement 3.94
Refinement steps:
+--------+--------+----------+
| Step | Nref | RMSD_I |
| | | (a.u) |
|--------+--------+----------|
| 0 | 52626 | 1.0286 |
| 1 | 52626 | 1.0286 |
+--------+--------+----------+
RMSD no longer decreasing
================================================================================
Calculating error estimates of inverse scale factors.
Scale factors determined during minimisation have now been
applied to all datasets.
Combined outlier rejection has been performed across multiple datasets,
391 outliers have been identified.
Determining a combined error model for all datasets
Performing a round of error model refinement.
Error model details:
Type: basic
Parameters: a = 0.77852, b = 0.06513
Error model formula: σ'² = a²(σ² + (bI)²)
estimated I/sigma asymptotic limit: 19.723
Results of error model refinement. Uncorrected and corrected variances
of normalised intensity deviations for given intensity ranges. Variances
are expected to be ~1.0 for reliable errors (sigmas).
+--------------------------+----------+------------------------+----------------------+
| Intensity range (<Ih>) | n_refl | Uncorrected variance | Corrected variance |
|--------------------------+----------+------------------------+----------------------|
| 10271.35 - 1840.85 | 1435 | 33.359 | 0.995 |
| 1840.85 - 1412.28 | 1435 | 25.965 | 1.024 |
| 1412.28 - 1186.78 | 1435 | 14.742 | 0.96 |
| 1186.78 - 925.05 | 2838 | 13.292 | 0.976 |
| 925.05 - 506.75 | 12336 | 8.072 | 1.008 |
| 506.75 - 277.61 | 21033 | 4.813 | 1.108 |
| 277.61 - 152.08 | 29020 | 2.781 | 1.193 |
| 152.08 - 83.31 | 31611 | 1.742 | 1.213 |
| 83.31 - 45.64 | 26190 | 1.279 | 1.222 |
| 45.64 - 24.99 | 16251 | 0.922 | 1.098 |
+--------------------------+----------+------------------------+----------------------+
The reflection table variances have been adjusted to account for the
uncertainty in the scaling models for all datasets
Using median unit cell across experiments : (68.3603, 68.3603, 103.953, 90, 90, 90)
Total time taken: 26.9728s
================================================================================
Warning: Over half (50.60%) of model parameters have significant
uncertainty (sigma/abs(parameter) > 0.5), which could indicate a
poorly-determined scaling problem or overparameterisation.
Summary of dataset partialities
+------------------+----------+
| Partiality (p) | n_refl |
|------------------+----------|
| all reflections | 301260 |
| p > 0.99 | 246904 |
| 0.5 < p < 0.99 | 2533 |
| 0.01 < p < 0.5 | 5463 |
| p < 0.01 | 46360 |
+------------------+----------+
Reflections below a partiality_cutoff of 0.4 are not considered for any
part of the scaling analysis or for the reporting of merging statistics.
Additionally, if applicable, only reflections with a min_partiality > 0.95
were considered for use when refining the scaling model.
----------Merging statistics by resolution bin----------
d_max d_min #obs #uniq mult. %comp <I> <I/sI> r_mrg r_meas r_pim r_anom cc1/2 cc_ano
68.41 4.83 21417 1380 15.52 100.00 291.1 55.1 0.048 0.050 0.013 0.026 0.999* 0.111*
4.83 3.84 20911 1271 16.45 100.00 426.2 58.5 0.046 0.048 0.012 0.025 0.999* -0.039
3.84 3.35 20681 1256 16.47 100.00 324.2 54.8 0.050 0.051 0.012 0.028 0.999* -0.056
3.35 3.05 20822 1226 16.98 100.00 223.1 50.6 0.056 0.057 0.014 0.030 0.999* -0.033
3.05 2.83 20697 1221 16.95 100.00 153.7 44.6 0.061 0.063 0.015 0.033 0.999* -0.087
2.83 2.66 21290 1233 17.27 100.00 127.6 42.4 0.067 0.069 0.016 0.038 0.998* -0.082
2.66 2.53 20399 1197 17.04 100.00 100.9 36.2 0.075 0.077 0.019 0.042 0.999* 0.022
2.53 2.42 16664 1212 13.75 100.00 90.0 30.9 0.081 0.084 0.023 0.049 0.997* 0.049
2.42 2.32 12512 1213 10.31 99.92 81.2 26.5 0.082 0.087 0.027 0.061 0.996* 0.062
2.32 2.24 10403 1207 8.62 99.92 76.6 23.0 0.084 0.089 0.030 0.068 0.996* 0.021
2.24 2.17 8636 1191 7.25 99.75 68.3 19.5 0.086 0.093 0.034 0.069 0.993* -0.185
2.17 2.11 6956 1189 5.85 99.33 58.0 15.5 0.097 0.106 0.042 0.093 0.991* -0.111
2.11 2.06 5477 1147 4.78 95.98 52.7 13.0 0.097 0.108 0.047 0.107 0.992* -0.019
2.06 2.01 4340 1110 3.91 92.65 45.4 10.8 0.109 0.124 0.059 0.127 0.986* -0.008
2.01 1.96 3144 1029 3.06 86.40 38.9 8.5 0.120 0.143 0.076 0.168 0.977* -0.048
1.96 1.92 2115 929 2.28 77.35 35.1 6.9 0.121 0.150 0.086 0.168 0.967* -0.319
1.92 1.88 1171 716 1.64 60.17 29.7 5.1 0.137 0.181 0.117 0.249 0.950* 0.947
1.88 1.84 816 564 1.45 48.83 26.2 4.3 0.156 0.213 0.143 0.363 0.942* 0.000
1.84 1.81 489 383 1.28 31.63 23.5 3.7 0.183 0.252 0.171 0.291 0.910* 0.000
1.81 1.78 136 125 1.09 10.52 16.3 2.5 0.136 0.188 0.130 9.566 0.963* 0.000
68.36 1.78 219076 20799 10.53 85.44 133.2 29.8 0.058 0.060 0.016 0.041 0.999* -0.022
-------------Summary of merging statistics--------------
Overall Low High
High resolution limit 1.78 4.83 1.78
Low resolution limit 68.36 68.41 1.81
Completeness 85.4 100.0 10.5
Multiplicity 10.5 15.5 1.1
I/sigma 29.8 55.1 2.5
Rmerge(I) 0.058 0.048 0.136
Rmerge(I+/-) 0.056 0.048 0.112
Rmeas(I) 0.060 0.050 0.188
Rmeas(I+/-) 0.060 0.050 0.155
Rpim(I) 0.016 0.013 0.130
Rpim(I+/-) 0.021 0.016 0.106
CC half 0.999 0.999 0.963
Anomalous completeness 69.3 100.0 0.1
Anomalous multiplicity 6.2 9.4 1.1
Anomalous correlation -0.022 0.111 0.000
Anomalous slope 1.019
dF/F 0.037
dI/s(dI) 0.850
Total observations 219076 21417 136
Total unique 20799 1380 125
Writing html report to dials.scale.html
Saving the scaled experiments to scaled.expt
Saving the scaled reflections to scaled.refl
See dials.github.io/dials_scale_user_guide.html for more info on scaling options
For exploring the scaling results, a wide variety of scaling and merging plots
can be found in the dials.scale.html
report generated by dials.scale
.
Almost there¶
As mentioned previously, the fourth dataset is giving significantly higher
R-merge values and much lower I/sigma.
Therefore the question one must ask is if it is better to exclude this dataset.
We can get some useful information about the agreement between datasets by
running the program dials.compute_delta_cchalf
. This program implements
a version of the algorithms described in Assmann et al. :
dials.compute_delta_cchalf scaled.refl scaled.expt
# Datasets: 4
# Groups: 4
# Reflections: 216878
# Unique reflections: 20793
CC 1/2 mean: 99.280
CC 1/2 excluding group 0: 99.315
CC 1/2 excluding group 1: 99.294
CC 1/2 excluding group 2: 99.256
CC 1/2 excluding group 3: 99.187
Dataset: 0, ΔCC½: -0.035
Dataset: 1, ΔCC½: -0.014
Dataset: 2, ΔCC½: 0.025
Dataset: 3, ΔCC½: 0.093
mean delta_cc_half: 0.017
stddev delta_cc_half: 0.049
cutoff value: -0.178
Show/Hide Log
Read 301260 predicted reflections
Selected 219076 scaled reflections
Combined 18 partial reflections with other partial reflections
Combined 23 partial reflections with other partial reflections
Combined 21 partial reflections with other partial reflections
Combined 23 partial reflections with other partial reflections
Removed 2113 reflections below partiality threshold
Resolution bins
0: 68.363, 5.616
1: 5.616, 3.978
2: 3.978, 3.250
3: 3.250, 2.815
4: 2.815, 2.518
5: 2.518, 2.299
6: 2.299, 2.129
7: 2.129, 1.991
8: 1.991, 1.878
9: 1.878, 1.781
Summary of input data:
# Datasets: 4
# Groups: 4
# Reflections: 216878
# Unique reflections: 20793
CC 1/2 mean: 99.280
CC 1/2 excluding group 0: 99.315
CC 1/2 excluding group 1: 99.294
CC 1/2 excluding group 2: 99.256
CC 1/2 excluding group 3: 99.187
Dataset: 0, ΔCC½: -0.035
Dataset: 1, ΔCC½: -0.014
Dataset: 2, ΔCC½: 0.025
Dataset: 3, ΔCC½: 0.093
mean delta_cc_half: 0.017
stddev delta_cc_half: 0.049
cutoff value: -0.178
Writing table to delta_cchalf.dat
Saving 301260 reflections to filtered.refl
Saving the experiments to filtered.expt
Writing html report to: compute_delta_cchalf.html
It looks like we could get a significantly better \(CC_{1/2}\) by excluding the final dataset - it has a negative \(\Delta CC_{1/2}\). But how bad is too bad that it warrants exclusion? Unfortunately this is a difficult question to answer and it may be the case that one would need to refine several structures with different data excluded to properly address this question. If we had many datasets and only a small fraction had a very large negative \(\Delta CC_{1/2}\) then one could argue that these measurements are not drawn from the same population as the rest of the data and should be excluded.
To see the effect of removing the last dataset (dataset ‘3’), we can rerun
dials.scale
(note that this will overwrite the previous scaled files):
dials.scale scaled.expt scaled.refl d_min=1.78
Show/Hide Log
DIALS 3.dev.617-g669c71566-release
The following parameters have been modified:
cut_data {
d_min = 1.78
}
input {
experiments = scaled.expt
reflections = scaled.refl
}
Checking for the existence of a reflection table
containing multiple datasets
Detected existence of a multi-dataset reflection table
containing 4 datasets.
Found 4 reflection tables & 4 experiments in total.
Dataset ids are: 0,1,2,3
Space group being used during scaling is P 4 2 2
Using median unit cell across experiments : (68.3603, 68.3603, 103.953, 90, 90, 90)
Scaling models have been initialised for all experiments.
================================================================================
The experiment id for this dataset is 0.
The scaling model type being applied is physical.
Applying filter of min_isigi > -5.0, partiality > 0.4
Excluding 20037/74952 reflections
Reflections passing individual criteria:
criterion: user excluded, reflections: 12546
criterion: excluded for scaling, reflections: 20037
The following corrections will be applied to this dataset:
+--------------+----------------+
| correction | n_parameters |
|--------------+----------------|
| scale | 10 |
| decay | 8 |
| absorption | 24 |
+--------------+----------------+
Loaded error model:
Error model details:
Type: basic
Parameters: a = 0.77852, b = 0.06513
Error model formula: σ'² = a²(σ² + (bI)²)
estimated I/sigma asymptotic limit: 19.723
Using previously determined optimal intensity choice: profile intensities
Completed preprocessing and initialisation for this dataset.
================================================================================
The experiment id for this dataset is 1.
The scaling model type being applied is physical.
Applying filter of min_isigi > -5.0, partiality > 0.4
Excluding 19172/74323 reflections
Reflections passing individual criteria:
criterion: user excluded, reflections: 11927
criterion: excluded for scaling, reflections: 19172
The following corrections will be applied to this dataset:
+--------------+----------------+
| correction | n_parameters |
|--------------+----------------|
| scale | 10 |
| decay | 8 |
| absorption | 24 |
+--------------+----------------+
Loaded error model:
Error model details:
Type: basic
Parameters: a = 0.77852, b = 0.06513
Error model formula: σ'² = a²(σ² + (bI)²)
estimated I/sigma asymptotic limit: 19.723
Using previously determined optimal intensity choice: profile intensities
Completed preprocessing and initialisation for this dataset.
================================================================================
The experiment id for this dataset is 2.
The scaling model type being applied is physical.
Applying filter of min_isigi > -5.0, partiality > 0.4
Excluding 21795/76579 reflections
Reflections passing individual criteria:
criterion: user excluded, reflections: 14121
criterion: excluded for scaling, reflections: 21795
The following corrections will be applied to this dataset:
+--------------+----------------+
| correction | n_parameters |
|--------------+----------------|
| scale | 10 |
| decay | 8 |
| absorption | 24 |
+--------------+----------------+
Loaded error model:
Error model details:
Type: basic
Parameters: a = 0.77852, b = 0.06513
Error model formula: σ'² = a²(σ² + (bI)²)
estimated I/sigma asymptotic limit: 19.723
Using previously determined optimal intensity choice: profile intensities
Completed preprocessing and initialisation for this dataset.
================================================================================
The experiment id for this dataset is 3.
The scaling model type being applied is physical.
Applying filter of min_isigi > -5.0, partiality > 0.4
Excluding 20789/75406 reflections
Reflections passing individual criteria:
criterion: user excluded, reflections: 12917
criterion: excluded for scaling, reflections: 20789
The following corrections will be applied to this dataset:
+--------------+----------------+
| correction | n_parameters |
|--------------+----------------|
| scale | 10 |
| decay | 8 |
| absorption | 24 |
+--------------+----------------+
Loaded error model:
Error model details:
Type: basic
Parameters: a = 0.77852, b = 0.06513
Error model formula: σ'² = a²(σ² + (bI)²)
estimated I/sigma asymptotic limit: 19.723
Using previously determined optimal intensity choice: profile intensities
Completed preprocessing and initialisation for this dataset.
================================================================================
Configuring a MultiScaler to handle the individual Scalers.
Determining symmetry equivalent reflections across datasets.
Using quasi-random reflection selection. Selecting from 17132 symmetry groups
with <I/sI> > 1.0 (203702 reflections)). Selection target of 716.56 reflections
from each dataset, with a total number between 11465.02 and 13758.03.
Summary of cross-dataset reflection groups chosen (1277 groups, 13392 reflections):
+---------------+------------+----------+-----+-----+-----+-----+
| d-range | n_groups | n_refl | 0 | 1 | 2 | 3 |
|---------------+------------+----------+-----+-----+-----+-----|
| 68.43 - 7.961 | 42 | 689 | 105 | 203 | 178 | 203 |
| 7.961 - 5.649 | 31 | 727 | 70 | 241 | 199 | 217 |
| 5.649 - 4.617 | 33 | 831 | 72 | 277 | 231 | 251 |
| 4.617 - 4.001 | 34 | 912 | 70 | 304 | 262 | 276 |
| 4.001 - 3.58 | 37 | 982 | 72 | 324 | 285 | 301 |
| 3.58 - 3.269 | 31 | 885 | 72 | 296 | 242 | 275 |
| 3.269 - 3.027 | 34 | 846 | 70 | 277 | 228 | 271 |
| 3.027 - 2.832 | 31 | 775 | 70 | 253 | 209 | 243 |
| 2.832 - 2.67 | 34 | 818 | 71 | 263 | 220 | 264 |
| 2.67 - 2.533 | 34 | 773 | 71 | 248 | 207 | 247 |
| 2.533 - 2.415 | 31 | 560 | 70 | 197 | 190 | 103 |
| 2.415 - 2.313 | 38 | 551 | 104 | 177 | 200 | 70 |
| 2.313 - 2.222 | 41 | 523 | 121 | 165 | 167 | 70 |
| 2.222 - 2.141 | 46 | 505 | 133 | 164 | 137 | 71 |
| 2.141 - 2.069 | 77 | 594 | 153 | 176 | 105 | 160 |
| 2.069 - 2.003 | 104 | 582 | 150 | 150 | 105 | 177 |
| 2.003 - 1.943 | 137 | 657 | 159 | 217 | 141 | 140 |
| 1.943 - 1.889 | 205 | 614 | 141 | 172 | 161 | 140 |
| 1.889 - 1.838 | 213 | 474 | 83 | 125 | 132 | 134 |
| 1.838 - 1.792 | 44 | 94 | 19 | 28 | 32 | 15 |
+---------------+------------+----------+-----+-----+-----+-----+
Summary of reflections chosen for minimisation from each dataset (52727 total):
+--------------+------------------+----------------------+----------------------+--------------------+
| Dataset id | reflections | randomly selected | randomly selected | combined number |
| | connected to | reflections | reflections | of reflections |
| | other datasets | within dataset | across datasets | |
|--------------+------------------+----------------------+----------------------+--------------------|
| 0 | 1876 | 4603 | 6250 | 11840 |
| 1 | 4257 | 4759 | 6416 | 13937 |
| 2 | 3631 | 4673 | 6395 | 13354 |
| 3 | 3628 | 4801 | 6405 | 13596 |
| total | 13392 | 18836 | 25466 | 52727 |
+--------------+------------------+----------------------+----------------------+--------------------+
Completed configuration of MultiScaler.
================================================================================
Components to be refined in this cycle for all datasets: scale, decay, absorption
Performing a round of scaling with an LBFGS minimizer.
Time taken for refinement 0.80
Refinement steps:
+--------+--------+----------+
| Step | Nref | RMSD_I |
| | | (a.u) |
|--------+--------+----------|
| 0 | 52727 | 1.094 |
| 1 | 52727 | 1.094 |
+--------+--------+----------+
RMSD no longer decreasing
lbfgs minimizer stop: callback_after_step is True
================================================================================
Scale factors determined during minimisation have now been
applied to all datasets.
Combined outlier rejection has been performed across multiple datasets,
3164 outliers have been identified.
Performing multi-dataset profile/summation intensity optimisation.
+-----------------+---------+---------+
| Combination | CC1/2 | Rmeas |
|-----------------+---------+---------|
| prf only | 0.99924 | 0.05429 |
| sum only | 0.99901 | 0.06091 |
| Imid = 850.52 | 0.99576 | 0.17464 |
| Imid = 23374.11 | 0.88695 | 0.25985 |
| Imid = 2337.41 | 0.98739 | 0.27416 |
| Imid = 233.74 | 0.99842 | 0.09007 |
+-----------------+---------+---------+
Profile intensities determined to be best for scaling.
Combined outlier rejection has been performed across multiple datasets,
385 outliers have been identified.
Components to be refined in this cycle for all datasets: scale, decay, absorption
Performing a round of scaling with an LBFGS minimizer.
Time taken for refinement 0.78
Refinement steps:
+--------+--------+----------+
| Step | Nref | RMSD_I |
| | | (a.u) |
|--------+--------+----------|
| 0 | 52640 | 1.0319 |
| 1 | 52640 | 1.0319 |
+--------+--------+----------+
RMSD no longer decreasing
lbfgs minimizer stop: callback_after_step is True
================================================================================
Scale factors determined during minimisation have now been
applied to all datasets.
Combined outlier rejection has been performed across multiple datasets,
388 outliers have been identified.
Determining a combined error model for all datasets
Performing a round of error model refinement.
Error model details:
Type: basic
Parameters: a = 0.77675, b = 0.06542
Error model formula: σ'² = a²(σ² + (bI)²)
estimated I/sigma asymptotic limit: 19.680
Results of error model refinement. Uncorrected and corrected variances
of normalised intensity deviations for given intensity ranges. Variances
are expected to be ~1.0 for reliable errors (sigmas).
+--------------------------+----------+------------------------+----------------------+
| Intensity range (<Ih>) | n_refl | Uncorrected variance | Corrected variance |
|--------------------------+----------+------------------------+----------------------|
| 10276.98 - 1840.30 | 1435 | 33.432 | 0.992 |
| 1840.30 - 1411.96 | 1435 | 25.957 | 1.02 |
| 1411.96 - 1186.53 | 1435 | 14.751 | 0.956 |
| 1186.53 - 925.34 | 2836 | 13.306 | 0.973 |
| 925.34 - 506.88 | 12330 | 8.094 | 1.006 |
| 506.88 - 277.66 | 21037 | 4.816 | 1.106 |
| 277.66 - 152.10 | 29022 | 2.783 | 1.192 |
| 152.10 - 83.32 | 31610 | 1.746 | 1.215 |
| 83.32 - 45.64 | 26201 | 1.279 | 1.225 |
| 45.64 - 24.99 | 16249 | 0.925 | 1.102 |
+--------------------------+----------+------------------------+----------------------+
Components to be refined in this cycle for all datasets: scale, decay, absorption
Performing a round of scaling with a Levenberg-Marquardt minimizer.
Time taken for refinement 3.99
Refinement steps:
+--------+--------+----------+
| Step | Nref | RMSD_I |
| | | (a.u) |
|--------+--------+----------|
| 0 | 52641 | 1.0323 |
| 1 | 52641 | 1.0322 |
+--------+--------+----------+
RMSD no longer decreasing
================================================================================
Components to be refined in this cycle for all datasets: scale, decay, absorption
Performing a round of scaling with a Levenberg-Marquardt minimizer.
Time taken for refinement 3.96
Refinement steps:
+--------+--------+----------+
| Step | Nref | RMSD_I |
| | | (a.u) |
|--------+--------+----------|
| 0 | 52641 | 1.0322 |
| 1 | 52641 | 1.0322 |
+--------+--------+----------+
RMSD no longer decreasing
================================================================================
Calculating error estimates of inverse scale factors.
Scale factors determined during minimisation have now been
applied to all datasets.
Combined outlier rejection has been performed across multiple datasets,
390 outliers have been identified.
Determining a combined error model for all datasets
Performing a round of error model refinement.
Error model details:
Type: basic
Parameters: a = 0.77657, b = 0.06540
Error model formula: σ'² = a²(σ² + (bI)²)
estimated I/sigma asymptotic limit: 19.690
Results of error model refinement. Uncorrected and corrected variances
of normalised intensity deviations for given intensity ranges. Variances
are expected to be ~1.0 for reliable errors (sigmas).
+--------------------------+----------+------------------------+----------------------+
| Intensity range (<Ih>) | n_refl | Uncorrected variance | Corrected variance |
|--------------------------+----------+------------------------+----------------------|
| 10267.95 - 1840.84 | 1435 | 33.166 | 0.988 |
| 1840.84 - 1412.34 | 1435 | 26.027 | 1.026 |
| 1412.34 - 1186.57 | 1435 | 14.577 | 0.946 |
| 1186.57 - 924.85 | 2842 | 13.243 | 0.971 |
| 924.85 - 506.66 | 12345 | 8.092 | 1.009 |
| 506.66 - 277.56 | 21031 | 4.803 | 1.104 |
| 277.56 - 152.06 | 29009 | 2.779 | 1.192 |
| 152.06 - 83.30 | 31612 | 1.747 | 1.215 |
| 83.30 - 45.64 | 26187 | 1.279 | 1.225 |
| 45.64 - 24.99 | 16257 | 0.925 | 1.103 |
+--------------------------+----------+------------------------+----------------------+
The reflection table variances have been adjusted to account for the
uncertainty in the scaling models for all datasets
Using median unit cell across experiments : (68.3603, 68.3603, 103.953, 90, 90, 90)
Total time taken: 15.7917s
================================================================================
Warning: Over half (51.19%) of model parameters have significant
uncertainty (sigma/abs(parameter) > 0.5), which could indicate a
poorly-determined scaling problem or overparameterisation.
Summary of dataset partialities
+------------------+----------+
| Partiality (p) | n_refl |
|------------------+----------|
| all reflections | 301260 |
| p > 0.99 | 246904 |
| 0.5 < p < 0.99 | 2533 |
| 0.01 < p < 0.5 | 5463 |
| p < 0.01 | 46360 |
+------------------+----------+
Reflections below a partiality_cutoff of 0.4 are not considered for any
part of the scaling analysis or for the reporting of merging statistics.
Additionally, if applicable, only reflections with a min_partiality > 0.95
were considered for use when refining the scaling model.
----------Merging statistics by resolution bin----------
d_max d_min #obs #uniq mult. %comp <I> <I/sI> r_mrg r_meas r_pim r_anom cc1/2 cc_ano
68.41 4.83 21417 1380 15.52 100.00 291.7 55.0 0.048 0.050 0.013 0.026 0.999* 0.109*
4.83 3.84 20912 1271 16.45 100.00 427.0 58.5 0.046 0.048 0.012 0.025 0.999* -0.039
3.84 3.35 20681 1256 16.47 100.00 324.9 54.8 0.050 0.051 0.012 0.028 0.999* -0.054
3.35 3.05 20823 1226 16.98 100.00 223.6 50.6 0.056 0.057 0.014 0.030 0.999* -0.032
3.05 2.83 20697 1221 16.95 100.00 154.0 44.6 0.061 0.063 0.015 0.033 0.999* -0.087
2.83 2.66 21290 1233 17.27 100.00 127.8 42.5 0.067 0.069 0.016 0.038 0.998* -0.083
2.66 2.53 20398 1197 17.04 100.00 101.1 36.2 0.075 0.077 0.019 0.042 0.999* 0.016
2.53 2.42 16664 1212 13.75 100.00 90.2 30.9 0.081 0.084 0.023 0.049 0.997* 0.049
2.42 2.32 12512 1213 10.31 99.92 81.4 26.6 0.082 0.087 0.027 0.061 0.996* 0.061
2.32 2.24 10403 1207 8.62 99.92 76.7 23.0 0.084 0.089 0.030 0.068 0.996* 0.021
2.24 2.17 8636 1191 7.25 99.75 68.4 19.5 0.086 0.093 0.034 0.069 0.993* -0.186
2.17 2.11 6956 1189 5.85 99.33 58.1 15.6 0.097 0.106 0.042 0.093 0.991* -0.110
2.11 2.06 5477 1147 4.78 95.98 52.8 13.1 0.097 0.108 0.047 0.107 0.992* -0.019
2.06 2.01 4340 1110 3.91 92.65 45.5 10.8 0.109 0.124 0.059 0.127 0.986* -0.009
2.01 1.96 3144 1029 3.06 86.40 39.0 8.5 0.120 0.143 0.076 0.168 0.977* -0.048
1.96 1.92 2115 929 2.28 77.35 35.2 6.9 0.121 0.149 0.086 0.168 0.967* -0.321
1.92 1.88 1171 716 1.64 60.17 29.8 5.1 0.137 0.181 0.117 0.249 0.950* 0.947
1.88 1.84 816 564 1.45 48.83 26.2 4.3 0.156 0.213 0.143 0.363 0.942* 0.000
1.84 1.81 489 383 1.28 31.63 23.5 3.7 0.183 0.252 0.171 0.291 0.910* 0.000
1.81 1.78 136 125 1.09 10.52 16.4 2.5 0.136 0.188 0.130 9.577 0.964* 0.000
68.36 1.78 219077 20799 10.53 85.44 133.5 29.8 0.058 0.060 0.016 0.041 0.999* -0.021
-------------Summary of merging statistics--------------
Overall Low High
High resolution limit 1.78 4.83 1.78
Low resolution limit 68.36 68.41 1.81
Completeness 85.4 100.0 10.5
Multiplicity 10.5 15.5 1.1
I/sigma 29.8 55.0 2.5
Rmerge(I) 0.058 0.048 0.136
Rmerge(I+/-) 0.056 0.048 0.112
Rmeas(I) 0.060 0.050 0.188
Rmeas(I+/-) 0.060 0.050 0.155
Rpim(I) 0.016 0.013 0.130
Rpim(I+/-) 0.021 0.016 0.106
CC half 0.999 0.999 0.964
Anomalous completeness 69.3 100.0 0.1
Anomalous multiplicity 6.2 9.4 1.1
Anomalous correlation -0.021 0.109 0.000
Anomalous slope 1.020
dF/F 0.037
dI/s(dI) 0.851
Total observations 219077 21417 136
Total unique 20799 1380 125
Writing html report to dials.scale.html
Saving the scaled experiments to scaled.expt
Saving the scaled reflections to scaled.refl
See dials.github.io/dials_scale_user_guide.html for more info on scaling options
The overall merging statistics look significantly improved and therefore one would probably proceed with the first three datasets:
Resolution: 68.40 - 1.78 > 68.40 - 1.79
Observations: 222563 > 166095
Unique reflections: 16534 > 16285
Redundancy: 13.5 > 10.2
Completeness: 68.18% > 67.56%
Mean intensity: 45.3 > 46.0
Mean I/sigma(I): 25.0 > 26.1
R-merge: 0.132 > 0.059
R-meas: 0.136 > 0.062
R-pim: 0.033 > 0.017
We could have also excluded a subset of images, for example using the option
exclude_images=3:301:600
to exclude the last 300 images of dataset 3.
This option could be used to exclude the end of a dataset that was showing
sigificant radiation damage, or if the crystal had moved out of the beam part-way
through the measurement.
It is also worth checking the assigned space group using dials.symmetry
.
In dials.cosym
, only the Laue/Patterson group was tested to determine a space
group of \(P\,4\,2\,2\). However, a number of other MX space groups are possible for the
Laue group (due to the possibility of screw-axes), such as \(P\,4\,2_1\,2\),
\(P\,4_1\,2\,2\) etc. The screw-axes tests are performed by dials.symmetry
, and we can disable the
Laue group testing as we are already confident about this:
dials.symmetry scaled.expt scaled.refl laue_group=None
Read 74952 predicted reflections
Selected 54860 scaled reflections
Combined 18 partial reflections with other partial reflections
Read 74323 predicted reflections
Selected 55048 scaled reflections
Combined 23 partial reflections with other partial reflections
Read 76579 predicted reflections
Selected 54658 scaled reflections
Combined 21 partial reflections with other partial reflections
Read 75406 predicted reflections
Selected 54511 scaled reflections
Combined 23 partial reflections with other partial reflections
Using median unit cell across experiments : (68.3603, 68.3603, 103.953, 90, 90, 90)
Removing 9 Wilson outliers with E^2 >= 16.0
Resolution estimate from <I>/<σ(I)> > 4.0 : 1.84
Resolution estimate from CC½ > 0.60: 1.78
Performing systematic absence checks on scaled data
Read 301260 predicted reflections
Selected 219077 scaled reflections
Removed 1 reflections with d <= 1.78
Combined 18 partial reflections with other partial reflections
Combined 23 partial reflections with other partial reflections
Combined 21 partial reflections with other partial reflections
Combined 23 partial reflections with other partial reflections
Removed 2113 reflections below partiality threshold
Laue group: P 4/m m m
+--------------+---------+---------------+--------------+---------------+--------------+-------------------+------------------+
| Screw axis | Score | No. present | No. absent | <I> present | <I> absent | <I/sig> present | <I/sig> absent |
|--------------+---------+---------------+--------------+---------------+--------------+-------------------+------------------|
| 41c | 1 | 11 | 34 | 659.526 | 0.054 | 32.99 | 0.471 |
| 21a | 1 | 14 | 14 | 862.165 | 0.462 | 22.191 | 1.155 |
| 42c | 1 | 23 | 22 | 315.197 | 0.322 | 16.156 | 0.332 |
+--------------+---------+---------------+--------------+---------------+--------------+-------------------+------------------+
+---------------+---------+
| Space group | score |
|---------------+---------|
| P 4 2 2 | 0 |
| P 4 21 2 | 0 |
| P 41 2 2 | 0 |
| P 42 2 2 | 0 |
| P 41 21 2 | 1 |
| P 42 21 2 | 0 |
+---------------+---------+
Recommended space group: P 41 21 2
Space group with equivalent score (enantiomorphic pair): P 43 21 2
Show/Hide Log
DIALS 3.dev.617-g669c71566-release
The following parameters have been modified:
laue_group = None
input {
experiments = scaled.expt
reflections = scaled.refl
}
Detected existence of a multi-dataset reflection table
containing 4 datasets.
================================================================================
Analysing systematic absences
Laue group: P 4/m m m
Read 74952 predicted reflections
Selected 54860 scaled reflections
Combined 18 partial reflections with other partial reflections
Read 74323 predicted reflections
Selected 55048 scaled reflections
Combined 23 partial reflections with other partial reflections
Read 76579 predicted reflections
Selected 54658 scaled reflections
Combined 21 partial reflections with other partial reflections
Read 75406 predicted reflections
Selected 54511 scaled reflections
Combined 23 partial reflections with other partial reflections
Using median unit cell across experiments : (68.3603, 68.3603, 103.953, 90, 90, 90)
Removing 9 Wilson outliers with E^2 >= 16.0
Resolution estimate from <I>/<σ(I)> > 4.0 : 1.84
Resolution estimate from CC½ > 0.60: 1.78
Performing systematic absence checks on scaled data
Read 301260 predicted reflections
Selected 219077 scaled reflections
Removed 1 reflections with d <= 1.78
Combined 18 partial reflections with other partial reflections
Combined 23 partial reflections with other partial reflections
Combined 21 partial reflections with other partial reflections
Combined 23 partial reflections with other partial reflections
Removed 2113 reflections below partiality threshold
Laue group: P 4/m m m
+--------------+---------+---------------+--------------+---------------+--------------+-------------------+------------------+
| Screw axis | Score | No. present | No. absent | <I> present | <I> absent | <I/sig> present | <I/sig> absent |
|--------------+---------+---------------+--------------+---------------+--------------+-------------------+------------------|
| 41c | 1 | 11 | 34 | 659.526 | 0.054 | 32.99 | 0.471 |
| 21a | 1 | 14 | 14 | 862.165 | 0.462 | 22.191 | 1.155 |
| 42c | 1 | 23 | 22 | 315.197 | 0.322 | 16.156 | 0.332 |
+--------------+---------+---------------+--------------+---------------+--------------+-------------------+------------------+
+---------------+---------+
| Space group | score |
|---------------+---------|
| P 4 2 2 | 0 |
| P 4 21 2 | 0 |
| P 41 2 2 | 0 |
| P 42 2 2 | 0 |
| P 41 21 2 | 1 |
| P 42 21 2 | 0 |
+---------------+---------+
Recommended space group: P 41 21 2
Space group with equivalent score (enantiomorphic pair): P 43 21 2
Saving reindexed experiments to symmetrized.expt in space group P 41 21 2
Saving 301260 reindexed reflections to symmetrized.refl
By analysing the sets of reflections we expect to be present and absent, the existence of the \(4_1\) and \(2_1\) screw axes are confirmed, hence the space group is assigned as \(P\,4_1\,2_1\,2\). Note that we can do this analysis before or after scaling, as we only need to know the Laue group for scaling, however it is preferable to do this after scaling as outliers may have been removed by scaling.
Finally, we must merge the data and produce an MTZ file for downstream structure solution:
dials.merge symmetrized.expt symmetrized.refl
Show/Hide Log
DIALS 3.dev.617-g669c71566-release
The following parameters have been modified:
input {
experiments = symmetrized.expt
reflections = symmetrized.refl
}
Using median unit cell across experiments : (68.3603, 68.3603, 103.953, 90, 90, 90)
Merging scaled reflection data
Read 301260 predicted reflections
Selected 219077 scaled reflections
Combined 18 partial reflections with other partial reflections
Combined 23 partial reflections with other partial reflections
Combined 21 partial reflections with other partial reflections
Combined 23 partial reflections with other partial reflections
Running systematic absences check
Laue group: P 4/m m m
+--------------+---------+---------------+--------------+---------------+--------------+-------------------+------------------+
| Screw axis | Score | No. present | No. absent | <I> present | <I> absent | <I/sig> present | <I/sig> absent |
|--------------+---------+---------------+--------------+---------------+--------------+-------------------+------------------|
| 41c | 1 | 12 | 34 | 603.737 | 0.086 | 30.59 | 0.482 |
| 21a | 1 | 14 | 14 | 862.165 | 0.462 | 22.191 | 1.155 |
| 42c | 1 | 24 | 22 | 301.695 | 0.322 | 15.673 | 0.332 |
+--------------+---------+---------------+--------------+---------------+--------------+-------------------+------------------+
+---------------+---------+
| Space group | score |
|---------------+---------|
| P 4 2 2 | 0 |
| P 4 21 2 | 0 |
| P 41 2 2 | 0 |
| P 42 2 2 | 0 |
| P 41 21 2 | 1 |
| P 42 21 2 | 0 |
+---------------+---------+
Recommended space group: P 41 21 2
Space group with equivalent score (enantiomorphic pair): P 43 21 2
Performing French-Wilson treatment of scaled intensities
Total number of rejected intensities 2
===================== Absolute scaling and Wilson analysis ====================
----------Maximum likelihood isotropic Wilson scaling----------
ML estimate of overall B value:
12.47 A**2
Estimated -log of scale factor:
-2.89
The overall B value ("Wilson B-factor", derived from the Wilson plot) gives
an isotropic approximation for the falloff of intensity as a function of
resolution. Note that this approximation may be misleading for anisotropic
data (where the crystal is poorly ordered along an axis). The Wilson B is
strongly correlated with refined atomic B-factors but these may differ by
a significant amount, especially if anisotropy is present.
----------Maximum likelihood anisotropic Wilson scaling----------
ML estimate of overall B_cart value:
11.71, 0.00, 0.00
11.71, 0.00
13.77
Equivalent representation as U_cif:
0.15, -0.00, -0.00
0.15, 0.00
0.17
Eigen analyses of B-cart:
-------------------------------------------------
| Eigenvector | Value | Vector |
-------------------------------------------------
| 1 | 13.767 | ( 0.00, 0.00, 1.00) |
| 2 | 11.707 | (-0.71, 0.71, -0.00) |
| 3 | 11.707 | ( 0.71, 0.71, -0.00) |
-------------------------------------------------
ML estimate of -log of scale factor:
-2.89
----------Anisotropy analyses----------
For the resolution shell spanning between 1.93 - 1.78 Angstrom,
the mean I/sigI is equal to 4.79. 58.1 % of these intensities have
an I/sigI > 3. When sorting these intensities by their anisotropic
correction factor and analysing the I/sigI behavior for this ordered
list, we can gauge the presence of 'anisotropy induced noise amplification'
in reciprocal space.
The quarter of Intensities *least* affected by the anisotropy correction show
<I/sigI> : 5.15e+00
Fraction of I/sigI > 3 : 6.23e-01 ( Z = 1.87 )
The quarter of Intensities *most* affected by the anisotropy correction show
<I/sigI> : 3.70e+00
Fraction of I/sigI > 3 : 4.58e-01 ( Z = 5.60 )
Z-scores are computed on the basis of a Bernoulli model assuming independence
of weak reflections with respect to anisotropy.
----------Wilson plot----------
The Wilson plot shows the falloff in intensity as a function in resolution;
this is used to calculate the overall B-factor ("Wilson B-factor") for the
data shown above. The expected plot is calculated based on analysis of
macromolecule structures in the PDB, and the distinctive appearance is due to
the non-random arrangement of atoms in the crystal. Some variation is
natural, but major deviations from the expected plot may indicate pathological
data (including ice rings, detector problems, or processing errors).
----------Mean intensity analyses----------
Inspired by: Morris et al. (2004). J. Synch. Rad.11, 56-59.
The following resolution shells are worrisome:
-----------------------------------------------------------------
| Mean intensity by shell (outliers) |
|---------------------------------------------------------------|
| d_spacing | z_score | completeness | <Iobs>/<Iexp> |
|---------------------------------------------------------------|
| 2.033 | 4.79 | 0.76 | 0.822 |
-----------------------------------------------------------------
Possible reasons for the presence of the reported unexpected low or elevated
mean intensity in a given resolution bin are :
- missing overloaded or weak reflections
- suboptimal data processing
- satellite (ice) crystals
- NCS
- translational pseudo symmetry (detected elsewhere)
- outliers (detected elsewhere)
- ice rings (detected elsewhere)
- other problems
Note that the presence of abnormalities in a certain region of reciprocal
space might confuse the data validation algorithm throughout a large region
of reciprocal space, even though the data are acceptable in those areas.
----------Possible outliers----------
Inspired by: Read, Acta Cryst. (1999). D55, 1759-1764
Acentric reflections:
None
Centric reflections:
-----------------------------------------------------------------------------------------------------
| Centric reflections |
|---------------------------------------------------------------------------------------------------|
| d_spacing | H K L | |E| | p(wilson) | p(extreme) |
|---------------------------------------------------------------------------------------------------|
| 2.628 | 26, 0, 1 | 4.19 | 2.79e-05 | 7.50e-02 |
-----------------------------------------------------------------------------------------------------
p(wilson) : 1-(erf[|E|/sqrt(2)])
p(extreme) : 1-(erf[|E|/sqrt(2)])^(n_acentrics)
p(wilson) is the probability that an E-value of the specified
value would be observed when it would selected at random from
the given data set.
p(extreme) is the probability that the largest |E| value is
larger or equal than the observed largest |E| value.
Both measures can be used for outlier detection. p(extreme)
takes into account the size of the dataset.
----------Ice ring related problems----------
The following statistics were obtained from ice-ring insensitive resolution
ranges:
mean bin z_score : 1.40
( rms deviation : 1.14 )
mean bin completeness : 0.86
( rms deviation : 0.29 )
The following table shows the Wilson plot Z-scores and completeness for
observed data in ice-ring sensitive areas. The expected relative intensity
is the theoretical intensity of crystalline ice at the given resolution.
Large z-scores and high completeness in these resolution ranges might
be a reason to re-assess your data processsing if ice rings were present.
-------------------------------------------------------------
| d_spacing | Expected rel. I | Data Z-score | Completeness |
-------------------------------------------------------------
| 3.897 | 1.000 | 2.76 | 1.00 |
| 3.669 | 0.750 | 1.08 | 1.00 |
| 3.441 | 0.530 | 3.41 | 0.99 |
| 2.671 | 0.170 | 1.23 | 0.99 |
| 2.249 | 0.390 | 1.57 | 0.98 |
| 2.072 | 0.300 | 0.88 | 0.81 |
| 1.948 | 0.040 | 1.64 | 0.51 |
| 1.918 | 0.180 | 0.07 | 0.37 |
| 1.883 | 0.030 | 0.40 | 0.30 |
-------------------------------------------------------------
Abnormalities in mean intensity or completeness at resolution ranges with a
relative ice ring intensity lower than 0.10 will be ignored.
No ice ring related problems detected.
If ice rings were present, the data does not look worse at ice ring related
d_spacings as compared to the rest of the data set.
----------Merging statistics by resolution bin----------
d_max d_min #obs #uniq mult. %comp <I> <I/sI> r_mrg r_meas r_pim r_anom cc1/2 cc_ano
68.41 4.83 21411 1380 15.52 100.00 291.7 55.0 0.048 0.050 0.012 0.026 0.999* -0.118
4.83 3.84 20902 1271 16.45 100.00 427.0 58.5 0.046 0.048 0.012 0.025 0.999* -0.044
3.84 3.35 20671 1256 16.46 100.00 324.9 54.8 0.049 0.051 0.012 0.028 0.999* -0.026
3.35 3.05 20816 1226 16.98 100.00 223.6 50.6 0.056 0.057 0.014 0.030 0.999* -0.029
3.05 2.83 20681 1221 16.94 100.00 154.0 44.6 0.061 0.063 0.015 0.033 0.999* -0.123
2.83 2.66 21279 1233 17.26 100.00 127.8 42.5 0.067 0.069 0.016 0.038 0.998* -0.092
2.66 2.53 20385 1197 17.03 100.00 101.1 36.2 0.075 0.077 0.019 0.042 0.999* 0.013
2.53 2.42 16659 1212 13.75 100.00 90.2 30.9 0.081 0.084 0.023 0.049 0.997* -0.001
2.42 2.32 12509 1213 10.31 100.00 81.4 26.6 0.082 0.087 0.027 0.061 0.996* 0.049
2.32 2.24 10399 1207 8.62 100.00 76.7 23.0 0.084 0.089 0.030 0.068 0.996* -0.033
2.24 2.17 8636 1191 7.25 99.92 68.4 19.5 0.086 0.093 0.034 0.069 0.993* -0.186
2.17 2.11 6956 1189 5.85 99.41 58.1 15.6 0.097 0.106 0.042 0.093 0.991* -0.110
2.11 2.06 5477 1147 4.78 96.14 52.8 13.1 0.097 0.108 0.047 0.107 0.992* -0.019
2.06 2.01 4340 1110 3.91 92.73 45.5 10.8 0.109 0.124 0.059 0.127 0.986* -0.009
2.01 1.96 3144 1029 3.06 86.47 39.0 8.5 0.120 0.143 0.076 0.168 0.977* -0.048
1.96 1.92 2115 929 2.28 77.48 35.2 6.9 0.121 0.149 0.086 0.168 0.967* -0.321
1.92 1.88 1171 716 1.64 60.22 29.8 5.1 0.137 0.181 0.117 0.249 0.950* 0.947
1.88 1.84 816 564 1.45 48.87 26.2 4.3 0.156 0.213 0.143 0.363 0.942* 0.000
1.84 1.81 489 383 1.28 31.65 23.5 3.7 0.183 0.252 0.171 0.291 0.910* 0.000
1.81 1.78 136 125 1.09 10.53 16.4 2.5 0.136 0.188 0.130 9.577 0.964* 0.000
68.36 1.78 218992 20799 10.53 85.66 133.5 29.8 0.058 0.060 0.016 0.041 0.999* -0.019
-------------Summary of merging statistics--------------
Overall Low High
High resolution limit 1.78 4.83 1.78
Low resolution limit 68.36 68.41 1.81
Completeness 85.7 100.0 10.5
Multiplicity 10.5 15.5 1.1
I/sigma 29.8 55.0 2.5
Rmerge(I) 0.058 0.048 0.136
Rmerge(I+/-) 0.056 0.047 0.112
Rmeas(I) 0.060 0.050 0.188
Rmeas(I+/-) 0.060 0.050 0.155
Rpim(I) 0.016 0.012 0.130
Rpim(I+/-) 0.021 0.016 0.106
CC half 0.999 0.999 0.964
Anomalous completeness 69.3 100.0 0.1
Anomalous multiplicity 6.2 9.4 1.1
Anomalous correlation -0.019 -0.118 0.000
Anomalous slope 1.020
dF/F 0.037
dI/s(dI) 0.851
Total observations 218992 21411 136
Total unique 20799 1380 125
Size of anomalous differences
+---------+---------+----------------+
| d_max | d_min | <|ΔF|/σ(ΔF)> |
|---------+---------+----------------|
| 68.41 | 4.83 | 0.963 |
| 4.83 | 3.84 | 0.899 |
| 3.84 | 3.35 | 0.969 |
| 3.35 | 3.05 | 0.944 |
| 3.05 | 2.83 | 0.939 |
| 2.83 | 2.66 | 0.94 |
| 2.66 | 2.53 | 0.906 |
| 2.53 | 2.42 | 0.874 |
| 2.42 | 2.32 | 0.908 |
| 2.32 | 2.24 | 0.871 |
| 2.24 | 2.17 | 0.731 |
| 2.17 | 2.11 | 0.741 |
| 2.11 | 2.06 | 0.707 |
| 2.06 | 2.01 | 0.726 |
| 2.01 | 1.96 | 0.725 |
| 1.96 | 1.92 | 0.696 |
| 1.92 | 1.88 | 0.669 |
| 1.88 | 1.84 | 0.77 |
| 1.84 | 1.81 | 0.665 |
| 1.81 | 1.78 | 1.201 |
+---------+---------+----------------+
Writing reflections to merged.mtz
Title: From dials.merge
Space group symbol from file: P41212
Space group number from file: 92
Space group from matrices: P 41 21 2 (No. 92)
Point group symbol from file: 422
Number of crystals: 1
Number of Miller indices: 20799
Resolution range: 68.3603 1.78117
History:
From DIALS 3.dev.617-g669c71566-release, run on 2022-01-08 at 08:23:38 GMT
Crystal 1:
Name: XTAL
Project: AUTOMATIC
Id: 1
Unit cell: (68.3603, 68.3603, 103.953, 90, 90, 90)
Number of datasets: 1
Dataset 1:
Name: NATIVE
Id: 1
Wavelength: 0.979493
Number of columns: 19
label #valid %valid min max type
H 20799 100.00% 0.00 37.00 H: index h,k,l
K 20799 100.00% 0.00 24.00 H: index h,k,l
L 20799 100.00% 0.00 56.00 H: index h,k,l
IMEAN 20799 100.00% -26.58 5622.43 J: intensity
SIGIMEAN 20799 100.00% 0.12 203.60 Q: standard deviation
I(+) 20519 98.65% -26.58 5622.43 K: I(+) or I(-)
SIGI(+) 20519 98.65% 0.12 203.60 M: standard deviation
I(-) 14678 70.57% -30.50 3681.68 K: I(+) or I(-)
SIGI(-) 14678 70.57% 0.35 78.46 M: standard deviation
N(+) 20519 98.65% 1.00 21.00 I: integer
N(-) 14678 70.57% 1.00 20.00 I: integer
F 20798 100.00% 0.26 74.61 F: amplitude
SIGF 20798 100.00% 0.04 1.41 Q: standard deviation
F(+) 20518 98.65% 0.26 74.61 G: F(+) or F(-)
SIGF(+) 20518 98.65% 0.05 1.41 L: standard deviation
F(-) 14677 70.57% 0.36 60.62 G: F(+) or F(-)
SIGF(-) 14677 70.57% 0.06 1.34 L: standard deviation
DANO 14397 69.22% -3.87 3.45 D: anomalous difference
SIGDANO 14397 69.22% 0.08 1.61 Q: standard deviation
Writing html report to dials.merge.html
This merges the data and performs a truncation procedure, to give a merged MTZ file containing intensities and strictly-positive structure factors (Fs).