source: sasmodels/doc/developer/calculator.rst @ 7150766

Last change on this file since 7150766 was e822c97, checked in by Paul Kienzle <pkienzle@…>, 8 years ago

internal docs: clean up description of the new polydispersity calculator

  • Property mode set to 100644
File size: 13.5 KB
RevLine 
[e822c97]1.. currentmodule:: sasmodels
2
[d5ac45f]3Calculator Interface
4====================
5
[e822c97]6This document describes the layer between the form factor kernels and the
7model calculator which implements the polydispersity and magnetic SLD
8calculations.  There are three separate implementations of this layer,
9*kernelcl.py* for OpenCL, which operates on a single Q value at a time,
10*kerneldll.c* for the DLL, which loops over a vector of Q values, and
11*kernelpy.py* for python models which operates on vector Q values.
12
13Each implementation provides three different calls *Iq*, *Iqxy* and *Imagnetic*
14for 1-D, 2-D and 2-D magnetic kernels respectively.
15
16The C code is defined in *kernel_iq.c* and *kernel_iq.cl* for DLL and OpenCL
17respectively.  The kernel call looks as follows::
18
19  kernel void KERNEL_NAME(
20      int nq,                  // Number of q values in the q vector
21      int pd_start,            // Starting position in the polydispersity loop
22      int pd_stop,             // Ending position in the polydispersity loop
23      ProblemDetails *details, // Polydispersity info
24      double *values,          // Value and weights vector
25      double *q,               // q or (qx,qy) vector
26      double *result,          // returned I(q), with result[nq] = pd_weight
27      double cutoff)           // polydispersity weight cutoff
28
29The details for OpenCL and the python loop are slightly different, but these
30data structures are common.
31
32*nq* indicates the number of q values that will be calculated.
33
34The *pd_start* and *pd_stop* parameters set the range of the polydispersity
35loop to compute for the current kernel call.   Give a polydispersity
36calculation with 30 weights for length and 30 weights for radius for example,
37there are a total of 900 calls to the form factor required to compute the
38kernel.  These might be done in chunks of 100, so the first call would start
39at zero and stop after 100 iterations.  The second call would then have to set
40the length index to 3 and the radius index to 10 for a position of 3*30+10=100,
41and could then proceed to position 200.  This allows us to interrupt the
42calculation in the middle of a long polydispersity loop without having to
43do special tricks with the C code.  More importantly, it stops the OpenCL
44kernel in a reasonable time; because the GPU is used by the operating
45system to show its windows, if a GPU kernel runs too long then it will be
46automatically killed and no results will be returned to the caller.
47
48The *ProblemDetails* structure is a direct map of the
49:class:`details.CallDetails` buffer.  This indicates which parameters are
50polydisperse, and where in the values vector the values and weights can be
51found.  For each polydisperse parameter there is a parameter id, the length
52of the polydispersity loop for that parameter, the offset of the parameter
53values in the pd value and pd weight vectors and the 'stride' from one index
54to the next, which is used to translate between the position in the
55polydispersity loop and the particular parameter indices.  The *num_eval*
56field is the total size of the polydispersity loop.  *num_weights* is the
57number of elements in the pd value and pd weight vectors.  *num_active* is
58the number of non-trivial pd loops (polydisperse parameters should be ordered
59by decreasing pd vector length, with a length of 1 meaning no polydispersity).
60Oriented objects in 2-D need a cos(theta) spherical correction on the angular
61variation in order to preserve the 'surface area' of the weight distribution.
62*theta_par* is the id of the polar coordinate parameter if there is one.
63
64
65The *values* vector consists of the fixed values for the model plus pd value
66and pd weight vectors.  There are *NUM_VALUES* fixed values for the model,
67which includes the initial two slots for scale and background, the NUM_PARS
68values for the kernel parameters, three slots for the applied magnetism, and
69three slots for each of the SLD parameters for the sample magnetiziation
70*(Mx, My, Mz)*.  Sample magnetization is translated from *(M, theta, phi)*
71to *(Mx, My, Mz)* before the kernel is called.   After the fixed values comes
72the pd value vector, with the polydispersity values for each parameter
73stacked one after the other.  The order isn't important since the location
74for each parameter is stored in the *pd_offset* field of the *ProblemDetails*
75structure, but the values do need to be contiguous.  After *num_weights*
76values, the pd weight vector is stored, with the same configuration as the
77pd value vector.  Note that the pd vectors can contain values that are not
78in the polydispersity loop; this is used by :class:`mixture.MixtureKernel`
79to make it easier to call the various component kernels.
80
81The *q* vector contains one value for each q for *Iq* kernels, or a pair
82of *(qx,qy)* values for each q for *Iqxy* and *Imagnetic* kernels.  OpenCL
83pads all vectors to 32 value boundaries just to be safe, including the
84*values* vector and the the *results* vector.
85
86The *results* vector contains one slot for each of the *nq* values, plus
87one extra slot at the end for the current polydisperse normalization.  This
88is required when the polydispersity loop is broken across several kernel
89calls.
90
91*cutoff* is a importance cutoff so that points which contribute negligibly
92to the total scattering can be skipped without calculating them.
93
94:func:`generate.make_source` defines the following C macros:
[d5ac45f]95
96- USE_OPENCL is defined if running in opencl
[6fd8de0]97- MAX_PD is the maximum depth of the polydispersity loop [model specific]
[e822c97]98- NUM_PARS is the number of parameter values in the kernel.  This may be
99  more than the number of parameters if some of the parameters are vector
100  values.
101- NUM_VALUES is the number of fixed values, which defines the offset in the
102  value list to the polydisperse value and weight vectors.
103- NUM_MAGNETIC is the number of magnetic SLDs
104- MAGNETIC_PARS is a comma separated list of the magnetic SLDs, indicating
105  their locations in the values vector.
106- MAGNETIC_PAR0 to MAGNETIC_PAR2 are the first three magnetic parameter ids
107  so we can hard code the setting of magnetic values if there are only a
108  few of them.
109- KERNEL_NAME is the name of the function being declared
110- PARAMETER_TABLE is the declaration of the parameters to the kernel:
[d5ac45f]111
[e822c97]112    Cylinder::
[d5ac45f]113
114        #define PARAMETER_TABLE \
115        double length; \
116        double radius; \
117        double sld; \
[e822c97]118        double sld_solvent;
[d5ac45f]119
120    Note: scale and background are never included
121
[e822c97]122    Multi-shell cylinder (10 shell max)::
[d5ac45f]123
124        #define PARAMETER_TABLE \
125        double num_shells; \
126        double length; \
127        double radius[10]; \
128        double sld[10]; \
[e822c97]129        double sld_solvent;
[d5ac45f]130
[e822c97]131- CALL_IQ(q, i, var) is the declaration of a call to the kernel:
[d5ac45f]132
[e822c97]133    Cylinder::
[d5ac45f]134
[03cac08]135        #define CALL_IQ(q, i, var) Iq(q[i], \
[d5ac45f]136        var.length, \
137        var.radius, \
138        var.sld, \
139        var.sld_solvent)
140
[e822c97]141    Multi-shell cylinder::
[d5ac45f]142
[03cac08]143        #define CALL_IQ(q, i, var) Iq(q[i], \
[d5ac45f]144        var.num_shells, \
145        var.length, \
146        var.radius, \
147        var.sld, \
148        var.sld_solvent)
149
[e822c97]150    Cylinder2D::
[03cac08]151
152        #define CALL_IQ(q, i, var) Iqxy(q[2*i], q[2*i+1], \
153        var.length, \
154        var.radius, \
155        var.sld, \
156        var.sld_solvent, \
157        var.theta, \
158        var.phi)
159
[d5ac45f]160- CALL_VOLUME(var) is similar, but for calling the form volume::
161
162        #define CALL_VOLUME(var) \
163        form_volume(var.length, var.radius)
164
[e822c97]165There is an additional macro that can be defined within the model.c file:
[d5ac45f]166
[e822c97]167- INVALID(var) is a test for model parameters in the correct range:
168
169    Cylinder::
[d5ac45f]170
171        #define INVALID(var) 0
172
[e822c97]173    BarBell::
[d5ac45f]174
[6fd8de0]175        #define INVALID(var) (var.bell_radius < var.radius)
[d5ac45f]176
[e822c97]177    Model with complicated constraints::
[d5ac45f]178
179        inline bool constrained(p1, p2, p3) { return expression; }
180        #define INVALID(var) constrained(var.p1, var.p2, var.p3)
181
182Our design supports a limited number of polydispersity loops, wherein
183we need to cycle through the values of the polydispersity, calculate
184the I(q, p) for each combination of parameters, and perform a normalized
185weighted sum across all the weights.  Parameters may be passed to the
186underlying calculation engine as scalars or vectors, but the polydispersity
187calculator treats the parameter set as one long vector.
188
[e822c97]189Let's assume we have 8 parameters in the model, with two polydisperse.  Since
190this is a 1-D model the orientation parameters won't be used::
[d5ac45f]191
192    0: scale        {scl = constant}
193    1: background   {bkg = constant}
[e822c97]194    2: radius       {r = vector of 10pts}
195    3: length       {l = vector of 30pts}
[6fd8de0]196    4: sld          {s = constant/(radius**2*length)}
197    5: sld_solvent  {s2 = constant}
[e822c97]198    6: theta        {not used}
199    7: phi          {not used}
200
201This generates the following call to the kernel.  Note that parameters 4 and
2025 are treated as polydisperse even though they are not --- this is because
203it is harmless to do so and simplifies the looping code::
204
205    MAX_PD = 4
206    NUM_PARS = 6          // kernel parameters only
207    NUM_VALUES = 17       // all values, including scale and background
208    NUM_MAGNETIC = 2      // two parameters might be magnetic
209    MAGNETIC_PARS = 4, 5  // they are sld and sld_solvent
210    MAGNETIC_PAR0 = 4     // sld index
211    MAGNETIC_PAR1 = 5     // solvent index
212
213    details {
214        pd_par = {3, 2, 4, 5}         // parameters *radius* and *length* vary
215        pd_length = {30, 10, 1, 1}    // *length* has more, so it is first
216        pd_offset = {10, 0, 31, 32}   // *length* starts at index 10 in weights
[d5ac45f]217        pd_stride = {1, 30, 300, 300} // cumulative product of pd length
[e822c97]218        num_eval = 300   // 300 values in the polydispersity loop
219        num_weights = 42 // 42 values in the pd vector
220        num_active = 2   // only the first two pd are active
221        theta_var =  6   // spherical correction
[d5ac45f]222    }
223
[e822c97]224    values = { scl, bkg,                                  // universal
225               r, l, s, s2, theta, phi,                   // kernel pars
226               in spin, out spin, spin angle,             // applied magnetism
227               mx s, my s, mz s, mx s2, my s2, mz s2,     // magnetic slds
228               r0, .., r9, l0, .., l29, s, s2,            // pd values
229               r0, .., r9, l0, .., l29, s, s2}            // pd weights
[d5ac45f]230
231    nq = 130
232    q = { q0, q1, ..., q130, x, x }  # pad to 8 element boundary
[e822c97]233    result = {r1, ..., r130, pd_norm, x }
[d5ac45f]234
235The polydisperse parameters are stored in as an array of parameter
236indices, one for each polydisperse parameter, stored in pd_par[n].
237Non-polydisperse parameters do not appear in this array. Each polydisperse
[6fd8de0]238parameter has a weight vector whose length is stored in pd_length[n].
[d5ac45f]239The weights are stored in a contiguous vector of weights for all
240parameters, with the starting position for the each parameter stored
241in pd_offset[n].  The values corresponding to the weights are stored
242together in a separate weights[] vector, with offset stored in
243par_offset[pd_par[n]]. Polydisperse parameters should be stored in
244decreasing order of length for highest efficiency.
245
[48fbd50]246We limit the number of polydisperse dimensions to MAX_PD (currently 4),
247though some models may have fewer if they have fewer polydisperse
[e822c97]248parameters.  The main reason for the limit is to reduce code size.
249Each additional polydisperse parameter requires a separate polydispersity
250loop.  If more than 4 levels of polydispersity are needed, then kernel_iq.c
251and kernel_iq.cl will need to be extended.
252
253Constraints between parameters are not supported.  Instead users will
254have to define a new model with the constraints built in by making a
255copy of the existing model.  Mac provides OpenCL and we are supplying
256the tinycc compiler for Windows so this won't be a complete limitation,
257but it is rather inconvenient.  The process could perhaps be automated
258so that there is no code copying required, just an alternate CALL_IQ
259macro that implements the constraints.  Think carefully about constraining
260theta since the polar coordinates normalization is tied to this parameter.
[d5ac45f]261
262If there is no polydispersity we pretend that it is polydisperisty with one
263parameter, pd_start=0 and pd_stop=1.  We may or may not short circuit the
264calculation in this case, depending on how much time it saves.
265
[e822c97]266The problem details structure could be allocated and sent in as an integer
267array using the read-only flag.  This would allow us to copy it once per fit
[d5ac45f]268along with the weights vector, since features such as the number of
269polydisperity elements per pd parameter or the coordinated won't change
[e822c97]270between function evaluations.  A new parameter vector must be sent for
271each I(q) evaluation.  This is not currently implemented, and would require
272some resturcturing of the :class:`sasview_model.SasviewModel` interface.
[d5ac45f]273
274The results array will be initialized to zero for polydispersity loop
275entry zero, and preserved between calls to [start, stop] so that the
276results accumulate by the time the loop has completed.  Background and
277scale will be applied when the loop reaches the end.  This does require
278that the results array be allocated read-write, which is less efficient
279for the GPU, but it makes the calling sequence much more manageable.
280
[03cac08]281For accuracy we may want to introduce Kahan summation into the integration::
282
283    double accumulated_error = 0.0;
284    ...
285    #if USE_KAHAN_SUMMATION
286        const double y = next - accumulated_error;
287        const double t = ret + y;
288        accumulated_error = (t - ret) - y;
289        ret = t;
290    #else
291        ret += next;
292    #endif
[e822c97]293
294This will require accumulated error for each I(q) value to be preserved
295between kernel calls to implement this fully.  The kernel_iq.c code, which
296loops over q for each parameter set in the polydispersity loop, will need
297also need the accumalation vector.
Note: See TracBrowser for help on using the repository browser.