source: sasmodels/doc/developer/calculator.rst @ b966a96

core_shell_microgelscostrafo411magnetic_modelrelease_v0.94release_v0.95ticket-1257-vesicle-productticket_1156ticket_1265_superballticket_822_more_unit_tests
Last change on this file since b966a96 was 6fd8de0, checked in by wojciech, 9 years ago

Updated polydispersity loop documentation

  • Property mode set to 100644
File size: 9.4 KB
RevLine 
[d5ac45f]1Calculator Interface
2====================
3
4The environment needs to provide the following #defines:
5
6- USE_OPENCL is defined if running in opencl
7- KERNEL declares a function to be available externally
8- KERNEL_NAME is the name of the function being declared
[6fd8de0]9- MAX_PD is the maximum depth of the polydispersity loop [model specific]
[d5ac45f]10- NPARS is the number of parameters in the kernel
11- PARAMETER_TABLE is the declaration of the parameters to the kernel::
12
13    Cylinder:
14
15        #define PARAMETER_TABLE \
16        double length; \
17        double radius; \
18        double sld; \
19        double sld_solvent
20
21    Note: scale and background are never included
22
23    Multi-shell cylinder (10 shell max):
24
25        #define PARAMETER_TABLE \
26        double num_shells; \
27        double length; \
28        double radius[10]; \
29        double sld[10]; \
30        double sld_solvent
31
[6fd8de0]32- CALL_IQ(q, i, var) is the declaration of a call to the kernel::
[d5ac45f]33
34    Cylinder:
35
[03cac08]36        #define CALL_IQ(q, i, var) Iq(q[i], \
[d5ac45f]37        var.length, \
38        var.radius, \
39        var.sld, \
40        var.sld_solvent)
41
42    Multi-shell cylinder:
43
[03cac08]44        #define CALL_IQ(q, i, var) Iq(q[i], \
[d5ac45f]45        var.num_shells, \
46        var.length, \
47        var.radius, \
48        var.sld, \
49        var.sld_solvent)
50
[03cac08]51    Cylinder2D:
52
53        #define CALL_IQ(q, i, var) Iqxy(q[2*i], q[2*i+1], \
54        var.length, \
55        var.radius, \
56        var.sld, \
57        var.sld_solvent, \
58        var.theta, \
59        var.phi)
60
[d5ac45f]61- CALL_VOLUME(var) is similar, but for calling the form volume::
62
63        #define CALL_VOLUME(var) \
64        form_volume(var.length, var.radius)
65
66- INVALID(var) is a test for model parameters in the correct range::
67
68    Cylinder:
69
70        #define INVALID(var) 0
71
72    BarBell:
73
[6fd8de0]74        #define INVALID(var) (var.bell_radius < var.radius)
[d5ac45f]75
76    Model with complicated constraints:
77
78        inline bool constrained(p1, p2, p3) { return expression; }
79        #define INVALID(var) constrained(var.p1, var.p2, var.p3)
80
81Our design supports a limited number of polydispersity loops, wherein
82we need to cycle through the values of the polydispersity, calculate
83the I(q, p) for each combination of parameters, and perform a normalized
84weighted sum across all the weights.  Parameters may be passed to the
85underlying calculation engine as scalars or vectors, but the polydispersity
86calculator treats the parameter set as one long vector.
87
88Let's assume we have 6 parameters in the model, with two polydisperse::
89
90    0: scale        {scl = constant}
91    1: background   {bkg = constant}
[6fd8de0]92    2: length       {l = vector of 30pts}
93    3: radius       {r = vector of 10pts}
94    4: sld          {s = constant/(radius**2*length)}
95    5: sld_solvent  {s2 = constant}
[d5ac45f]96
97This generates the following call to the kernel (where x stands for an
98arbitrary value that is not used by the kernel evaluator)::
99
100    NPARS = 4  // scale and background are in all models
101    problem {
[6fd8de0]102        pd_par = {3, 2, x, x}         // parameters *radius* and *length* vary
[d5ac45f]103        pd_length = {30, 10, 0, 0}    // *length* has more, so it is first
104        pd_offset = {10, 0, x, x}     // *length* starts at index 10 in weights
105        pd_stride = {1, 30, 300, 300} // cumulative product of pd length
[6fd8de0]106        pd_isvol = {True, True, x, x}       // true if weight is a volume weight
[d5ac45f]107        par_offset = {2, 3, 303, 313}  // parameter offsets
108        par_coord = {0, 3, 2, 1} // bitmap of parameter dependencies
[6fd8de0]109        fast_coord_index = {3, 5, x, x} // radius and sld have fast index
[d5ac45f]110        fast_coord_count = 2  // two parameters vary with *length* distribution
111        theta_var = -1   // no spherical correction
112        fast_theta = 0   // spherical correction angle is not pd 1
113    }
114
[6fd8de0]115    weight = { l0, .., l29, r0, .., r9} //length comes first as the longest vec
[d5ac45f]116    pars = { scl, bkg, l0, ..., l29, r0, r1, ..., r9,
117             s[l0,r0], ... s[l0,r9], s[l1,r0], ... s[l29,r9] , s2}
[6fd8de0]118             //where s[x,y] stands for material sld, s2 = solvent sld
[d5ac45f]119
120    nq = 130
121    q = { q0, q1, ..., q130, x, x }  # pad to 8 element boundary
122    result = {r1, ..., r130, norm, vol, vol_norm, x, x, x, x, x, x, x}
123
124
125The polydisperse parameters are stored in as an array of parameter
126indices, one for each polydisperse parameter, stored in pd_par[n].
127Non-polydisperse parameters do not appear in this array. Each polydisperse
[6fd8de0]128parameter has a weight vector whose length is stored in pd_length[n].
[d5ac45f]129The weights are stored in a contiguous vector of weights for all
130parameters, with the starting position for the each parameter stored
131in pd_offset[n].  The values corresponding to the weights are stored
132together in a separate weights[] vector, with offset stored in
133par_offset[pd_par[n]]. Polydisperse parameters should be stored in
134decreasing order of length for highest efficiency.
135
[48fbd50]136We limit the number of polydisperse dimensions to MAX_PD (currently 4),
137though some models may have fewer if they have fewer polydisperse
138parameters. This cuts the size of the structure in half compared to
139allowing a separate polydispersity for each parameter.  This will
140help a little bit for models with large numbers of parameters, such
141as the onion model.
[d5ac45f]142
143Parameters may be coordinated.  That is, we may have the value of one
144parameter depend on a set of other parameters, some of which may be
145polydisperse.  For example, if sld is inversely proportional to the
146volume of a cylinder, and the length and radius are independently
147polydisperse, then for each combination of length and radius we need a
148separate value for the sld.  The caller must provide a coordination table
149for each parameter containing the value for each parameter given the
150value of the polydisperse parameters v1, v2, etc.  The tables for each
151parameter are arranged contiguously in a vector, with offset[k] giving the
152starting location of parameter k in the vector.  Each parameter defines
[6fd8de0]153par_coord[k] as a bit mask indicating which polydispersity parameters the
[d5ac45f]154parameter depends upon. Usually this is zero, indicating that the parameter
155is independent, but for the cylinder example given, the bits for the
156radius and length polydispersity parameters would both be set, the result
157being a (#radius x #length) table, or maybe a (#length x #radius) table
158if length comes first in the polydispersity table.
159
160NB: If we can guarantee that a compiler and OpenCL driver are available,
161we could instead create the coordination function on the fly for each
162parameter, saving memory and transfer time, but requiring a C compiler
163as part of the environment.
164
165In ordering the polydisperse parameters by decreasing length we can
166iterate over the longest dispersion weight vector first.  All parameters
167coordinated with this weight vector (the 'fast' parameters), can be
168updated with a simple increment to the next position in the parameter
169value table.  The indices of these parameters is stored in fast_coord_index[],
170with fast_coord_count being the number of fast parameters.  A total
171of NPARS slots is allocated to allow for the case that all parameters
172are coordinated with the fast index, though this will likely be mostly
173empty.  When the fast increment count reaches the end of the weight
174vector, then the index of the second polydisperse parameter must be
175incremented, and all of its coordinated parameters updated.  Because this
176operation is not in the inner loop, a slower algorithm can be used.
177
178If there is no polydispersity we pretend that it is polydisperisty with one
179parameter, pd_start=0 and pd_stop=1.  We may or may not short circuit the
180calculation in this case, depending on how much time it saves.
181
182The problem details structure can be allocated and sent in as an integer
183array using the read-only flag.  This allows us to copy it once per fit
184along with the weights vector, since features such as the number of
185polydisperity elements per pd parameter or the coordinated won't change
186between function evaluations.  A new parameter vector is sent for
187each I(q) evaluation.
188
189To protect against expensive evaluations taking all the GPU resource
190on large fits, the entire polydispersity will not be computed at once.
191Instead, a start and stop location will be sent, indicating where in the
192polydispersity loop the calculation should start and where it should
193stop.  We can do this for arbitrary start/stop points since we have
194unwound the nested loop.  Instead, we use the same technique as array
195index translation, using div and mod to figure out the i,j,k,...
196indices in the virtual nested loop.
197
198The results array will be initialized to zero for polydispersity loop
199entry zero, and preserved between calls to [start, stop] so that the
200results accumulate by the time the loop has completed.  Background and
201scale will be applied when the loop reaches the end.  This does require
202that the results array be allocated read-write, which is less efficient
203for the GPU, but it makes the calling sequence much more manageable.
204
205Scale and background cannot be coordinated with other polydisperse parameters
206
207Oriented objects in 2-D need a spherical correction on the angular variation
208in order to preserve the 'surface area' of the weight distribution.
209
[6fd8de0]210cutoff parameter limits integration area within polydispersity hypercude,
211which speeds calculations
[03cac08]212
213For accuracy we may want to introduce Kahan summation into the integration::
214
215
216    double accumulated_error = 0.0;
217    ...
218    #if USE_KAHAN_SUMMATION
219        const double y = next - accumulated_error;
220        const double t = ret + y;
221        accumulated_error = (t - ret) - y;
222        ret = t;
223    #else
224        ret += next;
225    #endif
Note: See TracBrowser for help on using the repository browser.