Changeset 9ee2756 in sasmodels for doc/developer/calculator.rst


Ignore:
Timestamp:
Oct 19, 2017 12:31:30 PM (7 years ago)
Author:
Paul Kienzle <pkienzle@…>
Branches:
master, core_shell_microgels, magnetic_model, ticket-1257-vesicle-product, ticket_1156, ticket_1265_superball, ticket_822_more_unit_tests
Children:
2c108a3
Parents:
94f4543
Message:

simplify kernel wrapper code and combine OpenCL with DLL in one file

File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/developer/calculator.rst

    r870a2f4 r9ee2756  
    77 
    88This document describes the layer between the form factor kernels and the 
    9 model calculator which implements the polydispersity and magnetic SLD 
     9model calculator which implements the dispersity and magnetic SLD 
    1010calculations.  There are three separate implementations of this layer, 
    1111:mod:`kernelcl` for OpenCL, which operates on a single Q value at a time, 
     
    1414 
    1515Each implementation provides three different calls *Iq*, *Iqxy* and *Imagnetic* 
    16 for 1-D, 2-D and 2-D magnetic kernels respectively. 
    17  
    18 The C code is defined in *kernel_iq.c* and *kernel_iq.cl* for DLL and OpenCL 
    19 respectively.  The kernel call looks as follows:: 
     16for 1-D, 2-D and 2-D magnetic kernels respectively. The C code is defined 
     17in *kernel_iq.c*, with the minor differences between OpenCL and DLL handled 
     18by #ifdef statements. 
     19 
     20The kernel call looks as follows:: 
    2021 
    2122  kernel void KERNEL_NAME( 
    2223      int nq,                  // Number of q values in the q vector 
    23       int pd_start,            // Starting position in the polydispersity loop 
    24       int pd_stop,             // Ending position in the polydispersity loop 
    25       ProblemDetails *details, // Polydispersity info 
     24      int pd_start,            // Starting position in the dispersity loop 
     25      int pd_stop,             // Ending position in the dispersity loop 
     26      ProblemDetails *details, // dispersity info 
    2627      double *values,          // Value and weights vector 
    2728      double *q,               // q or (qx,qy) vector 
    2829      double *result,          // returned I(q), with result[nq] = pd_weight 
    29       double cutoff)           // polydispersity weight cutoff 
     30      double cutoff)           // dispersity weight cutoff 
    3031 
    3132The details for OpenCL and the python loop are slightly different, but these 
     
    3435*nq* indicates the number of q values that will be calculated. 
    3536 
    36 The *pd_start* and *pd_stop* parameters set the range of the polydispersity 
    37 loop to compute for the current kernel call.   Give a polydispersity 
     37The *pd_start* and *pd_stop* parameters set the range of the dispersity 
     38loop to compute for the current kernel call.   Give a dispersity 
    3839calculation with 30 weights for length and 30 weights for radius for example, 
    3940there are a total of 900 calls to the form factor required to compute the 
     
    4243the length index to 3 and the radius index to 10 for a position of 3*30+10=100, 
    4344and could then proceed to position 200.  This allows us to interrupt the 
    44 calculation in the middle of a long polydispersity loop without having to 
     45calculation in the middle of a long dispersity loop without having to 
    4546do special tricks with the C code.  More importantly, it stops the OpenCL 
    4647kernel in a reasonable time; because the GPU is used by the operating 
     
    4950 
    5051The *ProblemDetails* structure is a direct map of the 
    51 :class:`details.CallDetails` buffer.  This indicates which parameters are 
    52 polydisperse, and where in the values vector the values and weights can be 
    53 found.  For each polydisperse parameter there is a parameter id, the length 
    54 of the polydispersity loop for that parameter, the offset of the parameter 
     52:class:`details.CallDetails` buffer.  This indicates which parameters have 
     53dispersity, and where in the values vector the values and weights can be 
     54found.  For each parameter with dispersity there is a parameter id, the length 
     55of the dispersity loop for that parameter, the offset of the parameter 
    5556values in the pd value and pd weight vectors and the 'stride' from one index 
    5657to the next, which is used to translate between the position in the 
    57 polydispersity loop and the particular parameter indices.  The *num_eval* 
    58 field is the total size of the polydispersity loop.  *num_weights* is the 
     58dispersity loop and the particular parameter indices.  The *num_eval* 
     59field is the total size of the dispersity loop.  *num_weights* is the 
    5960number of elements in the pd value and pd weight vectors.  *num_active* is 
    60 the number of non-trivial pd loops (polydisperse parameters should be ordered 
    61 by decreasing pd vector length, with a length of 1 meaning no polydispersity). 
     61the number of non-trivial pd loops (parameters with dispersity should be ordered 
     62by decreasing pd vector length, with a length of 1 meaning no dispersity). 
    6263Oriented objects in 2-D need a cos(theta) spherical correction on the angular 
    6364variation in order to preserve the 'surface area' of the weight distribution. 
     
    7273*(Mx, My, Mz)*.  Sample magnetization is translated from *(M, theta, phi)* 
    7374to *(Mx, My, Mz)* before the kernel is called.   After the fixed values comes 
    74 the pd value vector, with the polydispersity values for each parameter 
     75the pd value vector, with the dispersity values for each parameter 
    7576stacked one after the other.  The order isn't important since the location 
    7677for each parameter is stored in the *pd_offset* field of the *ProblemDetails* 
     
    7879values, the pd weight vector is stored, with the same configuration as the 
    7980pd value vector.  Note that the pd vectors can contain values that are not 
    80 in the polydispersity loop; this is used by :class:`mixture.MixtureKernel` 
     81in the dispersity loop; this is used by :class:`mixture.MixtureKernel` 
    8182to make it easier to call the various component kernels. 
    8283 
     
    8788 
    8889The *results* vector contains one slot for each of the *nq* values, plus 
    89 one extra slot at the end for the current polydisperse normalization.  This 
    90 is required when the polydispersity loop is broken across several kernel 
    91 calls. 
     90one extra slot at the end for the weight normalization accumulated across 
     91all points in the dispersity mesh.  This is required when the dispersity 
     92loop is broken across several kernel calls. 
    9293 
    9394*cutoff* is a importance cutoff so that points which contribute negligibly 
     
    9798 
    9899- USE_OPENCL is defined if running in opencl 
    99 - MAX_PD is the maximum depth of the polydispersity loop [model specific] 
     100- MAX_PD is the maximum depth of the dispersity loop [model specific] 
    100101- NUM_PARS is the number of parameter values in the kernel.  This may be 
    101102  more than the number of parameters if some of the parameters are vector 
    102103  values. 
    103104- NUM_VALUES is the number of fixed values, which defines the offset in the 
    104   value list to the polydisperse value and weight vectors. 
     105  value list to the dispersity value and weight vectors. 
    105106- NUM_MAGNETIC is the number of magnetic SLDs 
    106107- MAGNETIC_PARS is a comma separated list of the magnetic SLDs, indicating 
    107108  their locations in the values vector. 
    108 - MAGNETIC_PAR0 to MAGNETIC_PAR2 are the first three magnetic parameter ids 
    109   so we can hard code the setting of magnetic values if there are only a 
    110   few of them. 
     109- MAGNETIC_PAR1, ... are the first three magnetic parameter ids so we can 
     110  hard code the setting of magnetic values if there are only a few of them. 
    111111- KERNEL_NAME is the name of the function being declared 
    112112- PARAMETER_TABLE is the declaration of the parameters to the kernel: 
     
    152152    Cylinder2D:: 
    153153 
    154         #define CALL_IQ(q, i, var) Iqxy(q[2*i], q[2*i+1], \ 
     154        #define CALL_IQ(q, i, var) Iqxy(qa, qc, \ 
    155155        var.length, \ 
    156156        var.radius, \ 
    157157        var.sld, \ 
    158         var.sld_solvent, \ 
    159         var.theta, \ 
    160         var.phi) 
     158        var.sld_solvent) 
    161159 
    162160- CALL_VOLUME(var) is similar, but for calling the form volume:: 
     
    182180        #define INVALID(var) constrained(var.p1, var.p2, var.p3) 
    183181 
    184 Our design supports a limited number of polydispersity loops, wherein 
    185 we need to cycle through the values of the polydispersity, calculate 
     182Our design supports a limited number of dispersity loops, wherein 
     183we need to cycle through the values of the dispersity, calculate 
    186184the I(q, p) for each combination of parameters, and perform a normalized 
    187185weighted sum across all the weights.  Parameters may be passed to the 
    188 underlying calculation engine as scalars or vectors, but the polydispersity 
     186underlying calculation engine as scalars or vectors, but the dispersity 
    189187calculator treats the parameter set as one long vector. 
    190188 
    191 Let's assume we have 8 parameters in the model, with two polydisperse.  Since 
    192 this is a 1-D model the orientation parameters won't be used:: 
     189Let's assume we have 8 parameters in the model, two of which allow dispersity. 
     190Since this is a 1-D model the orientation parameters won't be used:: 
    193191 
    194192    0: scale        {scl = constant} 
     
    196194    2: radius       {r = vector of 10pts} 
    197195    3: length       {l = vector of 30pts} 
    198     4: sld          {s = constant/(radius**2*length)} 
     196    4: sld          {s1 = constant/(radius**2*length)} 
    199197    5: sld_solvent  {s2 = constant} 
    200198    6: theta        {not used} 
     
    202200 
    203201This generates the following call to the kernel.  Note that parameters 4 and 
    204 5 are treated as polydisperse even though they are not --- this is because 
     2025 are treated as having dispersity even though they don't --- this is because 
    205203it is harmless to do so and simplifies the looping code:: 
    206204 
     
    218216        pd_offset = {10, 0, 31, 32}   // *length* starts at index 10 in weights 
    219217        pd_stride = {1, 30, 300, 300} // cumulative product of pd length 
    220         num_eval = 300   // 300 values in the polydispersity loop 
     218        num_eval = 300   // 300 values in the dispersity loop 
    221219        num_weights = 42 // 42 values in the pd vector 
    222220        num_active = 2   // only the first two pd are active 
     
    225223 
    226224    values = { scl, bkg,                                  // universal 
    227                r, l, s, s2, theta, phi,                   // kernel pars 
     225               r, l, s1, s2, theta, phi,                  // kernel pars 
    228226               in spin, out spin, spin angle,             // applied magnetism 
    229                mx s, my s, mz s, mx s2, my s2, mz s2,     // magnetic slds 
     227               mx s1, my s1, mz s1, mx s2, my s2, mz s2,  // magnetic slds 
    230228               r0, .., r9, l0, .., l29, s, s2,            // pd values 
    231229               r0, .., r9, l0, .., l29, s, s2}            // pd weights 
     
    235233    result = {r1, ..., r130, pd_norm, x } 
    236234 
    237 The polydisperse parameters are stored in as an array of parameter 
    238 indices, one for each polydisperse parameter, stored in pd_par[n]. 
    239 Non-polydisperse parameters do not appear in this array. Each polydisperse 
     235The dispersity parameters are stored in as an array of parameter 
     236indices, one for each parameter, stored in pd_par[n]. Parameters which do 
     237not support dispersity do not appear in this array. Each dispersity 
    240238parameter has a weight vector whose length is stored in pd_length[n]. 
    241239The weights are stored in a contiguous vector of weights for all 
     
    243241in pd_offset[n].  The values corresponding to the weights are stored 
    244242together in a separate weights[] vector, with offset stored in 
    245 par_offset[pd_par[n]]. Polydisperse parameters should be stored in 
     243par_offset[pd_par[n]]. Dispersity parameters should be stored in 
    246244decreasing order of length for highest efficiency. 
    247245 
    248 We limit the number of polydisperse dimensions to MAX_PD (currently 4), 
    249 though some models may have fewer if they have fewer polydisperse 
     246We limit the number of dispersity dimensions to MAX_PD (currently 4), 
     247though some models may have fewer if they have fewer dispersity 
    250248parameters.  The main reason for the limit is to reduce code size. 
    251 Each additional polydisperse parameter requires a separate polydispersity 
    252 loop.  If more than 4 levels of polydispersity are needed, then kernel_iq.c 
    253 and kernel_iq.cl will need to be extended. 
     249Each additional dispersity parameter requires a separate dispersity 
     250loop.  If more than 4 levels of dispersity are needed, then we need to 
     251switch to a monte carlo importance sampling algorithm with better 
     252performance for high-dimensional integrals. 
    254253 
    255254Constraints between parameters are not supported.  Instead users will 
     
    262261theta since the polar coordinates normalization is tied to this parameter. 
    263262 
    264 If there is no polydispersity we pretend that it is polydisperisty with one 
    265 parameter, pd_start=0 and pd_stop=1.  We may or may not short circuit the 
    266 calculation in this case, depending on how much time it saves. 
     263If there is no dispersity we pretend that we have a disperisty mesh over 
     264a single parameter with a single point in the distribution, giving 
     265pd_start=0 and pd_stop=1. 
    267266 
    268267The problem details structure could be allocated and sent in as an integer 
    269268array using the read-only flag.  This would allow us to copy it once per fit 
    270269along with the weights vector, since features such as the number of 
    271 polydisperity elements per pd parameter or the coordinated won't change 
    272 between function evaluations.  A new parameter vector must be sent for 
    273 each I(q) evaluation.  This is not currently implemented, and would require 
    274 some resturcturing of the :class:`sasview_model.SasviewModel` interface. 
    275  
    276 The results array will be initialized to zero for polydispersity loop 
     270disperity points per pd parameter won't change between function evaluations. 
     271A new parameter vector must be sent for each I(q) evaluation.  This is 
     272not currently implemented, and would require some resturcturing of 
     273the :class:`sasview_model.SasviewModel` interface. 
     274 
     275The results array will be initialized to zero for dispersity loop 
    277276entry zero, and preserved between calls to [start, stop] so that the 
    278277results accumulate by the time the loop has completed.  Background and 
     
    295294 
    296295This will require accumulated error for each I(q) value to be preserved 
    297 between kernel calls to implement this fully.  The kernel_iq.c code, which 
    298 loops over q for each parameter set in the polydispersity loop, will need 
    299 also need the accumalation vector. 
     296between kernel calls to implement this fully.  The *kernel_iq.c* code, which 
     297loops over q for each parameter set in the dispersity loop, will also need 
     298the accumulation vector. 
Note: See TracChangeset for help on using the changeset viewer.