Changeset e822c97 in sasmodels


Ignore:
Timestamp:
Aug 30, 2016 4:21:51 PM (8 years ago)
Author:
Paul Kienzle <pkienzle@…>
Branches:
master, core_shell_microgels, costrafo411, magnetic_model, release_v0.94, release_v0.95, ticket-1257-vesicle-product, ticket_1156, ticket_1265_superball, ticket_822_more_unit_tests
Children:
3a45c2c
Parents:
51241113
Message:

internal docs: clean up description of the new polydispersity calculator

File:
1 edited

Legend:

Unmodified
Added
Removed
  • doc/developer/calculator.rst

    r6fd8de0 re822c97  
     1.. currentmodule:: sasmodels 
     2 
    13Calculator Interface 
    24==================== 
    35 
    4 The environment needs to provide the following #defines: 
     6This document describes the layer between the form factor kernels and the 
     7model calculator which implements the polydispersity and magnetic SLD 
     8calculations.  There are three separate implementations of this layer, 
     9*kernelcl.py* for OpenCL, which operates on a single Q value at a time, 
     10*kerneldll.c* for the DLL, which loops over a vector of Q values, and 
     11*kernelpy.py* for python models which operates on vector Q values. 
     12 
     13Each implementation provides three different calls *Iq*, *Iqxy* and *Imagnetic* 
     14for 1-D, 2-D and 2-D magnetic kernels respectively. 
     15 
     16The C code is defined in *kernel_iq.c* and *kernel_iq.cl* for DLL and OpenCL 
     17respectively.  The kernel call looks as follows:: 
     18 
     19  kernel void KERNEL_NAME( 
     20      int nq,                  // Number of q values in the q vector 
     21      int pd_start,            // Starting position in the polydispersity loop 
     22      int pd_stop,             // Ending position in the polydispersity loop 
     23      ProblemDetails *details, // Polydispersity info 
     24      double *values,          // Value and weights vector 
     25      double *q,               // q or (qx,qy) vector 
     26      double *result,          // returned I(q), with result[nq] = pd_weight 
     27      double cutoff)           // polydispersity weight cutoff 
     28 
     29The details for OpenCL and the python loop are slightly different, but these 
     30data structures are common. 
     31 
     32*nq* indicates the number of q values that will be calculated. 
     33 
     34The *pd_start* and *pd_stop* parameters set the range of the polydispersity 
     35loop to compute for the current kernel call.   Give a polydispersity 
     36calculation with 30 weights for length and 30 weights for radius for example, 
     37there are a total of 900 calls to the form factor required to compute the 
     38kernel.  These might be done in chunks of 100, so the first call would start 
     39at zero and stop after 100 iterations.  The second call would then have to set 
     40the length index to 3 and the radius index to 10 for a position of 3*30+10=100, 
     41and could then proceed to position 200.  This allows us to interrupt the 
     42calculation in the middle of a long polydispersity loop without having to 
     43do special tricks with the C code.  More importantly, it stops the OpenCL 
     44kernel in a reasonable time; because the GPU is used by the operating 
     45system to show its windows, if a GPU kernel runs too long then it will be 
     46automatically killed and no results will be returned to the caller. 
     47 
     48The *ProblemDetails* structure is a direct map of the 
     49:class:`details.CallDetails` buffer.  This indicates which parameters are 
     50polydisperse, and where in the values vector the values and weights can be 
     51found.  For each polydisperse parameter there is a parameter id, the length 
     52of the polydispersity loop for that parameter, the offset of the parameter 
     53values in the pd value and pd weight vectors and the 'stride' from one index 
     54to the next, which is used to translate between the position in the 
     55polydispersity loop and the particular parameter indices.  The *num_eval* 
     56field is the total size of the polydispersity loop.  *num_weights* is the 
     57number of elements in the pd value and pd weight vectors.  *num_active* is 
     58the number of non-trivial pd loops (polydisperse parameters should be ordered 
     59by decreasing pd vector length, with a length of 1 meaning no polydispersity). 
     60Oriented objects in 2-D need a cos(theta) spherical correction on the angular 
     61variation in order to preserve the 'surface area' of the weight distribution. 
     62*theta_par* is the id of the polar coordinate parameter if there is one. 
     63 
     64 
     65The *values* vector consists of the fixed values for the model plus pd value 
     66and pd weight vectors.  There are *NUM_VALUES* fixed values for the model, 
     67which includes the initial two slots for scale and background, the NUM_PARS 
     68values for the kernel parameters, three slots for the applied magnetism, and 
     69three slots for each of the SLD parameters for the sample magnetiziation 
     70*(Mx, My, Mz)*.  Sample magnetization is translated from *(M, theta, phi)* 
     71to *(Mx, My, Mz)* before the kernel is called.   After the fixed values comes 
     72the pd value vector, with the polydispersity values for each parameter 
     73stacked one after the other.  The order isn't important since the location 
     74for each parameter is stored in the *pd_offset* field of the *ProblemDetails* 
     75structure, but the values do need to be contiguous.  After *num_weights* 
     76values, the pd weight vector is stored, with the same configuration as the 
     77pd value vector.  Note that the pd vectors can contain values that are not 
     78in the polydispersity loop; this is used by :class:`mixture.MixtureKernel` 
     79to make it easier to call the various component kernels. 
     80 
     81The *q* vector contains one value for each q for *Iq* kernels, or a pair 
     82of *(qx,qy)* values for each q for *Iqxy* and *Imagnetic* kernels.  OpenCL 
     83pads all vectors to 32 value boundaries just to be safe, including the 
     84*values* vector and the the *results* vector. 
     85 
     86The *results* vector contains one slot for each of the *nq* values, plus 
     87one extra slot at the end for the current polydisperse normalization.  This 
     88is required when the polydispersity loop is broken across several kernel 
     89calls. 
     90 
     91*cutoff* is a importance cutoff so that points which contribute negligibly 
     92to the total scattering can be skipped without calculating them. 
     93 
     94:func:`generate.make_source` defines the following C macros: 
    595 
    696- USE_OPENCL is defined if running in opencl 
    7 - KERNEL declares a function to be available externally 
     97- MAX_PD is the maximum depth of the polydispersity loop [model specific] 
     98- NUM_PARS is the number of parameter values in the kernel.  This may be 
     99  more than the number of parameters if some of the parameters are vector 
     100  values. 
     101- NUM_VALUES is the number of fixed values, which defines the offset in the 
     102  value list to the polydisperse value and weight vectors. 
     103- NUM_MAGNETIC is the number of magnetic SLDs 
     104- MAGNETIC_PARS is a comma separated list of the magnetic SLDs, indicating 
     105  their locations in the values vector. 
     106- MAGNETIC_PAR0 to MAGNETIC_PAR2 are the first three magnetic parameter ids 
     107  so we can hard code the setting of magnetic values if there are only a 
     108  few of them. 
    8109- KERNEL_NAME is the name of the function being declared 
    9 - MAX_PD is the maximum depth of the polydispersity loop [model specific] 
    10 - NPARS is the number of parameters in the kernel 
    11 - PARAMETER_TABLE is the declaration of the parameters to the kernel:: 
    12  
    13     Cylinder: 
     110- PARAMETER_TABLE is the declaration of the parameters to the kernel: 
     111 
     112    Cylinder:: 
    14113 
    15114        #define PARAMETER_TABLE \ 
     
    17116        double radius; \ 
    18117        double sld; \ 
    19         double sld_solvent 
     118        double sld_solvent; 
    20119 
    21120    Note: scale and background are never included 
    22121 
    23     Multi-shell cylinder (10 shell max): 
     122    Multi-shell cylinder (10 shell max):: 
    24123 
    25124        #define PARAMETER_TABLE \ 
     
    28127        double radius[10]; \ 
    29128        double sld[10]; \ 
    30         double sld_solvent 
    31  
    32 - CALL_IQ(q, i, var) is the declaration of a call to the kernel:: 
    33  
    34     Cylinder: 
     129        double sld_solvent; 
     130 
     131- CALL_IQ(q, i, var) is the declaration of a call to the kernel: 
     132 
     133    Cylinder:: 
    35134 
    36135        #define CALL_IQ(q, i, var) Iq(q[i], \ 
     
    40139        var.sld_solvent) 
    41140 
    42     Multi-shell cylinder: 
     141    Multi-shell cylinder:: 
    43142 
    44143        #define CALL_IQ(q, i, var) Iq(q[i], \ 
     
    49148        var.sld_solvent) 
    50149 
    51     Cylinder2D: 
     150    Cylinder2D:: 
    52151 
    53152        #define CALL_IQ(q, i, var) Iqxy(q[2*i], q[2*i+1], \ 
     
    64163        form_volume(var.length, var.radius) 
    65164 
    66 - INVALID(var) is a test for model parameters in the correct range:: 
    67  
    68     Cylinder: 
     165There is an additional macro that can be defined within the model.c file: 
     166 
     167- INVALID(var) is a test for model parameters in the correct range: 
     168 
     169    Cylinder:: 
    69170 
    70171        #define INVALID(var) 0 
    71172 
    72     BarBell: 
     173    BarBell:: 
    73174 
    74175        #define INVALID(var) (var.bell_radius < var.radius) 
    75176 
    76     Model with complicated constraints: 
     177    Model with complicated constraints:: 
    77178 
    78179        inline bool constrained(p1, p2, p3) { return expression; } 
     
    86187calculator treats the parameter set as one long vector. 
    87188 
    88 Let's assume we have 6 parameters in the model, with two polydisperse:: 
     189Let's assume we have 8 parameters in the model, with two polydisperse.  Since 
     190this is a 1-D model the orientation parameters won't be used:: 
    89191 
    90192    0: scale        {scl = constant} 
    91193    1: background   {bkg = constant} 
    92     2: length       {l = vector of 30pts} 
    93     3: radius       {r = vector of 10pts} 
     194    2: radius       {r = vector of 10pts} 
     195    3: length       {l = vector of 30pts} 
    94196    4: sld          {s = constant/(radius**2*length)} 
    95197    5: sld_solvent  {s2 = constant} 
    96  
    97 This generates the following call to the kernel (where x stands for an 
    98 arbitrary value that is not used by the kernel evaluator):: 
    99  
    100     NPARS = 4  // scale and background are in all models 
    101     problem { 
    102         pd_par = {3, 2, x, x}         // parameters *radius* and *length* vary 
    103         pd_length = {30, 10, 0, 0}    // *length* has more, so it is first 
    104         pd_offset = {10, 0, x, x}     // *length* starts at index 10 in weights 
     198    6: theta        {not used} 
     199    7: phi          {not used} 
     200 
     201This generates the following call to the kernel.  Note that parameters 4 and 
     2025 are treated as polydisperse even though they are not --- this is because 
     203it is harmless to do so and simplifies the looping code:: 
     204 
     205    MAX_PD = 4 
     206    NUM_PARS = 6          // kernel parameters only 
     207    NUM_VALUES = 17       // all values, including scale and background 
     208    NUM_MAGNETIC = 2      // two parameters might be magnetic 
     209    MAGNETIC_PARS = 4, 5  // they are sld and sld_solvent 
     210    MAGNETIC_PAR0 = 4     // sld index 
     211    MAGNETIC_PAR1 = 5     // solvent index 
     212 
     213    details { 
     214        pd_par = {3, 2, 4, 5}         // parameters *radius* and *length* vary 
     215        pd_length = {30, 10, 1, 1}    // *length* has more, so it is first 
     216        pd_offset = {10, 0, 31, 32}   // *length* starts at index 10 in weights 
    105217        pd_stride = {1, 30, 300, 300} // cumulative product of pd length 
    106         pd_isvol = {True, True, x, x}       // true if weight is a volume weight 
    107         par_offset = {2, 3, 303, 313}  // parameter offsets 
    108         par_coord = {0, 3, 2, 1} // bitmap of parameter dependencies 
    109         fast_coord_index = {3, 5, x, x} // radius and sld have fast index 
    110         fast_coord_count = 2  // two parameters vary with *length* distribution 
    111         theta_var = -1   // no spherical correction 
    112         fast_theta = 0   // spherical correction angle is not pd 1 
     218        num_eval = 300   // 300 values in the polydispersity loop 
     219        num_weights = 42 // 42 values in the pd vector 
     220        num_active = 2   // only the first two pd are active 
     221        theta_var =  6   // spherical correction 
    113222    } 
    114223 
    115     weight = { l0, .., l29, r0, .., r9} //length comes first as the longest vec 
    116     pars = { scl, bkg, l0, ..., l29, r0, r1, ..., r9, 
    117              s[l0,r0], ... s[l0,r9], s[l1,r0], ... s[l29,r9] , s2} 
    118              //where s[x,y] stands for material sld, s2 = solvent sld 
     224    values = { scl, bkg,                                  // universal 
     225               r, l, s, s2, theta, phi,                   // kernel pars 
     226               in spin, out spin, spin angle,             // applied magnetism 
     227               mx s, my s, mz s, mx s2, my s2, mz s2,     // magnetic slds 
     228               r0, .., r9, l0, .., l29, s, s2,            // pd values 
     229               r0, .., r9, l0, .., l29, s, s2}            // pd weights 
    119230 
    120231    nq = 130 
    121232    q = { q0, q1, ..., q130, x, x }  # pad to 8 element boundary 
    122     result = {r1, ..., r130, norm, vol, vol_norm, x, x, x, x, x, x, x} 
    123  
     233    result = {r1, ..., r130, pd_norm, x } 
    124234 
    125235The polydisperse parameters are stored in as an array of parameter 
     
    136246We limit the number of polydisperse dimensions to MAX_PD (currently 4), 
    137247though some models may have fewer if they have fewer polydisperse 
    138 parameters. This cuts the size of the structure in half compared to 
    139 allowing a separate polydispersity for each parameter.  This will 
    140 help a little bit for models with large numbers of parameters, such 
    141 as the onion model. 
    142  
    143 Parameters may be coordinated.  That is, we may have the value of one 
    144 parameter depend on a set of other parameters, some of which may be 
    145 polydisperse.  For example, if sld is inversely proportional to the 
    146 volume of a cylinder, and the length and radius are independently 
    147 polydisperse, then for each combination of length and radius we need a 
    148 separate value for the sld.  The caller must provide a coordination table 
    149 for each parameter containing the value for each parameter given the 
    150 value of the polydisperse parameters v1, v2, etc.  The tables for each 
    151 parameter are arranged contiguously in a vector, with offset[k] giving the 
    152 starting location of parameter k in the vector.  Each parameter defines 
    153 par_coord[k] as a bit mask indicating which polydispersity parameters the 
    154 parameter depends upon. Usually this is zero, indicating that the parameter 
    155 is independent, but for the cylinder example given, the bits for the 
    156 radius and length polydispersity parameters would both be set, the result 
    157 being a (#radius x #length) table, or maybe a (#length x #radius) table 
    158 if length comes first in the polydispersity table. 
    159  
    160 NB: If we can guarantee that a compiler and OpenCL driver are available, 
    161 we could instead create the coordination function on the fly for each 
    162 parameter, saving memory and transfer time, but requiring a C compiler 
    163 as part of the environment. 
    164  
    165 In ordering the polydisperse parameters by decreasing length we can 
    166 iterate over the longest dispersion weight vector first.  All parameters 
    167 coordinated with this weight vector (the 'fast' parameters), can be 
    168 updated with a simple increment to the next position in the parameter 
    169 value table.  The indices of these parameters is stored in fast_coord_index[], 
    170 with fast_coord_count being the number of fast parameters.  A total 
    171 of NPARS slots is allocated to allow for the case that all parameters 
    172 are coordinated with the fast index, though this will likely be mostly 
    173 empty.  When the fast increment count reaches the end of the weight 
    174 vector, then the index of the second polydisperse parameter must be 
    175 incremented, and all of its coordinated parameters updated.  Because this 
    176 operation is not in the inner loop, a slower algorithm can be used. 
     248parameters.  The main reason for the limit is to reduce code size. 
     249Each additional polydisperse parameter requires a separate polydispersity 
     250loop.  If more than 4 levels of polydispersity are needed, then kernel_iq.c 
     251and kernel_iq.cl will need to be extended. 
     252 
     253Constraints between parameters are not supported.  Instead users will 
     254have to define a new model with the constraints built in by making a 
     255copy of the existing model.  Mac provides OpenCL and we are supplying 
     256the tinycc compiler for Windows so this won't be a complete limitation, 
     257but it is rather inconvenient.  The process could perhaps be automated 
     258so that there is no code copying required, just an alternate CALL_IQ 
     259macro that implements the constraints.  Think carefully about constraining 
     260theta since the polar coordinates normalization is tied to this parameter. 
    177261 
    178262If there is no polydispersity we pretend that it is polydisperisty with one 
     
    180264calculation in this case, depending on how much time it saves. 
    181265 
    182 The problem details structure can be allocated and sent in as an integer 
    183 array using the read-only flag.  This allows us to copy it once per fit 
     266The problem details structure could be allocated and sent in as an integer 
     267array using the read-only flag.  This would allow us to copy it once per fit 
    184268along with the weights vector, since features such as the number of 
    185269polydisperity elements per pd parameter or the coordinated won't change 
    186 between function evaluations.  A new parameter vector is sent for 
    187 each I(q) evaluation. 
    188  
    189 To protect against expensive evaluations taking all the GPU resource 
    190 on large fits, the entire polydispersity will not be computed at once. 
    191 Instead, a start and stop location will be sent, indicating where in the 
    192 polydispersity loop the calculation should start and where it should 
    193 stop.  We can do this for arbitrary start/stop points since we have 
    194 unwound the nested loop.  Instead, we use the same technique as array 
    195 index translation, using div and mod to figure out the i,j,k,... 
    196 indices in the virtual nested loop. 
     270between function evaluations.  A new parameter vector must be sent for 
     271each I(q) evaluation.  This is not currently implemented, and would require 
     272some resturcturing of the :class:`sasview_model.SasviewModel` interface. 
    197273 
    198274The results array will be initialized to zero for polydispersity loop 
     
    203279for the GPU, but it makes the calling sequence much more manageable. 
    204280 
    205 Scale and background cannot be coordinated with other polydisperse parameters 
    206  
    207 Oriented objects in 2-D need a spherical correction on the angular variation 
    208 in order to preserve the 'surface area' of the weight distribution. 
    209  
    210 cutoff parameter limits integration area within polydispersity hypercude, 
    211 which speeds calculations 
    212  
    213281For accuracy we may want to introduce Kahan summation into the integration:: 
    214  
    215282 
    216283    double accumulated_error = 0.0; 
     
    224291        ret += next; 
    225292    #endif 
     293 
     294This will require accumulated error for each I(q) value to be preserved 
     295between kernel calls to implement this fully.  The kernel_iq.c code, which 
     296loops over q for each parameter set in the polydispersity loop, will need 
     297also need the accumalation vector. 
Note: See TracChangeset for help on using the changeset viewer.