calculator.rst @ 7e11c3a

core_shell_microgelscostrafo411magnetic_modelticket-1257-vesicle-productticket_1156ticket_1265_superballticket_822_more_unit_tests

Last change on this file since 7e11c3a was 870a2f4, checked in by Paul Kienzle <pkienzle@…>, 7 years ago
fix doc builds
Property mode set to `100644`
File size: 13.5 KB

Rev	Line
[e822c97]	1	.. currentmodule:: sasmodels
	2
[870a2f4]	3	.. _Calculator_Interface:
	4
[d5ac45f]	5	Calculator Interface
	6	====================
	7
[e822c97]	8	This document describes the layer between the form factor kernels and the
	9	model calculator which implements the polydispersity and magnetic SLD
	10	calculations. There are three separate implementations of this layer,
[2e66ef5]	11	:mod:`kernelcl` for OpenCL, which operates on a single Q value at a time,
	12	:mod:`kerneldll` for the DLL, which loops over a vector of Q values, and
	13	:mod:`kernelpy` for python models which operates on vector Q values.
[e822c97]	14
	15	Each implementation provides three different calls Iq, Iqxy and Imagnetic
	16	for 1-D, 2-D and 2-D magnetic kernels respectively.
	17
	18	The C code is defined in kernel_iq.c and kernel_iq.cl for DLL and OpenCL
	19	respectively. The kernel call looks as follows::
	20
	21	kernel void KERNEL_NAME(
	22	int nq, // Number of q values in the q vector
	23	int pd_start, // Starting position in the polydispersity loop
	24	int pd_stop, // Ending position in the polydispersity loop
	25	ProblemDetails *details, // Polydispersity info
	26	double *values, // Value and weights vector
	27	double *q, // q or (qx,qy) vector
	28	double *result, // returned I(q), with result[nq] = pd_weight
	29	double cutoff) // polydispersity weight cutoff
	30
	31	The details for OpenCL and the python loop are slightly different, but these
	32	data structures are common.
	33
	34	nq indicates the number of q values that will be calculated.
	35
	36	The pd_start and pd_stop parameters set the range of the polydispersity
	37	loop to compute for the current kernel call. Give a polydispersity
	38	calculation with 30 weights for length and 30 weights for radius for example,
	39	there are a total of 900 calls to the form factor required to compute the
	40	kernel. These might be done in chunks of 100, so the first call would start
	41	at zero and stop after 100 iterations. The second call would then have to set
	42	the length index to 3 and the radius index to 10 for a position of 3*30+10=100,
	43	and could then proceed to position 200. This allows us to interrupt the
	44	calculation in the middle of a long polydispersity loop without having to
	45	do special tricks with the C code. More importantly, it stops the OpenCL
	46	kernel in a reasonable time; because the GPU is used by the operating
	47	system to show its windows, if a GPU kernel runs too long then it will be
	48	automatically killed and no results will be returned to the caller.
	49
	50	The ProblemDetails structure is a direct map of the
	51	:class:`details.CallDetails` buffer. This indicates which parameters are
	52	polydisperse, and where in the values vector the values and weights can be
	53	found. For each polydisperse parameter there is a parameter id, the length
	54	of the polydispersity loop for that parameter, the offset of the parameter
	55	values in the pd value and pd weight vectors and the 'stride' from one index
	56	to the next, which is used to translate between the position in the
	57	polydispersity loop and the particular parameter indices. The num_eval
	58	field is the total size of the polydispersity loop. num_weights is the
	59	number of elements in the pd value and pd weight vectors. num_active is
	60	the number of non-trivial pd loops (polydisperse parameters should be ordered
	61	by decreasing pd vector length, with a length of 1 meaning no polydispersity).
	62	Oriented objects in 2-D need a cos(theta) spherical correction on the angular
	63	variation in order to preserve the 'surface area' of the weight distribution.
	64	theta_par is the id of the polar coordinate parameter if there is one.
	65
	66
	67	The values vector consists of the fixed values for the model plus pd value
	68	and pd weight vectors. There are NUM_VALUES fixed values for the model,
	69	which includes the initial two slots for scale and background, the NUM_PARS
	70	values for the kernel parameters, three slots for the applied magnetism, and
	71	three slots for each of the SLD parameters for the sample magnetiziation
	72	(Mx, My, Mz). Sample magnetization is translated from (M, theta, phi)
	73	to (Mx, My, Mz) before the kernel is called. After the fixed values comes
	74	the pd value vector, with the polydispersity values for each parameter
	75	stacked one after the other. The order isn't important since the location
	76	for each parameter is stored in the pd_offset field of the ProblemDetails
	77	structure, but the values do need to be contiguous. After num_weights
	78	values, the pd weight vector is stored, with the same configuration as the
	79	pd value vector. Note that the pd vectors can contain values that are not
	80	in the polydispersity loop; this is used by :class:`mixture.MixtureKernel`
	81	to make it easier to call the various component kernels.
	82
	83	The q vector contains one value for each q for Iq kernels, or a pair
	84	of (qx,qy) values for each q for Iqxy and Imagnetic kernels. OpenCL
	85	pads all vectors to 32 value boundaries just to be safe, including the
	86	values vector and the the results vector.
	87
	88	The results vector contains one slot for each of the nq values, plus
	89	one extra slot at the end for the current polydisperse normalization. This
	90	is required when the polydispersity loop is broken across several kernel
	91	calls.
	92
	93	cutoff is a importance cutoff so that points which contribute negligibly
	94	to the total scattering can be skipped without calculating them.
	95
	96	:func:`generate.make_source` defines the following C macros:
[d5ac45f]	97
	98	- USE_OPENCL is defined if running in opencl
[6fd8de0]	99	- MAX_PD is the maximum depth of the polydispersity loop [model specific]
[e822c97]	100	- NUM_PARS is the number of parameter values in the kernel. This may be
	101	more than the number of parameters if some of the parameters are vector
	102	values.
	103	- NUM_VALUES is the number of fixed values, which defines the offset in the
	104	value list to the polydisperse value and weight vectors.
	105	- NUM_MAGNETIC is the number of magnetic SLDs
	106	- MAGNETIC_PARS is a comma separated list of the magnetic SLDs, indicating
	107	their locations in the values vector.
	108	- MAGNETIC_PAR0 to MAGNETIC_PAR2 are the first three magnetic parameter ids
	109	so we can hard code the setting of magnetic values if there are only a
	110	few of them.
	111	- KERNEL_NAME is the name of the function being declared
	112	- PARAMETER_TABLE is the declaration of the parameters to the kernel:
[d5ac45f]	113
[e822c97]	114	Cylinder::
[d5ac45f]	115
	116	#define PARAMETER_TABLE \
	117	double length; \
	118	double radius; \
	119	double sld; \
[e822c97]	120	double sld_solvent;
[d5ac45f]	121
	122	Note: scale and background are never included
	123
[e822c97]	124	Multi-shell cylinder (10 shell max)::
[d5ac45f]	125
	126	#define PARAMETER_TABLE \
	127	double num_shells; \
	128	double length; \
	129	double radius[10]; \
	130	double sld[10]; \
[e822c97]	131	double sld_solvent;
[d5ac45f]	132
[e822c97]	133	- CALL_IQ(q, i, var) is the declaration of a call to the kernel:
[d5ac45f]	134
[e822c97]	135	Cylinder::
[d5ac45f]	136
[03cac08]	137	#define CALL_IQ(q, i, var) Iq(q[i], \
[d5ac45f]	138	var.length, \
	139	var.radius, \
	140	var.sld, \
	141	var.sld_solvent)
	142
[e822c97]	143	Multi-shell cylinder::
[d5ac45f]	144
[03cac08]	145	#define CALL_IQ(q, i, var) Iq(q[i], \
[d5ac45f]	146	var.num_shells, \
	147	var.length, \
	148	var.radius, \
	149	var.sld, \
	150	var.sld_solvent)
	151
[e822c97]	152	Cylinder2D::
[03cac08]	153
	154	#define CALL_IQ(q, i, var) Iqxy(q[2i], q[2i+1], \
	155	var.length, \
	156	var.radius, \
	157	var.sld, \
	158	var.sld_solvent, \
	159	var.theta, \
	160	var.phi)
	161
[d5ac45f]	162	- CALL_VOLUME(var) is similar, but for calling the form volume::
	163
	164	#define CALL_VOLUME(var) \
	165	form_volume(var.length, var.radius)
	166
[e822c97]	167	There is an additional macro that can be defined within the model.c file:
[d5ac45f]	168
[e822c97]	169	- INVALID(var) is a test for model parameters in the correct range:
	170
	171	Cylinder::
[d5ac45f]	172
	173	#define INVALID(var) 0
	174
[e822c97]	175	BarBell::
[d5ac45f]	176
[6fd8de0]	177	#define INVALID(var) (var.bell_radius < var.radius)
[d5ac45f]	178
[e822c97]	179	Model with complicated constraints::
[d5ac45f]	180
	181	inline bool constrained(p1, p2, p3) { return expression; }
	182	#define INVALID(var) constrained(var.p1, var.p2, var.p3)
	183
	184	Our design supports a limited number of polydispersity loops, wherein
	185	we need to cycle through the values of the polydispersity, calculate
	186	the I(q, p) for each combination of parameters, and perform a normalized
	187	weighted sum across all the weights. Parameters may be passed to the
	188	underlying calculation engine as scalars or vectors, but the polydispersity
	189	calculator treats the parameter set as one long vector.
	190
[e822c97]	191	Let's assume we have 8 parameters in the model, with two polydisperse. Since
	192	this is a 1-D model the orientation parameters won't be used::
[d5ac45f]	193
	194	0: scale {scl = constant}
	195	1: background {bkg = constant}
[e822c97]	196	2: radius {r = vector of 10pts}
	197	3: length {l = vector of 30pts}
[6fd8de0]	198	4: sld {s = constant/(radius*2length)}
	199	5: sld_solvent {s2 = constant}
[e822c97]	200	6: theta {not used}
	201	7: phi {not used}
	202
	203	This generates the following call to the kernel. Note that parameters 4 and
	204	5 are treated as polydisperse even though they are not --- this is because
	205	it is harmless to do so and simplifies the looping code::
	206
	207	MAX_PD = 4
	208	NUM_PARS = 6 // kernel parameters only
	209	NUM_VALUES = 17 // all values, including scale and background
	210	NUM_MAGNETIC = 2 // two parameters might be magnetic
	211	MAGNETIC_PARS = 4, 5 // they are sld and sld_solvent
	212	MAGNETIC_PAR0 = 4 // sld index
	213	MAGNETIC_PAR1 = 5 // solvent index
	214
	215	details {
	216	pd_par = {3, 2, 4, 5} // parameters radius and length vary
	217	pd_length = {30, 10, 1, 1} // length has more, so it is first
	218	pd_offset = {10, 0, 31, 32} // length starts at index 10 in weights
[d5ac45f]	219	pd_stride = {1, 30, 300, 300} // cumulative product of pd length
[e822c97]	220	num_eval = 300 // 300 values in the polydispersity loop
	221	num_weights = 42 // 42 values in the pd vector
	222	num_active = 2 // only the first two pd are active
	223	theta_var = 6 // spherical correction
[d5ac45f]	224	}
	225
[e822c97]	226	values = { scl, bkg, // universal
	227	r, l, s, s2, theta, phi, // kernel pars
	228	in spin, out spin, spin angle, // applied magnetism
	229	mx s, my s, mz s, mx s2, my s2, mz s2, // magnetic slds
	230	r0, .., r9, l0, .., l29, s, s2, // pd values
	231	r0, .., r9, l0, .., l29, s, s2} // pd weights
[d5ac45f]	232
	233	nq = 130
	234	q = { q0, q1, ..., q130, x, x } # pad to 8 element boundary
[e822c97]	235	result = {r1, ..., r130, pd_norm, x }
[d5ac45f]	236
	237	The polydisperse parameters are stored in as an array of parameter
	238	indices, one for each polydisperse parameter, stored in pd_par[n].
	239	Non-polydisperse parameters do not appear in this array. Each polydisperse
[6fd8de0]	240	parameter has a weight vector whose length is stored in pd_length[n].
[d5ac45f]	241	The weights are stored in a contiguous vector of weights for all
	242	parameters, with the starting position for the each parameter stored
	243	in pd_offset[n]. The values corresponding to the weights are stored
	244	together in a separate weights[] vector, with offset stored in
	245	par_offset[pd_par[n]]. Polydisperse parameters should be stored in
	246	decreasing order of length for highest efficiency.
	247
[48fbd50]	248	We limit the number of polydisperse dimensions to MAX_PD (currently 4),
	249	though some models may have fewer if they have fewer polydisperse
[e822c97]	250	parameters. The main reason for the limit is to reduce code size.
	251	Each additional polydisperse parameter requires a separate polydispersity
	252	loop. If more than 4 levels of polydispersity are needed, then kernel_iq.c
	253	and kernel_iq.cl will need to be extended.
	254
	255	Constraints between parameters are not supported. Instead users will
	256	have to define a new model with the constraints built in by making a
	257	copy of the existing model. Mac provides OpenCL and we are supplying
	258	the tinycc compiler for Windows so this won't be a complete limitation,
	259	but it is rather inconvenient. The process could perhaps be automated
	260	so that there is no code copying required, just an alternate CALL_IQ
	261	macro that implements the constraints. Think carefully about constraining
	262	theta since the polar coordinates normalization is tied to this parameter.
[d5ac45f]	263
	264	If there is no polydispersity we pretend that it is polydisperisty with one
	265	parameter, pd_start=0 and pd_stop=1. We may or may not short circuit the
	266	calculation in this case, depending on how much time it saves.
	267
[e822c97]	268	The problem details structure could be allocated and sent in as an integer
	269	array using the read-only flag. This would allow us to copy it once per fit
[d5ac45f]	270	along with the weights vector, since features such as the number of
	271	polydisperity elements per pd parameter or the coordinated won't change
[e822c97]	272	between function evaluations. A new parameter vector must be sent for
	273	each I(q) evaluation. This is not currently implemented, and would require
	274	some resturcturing of the :class:`sasview_model.SasviewModel` interface.
[d5ac45f]	275
	276	The results array will be initialized to zero for polydispersity loop
	277	entry zero, and preserved between calls to [start, stop] so that the
	278	results accumulate by the time the loop has completed. Background and
	279	scale will be applied when the loop reaches the end. This does require
	280	that the results array be allocated read-write, which is less efficient
	281	for the GPU, but it makes the calling sequence much more manageable.
	282
[03cac08]	283	For accuracy we may want to introduce Kahan summation into the integration::
	284
	285	double accumulated_error = 0.0;
	286	...
	287	#if USE_KAHAN_SUMMATION
	288	const double y = next - accumulated_error;
	289	const double t = ret + y;
	290	accumulated_error = (t - ret) - y;
	291	ret = t;
	292	#else
	293	ret += next;
	294	#endif
[e822c97]	295
	296	This will require accumulated error for each I(q) value to be preserved
	297	between kernel calls to implement this fully. The kernel_iq.c code, which
	298	loops over q for each parameter set in the polydispersity loop, will need
	299	also need the accumalation vector.

Note: See TracBrowser for help on using the repository browser.

SasView

source: sasmodels/doc/developer/calculator.rst @ 7e11c3a

Download in other formats: