assembly.py @ 69032d4

ESS_GUIESS_GUI_DocsESS_GUI_batch_fittingESS_GUI_bumps_abstractionESS_GUI_iss1116ESS_GUI_iss879ESS_GUI_iss959ESS_GUI_openclESS_GUI_orderingESS_GUI_sync_sascalccostrafo411magnetic_scattrelease-4.1.1release-4.1.2release-4.2.2release_4.0.1ticket-1009ticket-1094-headlessticket-1242-2d-resolutionticket-1243ticket-1249ticket885unittest-saveload

Last change on this file since 69032d4 was 5ab5cae, checked in by Peter Parker, 9 years ago
Refs #202 - Try and fix some Sphinx warnings.
Property mode set to `100644`
File size: 22.8 KB

Line
1	# This program is public domain
2	"""
3	An assembly is a collection of fitting functions. This provides
4	the model representation that is the basis of the park fitting engine.
5
6	Models can range from very simple one dimensional theory functions
7	to complex assemblies of multidimensional datasets from different
8	experimental techniques, each with their own theory function and
9	a common underlying physical model.
10
11	Usage
12	=====
13
14	First define the models you want to work with. In the example
15	below we will use an example of a simple multilayer system measured by
16	specular reflection of xrays and neutrons. The gold depth is the only
17	fitting parameter, ranging from 10-30 A. The interface depths are
18	tied together using expressions. In this case the expression is
19	a simple copy, but any standard math functions can be used. Some
20	model developers may provide additional functions for use with the
21	expression.
22
23	Example models::
24
25	import reflectometry.model1d as refl
26	xray = refl.model('xray')
27	xray.incident('Air',rho=0)
28	xray.interface('iAu',sigma=5)
29	xray.layer('Au',rho=124.68,depth=[10,30])
30	xray.interface('iSi',sigma=5)
31	xray.substrate('Si',rho=20.07)
32	datax = refl.data('xray.dat')
33
34	neutron = refl.model('neutron')
35	neutron.incident('Air',rho=0)
36	neutron.interface('iAu',sigma='xray.iAu')
37	neutron.layer('Au',rho=4.66,depth='xray.Au.depth')
38	neutron.interface('iSi',sigma='xray.iSi')
39	neutron.substrate('Si',rho=2.07)
40	datan = refl.data('neutron.dat')
41
42	As you can see from the above, parameters can be set to a value if
43	the parameter is fixed, to a range if the parametemr is fitted, or
44	to a string expression if the parameter is calculated from other
45	parameters. See park.Parameter.set for further details.
46
47	Having constructed the models, we can now create an assembly::
48
49	import park
50	assembly = park.Assembly([(xray,datax), (neutron,datan)])
51
52	Note: this would normally be done in the context of a fit
53	using fit = park.Fit([(xray,datax), (neutron,datan)]), and later referenced
54	using fit.assembly.
55
56	Individual parts of the assembly are accessable using the
57	model number 0, 1, 2... or by the model name. In the above,
58	assembly[0] and assembly['xray'] refer to the same model.
59	Assemblies have insert and append functions for adding new
60	models, and "del model[idx]" for removing them.
61
62	Once the assembly is created computing the values for the system
63	is a matter of calling::
64
65	assembly.eval()
66	print "Chi**2",assembly.chisq
67	print "Reduced chi**2",assembly.chisq/assembly.degrees_of_freedom
68	plot(arange(len(assembly.residuals)), assembly.residuals)
69
70	This defines the attributes residuals, degrees_of_freedom and chisq,
71	which is what the optimizer uses as the cost function to minimize.
72
73	assembly.eval uses the current values for the parameters in the
74	individual models. These parameters can be changed directly
75	in the model. In the reflectometry example above, you could
76	set the gold thickness using xray.layer.Au.depth=156, or
77	something similar (the details are model specific). Parameters
78	can also be changed through the assembly parameter set. In the same
79	example, this would be assembly.parameterset['xray']['Au']['depth'].
80	See parameter set for details.
81
82	In the process of modeling data, particularly with multiple
83	datasets, you will sometimes want to temporarily ignore
84	how well one of the datasets matches so that you
85	can more quickly refine the model for the other datasets,
86	or see how particular models are influencing the fit. To
87	temporarily ignore the xray data in the example above use::
88
89	assembly.parts[0].isfitted = False
90
91	The model itself isn't ignored since its parameters may be
92	needed to compute the parameters for other models. To
93	reenable checking against the xray data, you would assign
94	a True value instead. More subtle weighting of the models
95	can be controlled using assembly.parts[idx].weight, but
96	see below for a note on model weighting.
97
98	A note on model weighting
99	-------------------------
100
101	Changing the weight is equivalent to scaling the error bars
102	on the given model by a factor of weight/n where n is the
103	number of data points. It is better to set the correct error
104	bars on your data in the first place than to adjust the weights.
105	If you have the correct error bars, then you should expect
106	roughly 2/3 of the data points to lie within one error bar of
107	the theory curve. If consecutive data points have largely
108	overlapping errorbars, then your uncertainty is overestimated.
109
110	Another case where weights are adjusted (abused?) is to
111	compensate for systematic errors in the data by forcing the
112	errorbars to be large enough to cover the systematic bias.
113	This is a poor approach to the problem. A better strategy
114	is to capture the systematic effects in the model, and treat
115	the measurement of the independent variable as an additional
116	data point in the fit. This is still not statistically sound
117	as there is likely to be a large correlation between the
118	uncertainty of the measurement and the values of all the
119	other variables.
120
121	That said, adjusting the weight on a dataset is a quick way
122	of reducing its influence on the entire fit. Please use it
123	with care.
124	"""
125
126	__all__ = ['Assembly', 'Fitness']
127	import numpy
128
129	import park
130	from park.parameter import Parameter,ParameterSet
131	from park.fitresult import FitParameter
132	import park.expression
133
134
135
136	class Fitness(object):
137	"""
138	Container for theory and data.
139
140	The fit object compares theory with data.
141
142	TODO: what to do with fittable metadata (e.g., footprint correction)?
143	"""
144	data = None
145	model = None
146	def __init__(self, model=None,data=None):
147	self.data,self.model = data,model
148	def _parameterset(self):
149	return self.model.parameterset
150	parameterset = property(_parameterset)
151	def residuals(self):
152	return self.data.residuals(self.model.eval)
153	def residuals_deriv(self, pars=[]):
154	return self.data.residuals_deriv(self.model.eval_derivs,pars=pars)
155	def set(self, **kw):
156	"""
157	Set parameters in the model.
158
159	User convenience function. This allows a user with an assembly
160	of models in a script to for example set the fit range for
161	parameter 'a' of the model::
162
163	assembly[0].set(a=[5,6])
164
165	Raises KeyError if the parameter is not in parameterset.
166	"""
167	self.model.set(**kw)
168	def abort(self):
169	if hasattr(self.model,'abort'): self.model.abort()
170
171	class Part(object):
172	"""
173	Part of a fitting assembly. Part holds the model itself and
174	associated data. The part can be initialized with a fitness
175	object or with a pair (model,data) for the default fitness function.
176
177	fitness (Fitness)
178	object implementing the `park.assembly.Fitness` interface. In
179	particular, fitness should provide a parameterset attribute
180	containing a ParameterSet and a residuals method returning a vector
181	of residuals.
182	weight (dimensionless)
183	weight for the model. See comments in assembly.py for details.
184	isfitted (boolean)
185	True if the model residuals should be included in the fit.
186	The model parameters may still be used in parameter
187	expressions, but there will be no comparison to the data.
188	residuals (vector)
189	Residuals for the model if they have been calculated, or None
190	degrees_of_freedom
191	Number of residuals minus number of fitted parameters.
192	Degrees of freedom for individual models does not make
193	sense in the presence of expressions combining models,
194	particularly in the case where a model has many parameters
195	but no data or many computed parameters. The degrees of
196	freedom for the model is set to be at least one.
197	chisq
198	sum(residuals**2); use chisq/degrees_of_freedom to
199	get the reduced chisq value.
200
201	Get/set the weight on the given model.
202
203	assembly.weight(3) returns the weight on model 3 (0-origin)
204	assembly.weight(3,0.5) sets the weight on model 3 (0-origin)
205	"""
206
207	def __init__(self, fitness, weight=1., isfitted=True):
208	if isinstance(fitness, tuple):
209	fitness = park.Fitness(*fitness)
210	self.fitness = fitness
211	self.weight = weight
212	self.isfitted = isfitted
213	self.residuals = None
214	self.chisq = numpy.Inf
215	self.degrees_of_freedom = 1
216
217
218	class Assembly(object):
219	"""
220	Collection of fit models.
221
222	Assembly implements the `park.fit.Objective` interface.
223
224	See `park.assembly` for usage.
225
226	Instance variables:
227
228	residuals : array
229	a vector of residuals spanning all models, with model
230	weights applied as appropriate.
231	degrees_of_freedom : integer
232	length of the residuals - number of fitted parameters
233	chisq : float
234	sum squared residuals; this is not the reduced chisq, which
235	you can get using chisq/degrees_of_freedom
236
237	These fields are defined for the individual models as well, with
238	degrees of freedom adjusted to the length of the individual data
239	set. If the model is not fitted or the weight is zero, the residual
240	will not be calculated.
241
242	The residuals fields are available only after the model has been
243	evaluated.
244	"""
245
246	def __init__(self, models=[]):
247	"""Build an assembly from a list of models."""
248	self.parts = []
249	for m in models:
250	self.parts.append(Part(m))
251	self._reset()
252
253	def __iter__(self):
254	"""Iterate through the models in order"""
255	for m in self.parts: yield m
256
257	def __getitem__(self, n):
258	"""Return the nth model"""
259	return self.parts[n].fitness
260
261	def __setitem__(self, n, fitness):
262	"""Replace the nth model"""
263	self.parts[n].fitness = fitness
264	self._reset()
265
266	def __delitem__(self, n):
267	"""Delete the nth model"""
268	del self.parts[n]
269	self._reset()
270
271	def weight(self, idx, value=None):
272	"""
273	Query the weight on a particular model.
274
275	Set weight to value if value is supplied.
276
277	:Parameters:
278	idx : integer
279	model number
280	value : float
281	model weight
282	:return: model weight
283	"""
284	if value is not None:
285	self.parts[idx].weight = value
286	return self.parts[idx].weight
287
288	def isfitted(self, idx, value=None):
289	"""
290	Query if a particular model is fitted.
291
292	Set isfitted to value if value is supplied.
293
294	:param idx: model number
295	:type idx: integer
296	:param value:
297	"""
298	if value is not None:
299	self.parts[idx].isfitted = value
300	return self.parts[idx].isfitted
301
302	def append(self, fitness, weight=1.0, isfitted=True):
303	"""
304	Add a model to the end of set.
305
306	:param fitness: the fitting model
307	The fitting model can be an instance of `park.assembly.Fitness`,
308	or a tuple of (`park.model.Model`,`park.data.Data1D`)
309	:param weight: model weighting (usually 1.0)
310	:param isfitted: whether model should be fit (equivalent to weight 0.)
311	"""
312	self.parts.append(Part(fitness,weight,isfitted))
313	self._reset()
314
315	def insert(self, idx, fitness, weight=1.0, isfitted=True):
316	"""Add a model to a particular position in the set."""
317	self.parts.insert(idx,Part(fitness,weight,isfitted))
318	self._reset()
319
320	def _reset(self):
321	"""Adjust the parameter set after the addition of a new model."""
322	subsets = [m.fitness.parameterset for m in self]
323	self.parameterset = ParameterSet('root',subsets)
324	self.parameterset.setprefix()
325	#print [p.path for p in self.parameterset.flatten()]
326
327	def eval(self):
328	"""
329	Recalculate the theory functions, and from them, the
330	residuals and chisq.
331
332	:note: Call this after the parameters have been updated.
333	"""
334	# Handle abort from a separate thread.
335	self._cancel = False
336
337	# Evaluate the computed parameters
338	self._fitexpression()
339
340	# Check that the resulting parameters are in a feasible region.
341	if not self.isfeasible(): return numpy.inf
342
343	resid = []
344	k = len(self._fitparameters)
345	for m in self.parts:
346	# In order to support abort, need to be able to propagate an
347	# external abort signal from self.abort() into an abort signal
348	# for the particular model. Can't see a way to do this which
349	# doesn't involve setting a state variable.
350	self._current_model = m
351	if self._cancel: return numpy.inf
352	if m.isfitted and m.weight != 0:
353	m.residuals = m.fitness.residuals()
354	N = len(m.residuals)
355	m.degrees_of_freedom = N-k if N>k else 1
356	m.chisq = numpy.sum(m.residuals**2)
357	resid.append(m.weight*m.residuals)
358	self.residuals = numpy.hstack(resid)
359	N = len(self.residuals)
360	self.degrees_of_freedom = N-k if N>k else 1
361	self.chisq = numpy.sum(self.residuals**2)
362	return self.chisq
363
364	def jacobian(self, pvec, step=1e-8):
365	"""
366	Returns the derivative wrt the fit parameters at point p.
367
368	Numeric derivatives are calculated based on step, where step is
369	the portion of the total range for parameter j, or the portion of
370	point value p_j if the range on parameter j is infinite.
371	"""
372	# Make sure the input vector is an array
373	pvec = numpy.asarray(pvec)
374	# We are being lazy here. We can precompute the bounds, we can
375	# use the residuals_deriv from the sub-models which have analytic
376	# derivatives and we need only recompute the models which depend
377	# on the varying parameters.
378	# Meanwhile, let's compute the numeric derivative using the
379	# three point formula.
380	# We are not checking that the varied parameter in numeric
381	# differentiation is indeed feasible in the interval of interest.
382	range = zip(*[p.range for p in self._fitparameters])
383	lo,hi = [numpy.asarray(v) for v in range]
384	delta = (hi-lo)*step
385	# For infinite ranges, use p*1e-8 for the step size
386	idx = numpy.isinf(delta)
387	#print "J",idx,delta,pvec,type(idx),type(delta),type(pvec)
388	delta[idx] = pvec[idx]*step
389	delta[delta==0] = step
390
391	# Set the initial value
392	for k,v in enumerate(pvec):
393	self._fitparameters[k].value = v
394	# Gather the residuals
395	r = []
396	for k,v in enumerate(pvec):
397	# Center point formula:
398	# df/dv = lim_{h->0} ( f(v+h)-f(v-h) ) / ( 2h )
399	h = delta[k]
400	self._fitparameters[k].value = v + h
401	self.eval()
402	rk = self.residuals
403	self._fitparameters[k].value = v - h
404	self.eval()
405	rk -= self.residuals
406	self._fitparameters[k].value = v
407	r.append(rk/(2*h))
408	# return the jacobian
409	return numpy.vstack(r).T
410
411
412	def cov(self, pvec):
413	"""
414	Return the covariance matrix inv(J'J) at point p.
415	"""
416
417	# Find cov of f at p
418	# cov(f,p) = inv(J'J)
419	# Use SVD
420	# J = U S V'
421	# J'J = (U S V')' (U S V')
422	# = V S' U' U S V'
423	# = V S S V'
424	# inv(J'J) = inv(V S S V')
425	# = inv(V') inv(S S) inv(V)
426	# = V inv (S S) V'
427	J = self.jacobian(pvec)
428	u,s,vh = numpy.linalg.svd(J,0)
429	JTJinv = numpy.dot(vh.T.conj()/s**2,vh)
430	return JTJinv
431
432	def stderr(self, pvec):
433	"""
434	Return parameter uncertainty.
435
436	This is just the sqrt diagonal of covariance matrix inv(J'J) at point p.
437	"""
438	return numpy.sqrt(numpy.diag(self.cov(pvec)))
439
440	def isfeasible(self):
441	"""
442	Returns true if the parameter set is in a feasible region of the
443	modeling space.
444	"""
445	return True
446
447	# Fitting service interface
448	def fit_parameters(self):
449	"""
450	Return an alphabetical list of the fitting parameters.
451
452	This function is called once at the beginning of a fit,
453	and serves as a convenient place to precalculate what
454	can be precalculated such as the set of fitting parameters
455	and the parameter expressions evaluator.
456	"""
457	self.parameterset.setprefix()
458	self._fitparameters = self.parameterset.fitted
459	self._restraints = self.parameterset.restrained
460	pars = self.parameterset.flatten()
461	context = self.parameterset.gather_context()
462	self._fitexpression = park.expression.build_eval(pars,context)
463	#print "constraints",self._fitexpression.__doc__
464
465	self._fitparameters.sort(lambda a,b: cmp(a.path,b.path))
466	# Convert to fitparameter a object
467	fitpars = [FitParameter(p.path,p.range,p.value)
468	for p in self._fitparameters]
469	return fitpars
470
471	def set_result(self, result):
472	"""
473	Set the parameters resulting from the fit into the parameter set,
474	and update the calculated expression.
475
476	The parameter values may be retrieved by walking the assembly.parameterset
477	tree, checking each parameter for isfitted, iscomputed, or isfixed.
478	For example::
479
480	assembly.set_result(result)
481	for p in assembly.parameterset.flatten():
482	if p.isfitted():
483	print "%s %g in [%g,%g]"%(p.path,p.value,p.range[0],p.range[1])
484	elif p.iscomputed():
485	print "%s computed as %g"%(p.path.p.value)
486
487	This does not calculate the function or the residuals for these parameters.
488	You can call assembly.eval() to do this. The residuals will be set in
489	assembly[i].residuals. The theory and data are model specific, and can
490	be found in assembly[i].fitness.data.
491	"""
492	for n,p in enumerate(result.parameters):
493	self._fitparameters[n] = p.value
494	self._fitexpression()
495
496	def all_results(self, result):
497	"""
498	Extend result from the fit with the calculated parameters.
499	"""
500	calcpars = [FitParameter(p.path,p.range,p.value)
501	for p in self.parameterset.computed]
502	result.parameters += calcpars
503
504	def result(self, status='step'):
505	"""
506	Details to send back to the fitting client on an improved fit.
507
508	status is 'start', 'step' or 'end' depending if this is the
509	first result to return, an improved result, or the final result.
510
511	[Not implemented]
512	"""
513	return None
514
515	def fresiduals(self, pvec):
516	chisq = self.__call__(pvec)
517	return self.residuals
518
519	def __call__(self, pvec):
520	"""
521	Cost function.
522
523	Evaluate the system for the parameter vector pvec, returning chisq
524	as the cost function to be minimized.
525
526	Raises a runtime error if the number of fit parameters is
527	different than the length of the vector.
528	"""
529	# Plug fit parameters into model
530	#print "Trying",pvec
531	pars = self._fitparameters
532	if len(pvec) != len(pars):
533	raise RuntimeError("Unexpected number of parameters")
534	for n,value in enumerate(pvec):
535	pars[n].value = value
536	# Evaluate model
537	chisq = self.eval()
538	# Evaluate additional restraints based on parameter value
539	# likelihood
540	restraints_penalty = 0
541	for p in self._restraints:
542	restraints_penalty += p.likelihood(p.value)
543	# Return total cost function
544	return self.chisq + restraints_penalty
545
546	def abort(self):
547	"""
548	Interrupt the current function evaluation.
549
550	Forward this to the currently executing model if possible.
551	"""
552	self._cancel = True
553	if hasattr(self._current_model,'abort'):
554	self._current_model.abort()
555
556	class _Exp(park.Model):
557	"""
558	Sample model for testing assembly.
559	"""
560	parameters = ['a','c']
561	def eval(self,x):
562	return self.anumpy.exp(self.cx)
563	class _Linear(park.Model):
564	parameters = ['a','c']
565	def eval(self,x):
566	#print "eval",self.a,self.c,x,self.a*x+self.c
567	return self.a*x+self.c
568	def example():
569	"""
570	Return an example assembly consisting of a pair of functions,
571	M1.aexp(M1.cx), M2.aexp(2M1.c*x)
572	and ideal data for
573	M1.a=1, M1.c=1.5, M2.a=2.5
574	"""
575	import numpy
576	import park
577	from numpy import inf
578	# Make some fake data
579	x1 = numpy.linspace(0,1,11)
580	x2 = numpy.linspace(0,1,12)
581	# Define a shared model
582	if True: # Exp model
583	y1,y2 = numpy.exp(1.5x1),2.5numpy.exp(3*x2)
584	M1 = _Exp('M1',a=[1,3],c=[1,3])
585	M2 = _Exp('M2',a=[1,3],c='2*M1.c')
586	#M2 = _Exp('M2',a=[1,3],c=3)
587	else: # Linear model
588	y1,y2 = x1+1.5, 2.5*x2+3
589	M1 = _Linear('M1',a=[1,3],c=[1,3])
590	M2 = _Linear('M2',a=[1,3],c='2*M1.c')
591	if False: # Unbounded
592	M1.a = [-inf,inf]
593	M1.c = [-inf,inf]
594	M2.a = [-inf,inf]
595	D1 = park.Data1D(x=x1, y=y1)
596	D2 = park.Data1D(x=x2, y=y2)
597	# Construct the assembly
598	assembly = park.Assembly([(M1,D1),(M2,D2)])
599	return assembly
600
601	class _Sphere(park.Model):
602	parameters = ['a','b','c','d','e']
603	def eval(self,x):
604	return self.ax2+self.bx+self.c + exp(self.d) - 3*sin(self.e)
605
606	def example5():
607	import numpy
608	import park
609	from numpy import inf
610	# Make some fake data
611	x = numpy.linspace(0,1,11)
612	# Define a shared model
613	S = _Sphere(a=1,b=2,c=3,d=4,e=5)
614	y = S.eval(x1)
615	Sfit = _Sphere(a=[-inf,inf],b=[-inf,inf],c=[-inf,inf],d=[-inf,inf],e=[-inf,inf])
616	D = park.Data1D(x=x, y=y)
617	# Construct the assembly
618	assembly = park.Assembly([(Sfit,D)])
619	return assembly
620
621	def test():
622	assembly = example()
623	assert assembly[0].parameterset.name == 'M1'
624
625	# extract the fitting parameters
626	pars = [p.name for p in assembly.fit_parameters()]
627	assert set(pars) == set(['M1.a','M1.c','M2.a'])
628	# Compute chisq and verify constraints are updated properly
629	assert assembly([1,1.5,2.5]) == 0
630	assert assembly[0].model.c == 1.5 and assembly[1].model.c == 3
631
632	# Try without constraints
633	assembly[1].set(c=3)
634	assembly.fit_parameters() # Fit parameters have changed
635	assert assembly([1,1.5,2.5]) == 0
636
637	# Check that assembly.cov runs ... still need to check that it is correct!
638	C = assembly.cov(numpy.array([1,1.5,2.5]))
639
640	if __name__ == "__main__": test()

Note: See TracBrowser for help on using the repository browser.

SasView

source: sasview/park-1.2.1/park/assembly.py @ 69032d4

Download in other formats: