assembly.py @ e63a411

ESS_GUIESS_GUI_DocsESS_GUI_batch_fittingESS_GUI_bumps_abstractionESS_GUI_iss1116ESS_GUI_iss879ESS_GUI_iss959ESS_GUI_openclESS_GUI_orderingESS_GUI_sync_sascalccostrafo411magnetic_scattrelease-4.1.1release-4.1.2release-4.2.2release_4.0.1ticket-1009ticket-1094-headlessticket-1242-2d-resolutionticket-1243ticket-1249ticket885unittest-saveload

Last change on this file since e63a411 was 3570545, checked in by Mathieu Doucet <doucetm@…>, 13 years ago
Adding park Part 2
Property mode set to `100644`
File size: 22.8 KB

Rev	Line
[3570545]	1	# This program is public domain
	2	"""
	3	An assembly is a collection of fitting functions. This provides
	4	the model representation that is the basis of the park fitting engine.
	5
	6	Models can range from very simple one dimensional theory functions
	7	to complex assemblies of multidimensional datasets from different
	8	experimental techniques, each with their own theory function and
	9	a common underlying physical model.
	10
	11	Usage
	12	=====
	13
	14	First define the models you want to work with. In the example
	15	below we will use an example of a simple multilayer system measured by
	16	specular reflection of xrays and neutrons. The gold depth is the only
	17	fitting parameter, ranging from 10-30 A. The interface depths are
	18	tied together using expressions. In this case the expression is
	19	a simple copy, but any standard math functions can be used. Some
	20	model developers may provide additional functions for use with the
	21	expression.
	22
	23	Example models::
	24
	25	import reflectometry.model1d as refl
	26	xray = refl.model('xray')
	27	xray.incident('Air',rho=0)
	28	xray.interface('iAu',sigma=5)
	29	xray.layer('Au',rho=124.68,depth=[10,30])
	30	xray.interface('iSi',sigma=5)
	31	xray.substrate('Si',rho=20.07)
	32	datax = refl.data('xray.dat')
	33
	34	neutron = refl.model('neutron')
	35	neutron.incident('Air',rho=0)
	36	neutron.interface('iAu',sigma='xray.iAu')
	37	neutron.layer('Au',rho=4.66,depth='xray.Au.depth')
	38	neutron.interface('iSi',sigma='xray.iSi')
	39	neutron.substrate('Si',rho=2.07)
	40	datan = refl.data('neutron.dat')
	41
	42	As you can see from the above, parameters can be set to a value if
	43	the parameter is fixed, to a range if the parametemr is fitted, or
	44	to a string expression if the parameter is calculated from other
	45	parameters. See park.Parameter.set for further details.
	46
	47	Having constructed the models, we can now create an assembly::
	48
	49	import park
	50	assembly = park.Assembly([(xray,datax), (neutron,datan)])
	51
	52	Note: this would normally be done in the context of a fit
	53	using fit = park.Fit([(xray,datax), (neutron,datan)]), and later referenced
	54	using fit.assembly.
	55
	56	Individual parts of the assembly are accessable using the
	57	model number 0, 1, 2... or by the model name. In the above,
	58	assembly[0] and assembly['xray'] refer to the same model.
	59	Assemblies have insert and append functions for adding new
	60	models, and "del model[idx]" for removing them.
	61
	62	Once the assembly is created computing the values for the system
	63	is a matter of calling::
	64
	65	assembly.eval()
	66	print "Chi**2",assembly.chisq
	67	print "Reduced chi**2",assembly.chisq/assembly.degrees_of_freedom
	68	plot(arange(len(assembly.residuals)), assembly.residuals)
	69
	70	This defines the attributes residuals, degrees_of_freedom and chisq,
	71	which is what the optimizer uses as the cost function to minimize.
	72
	73	assembly.eval uses the current values for the parameters in the
	74	individual models. These parameters can be changed directly
	75	in the model. In the reflectometry example above, you could
	76	set the gold thickness using xray.layer.Au.depth=156, or
	77	something similar (the details are model specific). Parameters
	78	can also be changed through the assembly parameter set. In the same
	79	example, this would be assembly.parameterset['xray']['Au']['depth'].
	80	See parameter set for details.
	81
	82	In the process of modeling data, particularly with multiple
	83	datasets, you will sometimes want to temporarily ignore
	84	how well one of the datasets matches so that you
	85	can more quickly refine the model for the other datasets,
	86	or see how particular models are influencing the fit. To
	87	temporarily ignore the xray data in the example above use::
	88
	89	assembly.parts[0].isfitted = False
	90
	91	The model itself isn't ignored since its parameters may be
	92	needed to compute the parameters for other models. To
	93	reenable checking against the xray data, you would assign
	94	a True value instead. More subtle weighting of the models
	95	can be controlled using assembly.parts[idx].weight, but
	96	see below for a note on model weighting.
	97
	98	A note on model weighting
	99	-------------------------
	100
	101	Changing the weight is equivalent to scaling the error bars
	102	on the given model by a factor of weight/n where n is the
	103	number of data points. It is better to set the correct error
	104	bars on your data in the first place than to adjust the weights.
	105	If you have the correct error bars, then you should expect
	106	roughly 2/3 of the data points to lie within one error bar of
	107	the theory curve. If consecutive data points have largely
	108	overlapping errorbars, then your uncertainty is overestimated.
	109
	110	Another case where weights are adjusted (abused?) is to
	111	compensate for systematic errors in the data by forcing the
	112	errorbars to be large enough to cover the systematic bias.
	113	This is a poor approach to the problem. A better strategy
	114	is to capture the systematic effects in the model, and treat
	115	the measurement of the independent variable as an additional
	116	data point in the fit. This is still not statistically sound
	117	as there is likely to be a large correlation between the
	118	uncertainty of the measurement and the values of all the
	119	other variables.
	120
	121	That said, adjusting the weight on a dataset is a quick way
	122	of reducing its influence on the entire fit. Please use it
	123	with care.
	124	"""
	125
	126	__all__ = ['Assembly', 'Fitness']
	127	import numpy
	128
	129	import park
	130	from park.parameter import Parameter,ParameterSet
	131	from park.fitresult import FitParameter
	132	import park.expression
	133
	134
	135
	136	class Fitness(object):
	137	"""
	138	Container for theory and data.
	139
	140	The fit object compares theory with data.
	141
	142	TODO: what to do with fittable metadata (e.g., footprint correction)?
	143	"""
	144	data = None
	145	model = None
	146	def __init__(self, model=None,data=None):
	147	self.data,self.model = data,model
	148	def _parameterset(self):
	149	return self.model.parameterset
	150	parameterset = property(_parameterset)
	151	def residuals(self):
	152	return self.data.residuals(self.model.eval)
	153	def residuals_deriv(self, pars=[]):
	154	return self.data.residuals_deriv(self.model.eval_derivs,pars=pars)
	155	def set(self, **kw):
	156	"""
	157	Set parameters in the model.
	158
	159	User convenience function. This allows a user with an assembly
	160	of models in a script to for example set the fit range for
	161	parameter 'a' of the model::
	162	assembly[0].set(a=[5,6])
	163
	164	Raises KeyError if the parameter is not in parameterset.
	165	"""
	166	self.model.set(**kw)
	167	def abort(self):
	168	if hasattr(self.model,'abort'): self.model.abort()
	169
	170	class Part(object):
	171	"""
	172	Part of a fitting assembly. Part holds the model itself and
	173	associated data. The part can be initialized with a fitness
	174	object or with a pair (model,data) for the default fitness function.
	175
	176	fitness (Fitness)
	177	object implementing the `park.assembly.Fitness` interface. In
	178	particular, fitness should provide a parameterset attribute
	179	containing a ParameterSet and a residuals method returning a vector
	180	of residuals.
	181	weight (dimensionless)
	182	weight for the model. See comments in assembly.py for details.
	183	isfitted (boolean)
	184	True if the model residuals should be included in the fit.
	185	The model parameters may still be used in parameter
	186	expressions, but there will be no comparison to the data.
	187	residuals (vector)
	188	Residuals for the model if they have been calculated, or None
	189	degrees_of_freedom
	190	Number of residuals minus number of fitted parameters.
	191	Degrees of freedom for individual models does not make
	192	sense in the presence of expressions combining models,
	193	particularly in the case where a model has many parameters
	194	but no data or many computed parameters. The degrees of
	195	freedom for the model is set to be at least one.
	196	chisq
	197	sum(residuals**2); use chisq/degrees_of_freedom to
	198	get the reduced chisq value.
	199
	200	Get/set the weight on the given model.
	201
	202	assembly.weight(3) returns the weight on model 3 (0-origin)
	203	assembly.weight(3,0.5) sets the weight on model 3 (0-origin)
	204	"""
	205
	206	def __init__(self, fitness, weight=1., isfitted=True):
	207	if isinstance(fitness, tuple):
	208	fitness = park.Fitness(*fitness)
	209	self.fitness = fitness
	210	self.weight = weight
	211	self.isfitted = isfitted
	212	self.residuals = None
	213	self.chisq = numpy.Inf
	214	self.degrees_of_freedom = 1
	215
	216
	217	class Assembly(object):
	218	"""
	219	Collection of fit models.
	220
	221	Assembly implements the `park.fit.Objective` interface.
	222
	223	See `park.assembly` for usage.
	224
	225	Instance variables:
	226
	227	residuals : array
	228	a vector of residuals spanning all models, with model
	229	weights applied as appropriate.
	230	degrees_of_freedom : integer
	231	length of the residuals - number of fitted parameters
	232	chisq : float
	233	sum squared residuals; this is not the reduced chisq, which
	234	you can get using chisq/degrees_of_freedom
	235
	236	These fields are defined for the individual models as well, with
	237	degrees of freedom adjusted to the length of the individual data
	238	set. If the model is not fitted or the weight is zero, the residual
	239	will not be calculated.
	240
	241	The residuals fields are available only after the model has been
	242	evaluated.
	243	"""
	244
	245	def __init__(self, models=[]):
	246	"""Build an assembly from a list of models."""
	247	self.parts = []
	248	for m in models:
	249	self.parts.append(Part(m))
	250	self._reset()
	251
	252	def __iter__(self):
	253	"""Iterate through the models in order"""
	254	for m in self.parts: yield m
	255
	256	def __getitem__(self, n):
	257	"""Return the nth model"""
	258	return self.parts[n].fitness
	259
	260	def __setitem__(self, n, fitness):
	261	"""Replace the nth model"""
	262	self.parts[n].fitness = fitness
	263	self._reset()
	264
	265	def __delitem__(self, n):
	266	"""Delete the nth model"""
	267	del self.parts[n]
	268	self._reset()
	269
	270	def weight(self, idx, value=None):
	271	"""
	272	Query the weight on a particular model.
	273
	274	Set weight to value if value is supplied.
	275
	276	:Parameters:
	277	idx : integer
	278	model number
	279	value : float
	280	model weight
	281	:return: model weight
	282	"""
	283	if value is not None:
	284	self.parts[idx].weight = value
	285	return self.parts[idx].weight
	286
	287	def isfitted(self, idx, value=None):
	288	"""
	289	Query if a particular model is fitted.
	290
	291	Set isfitted to value if value is supplied.
	292
	293	:param idx: model number
	294	:type idx: integer
	295	:param value:
	296	"""
	297	if value is not None:
	298	self.parts[idx].isfitted = value
	299	return self.parts[idx].isfitted
	300
	301	def append(self, fitness, weight=1.0, isfitted=True):
	302	"""
	303	Add a model to the end of set.
	304
	305	:param fitness: the fitting model
	306	The fitting model can be an instance of `park.assembly.Fitness`,
	307	or a tuple of (`park.model.Model`,`park.data.Data1D`)
	308	:param weight: model weighting (usually 1.0)
	309	:param isfitted: whether model should be fit (equivalent to weight 0.)
	310	"""
	311	self.parts.append(Part(fitness,weight,isfitted))
	312	self._reset()
	313
	314	def insert(self, idx, fitness, weight=1.0, isfitted=True):
	315	"""Add a model to a particular position in the set."""
	316	self.parts.insert(idx,Part(fitness,weight,isfitted))
	317	self._reset()
	318
	319	def _reset(self):
	320	"""Adjust the parameter set after the addition of a new model."""
	321	subsets = [m.fitness.parameterset for m in self]
	322	self.parameterset = ParameterSet('root',subsets)
	323	self.parameterset.setprefix()
	324	#print [p.path for p in self.parameterset.flatten()]
	325
	326	def eval(self):
	327	"""
	328	Recalculate the theory functions, and from them, the
	329	residuals and chisq.
	330
	331	:note: Call this after the parameters have been updated.
	332	"""
	333	# Handle abort from a separate thread.
	334	self._cancel = False
	335
	336	# Evaluate the computed parameters
	337	self._fitexpression()
	338
	339	# Check that the resulting parameters are in a feasible region.
	340	if not self.isfeasible(): return numpy.inf
	341
	342	resid = []
	343	k = len(self._fitparameters)
	344	for m in self.parts:
	345	# In order to support abort, need to be able to propagate an
	346	# external abort signal from self.abort() into an abort signal
	347	# for the particular model. Can't see a way to do this which
	348	# doesn't involve setting a state variable.
	349	self._current_model = m
	350	if self._cancel: return numpy.inf
	351	if m.isfitted and m.weight != 0:
	352	m.residuals = m.fitness.residuals()
	353	N = len(m.residuals)
	354	m.degrees_of_freedom = N-k if N>k else 1
	355	m.chisq = numpy.sum(m.residuals**2)
	356	resid.append(m.weight*m.residuals)
	357	self.residuals = numpy.hstack(resid)
	358	N = len(self.residuals)
	359	self.degrees_of_freedom = N-k if N>k else 1
	360	self.chisq = numpy.sum(self.residuals**2)
	361	return self.chisq
	362
	363	def jacobian(self, pvec, step=1e-8):
	364	"""
	365	Returns the derivative wrt the fit parameters at point p.
	366
	367	Numeric derivatives are calculated based on step, where step is
	368	the portion of the total range for parameter j, or the portion of
	369	point value p_j if the range on parameter j is infinite.
	370	"""
	371	# Make sure the input vector is an array
	372	pvec = numpy.asarray(pvec)
	373	# We are being lazy here. We can precompute the bounds, we can
	374	# use the residuals_deriv from the sub-models which have analytic
	375	# derivatives and we need only recompute the models which depend
	376	# on the varying parameters.
	377	# Meanwhile, let's compute the numeric derivative using the
	378	# three point formula.
	379	# We are not checking that the varied parameter in numeric
	380	# differentiation is indeed feasible in the interval of interest.
	381	range = zip(*[p.range for p in self._fitparameters])
	382	lo,hi = [numpy.asarray(v) for v in range]
	383	delta = (hi-lo)*step
	384	# For infinite ranges, use p*1e-8 for the step size
	385	idx = numpy.isinf(delta)
	386	#print "J",idx,delta,pvec,type(idx),type(delta),type(pvec)
	387	delta[idx] = pvec[idx]*step
	388	delta[delta==0] = step
	389
	390	# Set the initial value
	391	for k,v in enumerate(pvec):
	392	self._fitparameters[k].value = v
	393	# Gather the residuals
	394	r = []
	395	for k,v in enumerate(pvec):
	396	# Center point formula:
	397	# df/dv = lim_{h->0} ( f(v+h)-f(v-h) ) / ( 2h )
	398	h = delta[k]
	399	self._fitparameters[k].value = v + h
	400	self.eval()
	401	rk = self.residuals
	402	self._fitparameters[k].value = v - h
	403	self.eval()
	404	rk -= self.residuals
	405	self._fitparameters[k].value = v
	406	r.append(rk/(2*h))
	407	# return the jacobian
	408	return numpy.vstack(r).T
	409
	410
	411	def cov(self, pvec):
	412	"""
	413	Return the covariance matrix inv(J'J) at point p.
	414	"""
	415
	416	# Find cov of f at p
	417	# cov(f,p) = inv(J'J)
	418	# Use SVD
	419	# J = U S V'
	420	# J'J = (U S V')' (U S V')
	421	# = V S' U' U S V'
	422	# = V S S V'
	423	# inv(J'J) = inv(V S S V')
	424	# = inv(V') inv(S S) inv(V)
	425	# = V inv (S S) V'
	426	J = self.jacobian(pvec)
	427	u,s,vh = numpy.linalg.svd(J,0)
	428	JTJinv = numpy.dot(vh.T.conj()/s**2,vh)
	429	return JTJinv
	430
	431	def stderr(self, pvec):
	432	"""
	433	Return parameter uncertainty.
	434
	435	This is just the sqrt diagonal of covariance matrix inv(J'J) at point p.
	436	"""
	437	return numpy.sqrt(numpy.diag(self.cov(pvec)))
	438
	439	def isfeasible(self):
	440	"""
	441	Returns true if the parameter set is in a feasible region of the
	442	modeling space.
	443	"""
	444	return True
	445
	446	# Fitting service interface
	447	def fit_parameters(self):
	448	"""
	449	Return an alphabetical list of the fitting parameters.
	450
	451	This function is called once at the beginning of a fit,
	452	and serves as a convenient place to precalculate what
	453	can be precalculated such as the set of fitting parameters
	454	and the parameter expressions evaluator.
	455	"""
	456	self.parameterset.setprefix()
	457	self._fitparameters = self.parameterset.fitted
	458	self._restraints = self.parameterset.restrained
	459	pars = self.parameterset.flatten()
	460	context = self.parameterset.gather_context()
	461	self._fitexpression = park.expression.build_eval(pars,context)
	462	#print "constraints",self._fitexpression.__doc__
	463
	464	self._fitparameters.sort(lambda a,b: cmp(a.path,b.path))
	465	# Convert to fitparameter a object
	466	fitpars = [FitParameter(p.path,p.range,p.value)
	467	for p in self._fitparameters]
	468	return fitpars
	469
	470	def set_result(self, result):
	471	"""
	472	Set the parameters resulting from the fit into the parameter set,
	473	and update the calculated expression.
	474
	475	The parameter values may be retrieved by walking the assembly.parameterset
	476	tree, checking each parameter for isfitted, iscomputed, or isfixed.
	477	For example::
	478
	479	assembly.set_result(result)
	480	for p in assembly.parameterset.flatten():
	481	if p.isfitted():
	482	print "%s %g in [%g,%g]"%(p.path,p.value,p.range[0],p.range[1])
	483	elif p.iscomputed():
	484	print "%s computed as %g"%(p.path.p.value)
	485
	486	This does not calculate the function or the residuals for these parameters.
	487	You can call assembly.eval() to do this. The residuals will be set in
	488	assembly[i].residuals. The theory and data are model specific, and can
	489	be found in assembly[i].fitness.data.
	490	"""
	491	for n,p in enumerate(result.parameters):
	492	self._fitparameters[n] = p.value
	493	self._fitexpression()
	494
	495	def all_results(self, result):
	496	"""
	497	Extend result from the fit with the calculated parameters.
	498	"""
	499	calcpars = [FitParameter(p.path,p.range,p.value)
	500	for p in self.parameterset.computed]
	501	result.parameters += calcpars
	502
	503	def result(self, status='step'):
	504	"""
	505	Details to send back to the fitting client on an improved fit.
	506
	507	status is 'start', 'step' or 'end' depending if this is the
	508	first result to return, an improved result, or the final result.
	509
	510	[Not implemented]
	511	"""
	512	return None
	513
	514	def fresiduals(self, pvec):
	515	chisq = self.__call__(pvec)
	516	return self.residuals
	517
	518	def __call__(self, pvec):
	519	"""
	520	Cost function.
	521
	522	Evaluate the system for the parameter vector pvec, returning chisq
	523	as the cost function to be minimized.
	524
	525	Raises a runtime error if the number of fit parameters is
	526	different than the length of the vector.
	527	"""
	528	# Plug fit parameters into model
	529	#print "Trying",pvec
	530	pars = self._fitparameters
	531	if len(pvec) != len(pars):
	532	raise RuntimeError("Unexpected number of parameters")
	533	for n,value in enumerate(pvec):
	534	pars[n].value = value
	535	# Evaluate model
	536	chisq = self.eval()
	537	# Evaluate additional restraints based on parameter value
	538	# likelihood
	539	restraints_penalty = 0
	540	for p in self._restraints:
	541	restraints_penalty += p.likelihood(p.value)
	542	# Return total cost function
	543	return self.chisq + restraints_penalty
	544
	545	def abort(self):
	546	"""
	547	Interrupt the current function evaluation.
	548
	549	Forward this to the currently executing model if possible.
	550	"""
	551	self._cancel = True
	552	if hasattr(self._current_model,'abort'):
	553	self._current_model.abort()
	554
	555	class _Exp(park.Model):
	556	"""
	557	Sample model for testing assembly.
	558	"""
	559	parameters = ['a','c']
	560	def eval(self,x):
	561	return self.anumpy.exp(self.cx)
	562	class _Linear(park.Model):
	563	parameters = ['a','c']
	564	def eval(self,x):
	565	#print "eval",self.a,self.c,x,self.a*x+self.c
	566	return self.a*x+self.c
	567	def example():
	568	"""
	569	Return an example assembly consisting of a pair of functions,
	570	M1.aexp(M1.cx), M2.aexp(2M1.c*x)
	571	and ideal data for
	572	M1.a=1, M1.c=1.5, M2.a=2.5
	573	"""
	574	import numpy
	575	import park
	576	from numpy import inf
	577	# Make some fake data
	578	x1 = numpy.linspace(0,1,11)
	579	x2 = numpy.linspace(0,1,12)
	580	# Define a shared model
	581	if True: # Exp model
	582	y1,y2 = numpy.exp(1.5x1),2.5numpy.exp(3*x2)
	583	M1 = _Exp('M1',a=[1,3],c=[1,3])
	584	M2 = _Exp('M2',a=[1,3],c='2*M1.c')
	585	#M2 = _Exp('M2',a=[1,3],c=3)
	586	else: # Linear model
	587	y1,y2 = x1+1.5, 2.5*x2+3
	588	M1 = _Linear('M1',a=[1,3],c=[1,3])
	589	M2 = _Linear('M2',a=[1,3],c='2*M1.c')
	590	if False: # Unbounded
	591	M1.a = [-inf,inf]
	592	M1.c = [-inf,inf]
	593	M2.a = [-inf,inf]
	594	D1 = park.Data1D(x=x1, y=y1)
	595	D2 = park.Data1D(x=x2, y=y2)
	596	# Construct the assembly
	597	assembly = park.Assembly([(M1,D1),(M2,D2)])
	598	return assembly
	599
	600	class _Sphere(park.Model):
	601	parameters = ['a','b','c','d','e']
	602	def eval(self,x):
	603	return self.ax2+self.bx+self.c + exp(self.d) - 3*sin(self.e)
	604
	605	def example5():
	606	import numpy
	607	import park
	608	from numpy import inf
	609	# Make some fake data
	610	x = numpy.linspace(0,1,11)
	611	# Define a shared model
	612	S = _Sphere(a=1,b=2,c=3,d=4,e=5)
	613	y = S.eval(x1)
	614	Sfit = _Sphere(a=[-inf,inf],b=[-inf,inf],c=[-inf,inf],d=[-inf,inf],e=[-inf,inf])
	615	D = park.Data1D(x=x, y=y)
	616	# Construct the assembly
	617	assembly = park.Assembly([(Sfit,D)])
	618	return assembly
	619
	620	def test():
	621	assembly = example()
	622	assert assembly[0].parameterset.name == 'M1'
	623
	624	# extract the fitting parameters
	625	pars = [p.name for p in assembly.fit_parameters()]
	626	assert set(pars) == set(['M1.a','M1.c','M2.a'])
	627	# Compute chisq and verify constraints are updated properly
	628	assert assembly([1,1.5,2.5]) == 0
	629	assert assembly[0].model.c == 1.5 and assembly[1].model.c == 3
	630
	631	# Try without constraints
	632	assembly[1].set(c=3)
	633	assembly.fit_parameters() # Fit parameters have changed
	634	assert assembly([1,1.5,2.5]) == 0
	635
	636	# Check that assembly.cov runs ... still need to check that it is correct!
	637	C = assembly.cov(numpy.array([1,1.5,2.5]))
	638
	639	if __name__ == "__main__": test()

Note: See TracBrowser for help on using the repository browser.

SasView

source: sasview/park-1.2.1/park/assembly.py @ e63a411

Download in other formats: