.. pd_help.rst

.. This is a port of the original SasView html help file to ReSTructured text
.. by S King, ISIS, during SasView CodeCamp-III in Feb 2015.

.. ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ

.. _polydispersityhelp:

Polydispersity Distributions
----------------------------

With some models in sasmodels we can calculate the average intensity for a
population of particles that exhibit size and/or orientational
polydispersity. The resultant intensity is normalized by the average
particle volume such that

.. math::

  P(q) = \text{scale} \langle F^* F \rangle / V + \text{background}

where $F$ is the scattering amplitude and $\langle\cdot\rangle$ denotes an
average over the size distribution.

Each distribution is characterized by a center value $\bar x$ or
$x_\text{med}$, a width parameter $\sigma$ (note this is *not necessarily*
the standard deviation, so read the description carefully), the number of
sigmas $N_\sigma$ to include from the tails of the distribution, and the
number of points used to compute the average. The center of the distribution
is set by the value of the model parameter. The meaning of a polydispersity
parameter *PD* (not to be confused with a molecular weight distributions
in polymer science) in a model depends on the type of parameter it is being
applied too.

The distribution width applied to *volume* (ie, shape-describing) parameters
is relative to the center value such that $\sigma = \mathrm{PD} \cdot \bar x$.
However, the distribution width applied to *orientation* (ie, angle-describing)
parameters is just $\sigma = \mathrm{PD}$.

$N_\sigma$ determines how far into the tails to evaluate the distribution,
with larger values of $N_\sigma$ required for heavier tailed distributions.
The scattering in general falls rapidly with $qr$ so the usual assumption
that $G(r - 3\sigma_r)$ is tiny and therefore $f(r - 3\sigma_r)G(r - 3\sigma_r)$
will not contribute much to the average may not hold when particles are large.
This, too, will require increasing $N_\sigma$.

Users should note that the averaging computation is very intensive. Applying
polydispersion to multiple parameters at the same time or increasing the
number of points in the distribution will require patience! However, the
calculations are generally more robust with more data points or more angles.

The following distribution functions are provided:

*  *Uniform Distribution*
*  *Rectangular Distribution*
*  *Gaussian Distribution*
*  *Boltzmann Distribution*
*  *Lognormal Distribution*
*  *Schulz Distribution*
*  *Array Distribution*
*  *User-defined Distributions*

These are all implemented as *number-average* distributions.


Suggested Applications
^^^^^^^^^^^^^^^^^^^^^^

If applying polydispersion to parameters describing particle sizes, use
the Lognormal or Schulz distributions.

If applying polydispersion to parameters describing interfacial thicknesses
or angular orientations, use the Gaussian or Boltzmann distributions.

If applying polydispersion to parameters describing angles, use the Uniform
distribution. Beware of using distributions that are always positive (eg, the
Lognormal) because angles can be negative!

The array distribution provides a very simple means of implementing a user-
defined distribution, but without any fittable parameters. Greater flexibility
is conferred by the user-defined distribution. 

.. ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ

Uniform Distribution
^^^^^^^^^^^^^^^^^^^^

The Uniform Distribution is defined as

.. math::

    f(x) = \frac{1}{\text{Norm}}
    \begin{cases}
        1 & \text{for } |x - \bar x| \leq \sigma \\
        0 & \text{for } |x - \bar x| > \sigma
    \end{cases}

where $\bar x$ ($x_\text{mean}$ in the figure) is the mean of the
distribution, $\sigma$ is the half-width, and *Norm* is a normalization
factor which is determined during the numerical calculation.

The polydispersity in sasmodels is given by

.. math:: \text{PD} = \sigma / \bar x

.. figure:: pd_uniform.jpg

    Uniform distribution.

The value $N_\sigma$ is ignored for this distribution.

.. ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ

Rectangular Distribution
^^^^^^^^^^^^^^^^^^^^^^^^

The Rectangular Distribution is defined as

.. math::

    f(x) = \frac{1}{\text{Norm}}
    \begin{cases}
        1 & \text{for } |x - \bar x| \leq w \\
        0 & \text{for } |x - \bar x| > w
    \end{cases}

where $\bar x$ ($x_\text{mean}$ in the figure) is the mean of the
distribution, $w$ is the half-width, and *Norm* is a normalization
factor which is determined during the numerical calculation.

Note that the standard deviation and the half width $w$ are different!

The standard deviation is

.. math:: \sigma = w / \sqrt{3}

whilst the polydispersity in sasmodels is given by

.. math:: \text{PD} = \sigma / \bar x

.. figure:: pd_rectangular.jpg

    Rectangular distribution.

.. note:: The Rectangular Distribution is deprecated in favour of the
            Uniform Distribution above and is described here for backwards
            compatibility with earlier versions of SasView only.

.. ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ

Gaussian Distribution
^^^^^^^^^^^^^^^^^^^^^

The Gaussian Distribution is defined as

.. math::

    f(x) = \frac{1}{\text{Norm}}
            \exp\left(-\frac{(x - \bar x)^2}{2\sigma^2}\right)

where $\bar x$ ($x_\text{mean}$ in the figure) is the mean of the
distribution and *Norm* is a normalization factor which is determined
during the numerical calculation.

The polydispersity in sasmodels is given by

.. math:: \text{PD} = \sigma / \bar x

.. figure:: pd_gaussian.jpg

    Normal distribution.

.. ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ

Boltzmann Distribution
^^^^^^^^^^^^^^^^^^^^^^

The Boltzmann Distribution is defined as

.. math::

    f(x) = \frac{1}{\text{Norm}}
            \exp\left(-\frac{ | x - \bar x | }{\sigma}\right)

where $\bar x$ ($x_\text{mean}$ in the figure) is the mean of the
distribution and *Norm* is a normalization factor which is determined
during the numerical calculation.

The width is defined as

.. math:: \sigma=\frac{k T}{E}

which is the inverse Boltzmann factor, where $k$ is the Boltzmann constant,
$T$ the temperature in Kelvin and $E$ a characteristic energy per particle.

.. figure:: pd_boltzmann.jpg

    Boltzmann distribution.

.. ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ

Lognormal Distribution
^^^^^^^^^^^^^^^^^^^^^^

The Lognormal Distribution describes a function of $x$ where $\ln (x)$ has
a normal distribution. The result is a distribution that is skewed towards
larger values of $x$.

The Lognormal Distribution is defined as

.. math::

    f(x) = \frac{1}{\text{Norm}}\frac{1}{x\sigma}
            \exp\left(-\frac{1}{2}
                        \bigg(\frac{\ln(x) - \mu}{\sigma}\bigg)^2\right)

where *Norm* is a normalization factor which will be determined during
the numerical calculation, $\mu=\ln(x_\text{med})$ and $x_\text{med}$
is the *median* value of the *lognormal* distribution, but $\sigma$ is
a parameter describing the width of the underlying *normal* distribution.

$x_\text{med}$ will be the value given for the respective size parameter
in sasmodels, for example, *radius=60*.

The polydispersity in sasmodels is given by

.. math:: \text{PD} = \sigma = p / x_\text{med}

The mean value of the distribution is given by $\bar x = \exp(\mu+ \sigma^2/2)$
and the peak value by $\max x = \exp(\mu - \sigma^2)$.

The variance (the square of the standard deviation) of the *lognormal*
distribution is given by

.. math::

    \nu = [\exp({\sigma}^2) - 1] \exp({2\mu + \sigma^2})

Note that larger values of PD might need a larger number of points
and $N_\sigma$.

.. figure:: pd_lognormal.jpg

    Lognormal distribution for PD=0.1.

For further information on the Lognormal distribution see:
http://en.wikipedia.org/wiki/Log-normal_distribution and
http://mathworld.wolfram.com/LogNormalDistribution.html

.. ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ

Schulz Distribution
^^^^^^^^^^^^^^^^^^^

The Schulz (sometimes written Schultz) distribution is similar to the
Lognormal distribution, in that it is also skewed towards larger values of
$x$, but which has computational advantages over the Lognormal distribution.

The Schulz distribution is defined as

.. math::

    f(x) = \frac{1}{\text{Norm}} (z+1)^{z+1}(x/\bar x)^z
            \frac{\exp[-(z+1)x/\bar x]}{\bar x\Gamma(z+1)}

where $\bar x$ ($x_\text{mean}$ in the figure) is the mean of the
distribution, *Norm* is a normalization factor which is determined
during the numerical calculation, and $z$ is a measure of the width
of the distribution such that

.. math:: z = (1-p^2) / p^2

where $p$ is the polydispersity in sasmodels given by

.. math:: PD = p = \sigma / \bar x

and $\sigma$ is the RMS deviation from $\bar x$.

Note that larger values of PD might need a larger number of points
and $N_\sigma$. For example, for PD=0.7 with radius=60 |Ang|, at least
Npts>=160 and Nsigmas>=15 are required.

.. figure:: pd_schulz.jpg

    Schulz distribution.

For further information on the Schulz distribution see:
M Kotlarchyk & S-H Chen, *J Chem Phys*, (1983), 79, 2461 and
M Kotlarchyk, RB Stephens, and JS Huang, *J Phys Chem*, (1988), 92, 1533

.. ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ

Array Distribution
^^^^^^^^^^^^^^^^^^

This user-definable distribution should be given as a simple ASCII text
file where the array is defined by two columns of numbers: $x$ and $f(x)$.
The $f(x)$ will be normalized to 1 during the computation.

Example of what an array distribution file should look like:

====  =====
 30    0.1
 32    0.3
 35    0.4
 36    0.5
 37    0.6
 39    0.7
 41    0.9
====  =====

Only these array values are used computation, therefore the parameter value
given for the model will have no affect, and will be ignored when computing
the average.  This means that any parameter with an array distribution will
not be fitable.

.. ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ

User-defined Distributions
^^^^^^^^^^^^^^^^^^^^^^^^^^

You can also define your own distribution by creating a python file defining a
*Distribution* object with a *_weights* method.  The *_weights* method takes
*center*, *sigma*, *lb* and *ub* as arguments, and can access *self.npts*
and *self.nsigmas* from the distribution.  They are interpreted as follows:

* *center* the value of the shape parameter (for size dispersity) or zero
  if it is an angular dispersity.  This parameter may be fitted.

* *sigma* the width of the distribution, which is the polydispersity parameter
  times the center for size dispersity, or the polydispersity parameter alone
  for angular dispersity.  This parameter may be fitted.

* *lb*, *ub* are the parameter limits (lower & upper bounds) given in the model
  definition file.  For example, a radius parameter has *lb* equal to zero.  A
  volume fraction parameter would have *lb* equal to zero and *ub* equal to one.

* *self.nsigmas* the distance to go into the tails when evaluating the
  distribution.  For a two parameter distribution, this value could be
  co-opted to use for the second parameter, though it will not be available
  for fitting.

* *self.npts* the number of points to use when evaluating the distribution.
  The user will adjust this to trade calculation time for accuracy, but the
  distribution code is free to return more or fewer, or use it for the third
  parameter in a three parameter distribution.

As an example, the code following wraps the Laplace distribution from scipy stats::

    import numpy as np
    from scipy.stats import laplace

    from sasmodels import weights

    class Dispersion(weights.Dispersion):
        r"""
        Laplace distribution

        .. math::

            w(x) = e^{-\sigma |x - \mu|}
        """
        type = "laplace"
        default = dict(npts=35, width=0, nsigmas=3)  # default values
        def _weights(self, center, sigma, lb, ub):
            x = self._linspace(center, sigma, lb, ub)
            wx = laplace.pdf(x, center, sigma)
            return x, wx

You can plot the weights for a given value and width using the following::

    from numpy import inf
    from matplotlib import pyplot as plt
    from sasmodels import weights

    # reload the user-defined weights
    weights.load_weights()
    x, wx = weights.get_weights('laplace', n=35, width=0.1, nsigmas=3, value=50,
                                limits=[0, inf], relative=True)

    # plot the weights
    plt.interactive(True)
    plt.plot(x, wx, 'x')

The *self.nsigmas* and *self.npts* parameters are normally used to control
the accuracy of the distribution integral. The *self._linspace* function
uses them to define the *x* values (along with the *center*, *sigma*,
*lb*, and *ub* which are passed as parameters).  If you repurpose npts or
nsigmas you will need to generate your own *x*.  Be sure to honour the
limits *lb* and *ub*, for example to disallow a negative radius or constrain
the volume fraction to lie between zero and one.

To activate a user-defined distribution, set the following environment variable:

    SASMODELS_WEIGHTS=path/to/folder/name_of_distribution.py

.. ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ

Note about DLS polydispersity
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Many commercial Dynamic Light Scattering (DLS) instruments produce a size
polydispersity parameter, sometimes even given the symbol $p$\ ! This
parameter is defined as the relative standard deviation coefficient of
variation of the size distribution and is NOT the same as the polydispersity
parameters in the Lognormal and Schulz distributions above (though they all
related) except when the DLS polydispersity parameter is <0.13.

.. math::

    p_{DLS} = \sqrt(\nu / \bar x^2)

where $\nu$ is the variance of the distribution and $\bar x$ is the mean
value of $x$.

For more information see:
S King, C Washington & R Heenan, *Phys Chem Chem Phys*, (2005), 7, 143

.. ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ

*Document History*

| 2015-05-01 Steve King
| 2017-05-08 Paul Kienzle
| 2018-03-20 Steve King
| 2018-04-04 Steve King