Changeset af6de50 in sasview
- Timestamp:
- Oct 25, 2016 7:21:44 PM (8 years ago)
- Branches:
- master, ESS_GUI, ESS_GUI_Docs, ESS_GUI_batch_fitting, ESS_GUI_bumps_abstraction, ESS_GUI_iss1116, ESS_GUI_iss879, ESS_GUI_iss959, ESS_GUI_opencl, ESS_GUI_ordering, ESS_GUI_sync_sascalc, costrafo411, magnetic_scatt, release-4.1.1, release-4.1.2, release-4.2.2, ticket-1009, ticket-1094-headless, ticket-1242-2d-resolution, ticket-1243, ticket-1249, ticket885, unittest-saveload
- Children:
- 907186d
- Parents:
- cf1910f
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
src/sas/sasgui/perspectives/fitting/media/plugin.rst
r3e1c9e5 raf6de50 11 11 There are essentially three ways to generate new fitting models for SasView: 12 12 13 * Using the SasView :ref:`New_Plugin_Model` helper dialog (best for beginners and/or relatively simple models) 14 * By copying/editing an existing model (this can include models generated by the *New Plugin Model* dialog) in the :ref:`Python_shell` or :ref:`Advanced_Plugin_Editor` as described below (suitable for all use cases) 15 * By writing a model from scratch outside of SasView (only recommended for code monkeys!) 13 * Using the SasView :ref:`New_Plugin_Model` helper dialog (best for beginners 14 and/or relatively simple models) 15 * By copying/editing an existing model (this can include models generated by 16 the *New Plugin Model* dialog) in the :ref:`Python_shell` or 17 :ref:`Advanced_Plugin_Editor` as described below (suitable for all use cases) 18 * By writing a model from scratch outside of SasView (only recommended for 19 code monkeys!) 16 20 17 21 Overview … … 479 483 } 480 484 481 The C model operates on a single $q$ value at a time. The code will be482 run in parallel across different $q$ values, either on the graphics card483 or the processor.484 485 Rather than returning NAN from Iq, you must define the *INVALID(v)*. The486 *v* parameter lets you access all the parameters in the model using487 *v.par1*, *v.par2*, etc. For example::488 489 #define INVALID(v) (v.bell_radius < v.radius)490 491 485 *Iqxy* is similar to *Iq*, except it uses parameters *qx, qy* instead of *q*, 492 and it includes orientation parameters. As in python models, *form_volume* 493 includes only the volume parameters. *Iqxy* will default to 494 *Iq(sqrt(qx**2 + qy**2), par1, ...)* and *form_volume* will default to 1.0. 495 496 The C code follows the C99 standard, including the usual math functions, 497 as defined in 498 `OpenCL <https://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/mathFunctions.html>`_. 499 500 The standard constants and functions include the following:: 501 502 M_PI = pi 503 M_PI_2 = pi/2 504 M_PI_4 = pi/4 505 M_E = e 506 M_SQRT1_2 = 1/sqrt(2) 507 NAN = NaN 508 INFINITY = 1/0 509 erf(x) = error function 510 erfc(x) = 1-erf(x) 511 expm1(x) = exp(x) - 1 512 tgamma(x) = gamma function 513 514 Some non-standard constants and functions are also provided:: 515 516 M_PI_180 = pi/180 517 M_4PI_3 = 4pi/3 518 square(x) = x*x 519 cube(x) = x*x*x 520 sinc(x) = sin(x)/x, with sin(0)/0 -> 1 521 SINCOS(x, s, c) sets s=sin(angle) and c=cos(angle) 522 powr(x, y) = x^y for x >= 0 523 pown(x, n) = x^n for n integer 524 525 **source=['lib/fn.c', ...]** includes the listed C source files in the 486 and it includes orientation parameters. 487 488 *form_volume* defines the volume of the shape. As in python models, 489 includes only the volume parameters. 490 491 *Iqxy* will default to *Iq(sqrt(qx**2 + qy**2), par1, ...)* and 492 *form_volume* will default to 1.0. 493 494 **source=['fn.c', ...]** includes the listed C source files in the 526 495 program before *Iq* and *Iqxy* are defined. This allows you to extend the 527 library of available C functions. Additional special functions and 528 scattering calculations are defined in 529 `sasmodels/models/lib <https://github.com/SasView/sasmodels/tree/master/sasmodels/models/lib>`_, 530 including:: 531 532 sph_j1c(x) = 3 j1(x)/x = 3 (sin(x) - x cos(x))/x^3 [spherical bessel function] 533 sas_J1c(x) = 2 J1(x)/x [bessel function of the first kind] 534 sas_gamma(x) = gamma function [tgamma is unstable below 1] 535 sas_erf(x) = error function [erf is broken on some Intel OpenCL drivers] 536 sas_erfc(x) = 1-erf(x) 537 sas_J0(x) = J0(x) 538 sas_J1(x) = J1(x) 539 sas_JN(x) = JN(x) 540 Si(x) = integral sin(z)/z from 0 to x 541 Gauss76Wt = gaussian quadrature weights for 76 point integral 542 Gauss76Z = gaussian quadrature values for 76 point integral 543 544 These functions have been tuned to be fast and numerically stable down 545 to $q=0$ even in single precision. In some cases they work around bugs 546 which appear on some platforms but not others. So use them where needed!!! 496 library of C functions available to your model. 547 497 548 498 Models are defined using double precision declarations for the 549 parameters and return values. Declarations and constants will be converted 550 to float or long double depending on the precision requested. 499 parameters and return values. When a model is run using single 500 precision or long double precision, each variable is converted 501 to the target type, depending on the precision requested. 551 502 552 503 **Floating point constants must include the decimal point.** This allows us … … 558 509 use the builtin constant M_PI rather than 4*atan(1); it is faster and smaller! 559 510 560 FLOAT_SIZE is the number of bytes in the converted variables. If your 561 algorithm depends on precision (which is not uncommon for numerical 562 algorithms), use the following:: 563 564 #if FLOAT_SIZE>4 565 ... code for double precision ... 566 #else 567 ... code for single precision ... 568 #endif 569 570 A value defined as SAS_DOUBLE will stay double precision; this should 571 not be used since some graphics cards do not support double precision. 572 511 The C model operates on a single $q$ value at a time. The code will be 512 run in parallel across different $q$ values, either on the graphics card 513 or the processor. 514 515 Rather than returning NAN from Iq, you must define the *INVALID(v)*. The 516 *v* parameter lets you access all the parameters in the model using 517 *v.par1*, *v.par2*, etc. For example:: 518 519 #define INVALID(v) (v.bell_radius < v.radius) 520 521 Special Functions 522 ................. 523 524 The C code follows the C99 standard, with the usual math functions, 525 as defined in 526 `OpenCL <https://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/mathFunctions.html>`_. 527 This includes the following: 528 529 M_PI, M_PI_2, M_PI_4, M_SQRT1_2, M_E: 530 $\pi$, $\pi/2$, $\pi/4$, $1/\sqrt{2}$ and Euler's constant $e$ 531 exp, log, pow(x,y), expm1, sqrt: 532 Power functions $e^x$, $\ln x$, $x^y$, $e^x - 1$, $\sqrt{x}$. 533 The function expm1(x) is accurate across all $x$, including $x$ 534 very close to zero. 535 sin, cos, tan, asin, acos, atan: 536 Trigonometry functions and inverses, operating on radians. 537 sinh, cos, tanh, asinh, acosh, atanh: 538 Hyperbolic trigonometry functions. 539 atan2(y,x): 540 Angle from the $x$\ -axis to the point $(x,y)$, which is equal to 541 $\tan^{-1}(y/x)$ corrected for quadrant. That is, if $x$ and $y$ are 542 both negative, then atan2(y,x) returns a value in quadrant III where 543 atan(y/x) would return a value in quadrant I. Similarly for 544 quadrants II and IV when $x$ and $y$ have opposite sign. 545 fmin(x,y), fmax(x,y), trunc, rint: 546 Floating point functions. rint(x) returns the nearest integer. 547 NAN: 548 NaN, Not a Number, $0/0$. Use isnan(x) to test for NaN. Note that 549 you cannot use :code:`x == NAN` to test for NaN values since that 550 will always return false. NAN does not equal NAN! 551 INFINITY: 552 $\infty, 1/0$. Use isinf(x) to test for infinity, or isfinite(x) 553 to test for finite and not NaN. 554 erf, erfc, tgamma, lgamma: **do not use** 555 Special functions that should be part of the standard, but are missing 556 or inaccurate on some platforms. Use sas_erf, sas_erfc and sas_gamma 557 instead (see below). Note: lgamma(x) has not yet been tested. 558 559 Some non-standard constants and functions are also provided: 560 561 M_PI_180, M_4PI_3: 562 $\pi/{180}$, $\tfrac{4}{3}\pi$ 563 SINCOS(x, s, c): 564 Macro which sets s=sin(x) and c=cos(x). The variables *c* and *s* 565 must be declared first. 566 square(x): 567 $x^2$ 568 cube(x): 569 $x^3$ 570 sinc(x): 571 $\sin(x)/x$, with limit $\sin(0)/0 = 1$. 572 powr(x, y): 573 $x^y$ for $x \ge 0$; this is faster than general $x^y$ on some GPUs. 574 pown(x, n): 575 $x^n$ for $n$ integer; this is faster than general $x^n$ on some GPUs. 576 FLOAT_SIZE: 577 The number of bytes in a floating point value. Even though all 578 variables are declared double, they may be converted to single 579 precision float before running. If your algorithm depends on 580 precision (which is not uncommon for numerical algorithms), use 581 the following:: 582 583 #if FLOAT_SIZE>4 584 ... code for double precision ... 585 #else 586 ... code for single precision ... 587 #endif 588 SAS_DOUBLE: 589 A replacement for :code:`double` so that the declared variable will 590 stay double precision; this should generally not be used since some 591 graphics cards do not support double precision. There is no provision 592 for forcing a constant to stay double precision. 593 594 The following special functions and scattering calculations are defined in 595 `sasmodels/models/lib <https://github.com/SasView/sasmodels/tree/master/sasmodels/models/lib>`_. 596 These functions have been tuned to be fast and numerically stable down 597 to $q=0$ even in single precision. In some cases they work around bugs 598 which appear on some platforms but not others. So use them where needed!!! 599 600 polevl(x, c, n): 601 Polynomial evaluation $p(x) = \sum_{i=0}^n c_i x^{n-i}$ using Horner's 602 method so it is faster and more accurate. 603 604 :code:`source = ["lib/polevl.c", ...]` 605 606 sas_gamma: 607 Gamma function $\text{sas_gamma}(x) = \Gamma(x)$. The standard math 608 library gamma function, tgamma(x) is unstable below 1 on some platforms. 609 610 :code:`source = ["lib/sasgamma.c", ...]` 611 612 erf, erfc: 613 Error function 614 $\text{erf}(x) = \frac{1}{\sqrt\pi}\int_0^x e^{-t^2}\,dt$ 615 and complementary error function 616 $\text{erfc}(x) = \frac{1}{\sqrt\pi}\int_x^\inf e^{-t^2}\,dt$. 617 The standard manth library erf and erfc are slower and broken 618 on some platforms. 619 620 :code:`source = ["lib/polevl.c", "lib/sas_erf.c", ...]` 621 622 sas_J0: 623 Bessel functions of the first kind where 624 $J_0(x) = \frac{1}{\pi}\int_0^\pi \cos(x\sin(\tau))\,d\tau$. 625 626 :code:`source = ["lib/polevl.c", "lib/sas_J0.c", ...]` 627 628 sas_J1: 629 Bessel functions of the first kind where 630 $J_1(x) = \frac{1}{\pi}\int_0^\pi \cos(\tau - x\sin(\tau))\,d\tau$. 631 632 :code:`source = ["lib/polevl.c", "lib/sas_J1.c", ...]` 633 634 sas_JN: 635 Bessel functions of the first kind where 636 $J_n(x) = \frac{1}{\pi}\int_0^\pi \cos(n\tau - x\sin(\tau))\,d\tau$. 637 638 :code:`source = ["lib/polevl.c", "lib/sas_J0.c", "lib/sas_J1.c", "lib/sas_JN.c", ...]` 639 640 Si: 641 Sine integral $\text{Si}(x) = \int_0^x \tfrac{\sin t}{t}\,dt$. 642 643 :code:`soure = ["lib/Si.c", ...]` 644 645 sph_j1c(qr): 646 Spherical Bessel form 647 $F(qr) = 3 j_1(qr)/(qr) = 3 (\sin(qr) - qr \cos(qr))/{(qr)^3}$, 648 with a limiting value of 1 at $qr=0$. This function uses a Taylor 649 series for small $qr$ for numerical accuracy. 650 651 :code:`source = ["lib/sph_j1c.c", ...]` 652 653 sas_J1c(qr): 654 Bessel form $F(qr) = 2 J_1(qr)/{(qr)}$, with a limiting value of 1 at $qr=0$. 655 656 :code:`source = ["lib/polevl.c", "lib/sas_J1c.c", ...]` 657 658 Gauss76z[i], Gauss76Wt[i]: 659 Points $z_i$ and weights $w_i$ for 76-point Gaussian quadrature, 660 computing $\int_{-1}^1 f(z)\,dz \approx \sum_{i=1}^{76} w_i f(z_i)$. 661 Similar arrays are available in :code:`gauss20.c` for 20 point 662 quadrature and in :code:`gauss150.c` for 150 point quadrature. 663 664 :code:`source = ["gauss76.c", ...]` 665 666 Problems with C models 667 ...................... 668 669 The graphics processor (GPU) in your computer is a specialized computer tuned 670 for certain kinds of problems. This leads to strange restrictions that you 671 need to be aware of. Your code may work fine on some platforms or for some 672 models, but then return bad values on other platforms. Some examples of 673 particular problems: 674 675 (1) Code is too complex, or uses too much memory. GPU devices only have a 676 limited amount of memory available for each processor. If you run programs 677 which take too much memory, then rather than running multiple values in parallel 678 as it usually does, the GPU may only run a single version of the code at a 679 time, making it slower than running on the CPU. It may fail to run on 680 some platforms, or worse, cause the screen to go blank or the system to reboot. 681 682 (2) Code takes too long. Because GPU devices are used for the computer 683 display, the OpenCL drivers are very careful about the amount of time they 684 will allow any code to run. For example, on OS X, the model will stop running 685 after 5 seconds regardless if the computation is complete. You may end up 686 with only some of your 2-D array defined, with the rest containing random 687 data. Or it may cause the screen to go blank or the system to reboot. 688 689 (3) Memory is not *aligned*. The GPU hardware is specialized to operate on 690 multiple values simultaneously. To keep the GPU simpler the values in memory 691 must be aligned with the different GPU compute engines. Not following these 692 rules can lead to unexpected values being loaded into memory, and wrong answers 693 computed. The conclusion from a very long and strange debugging session was 694 that any arrays that you declare in your model should be a multiple of four. 695 For example 696 697 double Iq(q, p1, p2, ...) 698 { 699 double vector[8]; // Only going to use seven slots, but declare 8 700 ... 701 } 702 703 The first step when your model is behaving strangely is to set **single=False**. 704 This automatically restricts the model to only run on the CPU, or on high end 705 GPU cards. There can still be problems even on high end cards, so you can force 706 the model off the GPU by setting **opencl=False**. This runs the model 707 as a normal C program without any GPU restrictions so you know that 708 strange results are probably from your code rather than the environment. Once 709 the code is debugged, you can compare your output to the output on the GPU. 710 711 Although it can be difficult to get your model to work on the GPU, the reward 712 can be a model that runs 1000x faster on a good card. Even your laptop may 713 show a 50x improvement or more over the equivalent pure python model. 573 714 574 715 External C Models … … 577 718 External C models are very much like embedded C models, except that 578 719 *Iq*, *Iqxy* and *form_volume* are defined in an external source file 579 loaded using the *source=[...]* method. You need to supply the function720 loaded using the *source=[...]* statement. You need to supply the function 580 721 declarations for each of these that you need instead of building them 581 722 automatically from the parameter table.
Note: See TracChangeset
for help on using the changeset viewer.