Changeset 599993b9 in sasmodels
- Timestamp:
- Oct 30, 2018 10:51:20 AM (6 years ago)
- Branches:
- master, core_shell_microgels, magnetic_model, ticket-1257-vesicle-product, ticket_1156, ticket_1265_superball, ticket_822_more_unit_tests
- Children:
- 63d4dd1
- Parents:
- 2a12d8d8 (diff), c6084f1 (diff)
Note: this is a merge changeset, the changes displayed below correspond to the merge itself.
Use the (diff) links above to see all the changes relative to each parent. - Files:
-
- 1 added
- 7 edited
Legend:
- Unmodified
- Added
- Removed
-
doc/guide/plugin.rst
r2015f02 r57c609b 428 428 def random(): 429 429 ... 430 431 This function provides a model-specific random parameter set which shows model 432 features in the USANS to SANS range. For example, core-shell sphere sets the 433 outer radius of the sphere logarithmically in `[20, 20,000]`, which sets the Q 434 value for the transition from flat to falling. It then uses a beta distribution 435 to set the percentage of the shape which is shell, giving a preference for very 436 thin or very thick shells (but never 0% or 100%). Using `-sets=10` in sascomp 437 should show a reasonable variety of curves over the default sascomp q range. 438 The parameter set is returned as a dictionary of `{parameter: value, ...}`. 439 Any model parameters not included in the dictionary will default according to 430 431 This function provides a model-specific random parameter set which shows model 432 features in the USANS to SANS range. For example, core-shell sphere sets the 433 outer radius of the sphere logarithmically in `[20, 20,000]`, which sets the Q 434 value for the transition from flat to falling. It then uses a beta distribution 435 to set the percentage of the shape which is shell, giving a preference for very 436 thin or very thick shells (but never 0% or 100%). Using `-sets=10` in sascomp 437 should show a reasonable variety of curves over the default sascomp q range. 438 The parameter set is returned as a dictionary of `{parameter: value, ...}`. 439 Any model parameters not included in the dictionary will default according to 440 440 the code in the `_randomize_one()` function from sasmodels/compare.py. 441 441 … … 701 701 erf, erfc, tgamma, lgamma: **do not use** 702 702 Special functions that should be part of the standard, but are missing 703 or inaccurate on some platforms. Use sas_erf, sas_erfc andsas_gamma704 instead (see below). Note: lgamma(x) has not yet been tested.703 or inaccurate on some platforms. Use sas_erf, sas_erfc, sas_gamma 704 and sas_lgamma instead (see below). 705 705 706 706 Some non-standard constants and functions are also provided: … … 769 769 Gamma function sas_gamma\ $(x) = \Gamma(x)$. 770 770 771 The standard math function, tgamma(x) is unstable for $x < 1$771 The standard math function, tgamma(x), is unstable for $x < 1$ 772 772 on some platforms. 773 773 774 774 :code:`source = ["lib/sas_gamma.c", ...]` 775 775 (`sas_gamma.c <https://github.com/SasView/sasmodels/tree/master/sasmodels/models/lib/sas_gamma.c>`_) 776 777 sas_gammaln(x): 778 log gamma function sas_gammaln\ $(x) = \log \Gamma(|x|)$. 779 780 The standard math function, lgamma(x), is incorrect for single 781 precision on some platforms. 782 783 :code:`source = ["lib/sas_gammainc.c", ...]` 784 (`sas_gammainc.c <https://github.com/SasView/sasmodels/tree/master/sasmodels/models/lib/sas_gammainc.c>`_) 785 786 sas_gammainc(a, x), sas_gammaincc(a, x): 787 Incomplete gamma function 788 sas_gammainc\ $(a, x) = \int_0^x t^{a-1}e^{-t}\,dt / \Gamma(a)$ 789 and complementary incomplete gamma function 790 sas_gammaincc\ $(a, x) = \int_x^\infty t^{a-1}e^{-t}\,dt / \Gamma(a)$ 791 792 :code:`source = ["lib/sas_gammainc.c", ...]` 793 (`sas_gammainc.c <https://github.com/SasView/sasmodels/tree/master/sasmodels/models/lib/sas_gammainc.c>`_) 776 794 777 795 sas_erf(x), sas_erfc(x): … … 811 829 If $n$ = 0 or 1, it uses sas_J0($x$) or sas_J1($x$), respectively. 812 830 831 Warning: JN(n,x) can be very inaccurate (0.1%) for x not in [0.1, 100]. 832 813 833 The standard math function jn(n, x) is not available on all platforms. 814 834 … … 819 839 Sine integral Si\ $(x) = \int_0^x \tfrac{\sin t}{t}\,dt$. 820 840 841 Warning: Si(x) can be very inaccurate (0.1%) for x in [0.1, 100]. 842 821 843 This function uses Taylor series for small and large arguments: 822 844 823 For large arguments ,845 For large arguments use the following Taylor series, 824 846 825 847 .. math:: … … 829 851 - \frac{\sin(x)}{x}\left(\frac{1}{x} - \frac{3!}{x^3} + \frac{5!}{x^5} - \frac{7!}{x^7}\right) 830 852 831 For small arguments ,853 For small arguments , 832 854 833 855 .. math:: -
explore/precision.py
r2a7e20e rfba9ca0 95 95 neg: [-100,100] 96 96 97 For arbitrary range use "start:stop:steps:scale" where scale is 98 one of log, lin, or linear. 99 97 100 *diff* is "relative", "absolute" or "none" 98 101 … … 102 105 linear = not xrange.startswith("log") 103 106 if xrange == "zoom": 104 lin_min, lin_max, lin_steps = 1000, 1010, 2000107 start, stop, steps = 1000, 1010, 2000 105 108 elif xrange == "neg": 106 lin_min, lin_max, lin_steps = -100.1, 100.1, 2000109 start, stop, steps = -100.1, 100.1, 2000 107 110 elif xrange == "linear": 108 lin_min, lin_max, lin_steps = 1, 1000, 2000109 lin_min, lin_max, lin_steps = 0.001, 2, 2000111 start, stop, steps = 1, 1000, 2000 112 start, stop, steps = 0.001, 2, 2000 110 113 elif xrange == "log": 111 log_min, log_max, log_steps = -3, 5, 400114 start, stop, steps = -3, 5, 400 112 115 elif xrange == "logq": 113 log_min, log_max, log_steps = -4, 1, 400 116 start, stop, steps = -4, 1, 400 117 elif ':' in xrange: 118 parts = xrange.split(':') 119 linear = parts[3] != "log" if len(parts) == 4 else True 120 steps = int(parts[2]) if len(parts) > 2 else 400 121 start = float(parts[0]) 122 stop = float(parts[1]) 123 114 124 else: 115 125 raise ValueError("unknown range "+xrange) … … 121 131 # value to x in the given precision. 122 132 if linear: 123 lin_min = max(lin_min, self.limits[0])124 lin_max = min(lin_max, self.limits[1])125 qrf = np.linspace( lin_min, lin_max, lin_steps, dtype='single')126 #qrf = np.linspace( lin_min, lin_max, lin_steps, dtype='double')133 start = max(start, self.limits[0]) 134 stop = min(stop, self.limits[1]) 135 qrf = np.linspace(start, stop, steps, dtype='single') 136 #qrf = np.linspace(start, stop, steps, dtype='double') 127 137 qr = [mp.mpf(float(v)) for v in qrf] 128 #qr = mp.linspace( lin_min, lin_max, lin_steps)138 #qr = mp.linspace(start, stop, steps) 129 139 else: 130 log_min = np.log10(max(10**log_min, self.limits[0]))131 log_max = np.log10(min(10**log_max, self.limits[1]))132 qrf = np.logspace( log_min, log_max, log_steps, dtype='single')133 #qrf = np.logspace( log_min, log_max, log_steps, dtype='double')140 start = np.log10(max(10**start, self.limits[0])) 141 stop = np.log10(min(10**stop, self.limits[1])) 142 qrf = np.logspace(start, stop, steps, dtype='single') 143 #qrf = np.logspace(start, stop, steps, dtype='double') 134 144 qr = [mp.mpf(float(v)) for v in qrf] 135 #qr = [10**v for v in mp.linspace( log_min, log_max, log_steps)]145 #qr = [10**v for v in mp.linspace(start, stop, steps)] 136 146 137 147 target = self.call_mpmath(qr, bits=500) … … 176 186 """ 177 187 if diff == "relative": 178 err = np.array([ abs((t-a)/t) for t, a in zip(target, actual)], 'd')188 err = np.array([(abs((t-a)/t) if t != 0 else a) for t, a in zip(target, actual)], 'd') 179 189 #err = np.clip(err, 0, 1) 180 190 pylab.loglog(x, err, '-', label=label) … … 197 207 return model_info 198 208 209 # Hack to allow second parameter A in two parameter functions 210 A = 1 211 def parse_extra_pars(): 212 global A 213 214 A_str = str(A) 215 pop = [] 216 for k, v in enumerate(sys.argv[1:]): 217 if v.startswith("A="): 218 A_str = v[2:] 219 pop.append(k+1) 220 if pop: 221 sys.argv = [v for k, v in enumerate(sys.argv) if k not in pop] 222 A = float(A_str) 223 224 parse_extra_pars() 225 199 226 200 227 # =============== FUNCTION DEFINITIONS ================ … … 297 324 ocl_function=make_ocl("return sas_gamma(q);", "sas_gamma", ["lib/sas_gamma.c"]), 298 325 limits=(-3.1, 10), 326 ) 327 add_function( 328 name="gammaln(x)", 329 mp_function=mp.loggamma, 330 np_function=scipy.special.gammaln, 331 ocl_function=make_ocl("return sas_gammaln(q);", "sas_gammaln", ["lib/sas_gammainc.c"]), 332 #ocl_function=make_ocl("return lgamma(q);", "sas_gammaln"), 333 ) 334 add_function( 335 name="gammainc(x)", 336 mp_function=lambda x, a=A: mp.gammainc(a, a=0, b=x)/mp.gamma(a), 337 np_function=lambda x, a=A: scipy.special.gammainc(a, x), 338 ocl_function=make_ocl("return sas_gammainc(%.15g,q);"%A, "sas_gammainc", ["lib/sas_gammainc.c"]), 339 ) 340 add_function( 341 name="gammaincc(x)", 342 mp_function=lambda x, a=A: mp.gammainc(a, a=x, b=mp.inf)/mp.gamma(a), 343 np_function=lambda x, a=A: scipy.special.gammaincc(a, x), 344 ocl_function=make_ocl("return sas_gammaincc(%.15g,q);"%A, "sas_gammaincc", ["lib/sas_gammainc.c"]), 299 345 ) 300 346 add_function( … … 463 509 lanczos_gamma = """\ 464 510 const double coeff[] = { 465 76.18009172947146, 466 24.01409824083091, 511 76.18009172947146, -86.50532032941677, 512 24.01409824083091, -1.231739572450155, 467 513 0.1208650973866179e-2,-0.5395239384953e-5 468 514 }; … … 475 521 """ 476 522 add_function( 477 name="log 523 name="loggamma(x)", 478 524 mp_function=mp.loggamma, 479 525 np_function=scipy.special.gammaln, … … 599 645 600 646 ALL_FUNCTIONS = set(FUNCTIONS.keys()) 601 ALL_FUNCTIONS.discard("loggamma") # OCL version not ready yet647 ALL_FUNCTIONS.discard("loggamma") # use cephes-based gammaln instead 602 648 ALL_FUNCTIONS.discard("3j1/x:taylor") 603 649 ALL_FUNCTIONS.discard("3j1/x:trig") … … 615 661 -r indicates that the relative error should be plotted (default), 616 662 -x<range> indicates the steps in x, where <range> is one of the following 617 log indicates log stepping in [10^-3, 10^5] (default) 618 logq indicates log stepping in [10^-4, 10^1] 619 linear indicates linear stepping in [1, 1000] 620 zoom indicates linear stepping in [1000, 1010] 621 neg indicates linear stepping in [-100.1, 100.1] 622 and name is "all" or one of: 663 log indicates log stepping in [10^-3, 10^5] (default) 664 logq indicates log stepping in [10^-4, 10^1] 665 linear indicates linear stepping in [1, 1000] 666 zoom indicates linear stepping in [1000, 1010] 667 neg indicates linear stepping in [-100.1, 100.1] 668 start:stop:n[:stepping] indicates an n-step plot in [start, stop] 669 or [10^start, 10^stop] if stepping is "log" (default n=400) 670 Some functions (notably gammainc/gammaincc) have an additional parameter A 671 which can be set from the command line as A=value. Default is A=1. 672 673 Name is one of: 623 674 """+names) 624 675 sys.exit(1) -
sasmodels/kernelpy.py
r91bd550 r12eec1e 37 37 self.info = model_info 38 38 self.dtype = np.dtype('d') 39 logger.info("make python model " + self.info.name) 39 40 40 41 def make_kernel(self, q_vectors): -
sasmodels/model_test.py
r012cd34 r12eec1e 47 47 import sys 48 48 import unittest 49 import traceback 49 50 50 51 try: … … 74 75 # pylint: enable=unused-import 75 76 76 77 77 def make_suite(loaders, models): 78 78 # type: (List[str], List[str]) -> unittest.TestSuite … … 86 86 *models* is the list of models to test, or *["all"]* to test all models. 87 87 """ 88 ModelTestCase = _hide_model_case_from_nose()89 88 suite = unittest.TestSuite() 90 89 … … 95 94 skip = [] 96 95 for model_name in models: 97 if model_name in skip: 98 continue 99 model_info = load_model_info(model_name) 100 101 #print('------') 102 #print('found tests in', model_name) 103 #print('------') 104 105 # if ispy then use the dll loader to call pykernel 106 # don't try to call cl kernel since it will not be 107 # available in some environmentes. 108 is_py = callable(model_info.Iq) 109 110 # Some OpenCL drivers seem to be flaky, and are not producing the 111 # expected result. Since we don't have known test values yet for 112 # all of our models, we are instead going to compare the results 113 # for the 'smoke test' (that is, evaluation at q=0.1 for the default 114 # parameters just to see that the model runs to completion) between 115 # the OpenCL and the DLL. To do this, we define a 'stash' which is 116 # shared between OpenCL and DLL tests. This is just a list. If the 117 # list is empty (which it will be when DLL runs, if the DLL runs 118 # first), then the results are appended to the list. If the list 119 # is not empty (which it will be when OpenCL runs second), the results 120 # are compared to the results stored in the first element of the list. 121 # This is a horrible stateful hack which only makes sense because the 122 # test suite is thrown away after being run once. 123 stash = [] 124 125 if is_py: # kernel implemented in python 126 test_name = "%s-python"%model_name 127 test_method_name = "test_%s_python" % model_info.id 96 if model_name not in skip: 97 model_info = load_model_info(model_name) 98 _add_model_to_suite(loaders, suite, model_info) 99 100 return suite 101 102 def _add_model_to_suite(loaders, suite, model_info): 103 ModelTestCase = _hide_model_case_from_nose() 104 105 #print('------') 106 #print('found tests in', model_name) 107 #print('------') 108 109 # if ispy then use the dll loader to call pykernel 110 # don't try to call cl kernel since it will not be 111 # available in some environmentes. 112 is_py = callable(model_info.Iq) 113 114 # Some OpenCL drivers seem to be flaky, and are not producing the 115 # expected result. Since we don't have known test values yet for 116 # all of our models, we are instead going to compare the results 117 # for the 'smoke test' (that is, evaluation at q=0.1 for the default 118 # parameters just to see that the model runs to completion) between 119 # the OpenCL and the DLL. To do this, we define a 'stash' which is 120 # shared between OpenCL and DLL tests. This is just a list. If the 121 # list is empty (which it will be when DLL runs, if the DLL runs 122 # first), then the results are appended to the list. If the list 123 # is not empty (which it will be when OpenCL runs second), the results 124 # are compared to the results stored in the first element of the list. 125 # This is a horrible stateful hack which only makes sense because the 126 # test suite is thrown away after being run once. 127 stash = [] 128 129 if is_py: # kernel implemented in python 130 test_name = "%s-python"%model_info.name 131 test_method_name = "test_%s_python" % model_info.id 132 test = ModelTestCase(test_name, model_info, 133 test_method_name, 134 platform="dll", # so that 135 dtype="double", 136 stash=stash) 137 suite.addTest(test) 138 else: # kernel implemented in C 139 140 # test using dll if desired 141 if 'dll' in loaders or not use_opencl(): 142 test_name = "%s-dll"%model_info.name 143 test_method_name = "test_%s_dll" % model_info.id 128 144 test = ModelTestCase(test_name, model_info, 129 test_method_name,130 platform="dll", # so that131 dtype="double",132 stash=stash)145 test_method_name, 146 platform="dll", 147 dtype="double", 148 stash=stash) 133 149 suite.addTest(test) 134 else: # kernel implemented in C 135 136 # test using dll if desired 137 if 'dll' in loaders or not use_opencl(): 138 test_name = "%s-dll"%model_name 139 test_method_name = "test_%s_dll" % model_info.id 140 test = ModelTestCase(test_name, model_info, 141 test_method_name, 142 platform="dll", 143 dtype="double", 144 stash=stash) 145 suite.addTest(test) 146 147 # test using opencl if desired and available 148 if 'opencl' in loaders and use_opencl(): 149 test_name = "%s-opencl"%model_name 150 test_method_name = "test_%s_opencl" % model_info.id 151 # Using dtype=None so that the models that are only 152 # correct for double precision are not tested using 153 # single precision. The choice is determined by the 154 # presence of *single=False* in the model file. 155 test = ModelTestCase(test_name, model_info, 156 test_method_name, 157 platform="ocl", dtype=None, 158 stash=stash) 159 #print("defining", test_name) 160 suite.addTest(test) 161 162 return suite 150 151 # test using opencl if desired and available 152 if 'opencl' in loaders and use_opencl(): 153 test_name = "%s-opencl"%model_info.name 154 test_method_name = "test_%s_opencl" % model_info.id 155 # Using dtype=None so that the models that are only 156 # correct for double precision are not tested using 157 # single precision. The choice is determined by the 158 # presence of *single=False* in the model file. 159 test = ModelTestCase(test_name, model_info, 160 test_method_name, 161 platform="ocl", dtype=None, 162 stash=stash) 163 #print("defining", test_name) 164 suite.addTest(test) 165 163 166 164 167 def _hide_model_case_from_nose(): … … 348 351 return abs(target-actual)/shift < 1.5*10**-digits 349 352 350 def run_one(model): 351 # type: (str) -> str 352 """ 353 Run the tests for a single model, printing the results to stdout. 354 355 *model* can by a python file, which is handy for checking user defined 356 plugin models. 353 # CRUFT: old interface; should be deprecated and removed 354 def run_one(model_name): 355 # msg = "use check_model(model_info) rather than run_one(model_name)" 356 # warnings.warn(msg, category=DeprecationWarning, stacklevel=2) 357 try: 358 model_info = load_model_info(model_name) 359 except Exception: 360 output = traceback.format_exc() 361 return output 362 363 success, output = check_model(model_info) 364 return output 365 366 def check_model(model_info): 367 # type: (ModelInfo) -> str 368 """ 369 Run the tests for a single model, capturing the output. 370 371 Returns success status and the output string. 357 372 """ 358 373 # Note that running main() directly did not work from within the … … 369 384 # Build a test suite containing just the model 370 385 loaders = ['opencl'] if use_opencl() else ['dll'] 371 models = [model] 372 try: 373 suite = make_suite(loaders, models) 374 except Exception: 375 import traceback 376 stream.writeln(traceback.format_exc()) 377 return 386 suite = unittest.TestSuite() 387 _add_model_to_suite(loaders, suite, model_info) 378 388 379 389 # Warn if there are no user defined tests. … … 390 400 for test in suite: 391 401 if not test.info.tests: 392 stream.writeln("Note: %s has no user defined tests."%model )402 stream.writeln("Note: %s has no user defined tests."%model_info.name) 393 403 break 394 404 else: … … 406 416 output = stream.getvalue() 407 417 stream.close() 408 return output418 return result.wasSuccessful(), output 409 419 410 420 -
sasmodels/sasview_model.py
rbd547d0 rce1eed5 803 803 return value, [value], [1.0] 804 804 805 @classmethod 806 def runTests(cls): 807 """ 808 Run any tests built into the model and captures the test output. 809 810 Returns success flag and output 811 """ 812 from .model_test import check_model 813 return check_model(cls._model_info) 814 805 815 def test_cylinder(): 806 816 # type: () -> float -
sasmodels/special.py
rdf69efa rfba9ca0 113 113 The standard math function, tgamma(x) is unstable for $x < 1$ 114 114 on some platforms. 115 116 sas_gammaln(x): 117 log gamma function sas_gammaln\ $(x) = \log \Gamma(|x|)$. 118 119 The standard math function, lgamma(x), is incorrect for single 120 precision on some platforms. 121 122 sas_gammainc(a, x), sas_gammaincc(a, x): 123 Incomplete gamma function 124 sas_gammainc\ $(a, x) = \int_0^x t^{a-1}e^{-t}\,dt / \Gamma(a)$ 125 and complementary incomplete gamma function 126 sas_gammaincc\ $(a, x) = \int_x^\infty t^{a-1}e^{-t}\,dt / \Gamma(a)$ 115 127 116 128 sas_erf(x), sas_erfc(x): … … 207 219 from numpy import pi, nan, inf 208 220 from scipy.special import gamma as sas_gamma 221 from scipy.special import gammaln as sas_gammaln 222 from scipy.special import gammainc as sas_gammainc 223 from scipy.special import gammaincc as sas_gammaincc 209 224 from scipy.special import erf as sas_erf 210 225 from scipy.special import erfc as sas_erfc -
sasmodels/kernelcl.py
rd86f0fc r95f62aa 74 74 75 75 from . import generate 76 from .generate import F32, F64 76 77 from .kernel import KernelModel, Kernel 77 78 … … 162 163 Return true if device supports the requested precision. 163 164 """ 164 if dtype == generate.F32:165 if dtype == F32: 165 166 return True 166 167 elif dtype == generate.F64: … … 239 240 """ 240 241 GPU context, with possibly many devices, and one queue per device. 242 243 Because the environment can be reset during a live program (e.g., if the 244 user changes the active GPU device in the GUI), everything associated 245 with the device context must be cached in the environment and recreated 246 if the environment changes. The *cache* attribute is a simple dictionary 247 which holds keys and references to objects, such as compiled kernels and 248 allocated buffers. The running program should check in the cache for 249 long lived objects and create them if they are not there. The program 250 should not hold onto cached objects, but instead only keep them active 251 for the duration of a function call. When the environment is destroyed 252 then the *release* method for each active cache item is called before 253 the environment is freed. This means that each cl buffer should be 254 in its own cache entry. 241 255 """ 242 256 def __init__(self): 243 257 # type: () -> None 244 258 # find gpu context 245 #self.context = cl.create_some_context() 246 247 self.context = None 248 if 'SAS_OPENCL' in os.environ: 249 #Setting PYOPENCL_CTX as a SAS_OPENCL to create cl context 250 os.environ["PYOPENCL_CTX"] = os.environ["SAS_OPENCL"] 251 if 'PYOPENCL_CTX' in os.environ: 252 self._create_some_context() 253 254 if not self.context: 255 self.context = _get_default_context() 259 context_list = _create_some_context() 260 261 # Find a context for F32 and for F64 (maybe the same one). 262 # F16 isn't good enough. 263 self.context = {} 264 for dtype in (F32, F64): 265 for context in context_list: 266 if has_type(context.devices[0], dtype): 267 self.context[dtype] = context 268 break 269 else: 270 self.context[dtype] = None 271 272 # Build a queue for each context 273 self.queue = {} 274 context = self.context[F32] 275 self.queue[F32] = cl.CommandQueue(context, context.devices[0]) 276 if self.context[F64] == self.context[F32]: 277 self.queue[F64] = self.queue[F32] 278 else: 279 context = self.context[F64] 280 self.queue[F64] = cl.CommandQueue(context, context.devices[0]) 256 281 257 282 # Byte boundary for data alignment 258 #self.data_boundary = max( d.min_data_type_align_size259 # for d in self.context.devices)260 self.queues = [cl.CommandQueue(context, context.devices[0]) 261 for context in self.context]283 #self.data_boundary = max(context.devices[0].min_data_type_align_size 284 # for context in self.context.values()) 285 286 # Cache for compiled programs, and for items in context 262 287 self.compiled = {} 288 self.cache = {} 263 289 264 290 def has_type(self, dtype): … … 267 293 Return True if all devices support a given type. 268 294 """ 269 return any(has_type(d, dtype) 270 for context in self.context 271 for d in context.devices) 272 273 def get_queue(self, dtype): 274 # type: (np.dtype) -> cl.CommandQueue 275 """ 276 Return a command queue for the kernels of type dtype. 277 """ 278 for context, queue in zip(self.context, self.queues): 279 if all(has_type(d, dtype) for d in context.devices): 280 return queue 281 282 def get_context(self, dtype): 283 # type: (np.dtype) -> cl.Context 284 """ 285 Return a OpenCL context for the kernels of type dtype. 286 """ 287 for context in self.context: 288 if all(has_type(d, dtype) for d in context.devices): 289 return context 290 291 def _create_some_context(self): 292 # type: () -> cl.Context 293 """ 294 Protected call to cl.create_some_context without interactivity. Use 295 this if SAS_OPENCL is set in the environment. Sets the *context* 296 attribute. 297 """ 298 try: 299 self.context = [cl.create_some_context(interactive=False)] 300 except Exception as exc: 301 warnings.warn(str(exc)) 302 warnings.warn("pyopencl.create_some_context() failed") 303 warnings.warn("the environment variable 'SAS_OPENCL' might not be set correctly") 295 return self.context.get(dtype, None) is not None 304 296 305 297 def compile_program(self, name, source, dtype, fast, timestamp): … … 318 310 del self.compiled[key] 319 311 if key not in self.compiled: 320 context = self. get_context(dtype)312 context = self.context[dtype] 321 313 logging.info("building %s for OpenCL %s", key, 322 314 context.devices[0].name.strip()) 323 program = compile_model(self. get_context(dtype),315 program = compile_model(self.context[dtype], 324 316 str(source), dtype, fast) 325 317 self.compiled[key] = (program, timestamp) 326 318 return program 319 320 def free_buffer(self, key): 321 if key in self.cache: 322 self.cache[key].release() 323 del self.cache[key] 324 325 def __del__(self): 326 for v in self.cache.values(): 327 release = getattr(v, 'release', lambda: None) 328 release() 329 self.cache = {} 330 331 _CURRENT_ID = 0 332 def unique_id(): 333 global _CURRENT_ID 334 _CURRENT_ID += 1 335 return _CURRENT_ID 336 337 def _create_some_context(): 338 # type: () -> cl.Context 339 """ 340 Protected call to cl.create_some_context without interactivity. 341 342 Uses SAS_OPENCL or PYOPENCL_CTX if they are set in the environment, 343 otherwise scans for the most appropriate device using 344 :func:`_get_default_context` 345 """ 346 if 'SAS_OPENCL' in os.environ: 347 #Setting PYOPENCL_CTX as a SAS_OPENCL to create cl context 348 os.environ["PYOPENCL_CTX"] = os.environ["SAS_OPENCL"] 349 350 if 'PYOPENCL_CTX' in os.environ: 351 try: 352 return [cl.create_some_context(interactive=False)] 353 except Exception as exc: 354 warnings.warn(str(exc)) 355 warnings.warn("pyopencl.create_some_context() failed") 356 warnings.warn("the environment variable 'SAS_OPENCL' or 'PYOPENCL_CTX' might not be set correctly") 357 358 return _get_default_context() 327 359 328 360 def _get_default_context(): … … 404 436 self.dtype = dtype 405 437 self.fast = fast 406 self. program = None # delay program creation407 self._ kernels = None438 self.timestamp = generate.ocl_timestamp(self.info) 439 self._cache_key = unique_id() 408 440 409 441 def __getstate__(self): … … 414 446 # type: (Tuple[ModelInfo, str, np.dtype, bool]) -> None 415 447 self.info, self.source, self.dtype, self.fast = state 416 self.program = None417 448 418 449 def make_kernel(self, q_vectors): 419 450 # type: (List[np.ndarray]) -> "GpuKernel" 420 if self.program is None: 421 compile_program = environment().compile_program 422 timestamp = generate.ocl_timestamp(self.info) 423 self.program = compile_program( 451 return GpuKernel(self, q_vectors) 452 453 @property 454 def Iq(self): 455 return self._fetch_kernel('Iq') 456 457 def fetch_kernel(self, name): 458 # type: (str) -> cl.Kernel 459 """ 460 Fetch the kernel from the environment by name, compiling it if it 461 does not already exist. 462 """ 463 gpu = environment() 464 key = self._cache_key 465 if key not in gpu.cache: 466 program = gpu.compile_program( 424 467 self.info.name, 425 468 self.source['opencl'], 426 469 self.dtype, 427 470 self.fast, 428 timestamp)471 self.timestamp) 429 472 variants = ['Iq', 'Iqxy', 'Imagnetic'] 430 473 names = [generate.kernel_name(self.info, k) for k in variants] 431 kernels = [getattr( self.program, k) for k in names]432 self._kernels= dict((k, v) for k, v in zip(variants, kernels))433 is_2d = len(q_vectors) == 2434 if is_2d:435 kernel = [self._kernels['Iqxy'], self._kernels['Imagnetic']]474 kernels = [getattr(program, k) for k in names] 475 data = dict((k, v) for k, v in zip(variants, kernels)) 476 # keep a handle to program so GC doesn't collect 477 data['program'] = program 478 gpu.cache[key] = data 436 479 else: 437 kernel = [self._kernels['Iq']]*2 438 return GpuKernel(kernel, self.dtype, self.info, q_vectors) 439 440 def release(self): 441 # type: () -> None 442 """ 443 Free the resources associated with the model. 444 """ 445 if self.program is not None: 446 self.program = None 447 448 def __del__(self): 449 # type: () -> None 450 self.release() 480 data = gpu.cache[key] 481 return data[name] 451 482 452 483 # TODO: check that we don't need a destructor for buffers which go out of scope … … 473 504 # type: (List[np.ndarray], np.dtype) -> None 474 505 # TODO: do we ever need double precision q? 475 env = environment()476 506 self.nq = q_vectors[0].size 477 507 self.dtype = np.dtype(dtype) … … 493 523 self.q[:self.nq] = q_vectors[0] 494 524 self.global_size = [self.q.shape[0]] 495 context = env.get_context(self.dtype) 496 #print("creating inputs of size", self.global_size) 497 self.q_b = cl.Buffer(context, mf.READ_ONLY | mf.COPY_HOST_PTR, 498 hostbuf=self.q) 525 self._cache_key = unique_id() 526 527 @property 528 def q_b(self): 529 """Lazy creation of q buffer so it can survive context reset""" 530 env = environment() 531 key = self._cache_key 532 if key not in env.cache: 533 context = env.context[self.dtype] 534 #print("creating inputs of size", self.global_size) 535 buffer = cl.Buffer(context, mf.READ_ONLY | mf.COPY_HOST_PTR, 536 hostbuf=self.q) 537 env.cache[key] = buffer 538 return env.cache[key] 499 539 500 540 def release(self): 501 541 # type: () -> None 502 542 """ 503 Free the memory. 504 """ 505 if self.q_b is not None: 506 self.q_b.release() 507 self.q_b = None 543 Free the buffer associated with the q value 544 """ 545 environment().free_buffer(id(self)) 508 546 509 547 def __del__(self): … … 515 553 Callable SAS kernel. 516 554 517 * kernel* is the GpuKernel object to call518 519 *model_info* is the module information520 521 * q_vectors* is the q vectors at which the kernel should be evaluated555 *model* is the GpuModel object to call 556 557 The following attributes are defined: 558 559 *info* is the module information 522 560 523 561 *dtype* is the kernel precision 562 563 *dim* is '1d' or '2d' 564 565 *result* is a vector to contain the results of the call 524 566 525 567 The resulting call method takes the *pars*, a list of values for … … 531 573 Call :meth:`release` when done with the kernel instance. 532 574 """ 533 def __init__(self, kernel, dtype, model_info, q_vectors):575 def __init__(self, model, q_vectors): 534 576 # type: (cl.Kernel, np.dtype, ModelInfo, List[np.ndarray]) -> None 535 q_input = GpuInput(q_vectors, dtype) 536 self.kernel = kernel 537 self.info = model_info 538 self.dtype = dtype 539 self.dim = '2d' if q_input.is_2d else '1d' 540 # plus three for the normalization values 541 self.result = np.empty(q_input.nq+1, dtype) 542 543 # Inputs and outputs for each kernel call 544 # Note: res may be shorter than res_b if global_size != nq 577 dtype = model.dtype 578 self.q_input = GpuInput(q_vectors, dtype) 579 self._model = model 580 self._as_dtype = (np.float32 if dtype == generate.F32 581 else np.float64 if dtype == generate.F64 582 else np.float16 if dtype == generate.F16 583 else np.float32) # will never get here, so use np.float32 584 self._cache_key = unique_id() 585 586 # attributes accessed from the outside 587 self.dim = '2d' if self.q_input.is_2d else '1d' 588 self.info = model.info 589 self.dtype = model.dtype 590 591 # holding place for the returned value 592 # plus one for the normalization values 593 self.result = np.empty(self.q_input.nq+1, dtype) 594 595 @property 596 def _result_b(self): 597 """Lazy creation of result buffer so it can survive context reset""" 545 598 env = environment() 546 self.queue = env.get_queue(dtype) 547 548 self.result_b = cl.Buffer(self.queue.context, mf.READ_WRITE, 549 q_input.global_size[0] * dtype.itemsize) 550 self.q_input = q_input # allocated by GpuInput above 551 552 self._need_release = [self.result_b, self.q_input] 553 self.real = (np.float32 if dtype == generate.F32 554 else np.float64 if dtype == generate.F64 555 else np.float16 if dtype == generate.F16 556 else np.float32) # will never get here, so use np.float32 599 key = self._cache_key 600 if key not in env.cache: 601 context = env.context[self.dtype] 602 #print("creating inputs of size", self.global_size) 603 buffer = cl.Buffer(context, mf.READ_WRITE, 604 self.q_input.global_size[0] * self.dtype.itemsize) 605 env.cache[key] = buffer 606 return env.cache[key] 557 607 558 608 def __call__(self, call_details, values, cutoff, magnetic): 559 609 # type: (CallDetails, np.ndarray, np.ndarray, float, bool) -> np.ndarray 560 context = self.queue.context 561 # Arrange data transfer to card 610 env = environment() 611 queue = env.queue[self._model.dtype] 612 context = queue.context 613 614 # Arrange data transfer to/from card 615 q_b = self.q_input.q_b 616 result_b = self._result_b 562 617 details_b = cl.Buffer(context, mf.READ_ONLY | mf.COPY_HOST_PTR, 563 618 hostbuf=call_details.buffer) … … 565 620 hostbuf=values) 566 621 567 kernel = self.kernel[1 if magnetic else 0] 568 args = [ 622 name = 'Iq' if self.dim == '1d' else 'Imagnetic' if magnetic else 'Iqxy' 623 kernel = self._model.fetch_kernel(name) 624 kernel_args = [ 569 625 np.uint32(self.q_input.nq), None, None, 570 details_b, values_b, self.q_input.q_b, self.result_b,571 self. real(cutoff),626 details_b, values_b, q_b, result_b, 627 self._as_dtype(cutoff), 572 628 ] 573 629 #print("Calling OpenCL") … … 580 636 stop = min(start + step, call_details.num_eval) 581 637 #print("queuing",start,stop) 582 args[1:3] = [np.int32(start), np.int32(stop)]583 wait_for = [kernel( self.queue, self.q_input.global_size, None,584 * args, wait_for=wait_for)]638 kernel_args[1:3] = [np.int32(start), np.int32(stop)] 639 wait_for = [kernel(queue, self.q_input.global_size, None, 640 *kernel_args, wait_for=wait_for)] 585 641 if stop < call_details.num_eval: 586 642 # Allow other processes to run … … 590 646 time.sleep(0.05) 591 647 last_nap = current_time 592 cl.enqueue_copy( self.queue, self.result, self.result_b)648 cl.enqueue_copy(queue, self.result, result_b, wait_for=wait_for) 593 649 #print("result", self.result) 594 650 … … 610 666 Release resources associated with the kernel. 611 667 """ 612 for v in self._need_release: 613 v.release() 614 self._need_release = [] 668 environment().free_buffer(id(self)) 669 self.q_input.release() 615 670 616 671 def __del__(self):
Note: See TracChangeset
for help on using the changeset viewer.