Changes in / [508475a:2a12d8d8] in sasmodels


Ignore:
Files:
1 deleted
10 edited

Legend:

Unmodified
Added
Removed
  • doc/guide/gpu_setup.rst

    r8b31efa r63602b1  
    9494Device Selection 
    9595================ 
    96 **OpenCL drivers** 
    97  
    9896If you have multiple GPU devices you can tell the program which device to use. 
    9997By default, the program looks for one GPU and one CPU device from available 
     
    106104was used to run the model. 
    107105 
    108 If you want to use a specific driver and devices, you can run the following 
     106**If you don't want to use OpenCL, you can set** *SAS_OPENCL=None* 
     107**in your environment settings, and it will only use normal programs.** 
     108 
     109If you want to use one of the other devices, you can run the following 
    109110from the python console:: 
    110111 
     
    114115This will provide a menu of different OpenCL drivers available. 
    115116When one is selected, it will say "set PYOPENCL_CTX=..." 
    116 Use that value as the value of *SAS_OPENCL=driver:device*. 
    117  
    118 To use the default OpenCL device (rather than CUDA or None), 
    119 set *SAS_OPENCL=opencl*. 
    120  
    121 In batch queues, you may need to set *XDG_CACHE_HOME=~/.cache*  
    122 (Linux only) to a different directory, depending on how the filesystem  
    123 is configured.  You should also set *SAS_DLL_PATH* for CPU-only modules. 
    124  
    125     -DSAS_MODELPATH=path sets directory containing custom models 
    126     -DSAS_OPENCL=vendor:device|cuda:device|none sets the target GPU device 
    127     -DXDG_CACHE_HOME=~/.cache sets the pyopencl cache root (linux only) 
    128     -DSAS_COMPILER=tinycc|msvc|mingw|unix sets the DLL compiler 
    129     -DSAS_OPENMP=1 turns on OpenMP for the DLLs 
    130     -DSAS_DLL_PATH=path sets the path to the compiled modules 
    131  
    132  
    133 **CUDA drivers** 
    134  
    135 If OpenCL drivers are not available on your system, but NVidia CUDA 
    136 drivers are available, then set *SAS_OPENCL=cuda* or 
    137 *SAS_OPENCL=cuda:n* for a particular device number *n*.  If no device 
    138 number is specified, then the CUDA drivers looks for look for 
    139 *CUDA_DEVICE=n* or a file ~/.cuda-device containing n for the device number. 
    140  
    141 In batch queues, the SLURM command *sbatch --gres=gpu:1 ...* will set 
    142 *CUDA_VISIBLE_DEVICES=n*, which ought to set the correct device 
    143 number for *SAS_OPENCL=cuda*.  If not, then set 
    144 *CUDA_DEVICE=$CUDA_VISIBLE_DEVICES* within the batch script.  You may 
    145 need to set the CUDA cache directory to a folder accessible across the 
    146 cluster with *PYCUDA_CACHE_DIR* (or *PYCUDA_DISABLE_CACHE* to disable 
    147 caching), and you may need to set environment specific compiler flags 
    148 with *PYCUDA_DEFAULT_NVCC_FLAGS*.  You should also set *SAS_DLL_PATH*  
    149 for CPU-only modules. 
    150  
    151 **No GPU support** 
    152  
    153 If you don't want to use OpenCL or CUDA, you can set *SAS_OPENCL=None* 
    154 in your environment settings, and it will only use normal programs. 
    155  
    156 In batch queues, you may need to set *SAS_DLL_PATH* to a directory 
    157 accessible on the compute node. 
    158  
     117Use that value as the value of *SAS_OPENCL*. 
    159118 
    160119Device Testing 
     
    195154*Document History* 
    196155 
    197 | 2018-10-15 Paul Kienzle 
     156| 2017-09-27 Paul Kienzle 
  • sasmodels/compare.py

    r4de14584 r610ef23  
    4141from . import kerneldll 
    4242from . import kernelcl 
    43 from . import kernelcuda 
    4443from .data import plot_theory, empty_data1D, empty_data2D, load_data 
    4544from .direct_model import DirectModel, get_mesh 
     
    116115    === environment variables === 
    117116    -DSAS_MODELPATH=path sets directory containing custom models 
    118     -DSAS_OPENCL=vendor:device|cuda:device|none sets the target GPU device 
     117    -DSAS_OPENCL=vendor:device|none sets the target OpenCL device 
    119118    -DXDG_CACHE_HOME=~/.cache sets the pyopencl cache root (linux only) 
    120119    -DSAS_COMPILER=tinycc|msvc|mingw|unix sets the DLL compiler 
     
    726725        set_integration_size(model_info, ngauss) 
    727726 
    728     if (dtype != "default" and not dtype.endswith('!')  
    729             and not (kernelcl.use_opencl() or kernelcuda.use_cuda())): 
     727    if dtype != "default" and not dtype.endswith('!') and not kernelcl.use_opencl(): 
    730728        raise RuntimeError("OpenCL not available " + kernelcl.OPENCL_ERROR) 
    731729 
  • sasmodels/core.py

    rb0de252 r2dcd6e7  
    2121from . import mixture 
    2222from . import kernelpy 
    23 from . import kernelcuda 
    2423from . import kernelcl 
    2524from . import kerneldll 
     
    211210        #print("building dll", numpy_dtype) 
    212211        return kerneldll.load_dll(source['dll'], model_info, numpy_dtype) 
    213     elif platform == "cuda": 
    214         return kernelcuda.GpuModel(source, model_info, numpy_dtype, fast=fast) 
    215212    else: 
    216213        #print("building ocl", numpy_dtype) 
     
    248245    # type: (ModelInfo, str, str) -> (np.dtype, bool, str) 
    249246    """ 
    250     Interpret dtype string, returning np.dtype, fast flag and platform. 
     247    Interpret dtype string, returning np.dtype and fast flag. 
    251248 
    252249    Possible types include 'half', 'single', 'double' and 'quad'.  If the 
     
    256253    default for the model and platform. 
    257254 
    258     Platform preference can be specfied ("ocl", "cuda", "dll"), with the 
    259     default being OpenCL or CUDA if available, otherwise DLL.  If the dtype 
    260     name ends with '!' then platform is forced to be DLL rather than GPU. 
    261     The default platform is set by the environment variable SAS_OPENCL, 
    262     SAS_OPENCL=driver:device for OpenCL, SAS_OPENCL=cuda:device for CUDA 
    263     or SAS_OPENCL=none for DLL. 
     255    Platform preference can be specfied ("ocl" vs "dll"), with the default 
     256    being OpenCL if it is availabe.  If the dtype name ends with '!' then 
     257    platform is forced to be DLL rather than OpenCL. 
    264258 
    265259    This routine ignores the preferences within the model definition.  This 
     
    274268    if platform is None: 
    275269        platform = "ocl" 
     270    if not kernelcl.use_opencl() or not model_info.opencl: 
     271        platform = "dll" 
    276272 
    277273    # Check if type indicates dll regardless of which platform is given 
     
    279275        platform = "dll" 
    280276        dtype = dtype[:-1] 
    281  
    282     # Make sure model allows opencl/gpu 
    283     if not model_info.opencl: 
    284         platform = "dll" 
    285  
    286     # Make sure opencl is available, or fallback to cuda then to dll 
    287     if platform == "ocl" and not kernelcl.use_opencl(): 
    288         platform = "cuda" if kernelcuda.use_cuda() else "dll" 
    289277 
    290278    # Convert special type names "half", "fast", and "quad" 
     
    297285        dtype = "float16" 
    298286 
    299     # Convert dtype string to numpy dtype.  Use single precision for GPU 
    300     # if model allows it, otherwise use double precision. 
     287    # Convert dtype string to numpy dtype. 
    301288    if dtype is None or dtype == "default": 
    302         numpy_dtype = (generate.F32 if model_info.single and platform in ("ocl", "cuda") 
     289        numpy_dtype = (generate.F32 if platform == "ocl" and model_info.single 
    303290                       else generate.F64) 
    304291    else: 
    305292        numpy_dtype = np.dtype(dtype) 
    306293 
    307     # Make sure that the type is supported by GPU, otherwise use dll 
     294    # Make sure that the type is supported by opencl, otherwise use dll 
    308295    if platform == "ocl": 
    309296        env = kernelcl.environment() 
    310     elif platform == "cuda": 
    311         env = kernelcuda.environment() 
    312     else: 
    313         env = None 
    314     if env is not None and not env.has_type(numpy_dtype): 
    315         platform = "dll" 
    316         if dtype is None: 
    317             numpy_dtype = generate.F64 
     297        if not env.has_type(numpy_dtype): 
     298            platform = "dll" 
     299            if dtype is None: 
     300                numpy_dtype = generate.F64 
    318301 
    319302    return numpy_dtype, fast, platform 
  • sasmodels/kernel_header.c

    r74e9b5f r108e70e  
    11#ifdef __OPENCL_VERSION__ 
    22# define USE_OPENCL 
    3 #elif defined(__CUDACC__) 
    4 # define USE_CUDA 
    53#elif defined(_OPENMP) 
    64# define USE_OPENMP 
    75#endif 
    8  
    9 // Use SAS_DOUBLE to force the use of double even for float kernels 
    10 #define SAS_DOUBLE dou ## ble 
    116 
    127// If opencl is not available, then we are compiling a C function 
    138// Note: if using a C++ compiler, then define kernel as extern "C" 
    149#ifdef USE_OPENCL 
    15  
    16    #define USE_GPU 
    17    #define pglobal global 
    18    #define pconstant constant 
    19  
    2010   typedef int int32_t; 
    21  
    22    #if defined(USE_SINCOS) 
    23    #  define SINCOS(angle,svar,cvar) svar=sincos(angle,&cvar) 
    24    #else 
    25    #  define SINCOS(angle,svar,cvar) do {const double _t_=angle; svar=sin(_t_);cvar=cos(_t_);} while (0) 
    26    #endif 
     11#  if defined(USE_SINCOS) 
     12#    define SINCOS(angle,svar,cvar) svar=sincos(angle,&cvar) 
     13#  else 
     14#    define SINCOS(angle,svar,cvar) do {const double _t_=angle; svar=sin(_t_);cvar=cos(_t_);} while (0) 
     15#  endif 
    2716   // Intel CPU on Mac gives strange values for erf(); on the verified 
    2817   // platforms (intel, nvidia, amd), the cephes erf() is significantly 
     
    3524   #  define erfcf erfc 
    3625   #endif 
    37  
    38 #elif defined(USE_CUDA) 
    39  
    40    #define USE_GPU 
    41    #define local __shared__ 
    42    #define pglobal 
    43    #define constant __constant__ 
    44    #define pconstant const 
    45    #define kernel extern "C" __global__ 
    46  
    47    // OpenCL powr(a,b) = C99 pow(a,b), b >= 0 
    48    // OpenCL pown(a,b) = C99 pow(a,b), b integer 
    49    #define powr(a,b) pow(a,b) 
    50    #define pown(a,b) pow(a,b) 
    51    //typedef int int32_t; 
    52    #if defined(USE_SINCOS) 
    53    #  define SINCOS(angle,svar,cvar) sincos(angle,&svar,&cvar) 
    54    #else 
    55    #  define SINCOS(angle,svar,cvar) do {const double _t_=angle; svar=sin(_t_);cvar=cos(_t_);} while (0) 
    56    #endif 
    57  
    58 #else // !USE_OPENCL && !USE_CUDA 
    59  
    60    #define local 
    61    #define pglobal 
    62    #define constant const 
    63    #define pconstant const 
    64  
    65    #ifdef __cplusplus 
     26#else // !USE_OPENCL 
     27// Use SAS_DOUBLE to force the use of double even for float kernels 
     28#  define SAS_DOUBLE dou ## ble 
     29#  ifdef __cplusplus 
    6630      #include <cstdio> 
    6731      #include <cmath> 
     
    8751     #endif 
    8852     inline void SINCOS(double angle, double &svar, double &cvar) { svar=sin(angle); cvar=cos(angle); } 
    89    #else // !__cplusplus 
     53else // !__cplusplus 
    9054     #include <inttypes.h>  // C99 guarantees that int32_t types is here 
    9155     #include <stdio.h> 
     
    11276     #define kernel 
    11377     #define SINCOS(angle,svar,cvar) do {const double _t_=angle; svar=sin(_t_);cvar=cos(_t_);} while (0) 
    114    #endif  // !__cplusplus 
    115    // OpenCL powr(a,b) = C99 pow(a,b), b >= 0 
    116    // OpenCL pown(a,b) = C99 pow(a,b), b integer 
    117    #define powr(a,b) pow(a,b) 
    118    #define pown(a,b) pow(a,b) 
    119  
     78#  endif  // !__cplusplus 
     79#  define global 
     80#  define local 
     81#  define constant const 
     82// OpenCL powr(a,b) = C99 pow(a,b), b >= 0 
     83// OpenCL pown(a,b) = C99 pow(a,b), b integer 
     84#  define powr(a,b) pow(a,b) 
     85#  define pown(a,b) pow(a,b) 
    12086#endif // !USE_OPENCL 
    12187 
  • sasmodels/kernel_iq.c

    r74e9b5f r70530778  
    278278    const int32_t pd_start,     // where we are in the dispersity loop 
    279279    const int32_t pd_stop,      // where we are stopping in the dispersity loop 
    280     pglobal const ProblemDetails *details, 
    281     pglobal const double *values, 
    282     pglobal const double *q, // nq q values, with padding to boundary 
    283     pglobal double *result,  // nq+1 return values, again with padding 
     280    global const ProblemDetails *details, 
     281    global const double *values, 
     282    global const double *q, // nq q values, with padding to boundary 
     283    global double *result,  // nq+1 return values, again with padding 
    284284    const double cutoff     // cutoff in the dispersity weight product 
    285285    ) 
    286286{ 
    287 #if defined(USE_GPU) 
     287#ifdef USE_OPENCL 
    288288  // who we are and what element we are working with 
    289   #if defined(USE_OPENCL) 
    290289  const int q_index = get_global_id(0); 
    291   #else // USE_CUDA 
    292   const int q_index = threadIdx.x + blockIdx.x * blockDim.x; 
    293   #endif 
    294290  if (q_index >= nq) return; 
    295291#else 
     
    344340  // seeing one q value (stored in the variable "this_result") while the dll 
    345341  // version must loop over all q. 
    346   #if defined(USE_GPU) 
     342  #ifdef USE_OPENCL 
    347343    double pd_norm = (pd_start == 0 ? 0.0 : result[nq]); 
    348344    double this_result = (pd_start == 0 ? 0.0 : result[q_index]); 
    349   #else // !USE_GPU 
     345  #else // !USE_OPENCL 
    350346    double pd_norm = (pd_start == 0 ? 0.0 : result[nq]); 
    351347    if (pd_start == 0) { 
     
    356352    } 
    357353    //if (q_index==0) printf("start %d %g %g\n", pd_start, pd_norm, result[0]); 
    358 #endif // !USE_GPU 
     354#endif // !USE_OPENCL 
    359355 
    360356 
     
    379375  const int n4 = pd_length[4]; 
    380376  const int p4 = pd_par[4]; 
    381   pglobal const double *v4 = pd_value + pd_offset[4]; 
    382   pglobal const double *w4 = pd_weight + pd_offset[4]; 
     377  global const double *v4 = pd_value + pd_offset[4]; 
     378  global const double *w4 = pd_weight + pd_offset[4]; 
    383379  int i4 = (pd_start/pd_stride[4])%n4;  // position in level 4 at pd_start 
    384380 
     
    566562  const int n##_LOOP = details->pd_length[_LOOP]; \ 
    567563  const int p##_LOOP = details->pd_par[_LOOP]; \ 
    568   pglobal const double *v##_LOOP = pd_value + details->pd_offset[_LOOP]; \ 
    569   pglobal const double *w##_LOOP = pd_weight + details->pd_offset[_LOOP]; \ 
     564  global const double *v##_LOOP = pd_value + details->pd_offset[_LOOP]; \ 
     565  global const double *w##_LOOP = pd_weight + details->pd_offset[_LOOP]; \ 
    570566  int i##_LOOP = (pd_start/details->pd_stride[_LOOP])%n##_LOOP; 
    571567 
     
    591587// Pointers to the start of the dispersity and weight vectors, if needed. 
    592588#if MAX_PD>0 
    593   pglobal const double *pd_value = values + NUM_VALUES; 
    594   pglobal const double *pd_weight = pd_value + details->num_weights; 
     589  global const double *pd_value = values + NUM_VALUES; 
     590  global const double *pd_weight = pd_value + details->num_weights; 
    595591#endif 
    596592 
     
    652648      BUILD_ROTATION(); 
    653649 
    654 #if !defined(USE_GPU) 
     650#ifndef USE_OPENCL 
    655651      // DLL needs to explicitly loop over the q values. 
    656652      #ifdef USE_OPENMP 
     
    658654      #endif 
    659655      for (q_index=0; q_index<nq; q_index++) 
    660 #endif // !USE_GPU 
     656#endif // !USE_OPENCL 
    661657      { 
    662658 
     
    701697//printf("q_index:%d %g %g %g %g\n", q_index, scattering, weight0); 
    702698 
    703         #if defined(USE_GPU) 
     699        #ifdef USE_OPENCL 
    704700          this_result += weight * scattering; 
    705         #else // !USE_GPU 
     701        #else // !USE_OPENCL 
    706702          result[q_index] += weight * scattering; 
    707         #endif // !USE_GPU 
     703        #endif // !USE_OPENCL 
    708704      } 
    709705    } 
     
    729725 
    730726// Remember the current result and the updated norm. 
    731 #if defined(USE_GPU) 
     727#ifdef USE_OPENCL 
    732728  result[q_index] = this_result; 
    733729  if (q_index == 0) result[nq] = pd_norm; 
    734730//if (q_index == 0) printf("res: %g/%g\n", result[0], pd_norm); 
    735 #else // !USE_GPU 
     731#else // !USE_OPENCL 
    736732  result[nq] = pd_norm; 
    737733//printf("res: %g/%g\n", result[0], pd_norm); 
    738 #endif // !USE_GPU 
     734#endif // !USE_OPENCL 
    739735 
    740736// ** clear the macros in preparation for the next kernel ** 
  • sasmodels/kernelcl.py

    r95f62aa r95f62aa  
    11""" 
    22GPU driver for C kernels 
    3  
    4 TODO: docs are out of date 
    53 
    64There should be a single GPU environment running on the system.  This 
     
    6159 
    6260 
    63 # Attempt to setup opencl. This may fail if the pyopencl package is not 
     61# Attempt to setup opencl. This may fail if the opencl package is not 
    6462# installed or if it is installed but there are no devices available. 
    6563try: 
     
    134132 
    135133def use_opencl(): 
    136     sas_opencl = os.environ.get("SAS_OPENCL", "OpenCL").lower() 
    137     return HAVE_OPENCL and sas_opencl != "none" and not sas_opencl.startswith("cuda") 
     134    return HAVE_OPENCL and os.environ.get("SAS_OPENCL", "").lower() != "none" 
    138135 
    139136ENV = None 
     
    183180        cl.kernel_work_group_info.PREFERRED_WORK_GROUP_SIZE_MULTIPLE, 
    184181        queue.device) 
     182 
     183def _stretch_input(vector, dtype, extra=1e-3, boundary=32): 
     184    # type: (np.ndarray, np.dtype, float, int) -> np.ndarray 
     185    """ 
     186    Stretch an input vector to the correct boundary. 
     187 
     188    Performance on the kernels can drop by a factor of two or more if the 
     189    number of values to compute does not fall on a nice power of two 
     190    boundary.   The trailing additional vector elements are given a 
     191    value of *extra*, and so f(*extra*) will be computed for each of 
     192    them.  The returned array will thus be a subset of the computed array. 
     193 
     194    *boundary* should be a power of 2 which is at least 32 for good 
     195    performance on current platforms (as of Jan 2015).  It should 
     196    probably be the max of get_warp(kernel,queue) and 
     197    device.min_data_type_align_size//4. 
     198    """ 
     199    remainder = vector.size % boundary 
     200    if remainder != 0: 
     201        size = vector.size + (boundary - remainder) 
     202        vector = np.hstack((vector, [extra] * (size - vector.size))) 
     203    return np.ascontiguousarray(vector, dtype=dtype) 
     204 
    185205 
    186206def compile_model(context, source, dtype, fast=False): 
     
    322342    Uses SAS_OPENCL or PYOPENCL_CTX if they are set in the environment, 
    323343    otherwise scans for the most appropriate device using 
    324     :func:`_get_default_context`.  Ignore *SAS_OPENCL=OpenCL*, which 
    325     indicates that an OpenCL device should be used without specifying 
    326     which one (and not a CUDA device, or no GPU). 
    327     """ 
    328     # Assume we do not get here if SAS_OPENCL is None or CUDA 
    329     sas_opencl = os.environ.get('SAS_OPENCL', 'opencl') 
    330     if sas_opencl.lower() != 'opencl': 
    331         # Setting PYOPENCL_CTX as a SAS_OPENCL to create cl context 
    332         os.environ["PYOPENCL_CTX"] = sas_opencl 
     344    :func:`_get_default_context` 
     345    """ 
     346    if 'SAS_OPENCL' in os.environ: 
     347        #Setting PYOPENCL_CTX as a SAS_OPENCL to create cl context 
     348        os.environ["PYOPENCL_CTX"] = os.environ["SAS_OPENCL"] 
    333349 
    334350    if 'PYOPENCL_CTX' in os.environ: 
     
    628644                current_time = time.clock() 
    629645                if current_time - last_nap > 0.5: 
    630                     time.sleep(0.001) 
     646                    time.sleep(0.05) 
    631647                    last_nap = current_time 
    632648        cl.enqueue_copy(queue, self.result, result_b, wait_for=wait_for) 
  • sasmodels/model_test.py

    r74e9b5f r012cd34  
    55Usage:: 
    66 
    7     python -m sasmodels.model_test [opencl|cuda|dll] model1 model2 ... 
     7    python -m sasmodels.model_test [opencl|dll|opencl_and_dll] model1 model2 ... 
    88 
    99    if model1 is 'all', then all except the remaining models will be tested 
     
    6363from .modelinfo import expand_pars 
    6464from .kernelcl import use_opencl 
    65 from .kernelcuda import use_cuda 
    6665 
    6766# pylint: disable=unused-import 
     
    8180    Construct the pyunit test suite. 
    8281 
    83     *loaders* is the list of kernel drivers to use (dll, opencl or cuda). 
    84     For python model the python driver is always used. 
     82    *loaders* is the list of kernel drivers to use, which is one of 
     83    *["dll", "opencl"]*, *["dll"]* or *["opencl"]*.  For python models, 
     84    the python driver is always used. 
    8585 
    8686    *models* is the list of models to test, or *["all"]* to test all models. 
     
    135135 
    136136            # test using dll if desired 
    137             if 'dll' in loaders: 
     137            if 'dll' in loaders or not use_opencl(): 
    138138                test_name = "%s-dll"%model_name 
    139139                test_method_name = "test_%s_dll" % model_info.id 
     
    156156                                     test_method_name, 
    157157                                     platform="ocl", dtype=None, 
    158                                      stash=stash) 
    159                 #print("defining", test_name) 
    160                 suite.addTest(test) 
    161  
    162             # test using cuda if desired and available 
    163             if 'cuda' in loaders and use_cuda(): 
    164                 test_name = "%s-cuda"%model_name 
    165                 test_method_name = "test_%s_cuda" % model_info.id 
    166                 # Using dtype=None so that the models that are only 
    167                 # correct for double precision are not tested using 
    168                 # single precision.  The choice is determined by the 
    169                 # presence of *single=False* in the model file. 
    170                 test = ModelTestCase(test_name, model_info, 
    171                                      test_method_name, 
    172                                      platform="cuda", dtype=None, 
    173158                                     stash=stash) 
    174159                #print("defining", test_name) 
     
    235220 
    236221                # Check for missing tests.  Only do so for the "dll" tests 
    237                 # to reduce noise from both opencl and cuda, and because 
     222                # to reduce noise from both opencl and dll, and because 
    238223                # python kernels use platform="dll". 
    239224                if self.platform == "dll": 
     
    383368 
    384369    # Build a test suite containing just the model 
    385     loader = 'opencl' if use_opencl() else 'cuda' if use_cuda() else 'dll' 
     370    loaders = ['opencl'] if use_opencl() else ['dll'] 
    386371    models = [model] 
    387372    try: 
    388         suite = make_suite([loader], models) 
     373        suite = make_suite(loaders, models) 
    389374    except Exception: 
    390375        import traceback 
     
    449434        loaders = ['opencl'] 
    450435        models = models[1:] 
    451     elif models and models[0] == 'cuda': 
    452         if not use_cuda(): 
    453             print("cuda is not available") 
    454             return 1 
    455         loaders = ['cuda'] 
    456         models = models[1:] 
    457436    elif models and models[0] == 'dll': 
    458437        # TODO: test if compiler is available? 
    459438        loaders = ['dll'] 
    460439        models = models[1:] 
     440    elif models and models[0] == 'opencl_and_dll': 
     441        loaders = ['opencl', 'dll'] if use_opencl() else ['dll'] 
     442        models = models[1:] 
    461443    else: 
    462         loaders = ['dll'] 
    463         if use_opencl(): 
    464             loaders.append('opencl') 
    465         if use_cuda(): 
    466             loaders.append('cuda') 
     444        loaders = ['opencl', 'dll'] if use_opencl() else ['dll'] 
    467445    if not models: 
    468446        print("""\ 
    469447usage: 
    470   python -m sasmodels.model_test [-v] [opencl|cuda|dll] model1 model2 ... 
     448  python -m sasmodels.model_test [-v] [opencl|dll] model1 model2 ... 
    471449 
    472450If -v is included on the command line, then use verbose output. 
    473451 
    474 If no platform is specified, then models will be tested with dll, and 
    475 if available, OpenCL and CUDA; the compute target is ignored for pure python models. 
     452If neither opencl nor dll is specified, then models will be tested with 
     453both OpenCL and dll; the compute target is ignored for pure python models. 
    476454 
    477455If model1 is 'all', then all except the remaining models will be tested. 
     
    493471    Run "nosetests sasmodels" on the command line to invoke it. 
    494472    """ 
    495     loaders = ['dll'] 
    496     if use_opencl(): 
    497         loaders.append('opencl') 
    498     if use_cuda(): 
    499         loaders.append('cuda') 
     473    loaders = ['opencl', 'dll'] if use_opencl() else ['dll'] 
    500474    tests = make_suite(loaders, ['all']) 
    501475    def build_test(test): 
  • sasmodels/models/lib/gauss76.c

    r74e9b5f r99b84ec  
    1111 
    1212// Gaussians 
    13 constant double Gauss76Wt[76] = { 
     13constant double Gauss76Wt[76]={ 
    1414        .00126779163408536,             //0 
    1515        .00294910295364247, 
     
    9090}; 
    9191 
    92 constant double Gauss76Z[76] = { 
     92constant double Gauss76Z[76]={ 
    9393        -.999505948362153,              //0 
    9494        -.997397786355355, 
  • sasmodels/models/lib/polevl.c

    r74e9b5f r447e9aa  
    5151*/ 
    5252 
    53 static 
    54 double polevl( double x, pconstant double *coef, int N ) 
     53double polevl( double x, constant double *coef, int N ); 
     54double polevl( double x, constant double *coef, int N ) 
    5555{ 
    5656 
     
    7272 */ 
    7373 
    74 static 
    75 double p1evl( double x, pconstant double *coef, int N ) 
     74double p1evl( double x, constant double *coef, int N ); 
     75double p1evl( double x, constant double *coef, int N ) 
    7676{ 
    7777    int i=0; 
  • sasmodels/models/lib/sas_J1.c

    r74e9b5f r5181ccc  
    4242#if FLOAT_SIZE>4 
    4343//Cephes double pression function 
     44double cephes_j1(double x); 
    4445 
    4546constant double RPJ1[8] = { 
     
    105106    0.0 }; 
    106107 
    107 static 
    108108double cephes_j1(double x) 
    109109{ 
     
    155155#else 
    156156//Single precission version of cephes 
     157float cephes_j1f(float x); 
     158 
    157159constant float JPJ1[8] = { 
    158160    -4.878788132172128E-009, 
     
    188190    }; 
    189191 
    190 static 
    191192float cephes_j1f(float xx) 
    192193{ 
     
    239240 
    240241//Finally J1c function that equals 2*J1(x)/x 
    241 static 
     242double sas_2J1x_x(double x); 
    242243double sas_2J1x_x(double x) 
    243244{ 
Note: See TracChangeset for help on using the changeset viewer.