Changes in / [2a12d8d8:508475a] in sasmodels


Ignore:
Files:
1 added
10 edited

Legend:

Unmodified
Added
Removed
  • doc/guide/gpu_setup.rst

    r63602b1 r8b31efa  
    9494Device Selection 
    9595================ 
     96**OpenCL drivers** 
     97 
    9698If you have multiple GPU devices you can tell the program which device to use. 
    9799By default, the program looks for one GPU and one CPU device from available 
     
    104106was used to run the model. 
    105107 
    106 **If you don't want to use OpenCL, you can set** *SAS_OPENCL=None* 
    107 **in your environment settings, and it will only use normal programs.** 
    108  
    109 If you want to use one of the other devices, you can run the following 
     108If you want to use a specific driver and devices, you can run the following 
    110109from the python console:: 
    111110 
     
    115114This will provide a menu of different OpenCL drivers available. 
    116115When one is selected, it will say "set PYOPENCL_CTX=..." 
    117 Use that value as the value of *SAS_OPENCL*. 
     116Use that value as the value of *SAS_OPENCL=driver:device*. 
     117 
     118To use the default OpenCL device (rather than CUDA or None), 
     119set *SAS_OPENCL=opencl*. 
     120 
     121In batch queues, you may need to set *XDG_CACHE_HOME=~/.cache*  
     122(Linux only) to a different directory, depending on how the filesystem  
     123is configured.  You should also set *SAS_DLL_PATH* for CPU-only modules. 
     124 
     125    -DSAS_MODELPATH=path sets directory containing custom models 
     126    -DSAS_OPENCL=vendor:device|cuda:device|none sets the target GPU device 
     127    -DXDG_CACHE_HOME=~/.cache sets the pyopencl cache root (linux only) 
     128    -DSAS_COMPILER=tinycc|msvc|mingw|unix sets the DLL compiler 
     129    -DSAS_OPENMP=1 turns on OpenMP for the DLLs 
     130    -DSAS_DLL_PATH=path sets the path to the compiled modules 
     131 
     132 
     133**CUDA drivers** 
     134 
     135If OpenCL drivers are not available on your system, but NVidia CUDA 
     136drivers are available, then set *SAS_OPENCL=cuda* or 
     137*SAS_OPENCL=cuda:n* for a particular device number *n*.  If no device 
     138number is specified, then the CUDA drivers looks for look for 
     139*CUDA_DEVICE=n* or a file ~/.cuda-device containing n for the device number. 
     140 
     141In batch queues, the SLURM command *sbatch --gres=gpu:1 ...* will set 
     142*CUDA_VISIBLE_DEVICES=n*, which ought to set the correct device 
     143number for *SAS_OPENCL=cuda*.  If not, then set 
     144*CUDA_DEVICE=$CUDA_VISIBLE_DEVICES* within the batch script.  You may 
     145need to set the CUDA cache directory to a folder accessible across the 
     146cluster with *PYCUDA_CACHE_DIR* (or *PYCUDA_DISABLE_CACHE* to disable 
     147caching), and you may need to set environment specific compiler flags 
     148with *PYCUDA_DEFAULT_NVCC_FLAGS*.  You should also set *SAS_DLL_PATH*  
     149for CPU-only modules. 
     150 
     151**No GPU support** 
     152 
     153If you don't want to use OpenCL or CUDA, you can set *SAS_OPENCL=None* 
     154in your environment settings, and it will only use normal programs. 
     155 
     156In batch queues, you may need to set *SAS_DLL_PATH* to a directory 
     157accessible on the compute node. 
     158 
    118159 
    119160Device Testing 
     
    154195*Document History* 
    155196 
    156 | 2017-09-27 Paul Kienzle 
     197| 2018-10-15 Paul Kienzle 
  • sasmodels/compare.py

    r610ef23 r4de14584  
    4141from . import kerneldll 
    4242from . import kernelcl 
     43from . import kernelcuda 
    4344from .data import plot_theory, empty_data1D, empty_data2D, load_data 
    4445from .direct_model import DirectModel, get_mesh 
     
    115116    === environment variables === 
    116117    -DSAS_MODELPATH=path sets directory containing custom models 
    117     -DSAS_OPENCL=vendor:device|none sets the target OpenCL device 
     118    -DSAS_OPENCL=vendor:device|cuda:device|none sets the target GPU device 
    118119    -DXDG_CACHE_HOME=~/.cache sets the pyopencl cache root (linux only) 
    119120    -DSAS_COMPILER=tinycc|msvc|mingw|unix sets the DLL compiler 
     
    725726        set_integration_size(model_info, ngauss) 
    726727 
    727     if dtype != "default" and not dtype.endswith('!') and not kernelcl.use_opencl(): 
     728    if (dtype != "default" and not dtype.endswith('!')  
     729            and not (kernelcl.use_opencl() or kernelcuda.use_cuda())): 
    728730        raise RuntimeError("OpenCL not available " + kernelcl.OPENCL_ERROR) 
    729731 
  • sasmodels/core.py

    r2dcd6e7 rb0de252  
    2121from . import mixture 
    2222from . import kernelpy 
     23from . import kernelcuda 
    2324from . import kernelcl 
    2425from . import kerneldll 
     
    210211        #print("building dll", numpy_dtype) 
    211212        return kerneldll.load_dll(source['dll'], model_info, numpy_dtype) 
     213    elif platform == "cuda": 
     214        return kernelcuda.GpuModel(source, model_info, numpy_dtype, fast=fast) 
    212215    else: 
    213216        #print("building ocl", numpy_dtype) 
     
    245248    # type: (ModelInfo, str, str) -> (np.dtype, bool, str) 
    246249    """ 
    247     Interpret dtype string, returning np.dtype and fast flag. 
     250    Interpret dtype string, returning np.dtype, fast flag and platform. 
    248251 
    249252    Possible types include 'half', 'single', 'double' and 'quad'.  If the 
     
    253256    default for the model and platform. 
    254257 
    255     Platform preference can be specfied ("ocl" vs "dll"), with the default 
    256     being OpenCL if it is availabe.  If the dtype name ends with '!' then 
    257     platform is forced to be DLL rather than OpenCL. 
     258    Platform preference can be specfied ("ocl", "cuda", "dll"), with the 
     259    default being OpenCL or CUDA if available, otherwise DLL.  If the dtype 
     260    name ends with '!' then platform is forced to be DLL rather than GPU. 
     261    The default platform is set by the environment variable SAS_OPENCL, 
     262    SAS_OPENCL=driver:device for OpenCL, SAS_OPENCL=cuda:device for CUDA 
     263    or SAS_OPENCL=none for DLL. 
    258264 
    259265    This routine ignores the preferences within the model definition.  This 
     
    268274    if platform is None: 
    269275        platform = "ocl" 
    270     if not kernelcl.use_opencl() or not model_info.opencl: 
    271         platform = "dll" 
    272276 
    273277    # Check if type indicates dll regardless of which platform is given 
     
    275279        platform = "dll" 
    276280        dtype = dtype[:-1] 
     281 
     282    # Make sure model allows opencl/gpu 
     283    if not model_info.opencl: 
     284        platform = "dll" 
     285 
     286    # Make sure opencl is available, or fallback to cuda then to dll 
     287    if platform == "ocl" and not kernelcl.use_opencl(): 
     288        platform = "cuda" if kernelcuda.use_cuda() else "dll" 
    277289 
    278290    # Convert special type names "half", "fast", and "quad" 
     
    285297        dtype = "float16" 
    286298 
    287     # Convert dtype string to numpy dtype. 
     299    # Convert dtype string to numpy dtype.  Use single precision for GPU 
     300    # if model allows it, otherwise use double precision. 
    288301    if dtype is None or dtype == "default": 
    289         numpy_dtype = (generate.F32 if platform == "ocl" and model_info.single 
     302        numpy_dtype = (generate.F32 if model_info.single and platform in ("ocl", "cuda") 
    290303                       else generate.F64) 
    291304    else: 
    292305        numpy_dtype = np.dtype(dtype) 
    293306 
    294     # Make sure that the type is supported by opencl, otherwise use dll 
     307    # Make sure that the type is supported by GPU, otherwise use dll 
    295308    if platform == "ocl": 
    296309        env = kernelcl.environment() 
    297         if not env.has_type(numpy_dtype): 
    298             platform = "dll" 
    299             if dtype is None: 
    300                 numpy_dtype = generate.F64 
     310    elif platform == "cuda": 
     311        env = kernelcuda.environment() 
     312    else: 
     313        env = None 
     314    if env is not None and not env.has_type(numpy_dtype): 
     315        platform = "dll" 
     316        if dtype is None: 
     317            numpy_dtype = generate.F64 
    301318 
    302319    return numpy_dtype, fast, platform 
  • sasmodels/kernel_header.c

    r108e70e r74e9b5f  
    11#ifdef __OPENCL_VERSION__ 
    22# define USE_OPENCL 
     3#elif defined(__CUDACC__) 
     4# define USE_CUDA 
    35#elif defined(_OPENMP) 
    46# define USE_OPENMP 
    57#endif 
     8 
     9// Use SAS_DOUBLE to force the use of double even for float kernels 
     10#define SAS_DOUBLE dou ## ble 
    611 
    712// If opencl is not available, then we are compiling a C function 
    813// Note: if using a C++ compiler, then define kernel as extern "C" 
    914#ifdef USE_OPENCL 
     15 
     16   #define USE_GPU 
     17   #define pglobal global 
     18   #define pconstant constant 
     19 
    1020   typedef int int32_t; 
    11 #  if defined(USE_SINCOS) 
    12 #    define SINCOS(angle,svar,cvar) svar=sincos(angle,&cvar) 
    13 #  else 
    14 #    define SINCOS(angle,svar,cvar) do {const double _t_=angle; svar=sin(_t_);cvar=cos(_t_);} while (0) 
    15 #  endif 
     21 
     22   #if defined(USE_SINCOS) 
     23   #  define SINCOS(angle,svar,cvar) svar=sincos(angle,&cvar) 
     24   #else 
     25   #  define SINCOS(angle,svar,cvar) do {const double _t_=angle; svar=sin(_t_);cvar=cos(_t_);} while (0) 
     26   #endif 
    1627   // Intel CPU on Mac gives strange values for erf(); on the verified 
    1728   // platforms (intel, nvidia, amd), the cephes erf() is significantly 
     
    2435   #  define erfcf erfc 
    2536   #endif 
    26 #else // !USE_OPENCL 
    27 // Use SAS_DOUBLE to force the use of double even for float kernels 
    28 #  define SAS_DOUBLE dou ## ble 
    29 #  ifdef __cplusplus 
     37 
     38#elif defined(USE_CUDA) 
     39 
     40   #define USE_GPU 
     41   #define local __shared__ 
     42   #define pglobal 
     43   #define constant __constant__ 
     44   #define pconstant const 
     45   #define kernel extern "C" __global__ 
     46 
     47   // OpenCL powr(a,b) = C99 pow(a,b), b >= 0 
     48   // OpenCL pown(a,b) = C99 pow(a,b), b integer 
     49   #define powr(a,b) pow(a,b) 
     50   #define pown(a,b) pow(a,b) 
     51   //typedef int int32_t; 
     52   #if defined(USE_SINCOS) 
     53   #  define SINCOS(angle,svar,cvar) sincos(angle,&svar,&cvar) 
     54   #else 
     55   #  define SINCOS(angle,svar,cvar) do {const double _t_=angle; svar=sin(_t_);cvar=cos(_t_);} while (0) 
     56   #endif 
     57 
     58#else // !USE_OPENCL && !USE_CUDA 
     59 
     60   #define local 
     61   #define pglobal 
     62   #define constant const 
     63   #define pconstant const 
     64 
     65   #ifdef __cplusplus 
    3066      #include <cstdio> 
    3167      #include <cmath> 
     
    5187     #endif 
    5288     inline void SINCOS(double angle, double &svar, double &cvar) { svar=sin(angle); cvar=cos(angle); } 
    53 else // !__cplusplus 
     89   #else // !__cplusplus 
    5490     #include <inttypes.h>  // C99 guarantees that int32_t types is here 
    5591     #include <stdio.h> 
     
    76112     #define kernel 
    77113     #define SINCOS(angle,svar,cvar) do {const double _t_=angle; svar=sin(_t_);cvar=cos(_t_);} while (0) 
    78 #  endif  // !__cplusplus 
    79 #  define global 
    80 #  define local 
    81 #  define constant const 
    82 // OpenCL powr(a,b) = C99 pow(a,b), b >= 0 
    83 // OpenCL pown(a,b) = C99 pow(a,b), b integer 
    84 #  define powr(a,b) pow(a,b) 
    85 #  define pown(a,b) pow(a,b) 
     114   #endif  // !__cplusplus 
     115   // OpenCL powr(a,b) = C99 pow(a,b), b >= 0 
     116   // OpenCL pown(a,b) = C99 pow(a,b), b integer 
     117   #define powr(a,b) pow(a,b) 
     118   #define pown(a,b) pow(a,b) 
     119 
    86120#endif // !USE_OPENCL 
    87121 
  • sasmodels/kernel_iq.c

    r70530778 r74e9b5f  
    278278    const int32_t pd_start,     // where we are in the dispersity loop 
    279279    const int32_t pd_stop,      // where we are stopping in the dispersity loop 
    280     global const ProblemDetails *details, 
    281     global const double *values, 
    282     global const double *q, // nq q values, with padding to boundary 
    283     global double *result,  // nq+1 return values, again with padding 
     280    pglobal const ProblemDetails *details, 
     281    pglobal const double *values, 
     282    pglobal const double *q, // nq q values, with padding to boundary 
     283    pglobal double *result,  // nq+1 return values, again with padding 
    284284    const double cutoff     // cutoff in the dispersity weight product 
    285285    ) 
    286286{ 
    287 #ifdef USE_OPENCL 
     287#if defined(USE_GPU) 
    288288  // who we are and what element we are working with 
     289  #if defined(USE_OPENCL) 
    289290  const int q_index = get_global_id(0); 
     291  #else // USE_CUDA 
     292  const int q_index = threadIdx.x + blockIdx.x * blockDim.x; 
     293  #endif 
    290294  if (q_index >= nq) return; 
    291295#else 
     
    340344  // seeing one q value (stored in the variable "this_result") while the dll 
    341345  // version must loop over all q. 
    342   #ifdef USE_OPENCL 
     346  #if defined(USE_GPU) 
    343347    double pd_norm = (pd_start == 0 ? 0.0 : result[nq]); 
    344348    double this_result = (pd_start == 0 ? 0.0 : result[q_index]); 
    345   #else // !USE_OPENCL 
     349  #else // !USE_GPU 
    346350    double pd_norm = (pd_start == 0 ? 0.0 : result[nq]); 
    347351    if (pd_start == 0) { 
     
    352356    } 
    353357    //if (q_index==0) printf("start %d %g %g\n", pd_start, pd_norm, result[0]); 
    354 #endif // !USE_OPENCL 
     358#endif // !USE_GPU 
    355359 
    356360 
     
    375379  const int n4 = pd_length[4]; 
    376380  const int p4 = pd_par[4]; 
    377   global const double *v4 = pd_value + pd_offset[4]; 
    378   global const double *w4 = pd_weight + pd_offset[4]; 
     381  pglobal const double *v4 = pd_value + pd_offset[4]; 
     382  pglobal const double *w4 = pd_weight + pd_offset[4]; 
    379383  int i4 = (pd_start/pd_stride[4])%n4;  // position in level 4 at pd_start 
    380384 
     
    562566  const int n##_LOOP = details->pd_length[_LOOP]; \ 
    563567  const int p##_LOOP = details->pd_par[_LOOP]; \ 
    564   global const double *v##_LOOP = pd_value + details->pd_offset[_LOOP]; \ 
    565   global const double *w##_LOOP = pd_weight + details->pd_offset[_LOOP]; \ 
     568  pglobal const double *v##_LOOP = pd_value + details->pd_offset[_LOOP]; \ 
     569  pglobal const double *w##_LOOP = pd_weight + details->pd_offset[_LOOP]; \ 
    566570  int i##_LOOP = (pd_start/details->pd_stride[_LOOP])%n##_LOOP; 
    567571 
     
    587591// Pointers to the start of the dispersity and weight vectors, if needed. 
    588592#if MAX_PD>0 
    589   global const double *pd_value = values + NUM_VALUES; 
    590   global const double *pd_weight = pd_value + details->num_weights; 
     593  pglobal const double *pd_value = values + NUM_VALUES; 
     594  pglobal const double *pd_weight = pd_value + details->num_weights; 
    591595#endif 
    592596 
     
    648652      BUILD_ROTATION(); 
    649653 
    650 #ifndef USE_OPENCL 
     654#if !defined(USE_GPU) 
    651655      // DLL needs to explicitly loop over the q values. 
    652656      #ifdef USE_OPENMP 
     
    654658      #endif 
    655659      for (q_index=0; q_index<nq; q_index++) 
    656 #endif // !USE_OPENCL 
     660#endif // !USE_GPU 
    657661      { 
    658662 
     
    697701//printf("q_index:%d %g %g %g %g\n", q_index, scattering, weight0); 
    698702 
    699         #ifdef USE_OPENCL 
     703        #if defined(USE_GPU) 
    700704          this_result += weight * scattering; 
    701         #else // !USE_OPENCL 
     705        #else // !USE_GPU 
    702706          result[q_index] += weight * scattering; 
    703         #endif // !USE_OPENCL 
     707        #endif // !USE_GPU 
    704708      } 
    705709    } 
     
    725729 
    726730// Remember the current result and the updated norm. 
    727 #ifdef USE_OPENCL 
     731#if defined(USE_GPU) 
    728732  result[q_index] = this_result; 
    729733  if (q_index == 0) result[nq] = pd_norm; 
    730734//if (q_index == 0) printf("res: %g/%g\n", result[0], pd_norm); 
    731 #else // !USE_OPENCL 
     735#else // !USE_GPU 
    732736  result[nq] = pd_norm; 
    733737//printf("res: %g/%g\n", result[0], pd_norm); 
    734 #endif // !USE_OPENCL 
     738#endif // !USE_GPU 
    735739 
    736740// ** clear the macros in preparation for the next kernel ** 
  • sasmodels/kernelcl.py

    r95f62aa r95f62aa  
    11""" 
    22GPU driver for C kernels 
     3 
     4TODO: docs are out of date 
    35 
    46There should be a single GPU environment running on the system.  This 
     
    5961 
    6062 
    61 # Attempt to setup opencl. This may fail if the opencl package is not 
     63# Attempt to setup opencl. This may fail if the pyopencl package is not 
    6264# installed or if it is installed but there are no devices available. 
    6365try: 
     
    132134 
    133135def use_opencl(): 
    134     return HAVE_OPENCL and os.environ.get("SAS_OPENCL", "").lower() != "none" 
     136    sas_opencl = os.environ.get("SAS_OPENCL", "OpenCL").lower() 
     137    return HAVE_OPENCL and sas_opencl != "none" and not sas_opencl.startswith("cuda") 
    135138 
    136139ENV = None 
     
    180183        cl.kernel_work_group_info.PREFERRED_WORK_GROUP_SIZE_MULTIPLE, 
    181184        queue.device) 
    182  
    183 def _stretch_input(vector, dtype, extra=1e-3, boundary=32): 
    184     # type: (np.ndarray, np.dtype, float, int) -> np.ndarray 
    185     """ 
    186     Stretch an input vector to the correct boundary. 
    187  
    188     Performance on the kernels can drop by a factor of two or more if the 
    189     number of values to compute does not fall on a nice power of two 
    190     boundary.   The trailing additional vector elements are given a 
    191     value of *extra*, and so f(*extra*) will be computed for each of 
    192     them.  The returned array will thus be a subset of the computed array. 
    193  
    194     *boundary* should be a power of 2 which is at least 32 for good 
    195     performance on current platforms (as of Jan 2015).  It should 
    196     probably be the max of get_warp(kernel,queue) and 
    197     device.min_data_type_align_size//4. 
    198     """ 
    199     remainder = vector.size % boundary 
    200     if remainder != 0: 
    201         size = vector.size + (boundary - remainder) 
    202         vector = np.hstack((vector, [extra] * (size - vector.size))) 
    203     return np.ascontiguousarray(vector, dtype=dtype) 
    204  
    205185 
    206186def compile_model(context, source, dtype, fast=False): 
     
    342322    Uses SAS_OPENCL or PYOPENCL_CTX if they are set in the environment, 
    343323    otherwise scans for the most appropriate device using 
    344     :func:`_get_default_context` 
    345     """ 
    346     if 'SAS_OPENCL' in os.environ: 
    347         #Setting PYOPENCL_CTX as a SAS_OPENCL to create cl context 
    348         os.environ["PYOPENCL_CTX"] = os.environ["SAS_OPENCL"] 
     324    :func:`_get_default_context`.  Ignore *SAS_OPENCL=OpenCL*, which 
     325    indicates that an OpenCL device should be used without specifying 
     326    which one (and not a CUDA device, or no GPU). 
     327    """ 
     328    # Assume we do not get here if SAS_OPENCL is None or CUDA 
     329    sas_opencl = os.environ.get('SAS_OPENCL', 'opencl') 
     330    if sas_opencl.lower() != 'opencl': 
     331        # Setting PYOPENCL_CTX as a SAS_OPENCL to create cl context 
     332        os.environ["PYOPENCL_CTX"] = sas_opencl 
    349333 
    350334    if 'PYOPENCL_CTX' in os.environ: 
     
    644628                current_time = time.clock() 
    645629                if current_time - last_nap > 0.5: 
    646                     time.sleep(0.05) 
     630                    time.sleep(0.001) 
    647631                    last_nap = current_time 
    648632        cl.enqueue_copy(queue, self.result, result_b, wait_for=wait_for) 
  • sasmodels/model_test.py

    r012cd34 r74e9b5f  
    55Usage:: 
    66 
    7     python -m sasmodels.model_test [opencl|dll|opencl_and_dll] model1 model2 ... 
     7    python -m sasmodels.model_test [opencl|cuda|dll] model1 model2 ... 
    88 
    99    if model1 is 'all', then all except the remaining models will be tested 
     
    6363from .modelinfo import expand_pars 
    6464from .kernelcl import use_opencl 
     65from .kernelcuda import use_cuda 
    6566 
    6667# pylint: disable=unused-import 
     
    8081    Construct the pyunit test suite. 
    8182 
    82     *loaders* is the list of kernel drivers to use, which is one of 
    83     *["dll", "opencl"]*, *["dll"]* or *["opencl"]*.  For python models, 
    84     the python driver is always used. 
     83    *loaders* is the list of kernel drivers to use (dll, opencl or cuda). 
     84    For python model the python driver is always used. 
    8585 
    8686    *models* is the list of models to test, or *["all"]* to test all models. 
     
    135135 
    136136            # test using dll if desired 
    137             if 'dll' in loaders or not use_opencl(): 
     137            if 'dll' in loaders: 
    138138                test_name = "%s-dll"%model_name 
    139139                test_method_name = "test_%s_dll" % model_info.id 
     
    156156                                     test_method_name, 
    157157                                     platform="ocl", dtype=None, 
     158                                     stash=stash) 
     159                #print("defining", test_name) 
     160                suite.addTest(test) 
     161 
     162            # test using cuda if desired and available 
     163            if 'cuda' in loaders and use_cuda(): 
     164                test_name = "%s-cuda"%model_name 
     165                test_method_name = "test_%s_cuda" % model_info.id 
     166                # Using dtype=None so that the models that are only 
     167                # correct for double precision are not tested using 
     168                # single precision.  The choice is determined by the 
     169                # presence of *single=False* in the model file. 
     170                test = ModelTestCase(test_name, model_info, 
     171                                     test_method_name, 
     172                                     platform="cuda", dtype=None, 
    158173                                     stash=stash) 
    159174                #print("defining", test_name) 
     
    220235 
    221236                # Check for missing tests.  Only do so for the "dll" tests 
    222                 # to reduce noise from both opencl and dll, and because 
     237                # to reduce noise from both opencl and cuda, and because 
    223238                # python kernels use platform="dll". 
    224239                if self.platform == "dll": 
     
    368383 
    369384    # Build a test suite containing just the model 
    370     loaders = ['opencl'] if use_opencl() else ['dll'] 
     385    loader = 'opencl' if use_opencl() else 'cuda' if use_cuda() else 'dll' 
    371386    models = [model] 
    372387    try: 
    373         suite = make_suite(loaders, models) 
     388        suite = make_suite([loader], models) 
    374389    except Exception: 
    375390        import traceback 
     
    434449        loaders = ['opencl'] 
    435450        models = models[1:] 
     451    elif models and models[0] == 'cuda': 
     452        if not use_cuda(): 
     453            print("cuda is not available") 
     454            return 1 
     455        loaders = ['cuda'] 
     456        models = models[1:] 
    436457    elif models and models[0] == 'dll': 
    437458        # TODO: test if compiler is available? 
    438459        loaders = ['dll'] 
    439460        models = models[1:] 
    440     elif models and models[0] == 'opencl_and_dll': 
    441         loaders = ['opencl', 'dll'] if use_opencl() else ['dll'] 
    442         models = models[1:] 
    443461    else: 
    444         loaders = ['opencl', 'dll'] if use_opencl() else ['dll'] 
     462        loaders = ['dll'] 
     463        if use_opencl(): 
     464            loaders.append('opencl') 
     465        if use_cuda(): 
     466            loaders.append('cuda') 
    445467    if not models: 
    446468        print("""\ 
    447469usage: 
    448   python -m sasmodels.model_test [-v] [opencl|dll] model1 model2 ... 
     470  python -m sasmodels.model_test [-v] [opencl|cuda|dll] model1 model2 ... 
    449471 
    450472If -v is included on the command line, then use verbose output. 
    451473 
    452 If neither opencl nor dll is specified, then models will be tested with 
    453 both OpenCL and dll; the compute target is ignored for pure python models. 
     474If no platform is specified, then models will be tested with dll, and 
     475if available, OpenCL and CUDA; the compute target is ignored for pure python models. 
    454476 
    455477If model1 is 'all', then all except the remaining models will be tested. 
     
    471493    Run "nosetests sasmodels" on the command line to invoke it. 
    472494    """ 
    473     loaders = ['opencl', 'dll'] if use_opencl() else ['dll'] 
     495    loaders = ['dll'] 
     496    if use_opencl(): 
     497        loaders.append('opencl') 
     498    if use_cuda(): 
     499        loaders.append('cuda') 
    474500    tests = make_suite(loaders, ['all']) 
    475501    def build_test(test): 
  • sasmodels/models/lib/gauss76.c

    r99b84ec r74e9b5f  
    1111 
    1212// Gaussians 
    13 constant double Gauss76Wt[76]={ 
     13constant double Gauss76Wt[76] = { 
    1414        .00126779163408536,             //0 
    1515        .00294910295364247, 
     
    9090}; 
    9191 
    92 constant double Gauss76Z[76]={ 
     92constant double Gauss76Z[76] = { 
    9393        -.999505948362153,              //0 
    9494        -.997397786355355, 
  • sasmodels/models/lib/polevl.c

    r447e9aa r74e9b5f  
    5151*/ 
    5252 
    53 double polevl( double x, constant double *coef, int N ); 
    54 double polevl( double x, constant double *coef, int N ) 
     53static 
     54double polevl( double x, pconstant double *coef, int N ) 
    5555{ 
    5656 
     
    7272 */ 
    7373 
    74 double p1evl( double x, constant double *coef, int N ); 
    75 double p1evl( double x, constant double *coef, int N ) 
     74static 
     75double p1evl( double x, pconstant double *coef, int N ) 
    7676{ 
    7777    int i=0; 
  • sasmodels/models/lib/sas_J1.c

    r5181ccc r74e9b5f  
    4242#if FLOAT_SIZE>4 
    4343//Cephes double pression function 
    44 double cephes_j1(double x); 
    4544 
    4645constant double RPJ1[8] = { 
     
    106105    0.0 }; 
    107106 
     107static 
    108108double cephes_j1(double x) 
    109109{ 
     
    155155#else 
    156156//Single precission version of cephes 
    157 float cephes_j1f(float x); 
    158  
    159157constant float JPJ1[8] = { 
    160158    -4.878788132172128E-009, 
     
    190188    }; 
    191189 
     190static 
    192191float cephes_j1f(float xx) 
    193192{ 
     
    240239 
    241240//Finally J1c function that equals 2*J1(x)/x 
    242 double sas_2J1x_x(double x); 
     241static 
    243242double sas_2J1x_x(double x) 
    244243{ 
Note: See TracChangeset for help on using the changeset viewer.