Changeset 8b31efa in sasmodels

Oct 15, 2018 1:27:14 PM (5 weeks ago)
beta_approx, cuda-test, py3, ticket-1015-gpu-mem-error, ticket-1157, ticket-608-user-defined-weights, ticket_1156
508475a, d5ce7fa

document cuda device selection; fix cuda speed issue

3 edited


  • doc/guide/gpu_setup.rst

    r63602b1 r8b31efa  
    9494Device Selection 
     96**OpenCL drivers** 
    9698If you have multiple GPU devices you can tell the program which device to use. 
    9799By default, the program looks for one GPU and one CPU device from available 
    104106was used to run the model. 
    106 **If you don't want to use OpenCL, you can set** *SAS_OPENCL=None* 
    107 **in your environment settings, and it will only use normal programs.** 
    109 If you want to use one of the other devices, you can run the following 
     108If you want to use a specific driver and devices, you can run the following 
    110109from the python console:: 
    115114This will provide a menu of different OpenCL drivers available. 
    116115When one is selected, it will say "set PYOPENCL_CTX=..." 
    117 Use that value as the value of *SAS_OPENCL*. 
     116Use that value as the value of *SAS_OPENCL=driver:device*. 
     118To use the default OpenCL device (rather than CUDA or None), 
     119set *SAS_OPENCL=opencl*. 
     121In batch queues, you may need to set *XDG_CACHE_HOME=~/.cache*  
     122(Linux only) to a different directory, depending on how the filesystem  
     123is configured.  You should also set *SAS_DLL_PATH* for CPU-only modules. 
     125    -DSAS_MODELPATH=path sets directory containing custom models 
     126    -DSAS_OPENCL=vendor:device|cuda:device|none sets the target GPU device 
     127    -DXDG_CACHE_HOME=~/.cache sets the pyopencl cache root (linux only) 
     128    -DSAS_COMPILER=tinycc|msvc|mingw|unix sets the DLL compiler 
     129    -DSAS_OPENMP=1 turns on OpenMP for the DLLs 
     130    -DSAS_DLL_PATH=path sets the path to the compiled modules 
     133**CUDA drivers** 
     135If OpenCL drivers are not available on your system, but NVidia CUDA 
     136drivers are available, then set *SAS_OPENCL=cuda* or 
     137*SAS_OPENCL=cuda:n* for a particular device number *n*.  If no device 
     138number is specified, then the CUDA drivers looks for look for 
     139*CUDA_DEVICE=n* or a file ~/.cuda-device containing n for the device number. 
     141In batch queues, the SLURM command *sbatch --gres=gpu:1 ...* will set 
     142*CUDA_VISIBLE_DEVICES=n*, which ought to set the correct device 
     143number for *SAS_OPENCL=cuda*.  If not, then set 
     144*CUDA_DEVICE=$CUDA_VISIBLE_DEVICES* within the batch script.  You may 
     145need to set the CUDA cache directory to a folder accessible across the 
     146cluster with *PYCUDA_CACHE_DIR* (or *PYCUDA_DISABLE_CACHE* to disable 
     147caching), and you may need to set environment specific compiler flags 
     148with *PYCUDA_DEFAULT_NVCC_FLAGS*.  You should also set *SAS_DLL_PATH*  
     149for CPU-only modules. 
     151**No GPU support** 
     153If you don't want to use OpenCL or CUDA, you can set *SAS_OPENCL=None* 
     154in your environment settings, and it will only use normal programs. 
     156In batch queues, you may need to set *SAS_DLL_PATH* to a directory 
     157accessible on the compute node. 
    119160Device Testing 
    154195*Document History* 
    156 | 2017-09-27 Paul Kienzle 
     197| 2018-10-15 Paul Kienzle 
  • sasmodels/

    rb0de252 r8b31efa  
    227227        self.context = None 
    228228        if 'SAS_OPENCL' in os.environ: 
    229             #Setting PYOPENCL_CTX as a SAS_OPENCL to create cl context 
    230             os.environ["PYOPENCL_CTX"] = os.environ["SAS_OPENCL"] 
     229            # Set the PyOpenCL environment variable PYOPENCL_CTX  
     230            # from SAS_OPENCL=driver:device.  Ignore the generic 
     231            # SAS_OPENCL=opencl, which is used to select the default  
     232            # OpenCL device.  Don't need to check for "none" or 
     233            # "cuda" since use_opencl() would return False if they 
     234            # were defined, and we wouldn't get here. 
     235            dev_str = os.environ["SAS_OPENCL"] 
     236            if dev_str and dev_str.lower() != "opencl": 
     237                os.environ["PYOPENCL_CTX"] = dev_str 
    231239        if 'PYOPENCL_CTX' in os.environ: 
    232240            self._create_some_context() 
    568576                current_time = time.clock() 
    569577                if current_time - last_nap > 0.5: 
    570                     time.sleep(0.05) 
     578                    time.sleep(0.001) 
    571579                    last_nap = current_time 
    572580        cl.enqueue_copy(self.queue, self.result, self.result_b) 
  • sasmodels/

    r74e9b5f r8b31efa  
    444444        self.q_input = q_input # allocated by GpuInput above 
    446         self._need_release = [self.result_b, self.q_input] 
     446        self._need_release = [self.result_b] 
    447447        self.real = (np.float32 if dtype == generate.F32 
    448448                     else np.float64 if dtype == generate.F64 
    467467        # Call kernel and retrieve results 
    468468        last_nap = time.clock() 
    469         step = 1000000//self.q_input.nq + 1 
     469        step = 100000000//self.q_input.nq + 1 
    470470        #step = 1000000000 
    471471        for start in range(0, call_details.num_eval, step): 
    479479                current_time = time.clock() 
    480480                if current_time - last_nap > 0.5: 
    481                     time.sleep(0.05) 
     481                    time.sleep(0.001) 
    482482                    last_nap = current_time 
    483483        sync() 
    500500        Release resources associated with the kernel. 
    501501        """ 
    502         if self.result_b is not None: 
    504             self.result_b = None 
     502        for p in self._need_release: 
     504        self._need_release = [] 
    506506    def __del__(self): 
Note: See TracChangeset for help on using the changeset viewer.