Opened 3 years ago

Closed 2 years ago

#814 closed defect (fixed)

OpenCL calculation errors are not being identified

Reported by: butler Owned by: butler
Priority: blocker Milestone: SasView 4.1.0
Component: SasView Keywords:
Cc: Work Package: SasView Bug Fixing

Description (last modified by butler)

while working through ticket #792 I discovered that fractal_core_shell.c is receiving 0.0 for thickness rather than the input from the defaults in or as changed from the GUI. Went back and verified this is true for the latest development build (windows 7 build 601). This is verified on my laptop which claims from the log file:

 OpenCL Intel(R) HD Graphics 4400

I then downloaded the same to my non openCL machine and it does not appear to be passing 0 thickness given the behavior. In other words it behaves correctly. Finally went back to development version that has a print statement in the c code and added

opencl = False

and now the value for thickness is being passed. Thus it seems to be a problem with the C code for OpenCL.

Furhter version 4.01 on the same openCL laptop also does not seem to have the problem of passing 0 thickness so it must have been introduced since 4.01

Attachments (1)

TEST-ModelTestCase-20170226141044.xml (7.3 KB) - added by butler 2 years ago.
xml output of running python script to test models for OpenCL

Download all attachments as: .zip

Change History (10)

comment:1 Changed 3 years ago by butler

  • Description modified (diff)

comment:2 Changed 3 years ago by pkienzle

Cannot reproduce (OS/X, HD Graphics 4000).

The usual source of these sorts of errors (array lengths not a multiple of 4) seems not to apply in this case since none of this code uses arrays.

Which other models are failing?

Try the following from the sasmodels directory to check all models through OpenCL:

$ python -m sasmodels.model_test -v opencl all

comment:3 Changed 3 years ago by butler

  • Owner changed from pkienzle to butler
  • Status changed from new to assigned

Paul Butler to do further testing - possibly move to 4.2 as opencl=off will be default in 4.1

Changed 2 years ago by butler

xml output of running python script to test models for OpenCL

comment:4 Changed 2 years ago by butler

Actually this is a lot more insidious than that I think since there is absolutely no failure anywhere that is detected. a thickness i properly handed off and received as a zero it seems. The only way to know is:

  1. Try varying the thickness with and without opencl from 1 to some extremely large number. If it makes not difference there is likely a problem, or
  2. Put a print staement in the c code to trap the value of thickness it is receiving.

Using the new OpenCL dialog I ran the tests which gave the following output:

132 tests completed.
1 tests failed.
Failing tests: ["Model: hollow_cylinder, Kernel: OpenCL"]

Platform Details:

Sasmodels version: 0.94

Platform used: ["Windows", "DeGennes", "8", "6.2.9200", "AMD64", "Intel64 Family 6 Model 69 Stepping 1, GenuineIntel"]

OpenCL driver: [["Intel(R) Corporation", "OpenCL 1.2 ", "Intel(R) Corporation", "Intel(R) HD Graphics 4400", "OpenCL 1.2 "], ["Intel(R) Corporation", "OpenCL 1.2 ", "Intel(R) Corporation", "Intel(R) Core(TM) i7-4500U CPU @ 1.80GHz", "OpenCL 1.2 (Build 10094)"]]

Note that indeed the fractal core shell does NOT show up as failing.

I also ran the code suggested above from sasmodels with the following output (only 75 tests so assume it stopped testing when it hit a failure … which is the same model that failed from within the SasView application).

  test_two_power_law_python (__main__.ModelTestCase) ... OK (0.015s)
  test_unified_power_Rg_python (__main__.ModelTestCase) ... OK (0.016s)
  test_vesicle_opencl (__main__.ModelTestCase) ... OK (0.076s)

FAIL [0.070s]: test_hollow_cylinder_opencl (__main__.ModelTestCase)
Traceback (most recent call last):
  File "C:\Users\Paul\git\sasmodels\sasmodels\", line 181, in run_all
    self.run_one(model, test)
  File "C:\Users\Paul\git\sasmodels\sasmodels\", line 229, in run_one
    'invalid f(%s): %s' % (xi, actual_yi))
AssertionError: invalid f((0.1, 0.1)): nan Model: hollow_cylinder, Kernel: OpenCL

Ran 75 tests in 9.017s

FAILED (failures=1)

Generating XML reports...


comment:5 Changed 2 years ago by butler

OK … confirmed this is a problem with the HD4400 and/or gen 4 i7. On the office desktop the model does work (including thickness) with OpenCL on.

2017-02-27 08:53:19,163 INFO building fractal_core_shell-float32-D574FE7E for OpenCL Intel(R) HD Graphics 4000
2017-02-27 08:55:11,944 INFO building fractal_core_shell-float32-D574FE7E for OpenCL Intel(R) Core(TM) i7-3770S CPU @ 3.10GHz

This may be something that cannot be addressed because it is card dependent but this must be a solved problem? it is very worrisome that it can quietly fail without letting the user know anything is wrong!

comment:6 Changed 2 years ago by pkienzle

There are no tests for specific q values in the model.

We can add a couple of lines to sasmodels.model_test (around line 184) so that all models require test cases. It needs to check that there are 1D, 2D, ER and VR tests depending on the model, so the simple if not tests check will not suffice.

As to why fractal_core_shell is failing on this card, that is more challenging. It is not a particularly complicated function, so not an issue with array lengths, et al.

Fractal core shell uses the following special functions:

    cube, square
    sas_gamma, sas_3j1x_x
    pow, sin, atan

Hollow cylinder uses the following:

    sas_2J1x_x, sas_sinx_x

These do not seem like they will cause errors.

comment:7 Changed 2 years ago by pkienzle

The ticket-814 branch of sasmodels now raises an error if missing 1D, 2D, ER or VR tests when the model supports those features. To run the tests:

python -m sasmodels.model_test -v dll all

To identify all models for which OpenCL is failing on your GPU, has been extended with an option to restrict the list of models to only the opencl 1D single models.

To use it, do the following:

# unix
./ opencl+single+1d 10 1d100 mono single single! > single1d.csv
# windows
multi_compare opencl+single+1d 10 1d100 mono single single! > single1d.csv

For 2D use:

# unix
./ opencl+single+2d 10 2d10 mono single single! > single2d.csv
# windows
multi_compare opencl+single+2d 10 2d10 mono single single! > single2d.csv

The resulting csv files show the difference between GPU and DLL calculations for each of the models.

If your card supports double, replace single with double in the commands to check the double-precision only models.

comment:8 Changed 2 years ago by pkienzle

  • Summary changed from OpenCL error in fractal_core_shell to OpenCL calculation errors are not being identified

Even without certified values for all the models, the GPU vs. CPU comparison can be run as part of the GPU test suite in the OpenCL device selection dialog, at least for models which do not have specific test values in the file.

Last edited 2 years ago by pkienzle (previous) (diff)

comment:9 Changed 2 years ago by GitHub <noreply@…>

  • Resolution set to fixed
  • Status changed from assigned to closed

In 0011eccbe83d86b849216e2b13344fd6bea8cae3/sasmodels:

Merge pull request #35 from SasView?/ticket-814

Finds models that throw errors when running the test routine. Fixes #814

Note: See TracTickets for help on using tickets.