Opened 8 years ago
Closed 8 years ago
#814 closed defect (fixed)
OpenCL calculation errors are not being identified
Reported by: | butler | Owned by: | butler |
---|---|---|---|
Priority: | blocker | Milestone: | SasView 4.1.0 |
Component: | SasView | Keywords: | |
Cc: | Work Package: | SasView Bug Fixing |
Description (last modified by butler)
while working through ticket #792 I discovered that fractal_core_shell.c is receiving 0.0 for thickness rather than the input from the defaults in fractal_core_shell.py or as changed from the GUI. Went back and verified this is true for the latest development build (windows 7 build 601). This is verified on my laptop which claims from the log file:
OpenCL Intel(R) HD Graphics 4400
I then downloaded the same to my non openCL machine and it does not appear to be passing 0 thickness given the behavior. In other words it behaves correctly. Finally went back to development version that has a print statement in the c code and added
opencl = False
and now the value for thickness is being passed. Thus it seems to be a problem with the C code for OpenCL.
Furhter version 4.01 on the same openCL laptop also does not seem to have the problem of passing 0 thickness so it must have been introduced since 4.01
Attachments (1)
Change History (10)
comment:1 Changed 8 years ago by butler
- Description modified (diff)
comment:2 Changed 8 years ago by pkienzle
comment:3 Changed 8 years ago by butler
- Owner changed from pkienzle to butler
- Status changed from new to assigned
Paul Butler to do further testing - possibly move to 4.2 as opencl=off will be default in 4.1
comment:4 Changed 8 years ago by butler
Actually this is a lot more insidious than that I think since there is absolutely no failure anywhere that is detected. a thickness i properly handed off and received as a zero it seems. The only way to know is:
- Try varying the thickness with and without opencl from 1 to some extremely large number. If it makes not difference there is likely a problem, or
- Put a print staement in the c code to trap the value of thickness it is receiving.
Using the new OpenCL dialog I ran the tests which gave the following output:
132 tests completed. 1 tests failed. Failing tests: ["Model: hollow_cylinder, Kernel: OpenCL"] Platform Details: Sasmodels version: 0.94 Platform used: ["Windows", "DeGennes", "8", "6.2.9200", "AMD64", "Intel64 Family 6 Model 69 Stepping 1, GenuineIntel"] OpenCL driver: [["Intel(R) Corporation", "OpenCL 1.2 ", "Intel(R) Corporation", "Intel(R) HD Graphics 4400", "OpenCL 1.2 "], ["Intel(R) Corporation", "OpenCL 1.2 ", "Intel(R) Corporation", "Intel(R) Core(TM) i7-4500U CPU @ 1.80GHz", "OpenCL 1.2 (Build 10094)"]]
Note that indeed the fractal core shell does NOT show up as failing.
I also ran the code suggested above from sasmodels with the following output (only 75 tests so assume it stopped testing when it hit a failure … which is the same model that failed from within the SasView application).
...... ....... test_two_power_law_python (__main__.ModelTestCase) ... OK (0.015s) test_unified_power_Rg_python (__main__.ModelTestCase) ... OK (0.016s) test_vesicle_opencl (__main__.ModelTestCase) ... OK (0.076s) ====================================================================== FAIL [0.070s]: test_hollow_cylinder_opencl (__main__.ModelTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Users\Paul\git\sasmodels\sasmodels\model_test.py", line 181, in run_all self.run_one(model, test) File "C:\Users\Paul\git\sasmodels\sasmodels\model_test.py", line 229, in run_one 'invalid f(%s): %s' % (xi, actual_yi)) AssertionError: invalid f((0.1, 0.1)): nan Model: hollow_cylinder, Kernel: OpenCL ---------------------------------------------------------------------- Ran 75 tests in 9.017s FAILED (failures=1) Generating XML reports... C:\Users\Paul\git\sasmodels>
comment:5 Changed 8 years ago by butler
OK … confirmed this is a problem with the HD4400 and/or gen 4 i7. On the office desktop the model does work (including thickness) with OpenCL on.
2017-02-27 08:53:19,163 INFO building fractal_core_shell-float32-D574FE7E for OpenCL Intel(R) HD Graphics 4000 2017-02-27 08:55:11,944 INFO building fractal_core_shell-float32-D574FE7E for OpenCL Intel(R) Core(TM) i7-3770S CPU @ 3.10GHz
This may be something that cannot be addressed because it is card dependent but this must be a solved problem? it is very worrisome that it can quietly fail without letting the user know anything is wrong!
comment:6 Changed 8 years ago by pkienzle
There are no tests for specific q values in the model.
We can add a couple of lines to sasmodels.model_test (around line 184) so that all models require test cases. It needs to check that there are 1D, 2D, ER and VR tests depending on the model, so the simple if not tests check will not suffice.
As to why fractal_core_shell is failing on this card, that is more challenging. It is not a particularly complicated function, so not an issue with array lengths, et al.
Fractal core shell uses the following special functions:
cube, square sas_gamma, sas_3j1x_x pow, sin, atan
Hollow cylinder uses the following:
square sas_2J1x_x, sas_sinx_x
These do not seem like they will cause errors.
comment:7 Changed 8 years ago by pkienzle
The ticket-814 branch of sasmodels now raises an error if missing 1D, 2D, ER or VR tests when the model supports those features. To run the tests:
python -m sasmodels.model_test -v dll all
To identify all models for which OpenCL is failing on your GPU, multi_compare.sh has been extended with an option to restrict the list of models to only the opencl 1D single models.
To use it, do the following:
# unix ./multi_compare.sh opencl+single+1d 10 1d100 mono single single! > single1d.csv # windows multi_compare opencl+single+1d 10 1d100 mono single single! > single1d.csv
For 2D use:
# unix ./multi_compare.sh opencl+single+2d 10 2d10 mono single single! > single2d.csv # windows multi_compare opencl+single+2d 10 2d10 mono single single! > single2d.csv
The resulting csv files show the difference between GPU and DLL calculations for each of the models.
If your card supports double, replace single with double in the commands to check the double-precision only models.
comment:8 Changed 8 years ago by pkienzle
- Summary changed from OpenCL error in fractal_core_shell to OpenCL calculation errors are not being identified
Even without certified values for all the models, the GPU vs. CPU comparison can be run as part of the GPU test suite in the OpenCL device selection dialog.
comment:9 Changed 8 years ago by GitHub <noreply@…>
- Resolution set to fixed
- Status changed from assigned to closed
Cannot reproduce (OS/X, HD Graphics 4000).
The usual source of these sorts of errors (array lengths not a multiple of 4) seems not to apply in this case since none of this code uses arrays.
Which other models are failing?
Try the following from the sasmodels directory to check all models through OpenCL: