Opened 6 years ago

Closed 6 years ago

#1037 closed defect (fixed)

data loader crop not working? & all fits crashing

Reported by: richardh Owned by: krzywon
Priority: major Milestone: SasView 4.2.0
Component: SasView Keywords:
Cc: pkienzle Work Package: SasView Bug Fixing

Description

Correspondance with user from Soleil re issue with 4.1.2, user says 4.1.0 is OK

Data set has a lot of nan at small Q, loader ought to either ignore of flag a problem.  Fitting then goes awol, even if it returns a failed message or you Stop the fit, subsequent fits, even in a different tab with good data that worked previously also then fail, thus the fitting is somehow not recovering post the intial error. See details below, original and cropped data files attached. Note a lot of data points as is SAXS data.


Dear all,

Thank you for sending the data set. I have been trying to fit it in 4.1.2 with just a simple cylinder model and seeing paranoid behaviour as you note.

I think that this is a data loader issue, which needs to be investigated.

If you edit out all the "nan" data at smallest Q from the data the file then it seems to work.

Alas the "Q range" boxes in the gui then also seem to be not working properly, so even if you think you use them to select a particular Q range, it does not work (likely the internal data format is already upset).

 However having had this issue with the fitting this seems to not "clear itself" for later fits even on a different fitting tab that previously worked! (so need to restart the whole program).

Turning on "openCl" in the fitting options to use gpu graphics processor will speed things up. Rebinning the data to reduce the number of points could also speed things up considerably.

If you don't mind I will "anonymise" your data set to share it with some of the developer team.

Richard


Dear all,

Maciej Baranowski, post-doc from Swing beamline here.

I gotta admit that with SasView 4.1.2 the plugin behaved very erratically and so its kinda hard to sum up.

  1. One thing which occurred pretty consistently is that trying to fit only scale parameter alone tended to bring the scale unreasonably low - below 0.0002. If one brought the scale manually to roughly appropriate position, and then tried to fit the scale for fine tuning, the fitting procedure would bring it down again - moving the displayed theoretical curve well below the displayed data 2. Changing parameters manually and hitting enter would result in SasView:
  • calculating for a few seconds
  • displaying some curve - usually very very nicely fitted
  • calculating some more
  • displaying a different curve - usually much worse then before 3. I've tried several different fitting procedures, but with none of them I managed to obtain anything even close to a reasonable fit.
  1. Even reasonably simple fitting procedures like Marquard work very slowly, even when fitting a single parameter.

As per suggestion of dr Orion, I've reinstalled SasView with v. 4.1.0 and I did not observe problems 1 and 2 so far. etc

I've attached our data.

Thanks a lot for all Your help!

Attachments (2)

test_data_Baranowski_original.dat (336.3 KB) - added by richardh 6 years ago.
test_data_Baranowski_cropped.dat (332.1 KB) - added by richardh 6 years ago.

Download all attachments as: .zip

Change History (8)

Changed 6 years ago by richardh

Changed 6 years ago by richardh

comment:1 Changed 6 years ago by krzywon

In 8475d16e523e9464967c2cfde813c92b8475b437/sasview:

Coerce all nan values to 0 when loading data files. refs #1037

comment:2 Changed 6 years ago by krzywon

In cc626073d268a7360c381f1c62e4481a66aab70a/sasview:

Add unit tests to show nan values are coerced to 0. refs #1037

comment:3 Changed 6 years ago by butler

from github comments on the pull request:
Paul Kienzle says:

Setting nan values to zero seems like a bad idea since zeros are valid but NaNs? are not. How can you then later trim the bad data out of the fit and plots? Zeros in the data will drive the fit unless you set their uncertainty to infinity.

Also, zeros will cause errors in the log plots, or will make the points invisible, in which case they will drive the fit without the user seeing them.

Could you instead set the data.mask to values which are NaN? Or strip them from the file entirely? Or just do it for files from one facility (where nan really are zero) instead of doing it for every file?

If you decide to leave the points in as zero, can you please create a good data index so that user scripts can trim it automatically? Something like:

self.good_data_index = None
...
bad_data_index = np.isnan(x)
array[bad_data_index] = 0.
if self.good_data_index is None:
    self.good_data_index = ~bad_data_index
else:
    self.good_data_index &= ~bad_data_index
Do you need to do the same for 2D data?

comment:4 Changed 6 years ago by butler

I agree with Paul Kienzle. Setting NaN to zero is a very bad idea in general. As Paul points out zeros are valid values while NaN are not (hence the term) I would strongly prefer stripping them on loading though providing a mask could also work.

comment:5 Changed 6 years ago by richardh

Have tested the 64 bit Windows build 23 from Jenkins.

Now loads the original problem data set from Soleil, and without any knock on effects on other data sets as far I can tell from comparing fits to sets loaded before and after the problem one.

So looks fixed to me.

comment:6 Changed 6 years ago by GitHub <noreply@…>

  • Resolution set to fixed
  • Status changed from new to closed

In a40e913a125b97dabfb031780b14dd8aabc8aa00/sasview:

Merge pull request #135 from SasView?/ticket-1037

Ticket 1037: Remove nan values from 1D and 2D data objects

closes #1037

Note: See TracTickets for help on using tickets.