Opened 5 years ago

Closed 5 years ago

#196 closed defect (fixed)

Be able to read semi-colon delimited files

Reported by: ajj Owned by:
Priority: minor Milestone: SasView 3.0.0
Component: SasView Keywords:
Cc: Work Package:

Description

As per email discussion:

On 18 nov 2013, at 10:25, stephen.king@… wrote:

If it’s any consolation, Mantid won’t read Greg’s CSV files either; I just tried.

It’s almost certainly a more complicated version of the ‘semicolon’ problem Andrew refers to (I’ve been there with Users on that too!). It’s all to do with the international settings in Windows. (For the benefit of those in the US, the continental Europeans – particularly the Germans - tend to use semicolons in their CSV files rather than commas).

In Greg’s case he’s got thousands separators for number formatting ‘turned on’, which means when the program that wrote those files came to write them it needed a way to distinguish between the comma field separator and the comma thousands separator, so it encased the fields with comma thousands separators in speech quotes. Kind of understandable, but a right pain nonetheless.

But I don’t see this as a ‘failure’ or ‘inadequacy’ of SasView?. The User can easily rectify the situation themselves.

Have to say that in 20-odd years this is the first time I’ve seen a User producing datafiles like Greg’s…

As an aside, will the SasView? ASCII reader read a CSV file that is semi-colon delimted??? That we SHOULD cater for.


Steve

From: Andrew Jackson andrew.jackson@…
Sent: 06 November 2013 13:16
To: Paul Butler
Cc: sasview-developers@…
Subject: Re: [Sasview-developers] csv files - reader?

Well, CSV is not really a strict format - I've been dealing recently with the fact that excel will sometimes export "csv" with the fields separated by semi-colons rather than commas.

Unfortunately, the real answer is that "Excel can read it" should really be our basis. What it means, however, is that if a file cannot be auto loaded, we would have to have a dialog that allowed some choice about which lines to skip, what separator to use, how to handle number etc, much in the way excel does.

In the immediate case, the user will have to go back to the data source and re-export with sensible output.

Andrew

On 6 nov 2013, at 14:01, Paul Butler <butler@…> wrote:

Merci!!

yes I should have remembered that csv files would be parsed by the generic ASCII reader. This can only be the most basic of ASCII parsers since to be general the reader has to apply some logic to figure out where the data is. As I recall it looks for the first instance with more than 5 or more rows containing X columns of numbers only AND where X is greater than 2 AND where X is exactly the same for all rows read. It then ignores everything before and after that block as being header and footer information of unknown formats. This would also explain why sasview did not read the first oscillation in the data where 2 points dropped below 1000 (and therefore did not have quotes).

In principle I agree with Miguel that adding a special ".cvs" reader seems like unneeded complexity (at a time when we are trying to simplify things:-). Also adding a check for quotes in the current reader would be adding complexity (specially since there are several permutations one might now have to consider).

Which brings me to Andrew's point. I don't know csv rules, but from what I can find quotes are not used for strings or numbers unless it includes a comma that is not a delimeter. It is not clear to me whether a quote around a number with a comma is part of the strict definition of csv or not. However I note that excel does read it correctly.

That said, it seems from what I'm hearing that we think that it is not necessary to support that variant? Certainly if the stric csv definition does not support this or if it is effectively not used this way, ignoring it is the right answer?

Cheers,

Paul

On Wed, Nov 6, 2013 at 3:57 AM, gonzalez <gonzalezm@…> wrote:

Salut Paul,

you are completely right. All the lines having quotation marks are skipped.
The csv files are read by the ASCII reader (ascii_reader.py), which basically reads a line, split its components and starts
trying to convert to floating numbers the first two tokens, assumed to correspond to Q and Intensity, respectively.
When it fails to do this, the line is simply omitted and this is what happens with all the lines having quotation marks.

I guess that trying to treat this particular case could cause more problems than benefits, as we may end up with a lot of
different possibilities in the data file that will be difficult to handle, so I would suggest that the easiest way forward is to
edit or export the files avoiding the quotation marks and any separator in the numbers other than the decimal point.

Cheers,
Miguel.

On 06/11/2013 04:07, Paul Butler wrote:
Has anybody got experience with csv files? as I mentioned in my response to Greg, the problem he is having is sasview does not recognize quotation mark enclosed numbers. Does anybody know how we read csv files? are we using a csv reading package or are we trying to parse the file ourselves? Jeff - you have been working on the reader .. but focusing on cansas I know.

while I cannot see the data "as read" with the datainfo (still broken) using the plot x,y printout in the info bar it would seem that the intensity is a bit lower than in the file (at least on the data sets sent) — anybody have any thoughts on that?


November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk

_
Sasview-developers mailing list
Sasview-developers@…
https://lists.sourceforge.net/lists/listinfo/sasview-developers


Dr. Miguel A. Gonzalez
Institut Laue Langevin (ILL)
6, rue Jules Horowitz, BP 156
38042 Grenoble Cedex 9, FRANCE

Tel: +33 (0)4.76.20.71.66
Fax: +33 (0)4.76.48.39.06
E-mail: gonzalez@…


November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk
_
Sasview-developers mailing list
Sasview-developers@…
https://lists.sourceforge.net/lists/listinfo/sasview-developers



November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk_
Sasview-developers mailing list
Sasview-developers@…
https://lists.sourceforge.net/lists/listinfo/sasview-developers


Andrew Jackson
Instrument Scientist - Small Angle Scattering Adjunct Lecturer - Physical Chemistry

European Spallation Source ESS AB Lund University
P.O Box 176, SE-221 00 Lund, Sweden P.O. Box 124, SE-221 00 Lund, Sweden
Street address: Tunavägen 24 (Medicon Village) Street address: Getingevägen 60

Phone: +46 46 888 3015
Mobile: +46 72 179 2015
E-mail: andrew.jackson@…

www.esss.se












Scanned by iCritical.


Andrew Jackson
Instrument Scientist - Small Angle Scattering Adjunct Lecturer - Physical Chemistry

European Spallation Source ESS AB Lund University
P.O Box 176, SE-221 00 Lund, Sweden P.O. Box 124, SE-221 00 Lund, Sweden
Street address: Tunavägen 24 (Medicon Village) Street address: Getingevägen 60

Phone: +46 46 888 3015
Mobile: +46 72 179 2015
E-mail: andrew.jackson@…

www.esss.se

Change History (1)

comment:1 Changed 5 years ago by ajj

  • Resolution set to fixed
  • Status changed from new to closed

Fixed as of revision #6827

Note: See TracTickets for help on using tickets.