Opened 11 years ago
Closed 11 years ago
#196 closed defect (fixed)
Be able to read semi-colon delimited files
Reported by: | ajj | Owned by: | |
---|---|---|---|
Priority: | minor | Milestone: | SasView 3.0.0 |
Component: | SasView | Keywords: | |
Cc: | Work Package: |
Description
As per email discussion:
On 18 nov 2013, at 10:25, stephen.king@… wrote:
If it’s any consolation, Mantid won’t read Greg’s CSV files either; I just tried.
It’s almost certainly a more complicated version of the ‘semicolon’ problem Andrew refers to (I’ve been there with Users on that too!). It’s all to do with the international settings in Windows. (For the benefit of those in the US, the continental Europeans – particularly the Germans - tend to use semicolons in their CSV files rather than commas).
In Greg’s case he’s got thousands separators for number formatting ‘turned on’, which means when the program that wrote those files came to write them it needed a way to distinguish between the comma field separator and the comma thousands separator, so it encased the fields with comma thousands separators in speech quotes. Kind of understandable, but a right pain nonetheless.
But I don’t see this as a ‘failure’ or ‘inadequacy’ of SasView?. The User can easily rectify the situation themselves.
Have to say that in 20-odd years this is the first time I’ve seen a User producing datafiles like Greg’s…
As an aside, will the SasView? ASCII reader read a CSV file that is semi-colon delimted??? That we SHOULD cater for.
Steve
From: Andrew Jackson andrew.jackson@…
Sent: 06 November 2013 13:16
To: Paul Butler
Cc: sasview-developers@…
Subject: Re: [Sasview-developers] csv files - reader?
Well, CSV is not really a strict format - I've been dealing recently with the fact that excel will sometimes export "csv" with the fields separated by semi-colons rather than commas.
Unfortunately, the real answer is that "Excel can read it" should really be our basis. What it means, however, is that if a file cannot be auto loaded, we would have to have a dialog that allowed some choice about which lines to skip, what separator to use, how to handle number etc, much in the way excel does.
In the immediate case, the user will have to go back to the data source and re-export with sensible output.
Andrew
On 6 nov 2013, at 14:01, Paul Butler <butler@…> wrote:
Merci!!
yes I should have remembered that csv files would be parsed by the generic ASCII reader. This can only be the most basic of ASCII parsers since to be general the reader has to apply some logic to figure out where the data is. As I recall it looks for the first instance with more than 5 or more rows containing X columns of numbers only AND where X is greater than 2 AND where X is exactly the same for all rows read. It then ignores everything before and after that block as being header and footer information of unknown formats. This would also explain why sasview did not read the first oscillation in the data where 2 points dropped below 1000 (and therefore did not have quotes).
In principle I agree with Miguel that adding a special ".cvs" reader seems like unneeded complexity (at a time when we are trying to simplify things:-). Also adding a check for quotes in the current reader would be adding complexity (specially since there are several permutations one might now have to consider).
Which brings me to Andrew's point. I don't know csv rules, but from what I can find quotes are not used for strings or numbers unless it includes a comma that is not a delimeter. It is not clear to me whether a quote around a number with a comma is part of the strict definition of csv or not. However I note that excel does read it correctly.
That said, it seems from what I'm hearing that we think that it is not necessary to support that variant? Certainly if the stric csv definition does not support this or if it is effectively not used this way, ignoring it is the right answer?
Cheers,
Paul
On Wed, Nov 6, 2013 at 3:57 AM, gonzalez <gonzalezm@…> wrote:
Salut Paul,
you are completely right. All the lines having quotation marks are skipped.
The csv files are read by the ASCII reader (ascii_reader.py), which basically reads a line, split its components and starts
trying to convert to floating numbers the first two tokens, assumed to correspond to Q and Intensity, respectively.
When it fails to do this, the line is simply omitted and this is what happens with all the lines having quotation marks.
I guess that trying to treat this particular case could cause more problems than benefits, as we may end up with a lot of
different possibilities in the data file that will be difficult to handle, so I would suggest that the easiest way forward is to
edit or export the files avoiding the quotation marks and any separator in the numbers other than the decimal point.
Cheers,
Miguel.
On 06/11/2013 04:07, Paul Butler wrote:
Has anybody got experience with csv files? as I mentioned in my response to Greg, the problem he is having is sasview does not recognize quotation mark enclosed numbers. Does anybody know how we read csv files? are we using a csv reading package or are we trying to parse the file ourselves? Jeff - you have been working on the reader .. but focusing on cansas I know.
while I cannot see the data "as read" with the datainfo (still broken) using the plot x,y printout in the info bar it would seem that the intensity is a bit lower than in the file (at least on the data sets sent) — anybody have any thoughts on that?
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk
_
Sasview-developers mailing list
Sasview-developers@…
https://lists.sourceforge.net/lists/listinfo/sasview-developers
—
Dr. Miguel A. Gonzalez
Institut Laue Langevin (ILL)
6, rue Jules Horowitz, BP 156
38042 Grenoble Cedex 9, FRANCE
Tel: +33 (0)4.76.20.71.66
Fax: +33 (0)4.76.48.39.06
E-mail: gonzalez@…
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk
_
Sasview-developers mailing list
Sasview-developers@…
https://lists.sourceforge.net/lists/listinfo/sasview-developers
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk_
Sasview-developers mailing list
Sasview-developers@…
https://lists.sourceforge.net/lists/listinfo/sasview-developers
Andrew Jackson
Instrument Scientist - Small Angle Scattering Adjunct Lecturer - Physical Chemistry
European Spallation Source ESS AB Lund University
P.O Box 176, SE-221 00 Lund, Sweden P.O. Box 124, SE-221 00 Lund, Sweden
Street address: Tunavägen 24 (Medicon Village) Street address: Getingevägen 60
Phone: +46 46 888 3015
Mobile: +46 72 179 2015
E-mail: andrew.jackson@…
www.esss.se
—
Scanned by iCritical.
Andrew Jackson
Instrument Scientist - Small Angle Scattering Adjunct Lecturer - Physical Chemistry
European Spallation Source ESS AB Lund University
P.O Box 176, SE-221 00 Lund, Sweden P.O. Box 124, SE-221 00 Lund, Sweden
Street address: Tunavägen 24 (Medicon Village) Street address: Getingevägen 60
Phone: +46 46 888 3015
Mobile: +46 72 179 2015
E-mail: andrew.jackson@…
www.esss.se
Change History (1)
comment:1 Changed 11 years ago by ajj
- Resolution set to fixed
- Status changed from new to closed
Fixed as of revision #6827