[plug] Slightly OT: library to read/write Excel data under Linux?
Denis Brown
dsbrown at cyllene.uwa.edu.au
Fri Aug 25 15:07:04 WST 2000
List,
Since this isn't strictly Linux-admin, Linux-user-query stuff, please reply
off list to keep the traffic down.
Situation: MS Excel spreadsheet has several columns containing date
info. The data in this spreadsheet has been cobbled together from variety
of sources possibly including earlier versions of Excel so cell formats in
a given column are not guaranteed to be identical. Cells in affected
column(s) have been selected and their format forced to something
sensible. When the spreadsheet is read into SPSS, a statistical programme
which normally reads Excel sheets without hiccup, any date column
containing at least one aberrant cell data is treated as a
number-of-days-since-epoch set. Tried saving the Excel sheet as CSV, etc
and re-saving as XLS without any improvement.
The "aberrant data" can be difficult to eyeball in the Excel sheet (690
rows and 70-odd columns) and some is quite subtle. For example most date
cells are of the form dd-Mmm-yy and if all are this form, no worries for
SPSS. If one or more cells have something like dd/mm/yy instead, SPSS
gags. Note that dd/mm/yy is still Excel-acceptable. So it seems that
there are at least two different internal Excel date representations. Excel
gurus please correct me if I'm wrong.
The Linux-related part: I'd like to write (or have written -- hint, hint) a
programme which examines the Excel sheet and parses the cell data,
rewriting the sheet under a new name and flagging rouge formats with some
sort of cell highlighting (red characters instead of black for example) and
allowing them to be fixed by hand. Auto-fix would be a nice pipe dream.
In order for this to happen, I need to know if Linux-friendly Excel-reading
C-libraries exist and if anyone on the list has +ve, -ve or other
experiences with them.
Freshmeat listed "abs" and "catdoc" in the console/text region but they
both seem to require X or Tk or other graphical environment. My preference
would be a utility which re-writes the sheet with flags as necessary and
this is given back to the data entry person to repair on his/her
Winbox. The other thought: any other languages up to this task? For
example, Perl. Does that have Excel-aware functionality?
Thanks in advance,
Denis
dsbrown at cyllene.uwa.edu.au
More information about the plug
mailing list