[plug] Slightly OT: library to read/write Excel data under Linux?

Denis Brown dsbrown at cyllene.uwa.edu.au
Fri Aug 25 15:07:04 WST 2000


List,

Since this isn't strictly Linux-admin, Linux-user-query stuff, please reply 
off list to keep the traffic down.

Situation:  MS Excel spreadsheet has several columns containing date 
info.  The data in this spreadsheet has been cobbled together from  variety 
of sources possibly including earlier versions of Excel so cell formats in 
a given column are not guaranteed to be identical.  Cells in affected 
column(s) have been selected and their format forced to something 
sensible.  When the spreadsheet is read into SPSS, a statistical programme 
which normally reads Excel sheets without hiccup, any date column 
containing at least one aberrant cell data is treated as a 
number-of-days-since-epoch set.  Tried saving the Excel sheet as CSV, etc 
and re-saving as XLS without any improvement.

The "aberrant data" can be difficult to eyeball in the Excel sheet (690 
rows and 70-odd columns) and some is quite subtle.  For example most date 
cells are of the form dd-Mmm-yy and if all are this form, no worries for 
SPSS.  If one or more cells have something like dd/mm/yy instead, SPSS 
gags.  Note that dd/mm/yy is still Excel-acceptable.  So it seems that 
there are at least two different internal Excel date representations. Excel 
gurus please correct me if I'm wrong.

The Linux-related part: I'd like to write (or have written -- hint, hint) a 
programme which examines the Excel sheet and parses the cell data, 
rewriting the sheet under a new name and flagging rouge formats with some 
sort of cell highlighting (red characters instead of black for example) and 
allowing them to be fixed by hand.   Auto-fix would be a nice pipe dream. 
In order for this to happen, I need to know if Linux-friendly Excel-reading 
C-libraries exist and if anyone on the list has +ve, -ve or other 
experiences with them.

Freshmeat listed "abs" and "catdoc" in the console/text region but they 
both seem to require X or Tk or other graphical environment.  My preference 
would be a utility which re-writes the sheet with flags as necessary and 
this is given back to the data entry person to repair on his/her 
Winbox.  The other thought: any other languages up to this task?  For 
example, Perl.  Does that have Excel-aware functionality?

Thanks in advance,
Denis
dsbrown at cyllene.uwa.edu.au




More information about the plug mailing list