[Octave-cvsupdate] SF.net SVN: octave:[6680] trunk/octave-forge/main/io/inst

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Revision: 6680
          http://octave.svn.sourceforge.net/octave/?rev=6680&view=rev
Author:   prnienhuis
Date:     2009-12-30 15:06:54 +0000 (Wed, 30 Dec 2009)

Log Message:
-----------
Revised and improved, now also based on odftoolkit & xercesImpl.
Only reading from .ods supported, adding write support is not yet feasible

Modified Paths:
--------------
    trunk/octave-forge/main/io/inst/calccelladdress.m
    trunk/octave-forge/main/io/inst/ods2oct.m
    trunk/octave-forge/main/io/inst/odsclose.m
    trunk/octave-forge/main/io/inst/odsopen.m
    trunk/octave-forge/main/io/inst/odsread.m
    trunk/octave-forge/main/io/inst/parsecell.m

Modified: trunk/octave-forge/main/io/inst/calccelladdress.m
===================================================================

--- trunk/octave-forge/main/io/inst/calccelladdress.m	2009-12-30 15:04:15 UTC (rev 6679)
+++ trunk/octave-forge/main/io/inst/calccelladdress.m	2009-12-30 15:06:54 UTC (rev 6680)
@@ -17,20 +17,22 @@
 ## calccelladdress - compute spreadsheet style cell address from
 ## row & column index.
 ## 
-## Max column index currently set to 1024 (ODS)
+## Max column index currently set to 1378 (ODS has max 1024)
 
 ## Author: Philip Nienhuis <prnienhuis at users.sf.net>
 ## Created: 2009-12-12
+## Last updated: 2009-12-27
 
 function [ celladdress ] = calccelladdress (trow, lcol, row, column)
 
+	if (nargin < 4) error ("calccelladdress: not enough arguments.") endif
 	colnr = lcol + column - 1;
-	if (colnr> 1024) error ("Column nr >1024"); endif
+	if (colnr> 1378) error ("Column nr > 1378"); endif
 	str = char (rem ((colnr-1), 26) + 'A');
 	if (colnr > 26 && colnr < 703) 
 		tmp = char (floor ((colnr - 27) / 26) + 'A');
 		str = [tmp str];
-	elseif (colnr > 702)
+	elseif (colnr > 702 && colnr < 1379)
 		tmp = char (floor ((colnr - 703) / 26) + 'A');
 		str = ['A' tmp str];
 	endif

Modified: trunk/octave-forge/main/io/inst/ods2oct.m
===================================================================
--- trunk/octave-forge/main/io/inst/ods2oct.m	2009-12-30 15:04:15 UTC (rev 6679)
+++ trunk/octave-forge/main/io/inst/ods2oct.m	2009-12-30 15:06:54 UTC (rev 6680)
@@ -14,6 +14,380 @@
 ## along with Octave; see the file COPYING.  If not, see
 ## <http://www.gnu.org/licenses/>.
 
+## -*- texinfo -*-
+## @deftypefn {Function File} [ @var{rawarr}, @var{ods}, @var{rstatus} ] = ods2oct (@var{ods})
+## @deftypefnx {Function File} [ @var{rawarr}, @var{ods}, @var{rstatus} ] = ods2oct (@var{ods}, @var{wsh})
+## @deftypefnx {Function File} [ @var{rawarr}, @var{ods}, @var{rstatus} ] = ods2oct (@var{ods}, @var{wsh}, @var{range})
+##
+## Read data contained within range @var{range} from worksheet @var{wsh}
+## in an OpenOffice.org spreadsheet file pointed to in struct @var{ods}.
+##
+## ods2oct is a mere wrapper for interface-dependent scripts (e.g.,
+## ods2jotk2oct and ods2jod2oct) that do the actual reading.
+##
+## @var{wsh} is either numerical or text, in the latter case it is 
+## case-sensitive and it may be max. 31 characters long.
+## Note that in case of a numerical @var{wsh} this number refers to the
+## position in the worksheet stack, counted from the left in a Calc
+## window. The default is numerical 1, i.e. the leftmost worksheet
+## in the ODS file.
+##
+## @var{range} is expected to be a regular spreadsheet range format,
+## or "" (empty string, indicating all data in a worksheet).
+##
+## If only the first argument is specified, ods2oct will try to read
+## all contents from the first = leftmost (or the only) worksheet (as
+## if a range of @'' (empty string) was specified).
+## 
+## If only two arguments are specified, ods2oct assumes the second
+## argument to be @var{wsh}. In that case ods2oct will try to read
+## all data contained in that worksheet.
+##
+## Return argument @var{rawarr} contains the raw spreadsheet cell data.
+## Optional return argument @var{ods} contains the pointer struct. Field
+## @var{ods}.limits contains the outermost column and row numbers of the
+## actually read cell range.
+## @var{rstatus} will be set to 1 if the requested data have been read
+## successfully, 0 otherwise.
+## Use parsecell() to separate numeric and text values from @var{rawarr}.
+##
+## @var{ods} is supposed to have been created earlier by odsopen in the
+## same octave session. It is only referred to, not changed.
+##
+## Erroneous data and empty cells turn up empty in @var{rawarr}.
+## Date/time values in OpenOffice.org are returned as numerical values
+## with base 1-1-000 (same as octave). But beware that Excel spreadsheets
+## rewritten by OpenOffice.org into .ods format may have numerical date
+## cells with base 01-01-1900 (same as MS-Excel).
+##
+## Be aware that ods2oct trims @var{rawarr} from empty outer rows & columns, 
+## so any returned cell array may turn out to be smaller than requested
+## in @var{range}.
+##
+## When reading from merged cells, all array elements NOT corresponding 
+## to the leftmost or upper OpenOffice.org cell will be treated as if the
+## "corresponding" cells are empty.
+##
+## Examples:
+##
+## @example
+##   A = ods2oct (ods1, '2nd_sheet', 'C3:ABS40000');
+##   (which returns the numeric contents in range C3:ABS40000 in worksheet
+##   '2nd_sheet' from a spreadsheet file pointed to in pointer struct ods1,
+##   into numeric array A) 
+## @end example
+##
+## @example
+##   [An, ods2, status] = ods2oct (ods2, 'Third_sheet');
+## @end example
+##
+## @seealso odsopen, odsclose, parsecell, odsread, odsfinfo
+##
+## @end deftypefn
+
+## Author: Philip Nienhuis
+## Created: 2009-12-13
+## Latest update: 2009-12-30
+
+function [ rawarr, ods ] = ods2oct (ods, wsh=1, datrange=[])
+
+	if (strcmp (ods.xtype, 'OTK'))
+		# Read xls file tru Java & ODF toolkit
+		[rawarr, ods, rstatus] = ods2jotk2oct (ods, wsh, datrange);
+		
+	elseif (strcmp (ods.xtype, 'JOD'))
+		[rawarr, ods, rstatus] = ods2jod2oct (ods, wsh, datrange);
+		
+#	elseif ---- < Other interfaces here >
+
+	else
+		error (sprintf ("ods2oct: unknown OpenOffice.org .ods interface - %s.", xls.xtype));
+
+	endif
+
+endfunction
+
+#=====================================================================
+
+## Copyright (C) 2009 Philip Nienhuis <prnienhuis _at- users.sf.net>
+## 
+## This program is free software; you can redistribute it and/or modify
+## it under the terms of the GNU General Public License as published by
+## the Free Software Foundation; either version 2 of the License, or
+## (at your option) any later version.
+## 
+## This program is distributed in the hope that it will be useful,
+## but WITHOUT ANY WARRANTY; without even the implied warranty of
+## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+## GNU General Public License for more details.
+## 
+## You should have received a copy of the GNU General Public License
+## along with Octave; see the file COPYING.  If not, see
+## <http://www.gnu.org/licenses/>.
+
+## odf2jotk2oct - read ODS spreadsheet data using Java & odftoolkit
+## You need proper java-for-octave & odfdom.jar + xercesImpl.jar
+## in your javaclasspath.
+
+## Author: Philip Nenhuis <pr.nienhuis at users.sf.net>
+## Created: 2009-12-24
+## Last update: 2009-12-30
+
+function [ rawarr, ods, status ] = ods2jotk2oct (ods, wsh, crange = [])
+
+	# Parts after user gfterry in
+	# http://www.oooforum.org/forum/viewtopic.phtml?t=69060
+
+	# Get contents and table stuff from the workbook
+	odfcont = ods.workbook;		# Use a local copy just to be sure. octave 
+								# makes physical copies only when needed (?)
+	xpath = ods.app.getXPath;
+	
+	# AFAICS ODS spreadsheets have the following hierarchy:
+	# <table:table> - table nodes, the actual worksheets;
+	# <table:table-row> - row nodes, the rows in a worksheet;
+	# <table:table-cell> - cell nodes, the cells in a row;
+	# Styles (formatting) are defined in a section "settings" outside the
+	# contents proper but are referenced in the nodes.
+	
+	# Create an instance of type NODESET for use in subsequent statement
+	NODESET = java_get ('javax.xml.xpath.XPathConstants', 'NODESET');
+	# Parse sheets ("tables") from ODS file
+	sheets = xpath.evaluate ("//table:table", odfcont, NODESET);
+	nr_of_sheets = sheets.getLength ();
+
+	# Check user input & find sheet pointer (1-based), using ugly hacks
+	if (~isnumeric (wsh))
+		# Search in sheet names, match sheet name to sheet number
+		ii = 0;
+		while (++ii <= nr_of_sheets && ischar (wsh))	
+			# Look in first part of the sheet nodeset
+			tmp1 = char (sheets.item(ii-1))(1:150);
+			ist = index (tmp1, 'table:name=') + 12;
+			ien = index (tmp1(ist:end), '" table') - 2 + ist;
+			if (strcmp (tmp1(ist:ien), wsh))
+				# Convert local copy of wsh into a number (pointer)
+				wsh = ii + i;
+			endif
+		endwhile
+		if (ischar (wsh)) error (sprintf ("No worksheet '%s' found in file %s", wsh, ods.filename)); endif
+	elseif (wsh > nr_of_sheets || wsh < 1)
+		# We already have a numeric sheet pointer. If it's not in range:
+		error (sprintf ("Worksheet no. %d out of range (1 - %d)", wsh, nr_of_sheets));
+	endif
+		
+	# Get table-rows in sheet no. wsh. Sheet count = 1-based (!)
+	str = sprintf ("//table:table[%d]/table:table-row", wsh);
+	sh = xpath.evaluate (str, odfcont, NODESET);
+	nr_of_rows = sh.getLength(); 
+	# Create storage for data content. We can't know max row length yet so expect the worst
+	rawarr = cell (nr_of_rows, 1024);
+
+	# Either parse (given cell range) or prepare (unknown range) help variables 
+	if (isempty (crange))
+		trow = 1;				# Top row
+		brow = nr_of_rows;		# Bottom row (ODS max = 65535, Xpath's guess = better)
+		nrows = brow;			# Number of rows to be read
+		lcol = 1;				# Leftmost column of range
+		rcol = 1024;			# Rightmost columns (1024 on ODS)
+		ncols = rcol;			# Number of columns to be read
+	else
+		[dummy, nrows, ncols, trow, lcol] = parse_sp_range (crange);
+		brow = min (trow + nrows - 1, nr_of_rows);
+		# Check ODS column limits
+		if (lcol > 1024 || trow > 65536) error ("ods2oct: invalid range; max 1024 columns & 65536 rows."); endif
+		# Truncate range silently if needed
+		rcol = min (lcol + ncols - 1, 1024);
+		ncols = min (ncols, 1024 - lcol + 1);
+		nrows = min (nrows, 65536 - trow + 1);
+	endif
+
+	# Read from worksheet row by row. Row numbers are 0-based
+	rightmcol = 0;		# Used to find actual rightmost column
+	ii = trow - 1;		# Spreadsheet row counter
+	rowcnt = ii;		# xpath row counter (multiple rows may be condensed in one!)
+	while (++ii <= brow)
+		row = sh.item(rowcnt++);
+		nr_of_cells = min (row.getLength (), rcol);
+		rightmcol = max (rightmcol, nr_of_cells);	# Keep track of max row length
+
+		# Read column (cell, "table-cell" in ODS speak) by column
+		jj = lcol; r_cols = 0;
+		while (jj + r_cols <= 1024 && jj <= rcol)
+			tcell = char (row.getCellAt(jj-1));
+			cellcont = getcellcont (tcell);		# Parse cell contents, func = below
+			if (~isempty (cellcont.cvalue))
+
+				# Get data from cell
+				switch cellcont.ctype
+					case  {'float', 'currency', 'percentage'}
+						rawarr(ii, jj) = str2double (cellcont.cvalue);
+					case 'date'
+						# Dates are returned as octave datenums, i.e. 0-0-0000 based
+						str = cellcont.cvalue;
+						yr = str2num (str(1:4));
+						mo = str2num (str(6:7));
+						dy = str2num (str(9:10));
+						if (index (str, 'T'))
+							hh = str2num (str(12:13));
+							mm = str2num (str(15:16));
+							ss = str2num (str(18:19));
+							rawarr(ii, jj) = datenum (yr, mo, dy, hh, mm, ss);
+						else
+							rawarr(ii, jj) = datenum (yr, mo, dy);
+						endif
+					case 'time'
+						str = cellcont.cvalue;
+						if (index (str, 'PT'))
+							hh = str2num (str(3:4));
+							mm = str2num (str(6:7));
+							ss = str2num (str(9:10));
+							rawarr(ii, jj) = datenum (0, 0, 0, hh, mm, ss);
+						endif
+					case 'boolean'
+						if (strcmp (cellcont.cvalue, 'true'))
+							rawarr(ii, jj) = true;
+						else
+							rawarr(ii, jj) = false;
+						endif
+					case 'string'
+						rawarr(ii, jj) = cellcont.cvalue;
+					otherwise
+						# Nothing
+				endswitch
+			endif
+			++jj;						# Next cell
+			r_cols = cellcont.cols;		# Number of repeated columns.
+		endwhile
+
+		# Check for repeated rows (i.e. condensed in one table-row)
+		mm = index (char(row), 'number-rows-repeated');
+		if (mm)
+			str = char (row) (1:82);
+			ist = mm + 22; ien = index (str(ist:end), '"') + ist - 2; 
+			extrarows = str2num (str(ist:ien));
+			if (extrarows > 0 && ii + extrarows < 65535)
+				# Expand table-row
+				nr_of_rows = nr_of_rows + extrarows - 1;
+				ii = ii + extrarows - 1;
+				nrows = min (65536, nrows + extrarows - 1);
+				brow = min (trow + nrows - 1, nr_of_rows);
+				# Increase return argument size if needed
+				tmp = cell (extrarows, 1024);
+				rawarr = [rawarr; tmp];
+			endif
+		endif
+		
+	endwhile
+
+	# Pre-crop rawarr from right (max was 1024) and bottom
+	rawarr = rawarr (1:nr_of_rows, 1:rightmcol);
+
+	# Crop rawarr from all empty outer rows & columns just like Excel does
+	# & keep track of limits
+	emptr = cellfun('isempty', rawarr);
+	irowt = 1;
+	while (all (emptr(irowt, :))), irowt++; endwhile
+	irowb = nr_of_rows;
+	while (all (emptr(irowb, :))), irowb--; endwhile
+	icoll = 1;
+	while (all (emptr(:, icoll))), icoll++; endwhile
+	icolr = rightmcol;
+	while (all (emptr(:, icolr))), icolr--; endwhile
+	# Crop textarray
+	rawarr = rawarr(irowt:irowb, icoll:icolr);
+	status = 1;
+
+	ods.limits = [lcol+icoll-1, lcol+icolr-1; trow+irowt-1, trow+irowb-1];
+
+endfunction
+
+
+## Copyright (C) 2009 by Philip Nienhuis <prnienhuis at users.sf.net>
+## 
+## Parse some data from ODS spreadsheet cells.
+## Input = character string <table:table-cell>...</table:table-cell>
+## 
+## Author: Philip Nenhuis <pr.nienhuis at users.sf.net>
+## Created: 2009-12-24
+## Last update: 2009-12-27
+
+function cellcont = getcellcont (tcell)
+
+	cellcont = struct ("ctype", '', "cvalue", '', "cols", 0);
+
+	# Check for repeated columns (often empty columns, viz. to right of data)
+	ii = index (tcell, 'number-columns-repeated');
+	if (ii)
+		ist = ii+25; ien = index(tcell(ist:end), '"') + ist - 2; 
+		cellcont.cols = str2num (tcell(ist:ien));
+	endif
+
+	# Get cell value type: float / boolean / string / date/.... Skip errors.
+	ii = index (tcell, 'value-type=');
+	if (ii && ~index (tcell, 'text:p>Err:') && ~index (tcell, 'text:p>#DIV'))
+		# Then cell contains useful data.
+		ist = ii + 12; ien = index (tcell(ist:end), '"') + ist - 2; 
+		cellcont.ctype = tcell(ist:ien);
+
+		if (strcmp (cellcont.ctype, 'string'))
+			# Get string value from between <text:p|r> </text:p|r> tags
+			ii = index (tcell, '<text');
+			if (ii)
+				ist = ii + 8; ien = index (tcell(ist:end), '</text') + ist - 2;
+				cellcont.cvalue = tcell(ist:ien);
+			endif
+
+		elseif (strcmp (cellcont.ctype, 'date'))
+			# Get date string
+			ii = index (tcell, 'date-value=');
+			if (ii)
+				ist = ii + 12; ien = index (tcell(ist:end), '"') + ist - 2; 
+				# Return string for the time being
+				cellcont.cvalue = tcell(ist:ien);
+			endif
+
+		elseif (strcmp (cellcont.ctype, 'time'))
+			# Get time string
+			ii = index (tcell, 'time-value=');
+			if (ii)
+				ist = ii + 12; ien = index (tcell(ist:end), '"') + ist - 2; 
+				# Return string for the time being (no pun intended)
+				cellcont.cvalue = tcell(ist:ien);
+			endif
+
+		else
+			# Get cell value as string (float, currency, percentage)
+			ii = index (tcell, 'value=');
+			if (ii)
+				ist = ii + 7; ien = index (tcell(ist:end), '"') + ist - 2; 
+				cellcont.cvalue = tcell(ist:ien);
+			endif
+
+		endif
+	endif
+
+endfunction
+
+
+#===========================================================================
+
+## Copyright (C) 2009 Philip Nienhuis <pr.nienhuis at users.sf.net>
+## 
+## This program is free software; you can redistribute it and/or modify
+## it under the terms of the GNU General Public License as published by
+## the Free Software Foundation; either version 2 of the License, or
+## (at your option) any later version.
+## 
+## This program is distributed in the hope that it will be useful,
+## but WITHOUT ANY WARRANTY; without even the implied warranty of
+## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+## GNU General Public License for more details.
+## 
+## You should have received a copy of the GNU General Public License
+## along with Octave; see the file COPYING.  If not, see
+## <http://www.gnu.org/licenses/>.
+
 ## ods2oct - get data out of an ODS spreadsheet into octave.
 ## Watch out, no error checks, and spreadsheet formula error results
 ## are conveyed as 0 (zero).
@@ -21,11 +395,22 @@
 ## Author: Philip Nienhuis
 ## Created: 2009-12-13
 
-function [ rawarr, ods ] = ods2oct (ods, wsh, range)
+function [ rawarr, ods, rstatus] = ods2jod2oct (ods, wsh, crange)
 
+	persistent months;
+	months = {"JAN", "FEB", "MAR", "APR", "MAY", "JUN", "JUL", "AUG", "SEP", "OCT", "NOV", "DEC"};
+
+	if (isempty (crange)) error ("Empty cell range not allowed w. jOpenDocument."); endif
+
+	if (isnumeric(wsh)) wsh = max (wsh - 1, 1); endif	# Sheet COUNT starts at 0!
 	sh	= ods.workbook.getSheet (wsh);
 	
-	[dummy, nrows, ncols, toprow, lcol] = parse_sp_range (range);
+	[dummy, nrows, ncols, toprow, lcol] = parse_sp_range (crange);
+	if (lcol > 1024 || toprow > 65536) error ("ods2oct: invalid range; max 1024 columns & 65536 rows."); endif
+	# Truncate range silently if needed
+	rcol = min (lcol + ncols - 1, 1024);
+	ncols = min (ncols, 1024 - lcol + 1);
+	nrows = min (nrows, 65536 - toprow + 1);
 
 	# Placeholder for data
 	rawarr = cell (nrows, ncols);
@@ -41,16 +426,28 @@
 			if (~isempty (val))
 				if (ischar (val))
 					# Text string
-					rawarr (ii, jj) = val;
+					rawarr(ii, jj) = val;
 				elseif (isnumeric (val))
 					# Boolean
-					if (val) rawarr (ii, jj) = true; else; rawarr (ii, jj) = false; endif 
+					if (val) rawarr(ii, jj) = true; else; rawarr(ii, jj) = false; endif 
 				else
 					try
 						val = sh.getCellAt (celladdress).getValue ().doubleValue ();
-						rawarr (ii, jj) = val;
+						rawarr(ii, jj) = val;
 					catch
-						# Probably empty Cell
+						val = char (val);
+						if (isempty (val))
+							# Probably empty Cell
+						else
+							# Maybe date / time value. Dirty hack to get values:
+							mo = strmatch (toupper (val(5:7)), months);
+							dd = str2num (val(9:10));
+							yy = str2num (val(25:end));
+							hh = str2num (val(12:13));
+							mm = str2num (val(15:16));
+							ss = str2num (val(18:19));
+							rawarr(ii, jj) = datenum (yy, mo, dd, hh, mm,ss);
+						endif
 					end_try_catch
 				endif
 			endif
@@ -58,5 +455,7 @@
 	endfor
 	
 	ods.limits = [ lcol, lcol+ncols-1; toprow, toprow+nrows-1 ];
+	
+	rstatus = ~isempty (rawarr);
 
 endfunction

Modified: trunk/octave-forge/main/io/inst/odsclose.m
===================================================================
--- trunk/octave-forge/main/io/inst/odsclose.m	2009-12-30 15:04:15 UTC (rev 6679)
+++ trunk/octave-forge/main/io/inst/odsclose.m	2009-12-30 15:06:54 UTC (rev 6680)
@@ -14,19 +14,19 @@
 ## along with Octave; see the file COPYING.  If not, see
 ## <http://www.gnu.org/licenses/>.
 
-## odsclose - close an ods spreadsheet file & if needed write out to disk
-## Basic version - no checks yet
+## odsclose - close an ods spreadsheet file
+## usage: ods = odsclose (ods)
 
 ## Author: Philip Nienhuis
 ## Created: 2009-12-13
+## Last update: 2009-12-27
 
 function [ ods ] = odsclose (ods)
 
-	if (ods.changed > 0)
-		file = java_new ('java.io.File', ods.filename);
-		ods.workbook.saveAs (file);
-	endif
+	# If needed warn that dangling spreadsheet pointers may be left
+	if (nargout < 1) warning ("return argument missing - ods invocation not reset."); endif
 	
+	# Until proper ODS write support has been implemented, not much need be done
 	ods = [];
 
 endfunction

Modified: trunk/octave-forge/main/io/inst/odsopen.m
===================================================================
--- trunk/octave-forge/main/io/inst/odsopen.m	2009-12-30 15:04:15 UTC (rev 6679)
+++ trunk/octave-forge/main/io/inst/odsopen.m	2009-12-30 15:06:54 UTC (rev 6680)
@@ -14,17 +14,252 @@
 ## along with Octave; see the file COPYING.  If not, see
 ## <http://www.gnu.org/licenses/>.
 
-## odsopen - open ODS spreadsheet
-## Basic version, no checks yet
+## -*- texinfo -*-
+## @deftypefn {Function File} @var{ods} = odsopen (@var{filename})
+## @deftypefnx {Function File} @var{ods} = odsopen (@var{filename}, @var{readwrite})
+## @deftypefnx {Function File} @var{ods} = odsopen (@var{filename}, @var{readwrite}, @var{reqintf})
+## Get a pointer to an OpenOffice.org spreadsheet in the form of return
+## argument @var{ods}.
+##
+## Calling odsopen without specifying a return argument is fairly useless!
+##
+## To make this function work at all, you need the Java package > 1.2.5 plus
+## either ODFtoolkit > 3.5 & xercesImpl, or jOpenDocument installed on your
+## computer + proper javaclasspath set. These interfaces are referred to as
+## OTK and JOD, resp., and are preferred in that order by default (depending
+## on their presence).
+##
+## @var{filename} must be a valid .ods OpenOffice.org file name. If @var{filename}
+## does not contain any directory path, the file is saved in the current
+## directory.
+##
+## @var{readwrite} is currently ignored until ODS writesupport is implemented.
+##
+## Optional input argument @var{reqintf} can be used to override the ODS
+## interface automatically selected by odsopen. Currently implemented interfaces
+## are 'OTK' (Java / ODFtoolkit) or 'JOD' (Java / jOpenDocument).
+##
+## Examples:
+##
+## @example
+##   ods = odsopen ('test1.ods');
+##   (get a pointer for reading from spreadsheet test1.ods)
+##
+##   ods = odsopen ('test2.ods', [], 'JOD');
+##   (as above, indicate test2.ods will be read from; in this case using
+##    the jOpenDocument interface is requested)
+## @end example
+##
+## @seealso odsclose, odsread, ods2oct, odsfinfo
+##
+## @end deftypefn
 
 ## Author: Philip Nienhuis
 ## Created: 2009-12-13
+## Last update: 2009-12-30
 
-function [ ods ] = odsopen (filename, rw=0)
+function [ ods ] = odsopen (filename, rw=0, reqinterface = [])
 
-	file = java_new ('java.io.File', filename);
-	wb = java_invoke ('org.jopendocument.dom.spreadsheet.SpreadSheet', 'createFromFile', file);
+	persistent odsinterfaces; persistent chkintf;
+	if (isempty (chkintf))
+		odsinterfaces = struct ( "OTK", [], "JOD", [] );
+		chkintf = 1;
+	endif
+	
+	if (nargout < 1) usage ("ODS = odsopen (ODSfile, [Rw]). But no return argument specified!"); endif
 
-	ods = struct ("xtype", 'JOD', "app", file, "filename", filename, "workbook", wb, "changed", 0, "limits", []);
+	if (~isempty (reqinterface))
+		# Try to invoke requested interface for this call. Check if it
+		# is supported anyway by emptying the corresponding var.
+		if (strcmp (tolower (reqinterface), tolower ('OTK')))
+			printf ("Java/ODFtoolkit interface requested... ");
+			odsinterfaces.OTK = []; odsinterfaces.JOD = 0;
+		elseif (strcmp (tolower (reqinterface), tolower ('JOD')))
+			printf ("Java/jOpenDocument interface requested... "); 
+			odsinterfaces.OTK = 0; odsinterfaces.JOD = [];
+		else
+			usage (sprintf ("Unknown .ods interface \"%s\" requested. Only OTK or JOD supported", reqinterface));
+		endif
+		odsinterfaces = getodsinterfaces (odsinterfaces);
+		# Well, is the requested interface supported on the system?
+		if (~odsinterfaces.(toupper (reqinterface)))
+			# No it aint
+			error (" ...but that's not supported!");
+		endif
+	endif
+	
+	# Var xwrite is really used to avoid creating files when wanting to read, or
+	# not finding not-yet-existing files when wanting to write.
+	
+#	if (xwrite) xwrite = 1; endif		# Be sure it's either 0 or 1 initially
 
+	# Check if ODS file exists
+	# Write statements temporarily disabled!
+	fid = fopen (filename, 'rb');
+	if (fid < 0)
+#		if (~xwrite)
+			err_str = sprintf ("File %s not found\n", filename);
+			error (err_str)
+#		else
+#			printf ("Creating file %s\n", filename);
+#			xwrite = 2;
+#		endif
+	else
+		# close file anyway to avoid Java errors
+		fclose (fid);
+	endif
+	
+# Check for the various ODS interfaces. No problem if they've already
+# been checked, getodsinterfaces (far below) just returns immediately then.
+
+	odsinterfaces = getodsinterfaces (odsinterfaces);
+
+# Supported interfaces determined; now check ODS file type.
+
+	chk1 = strcmp (tolower (filename(end-3:end)), '.ods');
+	if (~chk1)
+		error ("Currently ods2oct can only read reliably from .ods files")
+	endif
+
+	ods = struct ("xtype", [], "app", [], "filename", [], "workbook", [], "changed", 0, "limits", []);
+
+	if (odsinterfaces.OTK)
+		# Parts after user gfterry in
+		# http://www.oooforum.org/forum/viewtopic.phtml?t=69060
+		wb = java_invoke ('org.odftoolkit.odfdom.doc.OdfDocument', 'loadDocument', filename);
+		ods.xtype = 'OTK';
+		ods.app = wb;
+		ods.filename = filename;
+		ods.workbook = wb.getContentDom();		# Reads the entire spreadsheet
+
+	elseif (odsinterfaces.JOD)
+		file = java_new ('java.io.File', filename);
+		wb = java_invoke ('org.jopendocument.dom.spreadsheet.SpreadSheet', 'createFromFile', file);
+		ods.xtype = 'JOD';
+		ods.app = 'file';
+		ods.filename = filename;
+		ods.workbook = wb;
+		
+#	elseif 
+#		<other interfaces here>
+
+	else
+		warning ("No support for OpenOffice.org .ods I/O"); 
+
+	endif
+
+	if (~isempty (reqinterface))
+		# Reset found interfaces for re-testing in the next call. Add interfaces if needed.
+		chkintf = [];
+	endif
+
 endfunction
+
+
+## Copyright (C) 2009 Philip Nienhuis <prnienhuis at users.sf.net>
+## 
+## This program is free software; you can redistribute it and/or modify
+## it under the terms of the GNU General Public License as published by
+## the Free Software Foundation; either version 2 of the License, or
+## (at your option) any later version.
+## 
+## This program is distributed in the hope that it will be useful,
+## but WITHOUT ANY WARRANTY; without even the implied warranty of
+## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+## GNU General Public License for more details.
+## 
+## You should have received a copy of the GNU General Public License
+## along with Octave; see the file COPYING.  If not, see
+## <http://www.gnu.org/licenses/>.
+
+## -*- texinfo -*-
+## @deftypefn {Function File} @var{odsinterfaces} = getodsinterfaces (@var{odsinterfaces})
+## Get supported OpenOffice.org .ods file read/write interfaces from
+## the system.
+## Each interface for which the corresponding field is set to empty
+## will be checked. So by manipulating the fields of input argument
+## @var{odsinterfaces} it is possible to specify which
+## interface(s) should be checked.
+##
+## Currently implemented interfaces comprise:
+## - Java & ODFtoolkit (www.apache.org)
+## - Java & jOpenDocument (www.jopendocument.org)
+##
+## Examples:
+##
+## @example
+##   odsinterfaces = getodsinterfaces (odsinterfaces);
+## @end example
+
+## Author: Philip Nienhuis
+## Created: 2009-12-27
+## Last updated 2009-12-27
+
+function [odsinterfaces] = getodsinterfaces (odsinterfaces)
+
+	if (isempty (odsinterfaces.OTK) && isempty (odsinterfaces.JOD))
+		chk1 = 1;
+		printf ("Supported interfaces: ");
+	else
+		chk1= 0;
+	endif
+
+	# Try Java & ODF toolkit
+	if (isempty (odsinterfaces.OTK))
+		odsinterfaces.OTK = 0;
+		try
+			tmp1 = javaclasspath;
+			# If we get here, at least Java works. Now check for proper entries
+			# in class path. Under *nix the classpath must first be split up
+			if (isunix) tmp1 = strsplit (char(tmp1), ":"); endif
+			jpchk = 0; entries = {"rt.jar", "odfdom.jar", "xercesImpl.jar"};
+			for ii=1:size (tmp1, 2)
+				tmp2 = strsplit (char (tmp1(1, ii)), "\\/");
+				for jj=1:size (entries, 2)
+					if (strmatch (entries{1, jj}, tmp2{size (tmp2, 2)})), ++jpchk; endif
+				endfor
+			endfor
+			if (jpchk > 2)
+				odsinterfaces.OTK = 1;
+				printf (" Java/ODFtoolkit (OTK) OK. ");
+				chk1 = 1;
+			else
+				warning ("\n Java support OK but not all required classes (.jar) in classpath");
+			endif
+		catch
+			# ODFtoolkit support nonexistent
+		end_try_catch
+	endif
+
+	# Try Java & jOpenDocument
+	if (isempty (odsinterfaces.JOD))
+		odsinterfaces.JOD = 0;
+		try
+			tmp1 = javaclasspath;
+			# If we get here, at least Java works. Now check for proper entries
+			# in class path. Under unix the classpath must first be split up
+			if (isunix) tmp1 = strsplit (char(tmp1), ":"); endif
+			jpchk = 0; entries = {"rt.jar", "jOpenDocument"};
+			for ii=1:size (tmp1, 2)
+				tmp2 = strsplit (char (tmp1(1, ii)), "\\/");
+				for jj=1:size (entries, 2)
+					if (strmatch (entries{1, jj}, tmp2{size (tmp2, 2)})), ++jpchk; endif
+				endfor
+			endfor
+			if (jpchk > 1)
+				odsinterfaces.JOD = 1;
+				printf (" Java/jOpenDocument (JOD) OK. ");
+				chk1 = 1;
+			else
+				warning ("\nJava support OK but required classes (.jar) not all in classpath");
+			endif
+		catch
+			# No jOpenDocument support
+		end_try_catch
+	endif
+	
+	# ---- Other interfaces here, similar to the ones above
+
+	if (chk1) printf ("\n"); endif
+	
+endfunction
\ No newline at end of file

Modified: trunk/octave-forge/main/io/inst/odsread.m
===================================================================
--- trunk/octave-forge/main/io/inst/odsread.m	2009-12-30 15:04:15 UTC (rev 6679)
+++ trunk/octave-forge/main/io/inst/odsread.m	2009-12-30 15:06:54 UTC (rev 6680)
@@ -15,47 +15,104 @@
 ## <http://www.gnu.org/licenses/>.
 
 ## -*- texinfo -*-
-## @deftypefn {Function File} @var{rawarr} = odsread (@var{filename}, @var{wsh}, @var{range})
-## .
+## @deftypefn {Function File} [@var{numarr}, @var{txtarr}, @var{rawarr},  @var{limits}] = odsread (@var{filename})
+## @deftypefnx {Function File} [@var{numarr}, @var{txtarr}, @var{rawarr}, @var{limits}] = odsread (@var{filename}, @var{wsh})
+## @deftypefnx {Function File} [@var{numarr}, @var{txtarr}, @var{rawarr}, @var{limits}] = odsread (@var{filename}, @var{wsh}, @var{range})
+## @deftypefnx {Function File} [@var{numarr}, @var{txtarr}, @var{rawarr}, @var{limits}] = odsread (@var{filename}, @var{wsh}, @var{range}, @var{reqintf})
 ##
-## Proof-of-concept function for reading data from an ODS spreadsheet.
-## It works but there are no error checks at all; erroneous function 
-## results and empty cells are not ignored but returned as 0.
+## Read data contained in range @var{range} from worksheet @var{wsh}
+## in OpenOffice.org Calc spreadsheet file @var{filename}.
 ##
-## You need to have jOpenDocument-1.2b2.jar in your javaclasspath (get it at
-## http://www.jopendocument.org/) and the octave-forge java package installed.
+## You need the octave-forge java package (> 1.2.5) and one or both of
+## jopendocument.jar or preferrably: (odfdom.jar & xercesImpl.jar) in
+## your javaclasspath.
 ##
-## @var{filename} must be a valid ODS spreadsheet file name. If @var{filename}
-## does not contain any directory path, the file is saved in the current
-## directory.
+## Return argument @var{numarr} contains the numeric data, optional
+## return arguments @var{txtarr} and @var{rawarr} contain text strings
+## and the raw spreadsheet cell data, respectively, and @var{limits} is
+##a struct containing the data origins of the various returned arrays.
 ##
-## @var{wsh} is the name or index number of a worksheet in the spreadsheet file.
+## If @var{filename} does not contain any directory, the file is
+## assumed to be in the current directory.
 ##
-## @var{range} must be a valid spreadsheet range and cannot be empty.
+## @var{wsh} is either numerical or text, in the latter case it is 
+## case-sensitive and it should conformtoOpenOffice.org Calc sheet
+## name requirements.
+## Note that in case of a numerical @var{wsh} this number refers to the
+## position in the worksheet stack, counted from the left in a Calc
+## window. The default is numerical 1, i.e. the leftmost worksheet
+## in the Calc file.
 ##
-## Return args @var{numarr} and @var{txtarr} contain numeric & text data, 
-## resp.; @var{rawarr} contains the raw data and lim the actual cell ranges
-## from where the data originated.
+## @var{range} is expected to be a regular spreadsheet range format,
+## or "" (empty string, indicating all data in a worksheet).
 ##
-## Example:
+## If only the first argument is specified, odsread will try to read
+## all contents from the first = leftmost (or the only) worksheet (as
+## if a range of @'' (empty string) was specified).
+## 
+## If only two arguments are specified, odsread assumes the second
+## argument to be @var{wsh} and to refer to a worksheet. In that case
+## odsread tries to read all data contained in that worksheet.
 ##
+## The optional last argument @var{reqintf} can be used to override 
+## the automatic selection by odsread of one interface out of the
+## supported ones: Java/ODFtoolkit or Java/jOpenDocument.
+##
+## Erroneous data and empty cells are set to NaN in @var{numarr} and
+## turn up empty in @var{txtarr} and @var{rawarr}. Date/time values
+## in date/time formatted cells are returned as numerical values in
+## @var{obj} with base 1-1-000. Note that OpenOfice.org and MS-Excel
+## have different date base values (1/1/0000 & 1/1/1900, resp.) and
+## internal representation so MS-Excel spreadsheets rewritten into
+## .ods format by OpenOffice.org Calc may have different date base
+## values.
+##
+## @var{numarr} and @var{txtarr} are trimmed from empty outer rows
+## and columns, so any returned array may turn out to be smaller than
+## requested in @var{range}.
+##
+## When reading from merged cells, all array elements NOT corresponding 
+## to the leftmost or upper Calc cell will be treated as if the
+## "corresponding" Calc cells are empty.
+##
+## odsread is just a wrapper for a collection of scripts that find out
+## the interface to be used and do the actual reading. For each call
+## to odsread the interface must be started and the Calc file read into
+## memory. When reading multiple ranges (in optionally multiple worksheets)
+## a significant speed boost can be obtained by invoking those scripts
+## directly (odsopen / ods2oct [/ parsecell] / ... / odsclose).
+##
+## Examples:
+##
 ## @example
-## [ n, t, r, l ] = odsread ('test1.ods', '2ndSh', 'A1:AF250');
+##   A = odsread ('test4.ods', '2nd_sheet', 'C3:AB40');
+##   (which returns the numeric contents in range C3:AB40 in worksheet
+##   '2nd_sheet' from file test4.ods into numeric array A) 
 ## @end example
 ##
+## @example
+##   [An, Tn, Ra, limits] = odsread ('Sales2009.ods', 'Third_sheet');
+##   (which returns all data in worksheet 'Third_sheet' in file test4.ods
+##   into array An, the text data into array Tn, the raw cell data into
+##   cell array Ra and the ranges from where the actual data came in limits)
+## @end example
+##
+## @seealso odsopen, ods2oct, odsclose, odsfinfo, parsecell
+##
 ## @end deftypefn
 
 ## Author: Philip Nienhuis <prnienhuis at users.sf.net>
 ## Created: 2009-12-12
+## Last update: 2009-12-29
 
-function [ numarr, txtarr, rawarr, lim ] = odsread (filename, wsh, range)
+function [ numarr, txtarr, rawarr, lim ] = odsread (filename, wsh=1, datrange=[], reqintf=[])
 
-	ods = odsopen (filename, 0);
+	ods = odsopen (filename, 0, reqintf);
 	
-	[rawarr, ods] = ods2oct (ods, wsh, range);
+	[rawarr, ods] = ods2oct (ods, wsh, datrange);
 	
 	[numarr, txtarr, lim] = parsecell (rawarr, ods.limits);
 
-	odsclose (ods);
+	ods = odsclose (ods);
 
 endfunction

Modified: trunk/octave-forge/main/io/inst/parsecell.m
===================================================================
--- trunk/octave-forge/main/io/inst/parsecell.m	2009-12-30 15:04:15 UTC (rev 6679)
+++ trunk/octave-forge/main/io/inst/parsecell.m	2009-12-30 15:06:54 UTC (rev 6680)
@@ -16,16 +16,25 @@
 
 ## -*- texinfo -*-
 ## @deftypefn {Function File} [ @var{numarr}, @var{txtarr}, @var{lim} ] = parsecell (@var{rawarr})
+## @deftypefnx {Function File} [ @var{numarr}, @var{txtarr}, @var{lim} ] = parsecell (@var{rawarr}, @var{limits})
 ##
 ## Divide a heterogeneous 2D cell array into a 2D numeric array and a
 ## 2D cell array containing only strings. Both returned arrays are
 ## trimmed from empty outer rows and columns.
+## This function is particularly useful for parsing cell arrays returned
+## by functions reading spreadsheets (e.g., xlsread, odsread).
 ##
 ## Optional return argument @var{lim} contains two field with the outer
 ## column and row numbers of @var{numarr} and @var{txtarr} in the
-## original @var{rawarr}.
+## original array @var{rawarr}.
+## If optional input argument @var{limits} contained the spreadsheet
+## data limits returned in the spreadsheet file pointer struct
+## (field xls.limits or ods.limits), optional return argument @var{lim}
+## contains the real spreadsheet row & column numbers enclosing the
+## origins of the numerical and text data returned in @var{numarr}
+## and @var{txtarr}.
 ##
-## Example:
+## Examples:
 ##
 ## @example
 ##   [An, Tn] = parsecell (Rn);
@@ -33,12 +42,19 @@
 ##    text data into array Tn)
 ## @end example
 ##
-## @seealso xlsread, xls2oct
+## @example
+##   [An, Tn, lims] = parsecell (Rn, xls.limits);
+##   (which returns the numeric contents of Rn into array An and the
+##    text data into array Tn.)
+## @end example
 ##
+## @seealso xlsread, odsread, xls2oct, ods2oct
+##
 ## @end deftypefn
 
 ## Author: Philip Nienhuis
 ## Created: 2009-12-13
+## Last updated: 2009-12-29
 
 function [ numarr, txtarr, lim ] = parsecell (rawarr, rawlimits=[])
 


This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.




[Octave-cvsupdate] SF.net SVN: octave:[6680] trunk/octave-forge/main/io/inst

A collection of packages providing extra functionality for GNU Octave

[Octave-cvsupdate] SF.net SVN: octave:[6680] trunk/octave-forge/main/io/inst