[Squirrel-sql-develop] Updating data in the tables

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Wow!  What a lot of mail!

Thanks to everyone for the comments on my suggestions.  One topic in
particular seems to have generated the most interest, so for my own
sanity (such as it is) I think I'll focus on that issue and defer
dealing with the others.

I like to define the desired user experience and work from that
into the code, but since there have been several emails about
the proposed UPDATE mechanism I'll discuss the "Primary Key issue" first.

Simply put, I won't need to locate the Primary Keys because I will 
always use every column of the row in the WHERE clause, so any columns 
that have been designated as Primary Keys will be included 
automatically.  Therefore we don't need code to explicitly check for 
Primary Keys having been defined.  From a performance perspective this 
is ok because it is a single update operation that is happening once in 
a human-level time frame (seconds), not something that is being repeated 
many times or that is blocking a program waiting for the results.  Thus, 
the expected poor performance of this operation should not be an issue.
The (Squirrel-like) dbtool that I built in a previous life worked this 
way and ran for 2 years with about 20 users and never had any problems, 
so I think we should be all right.

Now on with the show...

GENERAL APPROACH

Johan Compagner wrote:
   > For editing tables i preffer something else (what i also see in other
tools)
   > I dont like direct editting in a table.
   >
   > you have a table where you can select a row.
   > That row is loaded in a panel with:
   > [label] [value]
   > [label] [value]

I understand the concept.  Personally I prefer
direct manipulation of the data in the table, as is done in
spreadsheets.  This seems to me to much more "transparent" and less work
for the user.  There is no need to request an editing panel,
reloacate the data item to change in that panel (which may require
scrolling) and then dismiss the panel (ie: "OK").  Also, with direct
manipulation it would be easier for the user to see relationships
across multiple rows (e.g. the values in one column) as they are editing
the data.

Without having looked at the Squirrel code, I believe that direct
manipulation requires less work to implement, mostly because there is no
extra panel or controls ("OK", "Cancel") to implement.  The user
experience that I hope to implement is:
	- user clicks on a cell in table; cell highlights
	- user edits data (more on this below)
	- user leaves the cell by tabbing to next cell or by clicking on another 
cell.  This triggers the DB update.

There are two other conditions that we will need to handle:
	- user "leaves" this view of the data, by closing the window, closing the 
connection, or some other operation.
	- shutdown of the application.
If the user has edited a cell and not updated the DB, we will need to
decide whether we want to update the DB when these conditions occur.

One advantage of using the panel is that it gives the user the chance to
say "Oops, I didn't mean to change that" and click on "Cancel".  With
the direct manipulation method, this is a bit more awkward.  We would
need to provide either an "abort" function of some kind, or an "undo"
mechanism.  In spite of this, the users of my previous db tool were very
happy with the direct manipulation approach.

If it is important enough to the Squirrel user community, we could
certainly implement both methods.  The user could select preferences as:
	[checkbox] Table contents are editable
		(radio button) in the table
		(radio button) in a separate panel
Obviously implementing both methods would be extra work.  I'm not
strongly motivated to provide both, so I would appreciate some guage of
interest from the user community.  Are there a lot of people that would
prefer to update the data in a separate panel?

On another topic, as a streach goal we could also make some of the
tables generated by the SQL requests editable.  Any SQL request of the
form "SELECT * FROM <table> [WHERE...]" would contain enough info for us
to do the UPDATE.  We will need to address 3 questions:
	- is it useful to have some of the SQL result tables editable while 
others are not?
	- would that difference be confusing to the user?
	- is the code needed to do this too much trouble?
All that is work for another day.

EDITING DATA IN THE TABLE CELL

There are a lot of DB types (see java.sql.Types), some of which I've
never worked with, and a few I've never even heard of!

For simple text, integer and date fields, the user should be able to
change the data using the normal mouse select/highlight and keyboard
entry operations.  With this approach I think we should be able to 
handle: BIGINT, BOOLEAN,
CHAR, DATE, DECIMAL, DOUBLE, FLOAT, INTEGER, LONGVARCHAR, NUMERIC, REAL,
SMALLINT TIME, TIMESTAMP, TINYINT, and VARCHAR.  However, I'm not
familiar with all the implications of these different types, so I might
handle some of them incorrectly (especially during validation).

Johan Compagner made a suggestion for editing certain fields:
   >
   > and if possible FK are know and the value is showed in a combobox
with the list
   > filled by the pk table, Then i need some info what columns you store
   > This feature i have pretty much working for another project, but it
uses some different
   > code underneath (i didn't have to generated sql ect because we use OJB)

I'm not sure I understand this correctly, so let me restate what I think
Johan is saying:
	- use getImportedKeys() to identify the foreign keys used in
this table, the tables those keys point to, and the primary key column
in that other table
	- get the contents of the primary key field(s)
	- generate a combobox in which the user may select one of the
existing values or enter a value not on the list.
(I include the ability to enter a value not on the list because a
user may be entering data in one table before adding the appropriate
values in the referenced table.  This assumes, of course, that there are
no DB constraints on that relation, and it is not the ideal sequence for 
entering data,
but hey, it's a messy world out there, and people do this kind of thing
all the time.)

This is an interesting idea.  It can help the user by showing what
values are currently available in the other table.  Unfortunately, those
values are likely to be codes (e.g. numbers) that the user still needs
to look up to understand what the code is refering to.  For example, if
the foreign table is a simple list of country codes:
	1 Austria
	2 Australia
	3 Belgium
	etc...
Then the primary key in that table would likely be the numeric code, not
the (human readable) name of the country.  Thus, when editing a
"Customer" table that references the country code, the values in the
combo box list would be numbers, not the names.  For more complicated
tables, the primary keys may be even less mnemonic of the row in the
foreign table.

One thought I have that could help this would be to make the foreign key
values links that would show the contents of the rows in the other table
when clicked.  That sound's like it could be a performance problem, but
it might be worth persuing.

I see other issues with this as well.  One is that, as Colin pointed
out, many people do not identify the primary keys in their databases.
Also, I suspect, many people do not formally specify their foreign keys. 
  However, assuming that the foreign keys have all been specified, what
do we do with a referenced table that contains thousands or millions of 
rows?  For example, what if a table contains a reference to a particular
transaction entry in a transaction log containing hundreds of thousands
of entries?  To create the combo box would seem to require retrieving
data for each of those entries and adding that data to the combo box
list.  My gut feeling is that trying to do that would hang up Squirrel
for a very long time, and possibly cause it to die (e.g. out of memory).

As an alternative, I would offer the following suggestion.  On any field
the user could click the right mouse button, and in the menu we could
have an item "Treat as Foreign Key".  Selecting that would bring up a
list of table names, and selecting that table name would bring up a new
panel with the contents of that table in it.  The user could scroll
through that table to the item desired, then click on the key value in
that table to be entered in the original table.  I can see problems with
this proposal as well, not the least of which is the handling of
multi-column primary keys.

One thing about these proposals is that it is not essential to
implement
either of them initially, and we should be able to add them on later,
provided we can
solve the performance issues.  My recommendation is to defer this.

Editing of binary fields is interesting (as in the old Chinese curse: 
"May you live in interesting times").  There are several different ways 
to view binary data, and the user may wish to use different views of the 
binary data in each column.  My suggestion is that when the user clicks
on a binary field that we pop up a small window containing the value in
a editable text field and a set of radio buttons that lets the user 
view/edit the data as:
	- bits ("10110011" - I would put spaces between each byte)
	- hex ("B3")
	- octal ("1 7 3" - I'm not sure of the best byte alignment here)
	- ascii (i.e. as the ascii chars for those nybbles)
The default would be based on the Session Preferences.  After the window
pops up, the user could select any of the representations, and editing
would be allowed using the characters used in that representation.  For
consistancy with the other data fields, I would use a tab or click on
another data item as indication of editing being completed and pop-down
the window at that point.  This is as opposed to using an "Ok" button in
the window itself, though we could also include that if desired.  This
approach should work for types BINARY, BIT, LONGVARBINARY and VARBINARY.

For BLOBs I would use the binary type's editing window except that:
	- the data space would be much larger (a multi-line window rather than 
one line)
	- there would be another editable field at the top of the window with a 
label "Replace with file:", and possibly a "Browse" button.
	I'm a bit nervous about reading a table with BLOB or CLOB columns 
directly into memory because of the potential size of the data.  This 
could adversely impact both memory usage and speed of data retrieval.  I 
see from the Session Preferences that Squirrel lets the user limit the 
amount of the BLOB/CLOB that is read, and that would be a problem if the 
user wants to edit the data.  For these reasons, I think that if the 
user selects a BLOB/CLOB field to edit, we should re-read the database 
for that item at that time.  (This will be a problem if there are two 
rows in the DB that are identical except for the BLOB/CLOB.) Once the 
BLOB is in memory, we can apply the same view/edit formats as discussed 
for the binary data.
	The additional field for a file name would allow the user to point to a 
binary file containing the data to replace the current contents of the
field.  For example, if the BLOB is actually a .gif, then the user could
replace it with a different picture by identifying the file.

CLOBs I would handle in the same way as BLOBs, except that there is no
need for the multiple formats for editing (is there?).

The data types that I do not now know how to handle in a database are:
	- ARRAY
	- DATALINK
	- DISTINCT
	- JAVA_OBJECT
	- OTHER
	- REF
	- STRUCT
I don't know what some of these mean, and I don't know whether they are
all legal data types for columns.  If anyone has suggestions on how to
handle these, please pass them on.

I'ld like to toss out another open issue here: Should we consider adding
an "Undo"/"Redo" mechanism to Squirrel?  If so, then editing of the data
would be one thing that should be tied into that mechanism.

DATA VALIDATION AND DB UPDATE

Once the user signals that editing is done (by leaving the table cell),
we should do the following:
	- if the data has not changed, then do nothing
	- if the field is marked as a foreign key, and if the referenced table
does not contain the data in the primary key field, then display a
warning to the user and let the user decide whether to proceed or not.
	- validate that data is of the appropriate type and convert any external
formats to the appropriate internal format (eg: change hex "1001" into
binary for 9).  If validation fails, put up an error message and return
to the editing step without updating the database.
	- Use getBestRowIdentifier to see if there are any Pseudo columns used
for identification of the rows.  I suspect that for us to use the Pseudo
columns, we will need to modify the table retrieval to get that data,
and then keep it in a hidden column in the table.  If the column is not 
a Pseudo column, then we already have it and thus can ignore the 
getBestRowIdentifier results.  If it is a Pseudo columm (or multiple 
columns?), it is probably a unique key, but since no one is guaranteeing 
that, we proceed to the next step.
	- Create two WHERE clauses containing all of the table columns, one with 
the current data for the row, and one with the proposed update.  If 
getBestRowIdentifier returned a Pseudo column, then include that as one 
of the fields in the WHERE clauses.
	- Then (expanding on Colin's suggestion)
we search for how many rows have an exact match for the existing data in
the row being edited.  If there are more than one, then we post a
warning to the user that "This same update will be made to N identical 
rows" and ask if the user wants to proceed.  If not, then do not update 
the database and revert to the original data in the table.
	- Otherwise, search the DB for how many rows currently in the DB exactly 
match the proposed updated data in the current row.  If more than 0, 
then post a warning to the user saying "This update will cause this row 
to become identical to N other rows" and ask if the user wants to 
proceed.  If not, then do not update the database and revert to the 
original data.
	- If we get this far, then update the database using the current row 
contents in the WHERE clause.

By using this approach, we at least "do no harm".  If the user's request
is going to do something "unusual", we tell the user and let them abort
the operation.  If the user wants to modify only one of two identical 
rows, then they will have to find a way to do it through SQL rather than 
using the table view.

One potential modification of the above: I'm not sure if it
is a good idea to include BLOBs and CLOBs in all the comparisons in this
process.  Would that be too much of a performance problem?  If so, then 
we would probably need to treat all BLOB/CLOB data as if it had been 
changed, whether it was or not.

NEW ROW and DELETE ROW OPERATIONS

While these features are separate from the data edit feature, they are
related so I'll discuss them here.

How adding a new row will work depends on whether we use a separate
panel for input or direct manipulation.  If we use a separate panel,
then we need a way for the user to ask for that panel.  I would suggest
adding a menu on the main window labeled "Edit" and and item in that
menu titled "New Row".  This would be somewhat consistant with other
tools.  Then, as Johan says, we can use the same window for input as for
editing.

If we use direct manipulation, I would like to try always including a
blank row at the bottom of the table.  If the user wants to add a new
row, they just enter data into that blank row.  We could still have the
"Edit"->"New Row" menu item (which would auto-scroll the table to the
end) so that new users can find the feature easilly.

For deletion,
Johan wrote:
  > delete key should be added to the table as a column yes.

My training and experience is that it is not good UI design to include 
both data (the contents of the DB) and controls (the delete entry) in 
the same window.  Even if we make the column visually distinctive in 
some way (which could be distracting), it is still confusing for the user.

Colin included a piece of someone else's email regarding delete:
 > A very nice feature which I picked up from DbVisualizer
 > is the concepts of shortcuts. (I am not sure that
 > "shortcuts" is what they call it.)
 >
 > What basically happens is that when one clicks on the
 > content tab of a table or view (and see the data in the
 > table) one can right click and get options like Select
 > all, Select count, Select, Delete, Insert, Update,

[IMHO I feel strongly that the table should be Updateable by default, 
and thus we should not have an Update menu item. - Glenn]

 >
 > Say I chose the first three columns of a row in a table
 > and select Delete. It will then take me to the SQL
 > section, where a delete from table_name, where ...
 > clause will have been pasted with the thee columns
 > names and pre-populated values

I would prefer the select-and-delete mechanism, which is the one used by 
most tools (think Word, Excel, Powerpoint, etc.).  The user should be 
able to click on a row (or, more accuratly, on one of the cells in the 
table), then select "delete" from somewhere. I have two suggestions, 
based on what other
tools do:
	- put it under the right mouse button menu
	- put it under the main menu "Edit" item
My recommendation would be that we do both.  Also, a "Select All" should 
be included under the "Edit" menu.

One question for the Squirrel user community is whether the delete
operation should request confirmation from the user.  Some people like
this, but it just irritates other folk.

As a stretch goal, it would be nifty if we could let the user select
multiple rows and delete them all.

I would say that delete definitely deserves discussion. (Nice 
alliteration, eh?)

As far as using the "Edit" menu goes, someone in one of the recent mails 
said that they don't like menus and "never" use them.  I am not a big 
fan of menus myself.  My previous dbtool had all the controls available 
as buttons on a top-level panel.  The problem is that the functions in 
Squirrel are growing too numerous to handle without using a better 
method of organization, and menus are the method of choice for other 
complex programs (think IDEs, Word, etc.).  Also, there is a standard 
for using menu items ("File, Edit, View"), and it would make Squirrel 
easier to use if we use the same GUI conventions as other programs.  As 
things stand now, it is often not obvious to me as a Squirrel newbie 
where to find the various functions that I want.  The easier we can make 
that, the better for everyone.

In a wildly out-of-my-mind moment, I've had another thought:  How about 
doing Cut-and-Paste of random sized chunks of the table, and 
automatically updating all those entries in the DB?  Wow!

Whew! What a lot of test to describe something we hope will look really
simple to the user!  Now its time to figure out how to do it.  I guess I
got to start reading code ;-(

Glenn

[Squirrel-sql-develop] Updating data in the tables

A Java SQL client for any JDBC compliant database

[Squirrel-sql-develop] Updating data in the tables