From: Kern S. <ke...@si...> - 2006-08-22 13:36:26
|
On Tuesday 22 August 2006 15:22, Bill Moran wrote: > In response to Kern Sibbald <ke...@si...>: > > > On Tuesday 22 August 2006 14:23, Dan Langille wrote: > > > On 22 Aug 2006 at 13:57, Kern Sibbald wrote: > > > > > > > Thanks, I found a solution -- unfortunately to make it "PostgreSQL > > (POSIX?)" > > > > compliant meant making the query less efficient as it returns twice as > > much > > > > data as I need ... > > > > > > Twice as many rows? Duplicates? > > > > Twice as many columns. PostgreSQL as opposed to MySQL forces me to return a > > field (column) I do not want: > > > > const char *sql_jobids_from_client = > > "SELECT DISTINCT Job.JobId,Job.StartTime FROM Job,Pool" > > " WHERE Client.Name='%s' AND Pool.Name='%s' AND Job.PoolId=Pool.PoolId" > > " AND Job.ClientId=Client.ClientId " > > " ORDER by Job.StartTime"; > > > > I don't care to see StartTime, only that the JobId is sorted on it. > > > > In MySQL, > > > > const char *sql_jobids_from_client = > > "SELECT DISTINCT Job.JobId FROM Job,Pool" > > " WHERE Client.Name='%s' AND Pool.Name='%s' AND Job.PoolId=Pool.PoolId" > > " AND Job.ClientId=Client.ClientId " > > " ORDER by Job.StartTime"; > > > > works fine, and I am not smart enough to understand the nuances of why > > PostgreSQL (or POSIX or whoever) requires the column be returned in order to > > have is sorted. Someone explained it, but for my feeble brain, it is just > > not logical, and this what make programming "correct" SQL far more difficult > > than programming C (for me). If there are two JobIds with the same > > StartTime, which is never the case, I want the SQL to simply take the one > > with the smaller StartTime and throw the second one away (but I know that > > will never happen and unfortunately "correct" SQL as far as I can see has no > > clean way for me to do that -- a incomplete design decision IMO). > > It's ANSI SQL (SQL99, I believe it's called) that's dictating this. Yes, thanks. > My > understanding is how can you sort on something you don't use? Easy. Here are the instructions: Get records with all the following conditions xx and yy and zz, ...; sort on StartTime; remove all duplicate records of JobId (this is called "uniq" in *nix scripting land); return JobId. That's what I want to do. ANSI SQL makes it impossible to do exactly as I want -- or at least, let's say that no one has proposed a solution other than I *must* also get back StartTime (and maybe you have some idea with multiple selects below). > In > PostgreSQL, the planner is going to analyze its information on the table > to determine whether it makes more sense to sort the table first, then > apply the distinct, or apply the distinct, then the sort. Real world > example: > bacula=> explain select distinct job.jobid,job.starttime from job,pool,client > bacula-> where client.name='mindwipe-fd' and pool.name='OffSiteRotation' and job.poolid=pool.poolid > bacula-> AND Job.ClientId=Client.ClientId ORDER by Job.StartTime; > QUERY PLAN > ------------------------------------------------------------------------------------ > Unique (cost=20.93..20.95 rows=2 width=12) > -> Sort (cost=20.93..20.93 rows=2 width=12) > Sort Key: job.starttime, job.jobid > -> Nested Loop (cost=1.28..20.92 rows=2 width=12) > Join Filter: ("inner".poolid = "outer".poolid) > -> Seq Scan on pool (cost=0.00..1.09 rows=1 width=4) > Filter: (name = 'OffSiteRotation'::text) > -> Hash Join (cost=1.28..19.61 rows=18 width=16) > Hash Cond: ("outer".clientid = "inner".clientid) > -> Seq Scan on job (cost=0.00..16.10 rows=410 width=20) > -> Hash (cost=1.27..1.27 rows=1 width=4) > -> Seq Scan on client (cost=0.00..1.27 rows=1 width=4) > Filter: (name = 'mindwipe-fd'::text) > (13 rows) > > Notice that the unique is the last thing PostgreSQL does. Makes sense, but > what if the contents of the table were different in a way that it would be > faster to apply the distinct first -- then it would be impossible to sort. > > My instinct in PostgreSQL would be to alter the query to contain a subselect, Thanks, but that's more complicated than I want to get right now. > which would return the correct # of columns. Does MySQL support subselects > yet? Yes, for a long time now (at least several years). > > -- > Bill Moran > Collaborative Fusion Inc. > > ------------------------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Bacula-devel mailing list > Bac...@li... > https://lists.sourceforge.net/lists/listinfo/bacula-devel > |