Thread: [SQLObject] Select Single Random Record
SQLObject is a Python ORM.
Brought to you by:
ianbicking,
phd
From: Adam B. <ada...@gm...> - 2012-04-18 16:30:21
|
What is the best way to select a single random record from a table and falls within the "where" criteria i have defined? I don't want to return a huge result set each time and then pick a random one, as the database is located outside of my network and accessed over the internet. -- Adam Berg |
From: Oleg B. <ph...@ph...> - 2012-04-18 16:38:50
|
On Wed, Apr 18, 2012 at 12:30:10PM -0400, Adam Berg <ada...@gm...> wrote: > What is the best way to select a single random record from a table and > falls within the "where" criteria i have defined? I don't want to return a > huge result set each time and then pick a random one Please define "random" in the context of your task. Do you want a different record every time? Or some non-deterministic record that could be the same every time is a possible solution? Oleg. -- Oleg Broytman http://phdru.name/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Hans G. S. <geo...@sc...> - 2012-04-18 17:20:54
|
On Wed, Apr 18, 2012 at 08:38:42PM +0400, Oleg Broytman wrote: > On Wed, Apr 18, 2012 at 12:30:10PM -0400, Adam Berg <ada...@gm...> wrote: > > What is the best way to select a single random record from a table and > > falls within the "where" criteria i have defined? I don't want to return a > > huge result set each time and then pick a random one > > Please define "random" in the context of your task. Do you want a > different record every time? Or some non-deterministic record that could > be the same every time is a possible solution? Obviously, i cannot defined «random» in his context, but I have a somewhat related problem. To me «random» means that every record is equally likely as the return of the next execution of the query, where «likely» can be entirely subjective seen from the user's point of view. In my context the issue is a queue of number crunching jobs, where a number of clients (10-100; maybe more) will query for the next parameter set to run the calculation for. A client receiving a job will set a field to indicate that it is taken, and transaction locking is used to avoid multiple clients snitching the same job at the same time. Ideally, I would have liked a query to retrieve the first unlocked record (it is useful but not necessary to order them by timestamp). Random selection, which makes it unlikely that a client will have to wait for the lock would be a useful second best. Or maybe there is another ingenious approach which I have not thought of. Admittedly, this is a quick and dirty solution which seems to work. -- :-- Hans Georg |
From: Adam B. <ada...@gm...> - 2012-04-18 17:25:22
|
Oleg Broytman <phd <at> phdru.name> writes: > > On Wed, Apr 18, 2012 at 12:30:10PM -0400, Adam Berg <adamberg <at> gmail.com> wrote: > > What is the best way to select a single random record from a table and > > falls within the "where" criteria i have defined? I don't want to return a > > huge result set each time and then pick a random one > > Please define "random" in the context of your task. Do you want a > different record every time? Or some non-deterministic record that could > be the same every time is a possible solution? > > Oleg. It should be a different record each time. I don't need a way to produce the same results again. |
From: Oleg B. <ph...@ph...> - 2012-04-18 18:31:57
|
On Wed, Apr 18, 2012 at 05:18:41PM +0000, Adam Berg <ada...@gm...> wrote: > Oleg Broytman <phd <at> phdru.name> writes: > > On Wed, Apr 18, 2012 at 12:30:10PM -0400, Adam Berg <adamberg <at> gmail.com> > wrote: > > > What is the best way to select a single random record from a table and > > > falls within the "where" criteria i have defined? I don't want to return a > > > huge result set each time and then pick a random one > > > > Please define "random" in the context of your task. Do you want a > > different record every time? Or some non-deterministic record that could > > be the same every time is a possible solution? > > It should be a different record each time. Do it in two steps. First, count the number of objects that satisfy your criteria: count = MyTable.select(whereClause).count() Optimize your whereClause and indices. Generate an int in range 0..count-1 (see module `random`). And select the item: row = MyTable.select(whereClause, orderBy=MyTable.q.id)[random_int] With this query SQLObject uses OFFSET and LIMIT=1 to select exactly that row. > To give context to the issue, I am building my own icecast2 broadcaster. I want > to have an Auto DJ feature which will pick a random song from the database which > hasn't recently been played plus some other criteria. It really doesn't matter > which record i get back from the database, I just need one at random. Aha, you mark played songs in the DB, so your whereClause certainly contains "if was not played yet" criteria. In this case selecting just the first item without ordering the result set ought to be enough: row = MyTable.select(whereClause)[0] You have to experiment. Oleg. -- Oleg Broytman http://phdru.name/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Adam B. <ada...@gm...> - 2012-04-18 17:44:56
|
Adam Berg <adamberg <at> gmail.com> writes: > > Oleg Broytman <phd <at> phdru.name> writes: > > > > > On Wed, Apr 18, 2012 at 12:30:10PM -0400, Adam Berg <adamberg <at> gmail.com> > wrote: > > > What is the best way to select a single random record from a table and > > > falls within the "where" criteria i have defined? I don't want to return a > > > huge result set each time and then pick a random one > > > > Please define "random" in the context of your task. Do you want a > > different record every time? Or some non-deterministic record that could > > be the same every time is a possible solution? > > > > Oleg. > > It should be a different record each time. > I don't need a way to produce the same results again. > > ------------------------------------------------------------------------------ > Better than sec? Nothing is better than sec when it comes to > monitoring Big Data applications. Try Boundary one-second > resolution app monitoring today. Free. > http://p.sf.net/sfu/Boundary-dev2dev To give context to the issue, I am building my own icecast2 broadcaster. I want to have an Auto DJ feature which will pick a random song from the database which hasn't recently been played plus some other criteria. It really doesn't matter which record i get back from the database, I just need one at random. |
From: Petr J. <pet...@tp...> - 2012-04-18 18:46:57
|
> To give context to the issue, I am building my own icecast2 broadcaster. I > want > to have an Auto DJ feature which will pick a random song from the database > which > hasn't recently been played plus some other criteria. It really doesn't > matter > which record i get back from the database, I just need one at random. > *Get the list of valid IDs from the remote database using the raw SQL command something like:* con = connectionForURI("firebird://yourUserName:yourPassword@serverAddress :3050/pat/to/database/database.file") validIDs= "SELECT ID FROM SONGS_TABLE where ......" returnedIDs = con.queryAll(validIDs) # you get the list of valid IDs, something like this: [(1,), (2,), (3,), (4,), (5,), (6,), (7,), (8,), (9,), (11,)] *than:* import random randomID = random.choice(returnedIDs)[0] randomSong = SongsTable.get(randomID) HTH Regards Petr |
From: Pʀoмᴇтнᴇυs 0. <pro...@gm...> - 2012-04-18 19:15:27
|
I'll see about adding a recipe to the SQLObject Community Doc site for this. ( https://sites.google.com/site/sqlobject/ ) On Wed, Apr 18, 2012 at 12:46 PM, Petr Jakeš <pet...@tp...> wrote: > > To give context to the issue, I am building my own icecast2 broadcaster. I >> want >> to have an Auto DJ feature which will pick a random song from the >> database which >> hasn't recently been played plus some other criteria. It really doesn't >> matter >> which record i get back from the database, I just need one at random. >> > > *Get the list of valid IDs from the remote database using the raw SQL > command > something like:* > > con = connectionForURI("firebird://yourUserName:yourPassword@serverAddress > :3050/pat/to/database/database.file") > > validIDs= "SELECT ID FROM SONGS_TABLE where ......" > > returnedIDs = con.queryAll(validIDs) > > # you get the list of valid IDs, something like this: [(1,), (2,), (3,), > (4,), (5,), (6,), (7,), (8,), (9,), (11,)] > > *than:* > > import random > randomID = random.choice(returnedIDs)[0] > randomSong = SongsTable.get(randomID) > > HTH > > Regards > > Petr > > > ------------------------------------------------------------------------------ > Better than sec? Nothing is better than sec when it comes to > monitoring Big Data applications. Try Boundary one-second > resolution app monitoring today. Free. > http://p.sf.net/sfu/Boundary-dev2dev > _______________________________________________ > sqlobject-discuss mailing list > sql...@li... > https://lists.sourceforge.net/lists/listinfo/sqlobject-discuss > > |
From: Hans G. S. <geo...@sc...> - 2012-04-18 19:23:27
|
On Wed, Apr 18, 2012 at 08:46:47PM +0200, Petr Jakeš wrote: > *Get the list of valid IDs from the remote database using the raw SQL > command > something like:* > > con = connectionForURI("firebird://yourUserName:yourPassword@serverAddress > :3050/pat/to/database/database.file") > > validIDs= "SELECT ID FROM SONGS_TABLE where ......" > > returnedIDs = con.queryAll(validIDs) > > # you get the list of valid IDs, something like this: [(1,), (2,), (3,), > (4,), (5,), (6,), (7,), (8,), (9,), (11,)] > > *than:* > > import random > randomID = random.choice(returnedIDs)[0] > randomSong = SongsTable.get(randomID) Although this might work in the particular application, but it is hardly ideal to make two queries, nor to load a long list of IDs. Is there a way to do it with a single query? -- :-- Hans Georg |
From: Petr J. <pet...@tp...> - 2012-04-18 19:43:34
|
> Although this might work in the particular application, but > it is hardly ideal to make two queries, nor to load a long list > of IDs. Is there a way to do it with a single query? > > 1) Is your task really "time critical"? How many records (IDs) do you expect to be returned using the first select? How many kb will it take? Is the list size a real problem for your Internet connection? How often do you expect to release the select querry? 2) what is your target database engine? some databases allow "rand" in the select query. see: http://stackoverflow.com/questions/19412/how-to-request-a-random-row-in-sql Petr |
From: Hans G. S. <geo...@sc...> - 2012-04-18 20:14:56
|
On Wed, Apr 18, 2012 at 09:43:23PM +0200, Petr Jakeš wrote: > > Although this might work in the particular application, but > > it is hardly ideal to make two queries, nor to load a long list > > of IDs. Is there a way to do it with a single query? > > > > 1) > Is your task really "time critical"? In my case, yes; it is number crunching. Question should be «is it /that/ "time critical"» :-) I have had some difficulties with slow operations due to multiple queries, and it feels more satisfactory get rid of any avoidable query, even if the saving is only a fraction of a percent of the total running time. > How many records (IDs) do you expect to be returned using the first select? thousands in current tests; hundreds of thousands is likely in future tests, and millions is plausible. > How many kb will it take? Is the list size a real problem for your Internet > connection? Might be; especially when it is over the Internet as opposed to over a LAN. Yet, I am as concerned with query processing time than transmission time. > How often do you expect to release the select querry? Could vary. About 50 clients querying once a minute each may be typical, but I am sure I will see faster jobs completing in seconds, in which case each client would query correspondingly often. > 2) > what is your target database engine? postgres at the moment. > some databases allow "rand" in the select query. > see: > http://stackoverflow.com/questions/19412/how-to-request-a-random-row-in-sql Ah. Thanks for the tip. I will look into it; obviously, I would prefer the solution to be engine independent, but this feature might be more valuable :-) :-- George > > Petr -- :-- Hans Georg |
From: Petr J. <pet...@tp...> - 2012-04-18 20:51:16
|
> > > Ah. Thanks for the tip. > I will look into it; obviously, I would prefer the solution to > be engine independent, but this feature might be more valuable :-) > > As usually, some good search engine (Google??? :D ) is your friend.... IMHO you have to always know which SQL engine is on the on the other side (in other words you have to know how to set up the connection string). If you know the engine, you can decide which kind of SQL "slang" to use: http://www.petefreitag.com/item/466.cfm Petr |