From: Lionel B. <lio...@bo...> - 2016-02-09 17:28:43
|
Hi, Le 09/02/2016 14:38, incredibleh0lg a écrit : > [...] > Another point made in this discussion was to give a use case: Third party applications using the data in the database. In my case, I’d like to provide a similar functionality as SQLgrey Webinterface provides. So present the user with relevant entries - e.g. in the connect table - and let him decide what to do with it. I don't think it's a good example because you definitely don't want to modify the connect table content in the back of SQLgrey : this is the table it uses to implement the basic greylisting algorithm. So if you want to modify the connect table you actually want to modify the greylisting algorithm. It seems to me that the two sane approaches if that's what you really want would be to : - implement your changes in SQLgrey itself, not in a separate application, - replace SQLgrey completely or fork it to implement your own algorithm if what you want to do is very different or solves a specific problem which isn't commonly encountered by SQLgrey uers. The same limitations apply to the config/from_awl/domain_awl tables (awl means *automatic* white list), if you try to modify them behind SQLgrey's back you are actually designing a new greylisting program. The only tables I created for external configuration are the opt[in|out]_[email|domain] tables. > In the current format this means, that the request has to carry the WHOLE entry as parameter to get it moved or deleted. In fact the connect table is a special case and doesn't even guarantee that each entry is unique so you can't event select a single entry deterministically. The tables that are supposed to be managed externally only have a single column : so the "WHOLE" entry is actually a single value which is obviously known by any application which would be designed to insert or delete entries (update doesn't make sense for these tables)... See below if "obviously" is not a given for you. > This is pretty wasteful and it is the reason why I asked if somebody tested a slightly adjusted schema where generated primary keys are added by the database.In theory, this shouldn’t break the application. It wouldn't break SQLgrey but it would make developing the application managing SQLgrey's configuration tables more complex and bug-prone (see a detailed explanation below). I think you are probably approaching the SQLgrey database and an external application's database as a single entity (see why I think so below too), which I don't think is a good way to approach the external configuration of SQLgrey. In a single database for any moderately complex application surrogate keys are nearly always the best choice for various reasons. For a database dedicated to a trivial service like SQLgrey this is clearly not the case. Let's use a practical example to demonstrate. Take the optout_email table (the other configuration tables are similar: all have a single column) and create an application managing its content (creating entries when users want to optout and deleting them when they want to activate greylisting again). If you use surrogate keys in the optout_email table the first question you have to ask yourself is : does my application separate database's user objects store : - user emails, - references to the surrogate keys in optout_email, - both and what is used among them to reference the optout_email entries. You made clear that don't want to use the user emails to reference optout_email entries so the only 2 choices for you remaining are the last ones. But actually if you store the references to surrogate keys you will most probably want to keep a copy of the email address too. Not only because it would be convenient in your application which would probably access the Postfix user database (or a database used to create the Postfix one which obviously must have the email addresses itself) but also because optout_email won't have the email if the user didn't choose to opt out. If you want to avoid storing the address at all costs you would make the application and Postfix depend on a SQLgrey change to store all your users' email addresses to solve this. This is not SQLgrey's role: it's a greylisting service, not an email user database. If you want to create your user database you obviously won't try to do so by installing a greylisting service. In conclusion there's no real choice once you have chosen to use surrogate keys : you will have to store both the email and the reference to the surrogate key in the external database in one way or another. So now if you really want to use the surrogate keys instead of the email itself you have to maintain consistency between your reference and the email addresses stored in two separate databases with 2 different codebases managed by 2 different sets of people. This is clearly (I hope so anyway) not something you want to do *at all*. By avoiding surrogate keys SQLgrey actually forces a potential separate database to be inherently consistent with its own database. In case this isn't crystal clear, as an example of the consistency problems consider these common scenarios : - admins create entries manually until enough users ask for direct access and install the configuration application, - admins use batches to create initial entries or maintain a subset of the optout_email table in parallel to the configuration application, these batches probably aren't aware of the existence of the configuration application and don't know how to update its database, - after a crash, on restoring the 2 separate databases (SQLgrey and configuration application) you can get a more recent SQLgrey database with entries not (yet) referenced by the other restored database version, - ... All these scenario imply that there will be entries in optout_email not yet referenced by the configuration application. When creating new users with addresses already in optout_email, how will you create the references to these entries ? The only way is to check that for each user you create in your application there isn't already an entry in the optout_email table with the same email address. So for new users your reference to the optout_email table is the email address and for existing users the reference is the surrogate key : it's bad design and a loud and clear invitation to bugs. Now let's see how you would design the application without surrogate keys. The application has its user objects with their own email attribute. - when you want to activate optout, you use your application's email attribute to find or create an entry in the optout_email table if needed, - when you want to deactivate it, you find and delete the entry with the same email, - when you change the email address of a user, you remove the old entry (if it exists) and create a new one if applicable. This is *trivial* logic. An environment where this is difficult to do would have to prevent you to make the most basic SQL queries imaginable. Why should we even consider a change only useful to people starting to develop external tools by shooting themselves in the foot when choosing their development environment ? Best regards, Lionel |