Thread: [Cronometer-development] Idea for database schema and model for eater &c
Brought to you by:
artichikin
|
From: Chris R. <of...@gm...> - 2005-04-24 15:20:57
|
So I had a stroke of insight this morning on the topic of our DB schema and model, and I think we're both looking at this the wrong way -- more specifically, my idea of separate databases is... flawed. Here's what I see us actually needing. First, we need some non-SQL concept of a FoodDataSource. This has to have all of the functionality we are currently providing with SQL calls (which I've enumerated, and have replacement calls for below).=20 This would be an abstract class/interface that will provide relevant meta-info (source, name, version, etc...) as well as the methods I'll outline below, and possibly more as we come up with them. Following from that, we need a few concrete implementations, such as SQLDataSource and XMLDataSource and maybe WWWDataSource. Those can all be queried the same way, which will make adding websites as sources much easier. We'll need a subclass of FoodDataSource called LocalFoodDataSource that will have a few more user-specific methods such as consumeFood() and changeFood() that will not b0rk the main DB. The best part is, though, while our SQL schema might look like ass underneath all this, it never has to leak over into the program. We can use this interface to provide a unified view of the user's data. Here's the methods I see being needed (based on a search for SQLStatement.execute and SQLStatement.executeQuery in the code): FoodDataSource =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D addFood(Food) removeFood(Food) changeFood(Food) findFood(... some parameters ...) findAllFoods() getServingsFor(Food) getNutrients(Food) getSources() --> All "sources" from the source, ie: SELECT DISTINCT source FROM Foods; getFoodGroups() getFood() * LocalFoodDataSource (also maybe MutableFoodDataSource?) =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D consumeFood(Food, Serving) changeConsumedAmount(Food, newServing) unConsumeFood(Food) changeServing(Food, newServing) addServingSize(Food, newServing) removeServingSize(Food, oldServing) getTimesConsumed(Food) getTimesConsumed(Food, Date start, Date end) getConsumedOnDate(Date) * Not sure if this is necessary, and even if it is -- there's not really guaranteed to be a unified way to get at this info, since not all sources will have the concept of a unique ID -- we might have to go with lists of Foods and pull from that. =3D=3D=3D=3D=3D=3D=3D Between these two ops, that seems to be the bulk of what we ask the DB. I think that our BioMarkers and such should be reached through a different interface, although it might actually hit the same backing store. Probably will, in fact. I can probably code up skeletal implementations of this this morning, if you think it's a decent idea. I don't want to start on it unless we're more or less in agreement, though. What think you? --=20 Chris R. =3D=3D=3D=3D=3D=3D Not to be taken literally, internally, or seriously. |
|
From: Aaron D. <dav...@cs...> - 2005-04-24 17:35:08
|
On Apr 24, 2005, at 9:20 AM, Chris Rose wrote: > So I had a stroke of insight this morning on the topic of our DB > schema and model, and I think we're both looking at this the wrong way > -- more specifically, my idea of separate databases is... flawed. I partially agree. I wouldn't go so far as to say flawed. I do agree that it is more proper to call these searchable collections FoodDataSources rather than just a database. I also like hiding SQL and database specific stuff as much away from the rest of the program as we can get, without sacrificing too much speed, or creating too huge of an unwieldy object hierarchy. It is possible to over-engineer a program ;-) > Here's the methods I see being needed (based on a search for > SQLStatement.execute and SQLStatement.executeQuery in the code): > > FoodDataSource > ============ > addFood(Food) > removeFood(Food) > changeFood(Food) Why are these part of the ImmutableFoodDataSource? You can add, remove, or change a website, and [the user] can't modify the default food sources either. > LocalFoodDataSource (also maybe MutableFoodDataSource?) > ================ > consumeFood(Food, Serving) > changeConsumedAmount(Food, newServing) > unConsumeFood(Food) > changeServing(Food, newServing) > addServingSize(Food, newServing) > removeServingSize(Food, oldServing) > getTimesConsumed(Food) > getTimesConsumed(Food, Date start, Date end) > getConsumedOnDate(Date) Hmmm. Aside. I have never liked my choice of 'Consumed' for an eaten food. I like how you used Serving for weights, but I like it even more to replace consumedFoods, because thats what they are -- Servings of foods. The weights are just gram to label conversions. The USDA database calls their table 'Weights' which I just mirrored. I noticed the Canadian food database calls theirs 'Measures'. I like 'Measurement' for this. What do you think? > ======= > Between these two ops, that seems to be the bulk of what we ask the > DB. I think that our BioMarkers and such should be reached through a > different interface, although it might actually hit the same backing > store. Probably will, in fact. Yah, I agree. > I can probably code up skeletal implementations of this this morning, > if you think it's a decent idea. I don't want to start on it unless > we're more or less in agreement, though. > > What think you? With my comments above in mind, yes, I think it's a good idea. I should be free to work on this all day. |
|
From: Aaron D. <ada...@po...> - 2005-04-24 17:35:41
|
On Apr 24, 2005, at 9:20 AM, Chris Rose wrote: > So I had a stroke of insight this morning on the topic of our DB > schema and model, and I think we're both looking at this the wrong way > -- more specifically, my idea of separate databases is... flawed. I partially agree. I wouldn't go so far as to say flawed. I do agree that it is more proper to call these searchable collections FoodDataSources rather than just a database. I also like hiding SQL and database specific stuff as much away from the rest of the program as we can get, without sacrificing too much speed, or creating too huge of an unwieldy object hierarchy. It is possible to over-engineer a program ;-) > Here's the methods I see being needed (based on a search for > SQLStatement.execute and SQLStatement.executeQuery in the code): > > FoodDataSource > ============ > addFood(Food) > removeFood(Food) > changeFood(Food) Why are these part of the ImmutableFoodDataSource? You can add, remove, or change a website, and [the user] can't modify the default food sources either. > LocalFoodDataSource (also maybe MutableFoodDataSource?) > ================ > consumeFood(Food, Serving) > changeConsumedAmount(Food, newServing) > unConsumeFood(Food) > changeServing(Food, newServing) > addServingSize(Food, newServing) > removeServingSize(Food, oldServing) > getTimesConsumed(Food) > getTimesConsumed(Food, Date start, Date end) > getConsumedOnDate(Date) Hmmm. Aside. I have never liked my choice of 'Consumed' for an eaten food. I like how you used Serving for weights, but I like it even more to replace consumedFoods, because thats what they are -- Servings of foods. The weights are just gram to label conversions. The USDA database calls their table 'Weights' which I just mirrored. I noticed the Canadian food database calls theirs 'Measures'. I like 'Measurement' for this. What do you think? > ======= > Between these two ops, that seems to be the bulk of what we ask the > DB. I think that our BioMarkers and such should be reached through a > different interface, although it might actually hit the same backing > store. Probably will, in fact. Yah, I agree. > I can probably code up skeletal implementations of this this morning, > if you think it's a decent idea. I don't want to start on it unless > we're more or less in agreement, though. > > What think you? With my comments above in mind, yes, I think it's a good idea. I should be free to work on this all day. |
|
From: Chris R. <of...@gm...> - 2005-04-24 17:42:16
|
On 4/24/05, Aaron Davidson <ada...@po...> wrote: >=20 > On Apr 24, 2005, at 9:20 AM, Chris Rose wrote: >=20 > > So I had a stroke of insight this morning on the topic of our DB > > schema and model, and I think we're both looking at this the wrong way > > -- more specifically, my idea of separate databases is... flawed. >=20 > I partially agree. I wouldn't go so far as to say flawed. I do agree > that > it is more proper to call these searchable collections FoodDataSources > rather than just a database. I also like hiding SQL and database > specific > stuff as much away from the rest of the program as we can get, without > sacrificing too much speed, or creating too huge of an unwieldy object > hierarchy. It is possible to over-engineer a program ;-) I concur, but in this case I think that abstracting away the nature of the data is critical to being able to handle things like the websites you were talking about. It can even hide hideous things like screen scraping. One thing that will have to be added, though, is an isSearchable() and isListable() method to each, so that we can determine if such a thing is possible. Some backing stores will be more amenable than others. =20 > > Here's the methods I see being needed (based on a search for > > SQLStatement.execute and SQLStatement.executeQuery in the code): > > > > FoodDataSource > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > addFood(Food) > > removeFood(Food) > > changeFood(Food) >=20 > Why are these part of the ImmutableFoodDataSource? You can add, remove, > or change a website, and [the user] can't modify the default food > sources either. You make a good point. I'm working those interfaces up now, and I'll replace that. > > LocalFoodDataSource (also maybe MutableFoodDataSource?) > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > consumeFood(Food, Serving) > > changeConsumedAmount(Food, newServing) > > unConsumeFood(Food) > > changeServing(Food, newServing) > > addServingSize(Food, newServing) > > removeServingSize(Food, oldServing) > > getTimesConsumed(Food) > > getTimesConsumed(Food, Date start, Date end) > > getConsumedOnDate(Date) >=20 > Hmmm. Aside. I have never liked my choice of 'Consumed' for an eaten > food. I like how you used Serving for weights, but I like it even more > to replace consumedFoods, because thats what they are -- Servings of > foods. >=20 > The weights are just gram to label conversions. The USDA database calls > their table 'Weights' which I just mirrored. I noticed the Canadian > food database calls theirs 'Measures'. I like 'Measurement' for this. > What do you think? I think that's not a bad idea -- a Serving can contain a Food and Measure object, and have accessor methods as appropriate. I was thinking we should probably have our own list/table/combo models as appropriate too, so that for all of those we can just throw in lists of the appropriate object to make lists. Can you get all the Food/Serving things in order, and I'll move all the Data-related ops into a SQLDatasource class (see the latest CVS) and start implementing all of it -- I'll then systematically replace all of the current accesses. Maybe hop on MSN and fire off any ideas you have while I'm working on it -- I'll try to check in code relatively often (until I integrate it, which will be one big schwack) so you can keep track. > > =3D=3D=3D=3D=3D=3D=3D > > Between these two ops, that seems to be the bulk of what we ask the > > DB. I think that our BioMarkers and such should be reached through a > > different interface, although it might actually hit the same backing > > store. Probably will, in fact. >=20 > Yah, I agree. >=20 > > I can probably code up skeletal implementations of this this morning, > > if you think it's a decent idea. I don't want to start on it unless > > we're more or less in agreement, though. > > > > What think you? >=20 > With my comments above in mind, yes, I think it's a good idea. I should > be free to work on this all day. I'm on it. --=20 Chris R. =3D=3D=3D=3D=3D=3D Not to be taken literally, internally, or seriously. |
|
From: Chris R. <of...@gm...> - 2005-04-24 17:44:52
|
One more thing... What should we allow searching by? I need methods to do all of that relatively simply, without exposing underlying data formats. --=20 Chris R. =3D=3D=3D=3D=3D=3D Not to be taken literally, internally, or seriously. |