Thread: [Objectbridge-jdo-dev] Status?
Brought to you by:
thma
From: Chris S. <zh...@ma...> - 2002-04-15 12:08:07
|
I am interested in joining a move to JDO compliance for objectbridge. At my previous employer I implemented a persistence library based on the JDO spec which has been in use for over a year now in a production environment. I have no restrictions on repeating that for this project. I was wondering what the status is? I have noticed that the jdo interfaces are in CVS, but there is no other code in yet? chris |
From: Thomas M. <tho...@ho...> - 2002-04-16 18:25:04
|
Hi Chris, Chris Stevenson wrote: > I am interested in joining a move to JDO compliance for objectbridge. Great, every helping hand is welcome! > At my previous employer I implemented a persistence library based on the > JDO spec which has been in use for over a year now in a production > environment. Cool! So you did a full implementation with instance lifecycle, code enhancement etc.? > I have no restrictions on repeating that for this project. This would be a great help! > I was wondering what the status is? I have noticed that the jdo > interfaces are in CVS, but there is no other code in yet? > That's right. Most OJB developers are currently focussed on getting Version 1.0 finished till June! JDO is not in the 1.0 scope. Currently we are discussing design issues etc. We also have been discussing with the Sparrow project about a close collaboration. SParrow wants to provide a "JDO front end" with pluggable Store implementations. Their main focus are integration OODBMS backends. We will provide the O/R backend. (I assume you are familiar with SUNS's JDORI. The RI has a similar design. it consists of a slim JDO Implementation incl. Lifecycle with pluggable StoreManagers.) I think this is a solid approach. What do you think? How did your implementation work? You are most welcome to share your ideas, concepts and to contribute your code! thanks, Thomas > chris > > > _______________________________________________ > Objectbridge-jdo-dev mailing list > Obj...@li... > https://lists.sourceforge.net/lists/listinfo/objectbridge-jdo-dev > > > |
From: Chris S. <sk...@us...> - 2002-04-17 13:04:18
|
Hi guys, I was pointed to this group by Thomas Mahler of the objectbridge group. I am interested in helping with the sparrow/objectbridge efforts to implement JDO in open source. What follows is a description of what we did, with some thoughts about the JDO spec as it currently stands, and what I think is important to target. I am moving between countries at the moment, and don't have permanent Internet connection, so this is off the top of my head. At my former employer we implemented a persistence mechanism based on the JDO early drafts. It has been in use for over a year now on our website, and coped well with a large amount of traffic (30,000 distinct page views in a day, with heavy database access, running at a peak of 25+ pages per second). The library interfaces with a MySQL server on the web site, and MS SQL Server on the internal side. Data is synchronized in real time through the persistence libraries. I can vouch for the suitability of the spec for real world use :-) Code Modification We did not do code modification, but I had a play with the BCEL library and I think that that would be a candidate for use. There is (was?) a better library from IBM alphaworks that presents an API similar to reflection, but allowing modification of byte-code, but I think that was not under open source. The enhancement would need to be pluggable, to allow us to try out some different implementations. We used source-code modification and reflection, ie the jdoXXX calls were in a PersistenceCapableImpl superclass from which all our db classes descended, and the ClassManagers used reflection to read the actual values. Classes encapsulated all fields with getXXX and setXXX methods and explicitly called jdoMarkDirty() methods on modification. Having done this I can see why code modification is necessary - we had a few bugs when a method was added but the jdoMarkDirty() was not :-) I would suggest that an implementation should aim at hand-modified source and worry about code-modification in parallel. HOLLOW state We did not implement the HOLLOW state, which made the code for 1-1 relationships nasty. Every relationship getter required a jdoLoad("relationshipName") method to trigger loading of the appropriate object. I would say HOLLOW state is a must. Caching Our PersistenceManagerImpl maintained 2 caches - a memory sensitive cache like objectbridge, and a dirtyCache which maintained only objects marked as PERSISTENT_DIRTY, PERSISTENT_NEW or PERSISTENT_DELETED in a list. This meant that db checking was only the small number of changed objects. This is a great boon for the JDO. savepoint() As hinted above, we extended the pm to have a checkpoint() method, which proved extremely useful. This is less so now, as the spec has been modified to allow the user to call setRetainValues(true) so that objects are not made HOLLOW on commit. However there is still a problem potentially with objects that outlive a pm. For example, if you create an object and then want to cache it in a singleton, then changes to the object are not automatically persisted, unless the singleton makes a commit(). It then has to re-associate all objects with a new PM in order to record new changes to the objects. It was much simpler to have a PM associated with the singleton and call checkpoint() periodically. Note that this is on the TO-DO list of the 1.0 spec as savepoint(). The new optional 'optimistic transactions' part of the spec is a good fit for this functionality too, and better in that it doesn't maintain a connection to the database for a long lived pm. This allows usage patterns like creating a PM and storing it in the session of a web transaction, and managing all interactions through this PM. We had to abandon this idea since our implementation maintained an open connection for each open PM, and we rapidly ran out of connections. Change logging One of the interesting things about our implementation was the changelog. Any change to the database was recorded by a Change object, that held the className, fieldNames and the old and new values of changed fields. The fieldName was used so that the Change object could be used to apply changes to a database with a different schema, or where the mapping had changed. This was used to synchronize our web database (MySQL) with our internal database (MS SQL Server). It worked extremely well, and I would heartily recomment this approach. It was implemented by having the PersistenceManagerFactory (nb, not the individual PMs) fire transactionCommitted events with a list of changes. This allows the maximum amount of flexibility in monitoring changes on the database. the changes were logged in an XML format for transfer. Queries The most we ever did with queries was to filter for objects. It was never necessary to use complex queries, since the relationships maintained by JDO allow you to navigate to the necessary object. I'm sure anyoune who has had to embed queries in code and debug them would agree that its easier to avoid it than work with it. So I don't think complex query support is essential for early versions. Having said that, over time it would be nice to support variants like OQL and EJBQL for the implementation, since this would allow an isql-like app that manipulates live object data. Now that would be a killer app. IMHO the implementation of querying should be aimed at two uses: 1. I have a collection of instances and I want to use a query construct to sort/filter them. So the filter must be applicable to a set of objects without hitting the database. 2. I want to fetch data from the database according to a filter. This is easily implemented by navigating an extent and using the same filter as in (1) above, but this is extremely inefficient to the point of being useless in large systems. Thus the system must be able to pass all or part of the query 'down the chain' to a lower level component that can use all or part of it to generate an SQL query. The full filter can then be applied to the smaller set of objects returned. This will be crucial for real world use. |
From: Joel S. <jo...@ik...> - 2002-04-18 17:34:33
|
> Code Modification > > We did not do code modification, but I had a play with the BCEL library > and I think that that would be a candidate for use. There is (was?) a > better library from IBM alphaworks that presents an API similar to > reflection, but allowing modification of byte-code, but I think that was > not under open source. The enhancement would need to be pluggable, to > allow us to try out some different implementations. I've looked at various open source bytecode manipulation APIs and they're much the same. I think BCEL will have the largest community and best support probably (because it's apache) and so I think that would be the best choice. > We used source-code modification and reflection, ie the jdoXXX calls > were in a PersistenceCapableImpl superclass from which all our db > classes descended, This isn't possible. That would require multiple inheritance when the PC already extends something else. > and the ClassManagers used reflection to read the > actual values. Classes encapsulated all fields with getXXX and setXXX > methods and explicitly called jdoMarkDirty() methods on modification. With the way the spec is set up, the state manager will know everything (it knows more than the PC does itself!) so we won't need reflection anywhere (that I know of). > Having done this I can see why code modification is necessary - we had a > few bugs when a method was added but the jdoMarkDirty() was not :-) I > would suggest that an implementation should aim at hand-modified source > and worry about code-modification in parallel. Perhaps, though we have some work already done towards it--and it appears there is a separate body that Andy knows of that we may be able to use some code (including an enhancer) from. Either way, code enhancement is not a big project because it's pretty much spelled out in the spec and I'm not sure we want to deviate from that because of compatibility requirements. > HOLLOW state > > We did not implement the HOLLOW state, which made the code for 1-1 > relationships nasty. Every relationship getter required a > jdoLoad("relationshipName") method to trigger loading of the appropriate > object. I would say HOLLOW state is a must. > > Caching > > Our PersistenceManagerImpl maintained 2 caches - a memory sensitive > cache like objectbridge, and a dirtyCache which maintained only objects > marked as PERSISTENT_DIRTY, PERSISTENT_NEW or PERSISTENT_DELETED in a > list. This meant that db checking was only the small number of changed > objects. This is a great boon for the JDO. > > savepoint() > > As hinted above, we extended the pm to have a checkpoint() method, which > proved extremely useful. This is less so now, as the spec has been > modified to allow the user to call setRetainValues(true) so that objects > are not made HOLLOW on commit. However there is still a problem > potentially with objects that outlive a pm. For example, if you create > an object and then want to cache it in a singleton, then changes to the > object are not automatically persisted, unless the singleton makes a > commit(). It then has to re-associate all objects with a new PM in order > to record new changes to the objects. It was much simpler to have a PM > associated with the singleton and call checkpoint() periodically. Note > that this is on the TO-DO list of the 1.0 spec as savepoint(). I think I understand you. So you're talking outside of transactions? > The new optional 'optimistic transactions' part of the spec is a good > fit for this functionality too, and better in that it doesn't maintain a > connection to the database for a long lived pm. This allows usage > patterns like creating a PM and storing it in the session of a web > transaction, and managing all interactions through this PM. We had to > abandon this idea since our implementation maintained an open connection > for each open PM, and we rapidly ran out of connections. Does the spec require one connection per PM? I would think that would be a bad thing. I would think the db connection pool should be separated from the PM so the PM could just grab a connection when it needed one. > Change logging > > One of the interesting things about our implementation was the > changelog. Any change to the database was recorded by a Change object, > that held the className, fieldNames and the old and new values of > changed fields. The fieldName was used so that the Change object could > be used to apply changes to a database with a different schema, or where > the mapping had changed. This was used to synchronize our web database > (MySQL) with our internal database (MS SQL Server). It worked extremely > well, and I would heartily recomment this approach. It was implemented > by having the PersistenceManagerFactory (nb, not the individual PMs) > fire transactionCommitted events with a list of changes. This allows the > maximum amount of flexibility in monitoring changes on the database. the > changes were logged in an XML format for transfer. Interesting. > Queries > > The most we ever did with queries was to filter for objects. It was > never necessary to use complex queries, since the relationships > maintained by JDO allow you to navigate to the necessary object. I'm > sure anyoune who has had to embed queries in code and debug them would > agree that its easier to avoid it than work with it. So I don't think > complex query support is essential for early versions. > > Having said that, over time it would be nice to support variants like > OQL and EJBQL for the implementation, since this would allow an > isql-like app that manipulates live object data. Now that would be a > killer app. > > IMHO the implementation of querying should be aimed at two uses: > > 1. I have a collection of instances and I want to use a query construct > to sort/filter them. So the filter must be applicable to a set of > objects without hitting the database. > > 2. I want to fetch data from the database according to a filter. This is > easily implemented by navigating an extent and using the same filter as > in (1) above, but this is extremely inefficient to the point of being > useless in large systems. Thus the system must be able to pass all or > part of the query 'down the chain' to a lower level component that can > use all or part of it to generate an SQL query. The full filter can then > be applied to the smaller set of objects returned. This will be crucial > for real world use. Okay. As you suggested, I envision the querying engine to be one of the later modules developed as it is rather complicated and because you can have a very useful piece of software without it (see Ozone for example). Thanks for your comments! -- Joel Shellman Comprehensive Internet Solutions -- Building business dreams. [ web design | database | e-commerce | hosting | marketing ] iKestrel, Inc. http://www.ikestrel.com/ |