Menu

Using in Anger

2002-03-13
2002-03-20
  • Neil Fitzsimmons

    Is there anybody out there thinking of using DiGIR schema for their database federation project?

    My initial impression is that it meets the extensive enough without breaking the 'being a pain to implement' rule.

    Neil Fitzsimmons

     
    • Ricardo Scachetti Pereira

      I'm not sure what was the intent of your message. Probably I am missing some context.
      Anyway, I believe that the answer to your question is yes, there are other groups interested in using DiGIR and also interested in participating in its development.
      I'm part of a group in Brazil trying to stablish a network of biological collection databases, and
      DiGIR would fit part of our needs very well.

       
    • Neil Fitzsimmons

      Thanks Ricardo. Sorry for the confusion in the message. I would be interested in discussing how you plan to use DiGIR. You can reply to the list or to me personally at neil.fitzsimmons@csiro.au.

       
      • Ricardo Scachetti Pereira

        Neil,
        First of all, thanks for your reply.
        I believe that this discussion can useful for the DiGIR development and user community, so I'll post my answer here.

        I work for a NGO in Brazil, called CRIA (Reference Center for Environmental Information), that is stablishing a biological collections network in Sao Paulo state.
        The network will basically follow the same model as the Species Analyst network, with some differences.

        We plan to use DiGIR as the main network protocol and software for distributing the queries to a number of providers (12 for pilot phase, but will be expanded to many more in the future). So all features of DiGIR will be used on our architecture (portal, providers, registry, federated schema, etc).
        We plan to comply to Darwin Core v2 federation schema, at least in a first phase.

        However, DiGIR won't solve the entire problem for us. We are planning to implement the idea of having the collection data mirrored on dedicated servers (this idea was formally stated in this context by MaNIS - Mammal Networked Information System, I believe), for many reasons: performance of the network, security of collection server, autonomy of data custodians, lack of computational power and Internet connectivity (on the collections), among others.

        To accomplish this plan, we are planning to implement an additional level on the network, just below the providers, that will handle the transfer of data from the collection to the dedicated server. Once on this dedicated server, the data will be served using DiGIR provider software.

        Our implementation will probably be different from MaNIS in the sense that our collection community in Brazil does not have IT departments within the Museums and Herbaria, so we believe that the developent of custom scripts is not feasible on our case. So we plan to develop a
        java application that controls data migration from collections to the dedicated server, and also provide other features to the collection manager to control the way his data is served.

        Although we will have a pretty heterogeneous set of DiGIR providers, DiGIR software will also be useful for us to connect well stablished nodes (institutions with fast Internet connections, good servers and decent DBMS). We plan to have some nodes like that, so in this case, we will connect them using the standard DiGIR approach (not using our additional network level).

        If you have further questions about our approach, please, let me know.

        Best regards to all.

        Ricardo

         
        • Ricardo Scachetti Pereira

          Just to add to my last message, the focus of our project is not develop network software but to develop applications that will be using the integrated data served by the network.
          Among the application we are planning we can list:
          1) Simple text based queries returning XML as result (processed in several formats using XSLT);
          2) Simple mapping tool using University of Minnesota's MapServer, to plot species occurrence data points on a base map;
          3) Georeferencing tools, using automated gazetteer lookups and validation using GARP and BioClim models (to look for outliers). We also plan to analyze MaNIS excellent work on collaborative georeferencing and error estimation.
          4) Species' distribution modeling tools such as GARP and BioClim, generating models and maps on-the-fly or in batch for other analysis (following Lifemapper approach - www.lifemapper.org);
          5) Decision support tools, using occurrence data points and species distribution models (GARP and BioClim) calculated in batch, to evaluate impacts of human intervention in biodiversity.

          All those tools will be communicating to the network using DiGIR query schema.

          So, we would greatly appreciate if we could use DiGER for our upper network architecture. However, depending on its availability and time frame for delivery of a stable release, we might need to implement a simple substitute for DiGIR for the time being. If needed, we are thinking about implementing a simple Portal-Provider software, not very fancy, that implements the DiGIR schema and hardcodes Darwin Core v2. So we can focus on the application development.

          So that is basically our current plan.

          Ricardo

           
          • Neil Fitzsimmons

            This sounds very impressive and the focus of the project seems to be on the mark as well. The first goal should be to build the tools that process the data to provide something useful to the communities that are providing the data. When these provider communities are happy and actively participating, other uses and users of the data will emerge.

            The issue I have following the full DiGIR protocol is the network communication layer. It seems to me that using SOAP would be a more efficient from a programming effort point of view.

            I would be interested in you take on this?

             
            • Ricardo Scachetti Pereira

              Well, I've been following the DiGIR protocol from a certain distance. I checked out the portal source code that is in CVS yesterday, so I'm still getting to know the implementation of the portal. It seems to me that this implementation uses HTTP POST for communication, instead of SOAP. Is that right?
              If that is the case, I agree with you that using SOAP would make much more sense in terms of systems interoperability. SOAP provides an interface that is much more transparent than HTTP POST.
              There is probably other issues that made DiGIR group decide for not using SOAP that are not clear to me.
              Would someone be interested in clarifying that?

              Ricardo

               
              • Neil Fitzsimmons

                I'm fairly certain that a HTTP POST is the mechanism currently used.

                From my vague memory, PJ told me at TDWG last year that SOAP was too immature to use when the DiGIR group first started. From reading the other thread, there seems like there may have been some technical reason against it. If there is, any comments from out there would be appreciated.

                I'm pleased that you think SOAP is a good network protocol. If implementations such as ours and yours can agree on something like SOAP, then we have the beginnings (politics and IP issues aside) of a true global network.

                Cheers
                Neil

                 

Log in to post a comment.