[Pattern-recognition] Starting up...
Status: Beta
Brought to you by:
berzal
From: Fernando B. <fb...@de...> - 2002-11-26 13:23:01
|
This message is sent just to check if everybody has already subscribed to the SourceForge Numerical Cruncher project list. In order to get started, we should begin by developing the subsystems which will provide the necessary infrastructure for the rest of the project to proceed. I would like you to choose the subsystems you are more interested in. This way, we will be able to develop the following subsystems in parallel: - Data modeling subsystem: We can assume we will work on numeric datasets which could come from different data sources, so we could use the composite design pattern and implement several wrappers (such as the JDBC, ASCII, and raw image wrappers already implemented, and other standards such as DSTP, XML, etc.). This system should provide the basic sequential and random access methods to the patterns/records/tuples in each dataset. Continuous/numeric attributes should be separated from categorical/nominal ones. - Process modeling subsystem: Base abstract classes for the kinds of techniques we will implement to solve classification, regression, and clustering problems. For instance, classification could be seen as a particular case of regression, while clustering is a special case of classification. All of them can be considered as agents/processes which take some kind of input and generate the apropriate output. This subsystem should support introspection (i.e. the ability to discover the internal structure of the different agents, such as parameters and their kind). - Bridges: Once the previous subsystem interface is defined, we should also develop bridges to existing systems and collections of algorithms such as WEKA, MLC++, etc.. - User interface subsystem: Using reflection, we should be able to generate standard windows/web interfaces for the available components in the system (datasets & pattern recognition algorithms). The main design principle here should be "generate, don't code", so that the development of new algorithms would require NO interface work (unless required for particular applications of our framework). - Framework infrastructure subsystem: We should also develop some infrastructure to decouple the components in our system from the actual method call techniques (simple method invocation, RMI, CORBA...) and provide some transparency to the users of the algorithms we implement (e.g. location transparency). This could be useful if we want our system to control its own performance asigning resources to competing processes and even to work on distributed environments. - Last, but not least, some of us will have to develop the documentation which will make our system usable. This is essential if we want our system to grow and more people to collaborate in the development of techniques and tools for pattern recognition / machine learning. I hope we will make a great job, learn a lot and have a good time while working on this project. Best regards, Fernando |