From: dmg <dm...@uv...> - 2007-03-23 00:47:47
|
With respect to the rearchitecturing of panotools (GSoC) I have been thinking about the "computational" stack of panotools. As I have stated in the past, I strongly believe that the optimization and the projection should be divided into 2 independent modules. This is probably best exemplified by Flexify, which takes an equirectangular (I haven't use it yet, so this is my understanding) and produces an output image after it has applied a transformation. FLexify does not need to know anything about registration and optimization. What I am proposing here is a "programmable flexify", or a super math-map. This is the direction where I want to take Tlalli. Let me elaborate: The current transformation model, for an equirectangular as input, is a computation for each pixel in the output image. That is, given an image I, a list of optional parameters, compute its projection I': I' = f(I, [p]) What if we have two fs, and compose them? For instance, the output of one projection is used for another projection: I'' = g(f(I,[p]), [p']) This only makes sense if the output of f is compatible with the input of g. For example, compute the Cassini of an equirectangular (rolling it by 90 degrees), and then compute the Mercator of the Cassini; the result is a transverse mercator. We have implemented transverse mercator as a composition of rolling the equirectangular + mercator. Think of the possibilities. In the current computational model this is done in steps: Generate I' then apply g on I'. There are two disadvantages to this model: * Error increases * I/O operations are proportional to the number of functions. What I envision is a system that will allow me to add my own functions to the computational stack, in the same way that layers work in Photoshop. So the composition happens at the pixel level, not the image level. For example, if you like to have your logo in the nadir, the you can create a function "Logo" that you insert into the computational stack. This function can take as a parameter a string. Then when the panorama is projected/computed, the logo is inserted right on the spot. If the architecture is open enough, this can lead to "plugin functions" that do things we can't even imagine today. We will only require the developer to create a function with certain properties, and register it. This system will be very powerful and useful beyond panoramas. Comments? -- Daniel M. German "Operating systems are like underwear, Bill Joy -> nobody really wants to look at them." http://turingmachine.org/ http://silvernegative.com/ dmg (at) uvic (dot) ca replace (at) with @ and (dot) with . |
From: JD S. <jd...@as...> - 2007-03-23 22:06:06
|
On Thu, 22 Mar 2007 16:46:30 -0700, dmg wrote: > > With respect to the rearchitecturing of panotools (GSoC) I have been > thinking about the "computational" stack of panotools. > > As I have stated in the past, I strongly believe that the optimization > and the projection should be divided into 2 independent modules. > > This is probably best exemplified by Flexify, which takes an > equirectangular (I haven't use it yet, so this is my understanding) > and produces an output image after it has applied a > transformation. FLexify does not need to know anything about > registration and optimization. > > What I am proposing here is a "programmable flexify", or a super > math-map. This is the direction where I want to take Tlalli. > > > Let me elaborate: > > The current transformation model, for an equirectangular as input, is > a computation for each pixel in the output image. That is, given an > image I, a list of optional parameters, compute its projection I': > > I' = f(I, [p]) > > What if we have two fs, and compose them? > > For instance, the output of one projection is used for another > projection: > > I'' = g(f(I,[p]), [p']) > > This only makes sense if the output of f is compatible with the input > of g. For example, compute the Cassini of an equirectangular (rolling > it by 90 degrees), and then compute the Mercator of the Cassini; the > result is a transverse mercator. We have implemented transverse > mercator as a composition of rolling the equirectangular + > mercator. Think of the possibilities. > > In the current computational model this is done in steps: Generate I' > then apply g on I'. There are two disadvantages to this model: > > * Error increases > * I/O operations are proportional to the number of functions. > > What I envision is a system that will allow me to add my own functions > to the computational stack, in the same way that layers work in > Photoshop. So the composition happens at the pixel level, not the > image level. > > For example, if you like to have your logo in the nadir, the you can > create a function "Logo" that you insert into the computational stack. > This function can take as a parameter a string. Then when the panorama > is projected/computed, the logo is inserted right on the spot. > > If the architecture is open enough, this can lead to "plugin > functions" that do things we can't even imagine today. We will only > require the developer to create a function with certain properties, > and register it. > > This system will be very powerful and useful beyond panoramas. Sounds quite powerful. In particular, if the system if flexible enough to allow plug-ins implementing portions of the easier projections in GPUs, it could make future optimizations quite a bit more robust. I would only worry that focusing to much on an arbitrarily deep transformation stack might come at the cost of performance/robustness of the "single layer" transformations that are the bread and butter of PanoTools. BTW, it would be *very* useful for you to enter in a description of tlalli at: http://wiki.panotools.org/wiki/index.php?title=Tlalli&action=edit This link already exists in several other places on the wiki, including the SoC page, where it may very well confuse students new to the community. JD |
From: Daniel M. G. <dm...@uv...> - 2007-03-24 00:06:22
|
JD> BTW, it would be *very* useful for you to enter in a description of JD> tlalli at: JD> http://wiki.panotools.org/wiki/index.php?title=Tlalli&action=edit done. JD> This link already exists in several other places on the wiki, JD> including the SoC page, where it may very well confuse students new to JD> the community. JD> JD -- Daniel M. German "If I were stuck on a desert island with only one compiler Brian Kernighan -> I'd want a C compiler" http://turingmachine.org/ http://silvernegative.com/ dmg (at) uvic (dot) ca replace (at) with @ and (dot) with . |
From: Pablo d'A. <pab...@we...> - 2007-03-24 17:05:12
|
Hi Daniel, > On Thu, 22 Mar 2007 16:46:30 -0700, dmg wrote: > >> With respect to the rearchitecturing of panotools (GSoC) I have been >> thinking about the "computational" stack of panotools. >> >> As I have stated in the past, I strongly believe that the optimization >> and the projection should be divided into 2 independent modules. >> >> This is probably best exemplified by Flexify, which takes an >> equirectangular (I haven't use it yet, so this is my understanding) >> and produces an output image after it has applied a >> transformation. FLexify does not need to know anything about >> registration and optimization. >> >> What I am proposing here is a "programmable flexify", or a super >> math-map. This is the direction where I want to take Tlalli. >> >> >> Let me elaborate: >> >> The current transformation model, for an equirectangular as input, is >> a computation for each pixel in the output image. That is, given an >> image I, a list of optional parameters, compute its projection I': >> >> I' = f(I, [p]) >> >> What if we have two fs, and compose them? What do you mean by f() exactly? A function that transforms coordinates (a,b) -> (c,d), or a complete image processing function. >> For instance, the output of one projection is used for another >> projection: >> >> I'' = g(f(I,[p]), [p']) >> >> This only makes sense if the output of f is compatible with the input >> of g. For example, compute the Cassini of an equirectangular (rolling >> it by 90 degrees), and then compute the Mercator of the Cassini; the >> result is a transverse mercator. We have implemented transverse >> mercator as a composition of rolling the equirectangular + >> mercator. Think of the possibilities. >> >> In the current computational model this is done in steps: Generate I' >> then apply g on I'. There are two disadvantages to this model: >> >> * Error increases >> * I/O operations are proportional to the number of functions. >> >> What I envision is a system that will allow me to add my own functions >> to the computational stack, in the same way that layers work in >> Photoshop. So the composition happens at the pixel level, not the >> image level. I didn't fully understand what you mean with that. Is it: 1. make the coordinate transformation stack more flexible by allowing more dynamic construction of it? 2. you propose to move to a new image processing framework, that is based on a lazy evaluation principle, where the operations are only performed for all pixels that contribute to the output pixels? This means, that by "pulling" at an output pixel, the whole image processing stack is triggered for that pixel, and no intermediate images are required (only if possible by the operations used in the stack, obviously). I think step 1 is very important. It does not make sense to do multiple geometric transforms (which always include quality loss due to interpolation) after each other. Step 2 has already been implemented by a number of other applications, which should be evaluated in detail before starting yet another one. This especially includes VIPS, which could be easily extended with the operations supported by panotools and hugin. http://www.vips.ecs.soton.ac.uk/index.php?title=VIPS Please take a look at that before starting something new from the scratch. I hope we will get a student to work on porting the panotools operations to VIPS, since this is similar to what you are describing here. >> For example, if you like to have your logo in the nadir, the you can >> create a function "Logo" that you insert into the computational stack. >> This function can take as a parameter a string. Then when the panorama >> is projected/computed, the logo is inserted right on the spot. This sounds very much like using VIPS, or a similar architecture, such as GEGL. >> This system will be very powerful and useful beyond panoramas. This is why something similar has been written already :-) ciao Pablo |
From: Daniel M. G. <dm...@uv...> - 2007-03-25 19:31:20
|
Hi Pablo, Pablo> I didn't fully understand what you mean with that. Is it: Pablo> 1. make the coordinate transformation stack more flexible by allowing more Pablo> dynamic construction of it? last night I spend some time on this. I looked at Flexify and Mathmap. I think it should be restricted to the computational stack of the coordinate system, otherwise it becomes a image processing tool. If it is restricted to the computation stack both can live with each other (a VIPS implementation that uses the computational stack for its coordinate transformation). In a way I see them orthogonal to each other. I haven't looked at VIPS yet. The functions in the stack could be provided via source code (like the Gimp plug-ins) or via a specific language (using mathmap language and parser, for example). Pablo> I think step 1 is very important. It does not make sense to do multiple Pablo> geometric transforms (which always include quality loss due to Pablo> interpolation) after each other. I agree. -- Daniel M. German "I cannot but conclude the bulk of your natives [human beings] to be the most pernicious race of little odious vermin that nature ever suffered to crawl upon Jonathan Swift -> the surface of the earth " http://turingmachine.org/ http://silvernegative.com/ dmg (at) uvic (dot) ca replace (at) with @ and (dot) with . |
From: Pablo d'A. <pab...@we...> - 2007-03-25 21:39:05
|
Hi Daniel, Daniel M. German schrieb: > Hi Pablo, > > Pablo> I didn't fully understand what you mean with that. Is it: > Pablo> 1. make the coordinate transformation stack more flexible by allowing more > Pablo> dynamic construction of it? > > last night I spend some time on this. I looked at Flexify and Mathmap. > > I think it should be restricted to the computational stack of the > coordinate system, otherwise it becomes a image processing tool. If it > is restricted to the computation stack both can live with each other > (a VIPS implementation that uses the computational stack for its > coordinate transformation). In a way I see them orthogonal to each > other. I fully agree. ciao Pablo |
From: Ippei U. <ipp...@ma...> - 2007-03-24 00:33:52
|
On 2007-03-22, at 23:46, dmg wrote: > > With respect to the rearchitecturing of panotools (GSoC) I have been > thinking about the "computational" stack of panotools. > > As I have stated in the past, I strongly believe that the optimization > and the projection should be divided into 2 independent modules. > > This is probably best exemplified by Flexify, which takes an > equirectangular (I haven't use it yet, so this is my understanding) > and produces an output image after it has applied a > transformation. FLexify does not need to know anything about > registration and optimization. > > What I am proposing here is a "programmable flexify", or a super > math-map. This is the direction where I want to take Tlalli. > > > Let me elaborate: > > The current transformation model, for an equirectangular as input, is > a computation for each pixel in the output image. That is, given an > image I, a list of optional parameters, compute its projection I': > > I' =3D f(I, [p]) > > What if we have two fs, and compose them? > > For instance, the output of one projection is used for another > projection: > > I'' =3D g(f(I,[p]), [p']) > > This only makes sense if the output of f is compatible with the input > of g. For example, compute the Cassini of an equirectangular (rolling > it by 90 degrees), and then compute the Mercator of the Cassini; the > result is a transverse mercator. We have implemented transverse > mercator as a composition of rolling the equirectangular + > mercator. Think of the possibilities. > > In the current computational model this is done in steps: Generate I' > then apply g on I'. There are two disadvantages to this model: > > * Error increases > * I/O operations are proportional to the number of functions. > > What I envision is a system that will allow me to add my own functions > to the computational stack, in the same way that layers work in > Photoshop. So the composition happens at the pixel level, not the > image level. > > For example, if you like to have your logo in the nadir, the you can > create a function "Logo" that you insert into the computational stack. > This function can take as a parameter a string. Then when the panorama > is projected/computed, the logo is inserted right on the spot. > > If the architecture is open enough, this can lead to "plugin > functions" that do things we can't even imagine today. We will only > require the developer to create a function with certain properties, > and register it. > > This system will be very powerful and useful beyond panoramas. > > Comments? This sounds really cool! Would it be like Quartz Composer where you can flexibly pipeline =20 multiple operations and the computation happens optimised to the =20 required result? I love how it works and am quite sure anything like =20 that for Panorama imaging would be really really cool and popular. Ippei -- ->> =E9=B5=9C=E9=A3=BC =E4=B8=80=E5=B9=B3 (UKAI Ippei) = ->>>>>>>>>>>>>>>>>>>>>>>> MSN & AIM: ipp...@ma... Skype: ippei_ukai Homepage: http://homepage.mac.com/ippei_ukai/ |
From: Daniel M. G. <dm...@uv...> - 2007-03-24 00:41:47
|
Ippei> This sounds really cool! Ippei> Would it be like Quartz Composer where you can flexibly pipeline Ippei> multiple operations and the computation happens optimised to the Ippei> required result? I love how it works and am quite sure anything like Ippei> that for Panorama imaging would be really really cool and popular. Exactly, a pipeline architecture. We have to be carefull, though. Some operations might not be able to operate at the pixel region level and might require the entire previous result, so this has somehow to be taken into account. But I think you get the idea. But this is like in databases: some results are "pipelineable" and some have to be "materialized". This is no different. dmg -- Daniel M. German "A work/computing environment without a foot of assorted junk piled on top Slashdot poster -> isn't a true environment." http://turingmachine.org/ http://silvernegative.com/ dmg (at) uvic (dot) ca replace (at) with @ and (dot) with . |
From: <jm...@we...> - 2007-03-24 14:26:30
|
Beware with GPUs: they are really fast (like 10x faster than CPU, usually), but they only are that fast if working alone. Exchanging information between CPU and GPU is extremelly difficult to pipeline unless you have a continuous queue of jobs to process. Basically, a GPU can't quite be used as a "coprocessor". The best way to use them is to do all that has to be done on the CPU first, then do all that has to be done on the GPU and display. If results go to disk, then it's more like do a lot on CPU, do a lot on GPU, read back to CPU (quite slow), more work on CPU, flush to disk. What does the CPU do while GPU is working? Best approach: prepare next job. What does this mean? That pipeline architectures are the way to go: split jobs into separate steps and have separate units work on different phases (example: core1 sets data up, GPU processes parallel stuff, core2 consolidates results and sends to disk). What's the main difficulty: error creep. Each step in a pipeline introduces its margin of error... It can become terribly difficult to stack many layers of "shaders" without losing too much precision. My suggestion to you is that you neither work at the image level or the pixel level. You should work at the "tile" level, something that is easy for the processor to work with. Someone mentionned dealing with "images larger than available memory" before, and I'd like to point out that for heavy processing it makes sense to consider available memory as whatever your L2 cache size is, ie somewhere between 512K and 4MB. It's entirely possible to write applications, pretending the cache is the actual memory and the RAM is more like disk space... A 16MP image is too big for the cache, and an individual pixel is too small to be practical... 128x128 floatRGB tiles can work : 192K per tile lets you have five ot them in a 1MB cache (one you write to and four being read from). GPU use is really something that is the way forward. Especially to process photos... The difficulty is dealing with platforms, APIs and hardware capabilities. Things may be getting even more complicated with nVidia and ATI each coming with their own "GPGPU" apis... I guess there is a way to be sufficiently platform independent with OpenGL (I'm not sure being more of a DirectX person myself), it's worth investigating as I am sure a lot of your projection work could benefit: after all, that's what GPUs do: project textures... Anyway, all this to say: stack of pipelined processes = good idea ! -- Jerome Muffat-Meridol LRPS - http://www.webphotomag.com - the online magazine about photographs, not cameras- Daniel M. German a écrit : > Ippei> This sounds really cool! > > Ippei> Would it be like Quartz Composer where you can flexibly pipeline > Ippei> multiple operations and the computation happens optimised to the > Ippei> required result? I love how it works and am quite sure anything like > Ippei> that for Panorama imaging would be really really cool and popular. > > Exactly, a pipeline architecture. > > We have to be carefull, though. Some operations might not be able to > operate at the pixel region level and might require the entire > previous result, so this has somehow to be taken into account. But I > think you get the idea. But this is like in databases: some results > are "pipelineable" and some have to be "materialized". This is no > different. > > dmg > > > -- > Daniel M. German "A work/computing environment without > a foot of assorted junk piled on top > Slashdot poster -> isn't a true environment." > http://turingmachine.org/ > http://silvernegative.com/ > dmg (at) uvic (dot) ca > replace (at) with @ and (dot) with . > > > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > PanoTools-devel mailing list > Pan...@li... > https://lists.sourceforge.net/lists/listinfo/panotools-devel > > |