[71a4e8]: details.htm Maximize Restore History

Download this file

details.htm    365 lines (363 with data), 18.9 kB

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
<h1>How It Works</h1>
<p class="author">By Andrew Mihal, 17 October 2004</p>
<p><b>Note:</b> This page refers to
the older 2.X releases of Enblend. The new 3.X versions feature a new
seam line optimization algorithm that tries to automatically avoid
placing the seam in areas where the input images mismatch. This reduces
the chance of having ghosts and cut-off people in the final output.
This feature will be documented in a future update to this page.</p>
<p>Enblend is a tool for compositing&nbsp;images. Given a set
of images that overlap in some irregular way, Enblend overlays them in
such a way that the seam between the images is invisible, or at least
very difficult to see. Enblend does <b>not</b> line up the
images for you. Use a tool like
<a href="http://hugin.sourceforge.net/" target="_blank">Hugin</a>
to do that.</p>
<p>Enblend uses a multiresolution spline to blend images together [1,2]. The
basic idea is that different image features should be blended across a
transition zone proportional in size to the spatial frequency of the features.
Big, smooth objects like the sky and clouds have low spatial frequency and
should be blended across a very wide region. Our eyes expect the sky to be very
uniform in appearance, so any sudden color change will be very noticeable. So it
is important to smooth out the difference over as large a zone as possible. On
the other hand, areas of the image with high spatial frequency, such as trees
and windowpanes, have sudden changes from light to dark. Our eyes expect to see
color changes here, and if you try to blend over a wide area there is the
possibility of noticeable ghosting. So high-frequency components are blended
across a narrow transition zone. The separate treatment of different image
components leads to better results than what I was ever able to do by hand in
the Gimp.</p>
<h3>Finding a Transition Line</h3>
<p>The first step is to calculate a transition line between the images. This
line will be used as a template for creating narrow blending masks (for
high-frequency details) and wide blending masks (for low-frequency areas).
Ideally, the transition line should be near the middle of the intersection
region between the images. This way, there will be plenty of room on the left
side of the line for the right image to fade out, and plenty of room on the
right side for the left image to fade out.</p>
<p>Enblend uses an algorithm suggested in [4] based on the Nearest Feature
Transform [3] to find the transition line. The algorithm finds a line
which is as far away as possible from the edges of the area where two images
intersect. Here is an example:</p>
<div class="illustration">
<div class="image">
<img src="images/mask_example1.jpg" border="0" height="300" width="603">
</div>
<div class="image">
<img src="images/mask_example1_mask.jpg" border="0" height="300" width="602">
</div>
</div>
<p>The red and green outlines show the alpha channel of the input images. The
black area of the mask indicates where the left image will have priority over
the right image. The white area of the mask indicates where the right image will
have priority over the left image. To avoid spatial confusion, I will usually
refer to the input images as the "black" image and the "white" image from now
on. Now you know where the Enblend logo comes from.</p>
<p>You can see from the double image that Jerry moved in between these two
photos. In the black image, he is half missing. By making some adjustments to
the alpha channels of the input images, we can make sure Jerry appears whole in
the output, and that the half-Jerry won't adversely affect the blending. I'll
erase Jerry from the left image by making the alpha channel transparent in the
affected areas. It's important to get the entire area where the images disagree. Here is the result:</p>
<div class="illustration">
<div class="image">
<img src="images/mask_example2.jpg" border="0" height="300" width="602">
</div>
<div class="image">
<img src="images/mask_example2_mask.jpg" border="0" height="300" width="602">
</div>
</div>
<p>You can see how Enblend re-routed the transition line to avoid the part of
the image I cut out.</p>
<h3>Creating the Laplacian Pyramids</h3>
<p>Next, Enblend makes three pyramids from the black image, the white image, and
the blend mask. The black and white images are turned into Laplacian pyramids. A
Laplacian pyramid breaks up an image into components based on spatial frequency.
The top level of the pyramid will contain just the highest spatial frequency
components - the edgiest of the edges. The bottom level will contain the lowest
spatial frequency components - smooth areas like the sky. The intermediate
levels contain features gradually decreasing in spatial frequency from high to
low.</p>
<p>A Laplacian pyramid is made by repeatedly applying a high-pass filter to the
image. The high-pass filter picks out all of the high spatial frequency
components of the image and passes everything else down to the next level. The
image that gets passed down actually contains less information (because the
edges have been removed) so we can downsample it. This reduces the size of the
next level by half in each dimension. This shrinking is what gives the pyramid
its pyramidal shape.</p>
<p>At the next level, the filter picks out the next-highest spatial frequency
components, and so on. After we have created the number of levels we want, the
bottom level is left with only the lowest spatial frequency components.</p>
<div class="illustration">
<table style="border-collapse: collapse;" border="2" cellpadding="5" cellspacing="0">
<tbody>
<tr>
<td align="justify">&nbsp;</td>
<td align="center">Black Laplacian Pyramid</td>
<td align="center">White Laplacian Pyramid</td>
</tr>
<tr>
<td align="justify" nowrap="nowrap">Level 0</td>
<td align="center"><img src="images/black_lp0.jpg" border="0" height="241" width="242"></td>
<td align="center"><img src="images/white_lp0.jpg" border="0" height="241" width="279"></td>
</tr>
<tr>
<td align="justify" nowrap="nowrap">Level 1</td>
<td align="center"><img src="images/black_lp1.jpg" border="0" height="121" width="121"></td>
<td align="center"><img src="images/white_lp1.jpg" border="0" height="121" width="139"></td>
</tr>
<tr>
<td align="justify" nowrap="nowrap">Level 2</td>
<td align="center"><img src="images/black_lp2.jpg" border="0" height="60" width="60"></td>
<td align="center"><img src="images/white_lp2.jpg" border="0" height="60" width="69"></td>
</tr>
<tr>
<td align="justify" nowrap="nowrap">Level 3</td>
<td align="center"><img src="images/black_lp3.jpg" border="0" height="30" width="30"></td>
<td align="center"><img src="images/white_lp3.jpg" border="0" height="30" width="34"></td>
</tr>
<tr>
<td align="justify" nowrap="nowrap">Level 4</td>
<td align="center"><img src="images/black_lp7.jpg" border="0" height="13" width="13"></td>
<td align="center"><img src="images/white_lp7.jpg" border="0" height="13" width="15"></td>
</tr>
</tbody>
</table>
</div>
<p>These scaled-down images don't look like much. Here is a bigger version of
the black pyramid level zero. This should make it clear that the biggest level
of the Laplacian pyramid contains only the highest spatial frequency features in
the image. I have enhanced the contrast of all of these images to bring out the
detail.</p>
<div class="illustration">
<div class="image">
<img src="images/black_lp0_crop.jpg" border="0" height="292" width="585">
</div>
</div>
<p>The deepest levels of the pyramids have the <b>smallest</b> number of pixels,
but they represent the <b>biggest, smoothest</b> features in the image.
Consequently, each pixel in these bottom levels will influence a large number of
pixels in the final result. To demonstrate this, I will reverse the Laplacian
pyramid process. This is called "collapsing" the pyramid. This recombines all of
the pyramid levels and gives you back the original image. Turning an image into
a Laplacian pyramid and then collapsing it is a lossless transformation.
However, to show how the bottom level contains the lowest spatial frequency
components in the image, I will first zero out all of the pyramid levels except
for the bottom one. This is equivalent to throwing away all but the lowest
spatial frequency components of the image. When this is collapsed, we will see
only the contribution of the bottom level. Here is what you get:</p>
<div class="illustration">
<table style="border-collapse: collapse;" border="2" cellpadding="5" cellspacing="0">
<tbody>
<tr>
<td align="right">&nbsp;</td>
<td align="center">Level 4</td>
<td align="center">Collapsed</td>
</tr>
<tr>
<td align="right" nowrap="nowrap">Black<br>
Laplacian<br>
Pyramid</td>
<td align="center">
<p align="center"><img src="images/black_lp7.jpg" border="0" height="13" width="13"></p>
</td>
<td align="center"><img src="images/black_collapse.jpg" border="0" height="241" width="279"></td>
</tr>
<tr>
<td align="right" nowrap="nowrap">White<br>
Laplacian<br>
Pyramid</td>
<td align="center"><img src="images/white_lp7.jpg" border="0" height="13" width="15"></td>
<td align="center"><img src="images/white_collapse.jpg" border="0" height="241" width="279"></td>
</tr>
</tbody>
</table>
</div>
<h3>Creating the Gaussian Pyramid</h3>
<p>The key of the multiresolution spline technique is to blend image features
across a transition zone proportional in size to the spatial frequency of the
features. This is accomplished by blending the black Laplacian pyramid and the
white Laplacian pyramid together, one level at a time. Each level will use a
different blending mask. At the top level we want to use a sharp blend mask so
that high-frequency details are blended over a narrow region. At the bottom
level we can use a wide blend mask so that low-frequency details are blended
over a large region.</p>
<p>These blend masks are constructed from the transition line template we
calculated above by creating a Gaussian pyramid. This process is similar to
making a Laplacian pyramid. Instead of high-pass filtering each level, we use a
low-pass filter. At the top level we start with the sharp blend mask we get from
the transition line template itself. The low-pass filter makes this transition
line blurrier and blurrier as we go down the levels. This gives us the
progression of blending zones that we want. We get a sharp blending zone at the
top and a wide blending zone at the bottom.</p>
<p>Since the low-pass filter removes detail from the image, we still do the
downsampling as before. Here is the result:</p>
<div class="illustration">
<table style="border-collapse: collapse;" border="2" cellpadding="5" cellspacing="0">
<tbody>
<tr>
<td align="right" nowrap="nowrap">&nbsp;</td>
<td align="center">Mask Gaussian Pyramid</td>
</tr>
<tr>
<td align="right" nowrap="nowrap">Level 0</td>
<td align="center"><img src="images/mask_gp0.jpg" border="0" height="241" width="279"></td>
</tr>
<tr>
<td align="right" nowrap="nowrap">Level 1</td>
<td align="center"><img src="images/mask_gp1.jpg" border="0" height="121" width="139"></td>
</tr>
<tr>
<td align="right" nowrap="nowrap">Level 2</td>
<td align="center"><img src="images/mask_gp2.jpg" border="0" height="60" width="69"></td>
</tr>
<tr>
<td align="right" nowrap="nowrap">Level 3</td>
<td align="center"><img src="images/mask_gp3.jpg" border="0" height="30" width="34"></td>
</tr>
<tr>
<td align="right" nowrap="nowrap">Level 4</td>
<td align="center"><img src="images/mask_gp7.jpg" border="0" height="13" width="15"></td>
</tr>
</tbody>
</table>
</div>
<p>As before, the downsampling makes it hard to see that the blending mask is
really getting wider. Here is the actual influence that the bottom mask exerts
on the final result: </p>
<div class="illustration">
<table style="border-collapse: collapse;" border="2" cellpadding="5" cellspacing="0">
<tbody>
<tr>
<td align="right">&nbsp;</td>
<td align="center">Level 4</td>
<td align="center">Collapsed</td>
</tr>
<tr>
<td align="right" nowrap="nowrap">Mask<br>
Gaussian<br>
Pyramid</td>
<td align="center">
<p align="center"><img src="images/mask_gp7.jpg" border="0" height="13" width="15"></p>
</td>
<td align="center"><img src="images/mask_collapse.jpg" border="0" height="241" width="279"></td>
</tr>
</tbody>
</table>
</div>
<h3>Blending the Pyramids</h3>
<p>The next step is to blend the Laplacian pyramids together, one level at a
time. Each level will use the corresponding blend mask from the mask Gaussian
pyramid. Here is the result:</p>
<div class="illustration">
<table style="border-collapse: collapse;" border="2" cellpadding="5" cellspacing="0">
<tbody>
<tr>
<td align="justify">&nbsp;</td>
<td colspan="4" align="center" nowrap="nowrap">
<div align="center">
<center>
<table style="border-collapse: collapse;" id="AutoNumber11" border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody>
<tr>
<td align="center" width="25%">Result =</td>
<td align="center" width="25%">Mask Gaussian Pyramid (</td>
<td align="center" width="25%">Black Laplacian Pyramid,</td>
<td width="25%">
<p align="center">White Laplacian Pyramid)</p>
</td>
</tr>
</tbody>
</table>
</center>
</div>
</td>
</tr>
<tr>
<td align="justify" nowrap="nowrap">Level 0</td>
<td align="center" nowrap="nowrap">
<img src="images/blend_lp0.jpg" border="0" height="241" width="242"></td>
<td align="center" nowrap="nowrap"><img src="images/mask_gp0.jpg" border="0" height="241" width="279"></td>
<td align="center"><img src="images/black_lp0.jpg" border="0" height="241" width="242"></td>
<td align="center"><img src="images/white_lp0.jpg" border="0" height="241" width="279"></td>
</tr>
<tr>
<td align="justify" nowrap="nowrap">Level 1</td>
<td align="center" nowrap="nowrap">
<img src="images/blend_lp1.jpg" border="0" height="121" width="121"></td>
<td align="center" nowrap="nowrap"><img src="images/mask_gp1.jpg" border="0" height="121" width="139"></td>
<td align="center"><img src="images/black_lp1.jpg" border="0" height="121" width="121"></td>
<td align="center"><img src="images/white_lp1.jpg" border="0" height="121" width="139"></td>
</tr>
<tr>
<td align="justify" nowrap="nowrap">Level 2</td>
<td align="center" nowrap="nowrap">
<img src="images/blend_lp2.jpg" border="0" height="60" width="60"></td>
<td align="center" nowrap="nowrap"><img src="images/mask_gp2.jpg" border="0" height="60" width="69"></td>
<td align="center"><img src="images/black_lp2.jpg" border="0" height="60" width="60"></td>
<td align="center"><img src="images/white_lp2.jpg" border="0" height="60" width="69"></td>
</tr>
<tr>
<td align="justify" nowrap="nowrap">Level 3</td>
<td align="center" nowrap="nowrap">
<img src="images/blend_lp3.jpg" border="0" height="30" width="30"></td>
<td align="center" nowrap="nowrap"><img src="images/mask_gp3.jpg" border="0" height="30" width="34"></td>
<td align="center"><img src="images/black_lp3.jpg" border="0" height="30" width="30"></td>
<td align="center"><img src="images/white_lp3.jpg" border="0" height="30" width="34"></td>
</tr>
<tr>
<td align="justify" nowrap="nowrap">Level 4</td>
<td align="center" nowrap="nowrap">
<img src="images/blend_lp7.jpg" border="0" height="13" width="13"></td>
<td align="center" nowrap="nowrap"><img src="images/mask_gp7.jpg" border="0" height="13" width="15"></td>
<td align="center"><img src="images/black_lp7.jpg" border="0" height="13" width="13"></td>
<td align="center"><img src="images/white_lp7.jpg" border="0" height="13" width="15"></td>
</tr>
</tbody>
</table>
</div>
<h3>Collapsing the Result</h3>
<p>The final step is to collapse the blended Laplacian pyramid. This is then
pasted on top of the parts of the input images that were not involved in the
blending. This gives us the output image:</p>
<div class="illustration">
<div class="image">
<img src="images/blend_out.jpg" border="0" height="300" width="602">
</div>
</div>
<h3>References</h3>
<table style="border-collapse: collapse;" border="0" cellpadding="5" cellspacing="0">
<tbody>
<tr>
<td align="left" valign="top">[1]</td>
<td align="left" valign="top">P. Burt and E. Adelson. "A
Multiresolution Spline With Application to Image Mosaics". ACM
Transactions on Graphics, Vol. 2, No. 4, October 1983. Pg. 217-236.</td>
</tr>
<tr>
<td align="left" valign="top">[2]</td>
<td align="left" valign="top">P. Burt and E. Adelson. "The Laplacian
Pyramid as a Compact Image Code". IEEE Transactions on
Communications, April 1983.</td>
</tr>
<tr>
<td align="left" valign="top">[3]</td>
<td align="left" valign="top">M. Alsuwaiyel and M. Gavrilova. "On
the Distance Transform of Binary Images". International Conference
on Imaging Science, Systems, and Technology. 2000.</td>
</tr>
<tr>
<td align="left" valign="top">[4]</td>
<td align="left" valign="top">Y. Xiong and K. Turkowski.
"Registration, Calibration, and Blending in Creating High Quality
Panoramas". 4th IEEE Workshop on Applications of Computer Vision.
October, 1998.</td>
</tr>
<tr>
<td align="left" valign="top">[5]</td>
<td align="left" valign="top">J. Cychosz. "Efficient Binary Image
Thinning using Neighborhood Maps". In Graphics Gems IV, P. Heckbert
editor. Academic Press, 1994.</td>
</tr>
</tbody>
</table>