Flood algorithm

From GIMP Developer Wiki
Revision as of 19:49, 3 April 2016 by Scl (Talk | contribs)

Jump to: navigation, search

Author: Ell

Year: 2016


The flood operation eliminates "holes" -- darker areas surrounded by lighter areas -- in single-component (grayscale) images. It is particularly useful for eliminating such holes from selections; see GIMP bug #761060 for more details.

The conceptual model considers the input image as the height-map of a hilly terrain. After heavy rain completely floods the terrain, the remaining water form lakes in its depressions.

                              /.\    /\_____________
                 ____________/...\  /..\  Water  /..\
                /....\      /.....\/....\__     /....\____
               /......\____/....Ground.....\___/......\  /\

Figure: A depiction of a one-dimensional flood. Valleys correspond to "holes" in the input image, filled with "water" according to their surrounding.

The result of the flood operation is the height-map of the terrain after the flood, taking both ground- and water-level into account.

More formally, the flood operation assigns to each pixel the minimum of the maximal input-image values along all paths from the pixel to the "outside". That is, the output value at pixel `x`, `o(x)`, is given by

<math>o(x)   =   \min_{p \in P(x)} \max_{y \in p} i(y)</math>,

where `P(x)` is the set of all paths from `x` to the outside, and `i(y)` is the value of the input image at pixel `y`.


In accord with the conceptual flood model, we refer to the values of the input image as the "ground level", and to the values of the output image as the "water level"; these values range from 0 to 1, and are considered to be 0 outside the bounds of the image. Note not to confuse "water level" with "water depth"; we use the term "water level" simply to refer to the elevation of either the ground or the water at a certain point.

Our starting point is modeling the problem as a cellular automaton. The state of each cell (pixel) is its current water level, which is initially 1 everywhere inside the image. The water level at each cell is updated according to the rule

<math>w_{n+1} (x) = \max { g(x), \min_{y \in N(x)} w_n (y) }</math>,

where `wn(x)` is the water level at pixel `x` on generation `n`, `g(x)` is the ground level at `x`, and `N(x)` is the set of (orthogonal) neighbors of `x`, including itself. In other words, the new water level at each pixel is the maximum of its ground level, and the minimum of its own water level, and that of its neighbors. This automaton converges to the output of the operation.

The automaton converges after, at most, `n` generations, where `n` is the number of pixels in the image. Therefore, a naive implementation, where at most `O(n)` cells are processed on each generation, has a worst-case time complexity of `O(n^2)`. By making a few observations, we can do better, at least in the most common cases.

First, note that the final state doesn't depend on the order of the individual steps. That is, we don't actually have to update the water level an entire generation at a time, but rather we can apply the transformation rule to any pixel, at any time, arbitrarily, until convergence. Furthermore, we don't even have to consider all the neighbors of a pixel each time we apply the rule: as long as we make sure to never increase the water level of a pixel, i.e., as long as we consider the pixel's own water level as part of the minimum term, we can take a different subset of the neighbors into account each time.

Second, using the above observation, note that we can solve a one- dimensional automaton (i.e., compute its final state) in linear time, using two passes: On the first pass, we iterate over the pixels left-to-right, applying the transformation rule while considering only the left neighbor of each pixel (using the water level assigned to the neighbor on the previous iteration; recall that the water level of the left neighbor of the leftmost pixel of the image is considered to be 0.) On the second pass, we work in reverse -- we iterate over the pixels right-to-left, applying the rule while considering only the right neighbors.

                                  /.\    /\           __
                     ____        /...\  /..\         /..\
           (a)      /....\      /.....\/....\__     /....\
                   /......\____/...............\___/......\  /\
                                  /.\    /\           __
                     ____________/...\  /..\         /..\
           (b)      /....\      /.....\/....\__     /....\
                   /......\____/...............\___/......\  /\
                                  /.\    /\_____________
                     ____________/...\  /..\         /..\
           (c)      /....\      /.....\/....\__     /....\____
                   /......\____/...............\___/......\  /\

Figure: Water level of a one-dimensional automaton. (a) initially; (b) after the first pass; (c) after the second pass.

While this technique doesn't extend directly to two dimensions, we can leverage it by processing one-dimensional strips of pixels in batch, as described above, instead of transforming pixels individually.

Finally, another obvious way to speed things up is to minimize the amount of unnecessary work we're doing. In particular, we only need to process pixels whose neighbors' state changed.

Taking all of the above into consideration, this is what we do:

We maintain a queue of "segments" -- one-dimensional, contiguous, strips of pixels -- whose water level needs to be updated, as a result of a change in the water level of the pixels of a neighboring segment, referred to as the "source segment". Although, for efficiency reasons, we allow segments to be either horizontal or vertical, for simplicity, we treat all segments as though they're horizontal, and perform the necessary coordinate-system transformation when dealing with vertical segments, as explained later.

Each segment is processed using the following steps:

1. Vertical propagation: The segment's pixels are updated, using the above transformation rule, considering only the corresponding

neighboring pixels of the source segment. During the process, we inspect which of the segment's pixels actually changed, and create a
list of "dirty ranges" of modified pixels. We construct the ranges such that all pixels of each range have the same water level; this
becomes important in the next step.
               - - -+-----+-----+-----+-----+-----+-----+-----+- - -
      Source        |     |     |     |     |     |     |     |
      Segment       |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
               - - -+--|--+--|--+--|--+--|--+--|--+--|--+--|--+- - -
      Current       |  V  |  V  |  V  |  V  |  V  |  V  |  V  |
      Segment       |     |  x  |  x  |     |  y  |  z  |     |
               - - -+-----+-----+-----+-----+-----+-----+-----+- - -
      Dirty ranges        |-----------|     |-----|-----|
                The current segment's pixels are updated
                according to the neighboring pixels of the
                source segment, and contiguous runs of
                modified, equivalent pixels form a list of
                dirty ranges.

2. Horizontal propagation: The segment's pixels are updated, considering

only their left and right neighbors, using the two-pass process
described above. Though semantically equivalent, this process
slightly more intricate than the one described above, since we use the
dirty ranges from the previous step to take a few shortcuts.
Recall that all pixels of a single dirty range are equal, and
therefore, unless modified as part of the current pass, don't affect
each other's state. On the other hand, all pixels outside any dirty
range didn't change, and therefore, unless modified as part of the
current pass, don't affect each other's state either. As a result,
initially, only the pixels that directly neighbor a dirty range, in the
direction of the pass, need to be updated. If the water level of such
pixel changes, we need to update the following pixel, and so on. Once
the water level of a pixel remains the same, we don't have to update
the next pixel, but can rather jump directly to the pixel at the edge
of the next dirty range, and so on.
For example, when scanning left-to-right, we start at the pixel
directly to the right of the first (leftmost) dirty range. We apply
the transformation rule to this pixel, and to the pixels to its right,
until the water level of one of them is unaffected. At this point, we
jump directly to the pixel to the right of the next dirty range.
           - -+---+---+---+---+---+---+---+---+---+---+---+- -
              |   |   |   | 1 | 2 |(3)|   |   | 4 |(5)|   |
           - -+---+---+---+---+---+---+---+---+---+---+---+- -
                  |-------|               |---|
                Pixel traversal order on a left-to-right
                pass.  Traversal starts to the right of
                the first dirty range, at pixel `1`.
                Pixel `(3)` is unaffected, and so we jump
                directly to the right of the second dirty
Of course, when scanning right-to-left, it all reverses, and we start
to the left of the last (rightmost) dirty range, etc.
During each pass, we extend the dirty ranges, in the direction of the
scan, to include the newly modified pixels. Note that, while scanning
a sequence of pixels next to one of the dirty ranges, we may reach the
edge of the next range. In such case, we keep scanning the pixels of
the second range, but we don't extend the previous range any further,
so that the two ranges meet, but don't overlap.
                - -+---+---+---+---+---+---+---+---+---+---+---+- -
                   |   |   |   | 1 | 2 |(3)|   |   | 4 | 5 | 6 |
                - -+---+---+---+---+---+---+---+---+---+---+---+- -
      Original         |-------|               |---|   |---|
      Extended         |---------------|       |-------|-------|
                The dirty ranges are extended, in the
                direction of the scan, to include the
                newly modified pixels. The scan can
                "leak" into existing ranges (notice the
                third range in the illustration), in which
                case the previous range is only extended
                as far as the leaked-into range.
Note that the rightmost and leftmost ranges may be extended past the
bounds of the segment, during the left-to-right and right-to-left
passes, respectively (recall that a segment doesn't necessarily span an
entire row.)
Also note that, if a dirty range is extended, or if its existing pixels
are modified, during the first, left-to-right, pass, then it's possible
that its water level will not be uniform come the second, right-to-
left, pass; this seems to break our assumption about the state of the
dirty ranges, which allowed us to take the shortcut described above.
This shortcut is still valid on the second pass, though. It turns out
that we only need the ranges to meet a weaker condition -- it's enough
for the water level of the pixels of each dirty range to be
monotonically decreasing in the direction of the scan (right-to-left,
in our case). This condition is still met at the end of the first
One final detail: each dirty range has an associated `modified` flag,
which is initially cleared. If, during the above process, the range is
extended, or its existing pixels are modified, then its `modified` flag
is set. This flag is used by the next step.

3. Distribution: The changes to the current segment's water level may

affect the two neighboring rows. For each dirty range, we push two new
segments into the queue -- one for each neighboring row -- using the
current row as their source segment.
There's one exception to this, however: if a dirty range hasn't been
modified during the horizontal propagation step, i.e., if its
`modified` flag is clear, then it necessarily doesn't affect the
neighboring pixels of the source segment. Hence, in this case, we can
avoid pushing a corresponding segment for the row of the source

                 +---+---+---+---+   .   .   .   +---+---+   .   .
        Source   |   |   |   |   |               |   |   |
                 +---+---+---+---+---+---+---+---+---+---+   .   .
        Current          |   |   |   |   |   |   |   |   |
                 +---+---+---+---+---+---+---+---+---+---+   .   .
                 |   |   |   |   |       |   |   |   |   |
                 +---+---+---+---+   .   +---+   +---+---+   .   .
                 |---------------|       |---|   |-------|
                     Modified                    Modified
                 New segments, corresponding to the dirty
                 ranges, are pushed into the queue for each
                 of the current segment's neighboring rows.
                 No segments are pushed for the row of the
                 source segment for non-modified dirty
To amortize the cost of maintaining and processing multiple separate
segments, dirty ranges that are separated by a small-enough gap are
coalesced into a single range prior to this step; the gap between the
ranges, if exists, becomes part of the coalesced range; the `modified`
flag of the coalesced range is the logical-OR of the `modified` flags
of the individual ranges.

Start and Termination

Recall that segments are pushed into the queue as a result of a change in the water level of a neighboring segment. To kick this process off, we pretend that the water level outside the image instantaneously dropped from 1 to 0, and push four segments, referred to as the "seed segments", corresponding to the four edges of the image (there may, in fact, be less than four seed segments, if the image is 1- or 2-pixel wide or high.) The source segment of the seed segments, hypothetically, lies outside the image; in particular, the water level of the neighboring pixels in the vertical propagation step is taken to be 0 for the seed segments.

                   |                                   |
                   |   |                           |   |
                   |   |                           |   |
                   |   |                           |   |
                   |   |                           |   |
                   |   |                           |   |
                   |   |                           |   |
                   |   |                           |   |
                   |                                   |

Figure: The four seed segments -- one for each edge of the image.

The process terminates when there are no more segments left in the queue. At this point, the automaton has converged, and the water level corresponds to the output of the flood operation.

Coordinate Systems

As mentioned above, segments can be either horizontal or vertical, but are treated internally as horizontal. Additionally, the region-of-interest (ROI) of the operation might not span the entire image; in this case, the operation is performed on the ROI in isolation, and what we've been calling the "image" up until now is in fact the ROI (in particular, the ground level outside the ROI is considered to be 0, even if the input image isn't completely black outside the ROI.)

To deal with this, we employ three coordinate systems:

  • Image-physical: This is the "real" coordinate system of the operation,
used when talking to the outside world (i.e., GEGL). Its origin is at
the top-left corner of the image, its x-axis points right, and its y-
axis points down.
  • Image-virtual: This is the same as the image-physical coordinate
system, except that the x- and y-coordinates are swapped when dealing
with vertical segments. In other words, when processing a vertical
segment, we pretend that image is transposed (i.e., reflected along the
south-east diagonal). We transform to/from this coordinate system on
the boundary between GEGL and the rest of the algorithm.
  • ROI-virtual: This is the same as the image-virtual coordinate system,
except that its origin is translated to the top-left corner of the ROI.
Internal coordinates, that aren't communicated to GEGL, are given in
this coordinate system.
     x                        y
   +----> - - - - - - -+    +----> - - - - - - -+    +- - - - - - - - - -+
 y |                   |  x |                   |    |     y             |
   |   +- - - - -+          |   +- - - - -+              +----> - -+
   v   |         |     |    v   |         |     |    | x |         |     |
           ROI                      ROI                  |   ROI
   |   |         |     |    |   |         |     |    |   v         |     |
       +- - - - -+              +- - - - -+              +- - - - -+
   |                   |    |                   |    |                   |
   +- - - - - - - - - -+    +- - - - - - - - - -+    +- - - - - - - - - -+
            (a)                      (b)                      (c)
                 The three coordinate systems: (a) image-
                 physical, (b) image-virtual (here shown
                 transposed), and (c) ROI-virtual.

To sum it up, internal coordinates (e.g., the y-coordinate of the current segment, or the x-coordinates of the dirty ranges) are given in the ROI- virtual coordinate system. Coordinates of `GeglRectangle`s (such as the ROI rectangle, or the rectangles used when reading and writing to the GEGL buffers) are given in the image-virtual coordinate system, but are transformed to/from the image-physical coordinate system before being passed-to/received-from GEGL.

Return to Index