Flood algorithm
Author: Ell
Year: 2016
Abstract
The flood operation eliminates "holes" -- darker areas surrounded by lighter areas -- in single-component (grayscale) images. It is particularly useful for eliminating such holes from selections; see GIMP bug #761060 for more details.
The conceptual model considers the input image as the height-map of a hilly terrain. After heavy rain completely floods the terrain, the remaining water form lakes in its depressions.
_______ /.\ /\_____________ ____________/...\ /..\ Water /..\ /....\ /.....\/....\__ /....\____ /......\____/....Ground.....\___/......\ /\ __/........................................\/..\_
Figure: A depiction of a one-dimensional flood. Valleys correspond to "holes" in the input image, filled with "water" according to their surrounding.
The result of the flood operation is the height-map of the terrain after the flood, taking both ground- and water-level into account.
More formally, the flood operation assigns to each pixel the minimum of the maximal input-image values along all paths from the pixel to the "outside". That is, the output value o(x) at pixel 'x' is given by
<math display="block">o(x) = \min_{p \in P(x)} \max_{y \in p} i(y)</math>,
where P(x) is the set of all paths from x to the outside, and i(y) is
the value of the input image at pixel y.
Algorithm
In accord with the conceptual flood model, we refer to the values of the input image as the "ground level", and to the values of the output image as the "water level"; these values range from 0 to 1, and are considered to be 0 outside the bounds of the image. Note not to confuse "water level" with "water depth"; we use the term "water level" simply to refer to the elevation of either the ground or the water at a certain point.
Our starting point is modeling the problem as a cellular automaton. The state of each cell (pixel) is its current water level, which is initially 1 everywhere inside the image. The water level at each cell is updated according to the rule
<math display="block">w_{n+1} (x) = \max { g(x), \min_{y \in N(x)} w_n (y) }</math>,
where w_{n}(x) is the water level at pixel x on generation n, g(x) is the ground level at x, and N(x) is the set of (orthogonal) neighbors of x, including itself. In other words, the new water level at each pixel is the maximum of its ground level, and the minimum of its own water level, and that of its neighbors. This automaton converges to the output of the operation.
The automaton converges after, at most, n generations, where n is the number of pixels in the image. Therefore, a naive implementation, where at most O(n) cells are processed on each generation, has a worst-case time complexity of O(n^{2}). By making a few observations, we can do better, at least in the most common cases.
First, note that the final state doesn't depend on the order of the individual steps. That is, we don't actually have to update the water level an entire generation at a time, but rather we can apply the transformation rule to any pixel, at any time, arbitrarily, until convergence. Furthermore, we don't even have to consider all the neighbors of a pixel each time we apply the rule: as long as we make sure to never increase the water level of a pixel, i.e., as long as we consider the pixel's own water level as part of the minimum term, we can take a different subset of the neighbors into account each time.
Second, using the above observation, note that we can solve a one- dimensional automaton (i.e., compute its final state) in linear time, using two passes: On the first pass, we iterate over the pixels left-to-right, applying the transformation rule while considering only the left neighbor of each pixel (using the water level assigned to the neighbor on the previous iteration; recall that the water level of the left neighbor of the leftmost pixel of the image is considered to be 0.) On the second pass, we work in reverse -- we iterate over the pixels right-to-left, applying the rule while considering only the right neighbors.
_________________________________________________ /.\ /\ __ ____ /...\ /..\ /..\ (a) /....\ /.....\/....\__ /....\ /......\____/...............\___/......\ /\ __/........................................\/..\_ ______________________________ /.\ /\ __ ____________/...\ /..\ /..\ (b) /....\ /.....\/....\__ /....\ /......\____/...............\___/......\ /\ __/........................................\/..\_ _______ /.\ /\_____________ ____________/...\ /..\ /..\ (c) /....\ /.....\/....\__ /....\____ /......\____/...............\___/......\ /\ __/........................................\/..\_
Figure: Water level of a one-dimensional automaton. (a) initially; (b) after the first pass; (c) after the second pass.
While this technique doesn't extend directly to two dimensions, we can leverage it by processing one-dimensional strips of pixels in batch, as described above, instead of transforming pixels individually.
Finally, another obvious way to speed things up is to minimize the amount of unnecessary work we're doing. In particular, we only need to process pixels whose neighbors' state changed.
Taking all of the above into consideration, this is what we do:
We maintain a queue of "segments" -- one-dimensional, contiguous, strips of pixels -- whose water level needs to be updated, as a result of a change in the water level of the pixels of a neighboring segment, referred to as the "source segment". Although, for efficiency reasons, we allow segments to be either horizontal or vertical, for simplicity, we treat all segments as though they're horizontal, and perform the necessary coordinate-system transformation when dealing with vertical segments, as explained later.
Each segment is processed using the following steps:
1. Vertical propagation: The segment's pixels are updated, using the above transformation rule, considering only the corresponding
- neighboring pixels of the source segment. During the process, we inspect which of the segment's pixels actually changed, and create a
- list of "dirty ranges" of modified pixels. We construct the ranges such that all pixels of each range have the same water level; this
- becomes important in the next step.
- - -+-----+-----+-----+-----+-----+-----+-----+- - - Source | | | | | | | | Segment | | | | | | | | | | | | | | | - - -+--|--+--|--+--|--+--|--+--|--+--|--+--|--+- - - Current | V | V | V | V | V | V | V | Segment | | x | x | | y | z | | - - -+-----+-----+-----+-----+-----+-----+-----+- - - Dirty ranges |-----------| |-----|-----| The current segment's pixels are updated according to the neighboring pixels of the source segment, and contiguous runs of modified, equivalent pixels form a list of dirty ranges.
2. Horizontal propagation: The segment's pixels are updated, considering only their left and right neighbors, using the two-pass process
- described above. Though semantically equivalent, this process slightly more intricate than the one described above, since we use the
- dirty ranges from the previous step to take a few shortcuts.
- Recall that all pixels of a single dirty range are equal, and therefore, unless modified as part of the current pass, don't affect
- each other's state. On the other hand, all pixels outside any dirty range didn't change, and therefore, unless modified as part of the
- current pass, don't affect each other's state either. As a result, initially, only the pixels that directly neighbor a dirty range, in the
- direction of the pass, need to be updated. If the water level of such pixel changes, we need to update the following pixel, and so on. Once
- the water level of a pixel remains the same, we don't have to update the next pixel, but can rather jump directly to the pixel at the edge
- of the next dirty range, and so on.
- For example, when scanning left-to-right, we start at the pixel directly to the right of the first (leftmost) dirty range. We apply
- the transformation rule to this pixel, and to the pixels to its right, until the water level of one of them is unaffected. At this point, we
- jump directly to the pixel to the right of the next dirty range.
- -+---+---+---+---+---+---+---+---+---+---+---+- - | | | | 1 | 2 |(3)| | | 4 |(5)| | - -+---+---+---+---+---+---+---+---+---+---+---+- - |-------| |---| Pixel traversal order on a left-to-right pass. Traversal starts to the right of the first dirty range, at pixel 1. Pixel (3) is unaffected, and so we jump directly to the right of the second dirty range.
- Of course, when scanning right-to-left, it all reverses, and we start to the left of the last (rightmost) dirty range, etc.
- During each pass, we extend the dirty ranges, in the direction of the scan, to include the newly modified pixels. Note that, while scanning
- a sequence of pixels next to one of the dirty ranges, we may reach the edge of the next range. In such case, we keep scanning the pixels of
- the second range, but we don't extend the previous range any further, so that the two ranges meet, but don't overlap.
- -+---+---+---+---+---+---+---+---+---+---+---+- - | | | | 1 | 2 |(3)| | | 4 | 5 | 6 | - -+---+---+---+---+---+---+---+---+---+---+---+- - Original |-------| |---| |---| Extended |---------------| |-------|-------| The dirty ranges are extended, in the direction of the scan, to include the newly modified pixels. The scan can "leak" into existing ranges (notice the third range in the illustration), in which case the previous range is only extended as far as the leaked-into range.
- Note that the rightmost and leftmost ranges may be extended past the bounds of the segment, during the left-to-right and right-to-left
- passes, respectively (recall that a segment doesn't necessarily span an entire row.)
- Also note that, if a dirty range is extended, or if its existing pixels are modified, during the first, left-to-right, pass, then it's possible
- that its water level will not be uniform come the second, right-to-left, pass; this seems to break our assumption about the state of the
- dirty ranges, which allowed us to take the shortcut described above.
- This shortcut is still valid on the second pass, though. It turns out that we only need the ranges to meet a weaker condition -- it's enough
- for the water level of the pixels of each dirty range to be monotonically decreasing in the direction of the scan (right-to-left,
- in our case). This condition is still met at the end of the first pass.
- One final detail: each dirty range has an associated modified flag, which is initially cleared. If, during the above process, the range is
- extended, or its existing pixels are modified, then its modified flag is set. This flag is used by the next step.
3. Distribution: The changes to the current segment's water level may affect the two neighboring rows. For each dirty range, we push two new
- segments into the queue -- one for each neighboring row -- using the current row as their source segment.
- There's one exception to this, however: if a dirty range hasn't been modified during the horizontal propagation step, i.e., if its
- modified flag is clear, then it necessarily doesn't affect the neighboring pixels of the source segment. Hence, in this case, we can
- avoid pushing a corresponding segment for the row of the source segment.
+---+---+---+---+ . . . +---+---+ . . Source | | | | | | | | +---+---+---+---+---+---+---+---+---+---+ . . Current | | | | | | | | | +---+---+---+---+---+---+---+---+---+---+ . . | | | | | | | | | | +---+---+---+---+ . +---+ +---+---+ . . |---------------| |---| |-------| Modified Modified New segments, corresponding to the dirty ranges, are pushed into the queue for each of the current segment's neighboring rows. No segments are pushed for the row of the source segment for non-modified dirty ranges.
- To amortize the cost of maintaining and processing multiple separate segments, dirty ranges that are separated by a small-enough gap are
- coalesced into a single range prior to this step; the gap between the ranges, if exists, becomes part of the coalesced range; the modified
- flag of the coalesced range is the logical-OR of the modified flags of the individual ranges.
Start and Termination
Recall that segments are pushed into the queue as a result of a change in the water level of a neighboring segment. To kick this process off, we pretend that the water level outside the image instantaneously dropped from 1 to 0, and push four segments, referred to as the "seed segments", corresponding to the four edges of the image (there may, in fact, be less than four seed segments, if the image is 1- or 2-pixel wide or high.) The source segment of the seed segments, hypothetically, lies outside the image; in particular, the water level of the neighboring pixels in the vertical propagation step is taken to be 0 for the seed segments.
+-----------------------------------+ | | +---+---------------------------+---+ | | | | | | | | | | | | | | | | | | | | | | | | | | | | +---+---------------------------+---+ | | +-----------------------------------+
Figure: The four seed segments -- one for each edge of the image.
The process terminates when there are no more segments left in the queue. At this point, the automaton has converged, and the water level corresponds to the output of the flood operation.
Coordinate Systems
As mentioned above, segments can be either horizontal or vertical, but are treated internally as horizontal. Additionally, the region-of-interest (ROI) of the operation might not span the entire image; in this case, the operation is performed on the ROI in isolation, and what we've been calling the "image" up until now is in fact the ROI (in particular, the ground level outside the ROI is considered to be 0, even if the input image isn't completely black outside the ROI.)
To deal with this, we employ three coordinate systems:
- Image-physical: This is the "real" coordinate system of the operation, used when talking to the outside world (i.e., GEGL). Its origin is at
- the top-left corner of the image, its x-axis points right, and its y-axis points down.
- Image-virtual: This is the same as the image-physical coordinate system, except that the x- and y-coordinates are swapped when dealing
- with vertical segments. In other words, when processing a vertical segment, we pretend that image is transposed (i.e., reflected along the
- south-east diagonal). We transform to/from this coordinate system on the boundary between GEGL and the rest of the algorithm.
- ROI-virtual: This is the same as the image-virtual coordinate system, except that its origin is translated to the top-left corner of the ROI.
- Internal coordinates, that aren't communicated to GEGL, are given in this coordinate system.
x y +----> - - - - - - -+ +----> - - - - - - -+ +- - - - - - - - - -+ y | | x | | | y | | +- - - - -+ | +- - - - -+ +----> - -+ v | | | v | | | | x | | | ROI ROI | ROI | | | | | | | | | v | | +- - - - -+ +- - - - -+ +- - - - -+ | | | | | | +- - - - - - - - - -+ +- - - - - - - - - -+ +- - - - - - - - - -+ (a) (b) (c) The three coordinate systems: (a) image- physical, (b) image-virtual (here shown transposed), and (c) ROI-virtual.
To sum it up, internal coordinates (e.g., the y-coordinate of the current segment, or the x-coordinates of the dirty ranges) are given in the ROI- virtual coordinate system. Coordinates of GeglRectangles (such as the ROI rectangle, or the rectangles used when reading and writing to the GEGL buffers) are given in the image-virtual coordinate system, but are transformed to/from the image-physical coordinate system before being passed-to/received-from GEGL.