The general definition of the Generalized Hough Transform, is that it's an extension of the "standard" Hough transform, used to detect shapes which cannot be described by simple analytical formulae. Instead, a more complex representation is used, e.g. matrices or other mathematical constructs representing patterns, shapes or vectors.
Without dwelling into too much detail, at a computational level the Generalized Hough Transform rougly consists in making each examined pixel in an image to "project" a copy of the searched pattern at various angles and scales, (usually, projection and comparison takes place starting from the center of a certain object, but it's possible to start elsewhere under special conditions) and then keeping track of how many pixel matches for a given scale and angle occured between the "projection" and the tested image. Of course, this can be quite a challenge to implement with limited memory and computational resources, as the problems faced are simply too many (e.g. need to keep track of separate voting spaces for various scales, including orientation information, performing comparisons for different orientations and scales).
The resulting "Generalized" Hough spectrum is generally a grayscale image ridden with shadowy ghost images of the searched object, projecting around the original image's contrours at various scales and orientations, with some points or groups of points "brighter " than the others. These are the most likely centers (or starting points) of the searched images. Even if it sounds strange, this is true for any image shape examined. Of course, a more complex object such as a teapot or a human siluette would create much more bizzare spectra, but the underlying principles would remain exactly the same.
|
![]() |
| Image: a Generalized Hough Transform spectrum, for circles with a diameter of 40 pixels. The possible circle centers stand out as brighter dots. A separate spectrum needs to be calculated for every diameter (or "scale" ) examined. Also, the bright white "glob" at the bottom right is created from a thick non-circular closed shape, and will lead to false detections. |
The most general algorithm definition says to do exactly that: creating a special reference data structure (usually a binary image, in the form of a table, called R-Table) and essentially comparing its "boundary" or "contour" with groups of pixels (having a central or boundary pixel for reference). Of course, the code must handle all the necessary rotations and scalings while doing this.
For circle detection, things are made a bit easier for two reasons:
Here rise a few questions: why drawing a circle for every supposed edge pixel (that would be, each white pixel in the binary image) instead of just checking if a circle exists AROUND each pixel?
This can be done, but FIRST we need to know where a probable center for a circle is...which we don't (a center pixel doesn't have to be white). And, unsuprisingly, checking for a possible circle around EVERY pixel a binary image, is just a redundant complement to drawing a circle only around the "white" pixels. In fact, the final voting spectrum looks surprisingly similar, only done in a much longer time, since every pixel has to be checked.
A more interesting variation would be, for each pixel encountered, trying to follow a circle, and see how probably this pixel is part of a certain circle, with given radius. Unluckily, since we don't know on what part of a circle we are starting or what radius the supposed "circle" is, we would have to check for all possible projections and orientations. That equates, unsurprisingly, to applying the GHT with no optimizations whatsoever.
Now...a top-to-bottom and left-to-right scanning of our image should, in theory, leave us with "just" the orientations from -90 to +90 degrees to check, thanks to the circle's symmetry.
However, even with this "relief", the extra computational load and the quantization noise and distortion involved in rotating binary images would make this approach pointless and with no real advantage, and it would defeat the simplicity of avoiding orientation checks for circles.
Also, in the end it would gives us exactly the same Hough spectrum, supposing we didn't suppress "insufficient" votes for a certain pixel (something which would not be a good idea, anyway).
It also wouldn't help us "rule out" groups of pixels earlier by marking them as part of a circle. If we did that, then the detection of concentric, merged, crossing or enclosed circles would be quite problematic.
A cleverer variation of this approach however is followed in the "EHT" method used in this program, which detects possible circle centers first and then does pixel-by-pixel comparisons (which is exactly the opposite of the generalized hough transform approach!).