Tag Archives: Image Processing

Angling for Angles

In our previous post, we discussed enhancing edges, sometimes known as “edge detection”, although as we said this is a slightly misleading term. In this post we will explore one of the more interesting image processing algorithms for finding the actual edge locations for lines, and their slopes.

The algorithm we will explore is the “Hough Transform”, and I am always amazed that someone (Paul Hough, for whom it is named) could come up with a method such as this one. In many books it is not thoroughly explained, so I will attempt to really pick it apart in order that you can really get the hang of it.

First, recall that a line can be represented by the formula y = mx + b;  Ok, now forget that, because that formula doesn’t work when the line is vertical – the slope becomes infinite, and y becomes undefined – technically known as a singularity.

So, what’s another way to represent a line? Well I will just jump right to the way that the Hough Transform thinks about lines. There are three main parts to this way of representing lines – first, the point at the origin of the image (0,0). Second, a line of a certain distance (we’ll call this line r) that starts at the origin, and third, an angle we’ll call theta (?) which is the amount that r is rotated in order to be perpendicular to the line we are defining – i.e. the line we are looking for. So, a theoretically infinite line can be uniquely identified by the origin, r, and theta, and actually all we need are theta and r because the origin can be assumed to be the same throughout – (0,0).

So, as we can see in the figure below, we have our original white line (let’s pretend we don’t see it ) in our image that we’re trying to locate. If we knew where it was, we could describe it by the angle ?, and the line r.

So, we can now define lines two different ways –  in a coordinate space based on x and y values, i.e. the  Cartesian System,  or we can define them by angle and distance from the origin. How does this help us, though? Well, what Mr. Hough came up with was an algortihm that is both sophisticated and “brute force”, if that’s possible.

What the algorithm does is walk through every pixel in the image, looking for a pixel that could be part of a line. In the case of our image above, that means any pixel that is not black. So, in our image we look through the first line and we find the white pixel at (0,60).

Now, recall that we don’t really know where the white line is – we’re looking for it. So, we start by creating lines that pass through this point and calculating the r value for each line.

We can start by looping from ? = 0 – 180 degrees. We know that the x (col) is 60, and the y (row) is 0.

So, let’s compute the length of that line with zero rotation = and we get r = 60.

60 = 60 cos(0) + 0sin(0)

So, the first potential line candidate, is perpendicular to our horizontal line at (60,0):

So, as we said we can represent lines by using an x, y coordinate system. We can also represent them by the angle theta and the line r. So, let’s make a new 2 dimension coordinate system, with one dimension theta and the other the length of r. Since we’re doing image processing, we can create an “image”  to hold this data, with one dimension being theta and the other r. This type of image is know as the “Hough Space”. We are transforming points in the Cartesian space to information in the Hough Space. This information consists of “votes” for an angle and an r distance. Each “pixel” in the Hough Space will hold the sum of the votes for our candidate  line based on theta, and r. Above we have our first line, so in the Hough Space we will add 1 vote for (theta, r) as (0, 60).

Hough Space (0,60,1) – angle 0, r 60, vote 1

Now, we continue this process for the next degree of theta, 1, and then 2, etc. Let’s move ahead to 10 degrees.

Now recall, we’re still on the first pixel that we found – the one at (60,0) – and this is the brute force part of the algorithm. For each degree, 0 – 180, we’re going to create a vote for every theta and r value in the Hough Space.

So, here we get a vote for (10, 59.085, 1).

So still staying at this first pixel location, we continue this process for each degree step. Since we already know our answer, let’s skip ahead to theta = 45 degrees:

Here we can see that the line is coincident with the line we’re looking for – and so we have our Hough Space vote at (45,42.42,1).

Now, we can see the pattern – as we “find” the white pixels in our line, we spin the blue line around and in this case it’s especially clear that each time that the blue line is coincident with the line we are looking for the votes will start to “add up” at that point – whereas the other positions of the blue line will generate smaller numbers of votes. Let’s try another white pixel at 45 degrees – let’s pick x = 48 and y = 12.

We calculate r as  48cos(45) + 12 sin(45)

and sure enough we get 42.42 so we have another vote for Hough Space (45,42.42) which means that the vote at that coordinate will increase by 1. And so in the Hough Space image at row 45 col 42.42 we will have 2 votes (assuming we really skipped ahead).

Now let’s run a version of the Hough Transform on our image, and see the resulting Hough Space image:

Here we see a small little blob, and the brightness of each pixel represents the number of votes for that r and theta.  Let’s look at the underlying data in the image:

Here the columns represent the Angles, and the rows represent is the distance. The value in each entry is the number of votes, which is the brightness in the resulting Hough Space image.

The image data in the spreadsheet view above has been normalized to between 0 and 1.0, so, our “brightest” spot, a 1.0,  is at 45 degrees! But wait – the distance r, as seen in the row numbers,  is 44, and we had calculated 42.42. It turns out that there is a fair amount of inaccuracy in the Hough Transform, and special care needs to be taken to improve the accuracy. So, we are getting a very good result on the angle, but due to rounding and other types of errors, our r values are not quite exact. There are methods for improving these result – see the book mentioned at the end of this post for possible variations that can improve accuracy.

The final step of the Hough Transform is to walk through the image and find the maximum(s) and you will have your line or lines. How one does this depends more precisely what you are trying to accomplish. Perhaps you are looking for lines with specific angles, or perhaps at specific locations.

This covers the basics of the Hough Transform for locating lines. There are many variations on the Hough Transform – variations for finding circles, ellipses, and even general “shapes”. For more detail an excellent book is “Feature Extraction and Image Processing” by Mark S. Nixon and Alberto S Aguado.

Rick Frank

Dominion Software, Inc.

Hough Space Image of the Construction image in the first Edge Detection Post:

Matlab Code:

function [ houghImage ] = lineHough( image )

%create line hough transform image from input image

[rows cols] = size(image);

% our maximum is the euclidean distance across the image

radiusMax = round(sqrt(rows * rows +cols * cols));

% your threshold may vary.

thresh = max(max(image)) / 2;

%allocate our hough image space

houghImage = zeros(radiusMax,180);

for row = 1:rows

for col = 1:cols

% did we find a potential line pixel?

if image(row,col) > thresh

for theta = 1:180

r = round(col * cos(theta * pi/180.0) + row * sin(theta * pi/180));

if(r < radiusMax & r > 0)

houghImage(round(r),round(theta)) = houghImage(round(r),round(theta)) + 1;

end

end

end

end

end

% we have our image – scale it between 0.0 and 1.0 so that it

% is pleasing to the eye.

im = houghImage;

m = max(max(im));

im = im/m;

houghImage = im;

end

Looking at the edge of Edge Detection – Part 1.

If you’ve done a bit of image processing, or read about it, you’ve probably run into the concept of “edge detection”. In the first part of this article I’ll take a closer look at the basics of edge detection, and explore in more detail what’s going on in those little boxes of numbers.

When people hear of the concept  of edge detection in images, they sometimes think of  finding objects in images, and obtaining the actual coordinates of the objects in the image. However, the various algorithms called edge “detection” would  probably better be called edge “enhancement”. These algorithms amplify or mark where the segments in an image change, and the result is a new image showing the location of these changes, which are “edges” in the image. So, the “detection” part does not include creating a map of coordinates of various regions – other algorithms are needed for this task. However, edge detection is a useful tool in preparing images for location detection, and for enhancing the lines, edges, and corners in images for other image processing tasks.

If we start with a very simple example, it is easier to understand the basic idea for edge enhancement. Here we have an image with a single, simple edge, where the white section meets the black section in the middle:

We will use the convention that the value of the color black on the left is 0, and the value of the white on the right is 1.  Let’s take this image down to 1 dimension, and look at it from the edge:

Here we can see the black area as all zeros, and the white area as all ones. To detect where that change happens, we have to bring in the dreaded “C” word, Calculus.  But, we really only need to know a bit about that topic to understand how to find the edge of the image.

We are looking for a change – a difference – in the image. We want to note where those differences occur. The ideas of Calculus help us find this difference. We can move along the data from left to right, on step at a time, looking for these changes.

To be clear, we have a discontinuous function,  as seen by the big jump between 0 and 1. Calculus doesn’t like to deal with this type of data. However, we are in the computer world,  where we always have to deal with these types of discontinuities between values.

The concepts we will borrow from the Calculus are the idea of derivatives, which is the rate of change, and the concept of the limit of the size of the step as it approaches zero.  The rate of change is how fast our data is changing – how much change vs how many steps. – The concept of the limit of the step size is simply taking the step size to its smallest theoretical value, which gives us the most accurate measurement. In our computer world, we have a simpler model – our step size in this case is just 1 pixel,  and as we step across this data we can estimate the rate of the change across  small sections of our data. This concept is known as Finite Differences, and since our world inside the computer has finite steps and step sizes, we will use this concept.

Imagine walking along inside a building, writing down the difference in your position vertically in the building as you walk.  As you start to walk your position is 0 feet off the floor, and you look ahead and measure the difference between where you are and where you will be the next step forward. As you move along the floor for four steps, say, you note at each step that there is no difference in the height between where you are and where you will be, so you write down a 0 difference for that step. Then you reach a stair, and you look forward and measure the difference between the height of the  first stair and the floor. Let’s say it’s 1 foot. The difference is 1 (where you will be) – 0 (where you are now). So you write 1 down and then take a step.  The floor is level after this first step  and so as you continue along there is no change in your vertical position, so you write down zeros.  So, now your notebook has something like the following written in it:

0 0 0 0 1 0 0 0 0

Congratulations! You detected the edge in the floor in the building! I hope you didn’t trip over it.

In more mathematical terms, you’ve taken the “Forward Difference” of the data along your path. This can be written as:

Height(where you are + 1) – Height(where you are)/Step Size

Here, our step size is conveniently just “1” step, so we can just pay attention to the top part of the fraction.

It turns out, however, that this method is not quite as accurate as it could be. If you are measuring a surface that has a rapidly changing shape, rather than our simple floor,  your measurements of the change will be limited by the size of your step.  Using our walking example, you could improve your estimate if as you go along you stop and look forward,  and then staying where you are look backwards, and then average these two measurements. This is known as a “Centered Difference”, and it is a bit intuitive that this kind of measurement might be a better estimate. It’s like the phrase “splitting the difference.”

Height(where you are + 1) – Height(where you are – 1)/2 x Step Size

There is also a “Backward Difference”, which as you can imagine takes the measurement where you are and the measurement in back of you and takes the difference between them. These Finite Differences provide an estimate of the first derivative of our data, i.e.  the rate of change of the data.

If you wish to see a more detailed discussion of Finite Differences, you can find more information at http://en.wikipedia.org/wiki/Finite_difference or see the book “Computational Engineering” by Gilbert Strang.

So, we now have a method – the Centered Difference  – to estimate the change in our data as we move along through it.

So, let’s apply this to our data. If we take an section of our “edge” data, we have an array of data like this:

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

So…..how are we efficiently going to apply the Centered Difference to this data?

In image processing, there is the concept of a “kernel”, which is a small and usually square block, or matrix, of numbers used to step along an image and provide a new pixel based on the values in the kernel and the values in the pixels that are covered by the kernel as it moves along. So, for instance, a kernel might conceptually look like this:

-1 0 1
-1 0 1
-1 0 1

Here we have a 3 by 3 matrix of numbers. But, for the moment, we are in one dimension – so, let us take one row from this matrix:

-1 0 1

This row of numbers can be used to quickly move through our data and give us a Centered Difference. Here is our centered difference formula again:

Height(where you are + 1) – Height(where you are – 1)/2 x Step Size

We’re not going to use this exactly as above – it turns out that dividing by two gives us a more accurate measurement of the rate of change, but in our simple case we’re outputing a black and white image, and for now we can skip the divide by 2.

In order to match our Centered Difference vector above (with a 0 in the center), we can add another measurement  – Height(where you are) – to the terms just for clarity. So now we have:

Height(where you are + 1) + (0 x Height(where you are))– Height(where you are – 1)

Let’s change the above terms so we are only using addition, and then reorder them to match (see the red numbers) our one row kernel above:

(-1 x Height(where you are -1)) + (0 x Height(where you are))+ (1 x Height(where you are +1)

In the above we take -1 of the height where you are -1 step, plus none of the height where you are, + 1 of the height where you are plus 1 step. Now we can see how to apply the kernel – we step the kernel along our data, multiply the data under the kernel by the number in the kernel, sum that data up, and set the new data at the center of the kernel to that answer:
-1 0 1  MOVE ->
0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

Let’s see what happens when we run through a simple Matlab function (code below) to multiply the array and the kernel:

0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0

We can see that the edge is located (approximately) by showing the the change across the span of the edge. This is not a perfect fit – however, it’s a pretty close fit for a small amount of computing work.

Now that we’ve covered the basic theories, let’s move right into a 2 dimensional image and filter.

Here we have an gray scale image with a lot of edges. Let’s move our filter into 2 d, but we’ll limit it to enhanceing only the vertical edges by using this pattern:

0 0 0
-1 0 1
0 0 0

We will modify and use the ApplyFilter Matlab function (code provided at the end) on this image with this filter:

Above we have the image with the vertical lines enhanced. Now we’ll try to pick up some diagonals that point from bottom right to top left. To do that we’ll add another set of coefficients, along the diagonal perpendicular to the lines we want to enhance:

0 0 1
-1 0 1
-1 0 0

Here we can see the additional diagonal lines picked up. Let’s pull out the other diagonals:

-1 0 0
-1 0 1
0 0 1

 

OK, lastly we’ll pick up all the edges with the full set of coefficients.

-1 0 1
-1 0 1
-1 0 1

We can see the lines, but, they seem a bit bright. We can put our divide by 2 back in now that we have a “real” image and bring the relative brightness of the lines back into proportion:

This is a more accurate, but perhaps less useful image. We could just as well skip the divide by two if we want the lines to stand out more.

So, that’s all for Part 1!

In Part 2 I’ll discuss  2nd derivative kernels, and also how to get the angles of the enhanced lines.

Rick Frank

Dominion Software, Inc.

Matlab code:

function [ RESULT ] = FirstDifference( inputVector )

%FirstDifference Demonstrate taking the centered first difference of the input vector, as

%a 1D example of an image processing edge enhancement.

kernel = [-1 0 1];

[rows cols] = size(V);

RESULT = zeros(1,cols);

% loop through the data, but we must focus on 1 pixel inside

% at both ends.

for i = 2:cols-1

temp = 0;

% loop over the “Kernel”, K, which simulates a centered first difference

for j = -1:1

data = inputVector(i+j);

mult = kernel(j + 2);

temp = temp + (data * mult);

end

RESULT(i) = temp;

end

end

function [ RESULT ] = ApplyFilter( inputImage )

%ApplyFilter apply a 3 by 3 filter to the input image

% Modify this kernel as desired.

kernel = [-1 0 1; -1 0 1; -1 0 1]

dbImage = im2double(inputImage);

[rows cols] = size(dbImage);

RESULT = zeros(rows,cols);

% loop through the data, but we must focus on 1 pixel inside

% at both ends.

for aRow = 2 : rows – 1

for aCol = 2 : cols – 1

temp = 0.0;

% loop over the “Kernel”

for rowOffset = -1 : 1

for colOffset = -1 : 1

imageRow = aRow + rowOffset;

imageCol = aCol + colOffset;

data = dbImage(imageRow,imageCol);

mult = kernel(rowOffset + 2,colOffset + 2);

% divide by two if we wish

temp = temp + ((data * mult) * 0.5);

end

end

RESULT(aRow,aCol) = temp;

end

end

end