A couple of days ago I was worried because I did not know how to calculate the scaling factor for my opencv project. Today I realized that I is easier than I first thought.

Lets go over what I am trying to do: When a picture is taken of the same place over a period of time (months), each picture seems to be a little off. One picture could be taken facing the north and another can be taken facing the south. Moreover some pictures are close to the subject whereas some others are far. I started this little project in opencv to mitigate these effects and have a “normalized” set of pictures :)

In previous posts I had already talked about how to un-rotate the rotated picture. So one can choose to transform all the pictures in such a way that all of them seem to be taken facing the south (for example). The other problem was scaling: How to re-size the picture so all the pictures seem to be taken from the same distance to the subject. In this case I am talking about height because the pictures I am working with are of plants taken from above. At first I was a bit worried because I thought that I would have to calculate some other intrinsic camera values but I think I can avoid this approach.

First we have to realize that we will be using the same distance from the image plane (in the camera) to a certain subject (in this case the chessboard). So if we have two pictures and one is 5 cm away from the subject and the other is 2 cm away from the subject, we should down-scale (or up, depending on your point of view) the second picture (the one that is the closest). We should scale it in such a way that the distance to the chessboard is the same as the first. We choose the closes one because it’s the one that has more detail and can actually be down-scaled. We could not zoom into the one that is farthest because we don’t have enough information.

So, How do we calculate the resulting size of the smallest image? Lets use a little image to help us understand how the relationships work.

- Our problem is how to change S2 into S1. S2 is the image projected in the image plane of the same object as S1. The only difference is that S1 describes an object that is farther away.
- Remember that we are trying to downscale S2 into S1. We do this because S2 has enough information for this (The opposite is not true)
- The functions used in opencv give us the distance of the object with respect to the point of origin (the pinhole). Am actually not 100% sure of this, the distance might be to the image plane, but as we will see this detail is not that important anyway.
- D1 is the distance of the picture that is farthest in a list of pictures. At the end of the image adjust analysis, all the pictures should have this distance. D2 is the distance of the picture that we are going to modify.
- A1 and A2 are the angles formed by the projections with the optical axis. As we will see, they are also not that important.

A few things that we can say about the figure above:

1. tan(A2) = S2/f || 2. tan(A2) = H1/D2 || 3. S2*D2 = f*H1 4. tan(A1) = S1/f || 5. tan(A1) = H1/D1 || 6. S1*D1 = f*H1

if we put 3. and 6. together we end up with

S2*D2 = S1*D1

Remember that we know both of the distances and we also know the S2 size. Which is the width and height of the image (The relation is going to be the same for the width and height of the projected image into the plane and for the plane itself). S1 represents both width and height. So the equations ends up being:

S1 = S2 * (D2/D1)

We can say that the ratio that we need is the short distance divided by the long distance. And that ration should be multiplied by the height (to calculate the new height) and by the width (to calculate the new width)

After the calculation we will end up with an image that is *ratio* times smaller than the original, what we can do (if the opencv function has not already done it for us) is fill in the rest of the image with black pixels.

Going to implement this to see if my calculations were actually correct :)