Project 4A: Image Warping and MosaicingΒΆ

BackgroundΒΆ

In this project, I create image mosaics from multiple photos. I start with first shooting and digitizing the pictures. Then, I can compute the homography and warp the left image to the space of the right image. This then allows me to create a canvas, and place both images on it, thus creating a mosaic. For the overlapping regions of the image, I also used alpha blending to get a nice transition without sudden, sharp edges.

Shooting the PicturesΒΆ

I began by taking pairs of pictures for different mosaics I wanted to create. When taking photos, I kept the COP constant by keeping my camera position the same and rotating my phone around my camera's axis. The images I took are shown below. The first 3 are for rectification, and the next are for mosaicing.

No description has been provided for this image

Recover HomographiesΒΆ

In this step, I write the function computeH(im1_points, im2_points), which recovers the homography H mapping im1_points to im2_points. To find H, I use the the equation p' = Hp, where p and p' are correspondences from the images. Since H_3_3 is a scaling factor and set to 1, this means if we have enough equations of the form p'_i = Hp_i, we can solve for the 8 entries in H. Each correspondence gives us 2 equations, so we need at least 4 pairs of points. However, to be robust to noise, I select more correspondences (around 15-20), set up a linear system to solve for entries of H, and then use least squares to find the H matrix.

Warping the ImagesΒΆ

In this part, I implement the warpImage(im, H) function, which uses homography H to warp im. This had a similar implementation to the inverse warp in project 3. However, I did add logic for figuring out the new image's size. The overall outline for this part was to first pipe the corners of the image through H to figure out a bounding box for the final image. This bounding box was used to create the destination grid of points we wanted values on. Then, using inverse warping and interpolation, we could determine the pixel values on this destination grid. There were several pixels with no value, as the larger dest_grid size meant that they wouldn't have a corresponding source point. In this case, I marked the pixels as black.

Image RectificationΒΆ

Here, I ensure that the computeH and warpIm are working as expected, before going on to use them for mosaicing. In order to do this, I used 3 example images where I knew about the geometry of the objects. Since homographies can map any quadrilateral to any other quadrilateral, my first test was to map some arbitrary quadrilateral to a square. Then, I tested warping an image of my macbook taken from an angle to a top-down view (ground-plane rectification). Finally, I tried rectifying a photo of a poster taken from a pretty sharp angle, and make the poster appear head-on. Below are the results of each original image with its correspondences labeled, as well as its rectification.

Quadrilateral -> Square

No description has been provided for this image

Laptop -> Top-Down View

No description has been provided for this image

Poster -> Front-Facing

No description has been provided for this image

Blend the Images into a MosaicΒΆ

Now, I finally use the warping to createa mosaic! I start by first using the web tool to select corresponding points between the left and right images for each scene. Then, I find the H that warps the left image to the right image. Now, I have two images in the same space- warped_im_1 and im_2, and can add them. In order to add them nicely, I first create a canvas that is large enough. My warp function also returns the bounding boxes, and I use the coordinates of the bounding boxes in order to figure out the coordinates where the two images overlap. Once I know the overlap region, I can use that to make the canvas just the right size, as well as to fill the corresponding regions of the canvas. Essentially, I calculate row, col offsets using the bounding boxes, and then on my canvas, I place im_1_warped, and then place im2 offset by these offsets.

However, as seen below, just this is not enough as it may lead to some artifacts in the overlapping region, like sharp edges. Thus, to remedy this, I use alpha blending for the overlapping region. From above, since I know what part of the canvas corresponds to overlapping region, I go back to this section and replace the pixel values with convex combinations of the left and right image. My coefficient alpha decreases linearly as I move left to right in the overlapping region. This means that right at the left edge of the overlap, the image is mostly im_1_warped, at the right edge it's mostly im2, and an appropriate convex combination in between.

Below, I demonstrate the results for each of my scenes. For each scene, I show 1a, 1b. the left and right images with their correspondences, 2a. the mosaic without alpha blending (left), and 2b. the mosaic with alpha blending (right).

No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image

Project 4B: Feature Matching for AutostitchingΒΆ

Step 1A: Harris Interest PointsΒΆ

In this part, I mostly used the provided harris.py code to implement section 2 of the paper. The code takes care of computing the corner strength function by taking the harmonic mean of the eigenvalues of the H matrix. After this step, it picks out the interest points by picking those that are a local maxima of corner strength in a 3x3 neighborhood and above a threshold of t=10. I wrote code to implement finding the theta values for each interest point, as described in section 2 of the paper, but this wasn't used anywhere else as we stuck with axis-aligned rectangular patches in the next step.

Step 1B: Adaptive Non-Maximal SuppressionΒΆ

Here, we significantly filter down the several interest points we collected. The idea here is to pick "strong" interest points but also those that are spatially spread out in the image to later get a nicer correspondence. I followed the approach outlined in Section 3 of the paper to implement this behavior. Specifically, here's what I did:

  1. I implemented a suppression_radius function that takes a point (x,y) and finds its "suppression radius", aka the smallest distance to an interest point that has a stronger corner strength. For robustness, the strength must be greater than that of (x,y) by a factor of 1/c_robust. Furthermore, by passing into this function the "sorted_corners" list, the list of interest points sorted in decreasing order of their h-values, we can improve the efficiency a bit: as soon as we hit a point that has lower strength than (x,y), we know all other points will also be smaller, and we can break from the loop earlier.
  2. filter_corners uses suppression_radius to iterate through all our interest points and get their suppression_radii, then we pick the top_n corners with the largest radii, and return this.

The end result is that we are able to filter a lot of the interest points (from about 2000 to 500), while making sure they are spread out throughout the image.

Step 2: Feature Descriptor ExtractionΒΆ

Once we have our filtered set of interest points from before, we can now compute the descriptor patches on each of them. Here, I do this by iterating through the interest points, centering a 40x40 window around each (with appropriate bounds checking), using skimage.resize to subsample it and get a 8x8 patch. I then bias/gain-normalize this patch, and add it to my descriptors array. The normalization ensures the descriptor doesn't depend on external factors like brightness, and the subsampling ensures that we are matching to the general pattern around the interest point, and not the specifics of this particular image.

After this part, I now have descriptors for each of the filtered interest points from the previous step.

Step 3: Feature MatchingΒΆ

In this step, we can use the get_descriptors() method imeplemented above on our two images to get 2 sets of interest points and their corresponding descriptors. I then use the provided dist2 function to compute pairwise distances between all descriptors of im1 and all descriptors of im2. Now, instead of just iterating and taking the NN of each interest point, we use the Lowe trick here: To ensure that we are only selecting points in im1 that actually have a partner in the other image, we will look at the ratio of the 1-NN error and 2-NN error. Essentially, if a point actually has a partner, this partner will be its 1-NN and have a super small error value, and the 2-NN will correspond to some other random point, and thus have a relatively high error value. Thus, the ratio 1-NN error/2-NN error gives us a good idea of whether a point actually has a corresponding partner in the other image. From Figure 6b of the paper, which looks at the distribution of this ratio, I selected my threshold to be 0.3: points that have a ratio less than 0.3 will be added to the "final_correspondences" array.

The end result of this part is that I now have 2 small arrays im1_points and im2_points, that only hold the locations of correspondences between the two images. This is the input to our code from part 4a! However, before we can blend, we must do one more thing...

Step 4: Robust Homography Estimate using RANSACΒΆ

At this point, if we were to just plug in our im1_points and im2_points from the previous step into our computeH function from 4A, this will lead to a pretty inaccurate homography to be found. The issue is that these point sets may still contain outliers, which will pull least squares, whose objective is to minimize the total squared error, towards itself, thus messing up the homography matrix we find. To address this, we will use RANSAC. Here is the outline:

  1. Since homographies map a quadrilateral to any other quadrilateral, we will randomly sample 4 points without replacement from our correspondences, pretend that this is the "correct" homography and computeH using these points. Then, we will count the number of inliers w.r.t to this H.
  2. To count the inliers, we simply apply H to each point, and count the number of points such that norm(H @ p - p') < epsilon. I chose epsilon=2.
  3. Store these 4 points, as well as the corresponding inlier count AND inlier list (all the points that "agreed" with the homography found using the 4 points). Then, repeat the process by sampling a new quadrilateral, doing this for num_iters times.
  4. Then, we pick the 4 points with the highest inlier count, and consider this as the "correct" homography. Instead of simply taking the H found by these 4 points, to get a smoother result, we now recompute H on the inlier list- the list of all points that agreed with the original homography found on the 4 points.
  5. We use the regular least squares approach on this inlier list to find our final homography H and return this.

Step 5: MosaicingΒΆ

Now that we have automatically recovered our H, I re-use much of the code from 4A in order to warp the left image, place the two on the canvas, blend the overlapping regions, etc. in order to get our final mosaic.

ResultsΒΆ

Time for some results!!
Below are the results for each mosaic. Here is the order in which I display the results for each mosaic:
1a, 1b: The initial Harris interest points (left) and the filtered interest points using Adaptive Non-Maximal Suppression (right) on the left image
2a, 2b: The initial Harris interest points (left) and the filtered interest points using Adaptive Non-Maximal Suppression (right) on the right image
3a, 3b: The correspondences that remain in the two images after using our Lowe thresholding to match the feature descriptors.
4: The final image mosaic produced with our fully automate procedure, with alpha blending enabled.
5: The image mosaic produced by manually defining correspondences (same as 4A), with alpha blending enabled.

BWW Mosaic ResultsΒΆ

No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image

TakeawaysΒΆ

The coolest thing I learned from this project was the idea of using Lowe thresholding to find correspondences between the two sets of interest points - I thought it was a neat idea of using the first AND second nearest neighbor to be able to estimate, with pretty good accuracy, whether a point truly had a partner in the other image, and use this as our basis for filtering out so many of the points and get to just those in the overlapping regions of the images.