CIS 565 Final Project: Preliminary Comparison of OpenSURF and CUDA SURF

OpenSURF[1] is an implementation of SURF feature detector/descriptor/matching in C++/C#. CUDA SURF[2] is an implementation of OpenSURF using CUDA SDK and CUDPP. Both use OpenCV to deal with basic image operations. CUDA SURF shares exactly the same function interface of OpenSURF so they are a reasonable pair to compare performance.

Here's a brief test on SURF algorithm using CPU vs GPU on the same computer(Intel Xeon 3.60GHz/4GB/Nvidia Quadro FX 5800/Ubuntu 11.04 32bit). The input images are shown.

Test Images from OpenSURF[1]

The preliminary test shows that both algorithm achieves good and similar results but CPU-based OpenSURF(0.65s) is 3x faster than GPU-based CUDA SURF(1.72s). I was quite surprised first and add timing probes to detect the difference of to implementation and found that CUDA SURF consumes numerous time in initializing to allocate memory(1.66s) and the rest part is far more faster than OpenSURF. It is potential doable for real-time processing as it only needs initialization once. More details will be tested and discussed later and I will try to optimized the CUDA SURF on this specific computer.

Code in Time(2) section

1:       // Allocate device memory  
2:       int img_width = src->width;  
3:       int img_height = src->height;  
4:       size_t rgb_img_pitch, gray_img_pitch, int_img_pitch, int_img_tr_pitch;  
5:       CUDA_SAFE_CALL( cudaMallocPitch((void**)&d_rgb_img, &rgb_img_pitch, img_width * sizeof(unsigned int), img_height) );  
6:       CUDA_SAFE_CALL( cudaMallocPitch((void**)&d_gray_img, &gray_img_pitch, img_width * sizeof(float), img_height) );  
7:       CUDA_SAFE_CALL( cudaMallocPitch((void**)&d_int_img, &int_img_pitch, img_width * sizeof(float), img_height) );  
8:       CUDA_SAFE_CALL( cudaMallocPitch((void**)&d_int_img_tr, &int_img_tr_pitch, img_height * sizeof(float), img_width) );  
9:       CUDA_SAFE_CALL( cudaMallocPitch((void**)&d_int_img_tr2, &int_img_tr_pitch, img_height * sizeof(float), img_width) );

CPU-based OpenSURF(0.65s)
Matches: 76
Time(load):0.03000
Time(descriptor):0.56000
Time(Integral):0.00000
Time(FastHessian):0.00000
Time(getIpoints):0.09000
Time(descriptor):0.33000
Time(cvReleaseImage):0.00000
--------------------------------------
Time(Integral):0.00000
Time(FastHessian):0.00000
Time(getIpoints):0.03000
Time(descriptor):0.11000
Time(cvReleaseImage):0.00000
Time(match):0.02000
Time(plot):0.00000
Time(save):0.04000


GPU-based CUDA SURF(1.72s)


Matches: 66

Time(load):0.02000

Time(descriptor):1.69000

        Time(Integral):1.68000

                Time(1):0.0000000000

                Time(2):1.6800000000

                Time(3):0.0000000000

                Time(4):0.0000000000

                Time(5):0.0000000000

                Time(6):0.0000000000

                Time(7):0.0000000000

                Time(8):0.0000000000

        Time(FastHessian):0.00000

        Time(getIpoints):0.00000

        Time(descriptor):0.00000

        Time(freeCudaImage):0.00000

        --------------------------------------

        Time(Integral):0.00000

                Time(1):0.0000000000

                Time(2):0.0000000000

                Time(3):0.0000000000

                Time(4):0.0000000000

                Time(5):0.0000000000

                Time(6):0.0000000000

                Time(7):0.0000000000

                Time(8):0.0000000000

        Time(FastHessian):0.00000

        Time(getIpoints):0.01000

        Time(descriptor):0.00000

        Time(freeCudaImage):0.00000

Time(match):0.01000

Time(plot):0.00000

Time(save):0.03000

CPU-based OpenSURF(0.65s)

GPU-based CUDA SURF(1.72s)




BTW, maybe there's a better way for timing which will increase the accuracy.[3]



              





Reference

[1]http://www.chrisevansdev.com/computer-vision-opensurf.html

[2]http://www.d2.mpi-inf.mpg.de/surf

[3]Measuring Computing Times and Operation Counts

of Generic Algorithms, http://www.cs.rpi.edu/~musser/gp/timing.html

1 comment:

Nasr AlshazlyOctober 31, 2012 at 1:10 AM
Dear Mr. Yedong Niu

Hope you have nice day,
can you help me

I need an OCR piece of code that will be used in my application for 5 different mobile platforms (Android, iPhone, windows phone , Blackberry, Symbian).

Input: image captured through mobile camera.

Output: text that contains the identified .

I’d like to inform you that after my search through various works I started to use Tesseract OCR engine as it’s open source on android platform, I’m trying now to compile the library using Android NDK plus Cygwin package (Unix environment emulator), then I’ll use the compiled library in my project.

So, I should say that my criteria if you have another suggestion should be offline, free licensee and open source solution.

Best Regards

Nasr

CIS 565 Final Project

Wednesday, March 28, 2012

Preliminary Comparison of OpenSURF and CUDA SURF

1 comment:

About Me