Frage

I've always taken it for granted that iterative search is the go-to method for finding maximum values in an unsorted list.

The thought came to me rather randomly, but in a nutshell: I believe I can accomplish the task in O(logn) time with n being the input array's size.

The approach piggy-backs on merge sort: divide and conquer.

Step 1: divide the findMax() task to two sub-tasks findMax(leftHalf) and findMax(rightHalf). This division should be finished in O(logn) time.

Step 2: merge the two maximum candidates back up. Each layer in this step should take constant time O(1), and there are, per the previous step, O(logn) such layers. So it should also be done in O(1) * O(logn) = O(logn) time (pardon the abuse of notation). This is so wrong. Each comparison is done in constant time, but there are 2^j/2 such comparisons to be done (2^j pairs of candidates at level j-th).

Thus, the whole task should be completed in O(logn) time. O(n) time.

However, when I try to time it, I get results that clearly reflect a linear O(n) running time.

size = 100000000 max = 0 time = 556

size = 200000000 max = 0 time = 1087

size = 300000000 max = 0 time = 1648

size = 400000000 max = 0 time = 1990

size = 500000000 max = 0 time = 2190

size = 600000000 max = 0 time = 2788

size = 700000000 max = 0 time = 3586

How come?

Here's the code (I left the arrays uninitialized to save on pre-processing time, the method, as far as I'd tested it, accurately identifies the maximum value in unsorted arrays):

public static short findMax(short[] list) {
    return findMax(list, 0, list.length);
}

public static short findMax(short[] list, int start, int end) {
    if(end - start == 1) {
        return list[start];
    }
    else {
        short leftMax = findMax(list, start, start+(end-start)/2);
        short rightMax = findMax(list, start+(end-start)/2, end);
        return (leftMax <= rightMax) ? (rightMax) : (leftMax);
    }
}

public static void main(String[] args) {
    for(int j=1; j < 10; j++) { 
        int size = j*100000000; // 100mil to 900mil
        short[] x = new short[size];
        long start = System.currentTimeMillis();
        int max = findMax(x);
        long end = System.currentTimeMillis();
        System.out.println("size = " + size + "\t\t\tmax = " + max + "\t\t\t time = " + (end - start));
        System.out.println();
    }
}
War es hilfreich?

Lösung

You should count the number of comparisons that actually take place :

In the final step, after you find the maximum of the first n/2 numbers and last n/2 nubmers, you need 1 more comparison to find the maximum of the entire set of numbers.

On the previous step you have to find the maximum of the first and second groups of n/4 numbers and the maximum of the third and fourth groups of n/4 numbers, so you have 2 comparisons.

Finally, at the end of the recursion, you have n/2 groups of 2 numbers, and you have to compare each pair, so you have n/2 comparisons.

When you sum them all you get :

1 + 2 + 4 + ... + n/2 = n-1 = O(n)

Andere Tipps

You indeed create log(n) layers.

But at the end of the day, you still go through each element of every created bucket. Therefore you go through every element. So overall you are still O(n).

With Eran's answer, you already know what's wrong with your reasoning.

But anyway, there is a theorem called the Master Theorem, which aids in the running time analysis of recursive functions.

It verses on the following equation:

T(n) = a*T(n/b) + O(n^d)

Where T(n) is the running time for a problem of size n.

In your case, the recurrence equation would be T(n) = 2*T(n/2) + O(1) So a=2, b=2, and d=0. That is the case because, for each n-sized instance of your problem, you break it into 2 (a) subproblems, of size n / 2 (b), and combines them in O(1) = O(n^0).

The master theorem simply states three cases:

if a = b^d, then the total running time is O(n^d*log n)

if a < b^d, then the total running time is O(n^d)

if a > b^d, then the total running time is O(n^(log a / log b))

Your case matches the third, so the total running time is O(n^(log 2 / log 2)) = O(n)

It is a nice exercise to try to understand the reason behind these three cases. They are merely the cases for which:

1st) We do the same amount total work for each recursion level (this is the case of mergesort), so we only multiply the merging time, O(n^d), by the number of levels, log n.

2nd) We do less work for the second recursion level than for the first, and so on. Therefore the total work is basically the one for the last merge step (first recursion level), O(n^d).

3rd) We do more work for deeper levels (your case), so the running time is O(number of leaves in the recursion tree). In your case you have n leaves for the deeper recursion level, so O(n).

There are some short videos on a Stanford cousera course which are very nice to explain the Master Method, available https://www.coursera.org/course/algo. I believe you can always preview the course, even if not enrolled.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top