Member Avatar for Ajantis

Hey folks :)

I am studying for my final exam, and I have some issues with this Big Oh notation. Really - I've been reading about it like crazy, but I just can't to understand it all properly.

What I know is that it is some kind of measurement of growth rate.
How much time it takes for a certain input to be computed.

And I can't seem to grasp how it really is measured with loops.

If I have a loop like this:

for(int i = 0; i < 10; i++)
do this...

then I know it's the problem of size N.

for(int i = 0; i < 10; i++)
for(int j = 0; j < 10; j++)
do this...

then I know it's the problem of size N^2.

HOWEVER... there are many different ways... and the teacher mentioned about the square root of N... N loglogN... N log^2 N... N^2 log N... and many more. How do I know how those look like?

Also with divide and conquer algorithms... you can divide a problem to N/2. How? I reeeeeally don't have any feeling of how to understand these stuff whatsoever. And I've been reading some books about it - they keep talking and talking... and they talk about everything except the Big Oh notation! Seriously... can someone please help me understand this thing once and for all? And at least get a feeling how I can compute it? :S

>square root of N... N loglogN... N log^2 N... N^2 log N...
>and many more. How do I know how those look like?

They're generally not as clear cut as N and N^2 because you typically don't see this kind of loop directly:

for (int i = 0; i < n; i++) {
  for (int j = 0; j < log10(n); j++) {
    // ...
  }
}

Obviously it's O(N logN), but usually you need to look a little deeper. For example:

for (int i = 0; i < n; i++) {
  bst.insert(load_value());
}

This is O(N logN) as well because the insert algorithm for the binary search tree is assumed to be logarithmic. It's really the same thing, but not as obvious. Simple loops are used to convey the idea of growth because it's easy to illustrate.

>Also with divide and conquer algorithms... you can divide a problem to N/2. How?
Binary search is the quintessential example of divide and conquer. The idea is to minimize your working set. In the case of binary search, this is immediately obvious: using the value, you can lop off half the set with a single test.

For something like quicksort it's less obvious, but the principle is the same. Divide the working set in half so that you're sorting two smaller sets. The benefits are twofold:

  1. It's faster to sort two small sets than one big one, or four small sets than two big ones, and so on. Thus quicksort recursively divides the sets.
  2. It's faster to sort together two already sorted sets than it is to sort one big set in random order. Quicksort typically uses insertion sort as the underlying algorithm because it's efficient with almost sorted sets.

So quicksort divides the problem recursively, sorts all of the smaller sets, then recursively puts them back together with a fast "almost sorted" sort. Divide and conquer at it's greatest, as proven by quicksort's dominance as a general sort since its introduction in the 1960s.

>Seriously... can someone please help me understand this thing once and for all?
I wrote a quick tutorial for mortals (most articles on Big-O seem to be written for people PhDs). It may or may not help.

commented: "people PhDs" -- you crack me up. :) +2
commented: Narue has explained Big-Oh in a simple and comprehendable way +2
Member Avatar for Ajantis

Narue, I can't explain how much you have helped me. Thank you. :) Really sweet of you and I appriciate it much.

Since my final exam is approaching, I will be exercising a little on Big-Oh today, and I will post up some loops and what I think Big-Oh they have later. Hopefully, maybe you - or someone else - could check it out and see if I got it straight. I'd be very grateful. As I already am. :)

If you want to complement Narue call her a coldhearted *****, she will appreciate it more.

;)

Member Avatar for Ajantis

Allright here are some that I did. Is it correct? :)

a)

for (i = 0; i < n; ++i)
++k;

MY ANSWER: O(N)


(b)

for ( i = 1; i < n; i *= 2)
++k;

MY ANSWER: O(2N)

(c)

for ( i = 0; i < n; ++i)
         for ( j = 0; j < n; ++j)
            ++k;

MY ANSWER: O(N log N)


(d)

for (i = 0; i < n; ++i)
         for ( j = i; j < n; ++j)
            ++k;

MY ANSWER: O(N^2)

(e)

for ( i = 0; i < n; ++i)
        for (int j = 0; j < i * i; ++j)
           ++k;

MY ANSWER: O(3N^2)

>(b)
>for ( i = 1; i < n; i *= 2) ++k;
>MY ANSWER: O(2N)

Constants are typically removed in Big O notation, so you're saying this loop is O(N)[1]. Let's make it easier and say you're iterating over values in the range of [1, N) and printing them:

void print_range1 ( int n )
{
  for ( int i = 1; i < n; i *= 2 )
    printf ( "%d ", i );
  putchar ( '\n' );
}

Let's now say you call this function with an argument of 10. It'll print {1, 2, 4, 8}. The asymptotic growth of an algorithm refers to the difference as N grows towards infinity, but a good empirical measurement can be made by doubling N. If you call print_range with an argument of 20, the output is {1, 2, 4, 8, 16}. If you call it with 40, the output is {1, 2, 4, 8, 16, 32}. Hmm, by doubling N you've gained a single item in the output rather than the expected N extra items.

Compare that with a loop that does run in O(N) time (the classic example):

void print_range2 ( int n )
{
  for ( int i = 1; i < n; i++ )
    printf ( "%d ", i );
  putchar ( '\n' );
}

This function prints n-1 values. If n doubles, so do the number of values printed. That's O(N), the growth of the algorithm is proportional to N.

Now, in your (b) loop, the range of the loop is [1, N). The important part here is how you skip over parts of that range. So if the loop is bound by N and you ignore large chunks of that range, wouldn't that mean your loop has sublinear growth? It's easy to see that the number of items printed in print_range1 increases when N reaches the next power of 2, so you can use that as a starting point for coming up with the growth.

>(c)
>for ( i = 0; i < n; ++i) for ( j = 0; j < n; ++j) ++k;
>MY ANSWER: O(N log N)

Which part of this algorithm is log N? The inner loop is clearly O(N) and the outer loop is also clearly O(N). That means for each iteration of the outer loop (an O(N) loop), you run the inner loop in its entirety. Thus, you have the classic example of O(N^2): quadratic complexity.

>(d)
>for (i = 0; i < n; ++i) for ( j = i; j < n; ++j) ++k;
>MY ANSWER: O(N^2)

Your answer is correct, but I want to make sure you understand why. Big O is an upper bound on growth, so even if your algorithm starts at N^2 and ends at N, the upper bound is still N^2, and therefore the complexity is O(N^2). The largest complexity rules in Big O.

>(e)
>for ( i = 0; i < n; ++i) for (int j = 0; j < i * i; ++j) ++k;
>MY ANSWER: O(3N^2)

Without the constant, you're saying the algorithm is O(N^2). The outer loop is O(N), clearly. The upper bound rule means that to get the inner growth, you take the largest iteration, which is i=n. You probably know that n*n is another way of saying n^2, so O(N^2) overall is a smidge optimistic. ;)

[1] Mathematical notation is the most common, so 2N really means (2*N). If you meant 2^N, you need to state it explicitly or use TeX tags to superscript the N.

Member Avatar for Ajantis

>(b)
>for ( i = 1; i < n; i *= 2) ++k;
>MY ANSWER: O(2N)

Constants are typically removed in Big O notation, so you're saying this loop is O(N)[1]. Let's make it easier and say you're iterating over values in the range of [1, N) and printing them:

void print_range1 ( int n )
{
  for ( int i = 1; i < n; i *= 2 )
    printf ( "%d ", i );
  putchar ( '\n' );
}

Let's now say you call this function with an argument of 10. It'll print {1, 2, 4, 8}. The asymptotic growth of an algorithm refers to the difference as N grows towards infinity, but a good empirical measurement can be made by doubling N. If you call print_range with an argument of 20, the output is {1, 2, 4, 8, 16}. If you call it with 40, the output is {1, 2, 4, 8, 16, 32}. Hmm, by doubling N you've gained a single item in the output rather than the expected N extra items.

Compare that with a loop that does run in O(N) time (the classic example):

void print_range2 ( int n )
{
  for ( int i = 1; i < n; i++ )
    printf ( "%d ", i );
  putchar ( '\n' );
}

This function prints n-1 values. If n doubles, so do the number of values printed. That's O(N), the growth of the algorithm is proportional to N.

Now, in your (b) loop, the range of the loop is [1, N). The important part here is how you skip over parts of that range. So if the loop is bound by N and you ignore large chunks of that range, wouldn't that mean your loop has sublinear growth? It's easy to see that the number of items printed in print_range1 increases when N reaches the next power of 2, so you can use that as a starting point for coming up with the growth.

>(c)
>for ( i = 0; i < n; ++i) for ( j = 0; j < n; ++j) ++k;
>MY ANSWER: O(N log N)

Which part of this algorithm is log N? The inner loop is clearly O(N) and the outer loop is also clearly O(N). That means for each iteration of the outer loop (an O(N) loop), you run the inner loop in its entirety. Thus, you have the classic example of O(N^2): quadratic complexity.

>(d)
>for (i = 0; i < n; ++i) for ( j = i; j < n; ++j) ++k;
>MY ANSWER: O(N^2)

Your answer is correct, but I want to make sure you understand why. Big O is an upper bound on growth, so even if your algorithm starts at N^2 and ends at N, the upper bound is still N^2, and therefore the complexity is O(N^2). The largest complexity rules in Big O.

>(e)
>for ( i = 0; i < n; ++i) for (int j = 0; j < i * i; ++j) ++k;
>MY ANSWER: O(3N^2)

Without the constant, you're saying the algorithm is O(N^2). The outer loop is O(N), clearly. The upper bound rule means that to get the inner growth, you take the largest iteration, which is i=n. You probably know that n*n is another way of saying n^2, so O(N^2) overall is a smidge optimistic. ;)

[1] Mathematical notation is the most common, so 2N really means (2*N). If you meant 2^N, you need to state it explicitly or use TeX tags to superscript the N.

Hey - thanks for the quick reply! Quite frankly, I'm happy you did - because my final exam is tomorrow :) And I have a couple of unsolved business still which I'd like to perfect.


>>b.) Hmm - ok, so if I understood it correctly, then it really wouldn't matter if i *= 2? It will still be O(N)? O(2N) that is - with constant removed.

>>c.) Yeah - I noticed that later on. Changed it to O(N^2). I am still unsure about the inner loops comparison though - the j < n. All of this is still a little "shadowy" still.

>>e.) Hmm, shouldn't it be like O(N^3)?? Sorry, I don't understand really... I'm still confused how such comparisons and other changes (like i *=2 was in b.) ) affect the growth rate. I'm trying to keep it all in my head but it keeps jumping out. Haven't been sleeping much the last 2 days. LOL

>b.) Hmm - ok, so if I understood it correctly, then it
>really wouldn't matter if i *= 2? It will still be O(N)?

It's actually quite a bit harder to nail down. Unless I'm missing something simple (very possible), the formula is tricky enough for my little brain that I'd just call it at O(# of powers of 2 below n). :D

>I am still unsure about the inner loops comparison though - the j < n.
Evaluate it independently from the outer loop, then combine the results. Nested means multiply, in sequence means add:

for ( int i = 0; i < n; i++ ) k++;
for ( int j = 0; j < n; j++ ) k++;

The loops are in sequence, so the you evaluate each (they're both O(N)) and add them together to get O(N) + O(N) = O(N+N) = O(2N). The 2 is irrelevant as N grows, so in toto you've got O(N).

for ( int i = 0; i < n; i++ ) {
  for ( int j = 0; j < n; j++ )
    k++;
}

Both loops are still O(N), but now the j loop is nested inside the i loop. That means you can multiply their complexities together. O(N) * O(N) = O(N*N) = O(N^2).

>e.) Hmm, shouldn't it be like O(N^3)?
Yes, exactly. The inner loop is O(N^2) and the outer loop is O(N). Multiply those two together and you get O(N^3).

Member Avatar for Ajantis

b.) Hmm - ok, so if I understood it correctly, then it
really wouldn't matter if i *= 2? It will still be O(N)?

It's actually quite a bit harder to nail down. Unless I'm missing something simple (very possible), the formula is tricky enough for my little brain that I'd just call it at O(# of powers of 2 below n). :D

I am still unsure about the inner loops comparison though - the j < n.

Evaluate it independently from the outer loop, then combine the results. Nested means multiply, in sequence means add:

for ( int i = 0; i < n; i++ ) k++;
for ( int j = 0; j < n; j++ ) k++;

The loops are in sequence, so the you evaluate each (they're both O(N)) and add them together to get O(N) + O(N) = O(N+N) = O(2N). The 2 is irrelevant as N grows, so in toto you've got O(N).

for ( int i = 0; i < n; i++ ) {
  for ( int j = 0; j < n; j++ )
    k++;
}

Both loops are still O(N), but now the j loop is nested inside the i loop. That means you can multiply their complexities together. O(N) * O(N) = O(N*N) = O(N^2).

e.) Hmm, shouldn't it be like O(N^3)?

Yes, exactly. The inner loop is O(N^2) and the outer loop is O(N). Multiply those two together and you get O(N^3).

Your little brain? I wouldn't think so. ;) I've read through some tutorials and know what you had to go through to learn what you did. And that requires quite a mind. :) Besides, size doesn't matter - it's the quality. ;) You're great.

a.) You mean square root of N? :)

Allright - so I understood those ones. But there are a couple of more that cause me some head ache. Like these two for example:

f.)

for( i = 0; i < n; i++)
   for( j = 0; j < n * n; j++)
      for( k = 0; k < j; k++)
         count++;

g.)

 for ( i = 0; i < n * n; i++)
    for (j = 0; j < i; j++)
       for ( k = 0; k < j; k++)
           count++;

f.) As far as I understand... shouldn't it be O(N*N^3) = O(N^4)?

g.) And this one... O(N^4) as well?

>a.) You mean square root of N?
Assuming you're talking about my "O(# of powers of 2 below n)" estimate, no, I really do mean powers of 2. The square root of N is close at first (but not quite right), and it gets worse as N grows.

>f.) As far as I understand... shouldn't it be O(N*N^3) = O(N^4)?
Let's take a look:

for( i = 0; i < n; i++) {
  for( j = 0; j < n * n; j++) {
    for( k = 0; k < j; k++)
      count++;
  }
}

The i loop is O(N), no need to explain why. ;) The j loop is O(N^2) as explained earlier in the thread. The k loop is an instance of taking the upper bound. To get the complexity of the k loop, you choose the largest value that k would reach before the loop ends, which in this case would be the j loop's upper bound: n*n.

So, you've got an O(N^2) loop inside another O(N^2) loop inside an O(N) loop. The k loop isn't quadratic the whole time, but that's the upper bound, so we'll call it O(N^2). Put it all together and you get O(N^5).

>g.) And this one... O(N^4) as well?

for ( i = 0; i < n * n; i++) {
  for (j = 0; j < i; j++) {
    for ( k = 0; k < j; k++)
      count++;
  }
}

The i loop is quadratic, the j loop depends on i and thus is also quadratic, and the k loop depends on j which makes it quadratic too. Two times four is six, so O(N^6)... Your algorithms keep getting slower. ;)

You'll notice that much of the time in these algorithms, the inner loops aren't quadratic. A lot of time is spent on the lower ranges and will be much faster than the Big O complexity suggests. While it's possible to refine the complexity to something more accurate, in my experience it's not really worth it.

commented: Excellent posts in this thread +12

Hi there. I'm taking a test from big O this Friday so I was going through your thread. First of all, thanks Narue, the problem is now much clearer and I get all of the examples that Ajantis had (my problem was the way I was looking at the problem, e.g. bubble sort - n^2 seemed wrong to me, because the actual number of comparisons is n*(n-1) but that's something completely different I guess). There's just that b) Ajantis mentioned. You never said the actual complexity. What I mean is that, well, O(# of powers of 2 below n) isn't really an answer you'd give to a teacher, is it :) Isn't it just log n (base 2)? At least that's how I understand it and it's another way of saying # of powers of 2 below n after all (and sry if it's a dumb question, it's after after midnight here already :icon_lol:).. Again, thanks for the explanation, went through your tutorial too :)

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.