This came up in another thread. I gave some advice that I'm not longer sure of. Rather than hijacking that thread, I figured I'd start my own. I advised against using the == in that thread. The context was this:

string subChoice ="";
getline(cin,subChoice);

if (subChoice == "100")
{
     // code
}

Ancient Dragon said the above was fine and that compare was unnecessary. So I wondered if maybe == was only bad when comparing two string variables, so I did a little test program and it worked:

#include <iostream>
#include <string>
using namespace std;

int main ()
{
    string a = "hello";
    string b = "hello";
    if (a == b)
         cout << "a and b are the same." << endl;
         
    return 0;
}

The line displayed to the screen. So my question is when is it bad to use == when comparing strings? I thought the program above was not supposed to work. Apparently I was mistaken. If it's always fine to use ==, do we ever need to use the "compare" function?

>>string subChoice ="";
It isn't necessary to provide an initializer for strings because that's the default. Just string subChoice; is sufficient.

>> do we ever need to use the "compare" function
You use it when you need to know if one string is less than, equal to, or greater than the second string, such as in sorting algorithms. You could also use the < and > operators but then that might be too slow when used in if conditions because the comparison would have to be repeated.

O.K. Just ran this through the debugger. Looks like my understanding of strings was incorrect. I had thought that a and b were addresses. I thought a pointed to some array of characters and that b pointed to some different array of characters. I thought that a pointed to where 'h' was stored in the first array of characters and b pointed to where 'h' was stored in the second array of characters, which would be two separate addresses and thus a and b would not be equal. Apparently this is incorrect?

>>Apparently this is incorrect?
Yes. You are confusing std::string for C character arrays. Had a and b been declared like this then you would be right

char* a = "hello";
char* b = "hello";
// now to compare the two string you have to call strcmp()
int n = strcmp(a,b);
//if you use == to compare a and b then you will get the wrong 
//answer
if( a == b) // <<<<<< wrong when comparing two strings

Note that the term string above does NOT mean std::string but null-terminated character arrays. In the above a and b are pointers and they point to character arrays.

commented: Thanks. +2

Well, it is possible that using == there will get the right answer, maybe, in that particular case, because the constant string literals "hello" and "hello" could evaluate to the same pointer if the compiler is smart enough.

Heres the same basic information using a slightly different explanation:

Given:
string a = "hello";

a is not an address. The STL string object may have within it a char * (to be used as a C style string) as a member variable, in addition to other member variables such as length, etc, but it is not the same as a stand alone char *. The relevant STL string constructor may be set up something like this:

string::string(const char * input)
{
   length = strlen(input);
   embeddedCharPtr = new char[len + 1];
   strncpy(embeddedCharPtr, input, length);
};

(I can only hope that I used the correct const syntax as I generally screw it up one way or the other).

The STL compare() function is probably a wrapper for strcmp() which can return the entire range of the values from strcmp() whereas the == operator probably only asks if the return value is zero from a call to strcmp() and if so then the == operator returns true, otherwise it returns false.

commented: Good explanation +2

Well, it is possible that using == there will get the right answer, maybe, in that particular case, because the constant string literals "hello" and "hello" could evaluate to the same pointer if the compiler is smart enough.

a == b is comparing the values of the two pointers to see if they both point to the same object. You can't count on what a compiler might or might not do with those two strings. Coding C or C++ programs does not and can not depending on what one compiler might or might not do. If you do that then the same program will probably not work correctly when compiled by a different compiler or even by the same compiler with different options.

In short: making such a comparison is just plain stupid and idiotic.

>>Apparently this is incorrect?
Yes. You are confusing std::string for C character arrays. Had a and b been declared like this then you would be right

char* a = "hello";
char* b = "hello";
// now to compare the two string you have to call strcmp()
int n = strcmp(a,b);
//if you use == to compare a and b then you will get the wrong 
//answer
if( a == b) // <<<<<< wrong when comparing two strings

Note that the term string above does NOT mean std::string but null-terminated character arrays. In the above a and b are pointers and they point to character arrays.

Well, it is possible that using == there will get the right answer, maybe, in that particular case, because the constant string literals "hello" and "hello" could evaluate to the same pointer if the compiler is smart enough.

I plugged this program into Dev C++, fully expecting it to fail:

#include <iostream>
#include <string>
using namespace std;

int main ()
{
    char* a = "hello";
    char* b = "hello";

    if (a == b)
         cout << "a and b are the same." << endl;

    return 0;
}

When it displayed "a and b are the same.", I started to question my existence because I had been told to NEVER do this.


I guess the Dev C++ compiler was smart enough.

a == b is comparing the values of the two pointers to see if they both point to the same object. You can't count on what a compiler might or might not do with those two strings. Coding C or C++ programs does not and can not depending on what one compiler might or might not do. If you do that then the same program will probably not work correctly when compiled by a different compiler or even by the same compiler with different options.

In short: making such a comparison is just plain stupid and idiotic.

So I guess it will fail on most compilers, but I just lucked out on that one. I'll continue to use strcmp. Frankly I was flabbergasted that it worked.

It "works" because the pointers were the same, because the compiler decided that it didn't need two immutable arrays that had the same contents.

Thanks to everyone. I think I have a better handle on this that I did before. It would be an exaggeration for me to think I've conquered this subject, but I'm going to mark this thread solved. I've gotten better at using the debugger in the last day trying to see the ins and outs of strings, so that's good. Lerner, you're right. There appears to be several things stored in a string. Somewhere there's an array of characters. There has to be. It shows up in the debugger, but I haven't found the actual memory address and how it relates to a and b yet, but I'll leave that one for another day. :) Again, thanks everyone!

Note that some implementations of the std::string datatype actually make a string object consist of a single pointer value, that points to an array of characters. Gcc's string implementation does this, and mingw's probably does too. However, just before the array of characters is a length and list of allocated space.

#include <string>
#include <cstdio>

void draw_word(unsigned char* z) {
  printf("%02x %02x %02x %02x ", z[0], z[1], z[2], z[3]);
}

void draw_memory(void* p) {
  unsigned char* z = (unsigned char*) p;
  draw_word(z - 12);
  draw_word(z - 8);
  draw_word(z - 4);
  printf("  ");
  draw_word(z);
  draw_word(z + 4);
  draw_word(z + 8);
  puts("");
}

int main() {

  std::string s("Hello");

  // Draw the memory on the stack, at the string object s
  draw_memory(&s);

  // Draw the memory pointed to by the pointer that s contains.
  draw_memory(*(void**) &s);

  // Change the string so that it has some space reserved.
  s.reserve(11);

  // Draw it again.
  draw_memory(&s);
  draw_memory(*(void**) &s);
}

Here's the output on my computer:

d0 c5 e4 8f ec 10 05 90 c2 6d e0 8f   fc 02 30 00 08 f7 ff bf 08 f7 ff bf 
05 00 00 00 05 00 00 00 00 00 00 00   48 65 6c 6c 6f 00 00 00 00 00 00 00 
d0 c5 e4 8f ec 10 05 90 c2 6d e0 8f   5c 03 30 00 08 f7 ff bf 08 f7 ff bf 
05 00 00 00 0b 00 00 00 00 00 00 00   48 65 6c 6c 6f 00 00 00 00 00 00 00

You can see that the reserve operation has caused a reallocation, since the pointer contained in s changed from 0x003002fc to 0x0030035c . Looking at the target of the pointer, you can see that it points directly at the front of the array of characters, shown by their ascii values, 48 65 6c 6c 6f . The 32-bit word of memory before the array is zero for some reason -- I have no idea why. Then we see that the 32-bit word before that changes from 0x00000005 to 0x0000000b after we reserved eleven bytes of memory for the string. 0xb is eleven, so I guess that's where the allocated space is stored. Then, before that, we have 0x00000005, which does not change, so it looks like that's the length of the string. You could try pushing characters on the back of the string to see how things change. You could try changing the reserve number to 10, then 9, then 8, and see how things change. (On my computer, it will reserve 10 characters even if you ask for 9. Maybe that means it defaults to doubling the reservation.)

Anyway, the reason == behaves differently for strings and char*s is that a different function actually gets called, because the types are different. You can implement your own == operator by defining an operator== function (or member function) for the types on which you want it defined.

commented: Thoughtful post. I appreciate it. +2

Note that some implementations of the std::string datatype actually make a string object consist of a single pointer value, that points to an array of characters. Gcc's string implementation does this, and mingw's probably does too. However, just before the array of characters is a length and list of allocated space.

#include <string>
#include <cstdio>

void draw_word(unsigned char* z) {
  printf("%02x %02x %02x %02x ", z[0], z[1], z[2], z[3]);
}

void draw_memory(void* p) {
  unsigned char* z = (unsigned char*) p;
  draw_word(z - 12);
  draw_word(z - 8);
  draw_word(z - 4);
  printf("  ");
  draw_word(z);
  draw_word(z + 4);
  draw_word(z + 8);
  puts("");
}

int main() {

  std::string s("Hello");

  // Draw the memory on the stack, at the string object s
  draw_memory(&s);

  // Draw the memory pointed to by the pointer that s contains.
  draw_memory(*(void**) &s);

  // Change the string so that it has some space reserved.
  s.reserve(11);

  // Draw it again.
  draw_memory(&s);
  draw_memory(*(void**) &s);
}

Here's the output on my computer:

d0 c5 e4 8f ec 10 05 90 c2 6d e0 8f   fc 02 30 00 08 f7 ff bf 08 f7 ff bf 
05 00 00 00 05 00 00 00 00 00 00 00   48 65 6c 6c 6f 00 00 00 00 00 00 00 
d0 c5 e4 8f ec 10 05 90 c2 6d e0 8f   5c 03 30 00 08 f7 ff bf 08 f7 ff bf 
05 00 00 00 0b 00 00 00 00 00 00 00   48 65 6c 6c 6f 00 00 00 00 00 00 00

You can see that the reserve operation has caused a reallocation, since the pointer contained in s changed from 0x003002fc to 0x0030035c . Looking at the target of the pointer, you can see that it points directly at the front of the array of characters, shown by their ascii values, 48 65 6c 6c 6f . The 32-bit word of memory before the array is zero for some reason -- I have no idea why. Then we see that the 32-bit word before that changes from 0x00000005 to 0x0000000b after we reserved eleven bytes of memory for the string. 0xb is eleven, so I guess that's where the allocated space is stored. Then, before that, we have 0x00000005, which does not change, so it looks like that's the length of the string. You could try pushing characters on the back of the string to see how things change. You could try changing the reserve number to 10, then 9, then 8, and see how things change. (On my computer, it will reserve 10 characters even if you ask for 9. Maybe that means it defaults to doubling the reservation.)

Anyway, the reason == behaves differently for strings and char*s is that a different function actually gets called, because the types are different. You can implement your own == operator by defining an operator== function (or member function) for the types on which you want it defined.

Thank you for the post. It was very helpful.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.