I have been reading the 'inside the c++ object model' by Lippman and one of the sections has me a little confused. It is the section where the author explains how virtual tables are created and virtual pointers assigned, in the scenario of multilevel, single inheritance. If anyone here has the book i'm referring to the diagram on page 129, figure 4.1 (I think there's only 1 edition) and the statement mentioned on the next page that if a derived class adds a new virtual function, a slot is added to the end of virtual table which contains the address of the new virtual function. Now since there's only 1 vptr in the class, how does the compiler stop it from accessing the new virtual function even if I have a base pointer pointing to a child pointer. As the vptr, if i'm not wrong is the same?

If needed I can upload an example object model.

I'm not sure I understand the question. Can you give an example of what you expect to happen?

I'm not sure if this is what you mean, if not, can you give an example what you're confused about?

struct base {
   virtual void test() {}
};

struct derived : public base {
   void test() { }
   virtual void test2() { }
};

int main() {
   derived* d = new derived;
   base* b    = (base*)(d);

   b->test2(); // compile error, test2 is not a member of base
   d->test2(); // Fine, 'd' has a vptr which includes test2

   b->test(); // calls derived::test, even though b is d, casted to base*, it still points to a derived object in memory
}

derived 'adds' a new virtual function, 'test2'.
It's not like both base and derived share the same vptr, they are two distinct classes who happen to share a common interface.

Sure,

Lets say we have a base class, 'Base' , derived by another say 'Derived'.

class Base
{
     public:
        int _x;
        virtual void y();
        virtual void z();
        virtual ~Base();
}

class Derived : Base
{
     public:
      virtual void y();
      virtual void newfnc();
      virtual ~Derived();
}

Now, Derived class's virtual table contains the address of the virtual functions inherited/overridden from base and the address of the new virtual function 'newfnc' declared in derived.

The virtual pointer in derived points to this vtable. If I have a Base pointer pointing to the Derived object will it still not be using the same virtual pointer? If yes, then how does the compiler prevent us from calling the 'newfn' function using the base pointer, since the virtual table has the address of newfn and we have a pointer to the table.

I'm probably too confused, if it's still not clear i'll try to get the diagram in.

You can't access newfnc from a pointer to Base because it's not available for that type, regardless of which actual object type you're pointing to. This is where downcasting comes into the picture, because you need to safely determine that the pointer is actually pointing to a Derived object before trying to access unique members of that class.

Thanks Narue. I think I knew that but got confused looking at the way the virtual function call was transformed in the example, something like (*ptr->vptr[4])(ptr), and ptr is the pointer to base type pointing to a derived class object. If i'm not wrong the vtable pointed to by vptr also contains the address of the virtual functions unique to the child class? and i was wondering what stops the compiler from finding the address of a child class function which might be in the next slot in the table? How is the type of object used here to determine which functions are not a part of base class ?

I think I knew that but got confused looking at the way the virtual function call was transformed in the example, something like (*ptr->vptr[4])(ptr), and ptr is the pointer to base type pointing to a derived class object.

There are two mechanisms going on here. You have the virtual type and the actual type. The virtual type is the type of the pointer or reference to your object. The actual type is the type of the object itself. These two types can be different, thanks to polymorphism.

If you have a pointer to a base class that actually points to an object of a derived class, you can call virtual member functions through the base class pointer and the virtual table mechanism will call the derived class override. But you can't call a member function that does not exist in the base class, virtual or not.

So we can say that access is determined by the virtual type, but behavior is determined by the actual type.

what stops the compiler from finding the address of a child class function which might be in the next slot in the table?

Each class has its own virtual table. The base class virtual table won't contain new virtual functions added in a child class.

How is the type of object used here to determine which functions are not a part of base class ?

The type of the object determines which virtual table is queried. Obviously you shouldn't get a handle to the child class virtual table from a base class object, and vice versa.

Thanks Narue, I think i've understood the concept now.

I have another question on a slightly different part from the same book. This is regarding the adjustment to the 'this' pointer. According to the book in case of a multple inheritance if we point the second base class pointer to a derived class,

class Derived: public Base1, Base2{
     
}

Base2* ptr = new Derived();

compiler requires to adjust the 'this' pointer, using something called a thunk to get to the Base2 subobject within the Derived class. The adjustment looks something like

Derived* temp = new Derived();
Base2* ptr = temp ? temp + sizeof(Base1) : 0

However it doesn't say if a similar strategy is used in case of multilevel inheritance, something like

class Base1{
}

class Base2 : public Base1{
}

class Derived : public Base2{
}

Base2* ptr = new Derived(); //will this also require a this pointer adjustment ?

tbh i'm not really clear of why multiple inheritance is more expensive than multilevel inheritance.

commented: Nice question, refreshing (Y) +3

However it doesn't say if a similar strategy is used in case of multilevel inheritance

While there is an adjustment involved for multilevel inheritance in the case of casts, the thunk solution isn't needed and would be too heavy. For example:

struct Top { int a; };
struct Middle: Top { int b; };
struct Bottom: Middle { int c; int d; };

int main()
{
    Top *p = new Bottom;
}

In the typical implementation of single inheritance, new levels are simply appended to the object. So the object pointed to by p would look something like this:

-----
p -> | a |
     -----
     | b |
     -----
     | c |
     | d |
     -----

To access each of the data members below the top-level base you'd need a cast to force visibility, and that cast involves an offset to the appropriate sub-object:

p->a = 0;
((Middle*)p)->b = 1;
((Bottom*)p)->c = 2;
((Bottom*)p)->d = 3;

The adjustment upon casting is equivalent to adding an offset corresponding to the size of the object stack above the object you're casting to, followed by an offset into the object to get to the specified data member:

const int _Top_offset = sizeof(Top);
const int _Middle_offset = sizeof(Middle);

const int _Top_a_offset = 0;
const int _Middle_b_offset = 0;
const int _Bottom_c_offset = 0;
const int _Bottom_d_offset = sizeof(int);

*(int*)(((char*)p + 0) + _Top_a_offset) = 0;
*(int*)(((char*)p + _Top_offset) + _Middle_b_offset) = 1;
*(int*)(((char*)p + _Middle_offset) + _Bottom_c_offset) = 2;
*(int*)(((char*)p + _Middle_offset) + _Bottom_d_offset) = 3;

i'm not really clear of why multiple inheritance is more expensive than multilevel inheritance

It really boils down to the complexity of maintaining a 1-to-1 relationship between an object and its immediate base sub-objects. With single inheritance this is easy because there's only one immediate sub-object. But with multiple inheritance, it becomes tricky to model and still maintain expected behavior.

The most obvious problem is address equivalency. With single inheritance you can cast the address of an object to a pointer to the base and get the same address, but the same is not possible with multiple immediate bases because both sub-objects can't have an offset of 0 or there would be data overlap:

#include <iostream>

struct A { int a; };
struct B { int b; };
struct C: A, B { int c; int d; };

struct D { int d; };
struct E: D { int e; };
struct F: E { int f; };

int main()
{
    C c;
    F f;

    std::cout<< (void*)&c <<'\t'<< (void*)(A*)&c <<'\n';
    std::cout<< (void*)&c <<'\t'<< (void*)(B*)&c <<"\n\n";

    std::cout<< (void*)&f <<'\t'<< (void*)(D*)&f <<'\n';
    std::cout<< (void*)&f <<'\t'<< (void*)(E*)&f <<'\n';
    std::cout<< (void*)&f <<'\t'<< (void*)(F*)&f <<'\n';
}

The compiler needs to introduce some overhead to get around this limitation when there's an implicit use of the this pointer, such as with member function calls.

commented: Refreshing to have these kinds of questions/answers +3
commented: Excellent explanation. Thanks +4

Thanks Narue for the explanation and the examples were very helpful.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.