I have a design question which I have not been able to find a satisfactory answer for by searching.

Essentially, I have a class - Base - that contains a vector of pointers to another class - Node. However, I would like to be able extend both the Base and Node class. This will involve adding new variables and functions to the classes derived from Node, and utilising these additional functions in classes derived from Base.

I can think of a few ways to accomplish this, but am unsure which is the best from a design point of view, as they each have their flaws. It has led me to think that perhaps there is a different design altogether which can accomplish what I need more elegantly.

Here is some example code illustrating my problem:

class Node
{
protected:

int value;

public:

Node(int v): value(v)
	{ }
void GetValue()
	{ return value; }
virtual void Print()
	{ 
	std::cout << "Value is: " << value << std::endl;
	}
};


class BetterNode: public Node
{
protected:

double another_value;

public:

BetterNode(int v, double v2): Node(v), another_value(v2)
	{ }
void GetBetterValue()
	{ return another_value; }
virtual void Print()
	{
	Node::Print();
	std::cout << "Better Value is: " << another_value << std::endl;
	}
};


class Base
{
protected:

std::vector< Node* > some_nodes;

virtual void MakeNodes()
	{
	for(int i = 0; i < 10; i++)
		some_nodes.push_back( new Node(i) );
	}

public:

Base()
	{
	Initialise();
	}
void PrintNodes()
	{
	for(int i = 0; i < some_nodes.size(); i++)
		{
		std::cout << "Node " << i << ":" << std::endl;
		some_nodes[i]->Print();
		}
	}

int AddValues()
	{
	int temp = 0;
	for(int i = 0; i < some_nodes.size(); i++)
		temp += some_nodes[i]->GetValue();
	return temp;
	}
	

};


class BetterBase
{

virtual void MakeNodes()
	{
	for(int i = 0; i < 10; i++)
		some_nodes.push_back( new BetterNode(i, i/2) );
	}

public:

BetterBase()
	{
	Initialise();
	}

double AddBetterValues()
	{
	double temp = 0;
	for(int i = 0; i < some_nodes.size(); i++)
		temp += some_nodes[i]->GetBetterValue(); // THE PROBLEM LIES HERE.
	return temp;
	}

};


int main(void)
{

Base * b = new Base;
Base * better_b = new BetterBase;

// Output details about Nodes:
b->PrintNodes();
// Virtual function in BetterNode takes over and outputs additional Detail:
better_b->PrintNodes();

// Lets find out what the sum of the node values from b are:
int b_values = b->AddValues();
std::cout << b_values << std::endl;

// We *know* that better_b contains additional information. What is it?
double better_values = better_b->AddBetterValues();
std::cout << better_values << std::endl;
// But ofcourse, this compiles with an error, because AddBetterValues does not exist in Base class..

return 0;
}

So, My derived BetterNode class is naturally going to extend the base Node class, and will need new values and consequently new functions to access them, that are unknown to the base Node class. However, it will share the common base properties.

In addition, the derived BetterBase class wishes to implement these BetterNodes and utilise their additional functionality in performing what it does. It will fill its some_nodes vector with pointers to BetterNodes. It can therefore use all of its parent Base class functions, as the BetterNode is derived from the Node.

I have tried to illustrate this problem in the above code. I do not need to reimplement my Print function, as classes derived from Node can virtually overload the Print function, which saves me effort. This is good. However, My derived BetterBase class wants to make use of the extended functionality of BetterNode, in this example by implementing AddBetterValues(). It obviously cannot, as the vector some_nodes is pointers to the base Node class, which does not know about this function.

So, How can I pull this off?

I have thought of a few ways, but am unsure which is best:

1. Make the Base class a template class, whereby I select which type of Node the vector some_nodes will hold.
example:

template<class T = Node> Base
{
protected:

vector<T*> some_nodes;

//...other code.
};

class BetterBase: public Base<BetterNode>
{
// now, AddBetterValues would work, as the vector is not of base class Node, but of BetterNode.
}

The problem here is that it allows for mistakes to be made. What if I derive from the Base class like this:

class AnotherBase: public Base<NonNodeClass>

Templating allows me to fill that vector with anything, not just classes derived from Node, which could screw things up. As such, this seems like a sloppy and potentially error-prone workaround.

2. Whenever I add functions to classes derived from Node, I can also prototype/define these functions as virtual in the base Node class. Now, the base class knows about additional functions, and there are no issues,

However, this seems like bad design practise, and people who derive from my classes should not need to go back and add relevant code into my base Node class. In addition, bulking out the base Node class with functions it does not itself need is just stupid.

3. Casting in derived classes. If, knowing that I will be using BetterNodes in the BetterBase class, I simply cast from Node* to BetterNode* when necessary, I would provide the additional BetterBase functions access to the required additional BetterNode functions. The downside to this approach is the requirement for an increasing amount of casting as the class derived from Node grows more complex, which seems sloppy. Also, people tend to say that this is a sign of a design flaw. As it seems like the simplest way to overcome my problem, does this mean that there is a better way I could structure my code?


So, which method is best? Or, do I need to think about redesigning my code entirely?

Thanks in advance for your help!
James,

It sounds like you're looking to expand the polymorphic behavior you have already created. Pay specific attention to the concept of a "Pure Virtual" function.

Your second option is probably closest to what you'll want. But instead, change the name of BetterBase::AddBetterValues() to BetterBase::AddValues() and make Base::AddValues() a virtual function, similar to MakeNodes(). You'll want to do something similar with your Node and BetterNode classes as well.

Hiya, thanks for your reply.

I have some understanding of polymorphic behaviour and am aware of abstract base classes. My examples were perhaps somewhat simplified in order to try and get my problem across.

I realise that I can define an abstract base class to hold all of the functions that one might need in the Node class and any classes derived from it. However, this seems sloppy as it involves me bulking out the abstract class with every function that every class derived from Node might need. Is this really the best way?

Also, in the example, AddValues() returns an int, whereas AddBetterValues() returns a double. My Node's Print() function is virtual, as that makes sense, but there will be cases that require me to define new functions for classes derived from the Node class which have no relationship to the existing ones: If I add a vector of values to the BetterNode class for example.

My main aim was just to reduce the amount of code I have to duplicate, by putting things common to all nodes in the base Node class, and likewise for the Base class. If I have to have an abstract base class with all functions from every derived Node class, why don't I just have one Node class with heaps of functions in? It would seem to suffer the same issues (though would probably be harder to read and more annoying having to initialise loads of unused variables in one class)?

Thanks,
James

>If I have to have an abstract base class with all functions from every derived
>Node class, why don't I just have one Node class with heaps of functions in?

Indeed. Polymorphism is meant to share a common interface with varying behavior. Your problem is trying to use polymorphism with a varying interface. Most likely your design is weak for what you're trying to accomplish. Can you describe the program requirements that brought you to this design?

I tried to compile your code for verification purposes. I think you have bigger issues to address first. My compiler returned 40+ errors. Ranging from undeclared identifiers to invalid return types.

My suggestion would be to correct your errors until you can either get a functional program or a single remaining error related to the thread. Once you have done that, re-post your code and we can take another look.

>I think you have bigger issues to address first. My compiler returned 40+ errors.
I got the distinct impression that the code is for illustrative purposes only. It's not his actual code, merely an ad hoc example of the overall design. Simple bugs can be ignored in this case, I believe.

>>I got the distinct impression that the code is for illustrative purposes only. It's not his actual code,
You're probably right, but it does make it difficult to test a corrective action related to the initial query.

Templating allows me to fill that vector with anything, not just classes derived from Node, which could screw things up. As such, this seems like a sloppy and potentially error-prone workaround.

Use SFINAE to restrict the types to just Node or derived classes of Node.

Also, in the example, AddValues() returns an int, whereas AddBetterValues() returns a double. My Node's Print() function is virtual, as that makes sense, but there will be cases that require me to define new functions for classes derived from the Node class which have no relationship to the existing ones: If I add a vector of values to the BetterNode class for example.

My main aim was just to reduce the amount of code I have to duplicate, by putting things common to all nodes in the base Node class, and likewise for the Base class.

For AddValues() to be polymorphic, the function must have the same name (but different implementations) in different classes. To make it (compile-time) polymorphic on the result type (int in one case and double in the other), have the node and its derived classes announce the type of the result.

An elided example (using boost::enable_if and boost::type_traits):

#include <boost/utility.hpp>
#include <boost/type_traits.hpp>
#include <iostream>

using boost::is_base_of ;
using boost::enable_if ;

struct node
{
    typedef int value_type ;

    explicit node( int i ) : v(i) {}

    value_type value() const { return v ; }

    int v ;
} ;

// generalization of base<T>; declared, not defined
template< typename T, typename ENABLER = void > struct base ;

// a specialization for base<T> is defined
//      when T is either a node or a derived class of node
template < typename T >
struct base< T, typename enable_if< is_base_of<node,T> >::type >
{
    base( const T& aa, const T& bb ) : a(aa), b(bb) {}

    typename T::value_type total_value() const
    { return a.value() + b.value() ; }

    T a ;
    T b ;
};

struct better_node : node
{
    typedef double value_type ;

    explicit better_node( int i, double d ) : node(i), v2(d) {}

    value_type value() const { return v2 ; }

    double v2 ;
} ;

int main()
{
     // fine, better_node is derived from node
     base<better_node> ok( better_node(2,4.7), better_node(0,8.5) ) ;
     std::cout << ok.total_value() << '\n' ;

     // also fine, use the base class node directly
     base<node> also_ok( node(7), node(19) ) ;
     std::cout << also_ok.total_value() << '\n' ;

     // error: base<double,void> is an incomplete type;
     // a variable of an incomplete type can't be defined
     base<double> not_ok ;
}
commented: Thanks for your help and example code :) +0

Hey,

Here is a description of my problem, as I believe someone asked.

In general, the program I am creating will take in N dimensional data. Each node in the program will represent one set of N dimensional values. There will be a limited number of nodes, and so the aim of the program will be to gradually alter the values stored in each node in order to minimize the error between an input, and the values stored within the node.

In general then, there will be a number of Nodes, each capable of storing an N dimensional pattern. It is useful if each node can also remember the error between its own pattern and the input pattern presented to the program at any given time, in order to find which node is closest and update the patterns stored within each node accordingly. There will also be a class containing these nodes, which is responsible for carrying out the learning algorithm on them by adjusting their values in some way so that they best approximate the N dimensional patterns presented to it.

That is the general case. However, there are various specialisations of this. Here are a couple for an example:

1. A "Self Organizing Map", or SOM. In this approach, each node stores an input pattern as above, and the aim is to find the node most closely matching a given input pattern, as above. However, in this approach, nodes also have a position on a grid. This position is used for the learning process. Given that each node has a fixed position on a grid, we find the node closest to the input pattern presented, and then update that node and nodes nearby it on the grid, by moving the patterns stored in them closer to the input pattern. How much the nodes' patterns are moved towards the input pattern depends on how far away the node is, on the fixed grid, from the node closest to the input pattern.

This requires nodes to also remember their position on a grid. This also requires the object containing the nodes to perform a learning algorithm utilising that position, and to create a given number of nodes, each with a fixed position on a grid.

1a. The self organizing map from (1) can be modified by altering the learning approach in smaller ways. This will lead to classes derived from the SOM class (which itself is derived from a Base class), which overload the appropriate functions in order to perform the modified learning method.

2. "Growing Neural Gas". In this approach, each node stores an input pattern as above, and once again, the aim is to find the node most closely matching a given input pattern, as above. In this approach, nodes don't have a position on a grid as with (1). However, nodes are linked with eachother, and so the Node class will have to have a vector of links (either pointers or integers describing their position in a vector or ID). The learning process will involve the closest node and nodes linked to it having their patterns moved towards a given input pattern.

As you can see, there appears to be a hierarchical layout, by which nodes take on additional functionality in order to work with a learning class that itself has taken on additional functionality from the base one. option 1 and 2 are both derived from some base, which can perform loading/saving/outputting of details etc that are common to every learning class. likewise, the type of node required for option 1 and 2 can be derived from a base node in order that they can be output/saved/etc by using functions defined in the base Node class (and overloaded to provide additional functionality when required).

I would be very welcome to any other ideas on how to design this program. My hierarchical design was in order to minimize the amount of code I needed to duplicate and provide as much base functionality as possible; things that do not necessary need changing. Also, this hierarchy allows me to extend the base load and save functions easily by just appending any additional details into the save file/loading the additional details.

I hope this has been informative enough for a response, please let me know if you require any more detail!

Also, I realise my code will have errors, I just quickly typed it up for illustrative purposes as an example of my problem. Sorry if thats proved an inconvenience!

Thanks,
James

Conclusion

This thread has been very useful, and caused me to reevaluate my design somewhat. I found the mention of SFINAE particularly helpful, and the example code a nice illustration of the boost SFINAE usage.

I think that templates will get tricky down the line, and I've decided that the best way to allow any derived Base class to access the required derived Node functionality, is to implement a simple virtually overloaded cast function to serve as an intermediary between the stored nodes and Base functions. This allows as much polymorphism as I'd be able to make use of, while cutting the cost of extending code (worrying about more templating etc down the line) and ensuring it cannot be misused at the same time (anything not derived from Node, won't work with the base Node* vector anyway).

Thanks for your help and suggestions, they have been most useful in helping me explore alternate routes :)

James,

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.