I have been working through the Accelerated C++ book. In the fifth chapter I am beginning to construct my own programs without the aid of step by step instructions from the book. There is an exercise that requests for me to write a program that produces a permuted index. The program below is what I have come up with.

  1. Is my code well-written and readable?
  2. In split.h, should splitBySpace and splitBySep be part of two different classes, or is the fact that they both have similar functions enough to keep them in the same class?
  3. For the most part I have been using include directives in the header files. If the header file has the include directive, then I have been opting to omit that same include directive from the source file (my current thought is that it may be redundant). Is this correct?

I know that my program functions, what I am trying to do is build good habits as a C++ programmer and I would like to build those habits as early as possible if there are any vets that could give a student any advice or praise.

merge.h

#ifndef GUARD_merge_h
#define GUARD_merge_h

// merge.h
#include <string>
#include <vector>

std::string merge(const std::vector<std::string>&);
#endif

merge.cpp

#include "merge.h"

using std::string;
using std::vector;

string merge(const vector<string>& vec)
{
    string s = "";
    vector<string>::const_iterator iter = vec.begin();
    // merge the strings in the vector
    while(iter != vec.end()) {
        s += *iter + " ";
        ++iter;
    }
    return s;
}

split.h

#ifndef GUARD_split_h
#define GUARD_split_h

// split.h
#include <string>
#include <vector>

bool issep(char);
std::vector<std::string> splitBySpace(const std::string& s);
std::string splitBySep(std::string& s);
#endif

split.cpp

#include <algorithm>
#include "split.h"

using std::string;
using std::vector;
using std::max;

bool issep(char x)
{
    return x == ',';
}

vector<string> splitBySpace(const string& s)
{
    vector<string> ret;
    typedef string::size_type string_size;
    string_size i = 0;

    while(i != s.size()) {
        // ignore leading blanks
        while(i != s.size() && isspace(s[i]))
            ++i;
        // find end of next word
        string_size j = i;
        while(j != s.size() && !isspace(s[j]))
            ++j;

        // if we found some nonwhitespace characters
        if(i != j) {
            // copy from s starting at i and taking j - i chars
            ret.push_back(s.substr(i, j - i));
            i = j;
        }
    }
    return ret;
}

string splitBySep(string& s)
{
    string left = "";
    string right = "";
    typedef string::size_type string_size;
    string_size i = 0;


    // if the rotation does not need to be split,
    // return the full rotation as the right column
    if(issep(s[s.size() - 2])) {
        s = s.substr(0, s.size() - 2);
        left = "";
        return left;
    } else
        ++i;

    // split the string into the two columns
    while(i != s.size()) {

        if(issep(s[i])) {
            right = s.substr(0, i);
            left = s.substr(i + 1, s.size());
        }
        ++i;
    }
    s = right;
    return left;
}

permuted_index.cpp

#include <algorithm>
#include <iostream>
#include <sstream>
#include <stdexcept>
#include <string>
#include <vector>
#include "merge.h"
#include "split.h"

using std::cin;
using std::cout;
using std::domain_error;
using std::endl;
using std::getline;
using std::max;
using std::rotate;
using std::size_t;
using std::string;
using std::vector;

// permuted_index.cpp
// This program finds the permuted index of any number of
// given phrases.
// The algorithm that I use involves creating rotations of
// the words of each line in order to alphabetize them.

int main() {
    vector<string> rotations; 
    string::size_type maxlen = 0;
    string s;

    // read and split the words in the phrase and create word 
    // rotations from them
    while(getline(cin, s)) {
        size_t found = s.find(',');
        // throw domain error if the separator character ',' is found
        if (found != string::npos)
            throw domain_error("the phrase must not contain the separator character: ','");

        // split the phrase into words, appending a space and comma
        // to denote the end of the phrase and append the separator
        vector<string> words = splitBySpace(s + ",");

        vector<string>::iterator wordsIter = words.begin();
        // use the words to create rotations
        while(wordsIter != words.end()) {
            string x = merge(words);
            rotations.push_back(x);
            rotate(words.begin(),words.begin() + 1,words.end());
            ++wordsIter;

            // find the phrase with the most characters
            maxlen = max(x.size(), maxlen);
        }
    }
    // alphabetize the rotations
    sort(rotations.begin(), rotations.end());

    vector<string>::iterator rotationsIter = rotations.begin();
    // split the rotations by separator and output in 2 columns
    while(rotationsIter != rotations.end()) {
        string q = splitBySep(*rotationsIter);
        cout << q << string(maxlen - q.size(), ' ') << *rotationsIter << endl;
        ++rotationsIter;
    }
    return 0;
}

You certainly deserve a lot of praise for this piece of code! We rarely see beginners producing code at that level of "good practices" of programming.

Is my code well-written and readable?

Yes, it's well-written and readable. You get an A+ here. I really can't find anything to critique in that aspect of your code. It's well spaced out, consistently styled, the names are well chosen and meaningful, and you have a healthy amount of comments in your code (without being excessive).

In split.h, should splitBySpace and splitBySep be part of two different classes, or is the fact that they both have similar functions enough to keep them in the same class?

They are not part of the same class, they are part of the same header / source. This is totally fine. Headers and source files are meant to regroup things that are very much related or very similar. The most important thing is that logical separation. After that, you can be concerned about size (not making them too large), but that's not a problem for you now ("large" starts at maybe one thousand of lines of code). So, the "one class one header" rule is only a rule of thumb to help you achieve the other two objectives (logically related, and not too big).

For the most part I have been using include directives in the header files. If the header file has the include directive, then I have been opting to omit that same include directive from the source file (my current thought is that it may be redundant). Is this correct?

Well, this is a bit more subtle. The reality is that pretty much everyone does as you do, even seasoned or professional programmers. But technically-speaking, you should be repeating the includes in the source file. This is because you should not depend on the fact that a header will necessarily include another one that you need. This is because it might not be true. You probably don't know enough yet about C++, but there are ways to avoid including the headers for things that you don't actually completely need yet. What this means is that a header could declare a function taking a parameter of some type, without including the header that declares that type (by using a "forward declaration" instead), which means that you have to include that necessary header in the source file. So, the general rule is that you should not assume that a header includes anything else. It's not really redundant because of the header guards, but it is kind of tedious to always include everything, and that's why it is not a rule that is too rigorously followed, especially when it works regardless.

So, the guidelines there is to try to minimize the headers you include inside your headers, and to try not to assume that the headers you include include any other headers. But don't lose sleep over it.

if there are any vets that could give a student any advice or praise.

It's really hard to find much to critique in your code, but I will give it a try.

1) Your header guards should be in all upper-case letters. This is because of the very wide-spread convention that pre-processor macros (defines) should be in all upper-case. There are dangers associated to macros and therefore, the all upper-case lettering makes them stand out to the programmers reading the code. For header guards, this is not as critical (because those macros are not used anywhere else), but you might as well do it everywhere.

2) Your consistent use of using statements like using std::string; is good and bad. It's good in the sense that it's much better than a blanket using namespace std; statement. It's bad in the sense that it's verbose and not often that useful.

If you use the class or function only once, what's the point of the using statement?

Part of the problem with this is also similar to the problem with header inclusions. You are essentially listing, at the top of the source file, all the things you use later on inside a bunch of functions. If you decide to move those functions from one source file to another, you'll have to figure out which using-statements you need to bring along as well. It makes the code less "copy-pastable".

I generally prefer to either use the std:: prefix everywhere, or issue the using-statements inside the body of the function that uses that class or function often.

3) You should prefer idiomatic code as much as possible. Use for-loops instead of while-loops for the idiomatic "iterate through elements" loops, like in your merge function:

for(vector<string>::const_iterator iter = vec.begin(); 
    iter != vec.end(); ++iter) {
    s += *iter + " ";
}

Or, some of these "advanced" alternatives:

for(vector<string>::const_iterator iter = vec.begin(), iter_end = vec.end();
    iter != iter_end; ++iter) {
    s += *iter + " ";
}

Or (range-based for-loops (C++11)):

for(auto x : vec) {
    s += x + " ";
}

A big part of making code readable is making the code look obvious to most programmers. Most programmers see a for-loop and assume it's some basic linear traversal of an array or list. When you see a while-loop, you expect something a bit more complex (i.e., you think "there must be a reason he is not using a for-loop here", and you waste time looking for that special reason, if there is none).

I don't think I can find anything else to critique.... it's pretty good code you have there (in fact, if this was an assignment in a course, I would assume that the student cheated!). There are a few minor inefficiencies in the code, but nothing to cry about.

I dreamed of such a response to my post, Mike! I appreciate your praise and the details in your critique, which I will undoubtedly use... I am now reading about the range-based for-loops.

This is not a critique of your code but a way to make it look a little better. In your SplitBySpace function you can make it a lot simpler by using stringstreams.

vector<string> SplitBySpace(const string & s)
{
    stringstream ss;
    string temp;
    // load s into the stringstream
    ss << s;
    vector<string> ret;
    // while there a parts in ss load them into ret
    while(ss >> temp)
    {
        ret.push_back(temp);
        temp.clear();
    }
    return ret;
}

Using a stringstream takes advantage of the fact that white space is ignored just like using cin. I do have to agree with Mike that your code is very well done for a begginer. If only more people could put as much effort into there code when starting out.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.