So, I'm trying out a project on my own because I have no real clue how to do anything complex or useful in C++ (applications for example). This project calls for creating a simple console text editor with basic functionality (buffer, command set, redisplay). The problem is that I don't understand the buffer step. How do you decide where the gap starts in a gap buffer? I have searched online for ages but can't find any definitive source on how the algorithm for the data structure actually works.

I guess what I'm trying to say is that this is my frist crack at trying to develop a rudimentary application and I'm getting stuck at the very first step; I really don't understand how I can use C++ to develop more involved applications.

By gap buffer do you mean a character array that holds the characters that are typed from the keyboard? All you need is an integer whose value is the position in the character array where the next character is to be stored. Where the gap starts? At the beginning.

char buffer[255];
int nextChar = 0;

cout << "Enter a character\n";
cin >> buffer[nextChar];
++nextChar; // point to the next available char in the buffer

OK; now how do I determine the size of the gap?

You should probably start by making a Dynamic Array, which is what is implemented in the std::vector class. The idea is simple. Instead of increasing the size of the array at every insertion, you keep it always bigger than it needs to be, using a base-2 rule for the size. In other words, you start with a size of 2, and when there is a request to add a third element to the array, you increase the "capacity" to 4 (2x2). When more than 4 elements are needed, you increase the capacity to 8, and then to 16, and so on so forth. At the end, you are never using more than twice the memory that you actually need, and you only have to very rarely (logN) increase the capacity.

A gap buffer works the exact same way (the whole power-of-2 rule for increasing capacity), but instead of having all the valid data at the start, you have part of it at the beginning (up to where the cursor is currently located) and part of it at the end (from the cursor to the end of the text), leaving a gap in-between. The size of the gap is just a matter of (capacity - size) where the capacity is whatever power of 2 is greater than the size. And to keep track of the various positions that are relevant, you simply keep different pointers within your class, like a pointer to the start of the array, one to the start of the gap (or end of first part of text), and one to the start of the second chunk of text. And you just update those pointers whenever there is a change in the buffer (insertions, deletions, relocations, etc.).

OK...So let's say you have a line of text 16 characters long and a gap buffer 80 bytes (chars) in size. How would you split that text since the gap is supposed to be 64 bytes (chars) long? Do you just cut the text in half across the gap or something like that? In my implementation I was gonna make a 4K buffer and double that every time the user needed more space; I just don't know how insertion would work in relation to the gap since the buffer is going to be huge.

Here is a very simple implementation of a gap buffer:
(Disclaimer: If this is for a school assignment, don't hand in the code below, because it intentionally uses things that would be immediately spotted as plagarism)

class GapBuffer {
  private:
    char* start_ptr;
    char* cursor_ptr;
    char* gap_end_ptr;
    char* end_ptr;
    std::size_t capacity;

    void updateCapacity() {
      if(cursor_ptr == gap_end_ptr) {
        capacity *= 2;  // double the capacity.
        char* tmp_ptr = new char[capacity];
        std::copy(start_ptr, cursor_ptr, tmp_ptr); // copy the before-gap part.
        std::copy(gap_end_ptr, end_ptr, tmp_ptr + capacity - (end_ptr - gap_end_ptr)); // copy the after-gap part.
        // update the pointers:
        cursor_ptr = tmp_ptr + (cursor_ptr - start_ptr);
        gap_end_ptr = tmp_ptr + capacity - (end_ptr - gap_end_ptr);
        end_ptr = tmp_ptr + capacity;
        // swap the memory:
        delete[] start_ptr;
        start_ptr = tmp_ptr;
      };
    };

  public:
    GapBuffer(std::size_t aInitialCapacity = 256) {
      capacity = aInitialCapacity;
      start_ptr = new char[capacity];
      cursor_ptr = start_ptr;
      end_ptr = start_ptr + capacity;
      gap_end_ptr = end_ptr;
    };

    ~GapBuffer() {
      delete[] start_ptr;
    };

    GapBuffer(const GapBuffer& rhs) {
      capacity = rhs.capacity;
      start_ptr = new char[capacity];
      std::copy(rhs.start_ptr, rhs.end_ptr, start_ptr);
      cursor_ptr = start_ptr + (rhs.cursor_ptr - rhs.start_ptr);
      end_ptr = start_ptr + capacity;
      gap_end_ptr = start_ptr + (rhs.gep_end_ptr - rhs.start_ptr);
    };

    friend
    void swap(GapBuffer& lhs, GapBuffer& rhs) {
      using std::swap;
      swap(lhs.capacity, rhs.capacity);
      swap(lhs.start_ptr, rhs.start_ptr);
      swap(lhs.cursor_ptr, rhs.cursor_ptr);
      swap(lhs.end_ptr, rhs.end_ptr);
      swap(lhs.gap_end_ptr, rhs.gap_end_ptr);
    };

    GapBuffer& operator=(GapBuffer rhs) {
      swap(*this, rhs);
      return *this;
    };

    // insert char function:
    void insert(char c) {
      (*cursor_ptr++) = c;
      updateCapacity();
    };

    // remove char function:
    void removeLastChar() {
      --cursor_ptr;  // you just need to move back the cursor-pointer.
    };

    // move cursor forward:
    void moveCursor(std::ptrdiff_t aOffset = 1) {
      if(aOffset > 0) {
        // need to copy memory from the back to the front.
        std::copy(gap_end_ptr, gap_end_ptr + aOffset, cursor_ptr);
        gap_end_ptr += aOffset;
        cursor_ptr += aOffset;
      } else if(aOffset < 0) {
        // need to copy memory from the front to the back.
        std::copy(cursor_ptr + aOffset, cursor_ptr, gap_end_ptr + aOffset);
        gap_end_ptr += aOffset;
        cursor_ptr += aOffset;
      };
    };

    // ... so on for more fancy functions.

};

So, you see, the basic idea is very simple, you just move the cursor around by copying memory either from front to back or from back to front, and moving the cursor pointer. For insertions and deletions, it is just a matter of writing and moving the cursor forward, or simply moving the cursor back (discarding whatever ends up in the gap between cursor_ptr and gap_end_ptr). When needed (i.e., during an insertion), the gap can be checked for being zero (when cursor_ptr == gap_end_ptr), at which point the capacity is increased by a factor of two. This is really all it is, it is just a matter of keeping track of a few key points in the buffer, via pointers, and maintaining them correctly.

So the basic premise is that you start the gap at the beginning, and decrease it when inserting characters. When you move the cursor pointer, what happens to the gap? If we start from an empty array, how does the buffer get filled?

Ok, below is the basic logic, illustrated. I will mark the start pointer with "s", the end pointer with "e", the cursor pointer with "c" and the pointer to the end of the gap with "g". All characters marked as "X" are currently part of the gap (which is only logical, of course, as the actual values of the memory in the gap is just whatever was there before).

So, starting with an empty buffer of length 8:

XXXXXXXX
s       e
c       g

Notice how g == e and s == c at that point, so, the gap fills the entire buffer (capacity == 8, size == 0).

If the user enters the characters "Hi my", you get:

Hi myXXX
s       e
     c  g

Notice how the cursor pointer "c" always points to the place where a new character can be inserted. In other words, every inserted character is written to that place, and then the cursor pointer is incremented, effectively shrinking the gap.

Now, if the user realizes that there needs to be a comma between "Hi" and "my", he moves the cursor back to right after the word "Hi", in which case, a copy occurs which sends a few characters to the end of the buffer, and you get this:

HiXXX my
s       e
  c  g

Notice how the end-of-gap pointer has been moved back as valid characters were copied to the end of the buffer.

So, the user enters a comma:

Hi,XX my
s       e
   c g

Then, the user decides that he wanted to write "Hi y'all," instead of just "Hi,", so, he first back-tracks:

HiXX, my
s       e
  c g

Then, he writes in the characters " y":

Hi y, my
s   c   e
    g

Now, the gap buffer detects that the cursor pointer and the end-of-gap pointer are coincident, so, the capacity of the buffer is increased to twice what it was before (it was 8, now it will be 16), and updates the pointers accordingly:

Hi yXXXXXXXX, my
s               e
    c       g

Now, there is enough place to keep on writing to get:

Hi y'allXXXX, my
s               e
        c   g

Then, the user moves the cursor to the end (after "my"), which causes the memory to be copied from after the gap to before the gap:

Hi y'all, myXXXX
s               e
            c   g

Then, the user writes " nam", which gives this:

Hi y'all, my nam
s               e
                c
                g

Which makes the cursor and end-of-gap pointer coincident again, so, the capacity is increased again (now to 32 characters in total):

Hi y'all, my namXXXXXXXXXXXXXXXX
s                               e
                c               g

And then the user completes the sentence:

Hi y'all, my name is fred!XXXXXX
s                               e
                          c     g

Et voilĂ !

That is pretty much the complete cycle of possible things that can happen and how to deal with it. And that is pretty much what the class I posted earlier does. The more fancy gap buffers would just involve doing batch insertions (instead of one character at a time), delaying the copying (when the gap is shifted by a move of the cursor) until the next insertion / deletion, and so on so forth.

commented: Very nice explanation and illustration +8
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.