I want to create a word processor. It seems like a useful project because I have strong feelings about how modern big word processors like MS Word and Open Office don't do things right, so my solution is to make one with more modest features but with the huge advantage of working the way I want it to work. It seems like a fun and rewarding project, but it is also challenging and I am doing it single-handedly, so I look to the internet for help and ideas.
A project like this is too big and complicated to just leap into it. I need to break it into small well-designed pieces that I can work on independently and build up to a working word processor. This seems obvious and it is a basic principle of good design, but it is troublesome because a word processor seems so monolithic by nature. I can't see the natural dividing lines where I should break up my project, and every time I attempt to break it into pieces, the pieces end up confused and tightly coupled. I think I may need advice from someone more wise than myself in the ways of object-oriented design, since I'm sure that object-oriented programming is the way to go here.
Best of all would be to look at the design of an existing word processor, perhaps something open-source, but it's really not easy to get something good. I don't want something monolithic that I can't understand, and I don't want a mere text-editor that's so simple that it could never be expanded to work on proper documents. I intend to use a piece table data structure for the text, because that is naturally superior for large documents, but examples I find on the web seem to like using gap buffers or other inferior structures. A gap buffer may work, but it involves copying text around needlessly.
The principles of a piece table are simple enough that there's no reason not to use it, in my opinion, but using it tends to flavor everything else in the project. A piece table tends to be represented using a linked list and so it is unsuited to using character index numbers for access. When you are using a gap buffer, you can access then 521st character in the document (for example) in constant time, but when you have a piece table you would need to search from the beginning of the document to find which character is the 521st. Fortunately there is nothing in the nature of a word processor that says being able to look up the 521st character should be fast. A word processor only needs to know where the currently visible part of the document starts and iterate through the document from that point until the screen is filled with text, which is more suited to a linked list than an array. Unfortunately, the best examples that I've found on the internet so far throw character index numbers all over the place, making it very difficult to use any of it in a piece table design. Most frustrating of all are the piece tables that conform to a character-index interface, and thereby practically destroy the advantages of the piece table.
Even if it makes the project more difficult, I am committed to making something with quality, something that uses a piece table, even if that means that I can't take advantage of existing libraries that depend on character index numbers. I have reluctantly discarded more than one such library.
With the right design, any project no matter how huge, becomes simple. I'd just like to find something that clearly shows how a word processor is organized. I imagine it would take the form of a graph that shows data flowing along arrows through nodes that manipulate the data, like you would have with a compiler or a graphics renderer. Or perhaps it should be represented as layers like an operating system. Has anyone ever seen a picture of a word processor broken down into pieces like that?