Reference reading directly related to this post:
http://www.math.princeton.edu/matalive/Error/index.html

http://www.math.princeton.edu/matalive/Error/index.html

http://www.math.princeton.edu/matalive/Error/ErrorLab2/LZCompressor.html

http://www.math.princeton.edu/matalive/Error/ErrorLab2/LZCompressor.html

Hi all.

I've just started learning the ways of compression. I decided to start with the LZ algorithm. The resource I use is:

http://www.math.princeton.edu/matalive/Error/index.html

http://www.math.princeton.edu/matalive/Error/index.html

I understand the fundamentals of parsing the incoming text. Let's call the container that holds the unique sub-strings: a dictionary.
On the page where one can play online with a fully implemented LZ compressor, they use a number of ASCII characters to indicate the index of the unique sub-substring in the "dictionary". Index 1 is the ASCII character 33 ('!' = index 1, so 'SPACE' = not encountered before). ASCII-char 127 (DEL) is the 95th index. To me, a few steps in this online tutorial went too fast, missing a few steps.

Let's play with the online compression tool, I'll ask my questions along the way. I am accurate, so observe the spaces as well :)

I enter: "a"
output: "aa"

QUESTION: why the double a? I see only one letter, not encountered before, etcetera. So I expect to see a space then "a" => " a"

I enter: "aa"
output: "aa"

QUESTION: Why this output? To me the second "a" is a string in the dictionary on index 1. Since index 1 is ASCII-char 33, the
output to me should be: " a!".

I enter: "aaa"
output: " a!a!a"

QUESTION: this is how I think:

1) The dictonary looks like this:

index   substring   in_terms_of_index
"!"     "a"             NONE
""""        "aa"            !a

So the output must be (according to me): " a!a"

I enter: "aaaa"
output: " a!a a"

QUESTION: To my logic the output should be: " a!a!" . I'll show you my dictionary:

index    substring    in_terms_of_index
"!"        "a"                NONE
""""        "aa"               !a

So the output must be (according to me): SPACE for 1st "a" + "!a" for double "aa" + "!" for first index "a" = " a!a!"

And so forth.

I have another question:
Suppose my dictionary size exceeds 95? How would I code something with index 96 or higher? What is the logic? I can't just pad a
fixed amount of characters can I?

Thank your for your reading,

*bump*
I am terribly sorry for the bump. But I really need assistance. It isn't even for me.
- Val -

>It isn't even for me.
You sure used "I" a lot if it's not for you. You also seem to care quite a bit even though it's not for you. This strikes me as one of those "Hey, Doctor, my friend has this embarrassing problem" stories. :D

It seems to be you should start with simple encoding schemes and work your way up. Run-Length encoding is a good start in the field of compression. LZ is probably the next step up from Run-Length, but they both follow a similar scheme: mark repeating patterns with some kind of special encoding. With Run-Length, that encoding would be a simple integral value in front of or behind a character marking how many of that character the encoding is to be replaced with. In LZ, repeating patterns are indexed and filed at the beginning or end of the sequence so that each pattern only appears once, and the index replaces all patterns in the sequence.

>It isn't even for me.
You sure used "I" a lot if it's not for you. You also seem to care quite a bit even though it's not for you. This strikes me as one of those "Hey, Doctor, my friend has this embarrassing problem" stories. :D

If you read it right, you would have found that I am not interested in code whatsoever. I never asked for a single code snippet, and I can assure you I don't need it.

But I do need to know how the indexing issue is solved from an algorithmic point of view. It is exactly here where the tutorials stop. So this is the point where I start to post.
The reason is not that interesting, but in the end it is not for me indeed. I don't need compression algo's :).

>If you read it right, you would have found that I am not interested in code whatsoever.
And if you read it right, you would have discovered that I made no mention of code whatsoever. I was questioning your ferver about a problem that you claim means nothing to you. Rest assured that when I think you're a brainless moron who believes that everyone is obliged to give you everything you ask for, I'll inform you of it in no uncertain terms. And I won't be nearly as pleasant as I've been thusfar.

>But I do need to know how the indexing issue is solved from an algorithmic point of view.
From an algorithmic point of view, it's solved with a table.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.