18th March 2005

 

Wehner’s

“Fibonacci”

Data Compression

And the New Calculus of Sets

by Charles Douglas Wehner

© Patents pending


There is no genuine portrait of Leonardo Fibonacci. He was taken as a young boy to Algeria in the Byzantine empire, and portraiture was forbidden under Islam. He studied mathematics under various Muslim scholars - the Hisab al gebr wa'l muqqabalah of Al-Kwarizmi having arrived four centuries before.

His real name was Leonardo Bonacci - the son of Guilielmo Bonacci. So his various names - Ibn Bonacci, Bin Bonacci, Figlio Bonacci and Fi Bonacci - are just nicknames in the Arabic style (SON of Bonacci).

His father was an ambassador for the Republic of Pisa. He also became a kind of ambassador, writing at least four books including in 1202 the Liber abaci (book of calculations) in which his famous number series appears. He was the “mathematics ambassador of the Republic of Pisa”, delivering the al gebr to a mathematically illiterate Europe - the right man, in the right place at the right time.

To his various nicknames, he added “Leonardo Bigollo”. It seems that the word bigollo signifies a tramp - possibly a wry comment on his way of life, wandering through the Muslim lands of North Africa. His self-deprecating style was not lost on the government of Pisa, who wrote to him “Dear honourable and respected Leonardo Tramp”, and awarded him a salary in 1240 in recognition of his work. Shortly afterwards, he died.

It was in Book 6 of “The Elements” that Euclid, around 300 BC, had posed the problem of dividing a line C into parts A and B in such a manner that the ratio C over A is equal to the ratio A over B.

It turns out that this ratio - Phi - is the square root of five, plus one, all divided by two. As the square root of a prime number is irrational, and remains irrational even when incremented and halved, so Phi must be irrational. That is to say, it cannot be expressed by the ratio of two whole numbers.


The ratio is known as the GOLDEN SECTION, and is often found in Greek painting, and in the height-to-width ratio of Greek urns. It is said to bestow great elegance upon those artefacts that are so divided.

With typical humour, Fibonacci asked in his Liber Abaci how many rabbits one would get in a year from a single pair, if each pair could breed another pair each month, but only after becoming two months old.



Exponentiation

The first pair breeds a pair, which is 1. Next month it breeds a pair, which again is 1 - its first brood is growing up. On the third month, it breeds a pair, as does its firstborn pair - that is 2. The other pair is growing up. On the fourth month, three pairs breed (3) and two are growing up. Then there are 5, then 8, and so on.

Each number is the sum of the two previous.

These numbers: 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, and so on, are the Fibonacci Series.


Fibonacci studied the ratio of each of his numbers to the previous. 
If it were fixed, it would be Phi, the Golden Section. 
So each ratio is a Fibonacci Approximation to Phi.
These are 1, 2, 1.5, 1.66666, 1.6, 1.625 &c.

There can be no ratio of whole numbers (rational number) that equals an irrational one. Fibonacci knew this, but had discovered that there is now the possibility of finding approximations, like the popular 22/7 for pi. He studied the square root of two to find approximations.

As can be seen from the Fibonacci Approximations, as the numbers get bigger the approximation gets better.

Set Theory

When Gottlob Frege heard of Boolean Arithmetic, he set out to put all the principia of mathematics into a book, subject to the rigours of the new mathematical logic. It was Bertrand Russell (pictured) who wrote to him to say that this is impossible. Frege discontinued his work.

The reasoning is founded in SET THEORY.


What Frege had in mind was that all of logic and mathematics could be encapsulated into a single self-contained SET of principia.

Russell realised that the SET that was being constructed was an open set, whilst Frege’s set was a closed set.


What this meant was that as Frege started with his principia F, and created his proofs, R, he would need to create proofs of his proofs, proofs of his proofs of his proofs and proofs of his proofs of his proofs of his.......


It is only when the SETS that are being incorporated into the new set - the SUPERSET - are closed, that we know how many components there are in each, and therefore that the superset will be of finite size.

Russell received the Nobel Prize - but not for his mathematics (there is none). He received the 1950 prize for literature “In recognition of his varied and significant writings in which he champions humanitarian ideals and freedom of thought.”

On 9th July 1955, Lord Russell and Albert Einstein issued the Russell-Einstein Manifesto, signed by many eminences including double Nobel-prize winner Linus Pauling. It opens with a warning about the perils of “WEAPONS OF MASS DESTRUCTION” being developed by America and Russia.

The British establishment immediately ordered the production of so-called “documentary” films, exposing the alleged sex-lives of Russell and Einstein, in order to try to besmirch their monumental reputations. Russell’s huge following, particularly amongst members of the Campaign for Nuclear Disarmament (CND), obliged them to delay publication. Russell kept them waiting - he lived to be 97.

Russell’s Rule states simply that a set cannot contain itself.

When Lempel, Ziv and Welsh designed the LZW system of data-compression for Unix, with applications in the “Graphics Interchange Format” (GIF) and “Run Length Encoding” (RLE) image compression standards, they broke Russell’s Rule.

From the original data, the first byte, F, as above, is put into the output buffer. If that kind of byte repeats many times, the repeat set, R, as above, is opened. Then the whole of F is put into R.

Hovever, the set R cannot contain all of itself. We are obliged to step outside of arithmetic, to discover how much of R can go into itself.

Does it make sense to create an empty set? If you are trying to compress data, would you despatch compression codes that represent zero data? No. So the MINIMUM size of an LZW set is ONE.

Under this condition, we can license ourselves to break Russell’s Rule. We can put ALL of F into R, and then the FIRST - and only the first - of R. In the case of a blue-sky in a cartoon image, for example, F might contain a blue pixel and R would contain TWO blue pixels.

This restriction then propagates throughout the entire coding, so that any new set can contain all of a previous set plus ONE ONLY from the set that follows that previous set.

The consequence is a LINEAR GROWTH of coding efficiency.

Wehner coding puts TWO items from the original data into the output buffer. These are F and R. Even if R is a repeat of the first item F, no attempt is made to compress at this stage.

Wehner coding then opens up a new set, G, as pictured above. If in a graphic image there is a lot of blue sky, then F will be one blue dot, R will be one blue dot instead of two, and G will be two blue dots.

Should the blue expanse continue, the next set will contain the second blue dot plus the pair - giving three dots. Then the next set will contain the two plus the three, giving five. Then we have three plus five - giving eight. We have entered into the Fibonacci Series, with its “Golden rate of exponential growth”, hence the nickname “Fibonacci” for the method.

We will see later that there is more to the system than mere Fibonacci expansion.

But first, let us examine this 600 by 120 pixel graphic:


Each code of 1500 in the output can be depicted as a coloured line on a graph:


Notice first of all, the exponential growth at the start. Also, notice how the compression strings - which we will call Differals for reasons that will become clear - often contain colour.

Now compare the 2130 codes of an LZW file, such as a GIF:


There is a linear ramp at the beginning, followed by a series of spikes that contain no colour except at their tips. It is difficult for LZW to consume coloured patches into itself, because these are usually completed SETS, and LZW accepts only additional pixels, not additional sets.

The start is particularly interesting:


Wehner coding rises exponentially, and the codes quickly reach about two thousand pixels in this graphic. Although the image is 512 dots wide, the graph goes off screen.

Coherent data consists of bytes of uniform type. Here the grey is 192. So there is a Fibonacci expansion on coherent data, the divergence of compression efficiency, followed by a sudden collapse or convergence.

The divergence has already been shown to be approximately at the rate of 1.618 to one - the Golden rate. However, the convergence is different.

If we run out of grey pixels, we might fail to reach, say, 144. So we use the Fibonacci number 89 for a second time. This cannot be followed by 55, because we failed to reach 144. So we will fail to reach 55, and must use 34. After 34, we cannot use 21 because 34 plus 21 give 55. Just as we failed to reach 55, so we fail to reach 21.

We find that in the limit we are going backwards in TWOS. Each step is 1.618 TIMES the step before, so two steps are 1.618 times 1.618 or 2.618 (har, har, har)!

Going backwards one step divides by 1.618, (Phi) - or multiplies by 0.618, (phi), har, har, har! Going backwards two steps multiplies by 0.618 squared. This is 0.381966. But that is 1 MINUS 0.618 (har, har, har)!

So the divergence is 1 plus phi, and the convergence is better than or equal to 1 minus phi.


The LZW system, however, never leaves the 512-dot wide page. The slow divergence is compensated for by a convergence that is either instantaneous or needs just a single code. However, this is a poor return.

On this graphic, the compression was 1.4 times worse than with Wehner coding.

Phase Lock

Consider a situation where we have a repeating pattern of RGB, over and over again.

The first two codes will be R and G. Then, the B will be unable to find a match. Then we have RG.

The B that follows has RG after it, giving BRG.

So there is a phase lock onto the triad of colours, but not in the sequence we expect. After the initial “learning” stage, the machine has compressed three bytes into a single code.

However, there is now the need for another triad that starts on Blue. The system can find no way out other than to repeat the previous triad. Thereafter, however, the system gives TWO triads for a single code, THREE triads for one code, FIVE triads for a code, the Fibonacci series ... for as long as the repetitive data lasts.

Compressed Data

Wehner codes contain the essence of repetition compaction. That is to say, it may seem to be a clever idea to pack two pixels of 4 bits each into a byte, throughout the file, prior to compression.

However, when the graphic “Compression” was coded from a 4-bit bitmap, it came to 1448 codes. This is just 49 less than the 1497 codes of the 8-bit bitmap.

This may seem like a saving until one realises that there is still a task for the central processor to do. Before a graphic image can be displayed, the CPU must unpack every one of the 36000 codes into two bytes - but the “Fibonacci” 8-bit file delivers the data already unpacked as bytes.

Indexed Data

If we are dealing with 8-bit data that is not eight-bit greyscale but a restricted range of colours indexed into a palette, we are again at a disadvantage. A test was made of the “Compression” graphic coded as a 24-bit bitmap. It required 1791 Wehner codes. This showed that the data took 294 codes to learn to connect the colours into an internal “virtual palette”.

What you get in return for those 294 codes is a package that needs no explicit palette - thereby saving perhaps as much as the 768 bytes of a 256-colour 24-bit palette. The data is delivered after “Fibonacci” ready decoded into RGB, thereby saving the processor the additional work, and can even deliver 257 colours at little extra cost.

Every time a new colour arrives, the Wehner system goes through its “learning phase” - but is simultaneously building the output, if inefficiently. Having paid the price of this small inefficiency, one then finds that the extra colour is now firmly part of the Fibonacci indexing system.

Conventional wisdom has to be thrown away, now that exponential compression has arrived. Although bit-compressed data and indexed data can be coded, it is a matter of preference. When the data is presented as three bytes per pixel, Wehner coding learns to wrap it up as one code per pixel and then two, three, five and onward. When the coding is reversed, RGB is delivered ready for display.

The efficiency of a restricted palette depends on statistics rather than bit-packing in the Fibonacci system. How many ways can you toss a coin before a previous result is guaranteed to repeat? And how many times can you toss a die? Coins have two sides, so the answer is 2. Dice have six sides, so that answer is 6. And how many PAIRS of throwings are possible? HEAD, HEADS then HEADS, TAILS then TAILS, HEADS then TAILS, TAILS. So it is 2 squared for coins and 6 squared for dice. The two-colour data will start repeating, and therefore compressing, sooner than will the six-colour data.

So a sixteen-megacolour 24-bit system when used for just sixteen colours costs a few hundred extra codes. This matters on the Internet with a small graphic that itself has few bytes. However, with animated cartoons sent over a television channel it does not matter. The system pays a tiny price, but is always ready to extend its palette - such as if a subtitle in unfamiliar colours has suddenly to be broadcast together with that cartoon.

The maximum string-length on 4-bit packed data was 1974 bytes - equivalent to 3948 pixels at one go. The maximum string-length for that same “Compression” graphic, packed as one byte per pixel, was 6765. The maximum byte-length for that graphic packed as 24-bit data was 10946, equivalent to 3648.66666 pixels.

Again, conventional wisdom can be discarded. These results - 3948, 6765 and 3648 - can be looked upon as virtually identical. That is because they differ by only one or two steps in the Fibonacci series, two steps being a ratio of 2.168 to one. And their sheer magnitude contrasts dramatically with the 273 of the LZW coding.

Disparate Data

It may seem unlikely that this system is of any use for data other than of the repetitive kind. However, that is not the case.

Firstly, we must consider disparate data - data in which nothing pairs up. We could lay out the bytes 0, 1, 2, 3, and so on as pairs in a grid in order to look for pairs that match:


What emerges from this study is that pairs like 00 and 11 must be eliminated. This puts an oblique line through the grid.

Then we find that all pairs that are earlier than the matched pairs must also be eliminated. That is, 10 must be eliminated because it already appeared in the sequence 0102. Similarly, 20 appeared in the sequence 0203, and 21 appeared in 1213.

So we are left with SUMMA 9 in this 10 by 10 grid - just 45 out of a hundred. Similarly, we have SUMMA 255 in a grid of bytes. That is just 32640.

Bytes can be disparate, therefore, up to 32640. After that, any new bytes are bound to replicate a pre-existing pair.

The science of combinations can become quite complex. Some people call it bombastically “combinatorial mathematics”. The number of ways of arranging X items in a row is FACTORIAL X. Factorials quickly become quite large.

Fortunately, in real-world data there is a lot of repetition. The individual frame of a video may be disparate, but as each frame is very much like the next, a “Fibonacci” system can capitalise on this - with astonishing results.

Consider the word “Random”, with a full-stop and space:

The letters of the word are themselves random. They are disparate. However, if the word comes around again, the “Ra” will be gathered up as one code. Then the “nd”, then the “om”, and finally the “. ” (stop, space).

Another appearance, and the “Rand” is collected, followed by the “om. ”. Finally, the entire word with stop and space, “Random. ” is coded as a single code.

This is exponential coding, but NOT at the Fibonacci “Golden” rate. It is BINARY - even quicker compression.

It was said before that there is more to Wehner coding than Fibonacci alone. Now you can see the reason. Being based upon the gathering up of pairs of strings into each superstring, it is intrinsically binary.

But what happens after the disparate data has been consumed into a single code?

That’s right! Just as with the triad of RGB dots, it is forced to pause - and then continues at the Fibonacci rate.

Video

Polly, the star of black-and-white silent movies, will take you through the rudiments of cinematography.

We will study the effect that her presence will have upon the sets of pixels in a Wehner-coded video stream. In effect, this is a dress-rehearsal for high-definition, non-degrading colour television.

Each compression-string (or Differal) is coloured either red or green, alternately, in the coded image.

The Static Image

The static image consists of disparate data. When the frame is repeated, as with the word “Random” above, data will be consumed frame-by-frame at the binary rate. The video also shows the switch-on transient.


92 grey shades. 136,240 bytes become 10,242 codes.
Maximum code length 2043.

The Pan

The pan is a sideways movement, which breaks up the phase of the coding. It requires a new SET to commence in each new position, so the leading edge of Polly is preserved even as her features are consumed into the coding.


92 grey shades. 136,240 bytes become 12,816 codes.
Maximum code length 5760.

The Tilt

The tilt is similar to the pan, but in the vertical direction. Clues to the vertical nature of the movement will be retained in the coding, even whilst the features are consumed.


92 grey shades. 136,240 bytes become 9,942 codes.
Maximum code length 8544.

The Zoom

The zoom introduces a change of scale to the image. The effect of this movement on the Fibonacci coding is very hard to quantify.


124 grey shades. 136,240 bytes become 39,687 codes.
Maximum code length 87.

When these movements are put together into a movie, we can see that all that is old is stored in the video data earlier on, whilst all that is new is displayed as a difference. This is the reason for inventing a new Calculus. It is the Calculus of Sets.

The term has been previously used for the combinations and permutations of sets. But that is really the calculus applied to sets.

The new calculus DIFFERATES rather than differentiates. It seeks out plural sets that match other plural sets. When it finds a match, it “wraps” that set up as a single SUPERSET, under a new name. That means that the old when seen again is constantly being turned into the new. The data is constantly being made different.

Differation has nothing to do with subtracting one number from another. It is not in the amplitude domain. Nor is it statistics - they are in the amplitude domain even when applied to sets. Differation uses the Babbage “IF” construct (from his Inference Engine) to decide that “If they match, they are made different”.

The lowest form of plurality is two, so the system turns two sets into one if they appear again. As, by Russell’s Rule, they must be closed sets, they have to be completed sets - hence the Wehner coding, which in the limit delivers the Fibonacci series.

 

 

 

 

There were 1,097,920 bytes compressed into 67,443 codes when the left part of this image was compressed into the right.

Miniature videos of this kind do not show the system off to its best advantage - but it is necessary to work this way in order to keep the demonstration within manageable proportions.

Artificial Awareness

From the above, it can be seen that the system goes to sleep when repetitive data arrives, and wakes up when disparate data appears.

The effect when data is delivered, such as from a video camera, at a regular rate, is just like neurology. The slow delivery of Differals is like the slow Beta waves of the human brain. The explosion of small sets and primitives, such as when the zoom commences, is like the Theta Storm of the roused human mind.

Consciousness is consciousness of time. It is the comparison of the present with the past. However, the past is constantly slipping away. So on first reflection, it seems that awareness is impossible.

However, if events are stored, we have a different situation. Storage brings its contents out of the domain of time. So we have a new pair of antonyms - time and storage. Just as left is the opposite of right, and up of down, so time is the opposite of storage.

The conscious mind lives in the time domain. The subconscious exists in the storage domain.

In Man and the higher animals, there are of course other activities in the subconscious. There is the endocrine function - to maintain life via the pituitary glands. Also there is a motor function - to run the leg drivers automatically, so that you can attend to other matters whilst walking. Also, there is further data-compaction, and also unconscious problem-solving. Have you ever walked away from a crossword puzzle, only to find that you could solve it easily later?

However, at the level of this study, the primitive coding Al-Kwarism (algorithm) has no antecedent life. It has no subconscious feelings. The animal must be self-feeding - going out into the world and finding food. The program when run in silicon needs only electricity, which is supplied.

It has less wit, personality and charisma than an ant. Yet it is a "conscious" being of sorts. For this is the quintessential awareness.

Consider the image when the first pixel of each Differal (compression string) is coloured green, and all non-primitives are red:


You can clearly see how Polly arrives on screen, with her black-and-white pixels appearing and then being “consumed”. She ploughs across the screen, her face being “consumed” into the system, but her “bow wave” travels ahead of her. When she stops, she vanishes - eight Differals becoming four, then two, then one. Her disappearance is exponential.

Set Calculus

The phenomenon of exponential decay is well known in the world of mathematics and electronics. A simple differentiator consisting of a capacitor and resistor causes the direct voltage to decay in this manner.

But what we are seeing in the above is not the decay of some magnitude. It is the decay of some pattern.

An examination of the leading-edge Differal display of Polly (above) will show that, yes, the number of Differals, such as after the zoom, becomes half, and half and half again.

The conventional arithmetic of exponential decay is quite adequate to quantify what happens after it happens. However, it was a binary process of pattern-matching that caused it to happen. That process is “DIFFERATION”. A new word becomes necessary when a new science is revealed.

The reverse process is “SUMMARATION”, from the Latin SUMMA. It “unwraps” each “Differal” to reveal the sum of its parts.

Differation is a process of logical comparison, not of arithmetic - so it is the logical parallel of differentiation.

A data-stream that has been differated cannot be differated again. This is because of Russell’s Rule. A set that has had all pairs of pairs removed cannot be simplified by eliminating pairs of pairs. Fibonacci will not compress Fibonacci.

However, if the sets are modified, such as by scrubbing the disparate data within them and emphasising only the leading edge, then we can differate again:


What we see is that the “bow wave” of Polly has vanished entirely. However, by carrying out a process of exclusive-or (XOR) with the original leading-edge Differal file, it may be possible to separate it out.

Such studies are necessary when examining data that changes in the time-domain, in order to identify the pan, the tilt and (hopefully) the zoom. However, time did not allow the study to be taken to completion.

The differation process involves deciding what the domain of the set of primitives is. Here, the primitives are bytes in the domain [0 to 255]. Sets must be labelled with numbers outside of that domain.

LZW reserve two numbers outside of the domain of bytes - the 256 and 257. Wehner coding reserves only one, the 256.

If 16-bit words are used in the coding, the number 65536 will be used. Thus a Wehner “Data Package” begins with a byte that states 8, for two to the power eight, or 16 for two to the power 16, and so on.

When a pair of primitives is found to match another pair, the FIRST AVAILABLE number outside the set is taken, to point at it. It would be possible to point at the second primitive with the implicit assumption that the preceding primitive will precede it, but for convenience it was decided to point at the first byte with the implicit assumption that the following byte will follow.

The process continues, with primitives and sets being compared with the fresh data, and the longest possible match accepted.

So the first byte is code-address 257. The second byte is code-address 258. The third may be a byte or a set. If it is a set, it must contain 257, and be located at 259. The fourth is at location 260. It may be a primitive byte, a pointer to 257 (carrying 258 with it, giving two bytes) or a pointer to 258 with 259, giving either two or three.

So it goes on. Every time a pointer points outside of 0 to 255, it points to a plural set of bytes.

Although the method is extremely simple, it appears to embody natural laws that heretofore were not discovered.

Unnamed Notions

Noam Chomsky, Professor of Linguistics and Modern Languages at MIT since 1961, discovered a “Silent Language” in the mind of man. This language has no words. However, Chomsky’s Universal Grammar is waiting for a child to learn from its environment what the customary words and usages are, and so to become the language that we use.

Like Russell and Einstein before him, Chomsky has become world famous as a campaigner for freedom of speech and for world peace.

That a Japanese child quickly learns fluent Japanese does not of itself suggest that they are more clever than we are. When we learn English, German, French or another European mother-tongue, we occupy that universal grammar with European words and usage. To learn another language as an adult requires that we STOP using that grammar in the European way - effectively to stop thinking. So European adults are slower to learn Japanese than are Japanese children.

What concerns us at this stage is not the mind of man, but the idea of a conceptual language. We find that the Calculus of Sets, by clumping data together is generating concepts. The German for a concept is “der Begriff”, from the verb begreifen, to understand.

So just as in the mind of a child, so in the compression strings (Differals) of this system, there are assemblies of data ready-labelled and waiting to be comprehended. Often they are meaningless clumps of data, but sometimes they are things that make sense to our human minds. They need only to be linked to a word in a relational database. They are ready-numbered.

Experience with the system is leading the author to the belief that Chomsky’s Universal Grammar is not a result of the anatomy and physiology of the brain, but rather due to a natural law of data-clumping.

It is true that the eyes have facilities for edge-enhancement (acutancing), and that the visual cortex has the ability to deconstruct images into lines. It is true that the ears have a basilar membrane for spectrum analysis and that the eighth cranial nerve-pair has facilities for fourier synthesis, but all of this is simply refining the data in readiness for the process of comprehension.

Comprehension, once the data is adequately refined, may be no more than differating.

If this is true, the author has stumbled upon a natural law that goes much deeper than mere data compression. And Chomsky, by studying linguistics, had found a truth of universal logic.

Artificial Intelligence

The parts of speech are many and various, but we will concern ourselves with the nouns and verbs.

Differating “consumes” anything immobile into its compression-strings. Such things are nouns, or substantives. That which remains is a shower of data due to changes in the domain of time.

The German word for a verb is “das Zeitwort” - literally, the time-word. Verbs have tenses, from the Latin tempus (time).

Whereas the evaluation of nouns is fairly simple, verbs seem to be less simple to find. However, there are three verbs that are quite fundamental to the process of differating.

Consider again the graphic “Random. ”. We will lay the codes out in sequence. They are:

R, a, n, d, o, m, ., (space), 1, 3, 5, 7, 9, 13, 13, 15, 16, 17, 18....

We wish to find the rate of change. This is traditionally done by means of the conventional calculus. However, we cannot use the infinitesimal calculus because these numbers are advancing by unit step.

The calculus of unit step does not differentiate, but simply differences the numbers. It is the simplest calculus. Here, we will difference whatsoever can be differenced:

-, -, -, -, -, -, -, -, 2, 2, 2, 2, 2, 0, 2, 1, 1, 1....

So we have the number 2 whilst it is assimilating data. This is an unspoken verb - to learn, to think, to concentrate, although it is none of these things, because it has no brains.

Then we have the number 0 when the data has been consumed. This is the verb to gasp, to be amazed, or the phrase “Heureka, I’ve found it”, which encompasses the verb to find. Yet, it has found nothing.

Then we have 1, which is the verb to be bored, to relax, to idle (like a car). However, it is not bored, nor relaxed, nor resting.

How do I know that it was not learning, was not amazed, and did not relax after its accomplishment? I know this because - even though the word “Random” is a plan for artificial intelligence in silicon - it is not in silicon. It is on the printed page. The printed page cannot learn, nor get excited nor relax.


So, from the difference of Differals we can reveal the deep “spsychology” (simulated psychology) of the system, and discover the zeroes - the moments of enlightenment of this heuristic algorithm. It may deliver the Differal of an empty studio.

Then, an announcer may sit in the chair. If he remains still for long enough, the system can capture the Chomsky (ie. unspoken) noun “announcer in studio”.

At the “zero moments”, the machine may catalogue all the discoveries. Then, an operative may go to the computer and examine the images. Using LISP (List Processing), or a similar language, he gives a name to the images. Thus, the studio becomes known as the studio.

The LISP program then reverse-differates (summarates) by a number of ply. The first ply will reveal two components. One component may indeed be a primitive byte - but usually will be a set of bytes. Each of the two sets then receives a percentage. If the first Differal (for the top of the screen) is thirty percent, it is labelled as thirty percent. The second Differal receives the remainder - in this example 70%.

One can back-track several ply, and so give percentages to quite small parts of the screen.

Next, the studio with announcer is labelled as (logically) studio with announcer. The LISP program can quite easily remove from this image all background parts due to the room, and the remainder (announcer without studio) can have its pixels counted.

The component Differals of the announcer can now be given percentages. So if that announcer re-appears at a later date, the percentage of the image that is made out of the earlier strings is easily evaluated.

This produces a crude, stochastic image recognition - which is in need of considerable refinement. Nevertheless, the system may well distinguish John (eg. 8% John, 5% Jane, 1% Jean) from Jane (4% John, 9% Jane, 2% Jean) and Jean (3% John, 1% Jane, 7% Jean).

We then begin to think of fitting a NEURAL NET. This makes much more sense at the Differal level than at the level of pixels.

Audio

We have examined the behaviour of the system when fed with video data. However, audio compression is every bit as important in the world of engineering.

It was found convenient to take “The Microsoft Sound” and examine the first 18600 bytes. Most people with Microsoft software will have this WAV file on their computer. The amount examined is below nine tenths of a second - due to the non-availability of the specialised hardware (below) that is needed for bulk data:


This study shows exactly what was expected. There are two white dots, representing the bytes 128. Then the system alternates between red and green, as previously with video, to distinguish the lengths of the Differals.

The codes expand from two, three, five and so on to 233 (a Fibonacci number) before converging with 34, 8 and 3. There is a tiny glitch (coloured white) and after a few more bytes, the (white) disparate data of the sound begins.

As we travel further to the right, we find more and more areas of green and red. This means that the data is becoming ever more compressed.

The tiny piece of music seems to have compressed into 8,213 codes. This is, as expected, a rather poor performance at about two bytes per code. However, things would be very different if a minute or two were compressed instead of less than a second.

One suspects that due to the laws of combinatorial mathematics there will be less combinations of numbers with small deviations from the “quiet level” of byte 128 than there are on large deviations. So perhaps the system will lock onto the beat of music, compressing better on the quiet parts between the beats.

It was not possible to try this, so the behaviour of the system on music has yet to be properly evaluated.

In England, the author had created a tiny file of about 186 bytes. This delivered a second of sound - but only a test tone. The British government stole everything from the author, who fled the country - and so these researches had to begin again from the start.

Experience showed, however, that if the sinusoids are locked to the sampling grid, compression is excellent. This led the author to conceive of the Digital Tempered Scale, based upon the Equitempered Scale of Johan Sebastian Bach, but slightly shifted to lock onto the sampler.

Here is a kiloHerz signal that is not exactly a kiloHerz. It is exactly 22 cycles, on a 22,050 sample-per-second file. This gives 9,977.32 cycles per second. This is about one 26th part of a semitone off tune - impossible to hear. However, it locks perfectly to the grid.

What we see is exactly what was described for the RGB triads - a “learning period” followed by a Chomsky notion. Without thinking, the mechanism has found the noun “cycle”. Thereafter, it gives one more cycle, then two, three, five....

This in turn led the author to conceive of the Rationalised Acoustic Parameters system of sound compression. If a sound recording is comb-filtered to leave only those components that are locked to the sampler, then it will compress more easily. The RAP system is still under study.

Beyond Fibonacci

Kurt Gödel was a friend of Einstein. He investigated Russell’s Rule, that a set cannot enclose itself, in the matter of the SET of mathematical principia.

To a subset of principia he ascribed a series of prime numbers. Primes are numbers that have no different pairs of factors. For example, seven has factors 7 and 1. 7 is not different, and 1 is trivial.

So “There exists” might be 7, “a number” might be 11, and “such that” might be 13. In this way, “There exists a number such that” would be 1001. You multiply the primes together as you work, and so derive the “Gödel Number” for the entire proof.

To find out what components there are in the Gödel proof, you can always factorise them out.

The “Gödel Number” for a proof and for its opposite were found to be identical. This meant that mathematics cannot prove mathematics, and that one must seek ones proofs in fields that extend beyond mathematics.

This ingenious proof not only verified Russell’s Rule, but also proved the incompleteness of any system of axioms.

Differation removes any redundancy due to repeats. However, Gödel’s Theorem shows that no system can ever be complete. Sure enough, we find redundancy in the output of a Wehner-coded data stream:


The redundancy comes about because we are obliged to use a consecutive series of numbers if we desire to retain control of the system.

The address of a set is an unspoken one in the series. The address is quoted, with the implication that its neighbour joins it, every time it is used in a superset. Such tacit numbering brings serial redundancy into the system in place of static redundancy.

Static redundancy - the repetition on things that do not change - is naturally the greater part of the redundancy. We have seen how, with the word “Random. ”, we were coding 2, 2, 2, 2, 4, 4, and then 8 bytes with each code. So the static redundancy vanishes exponentially, leaving serial redundancy. Note how the coding (of the “Polly” video) delivered 58560 bytes in a single Differal.

In other words, the residual serial redundancy varies logarithmically with the original static redundancy.

We can further reduce the bandwidth of the coded output by leaving the “Fibonacci” system, stating where the serial sequence starts, and for how many numbers, and then re-entering the “Fibonacci” system. It was for this reason and others that the stop-code (usually 256) was reserved.

Logically, one would exit from a stream of Differals with the number 256. Then one would use the number 2 - the difference of differals - to signify that a sequence of twos will follow. The sequence 2, 67176, 100 might deliver all the numbers from 67176 to 67374 inclusive, for example.

This paper DOES NOT set out to define standards - only principles. There may be technical reasons why the concept as described may need to be refined and altered. However, what is seen at once is that if 67176 and the rest are four-byte number, and the 2 is a byte, we will have received a hundred four-byte numbers for the price of nine bytes.

Such forty-four-fold compression of the code is in additional to the original compression. That is to say, the output would be the exponential of forty-four times the code size at such points.

Bit Compression

The tacit numbering system also suggests further space-saving as with GIF. We started with two bytes of the original code. Then, at the third position, we needed to be ready to code the number 257. That is a nine-bit number. For this, and 254 codes more, we will need 9 bits. Then, when we have passed tacit address 511 we will need 10.

512 codes later, and we reach 1023. Then we change to 11 bits. And so it goes on. At each position, we bit-compress the data into the output as a long string of 1s and 0s. That string is then sliced into eights (bytes) for Internet transmission or storage.

Where we exit from the Fibonacci sequence in order to compress the consecutive codes, we continue to count our way up the sequence in order to keep track of how many bits are needed. In this way, we can re-enter the Fibonacci sequence and not lose track of bit-compaction.

The sequence 256 (Fibonacci), 1 (byte), start (4 bytes), quantity (4 bytes) could be interposed into the system in order to compress the consecutive numbers caused by coherent data.

All the Chomsky concepts contained within the coding have not been discovered. Accordingly, it seems to the author that the escape from the system should be with a byte 128 for 8-bit data or with a 32768 for two-byte data. This means that if any significance is found in a difference of differals other than 0, 1 and 2, the place will not have been taken by engineering codes.

However, as stated, the standard is NOT, at the time of writing, complete.

Implementation

The system is best implemented with bulk data. That means that short bursts of sound or video are not ideal. High-definition colour television is one field where the method can be shown to advantage.

Disparate data in the frame will compress, but best compression occurs from frame to frame. A collection of perhaps as many as two hundred and fifty frames should be assembled in the buffer.

This leads to major problems. The iterative search for matching strings is an N-SQUARED algorithm, where N is the size of the data. We will find that we need a supercomputer if we are to compress the data in real time.


Although other schemes can be found, by which memory space can be economised, the simplest is shown in order that the reasoning process can be understood.

To avoid bus-contention, banks of memory are loaded in parallel by the “server”, or video processor. Each bank of memory has a processor of its own - a string comparator, or string-matching unit (SMU).

The address lines were not shown in the above diagram, and the data lines were shown as eight simply in order to keep the diagram uncluttered.

An array of processors working in this way is traditionally known as a farm. They are under the control of a root processor which distributes the tasks.

Although full-specification processors can be arranged into a farm, they would have to run a program. Thus, instruction-fetches will be interleaved with data-fetches, which would slow the system down. The SMUs are designed to carry out only one task. They are part of a “Single Instruction Multi Data” system (SIMD), and so can be fabricated as a specialised chip - preferably combined with the memory in order to minimise propagation delays.

It will be seen, therefore, that the String Matching Units are a form of hard-wired computer. The program is built into their structure, and they have no instruction set.

Double String-Matching Units may be even more efficient than single ones. The root processor would sort the strings to be searched in order of size. The diagram above shows four SMUs, and the top SMU would receive the longest string, the next one down receives the second, the third receives the third and the fourth the fourth and fifth.

Thereafter, the third SMU receives the sixth string, the second receives the seventh and the top processor receives the shortest.

If a processor matches its first string to the data, it stops all processors below it (in the direction of the light green arrow). All those SMUs then signal “FALSE” to the root procesor, and only the successful one signals “TRUE”. However, higher processors can continue, and possibly signal “TRUE” whilst cancelling the “TRUE” of the one that stopped earlier.

When the first string has been searched, and a match not found, the processors attend to searching the second string. The signals on the farm now run upwards, from low processor to high.

This kind of scheme helps to balance the workload of the processors. There will often be considerable wastage if processors run synchronously, and schemes have to be thought up by which this wastage is kept to a minimum. A non-synchronous algorithm, in which the root processor re-uses SMUs as soon as they have stopped is another means for avoiding waste.


The SMUs are essentially similar to a Pentium, but having only the one instruction CMPSB, CMPSW or CMPSD. That instruction uses the SI and DI registers as pointers, the ALU (Arithmetic and Logic Unit) for comparison and the ECX register as a counter.

We will need only the SI register (but we can call it the NOW register because it points to incoming data), and the DI (which we will call THEN because it deals with past data) and the counter.

CoMPare String Bytes (CMPSB) and the others are often used for the sorting of text. They need a greater than, less than and equality test - and so use arithmetic. Our interest is ONLY in the matching. We want a YES or NO. So it is far better to use EXCLUSIVE-OR gates (XOR) than the arithmetic of an ALU. For one thing, we do not have to wait for the ripple-through of the carry.

The series of J-K flip-flops that make up the NOW, THEN and COUNTER are not shown. The diagram shows the data lines. An address, say, the NOW address, is put on the address bus during Phase 1 of the clock. The Phi symbol is a standard one in electronics for PHASE, and is not to be confused with the Golden Section.

During Phase 2, the NOW data is latched into the RESET-SET (RS) flip-flops. These are assembled as the simplest-possible data latches (D-type flip-flops). During Phase 3, the THEN address appears on the address bus. During Phase 4, the THEN data on the data bus is compared with the stored NOW data. If any single bit is wrong, the whole byte has mismatched - so all eight XOR gates feed into an eight-input OR gate, and it is only the single output that is latched after Phase 4.


The counter, the NOW index register and the THEN index register are made out of J-K flip-flops. A good arrangement is for the NOW address to be held permanently in a register until changed by the root processor. In this way, the NOW index register can automatically recharge at the start of each search.

The THEN address, however, varies with every search, and so this and the count must be defined repeatedly by the root processor.

The counter is shown in the image above, with just eight flip-flops to avoid clutter. If the SMU has a 512 megabyte addressing range, that would be 29 bits. However, if the SMU can fetch sixteen bytes at each memory-accesss, four bits of address would not be needed. That would leave 25.

As shown, when the counter is at nil, its ones complement side shows all 1s. An AND gate then delivers a MODULUS ENABLE output, of which more will be said later.

At the trailing edge of phase 4, or a separate phase 5 of the clock, the counter counts down to -1. If it has reached this far, it has successfully matched the two strings. It is the output from an AND gate, which detects the -1 condition, which throws the “SUCCESS” latch, stops the clock and disables any processors below it in the hierarchy. The system may have taken four or five ticks per fetch.


A refinement that may be added is the incorporation of LAXITY of coding. Unfortunately, it would require some AND gates with their associated propagation delays.

The diagram shows the six least-significant bits equipped for lax coding. If any of those AND gates has a zero on its laxity pin instead of a one, the gate cannot transmit the “mismatch alarm” to the output. The result is that data-strings that are a near match, without being exact, are passed as acceptable by the unit.

Here we see what happens when only three bits are used for the comparison. At first it may seem as if the entire spectrum is divided into eight bands, and that the outcome is arbitrary.

In fact, we have not altered the incoming data. The population of the various colours is depicted here (right) as brightness. Lax coding will select existing colours because there are only existing colours to select from. In addition, it will select the most popular colour in each band because it is looking for the longest strings.

The longest strings will contain those colours that have been seen most often. So Wehner coding even contains an automatic palette selection. Views about palette selection vary, and this may not be the system of choice. However, it is useful to have automatic popular colour selection without having to burden the CPU, and without having to code for it.

Laxity is required when there are too many “EVENTS” in the data. Events are changes in the image, in a video, or in the sound, in audio. The “Fibonacci” system removes all repetition from the data beyond three. Sudden bursts of disparate data, that appear and never recur, would overload a system that was designed for narrow bandwidth.

Everyday data does not behave in this way, but there is noise that can generate a “Factorial Calamity”. An exponential system might double its efficiency, then double again because its base is 2. A factorial system will have a base of 2, then 3, then 4, then more - and will always overtake an exponential.

Totally random noise could cheat the data compression. That is one reason for looking at ways of making quieter Charge-Coupled Devices (CCDs) such as the author’s photoelectric CCD. The other approach is to clip the noise off the data, if it threatens an overload.

With dynamic laxity, the system runs at full precision until an output overload occurs. Then it automatically switches to laxity. With progressive laxity, one bit is removed from the precision, and if this is not enough, another bit of precision is dropped, and so on.

A system based upon visual psychology can also be devised. Accuracy is required in the green, less in the red, and least in the blue. Accordingly, a scheme that gives 574 (five bits red, seven bits green and four bit blue) would reduce the precision to 64k colours with the least visible effect.


In summary, one would design the processor to sit on a memory chip, together with the largest amount of memory that is practicable, and with as many bytes being checked simultaneously as possible.

The Differator program consists of a root processor giving the NOW address to all SMUs on the farm. It then starts at the top of the address list of processed data, and by comparing the start of a string with the start of the next-but-one, evaluates its length. The start addresses and lengths are then farmed out.

Every time the farm stops, the root processor checks to see if a string has been matched. If so, that length becomes the new target length, and all strings equal to or less than that length are ignored.

When the longest match is found, NOW is advanced by that amount, and given as a new address to the entire farm. The address where it matched (the THEN) is coded into the output stream, for example on the basis that the output stream begins with code 257.

In the absence of a match, a primitive byte is issued to the output stream, and the NOW pointer advanced by one.

The diagram shows eight bytes being tested at the same time. In a practical version, it might be sixteen bytes or even thirty-two. The bottleneck is in the wiring. At the time of writing, the Intel Pentium has reached three-and-a-half gigaHerz, and a 64-bit AMD Athlon processor is on the market. Windows XP-64 is available as a Beta edition for testing.

Accordingly, a crude system of XOR string-matching does not place any huge burden on industry.

The diagram shows how, when the J-K counter reaches zero, the lower addresses are decoded and fed to the AND gates associated with the XORs. Thus, only those gates that have a 1 can deliver the “mismatch alarm”. In this way, one might have a function CMPSLD (compare string long double) all the way through the string, followed by something like CMPS3B (compare string three bytes) to deal with the modulus.

We see therefore why the modulus enable bit was evaluated earlier. The design for a modulus-decoder with enable input is sufficiently trivial that it is not described in detail here.

Those who are skilled in the art of logic design will be able from the foregoing to put together a hard-wired string-matching unit.

In a practical implementation, such as for colour television, perhaps as much as 512 megabytes of data would be fed into the memory. The SMUs would compress this data, and the data would be broadcast. Any viewer switching his receiver on would have to wait for a maximum of ten seconds before such a “package” of data starts, is expanded by blockmoving and displayed.

By extension of the logic, it can be seen that a SUMMARATOR mechanism may also be integrated, for mass-marketing as the central component of a television receiver.

Chaos

The mathematical use of the term chaos is different to the vernacular. Noise is random data having no underlying order, but chaos is apparent noise.

There is always a key to chaos, so that if the data is unlocked, rational order will return. This was the discovery of Gaston Julie and Benoit Mandelbrot. An iterative process of multiplication and addition led to such confused numbers that nobody could unravel the key. Yet, when the number of successful repeats was colour-coded on a chart, the amazing Mandelbrot Images emerged.

Differation removes repetition from a data set, leaving only the events. When in addition, progressive redundancy is removed, the result appears chaotic. Reversing the coding then restores the original data.

The system may be used for pictures, sound, text, facsimile (FAX), archiving (eg. ZIP) indeed for any kinds of data. The greater the redundancy, the better the coding.

Bedrock

This is also the bedrock of awareness. The conscious mind compares current events with the past. So the Sconscious (simulated conscious) does the same. The subconscious is a record of the past. The Subsconscious ia a record of Differals that were previously evaluated.

(C) 2005 Charles Douglas Wehner.
The information given on this page is supplied for educational purposes,
and for academic research. Commercial use is forbidden.


HOME

Appendix A


To obtain small phi, simply divide into 1.

1.618033988749894848204586834
36563811772030917980576286213
54486227052604628189024497072
07204189391137484754088075386
89175212663386222353693179318
00607667263544333890865959395
82905638322661319928290267880
67520876689250171169620703222
10432162695486262963136144381
49758701220340805887954454749
24618569536486444924104432077
13449470495658467885098743394
42212544877066478091588460749
98871240076521705751797883416
62562494075890697040002812104
27621771117778053153171410117
04666599146697987317613560067
08748071013179523689427521948
43530567830022878569978297783
47845878228911097625003026961
56170025046433824377648610283
83126833037242926752631165339
24731671112115881863851331620
38400522216579128667529465490
68113171599343235973494985090
40947621322298101726107059611
64562990981629055520852479035
24060201727997471753427775927
78625619432082750513121815628
55122248093947123414517022373
58057727861600868838295230459
26478780178899219902707769038
95321968198615143780314997411
06926088674296226757560523172
77752035361393621076738937645
56060605921658946675955190040
05559089502295309423124823552
12212415444006470340565734797
66397239494994658457887303962
30903750339938562102423690251
38680414577995698122445747178
03417312645322041639723213404
44494873023154176768937521030
68737880344170093954409627955
89867872320951242689355730970
45095956844017555198819218020
64052905518934947592600734852
28210108819464454422231889131
92946896220023014437702699230
07803085261180754519288770502
10968424936271359251876077788
46658361502389134933331223105
33923213624319263728910670503
39928226526355620902979864247
27597725655086154875435748264
71814145127000602389016207773
22449943530889990950168032811
21943204819643876758633147985
71911397815397807476150772211
75082694586393204565209896985
55678141069683728840587461033
78105444390943683583581381131
16899385557697548414914453415
09129540700501947754861630754
22641729394680367319805861833
91832859913039607201445595044
97792120761247856459161608370
59498786006970189409886400764
436170933417270919143365013716

...or subtract 1.


Appendix B

Colorization

Because the compression strings travel with the image, it is possible to arrange that a monochrome film is compressed, and then the key frame of each scene coloured in, to allow the colour to spread to later frames.

The Differals from each successive frame are “unwrapped”, and their contents examined to ensure that they come from a similar region of the key frame. If so, the equivalent pixel-strings from the coloured image are taken, and the image repainted.

Where sections are missing, these can be shown on screen in a garish colour. This draws the attention of a computer operative, who either moves a “blend” tool over the affected area, by which an interpolation of the chroma (but not of the luma) is undertaken from adjacent areas, or paints in the colour by hand.

Such ancillary software can be written to run both forwards and backwards in the time domain. For example, where Polly is panned into frame (or “walks” onto the screen), one would colour the entire horse, and run the film backwards so that the coloured areas move off screen. With the film re-painted in reverse, it may be shown to the audience forwards.


Appendix C

Travelling Matte

In the film studio, it is quite usual to use a blue screen (“Blue-Screen Process”), a green screen or some other colour (“Chroma Key”) in order to generate a mask that moves with the actors. A new background can then be substituted. This allows trick effects, and sometimes saves location expenses.

Because the Differals have an address giving frame number, line number and column number, they are themselves a kind of travelling matte. Travelling Matte programs may therefore be written to capitalise on this fact.


Appendix D

Vibration compensation

Each Differal begins at a unique position in each frame. For example, with the various mini videos, we had frames that were 94 by 146 dots, or 13,724 pixels per frame.

Given the position in the data where a Differal begins, one would divide modulo 13,724 to find the quotient (frame number) and modulus 0 to 13,723.

Dividing that modulus modulo 94 gives the quotient (line number) and modulus 0 to 93.

That final modulus is the column-number in pixels.

So vertical and horizontal shake can be evaluated by comparing the line and column numbers of Differals in successive frames. Where a dramatic difference is found, this will be due to some on-screen event such as a mouth suddenly speaking. The Differals of this section of image are simply ignored.

Extremely high co-relation should be found between the Differals in other places, showing consistently how the image has moved up, down, left or right.

Simply re-painting the frames with the vibration removed will steady the image. There may, however, be problems with anti-aliasing across the rows and columns for tiny amounts of shake.

In addition, when there is a pan or a tilt, the vibration data will have to be analysed mathematically to preserve the underlying trend in the movement, in order that the pan or tilt is preserved.

Camera shake introduced a few lines of data outside of the frame. Accordingly, the shake-removal by the calculus of sets will enable the full frame-size to be preserved, particularly if the algorithm runs both forwards and backwards in time. Cruder methods involve simply slicing off a row or a few rows from the top or bottom of the frame - and this results in reduction of the frame size.


Appendix E

Snow and Scratch Removal

According to similar principles to the above, Differals may be machine-examined to find their starting positions within the frame. If a long string of pixels is broken up into shorter lengths by being interrupted by a single pixel or group, it may be offered to an operative as a possible site for mechanical or manual correction.

If however, the interruption of that string of pixels takes place in the same column as the interruption of another string above it, it is probably a scratch.

If the camera pans, or if the scratch moves sideways, a clean section of pixels may be found, with which to replace the damaged section. The Differals of the Calculus of Sets make such software much easier to write, because the various strings have already been evaluated.


Appendix F

Surveillance

The artificial awareness can find immediate application in surveillance systems.

Consider a surveillance camera having two high-capacity string-matching memory units (SMMUs). It feeds data to the start of the memory of one memory, whilst feeding that same data to the middle of the second. When the second SMMU is full, data is fed again to the start whilst being fed to the middle of the first SMMU.

In this way, differation is taking place in an interleaved fashion. Processed data from the unit that is more full will be more sparse than that from the unit that is just starting its cycle.

When a surge of data is found to emerge from the fuller unit, software has the option of sending data from either of the two units down a telephone link, or other communication channel.

In this way a remote host may receive a video recording either of the action leading to the on-screen event and its consequences, or of the action that began about the time of that event.

In addition, as each differal of the on-screen event has a position in memory, it has an X and Y co-ordinate in the image. A background program can direct the camera to zoom in on the on-screen event. This leads to automatic tracking of intrusions.

The camera should not, of course, be directed towards a window with a street-scene outside, or towards any other source of false alarms.


Appendix G

Summary of Calculus of Sets:

At its essence, this simple logic is based upon bits, where bits can be 1 or 0. There are just two states.

A system based upon bits is described as binary. Everything is pairs, including pairs of pairs, pairs of pairs of pairs &c.

According to Georg Cantor in 1872, when he conceived the “Mengenlehre” (Set Theory), “Eine Menge ist eine Zusammensetzung von bestimmten wohl unterschiedenen Objekten der Anschauung oder des Denkens, welche die Elemente der Menge genannt werden, zu einem Ganzen”. A set is a combination of certain well-defined objects of perception or thought, which are described as the elements of the set, into a whole.

There may be multiple types of element. One uses the term tuple for the types, because of the words pentuple, heptuple and octuple for types with five, seven or eight varieties. Binary is the lowest form of tuple - having only two types.

Now we have combinations of bits. To start with, these can be fourfold: 00, 01, 10 and 11.

The Calculus of Sets requires that the nature of the elements, and also the order of the elements, be taken into account. Thus, it is not just important whether there is a 1 and an 0, but the numbers 1 and 0 do not match the numbers 0 and 1 because of the reversed order.

The domain of the elements is the number of possible variants. With bits, there are only two possibles in the domain.

The co-domain is some region outside of the domain, where an ordered array of elements is available. With binary, the co-domain might be the real numbers starting at 2.

Two elements from the data must be put into the co-domain at its start. Then comparison with further data may begin.

Comparison is made by logic, not by arithmetic. For example, we might use the exclusive-or function, which returns a 1 (a true) as if to the question “ARE THEY DIFFERENT”?

Comparison is made between one pair of elements and another pair. Thus, there must be at least four elements. We have already put a pair of elements in the co-domain. This pair is now defined as a pair of closed sets. The fate of the next pair is undecided.

If they match, the new pair receives a pointer to the pair of closed sets. There will need to be a convention as to whether the pointer is to the first set (forward Calculus of Sets) or second set (reversed Calculus of Sets).

The pointer is the position in the co-domain. Thus, with forward calculus, we point to the first of the pair. That pointer is now inserted into the next available position in the co-domain, and declared to be a closed set.

If they do not match, the third bit is put into the third position in the co-domain, and declared to be closed.

Further data is examined, and compared with pairs of closed sets in the co-domain. Step-by-step the co-domain receives more data, and the more similar data there had been, the more these numbers become pointers to pointers to pointers, representing as a single number in the co-domain a very large collection of elements in the original data.

This process of suppressing similarity, and exaggerating differences, is described as DIFFERATING. Each set that enters the co-domain is defined as a DIFFERAL. If it is a single bit, it is defined as a PRIMITIVE, because it is a set of just one.

In the Calculus of Sets, a differal can be either a set of two or a (primitive) set of one. There are no other variants.

Opening up a set (a differal) is defined as SUMMARATING. The parts contained within that set are the SUMMARAL. If the summaral is one, the set was a primitive. If two, it was not.

If the summaral has two parts, both may be primitives, or one may be primitive, or neither may be primitive. If both are not primitives, the sets in the summaral may themselves be opened.

Each stage of differation, or of summaration is defined as a PLY.

When a differal can be summarated by two ply, it has more than two elements. The more it can be summarated, the more elements.

As differals are sometimes primitive, summaration will often deliver indivisible elements. Each ply of summaration cannot therefore be relied upon to deliver a doubling of the data.

Given extensive data, the binary calculus described will ultimately deliver differals for groups of four bits, if they existed in the original data. Ultimately, it will deliver differals for groups of eight if they also exist.

As modern computers already work at the level of nybbles (4 bits) and bytes (8 bits), we usually bypass the natural evolution of the differation from bits to a higher level, and start comparing bytes directly.

Even at the level of bytes, the Calculus of Sets is comparing pairs of elements with pairs of elements. It eliminates from its results all repetition beyond three.

The method used to produce the demonstrations of compression of sound and pictures was the Bytewise Calculus of Sets. Here the domain of the elements was the range [0-255], which can also be expressed as [0-256[. The enclosing bracket symbolises inclusion, so the non-enclosing bracket in [0-256[ means zero to two-five-six without the two-five-six.

In the system used for the demonstration, the co-domain was the set of reals beginning at 257. The number 256 had been reserved as a stop code.


Appendix H

Patching LZW

It would be possible to patch the LZW code in systems such as GIF, by means of exception-handling.

A system that arranged for a set to contain all of two previous sets would be a Fibonacci system, even when the special exception is allowed that a set contained within the very next set is extended only by one.

There is little benefit in trying to patch the LZW system. This is because the special exception of the contents of a set being one place earlier than the address of that set would have to be tested for at every set. This slows the processor.


Appendix I

Some Technical Terms

Terms given in RED are new.

ARGUMENT. A number that is to be worked on by a function to deliver a result.

BIT A binary digit. It can take the value 0 or 1. In the positive logic, 0 represents false and 1 represents true. In switching circuitry, a bit is a single switch.

BYTE Eight bits. These can represent some pattern, as in a bitmap, or some number. When used numerically, each bit is a flag to represent the truth of whether or not a power of two is present. Thus, we have “take 1?” (true or false); “add 2?” (true or false); “add 4?” (true or false); up to 128, allowing all numbers from zero to 255 to be coded.

CALCULUS The Latin word for a stone. Stones were often used as an aid to counting. By extension, calculus is a form of counting - arithmetical calculus. However, stones may be sorted by colour, size and position. By extension, calculus is a form of pattern-matching - the new logical calculus.

CHAOS A set of data having the appearance of noise, yet having concealed order.

CO-DOMAIN The set that a function enters in order to deliver its result. For example, a function delivering only bytes will obtain them from the co-domain zero to 255.

COHERENT Able to cohere. Having properties that enable it to join with others.

DERIVATIVE A meaningless word for a DIFFERENTIAL. It leads to confusion when asked to “Derive the equation”.

DIFFERAL A set, which may be a superset, and is the result of differation.

DIFFERATION The process, by examining sets, of finding two new sets that match two old sets, in type and in sequence, and of renaming them as a single set, which may be a superset, under a new name. This hides repetition and exaggerates difference.

DIFFERATOR A program or device such as an integrated circuit, which carries out the process of differation.

DIFFERENCE (numerical) A number which is the result of differencing (subtracting) one number from another. It may also be an arithmetical expression.

DIFFERENCE (logical) Having different properties, such as colour, position or size.

DIFFERENCING Subtracting one number from another.

DIFFERENTIAL A number which is the result of differencing one number from another, and then dividing by an infinitesimal number. The result of differentiation. It may also be an arithmetical expression.

DIFFERENTIATION The processing of differencing one number from another and dividing by an infinitesimal number.

DISPARATE Having no parity with anything else. Alone. Separate.

DOMAIN The set that a function enters in order to find its argument. For example, a function dealing only with bytes will obtain its argument from numbers in the range zero to 255.

EVENT (or DATA-EVENT) Such data as enters a differating process having neither the properties of a repeated set of disparate data, nor of coherent data, and so represents new information.

EXCLUSIVE-OR A logical function that takes the arguments A and B and delivers a true if they are not the same. Otherwise it delivers a false. The domain may be bits, bytes or strings for example. The co-domain is usually a flag.

FALSE The result of a logical function. In positive logic, it is a 0 when the co-domain is bits, but may be a string of 0s when the co-domain is not bits.

FLAG Usually a single bit to signify a logical true or a logical false.

FLIP-FLOP A switch. An astable flip-flop keeps switching back and forth. Used as a clock. A monostable flip-flop when triggered changes states for a defined period and then switches back. A bistable can be used as memory (RS), or in a more refined form as a divide-by-two circuit in a counter (JK). There are many types - synchronous, asynchronous, edge-triggered, D-type (data) &c.

FUNCTION A logical or mathematical process that turns an argument into a result.

INFINITESIMAL A tiny number that is too small to understand. This may be the step in the Leibnitz infinitesimal calculus, or one of the “Alephs” of the “Transfinite Numbers” of Cantor. The word is usually used as an adjective.

INTEGRAL A number which is the result of adding one number to another, and then multiplying by an infinitesimal number. The result of integration, often wrongly described as a primitive. It may also be an arithmetical expression.

INTEGRATION The process of adding one number to another and then multiplying by an infinitesimal number. It may also be applied to arithmetical expressions.

INTEGRATION The result of adding pieces of circuitry to others, usually on a silicon chip.

NOISE A set of data having no intrinsic order. Yet there may be a probability of all spectral components being equal (white noise), of high-frequencies being more common (blue noise), or of low-frequencies being more common (pink noise).

NYBBLE Four bits. Half a byte (har, har, har)!

PLY A layer of analysis. The term is used in the design of chess programs, for example, as a layer of possible moves. It has been adopted in the Calculus of Sets as a layer of differation or summaration.

PRIMITIVE In the Calculus of sets, a set which has no subsets. It is a set of one.

PRIMITIVE A wrong term for an INTEGRAL. In the infinitesimal arithmetical calculus, for example, X to the power of four integrates to one fifth of X to the power of five. This is a higher-order term, and therefore more complicated rather than more primitive.

SET “A set is a combination of certain well-defined objects of perception or thought, which are described as the elements of the set, into a whole” - Georg Cantor, 1872.

REAL-TIME A system where the data output is less than or equal to the data input, allowing continuous operation. Television, for example is a real-time system. The human brain is not a real-time system, in that after data-gathering, further data-processing is needed by a process known as sleep.

RESULT The argument of a function after it has been altered by that function.

STRING A sequence. Usually of bytes.

SUBTRACTION Another word for differencing.

SUMMARAL The contents of a set, which in the Calculus of Sets will be two sets. The result of summaration.

SUMMARATION The process, by examining a set, of finding what elements are contained within it. Those elements will be sets, but may be primitive.

SUMMARATOR A program, or device such as an integrated circuit, which carries out the process of summaration.

SUPERSET A set which itself contains at least one set.

TRUE The result of a logical function. In positive logic, it is a 1 when the co-domain is bits, but may be a string of 1s, or a string that is not all zeroes, when the co-domain is not bits.

TUPLE A group rather than a single. Double is a tuple of 2, treble of 3, quadruple is of 4, quintuple of 5 and so on.


Appendix J

Further Series

There exist other series similar to the Fibonacci series.

For example, instead of a new number being the sum of the previous two, it might be the sum of the previous three:

1, 1, 1, 3, 5, 9, 17, 31, 57, 105, 193, 355, 653, 1201, 2209, 4063, 7473, 13745, 25281, 46499, 85525, 157305, 289329, 532159, 978793, 1800281....

Similarly, it may be the sum of the previous four:

1, 1, 1, 1, 4, 7, 13, 25, 49, 94, 181, 349, 673, 1297, 2500, 4819, 9289, 17905, 34513, 66526, 128233, 247177, 476449, 918385, 1770244....

These also are part of the Calculus of Sets, but due to the fact that they are not as fundamental as the binary version - where two sets, which is the lowest form of plurality, combine into one - no emphasis has been placed upon these other variants.


Appendix K

Here are some Fibonacci numbers:

2 3 5 8
13 21 34 55
89 144 233 377
610 987 1597 2584
4181 6765 10946 17711
28657 46368 75025 121393
196418 317811 514229 832040
1346269 2178309 3524578 5702887
9227465 14930352 24157817 39088169
63245986 102334155 165580141 267914296
433494437 701408733 1134903170 1836311903
2971215073 4807526976 7778742049 12586269025


HOME