Tags

, ,

Lately I’ve been exploring the idea of a vector space with a large number of dimensions (but few degrees of freedom). A model was presented with five degrees of freedom in 500 dimensions (neurons, as it happens).

The question is, given the axes are bit-level, does normal vector manipulation semantics make sense. My contention is it has severe problems.

To test and illustrate why I think this is problematic, I wrote a simple bit of Python code that I think comes close to modeling the situation. The model puts in stark relief the problem of creating valid new vectors from existing (known valid) vectors.

The TL;DR is that trying to create new vectors from existing vectors is a problem.

§

My design involves an “Address Space” capable of holding the valid street address of any location on Earth. The input is a fixed-length (UTF-8) string — a valid street address. Max byte size is adjustable; currently 25.

The intent is to represent the 500-neuron model, albeit with far fewer dimensions (which won’t matter for this purpose). The vectors here are plenty big enough to illustrate the issue.

Rather than five degrees of freedom, my model has four, because it was much easier to implement that way. But, again, it won’t matter here.

In any event, the model’s parameters are adjustable, so I can show you what happens with, say, eight or sixteen degrees of freedom. The issue is apparent regardless.

§

The idea is to break a street address string into clusters of bits to give us a vector of small values.

For example, using two-bit clusters, the street address [1234 Main Street] (with space padding out to 25 chars) gives us the following vector:

[1, 0, 3, 0, 2, 0, 3, 0, 3, 0, 3, 0, 0, 1, 3, 0, 0, 0, 2, 0, 1, 3, 0, 1, 1, 0, 2, 1, 1, 2, 2, 1, 2, 3, 2, 1, 0, 0, 2, 0, 3, 0, 1, 1, 0, 1, 3, 1, 2, 0, 3, 1, 1, 1, 2, 1, 1, 1, 2, 1, 0, 1, 3, 1, 0, 0, 2, 0, 0, 0, 2, 0, 0, 0, 2, 0, 0, 0, 2, 0, 0, 0, 2, 0, 0, 0, 2, 0, 0, 0, 2, 0, 0, 0, 2, 0, 0, 0, 2, 0]

The two-bit clusters giving us values of 0-3 for the components of the vector.

Given some other vector, say for [87 Bentley Place], which is:

[0, 2, 3, 0, 3, 1, 3, 0, 0, 0, 2, 0, 2, 0, 0, 1, 1, 1, 2, 1, 2, 3, 2, 1, 0, 1, 3, 1, 0, 3, 2, 1, 1, 1, 2, 1, 1, 2, 3, 1, 0, 0, 2, 0, 0, 0, 1, 1, 0, 3, 2, 1, 1, 0, 2, 1, 3, 0, 2, 1, 1, 1, 2, 1, 0, 0, 2, 0, 0, 0, 2, 0, 0, 0, 2, 0, 0, 0, 2, 0, 0, 0, 2, 0, 0, 0, 2, 0, 0, 0, 2, 0, 0, 0, 2, 0, 0, 0, 2, 0]

What happens when we combine them?

Here are some of the possible answers:

  • [42?? ??hi???eabd]
  • [02 ? M?he ?P?aad]
  • [973venumnyct?egu]

The question marks represent illegal street address characters.

The combinations involve taking, respectively, the average, the min, and the max, of the values. I’m open to other ideas on ways to combine them.

§

Here’s the code:

class AddressVector (object):
    def __init__ (self, _addr):
        addr = (_addr+(AddrSize*' '))[0:AddrSize]
        self.data = []
        for ch in [ord(a) for a in addr]:
            for _ in range(BitParts):
                self.data.append(ch & BitMask)
                ch >>= BitShift

    def __call__ (self):
        return [int(d) for d in self.data]

    def __len__ (self):
        return len(self.data)

    def __str__ (self):
        data = list(self.data)
        a = []
        while len(data):
            b = 0
            for _ in range(BitParts):
                d = data.pop()
                b <<= BitShift
                b += d
            a.append(LegalChar(b))
        return ''.join(reversed(a))
#

The AddressVector class embodies the idea of a vector in Address Space. The constructor takes a street address, pads as necessary (truncates as necessary), and breaks it into chunks of bits to create a vector.

Note that the original string is not preserved. Successfully printing the vector as an address demonstrates the conversion code works. That gives us confidence newly created vectors are correctly converted to strings.

The class requires a few support definitions:

AddrSize = 25
BitShift = 2
BitMask  = 3
BitParts = 4

def LegalChar (c):
    ch = chr(c)
    if 'A' <= ch.upper() <= 'Z':
        return ch
    if '0' <= ch <= '9':
        return ch
    if ch in ' -.,;:#':
        return ch
    return '?'
#

The LegalChar function is just a helper for when we try to convert a vector back to a string. Many of the bit patterns formed created illegal characters (from the perspective of them being printable).

By (carefully) varying the three Bit* parameters, the chunk size can be adjusted.

[It’s bad coding to have three dependent variables like that. I should calculate two based on one (BitShift), but I’m lazy and don’t want to think about the math involved.]

Using them is simple enough:

addr1 = '1234 Main Street'
addr2 = '87 Bentley Place'

av1 = AddressVector(addr1)
av2 = AddressVector(addr2)
#

Which results in the two vectors shown above.

Combining them gives the results shown above.

§

The problem is that we’re trying to create a new valid semantic object by combining low-level parts of two others.

But there is no semantic connection between those layers, so combining low-level parts results in gibberish.

§

FWIW, setting the parameters for three-bit chunks (giving eight degrees of freedom), gives us these vectors:

[1234 Main Street] = [1, 6, 0, 2, 6, 0, 3, 6, 0, 4, 6, 0, 0, 4, 0, 5, 1, 1, 1, 4, 1, 1, 5, 1, 6, 5, 1, 0, 4, 0, 3, 2, 1, 4, 6, 1, 2, 6, 1, 5, 4, 1, 5, 4, 1, 4, 6, 1, 0, 4, 0, 0, 4, 0, 0, 4, 0, 0, 4, 0, 0, 4, 0, 0, 4, 0, 0, 4, 0, 0, 4, 0, 0, 4, 0]

[87 Bentley Place] = [0, 7, 0, 7, 6, 0, 0, 4, 0, 2, 0, 1, 5, 4, 1, 6, 5, 1, 4, 6, 1, 4, 5, 1, 5, 4, 1, 1, 7, 1, 0, 4, 0, 0, 2, 1, 4, 5, 1, 1, 4, 1, 3, 4, 1, 5, 4, 1, 0, 4, 0, 0, 4, 0, 0, 4, 0, 0, 4, 0, 0, 4, 0, 0, 4, 0, 0, 4, 0, 0, 4, 0, 0, 4, 0]

Note how the components now have values from 0–7.

The same three methods of combining vectors gives us:

  • [04????jje??bkcdl]
  • [02 ? Maie ?Pjacd]
  • [973tentlnyctteeu]

§

For grins, here it all is in four-bit chunks (because why not):

[1234 Main Street] = [1, 3, 2, 3, 3, 3, 4, 3, 0, 2, 13, 4, 1, 6, 9, 6, 14, 6, 0, 2, 3, 5, 4, 7, 2, 7, 5, 6, 5, 6, 4, 7, 0, 2, 0, 2, 0, 2, 0, 2, 0, 2, 0, 2, 0, 2, 0, 2, 0, 2]

[87 Bentley Place] = [8, 3, 7, 3, 0, 2, 2, 4, 5, 6, 14, 6, 4, 7, 12, 6, 5, 6, 9, 7, 0, 2, 0, 5, 12, 6, 1, 6, 3, 6, 5, 6, 0, 2, 0, 2, 0, 2, 0, 2, 0, 2, 0, 2, 0, 2, 0, 2, 0, 2]

Now the vector values can go from 0–16. The combinations are:

  • [44?3B?bjiD1bgcdd]
  • [12 2 Maie Pbacd]
  • [873DentlnySt?eeu]

So the chunk size doesn’t matter; the degrees of freedom don’t matter.

What matters is the decoupling of the bit patterns from the semantics.

§

I did give some thought to regular vector operations, say member-wise adding or multiplying. The problem there is it results in vectors with illegal values — values for the components higher than allowed.

(Normalizing, might be an option, but it will require rounding off, and I think averaging values did largely the same thing.)

Just adding and multiplying gives (as expected) gibberish:

  • [iiSv??????s??????????????]
  • [?????f???? p?????????????]

Those vector operations don’t even preserve the trailing spaces!

So, again, I’m open to other ideas on combining them. I’m willing to try, but I don’t see how combining two valid addresses ever results in a new valid address.

§

The bottom line is that semantic pointers require a semantic basis.

There is also that the quantizing and limiting of the axes results in a closed space. Normal vector operations create vectors outside that space.

Otherwise, vector operations just result in random gibberish.


Note to readers: In conversations that may (or may not) follow, the term “pointer” is synonymous with vector on the account that “pointers” are arrows and vectors are commonly visualized as arrows.

Additionally, the allusion to memory pointers is deliberate. The semantic vector is imagined to “point” (perhaps literally) to the concept it represents in the “memory” of the mind.