Words of Life

A variant of Conway's Game of Life

First Published: 15/12/22

“I've actually written about it, he [Wittgenstein] talks about how language developed … couple of people are together. One of them points to a rock and says “rock”, and the other one says a “rock”. It's just not even remotely like that. That's just off the spectrum of discussion. None of these things happen. That's not the way language develops at all. In fact, the concepts of our mind, you can easily show, are much richer than anything that's presented.” - Noam Chomsky
This morning, I was browsing Hacker News, when I came across this link from Saharan, where they created a recursive Game of Life. Mesmerized by its wonderful patterns, I started thinking - what would happen if we replace a live cell with an alphabet? More specifically, I am proposing the following changes to Conway's Game of Life: Things get fascinating when you ask yourself -
Assuming a certain probability distribution over the alphabet, what kind of words can we construct from these sets? Is there a word of life?
So like every curious programmer, I wrote some code to explore this problem. I started with a simple script that generates an 8 x 8 grid and performs the classic Game of Life with it. I choose to use the glider pattern for this experiment.

Next, I replaced the starting live cells with alphabets. For the first experiment, my probability distribution was based on the relative frequency in the English language. I also modified the rules to accommodate the changes mentioned above. All that was left now was to see it in action!

I ran each Game of Life for four iterations and repeated this with a thousand different starting grids. Here's what I found -
[
    ("tea", 178),
    ("eta", 178),
    ("eat", 178),
    ("ate", 178),
    ("toe", 176),
    ("tie", 174),
    ("rte", 160),
    ("ten", 159),
    ("net", 159),
    ("est", 152),
    ("set", 152),
    ("the", 144),
    ("one", 144),
    ("eon", 144),
    ("roe", 143),
    ("ore", 143),
    ("sea", 137),
    ("ear", 133),
    ("are", 133),
    ("era", 133),
    ("nae", 130),
    ("oat", 123),
    ("tel", 121),
    ("let", 121),
    ("hoe", 121),
    ("sat", 120),
    ("hen", 120),
    ("hie", 116),
    ("res", 116),
    ("ire", 115),
    ("ode", 115),
    ("doe", 115),
    ("ted", 113),
    ("sot", 112),
    ("hes", 112),
    ("she", 112),
    ("ale", 110),
    ("lea", 110),
    ("its", 106),
    ("sit", 106),
    ("her", 106),
    ("ton", 105),
    ("not", 105),
    ("ens", 105),
    ("sen", 105),
    ("tor", 105),
    ("rot", 105),
    ("hit", 104),
    ("hat", 102),
    ("oar", 100),
    ("tan", 99),
    ("ant", 99),
    ("int", 98),
    ("nit", 98),
    ("tin", 98),
    ("hot", 98),
    ("tho", 98),
    ("die", 94),
    ("tar", 93),
    ("lie", 93),
    ("art", 93),
    ("lei", 93),
    ("air", 93),
    ("rat", 93),
    ("ole", 93),
    ("ion", 90),
    ("nth", 90),
    ("den", 90),
    ("end", 90),
    ("alt", 89),
    ("lat", 89),
    ("eds", 89),
    ("nor", 88),
    ("dot", 88),
    ("has", 87),
    ("ash", 87),
    ("etc", 87),
    ("hos", 86),
    ("ohs", 86),
    ("rel", 85),
    ("tad", 85),
    ("red", 84),
    ("met", 82),
    ("ail", 77),
    ("til", 75),
    ("lit", 75),
    ("his", 74),
    ("sir", 74),
    ("ran", 73),
    ("ado", 73),
    ("ace", 73),
    ("lot", 73),
    ("nah", 72),
    ("dos", 72),
    ("sod", 72),
    ("aid", 72),
    ("ans", 71),
    ("hrs", 71),
    ("rent", 70),
    ("tern", 70),
]            

In hindsight, this seems kinda obvious as the alphabets with the highest frequency are 'e', 't', and 'a', followed by 'o' and 'i'. However, it was surprising to see how important 'r' (9th most popular) seemed to be in the simulation. There were only two four-letter words in Top-100 and both had 'r' in them - rent and tern. I ran this again, and the results seemed more or less similar.

This got me thinking - what would happen if all the alphabets have an equal probability? So I ran the same experiment with an equal probability distribution. Only this time, the experiment got repeated with four thousand different starting grids! Here are the results -

Most common words:
[('ltd', 108), ('ult', 104), ('jut', 103), ('icy', 103), ('tab', 103), ('bat', 103), ('fut', 102), ('flt', 100), ('fro', 99), ('for', 99)]

Most common four-letter words:
[('iron', 30), ('ship', 30), ('phis', 30), ('hips', 30), ('bets', 30), ('best', 30), ('pawn', 30), ('tube', 29), ('glut', 29), ('cult', 29)]

Most common five-letter words:
[('bahts', 10), ('baths', 10), ('donas', 9), ('wafer', 9), ('gaits', 9), ('nitro', 8), ('intro', 8), ('prawn', 8), ('bytes', 8), ('chats', 8)]

Most common six-letter words:
[('ovular', 4), ('ration', 3), ('shadow', 3), ('thrive', 3), ('josher', 3), ('thorns', 3), ('bracts', 3), ('beauty', 3), ('prefab', 3), ('ticker', 3)]

Now, in all honesty, none of these seem to mean anything significant - it all seems quite chaotic. The results could have easily been different with a different random function (I was using Python's in-built module). I am curious about how the results would look on a different architecture with say ten thousand different starting grids! Having more than four iterations of Game of Life seems to blow up the search space.

I guess, for now, you could say that my machine has decided that its words of life are - ltd, iron, bahts, and ovular. What a sequence! I am open-sourcing the script used to conduct the above experiment. I would love to know your thoughts on this. Happy exploring!