GPT-2 seems to be a fancy Lorem Ipsum generator

Ahti Ahde
6 min readJan 23, 2021

--

Lorem Ipsum allows us to “hallucinate” a meaning to total lack of meaning. image source

I started to develop a World of Warcraft story generator for the NaNoWriMo AI competition. I plan to publish a kind of dev diary of the project. This is the update vol. 1:

Later this week I scraped all the World of Warcraft short stories Blizzard publishes on their website for free. I might plan to add Warcraft and Heartstone lore as well, as they kind of belong to the same “canon”. The full scale implementation needs an aesthetic sentence generator, for which I though I would use GPT-2. The non-linear narrative dramatic elements I would get from an improved version of some kind of Partial Order Causal-Link planners (POCL) that has been under active academic research started by Michael Young and Mark Owen Riedl.

Having now gone through thousands of words of generated text, it seems like my worst fears about GPT-2 are getting true. I hope I will discover a way to prove that I am wrong about my hypothesis about why GPT-2 is able to make us believe it can generate natural text, but actually conveys zero meaning in the sentences it writes.

As human being it is very easy for us to fill meaningless non-sense with meaning, and it seems that the rumors around GPT-2 were more about a publicity stunt by Elon Musk than based on genuinely scientific merits (as opposed to BERT that actually solves interesting engineering problems; I will write more about “BERT powered” Dialogflow as it is very much related to machine generated narrative structures).

In natural language generation there are two distinct goals to accomplish: meaning delivery and achieving a decent level of human aesthetics for the language to be kind of engaging as opposed to pure logical claims (POCL reduces a story to pure logical claims generated about agents and their state changes between start and end conditions based on action space).

In each paragraph of a professional writer, at least a single unit of meaning is delivered. This is achieved by claiming at least one change in the state of the world, where some subject causes a change to another or the supposed cognition of the reader (these state modes are known as fabula and sujet).

Pure logical presentation of story point progress. image source

POCLs are an extreme version of natural language generators, or should I say natural logic generators, that do this meaning delivery well: each sentence changes a state, because it consists of a sentence where dramatically significant agent does a state transforming action to another dramatically significant agent or the world (in a sense objects in the world can be said to be agents also, even when they only exist passively as obstacles that only have one purpose; their state to be changed to a non-obstacle; narratively this makes a lot of sense).

POCLs have two problems: 1) you have to predefine the action space and agents, and 2) they are combinatorically explosive so you need a method for “relevancy realization” to cut down the exploration paths by splitting the big narrative to smaller units and also limit the action space and agents per narrative scene. However, they are able to deliver meaningful story arcs only from definition of a start and end states of the scenario. The problem is that the sentences generated are devoid of human aesthetics, they are pure predicate logic.

GPT-2 is the opposite of this, it is pure Lorem Ipsum generator. Lorem Ipsum feels pleasantly aesthetic, because it is based on non-language, which doesn’t really exist, but it looks like a language. GPT-2 is kind of second generation non-sense generator.

Why would I say such thing?

I tried to use batch runs of GPT-2 with NLP to extract sets of common transformative actions and characters in the stories of World of Warcraft. The character extraction works very well (named entity recognition), but when I tried to extract the actions I noticed something interesting: it only makes a meaningful statements accidentally and it tends to avoid making any meaningful statements.

Why this might be?

My hypothesis is, that it is easier to create aesthetically pleasant language when it doesn’t have logically incoherent claims about reality. For this reason the GPT-2 has learned (or more probably was originally designed) to avoid any meaningful logical claims at all, since if the logical claims would break the coherence, the narrative aesthetics of the sentences would also cognitively break apart in human mind. What GPT-2 seems to do instead is optimization towards aesthetic association, meaning that it looks like it is staying within a topicality, but it doesn’t progress anywhere in narrative sense.

While you can narrate a story in non-linear chronological order, that is not what we mean with non-linear story elements. The crisis points here are the non-linear story points, which are represented in linear sense that fills the gaps in an aesthetically pleasant matter; I will later write how fractals are related to all this. image source

Sentences in paragraph have linear and non-linear aspects in narratives. The linear narrative demands the sentences to be able to have ordered sequence structure through association from one sentence to the next (or actually from next sentence to the previous one; the sentences that fail this test do not get to become the next sentence). The non-linear narrative contains the dramatic elements which had their state changed in order to progress the logical goals of the story. This relationship between linear and non-linear structures makes some fractal and network theory related methods of narrative analysis so effective (for example this).

In machine learning it is very common that we combine different methods and models from one problem field in such a way that they discriminate against the worst errors of each other through an ensemble that is able to combine the strong aspects of each model.

My idea is to add aesthetic enhancements around POCL and hopefully also in such way that I can optimize partly the combinatorial explosiveness problem. My idea was to use GPT-2 alone, but I think I need to add BERT also, because it has nice capability to do question answering and I might make it “hallucinate” solutions to the action space I might need, to feed to the POCL. Then I can feed the logical claims back to my Lorem Ipsum generator (GPT-2).

If someone knows how I can force GPT-2 prefix prompts to “answer questions” I am all ears (adding BERT adds complexity, I want simple). What I need now is to have a system, which can give me sensible logical answers to a question like “Why did Garosh lose his weapon?”. The answers do not need to be perfect, but it would be necessary that I could generate a batch of such answers and develop some simple (but generalizable by grammar or narrative theories) heuristics to pick the best candidate actions to add to the POCL action space.

I believe the next leap in natural narrative generation is how we deal with this problem of non-linear logical claims about the progress of story points. Contemporary philosophy gives us a lot of tools for understanding this. One might say that this non-linear progress is what Gilles Deleuze calls as Actualization, within a Repetition, that is kind of Bergsonian idea of time as non-linear interval between a change, rather than a measurable unit of words or seconds.

In all machine learning associative meaning discovery can lead to premature excitement about the results and it takes experience to remain critical about what was actually achieved. Perhaps one of the most important skills to have in this line of profession is not to hallucinate success, when data doesn’t have anything that supports scientific evidence for significance, first thing you want to develop is the criteria of significant success.

For me in this project it is the ability to produce a natural narrative generator that nicely adheres the structure of Gilles Deleuze and Bergson in such way, that I could represent the story with HTML 5 document, which has distinct headlines to denote the non-linear progress of story elements and the linear progress of sensible paragraphs (then I can hide the headlines and represent the story as a novel, but edit the story by remixing the headline pointed sections of the text).

--

--

Ahti Ahde

Passionate writer of fiction and philosophy disrupting the modern mental model of the socio-capitalistic system. Make People Feel Less Worthless.