In part II of this series on poiesthetic writing assistance, we will showcase Poiesis Studio, our testbed for exploring AI-driven writing assistance. In part III, we are going to go ahead and demonstrate interactive text generation using our system and reflect on what we have learned from trying it out so far.
If we take the view that masking text can be thought of as a sort of query, and the answers as probable sentences, what sort of questions might a writer want to ask? Each question is a particular writing situation, where a user wants to expand upon their text in some way. This user-driven interactive approach to content creation is not really considered in the general research on language models. Our work on using masked language models for content creation is among the first of its kind. However, even since we started looking into this problem a couple of years ago, the potential for new applications has become ever more evident. Expect this area to explode over the next 5-10 years!
To begin our exploration into AI-driven writing assistance, we developed a prototype for interfacing with language models in a writing context. Being able to interact with the writing assistant in an interactive way allows us to understand the utility of language models for writing assistance. It’s difficult to guess what this utility would be without having a tangible system to interact with. There is an enormous difference between simply imagining what a system can do and constructing it as a tangible object that can be interrogated through play.
What is Poiesis?
The system we created is called “Poiesis Studio”. It serves as a playground for interacting with language models and text generation tools in general. Why the word ‘poiesis’? In terms of a creative endeavour, poiesis is the creative spark from which new ideas materialise. Every creator has their own process for entering poiesis. For many, poiesis corresponds to the moments where we feel ‘inspired’. Unfortunately, this is often a mysterious state of being that we can only hope to stumble across. We might attempt to nudge ourselves into contexts where we find inspiration, by going on walks, visiting a cultural place, listening to music etc. But inspiration is often unattainable when we want it to be there. This is the famous problem of ‘writer’s block’, where inspiration is found lacking.
One of the purposes of Poiesis Studio is to provide a sort of ‘poiesis engine’. This engine can serve as a writer’s spark of poiesis, so that needing to feel inspired is no longer a pre-requisite for creation. HOWEVER, though Poiesis Studio might step-in when you are feeling uninspired, that doesn’t at all mean that you won’t be inspired when using our tool! The user always directs the system, choosing which writing contexts they want to activate it. Moreover, the user always serves the role of aesthetic judge when evaluating the intrinsic value of whatever the system generates. In other words, they choose what they like from what was suggested by the system and how to incorporate it back into their composition. This cycle of generate, judge, incorporate is the fundamental lifecycle of using the tool. The moment of poiesis is only present at the first stage of generation and only when the use wants it to be there.
So, let’s check out Poeisis Studio itself. When you start Poiesis studio, you first have to choose which genre of document you will be writing. Right now, we are only consider general purpose ‘writing’ and ‘lyrics’. These are much the same at the moment, except for the underlying language models used. In future versions, we have plans for customising the writing experience for various types of documents.
Once you click on a particular document type, you will be presented with a good old-fashioned, familiar, text editor. We’ve already typed in “This is some example text” to demonstrate some features of the editor. This editor has the usual functionality one would expect from a text editor, such as text justification and text font styles like italics and bold. The one thing that is different in this editor is the ‘poiesis’ mode button in the top right-hand corner of the editor.
Clicking on this button brings up the menu corresponding to ‘poiesis’ mode. This mode consists of the ‘plugin selector’, the ‘focus selector’ and the ‘poiesis toolbar’. Let’s explain a little more about the poiesis mode interface.
The plugin selector is used to choose which plugin to use in poeisis mode. This selects both the underlying AI language model that is chosen and the UI elements that are available for interacting with it. You can see in the screenshot that we selected the ‘WRITE’ plugin. The plugin is accompanied by labels that informs the user about the functionality provided by the plugin. In particular, we can see that:
- It supports English, shown by the UK flag.
- It supports restrictions on words based on word categories, shown by the ‘RESTRICT TAGS’ label. (We will explain what this means later.)
- It is a masked language model, shown by the ‘MASK’ label.
In a text editor context, user interface elements are normally presented in the toolbar. That is what we’ve gone for with our approach. However, it’s not just the toolbar elements that change. The text itself needs to be interacted with differently.
When in poiesis mode, the way that a user interacts with the text in the editor changes. In the case of our masked language model plugin, the text content is frozen and each paragraph now becomes a clickable region that will become the focus of poiesis when double-clicked. This focus is the part of the document which the user is regenerating. The surrounding paragraphs are also provided to the underlying language model as context when it regenerates text.
Once the focus has been selected, the user is presented with a masked representation of the text in that focus.
This representation consists of the focus text segmented into its constituent parts. This segmentation might not fall in line with your intuition about how a sentence should be split up. For example, the full stop is considered to be in its own segment in the example above. We won’t go into details of why the sentence is segmented in this way, but essentially, the AI model has its own idea of what these segments should be, and we need to conform to that using our current implementation.
To relate this representation back to its original surface form, we keep track of any splitting that has occurred that is different from the white space segmentation we are used to. This is indicated to the user by a little ‘tie’ that joins the words. So you can see in the example that the word ‘text’ and the full stop following it are attached in the underlying text i.e. it is actually ‘text.’ and not ‘text .’.
Each segment may be associated with a tag. This is automatically identified by the AI-model. In this example, the tags are part of speech tags corresponding to:
- PROPN: Proper Nouns
- NOUN: Nouns
- ADJ: Adjectives
- VERB: Verbs
- ADV: Adverbs
The segmented sentence along with its tags are used to ‘mask’ words. This is either done by clicking on individual words or by clicking on a word category that will mask all words corresponding to that category. Finally, there is a button to the right of this representation that generates the missing words.
So now that we have this masking representation. Let’s look at different ways we can mask text:
Single Word Mask
Multiple Word Mask
Single Tag Mask
Add and remove segments
Using the masking toolbar, you can also add or remove segments. When adding a new segment, the segment will be shown as an underscore where the word should be, and it will be masked automatically. This is helpful for playing with a sentence and generating different variations of it. In the following example, we added a mask before ‘example’ and a mask after ‘text’.
Finally, each segment can be associated with a tag restriction. This indicates to the system that a regenerated segment should match the tag the user has selected. We’ve done that here, by suggesting that the mask following the word ‘some’ should be an adjective and the mask following word ‘example’ should be a noun.
In part II of our series on poeisthetic writing assistance, we explained what poiesis is and showcased the Poeisis Studio interface itself. In part III we will demonstrate using poiesis studio in an interactive text generation context and even write some lyrics!
Our take-home messages are as follows:
- Poiesis is the point in a creative endeavour where new ideas are formed
- Poiesthetic writing assistance is very different to other forms of writing assistance
- The aim isn’t to automatically compose a piece of text, but to compose text interactively in a user-driven way
We are excited to make more progress on improving our work both technically and conceptually by better understanding and supporting poiesthetic writing experiences. Feel free to contact me at [email protected] if you want to get in touch. We are very happy to collaborate with writers, language researchers and techies!
Question: Why use word ‘poiesis’? That’s just branding and fluff — unnecessary, high-level padding?
I take the view that it is important to clearly differentiate our vision of the future of writing from what is currently known by the public and other researchers. To do that, we need to erect temporary conceptual scaffolding, that makes it clear what we are trying to achieve. The notion of poiesis perfectly captures that. In the future, we might just talk about text ‘plugins’ for ‘text generation’. If you want to think in those terms, then please go ahead.