#233: Talking to the monsters

Kicking the tyres of Xbox's new machine-learning NPCs.

One soggy English afternoon when my youngest was about two years old, I asked him to do some colouring. In some magazine he’d acquired from a kindly grandparent I found a black-and-white picture of that scourge of parents everywhere, the evil Paw Patrol, and their human leader Ryder; I handed him the box of pens and left him to it, hoping it would buy me ten minutes of peace and quiet. Reader, needless to say, it did not. He returned to me two minutes later, saying he was finished.

He had coloured in the eyes. Just the eyes. They were red. And of course he’d gone over the lines, in his manic two-year-old way. Ryder and the pups looked terrifying, their rictus smiles, while untouched by my offspring’s hand, rendered somehow mad and angry. It was like something out of the end of Akira. And of course my kid looked up at me, eyes sparkling in the way that they do, waiting for some praise. Perhaps a sticker; he was toilet training at the time, and stickers were very much the thing. For a moment I just stood there, at a complete loss for what to say, though I was certainly thinking about a lot. It was like, mate. This isn’t what I asked for. It is, quite manifestly, not good. And it raises a whole host of uncomfortable issues that I really don’t want to think about. I gave him a sticker and did my best to forget about it.

That popped into my head this morning on the dog walk, while I was mulling over the subject of today’s edition. Not sure why.

“Midjourney, DALL•E 3 and GPT-4 have opened a world of endless possibilities,” some blue-tick schmo named Javi Lopez wrote on Twitter last week, before revealing that he had leveraged this ‘world of endless possibilities’ to make an Angry Birds clone. It’s called Angry Pumpkins.

“I'm genuinely blown away,” Lopez sobbed. “Honestly, I never thought this would be possible. I truly believe we're living in a historic moment that we've only seen in sci-fi movies up until now.”

Good heavens. At least my kid did a bit of colouring.

On Monday Microsoft announced a partnership with a machine-learning firm, Inworld AI, to empower developers of Xbox games to “create detailed scripts, dialogue trees, quest lines and more,” according to The Verge. The multi-year hookup will birth an Xbox equivalent of the Copilot system Microsoft has incorporated into Microsoft 365 and Windows; essentially a machine-learning version of Clippy that promises to do the boring bits of your work for you — or, on a corporate scale, to do the work of all the people you’ve laid off this year to cut costs.

“Inworld has been working on [ML] NPCs that react to questions from a player, much like how ChatGPT or Bing Chat responds to natural language queries,” The Verge reports. “These [ML] NPCs can respond in unique voices and can include complex dialogue trees or personalised dynamic storylines within a game. Inworld’s technology can also be used for narration, so companions in top-down RPGs can warn of groups of enemies or players up ahead.”

Look, it is easy to be snarky about this. And quite tempting, actually, so let us indulge ourselves for a moment. I am not surprised, given the Xbox operation’s well-documented struggle to maintain a regular cadence of firstparty releases, to learn that Microsoft is interested in technology that automates the game-development process. Deep in the bowels of Xbox HQ I am sure they are building machine-learning agents to power the future creation of all sorts of things. Battle passes, and associated cosmetic doodads, to keep Halo Infinite on life support. Poochie-esque, Gen-Z-baiting radio chatter for the next Forza Horizon. Scripts for Phil Spencer’s future apologetic podcast appearances. Perhaps this is the answer to all of Xbox’s problems! You can certainly understand the motivation, eh.

I will admit I am a little bit conflicted about generative machine learning — or, at least, a little more conflicted than most industry onlookers appear to be. For one thing, I recognise that game developers have been using more primitive forms of this sort of technology for decades now, albeit much more quietly than the likes of Mr Lopez and his pumpkins. I absolutely see a use case for machine-powered asset creation, because for all that I fret about the danger this technology poses to human labour, I also recognise that there isn’t much of a career in spending 40 hours a week modelling rocks for open-world games that most players will never see.

And of course I am of vintage Edge stock, honour-bound to be curious about new technology; to give it the benefit of the doubt, at least at first. I managed to give blockchain games a fair shake, even working — very, very briefly — for a company operating in the space. I entered the Metaverse with a reasonably open mind, though that only lasted about 30 seconds. I should probably do the same for generative ML, even though it seems like just more VC-inflated nonsense. I should postpone judgement until I am able to find out for myself whether this malarkey is worth all the noise around it.

Happily, I am able to so today, and so are you, since Inworld, the mob with which Microsoft is partnering, has a tech demo on Steam. Inworld Origins is a ‘playable short’ helmed by John Gaeta, the VFX whiz behind The Matrix’s bullet time and these days Inworld’s CCO. It’s a procedural (ha!) detective thing in which you pitch up to the scene of a laboratory explosion and question witnesses, all of whom are powered by Inworld’s technology. You mooch around, ask questions to which the responses are generated on the fly, and try to unravel what happened, and who’s behind it.

It is… not brilliant, this thing. There are awkward pauses while the technology parses your questions and generates responses to them — only a few seconds, but more than long enough to break the spell — and the NPC dialogue is delivered by the sort of placeholder text-to-speech software I know far too well from my consulting work. (For this reason, one assumes, several of the characters you speak to are robots). The technology struggled at times with my accent (unremarkable, and you’d think unchallenging, RP English): when I said the phrase ‘human rights activist’ to one shady character, he responded as if I’d said ‘humans with dyed hair’. And several conversations came to an abrupt end when an NPC just stopped responding to me, presumably caused by a glitch in the system. Maybe they were just annoyed with me? Entirely possible. Happens all the time in real life, too.

And yet I found it quite fascinating. At first I was effectively paralysed, standing there with no idea what to say, pining for the list of dialogue options this medium has trained me to expect. But I quickly slipped into a rhythm. I got quite deep into a chat with a police officer about the city in which the demo is set. I asked her about her work and she briefly alluded to people with superpowers existing in this world, and like a good journalist I thought that worth following up on. We spent a good few minutes talking about this stuff — about the kinds of superheroes there are in Metropolis, about the super-villains the city has known over the years, and the times she was witness to, and once even caught up in, their antics. Was this intended? Is any of this intended? I can’t say for sure, but I certainly doubt it. I just led the machine down a conversational side-road, and it happily followed me. There’s magic to that for sure.

Magic, perhaps, but very little point. Yes, this is a tech demo, here only to hint at what this technology could do in the hands of experienced, talented game makers. But if it is supposed to get me excited about this machine-generated gaming future, to get my synapses firing at the possibilities, it fails. The dialogue may be improvised on the fly, and spoken by characters willing to follow you along any tangent. But the writing is low-quality even in the context of a medium that has historically struggled with good storytelling. Even imagining a future iteration using a more human-sounding speech program, the staccato flow of conversations makes it all too apparent you are talking to a machine. And throughout there is a sense of doubt, both of myself and the technology — is my mic working? Did it hear me correctly? Am I asking the right questions? What am I supposed to do? — that stops it ever feeling, well. You know. Fun.

I will keep my mind open for a little while longer, but after my time with Inworld Origins I am even less convinced, even more sceptical about our machine-learning future, than I was yesterday. I do not believe that, if you gave this technology to one of our industry’s narrative heavyweights — an Inkle, Failbetter or Half Mermaid — that they would come up with anything better than they have already achieved through their traditional, mostly manual working practices. I doubt they’d even want to touch it, frankly, but that’s not the point.

The point is: is any of this actually an improvement? Will it result in richer worlds, and better stories, than we have today? And if not, why are we even bothering? For all the freewheeling jibber-jabber of which Origins’ NPCs are capable, the clearest message they send is left unspoken. This isn’t about making things better, just making them faster and cheaper, and while I am all for small teams being able to dream bigger with the help of machines — the way a dozen people made the universe of No Man’s Sky, say — I feel very different indeed about big companies using them to work smaller. Yet as Microsoft has made clear this week, there’s probably not that much we can do to stop it. Because it has already begun.

Oh god, that’s an Akira quote, isn’t it. Guess I’ve got another couple of weeks of Paw Patrol-infused nightmares to look forward to.

That’s your lot! MORE and the Hit Points MAILBAG are dipping behind the paywall for a spell, as I continue the search for a way to make the production of this newsletter a bit more sustainable. (I’ll be talking about this some more in a future edition, btw. Please look forward to it.) If you'd like an extra Hit Points, including a rundown of the week’s news, in your inbox every Friday, you can join us on the happier, more fragrant side of the fence for just £4 a month. Cheerio!