Roottrees, AI, and Perception

Outside of IFComp season, it’s rare that I explicitly assign a game a status of “will not finish” — this blog owes its very existence to my tendency to go for “will finish later” instead. It’s even rarer that I post about it here when I do. But here we are. My experience with The Roottrees are Dead has some peculiarities worth recording.

The Roottrees are Dead is a free online deduction game along the lines of Return of the Obra Dinn and The Case of the Golden Idol. The plot involves identifying heirs to a fortune by filling out an extensive family tree, using information largely obtained from in-game web searches. (The UI for this feels weird: rather than show you the individual search results, it narrates a summary of them, but it displays this narration on the in-game CRT as if it were the search engine output.) There’s a tutorial where you have to identify three sisters in a photograph before you get access to the full family tree, although even in the tutorial you have full access to the search engine, as well as a long list of names you can search for long before they become relevant.

And that’s basically as far as I got. One of the three sisters was easy to identify, but the other two? I could not tell which was which from the information given, and even though there were only two possibilities, I was too proud to simply take a random guess so early in the game. Or perhaps not just proud; my experience with games of this sort is that guessing degrades the experience. It’s not just that it deprives you of the satisfaction of feeling clever at that moment, it’s also that each thing you puzzle out teaches you a little bit about how the author thinks and what kind of information is important. Skipping stuff without even an after-the-fact explanation of how you were supposed to have figured it out makes it harder to figure out more things later. And this was the tutorial, the part of the game that’s specifically meant to teach you, so skipping through it without learning what I was supposed to learn seemed potentially disastrous for what followed.

And so, when I finally gave up after an hour or two of diving down in-game web-search rabbit-holes and finding no new information about the sisters, rather than just guess, I sought hints. I was flabbergasted to learn that the intended solution involved noticing that the sister in the back is wearing a plaid shirt. I had seen the mention of plaid in the web-search text about the sisters, but had rejected it as relevant to the photo because obviously no one in it was wearing plaid; the shirt that I was now being told is plaid was clearly a shirt with red-and-black horizontal stripes. I took a closer look at the picture — maybe it was really plaid, and just looked like stripes to me? No, scrutiny just confirmed my initial impression.

Now, here’s the weird part. I’ve talked about this with several other people online, on the game’s official Discord channel and elsewhere. Every single person who weighed in on the matter had simply perceived the shirt as plaid. Some conceded that it’s actually striped if you look more closely, but they didn’t notice this at first glance. I didn’t perceive it as plaid at first glance — I still can’t perceive it as plaid no matter how hard I try. But apparently I’m the only one like this! This game has been out for three months, and this is the very first photo in it — heck, it’s on the game’s title screen! — and you have to inspect it closely for details figure out who the third sister is, and yet I find no trace of anyone else online having my problems. I suppose some people might have had problems, and guessed, and got on with it. But additionally the game has more than 20 credited beta testers, all of whom apparently signed off on the photo. The author himself appears to have seen nothing wrong with it. Wait, how? I can understand, at least in theory, how you could look at a picture of a striped shirt and not notice that it isn’t plaid, but how do you make a picture without noticing what you’re making?

The answer is that you use AI. I think we all understand by now that if you tell a neural net to give you a picture of a woman in a plaid shirt, it’s not at all surprising if it gives you a striped shirt instead. All the author had to do is not notice that it isn’t plaid, just like nearly all the players. This, it seems to me, is a big problem with AI generation: that it lets you produce art without any human mind paying a whole lot of attention to it. That might be acceptable in some situations (although it sometimes results in people with more arms than intended, and there are very few situations where that’s acceptable), but in a sleuthing game, it’s potentially deadly. I’m told that AI generation was just the first step in this game, that a lot of the pictures were retouched or had details added by hand for the sake of the puzzles. So the generation of these pictures did involve a human element beyond a casual glance, at least some of the time. Surely the very first picture in the game, the one that’s making a first impression, would be worth such care? But again, I have to remind myself that this is apparently a Just Me problem.

But speaking as Just Me, that first impression has, I think, already done irreparable damage. A game of this kind needs the player to trust it to deliver the necessary information clearly and accurately, and that trust has been broken. If I play further, I’ll always be wondering if I’m really seeing what the author intended. If it had happened later in the game, I might have decided to just keep going anyway, just to get to the end. But since it happened before I even got into the game proper, I feel free to just let it go.

ParserComp 2023 (belated): The Fortuna

Admittedly, ParserComp 2023 is old news by now, but I’m not quite done with it. I declared my intention to play all sixteen of its games, and I got through only seven before the judging period’s deadline. My mistake: I got a very late start, but thought I could still get through them all, because I was really thinking in terms of IFComp, which has the famous two-hour rule to encourage smaller games. ParserComp has no such rule, and this year’s entries included two more games by the author of Finn’s Big Adventure. I did consider continuing to play and write up the entries here after the deadline passed, as blogs have no deadline, but I didn’t do that.

Still, there’s one entry that I really want to comment on here, because it’s so of-the-moment. The Fortuna, a thriller set on a cruise ship, uses an LLM to generate text. Now, from where I sit, the judgment I’ve seen passed on LLMs and generative AI in general has mostly been negative: that their lack of any mooring in reality makes them unsuitable for dealing with facts, while applying them to fiction and creative works is a threatens the livelihoods of human artists (whose livelihoods are precarious enough already) while ultimately resting on their labor without fair compensation. But the author of this piece is a true believer in the liberatory potential of the AI revolution. (I thought for a while that they might be pretending, mainly because of the illustrations: these too are AI-generated, and, while the first few shown are reasonable enough that I mistook them for stock photographs, they start turning toward subtle monstrosity before long. I’ve now played enough of the game to be reassured of its sincerity, but it’s conceivable that there’s a twist that I simply haven’t had the patience to reach.)

Now, you may be thinking “LLM-generated IF? Like AI Dungeon, then?” Not quite: the only generated text in the game is NPC dialogue provoked by freeform input. It’s in some ways a natural choice, supplementing one of IF’s traditional weak spots while limiting the damage that the LLM can do to the narrative, but with the current state of the tech, the effect is deeply peculiar. It makes the dialogue unnatural in a characteristically ChatGPT-ish way. Characters don’t really converse so much as use your words as prompts for paragraph-long essays with lots of “I believe” statements:

Journalism is an incredibly important field that allows us to gather, analyze, and share information with the world. It is the perfect blend of storytelling and investigation, allowing journalists like myself to shine a light on important issues and hold those in power accountable. Journalism has the power to shape public opinion, influence policy, and promote social change. As a woman in journalism, I am particularly passionate about using my platform to advocate for women’s rights and to amplify the voices of marginalized communities. It is crucial that we have diverse perspectives and stories represented in the media, and I am determined to be a part of that movement.

The game tells you to befriend these characters to advance the plot — I could comment on how pathologically utilitarian this view of “friendship” is, but that’s really a problem endemic to IF as a whole, not this game in particular. Befriending a character basically means asking them about their pet topics repeatedly. Which they tend to circle back to regardless, often with suspiciously similar verbiage:

> ask nancy “What are you doing here?”

I’m here as a reporter for the New York Times, covering the recent women’s rights rally in the city. It’s important to give a voice to the issues that affect women and make sure their stories are heard. This rally is a powerful platform for women to come together, raise awareness, and fight for their rights. As a journalist, it’s my responsibility to report on these events and shed light on the challenges women face in our society.

> ask nancy “I mean, what are you doing on this deck in particular?”

I’m conducting research for a new article I’m working on. I’m investigating the gender wage gap in the corporate world and the impact it has on women’s rights. I believe it’s important to shed light on these issues and advocate for equality and fair treatment for women in the workplace.

> ask nancy “Why are you on the pool deck?”

I’m on the pool deck because I’m investigating the recent increase in drownings in the area. As a reporter, it’s my job to shed light on important issues and bring attention to them. It’s not enough for me to just report the facts; I want to use my platform to make a difference. I believe that by raising awareness about the importance of water safety and advocating for women’s rights, we can start to address this problem and prevent future tragedies.

For this player, at least, the end result is that I gave up on trying to treat this as anything other than Infocom-style “ASK/TELL” dialogue with unusually elaborate “I don’t understand” messages. The LLM simply doesn’t have enough knowledge about its context within the game to be able to converse about that context meaningfully. It’s been said that the ability to figure out what you do and don’t need to pay attention to, and to filter out the unimportant, is the quintessential skill in all videogames, and it’s possibly even more paramount when the text is being produced in a void of authorial intent, where failing to realize that the nonsense is nonsense has the potential to send you down infinitely-deep but ultimately empty rabbit-holes. So there’s the paradox of this game: the whole thing is a showcase for a feature that it structurally encourages you to ignore! (But I suppose the same could be said of a lot of graphical games…)

It’s got me wondering about other possibilities, though. The dialogue feels meaningless because it’s disconnected from the rest of the game — even the “befriending” mechanic seems like it’s just scanning your input for keywords and ignoring the response:

> tell david “Physics sucks and physicists are bad people. People like you are responsible for Hiroshima. I hate you. I hate you. I hate you.”

I’m sorry if you feel that way, but as a research scientist in the field of physics, my goal is to make a significant contribution to my field and to help others through my work. I understand that there have been events in history, like Hiroshima, that have used scientific discoveries in destructive ways. However, it is important to remember that science itself is not inherently good or bad – it is how we choose to use and apply it that determines the outcome. I am driven by a desire to contribute positively to the world and leave a lasting legacy, and I strive to do so with compassion and responsibility.

David Millman is now your friend!

But what if it didn’t ignore the output? What if it tried to determine whether you had befriended a character on the basis of what they said?

Well, for one thing, you’d have to retune the output considerably to make that work. In this game, I don’t think I’ve seen any character actually express camaraderie or affection (or indeed any emotion at all). But assume that’s a solved problem. Would it make for a better game? I don’t know that it would! It might be more responsive, but it would be less intentional, less about the story the author wants to tell and more about figuring out how to manipulate the LLM.

But on the other hand, game devs routinely let general-purpose libraries handle things like physics and 3D rendering, sometimes with unanticipated effects. Will it someday be possible for developers to simply give an AI actor parameters and direction and motivation, and trust it to ad-lib its lines? I suppose that’s the dream this game represents. And I’m willing to keep an open mind about it. But we’re not there yet, and it’s not at all clear that LLM are anything more than a dead end in its pursuit.