Category Archives: rants

Machine Learning and Creativity

It’s an interesting time to be an artist. As machine learning becomes part of the toolkit, in different ways for different people, new ideas are shaking loose, and I feel compelled to write about them as a way of wrapping my head around the whole thing.

The most recent headquake hit me by way of the ML-assisted album Chain Tripping by post-punk-pop band YACHT. Here’s a great Google I/O talk by bandleader Claire Evans that describes just how they made it. (Tl;dr: no, the machines are not coming for your jobs, songwriters! Using ML actually made the process slower: it took them three years to finish the album.) This case is interesting for what it tells us about not just the limitations of current AI techniques, but also the creative process, and what makes people enjoy music.

In music there’s this idea that enjoyment comes from a combination of the familiar and the unexpected. For example, a familiar arrangement of instruments, a familiar playing style, with a surprising melody or bass line. Maybe it works like visual indeterminacy: it keeps you interested by keeping you guessing.

As genres go, pop music is particularly information-sparse. What I take from YACHT’s example is that low level noise— nearly random arrangements of words and notes— can produce occasional bursts of semi-intelligible stuff. By manually curating the best of that stuff and arranging it, they pushed the surprise factor well above the threshold of enjoyability for a pop song. And then they provided the familiarity piece by playing their instruments and singing in their own recognizable style. The result: it’s pretty damn catchy.

So if you like the album, what is it exactly that you like? It sounds to me like what you’re enjoying is not so much the ML algorithm’s copious output of melodies and lyrics, but YACHT’s taste in selecting the gems from within it. So far, so good. But there’s another piece of this puzzle that makes me question whether this analysis is going deep enough.

Screenshot of the video for SCATTERHEAD, showing the lyrics.
Lyrics from SCATTERHEAD:
Time flies and I feel / but I can’t hear / palm of your eye / is it empty, memory?

The first time I watched the video for SCATTERHEAD, one lyric fragment jumped out at me: “palm of your eye”. I’m not alone: NPR Music’s review calls it out specifically as a “lovely phrase … which pins the lilting chorus into place”. But it jumped out at me for a rather different reason: I’d heard those exact words before. I immediately recognized them from Joanna Newsom’s 2004 song Peach, Plum, Pear.

I have read the right books / to interpret your look / you were knocking me down / with the palm of your eye

At the time, not knowing anything about YACHT’s process, I assumed they were making an overt, knowing reference to Newsom’s song. But then I learned how they generated their lyrics: they trained the ML model on the lyrics of their own back catalog plus the entire discography of all of the artists that influenced them. This opens up another plausible explanation: it could be that Newsom was among those influencers, the model lifted her lyric whole cloth, and YACHT simply failed to recognize it. If that’s the case, it would mean the ML model performed a sort of money-laundering operation on authorship. YACHT gets plausible deniability. Everyone wins.

This sounds like a scathing indictment of YACHT or of ML, but I honestly don’t mean it that way. It really isn’t that different from what happens in the creative process normally. Humans are notoriously bad at remembering where their own ideas come from. It’s all too common for two people to walk away from a shared conversation, each thinking he came up with a key idea. For example: witness the recent kerfuffle about the Ganbreeder images, created by one artist using software developed by another artist, unknowingly appropriated by a third artist who thought he had “discovered” it in latent space, and then exhibited and sold in a gallery. So, great, now we have yet one more way that ML can cloud questions of authorship in art.

But maybe authorship isn’t actually as important as we think it is. Growing up in our modern capitalist society, we’ve been trained to value the idea of intellectual property. It’s baked into how working artists earn their living, and it permeates all kinds of conversations around art and technology. We assume that coming up with an original idea means you own that idea (dot dot dot, profit!) But capitalism is a pretty recent invention, and for most of human history this is not how culture worked. Good ideas take hold in a culture by being shared, repeated, modded and remixed. Maybe there’s a way forward from here, to a world where culture can be culture, and artists can survive and even thrive, without the need to cordon off and commodify every little thing they do. It’s a nice dream, at any rate.

At some level this is just me, sticking a toe in the water, as I get ready to add ML to my own toolkit. (It’s taken me this long to get over my initial discomfort at the very thought of it…) When I do jump in, we’ll see how long I can keep my eyes open.

On the rules for VR

SIGGRAPH attendees are a sophisticated audience, so demoing Pearl in the VR Village last week led to some really interesting conversations.  One thing I heard more than once was this idea that to do storytelling in VR, we have to throw out all the rules of traditional cinema. While I appreciate the swashbuckling spirit of that sentiment, I don’t think it’s actually true.

drawing by Elizabeth Floyd
drawing by Elizabeth Floyd

I had a life drawing instructor in college who used to teach us rules like “highlights are circular, and shadows are triangular.” As a math major, this really bothered me at the time, because taken literally it was provably false– just give me a flashlight and a grapefruit and I’ll show you! But that was missing his point. The human body is made of smooth, convex masses, and the highlights on them do indeed tend to be round. And when one limb casts a shadow on another, the contour of the shadow’s edge wraps around and hits the silhouette at an angle, forming a sharp point. In other words, “triangular”. So my teacher’s rule, within the context of human figure drawing, was totally valid and actually pretty insightful. But it wasn’t a law of nature, it was something he invented. And to construct it, he had to synthesize knowledge from human anatomy, physics, geometry, and visual perception.

The rules of filmmaking seem atomic and universal to us, but they’re not. Like the “triangular/circular” rule, they’re chimaeras, hybrid creatures assembled from bits of wisdom from different disciplines. They’re not real the way math and biology are real, we’re just so used to them that we mistake them for reality.

illustration of the 180-degree ruleFor example, take film’s 180º rule. That’s the rule that says if you’re shooting a conversation between two characters, there’s an imaginary line connecting them, and when you cut from shot to shot, you always have to keep the camera on the same side of it. Cross that line, and you risk confusing your audience. This rule has elements of geometry (projecting 3D space to a plane), perception (how humans construct mental models of 3D space) and psychology (how we organize those models based on relationships between people). That’s a lot of moving parts! Now imagine trying to apply this rule to a VR experience where you can walk around the scene. Some of those elements change (the flat screen becomes a volume) but the perception and psychology parts are still there. So the question is not whether to keep the 180º rule or throw it away.  The question to ask is which parts do we keep, and what else do we add into the mix, to construct a new rule that works for VR?

For VR storytelling, we shouldn’t have to throw out the rules of the mediums we know and love. But we can unpack them, dismantle them into their component parts, and analyze them at a deeper level than we’re used to doing. And that’s going to be a fun way to spend the next few years, for all of us.

Time and its impact on fun

I like to play word games. Scrabble and Boggle are two of my childhood favorites, and nowadays I play Zynga’s “With Friends” versions of both games on my phone. (If you want to play me, look for “cassidyjcurtis” or “otherthings”!)

Both games are about scrambling letters up into words, and both make heavy use of the anagram-loving part of me. But I’ve noticed that the two games produce very different mental states. The reason has to do with how they make use of time.

There’s no Y in “otolith”. And besides, there’s no place to play it.

In Scrabble, there’s no time limit. You’re free to take as long as you want to play a word, but you can’t take it back once you’ve played it. The effect that has, on me anyway, is to make me an optimizer. I try to find the best possible word for the given moment, taking everything into account: the score, the state of the board, the consonant-to-vowel balance of my rack, how many letters are left, and so on. It’s a complex mix of concerns, and sometimes I just can’t see any option that’s clearly the best. But because I know my vocabulary is limited, I always suspect that a better word is out there that I’m just not seeing. When this happens, I get stuck, unable to play, effectively paralyzed. So Scrabble as a game makes me happy when I’m doing well, and miserable when I’m not. It’s not so much about the score of the game, as whether I’m measuring up to some abstract ideal of the perfect player. What a headache!

Is “squarey” a word? I dunno, let’s try it and find out!

In Boggle, there’s a hard time limit, and the goal is to find as many words as you can in that time. Some words are worth more than others, of course, but it’s usually better to find lots of small words than a handful of huge ones. So when the clock starts ticking, I just start finding words as fast as I can, with no time wasted on judging good from better. And what I find tends to happen is that small words lead to bigger words, in a stream-of-consciousness kind of way that’s energetic but not stressful, and just a lot of fun. I only pop up to look at the big picture when the vein I’m mining runs dry. And before I know it, time is up, and I’ve finished my turn exhilarated by the effort. Sometimes I win, and sometimes I lose, but I always enjoy the game. And enjoying the game, feeling that state of flow and fun, directly impacts my ability to play it well.

What this has to do with animation, or any complex creative work, should be pretty clear. You can approach a new shot in either way: give yourself all the time in the world to find the best possible idea, or give yourself a hard time limit (to accomplish some part of the job) and just start exploring, and then see what you’ve got when your time runs out.

I’ve experimented with the size of the task and the length of the time limit. And what I’ve noticed surprised me: the shorter the time limit, the more fun I have. And more fun leads to better quality work. I do still feel the urge to optimize sometimes. But on my best days, I’m too busy playing to notice.

Animating with a wrecking ball

Any professional animator can tell you that animating well is only half of the job. The other half is being able to work well with others: directors, supervisors, your fellow animators, other departments that depend on you, etc.

One of the biggest struggles I see animators face is how to handle changes. Because animation is so time-consuming, it’s easy to think of your work like it’s a kind of architecture: first you must lay down a strong foundation, and then you can start building walls, etc., and finally put on that sweet paint job that makes it look awesome.

Sisyphus, by Marcell Jankovics. Not just a metaphor for your worst day at work, it's also a great short film!

This view is certainly true at a technical level: once the idea of the shot is clear in your mind, the process of blocking, breaking down, and polishing does have a kind of one-directional feel to it. It can be hard to go back and adjust your blocking after you’re well along in the polishing process. So, if for any reason you get notes from your director that change your blocking significantly, it can feel pretty bad.

But if you think this technical process is what your work is about, you’re completely missing the big picture.

Your real job as an animator is to find and execute the best possible performance. The performance is not made of keyframes and curves, any more than it’s made of bricks and concrete. It’s made of ideas. That is what you’re here to find. The part of animation that’s like building a house? That’s just the execution of the ideas. If you’re executing the wrong ideas, it’s like building your house in the middle of the road. No matter how good it looks, it’s not going to be a nice place to live.

Does this look like "work" to you?
Flickr photo courtesy of AlphaTangoBravo (Adam Baker)

So here’s a trick to help you deal with changes: learn to love destroying your own work. Genuinely enjoy it. Relish it. Specifically: you have to enjoy the process of destroying as much as you enjoy creating. Make it fun. Make it something you’ll look forward to, if you’re given the chance to do it.

Remember when you were a little kid? Did you ever make a huge tower of blocks, just so you could knock it down and make a huge crash? Remember how you wanted to do it over and over again? Destruction can be delicious fun.

So before you bring your shot in for review, take a moment to contemplate its utter demolition. Step back and take a hard look at your shot, and ask yourself: if I had to smash this to bits, how would I do it? Which parts would I smash at first? If I had to start again, what would I do differently? Savor that idea for a moment. And bring it with you to your review.

This is your wrecking ball. If the director asks you to make a major change, it just means you’ve got permission to use it. And when you do, you can do it with gusto.

Animation Hypnosis

Sometimes, when you’ve worked on one shot for too long, you can go a bit blind. It’s a very specific kind of blindness, one that prevents you from seeing mistakes you’ve made and opportunities you may have missed. It seems to happen to every animator at some point, and it is deadly to the creative process.

"Day Nine", photo by Jerry Cooke

There are tricks you can use to get around it. You can hold a mirror up to the screen to see the shot from a fresh point of view. You can step away for a few minutes, or a few days, or longer. (One time I was able to step away from a shot for a full year: boy did I see it with fresh eyes at that point!) And of course you can show other people. But once the effect has set in, your own perceptive powers are severely diminished.

I have a hunch that the main cause of this is the simple act of watching your shot over and over again. In psychology there’s a concept called habituation: any stimulus repeated long enough reduces your sensitivity to that stimulus. It’s the reason why you notice the sound of the dishwasher when you first walk into the kitchen, but eventually it fades into the background.

In the early days of animation, watching a shot in progress play back at full speed was a luxury animators didn’t have. To do that required shooting key drawings onto film, developing the film, threading it into a projector, and so on: an expensive and time consuming process. And yet, great animation still got made: animators planned very carefully and learned to do most of the work in their heads. In the early days of computer animation, it was much the same, though for different reasons: we’d have to wait for an overnight render to see our work at speed.

"Hypnotizing", photo by Patrick Breen.

Nowadays, animators have digital tools that allow for instant, real-time feedback, which for the most part is a tremendous aid. But it also makes it very easy to hypnotize yourself with all those looping stimuli.

If you want to stay sharp, it’s critical that you delay the onset of animation hypnosis as long as you possibly can. So what I try to do is avoid watching my shot while I’m working on it. I’ll watch it once, twice, maybe three times, and then jot down my thoughts about what needs doing. Then I go to work, keeping narrowly focused on each detail as I go. If I have to play some part of the shot at speed to judge some nuance of weight or gesture, I’ll hide everything but the body part I’m working on, so as not to get distracted. Once I’ve addressed all of my notes, only then will I watch the shot as a whole again. It takes a kind of discipline that I can’t always muster. But when I succeed, it feels great. And as a side benefit, I find I get more real work done in less time: after all, time spent looking at your shot is not time spent working on it.

"Sissi ipnotizzata", photo by screanzatopo.

I’d love to have some way of counting how many times I’ve looped my shot, to see if there’s a certain magic number where hypnosis sets in. What would that number be? 500? 5,000? You could make a game of it, a la “Name That Tune”: challenge yourself to finish a shot with the absolute minimum of viewings. How low could you go? Ten loops? Three loops? Zero?

What would 120fps mean for animation?

Following up on yesterday’s post about higher frame rates in movies, there’s another question looming. If high frame rates catch on industry-wide, what will it mean for animators?

We won’t really know for sure until the movies start coming out. But we can guess. There are televisions on the market that will play movies at 120fps regardless of how the movie was shot. They do this by creating the inbetween frames automatically, in real time. (How exactly it’s done, I’m not sure, but it’s probably some sort of optical flow technique, like Twixtor.) When you see this done to a live action movie shot at 24fps, the effect is impressive: movement really does feel incredibly smooth, and the strobing/juddering problem is minimal. But if you watch an animated movie on one of these TVs, the results are not good. Timing that felt snappy at 24fps feels mushy at 120. Eyes look bizarre during blinks. And don’t get me started on smear frames.

Of course, this is just a machine trying its best to interpolate frames according to some fixed set of rules. Animators will be able to make more intelligent choices, which of course it’s our job to do. But that’s where it gets interesting. How many frames should a blink take at 120fps? What’s the upper limit on how snappy a move can be, if you can potentially get from one pose to another in a mere 8 milliseconds? It could open up new creative possibilities too. Take staggers for example: at 24fps, if you want something to vibrate really quickly, your only option is to do it “on ones” (that is, alternating single frames). But at 120fps, you could potentially have staggers vibrating on anything from ones to fives. How will those different speeds feel to the audience?

One thing seems pretty certain: animating at 120fps would be a lot more work. For animators who agonize over every frame, it will mean five times the agony. It will certainly mean more reliance on computer assistance: more spline interpolation, fewer hand-crafted inbetweens, and forget about hand-drawing every frame! I look forward to hearing animators’ stories from the trenches on Hobbit. Will they find 48fps twice as hard, or more, or less? What tricks will they have to invent to make their job manageable?

Frames per second

Flickr photo courtesy of purplemattfish
Flickr photo courtesy of purplemattfish

There’s been some discussion brewing among certain filmmakers about the impact of making movies that play faster than the current standard of 24 frames per second. Peter Jackson is shooting The Hobbit at 48fps, and others are reportedly experimenting with rates like 60 or even 120.

Mixed into the discussion are some really deep misconceptions about how vision and perception actually work. People are saying things like “the human eye perceives at 60fps”. This is simply not true. You can’t quantify the “frame rate” of the human eye, because perception doesn’t work that way. It’s not discrete, it’s continuous. There is, literally, no frame rate that is fast enough to match our experience of reality. All we can do, in frame-by-frame media, is to try to get closer.

The problem is that our eyes, unlike cameras, don’t stay put. They’re active, not passive. They move around in direct response to what they are seeing. If you watch an object moving past you, your eyes will rotate smoothly to match the speed of the thing you’re looking at. If what you’re looking at is real, you will perceive a sharp, clear image of that thing. But if it’s a movie made of a series of discrete frames, you will perceive a stuttering, ghosted mess. This is because, while your eyes move smoothly, each frame of what you’re watching is standing still, leaving a blurry streak across your retina as your eyes move past it, which is then replaced by another blurry streak in a slightly different spot, and so on. This vibrating effect is known as “strobing” or “judder”.

Applying camera-based effects like motion blur only makes the mess look worse. Now, your stuttering ghosted multiple image becomes a stuttering, ghosted blurry multiple image. (The emphasis on motion blur is particularly bad in VFX-heavy action movies, which is why I try to sit near the back.)


Click the image to see a demonstration of the "judder" effect. This is what your eyes actually see when you watch an object moving back and forth on a movie screen. Even with motion blur, you can see that there's a distracting sawtooth vibration to the ball that can be reduced, but not eliminated, by increasing the frame rate.

Filmmakers tend to work around this problem by using the camera itself as a surrogate for our wandering eye: tracking what’s important so that it effectively stays put (and therefore sharp) in screen space. But you can’t track everything at once, and a movie where nothing ever moves would be very dull indeed.

I am pretty sure there is no frame rate fast enough to completely solve this problem. However, the faster the frame rate, the less blurring and strobing you’ll experience as your eyes track moving objects across the screen. So I am extremely curious to see what Jackson’s Hobbit looks like at 48fps.

There’s a second problem here, which is cultural. My entire generation was raised on high quality feature films at 24 frames, and poorer-quality television (soap operas, local news) at 60 fields per second. As a result, we tend to associate the slower frame rate with quality. Commercial and TV directors caught on to this decades ago, and started shooting at 24fps to try to get the “film look”. How will we perceive a movie that’s shot at 48fps? Will it still feel “cheap” to us? And what about the next generation, raised on video games that play at much higher frame rates? What cultural baggage will they bring into the theater with them?

Forgotten, and remembered, by Google.

So, while going through this whole website restoration process, I discovered that Google’s search engine (funny how you have to be specific about that, now that Google is no longer just a search engine…) seemed to have completely forgotten this site ever existed. If you searched for “Cassidy Curtis” or “how to make a baby” or even “otherthings.com”, you’d find no results whatsoever on this domain. Zero. Considering that a few weeks ago this site was the top search result for all of those phrases, that seemed pretty weird. But I figured it was just because our server had crashed, and it was taking Google’s spiders some time to crawl back over to my little corner of the web.

The truth turns out to be a bit creepier.

I only found out the true nature of the problem by visiting Google’s webmaster tools, where I found an anonymous message dated October 20th, explaining what had happened. The message was sent to nobody, or maybe it was sent to my old email address, the one that died with my old server. At any rate, I never received it. But Google being the ultimate data hoarder, it archived it, and it was waiting for me when I identified myself as the owner of this domain.

Here’s what I learned: Remember a few years ago, when my blog got hacked? Well, the hacker in question used this blog’s machinery (Movable Type, at the time) to plant a nasty little trove of fake web pages advertising all the usual types of internet snake oil, the kind of stuff that usually gets caught in your spam filter. Well, when I switched over to WordPress, I never bothered to delete the old files, I just moved them to a different location, figuring that would break any incoming links and neutralize the problem. (I know, bad idea, right? This is why you should never let me be your sysadmin.) It didn’t work. Somehow, said hacker managed to find the files, and keep using them for their nefarious purposes.

The files were full of sleazy code that did things like: showing one thing to human visitors and an entirely different thing to search engines. Google doesn’t like that. So it reacted, in its anonymous, machine-like way, the only way it knows to respond: it removed otherthings.com entirely from its search engine. Harsh! Luckily, Google lets webmasters appeal that decision once they’ve fixed the problem. They said “it may take several weeks for your site to show up in search again”, but in actuality it only took a day.

Why was this creepy? Because it revealed just how much power this one corporation has over the shape of how we communicate. If you displease Google, it can make you disappear.

On the Value of Drawing for CG Animators

Last weekend I got to see a live interview with one of my all-time heroes, Hayao Miyazaki. He said, through a translator, all kinds of interesting things. When asked about computer animation, he had this to say: “One time we hired an expert to animate some scenes on the computer, but in the end we found that we could draw the scenes faster with a pencil.”

Now maybe this was his choice of words, and maybe it was the translator’s, but I found this response pretty revealing. It reflects a thought process about what animation is, in which drawing is central to everything. If what you’re trying to do is draw a scene, a pencil really is faster than a computer. But is that necessarily true? Is animation, at its heart, about drawing?
Continue reading On the Value of Drawing for CG Animators

Neither Graffiti Nor Animation

robinrhode-he-got-game-2000.jpg

Tonight I learned that one of my favorite graffiti spots, London’s “Undercroft” skatepark, has temporarily been taken over–or rather been given over by the authorities–to an artist named Robin Rhode. If you search Google or YouTube you can see some of the guy’s work… some of it is actually kinda interesting, but nowhere near cool enough to justify shutting down the Undercroft for a whole week, in my opinion.

His schtick seems to be taking a series of photos where he draws some stuff in chalk on the wall or floor, and then photographs himself a bunch of times miming some kind of simple action, like waving a flag made of bricks, or slam-dunking a basketball into an imaginary basket. He shows the work as a series of photos in a gallery, or a videotaped slide show of same.

The London-based graffiti writers in my Flickr circles have been putting him down because he’s an “artist” (as opposed to a writer), and in this case, they have a point: everyone else who paints at the Undercroft does so on their own time and resources, and at their own risk. Then, here comes this guy with a team of contractors to do the heavy work of priming the walls for him, so he can stroll in and do his drawings. And Southbank Centre actually shuts down the skatepark for a whole week just to give him room to do his thing. That’s got to bother a lot of people, given the strong anti-authority thread in the subculture: “real” graffiti writers don’t ask permission, and they certainly don’t hire assistants to do their dirty work. Graffiti writers, in a way, created the Undercroft. It’s hard not to feel like this action is taking their baby away from them.

For my part, I get irritated when art critics credit Rhode as doing “stop-frame animation“, when all he’s ever produced is a mere storyboard of an action. What he’s doing is almost animation, but it’s not really, because he never seems to shoot enough photos (nor have a strong enough grasp of the principles of motion) to make it work–it certainly wouldn’t play well at 24 frames per second–so instead he shows it as a series of stills. Pretty weak sauce, especially when there are artists like Blu out there doing full-on hand-painted animation on the street. And so I find myself wanting to hate on the guy, even though we’ve never met.

What’s interesting to me is that what these two responses have in common: both animation and graffiti require a lot of skill, effort, dedication and time. So when someone steps into either territory who doesn’t seem to be willing to put in that effort, it’s going to bother the people who live there.

Anyway, Rhode’s intervention is scheduled to end on October 4th, so it’ll all be over shortly. I look forward to seeing the writers and skaters take back their Undercroft with extreme prejudice.