Artistic (im)perfection, android dreams of auto-tune, and the app cold war
July 17, 2020 Edition
Hi everyone, we’re back, so it must be Friday. I’ve been in the midst of some research around AI media creation tools, so forgive me if the next couple of editions gesture in that direction.
“Phase II”

Source: New Yorker, Sketchpad (June 29, 2020) by Jason Adam Katzenstein
Do androids dream of Auto-Tuned Zoom meetings?
Last weekend, I was engaged in something that I’ve become quite good at in 2020, despite doing it zero times prior to March: sanitizing packages while listening to podcasts. One perplexing technical challenge that I’ve been unable to overcome is how to skip podcast ads while wearing latex gloves and without touching my phone. Thus, it was a perfect storm of circumstances that led to my hearing the entirety of a truly painful ad on one of Vox’s podcasts.
In this ad, sponsored by Samsung Galaxy 5G, a breathless interviewer asks Jen Golbeck, CS professor at the University of Maryland, how our pandemic-era reliance on technology is impacting us psychologically. Golbeck’s response is as follows (emphasis mine): “If things get in the way of us doing what we’re trying to do, it can be so frustrating, because it builds up over the course of the day. And so, if we have seamless video, if we don’t have those little delays, it’s going to make us feel better, it’s going to reduce that frustration. And that’s going to make us nicer to everybody else; it’s going to make us happier.”
While the (admittedly over-simplified) notion that 5G=seamless video=spiritual contentment might fall uncomfortably close to Silicon Valley’s pitch-perfect “making the world a better place through constructing elegant hierarchies for maximum code reuse and extensibility,” I was struck by how many times in recent weeks I’ve seen or heard someone mention video-call lag as a culprit for our collective exhaustion and anxiety. Seriously, there are articles about this everywhere (Fast Company), citing the life-draining impact of glitchiness on video conference calls. [Personally, I am tired because we are in the midst of a pandemic, combined with justified social unrest and inept governance.] The good news is that introspection about our weariness appears to have peaked in late April, so we should be back to normal any day now!

I found myself thinking about this ad while on a run recently, for reasons that remain unclear but reveal that I’m especially suggestible while sanitizing. Specifically, I was thinking on what our sensitivity to visual & audio processing means in a world where many of our interactions happen via technology and much of the content we consume is produced—in whole or in part—by machines.
If this pop research is accurate, we know subconsciously that video-calls are not the same as real human interaction—and respond negatively to them—because of imperfections in content delivery. When processing speech, we seem to prefer filler words and pauses—the sort of verbal buffering that high-school speech teachers try valiantly to train out of us—rather than the perfect but robotic virtual speech that machines are increasingly capable of producing. As I mentioned in a previous edition of Trillium, the famous Google Duplex demo at Google I/O in 2018 received an especially positive response due to the “life-like” nature of the AI caller. If you watch that video, you can hear the emphasis placed on the prosody (the pattern of rhythm/sound in the voice) and inclusion of filler words like “um” and “mm-hmm.” For whatever reason, we receive that audio signal differently than the standard Siri speech.
Applying these anecdotes about how we process sight and sound is intriguing, to me at least, in the context of content creation and consumption. We’ve talked in previous weeks about a trend towards machine-produced content, and a dystopian future where we’re all doing VR Peloton workouts led by a virtual fitness influencer’s avatar and soundtracked by algorithmically-generated music. Facing this world, one might gain confidence from the idea that human-sounding audio registers differently from robo-audio. Perhaps it’s not so easy to replace a human artist, after all.
I’ve written previously about the work done at Google’s Magenta research lab and the OpenAI project to create public, algorithmic music creation tools. Even mainstream music creation software like Ableton offers plug-ins that, like magic, allow anyone to play a single key on a MIDI keyboard and hear a perfectly formed chord, or a fluid arpeggio. I would hardly call myself a keyboardist, but yesterday I found myself playing what sounded like complicated keyboard parts, all without using more than one finger and a couple of software settings. To fast forward that movie is, to some degree, to envision a world in which a human creator is a decreasingly important component in the creation process.
I never thought I would say this, but I’ve found love in a hopeless place: Auto-Tune. My guess is that all of you recognize Auto-Tune when you hear it: a sort of warbling, pitch-correcting vocal effect that has turned many mediocre singers into passable pop stars. Auto-Tune debuted in 1997 and, for its first decade, was rarely more than a trivial audio processing tool, despite appearances in the pop canon (Cher’s “I Believe” in 1998). In the 2008-2010 range, I remember feeling like I couldn’t escape it. T-Pain, in particular, went Full Auto-Tune on many of his songs in that period, thus catalyzing a run of hit songs where the effect featured prominently (remember “Lollipop”?). It was easy to parody and to diminish; nonetheless, artists like Radiohead (the Amnesiac album and some of Thom Yorke’s work thereafter), various country singers, and many rap/r&b/hip hop artists (Future, for example) have turned to Auto-Tune to improve vocal performances or, at least, to add a new element to their songs.
There was an inevitable backlash, with many pundits (and some artists) claiming that using Auto-Tune was effectively “cheating”: the singing equivalent of using performance-enhancing drugs. Jay-Z even wasted a perfectly good beat from No I.D. to say things like, “This is anti Auto-Tune, death of the ring tone…this is death of Auto-Tune, moment of silence.” Meanwhile, his old partner (Kanye) used Auto-Tune to create 2008’s 808s & Heartbreak: Kanye’s first “different” album, leaning further into R&B and vocal performances (heavily produced with the effect) than his previous work.
I never felt especially spiritually attuned (…) to Auto-Tune, but I never hated it either. And, the more that I’ve thought about it recently, I do think it presents an interesting example of a technology that has an enabling (rather than disrupting) effect on human creativity. Simply, it lets people do things that they otherwise could notx. To illustrate this deeply scientific thesis, allow me to present a Trillium exclusive, Kanye-themed visual that I call “The Kan-tinuum of Creation.”

The beauty of the Auto-Tune zone—not to be confused with significantly less fulfilling automotive parts retailer AutoZone—is that it gives us a tangible, recent instance of creation technology that extends human capabilities. Auto-Tune is the only thing that can make Kanye’s otherwise excruciating singing voice palatable. In the abstract, Auto-Tune represents a technology layer that exists only to enhance the underlying organic material. Without the human voice, Auto-Tune is useless: technology that cannot create without its user. So, no matter how much you may roll your eyes at T-Pain, just remember that he is a valiant ambassador of a human-computer creation style that reinforces our species’ ongoing relevance. All technological roads need not lead to a replacement of artist with algorithm.
Having spilled a lot of ink on this idea, I suppose all of this could simply be reduced to: in song, Hybrid Kanye beats Human Kanye and Robo Kanye, but all forms of Kanye trump (!) President Kanye.
Boomerang: stories from weeks past
👀👀👀: Google (disclosure: my employer) has signed an agreement with Jio (Official Google Blog) to invest $4.5bn into the company, in exchange for a 7.73% stake. This is the first investment done under the umbrella of Google’s recently announced “Google for India Digitization Fund,” the goal of which is to “accelerate India’s digital economy over the next five to seven years through a mix of equity investments, partnerships, and operational, infrastructure and ecosystem investments.”
In other Jio investment-related news (the only kind there is), Qualcomm has thrown $97M into the company (Yahoo). This is a strategic investment for Qualcomm, similar to Intel’s investment into Jio from last week. Specifically, Qualcomm could become a key component in the construction and rollout of 5G infrastructure in India, as part of Jio’s venture in this direction.
That escalated quickly: Bit of a wild week for the ‘Tok. By way of non-exhaustive recap: Amazon asked employees to remove the app from their phones…and then decided to reverse course a mere five hours later (NYT). The removal was intended to extend to any employee device that could access Amazon email or other corporate tools, similar to Google’s restriction of employee usage of Zoom on corporate devices. Meanwhile, Wells Fargo asked its employees to remove TikTok from their phones, and stuck with this guidance. I would bet that US government employees will be asked to do the same in the near future. This comes, of course, in the context of broader US-China trade actions and the US tech industry’s reaction to China’s recent Hong Kong national security law, which I and every other person on the planet wrote about last week. Further, as Mike Isaac and Karen Weise note in that NYT article, some concerns were catalyzed by the work of a security researcher (Digital Information World) who revealed that TikTok had the ability to pull a significant amount of information that a user copied to a clipboard on their smartphone, along with numerous other datapoints about the user and their device. Just this week, the South Koreans fined TikTok for “mishandling child data.” (ZDNet) The fine was for a head-scratching $155,000; how they arrived at that majestic sum, I do not know. For a longer look at the “TikTok Wars,” I would recommend checking out Ben Thompson’s latest article on the subject.
Highly diplomatic White House trade advisor Peter Navarro accused Kevin Mayer—the former Disney executive who recently became the CEO of TikTok—of being “an American puppet.” (CNBC) I don’t think it was intended as a compliment.
One of my favorite subplots to this otherwise layered and concerning story, catalogued by the NYT’s Taylor Lorenz, is the incredibly dramatic response of TikTokNation to any potential ban of the app. As Lorenz writes, TikTokers are saying things like, “If you’re going to mess with us, we will mess with you,” and engaging in stunningly coordinated campaigns to flame Donald Trump’s campaign app with negative reviews. 2020!
My way or the Huawei: Yeah, really bad. Anyway, first it seemed that Britain was comfortable using Huawei as their partner for next-gen 5G infrastructure. Then, they definitely weren’t. Now, Huawei is appealing to the British government (Times UK) to delay its decision to remove the company’s components (and prevent any 5G-related work from occurring) until after the country’s next election in 2025. That seems like a long stay of execution. It’s worth following this story as a potential precursor to similar and subsequent Huawei/5G partnership decisions across Europe. One example is what’s unfolding in Italy, where extensive new security protocols were announced for non-European suppliers of 5G components (Formiche). Huawei also reported its revenue for H1 2020 this week (Tech Crunch): $64.9bn, +13.1% YoY, with net margins of 9.2%.
This week in corporate fraud: Wirecard is attracting significant interest (Reuters) from potential buyers who are interested in acquiring its core business and holdings. Can’t keep them down! Meanwhile, Luckin Coffee’s intransigent co-founder has been replaced (Tech Crunch), making for a total of 4 Directors to have left the Board since news of Luckin’s fraudulent revenue reporting came to light. This probably means that millions of retail investors on Robinhood will start trying to trade the stock (Forbes). My financial advice, which is worth roughly as much as what you paid for it, is “don’t!”
A rising tide: In the wake of India’s ban of Chinese apps, India’s local TikTok competitor (clone?), Roposo, announced that it’s adding 500k new users per hour (South China Morning Post). The company projects to have around 100M users by the end of July, up from 55M prior to the app ban. This is an intriguing case study to watch, as it casts light on how local startups might fare in countries that enact bans on foreign competitors, whether Chinese or otherwise. That said, there aren’t many countries where 500k new users per hour is a demographic possibility.
For your ears only
I’ve always loved Jack Rose’s guitar playing and, contrary to all previous writings bemoaning the impact of AI on the creative process, would immediately sign myself up for a piece of technology that could make me sound like him. Rose, who died tragically young in 2009, was an innovative guitarist who made extensive use of open-chord guitar tunings to create the atmospheric drone sound that you hear in this video. He’s often linked to the American Primitive school of guitar catalyzed by John Fahey (not to be confused with the significantly less enjoyable but seemingly ascendant American Primitive school of politics.)
I discovered South African drummer and bandleader Asher Gamedze through the excellent Aquarium Drunkard blog. Gamedze just released a beautiful spiritual/jazz track called “Siyabulela,” inspired by a gospel song that is apparently popular in South African churches. I can neither confirm nor deny this, but the song is wonderful. Enjoy!
See you all next week.
N