"Like iron filings drawn to a magnet"
Do we lose anything when we let AI do all the transcription?
This week I gave a guest lecture about long-form writing to a group of undergraduate journalism students. They had lots of great questions - how to move from subject to story (something I still sometimes struggle with), how to think about structure, and how to manage research and notes. There was even a question about whether or not to record interviews, and how best to manage going through audio afterwards. I love nothing more than talking about really nerdy aspects of reporting process so here we go.
I record pretty much everything when I’m reporting a piece, as well as making notes. This has been very much on my mind recently because I’m deep within the process of going through audio transcripts and notes for a couple of different pieces. Earlier in my career, transcribing audio was probably the single thing I spent most time on. I couldn’t afford to pay someone else to do it, and it’s an essential, if time-consuming job. That has radically changed with the advent of AI transcription services. Otter.ai is the best-known provider, and is the service I’ve used the most. The company changed its pricing structure a couple of years ago, making it much more expensive and with worse terms. I’ve also found it increasingly glitchy. (Case in point, this week I recorded a Zoom call straight into Otter.ai. It showed up that it had recorded the full 29 minutes but for reasons I still don’t understand it transcribed only the first 35 seconds). It is also absolutely terrible at transcribing most accents that aren’t American or English received pronunciation. I’ve increasingly been moving to using Transkriptor, which is cheaper than Otter.ai, transcribes in hundreds of languages, and is pretty good with a wide range of accents.
Amid the rapid gain in convenience it can be easy to forget there are serious risks when it comes to using AI transcription services. A memorable piece published by Politico in 2022 explored this, after a journalist who interviewed an Uyghur activist got a weird request from Otter. As he writes:
We make privacy versus utility tradeoffs all the time with our tech. We know Facebook sells our data, but we still post baby pictures. We allow Google maps access to our location, even though we know it leaves an indelible digital trail. And even savvy, skeptical journalists who take robust efforts to protect sources have found themselves in the thrall of Otter, a transcription app powered by artificial intelligence, and which has virtually eliminated the once-painstaking task of writing up interview notes. That’s an overlooked vulnerability that puts data and sources at risk, say experts.
The story is a sobering reminder that it’s important not to give everything over to tech companies, and that despite the allure of convenience, journalists need to remember their responsibilities to vulnerable sources.
On a more abstract note, I’ve also been thinking about what else is lost when we give this boring but important aspect of reporting work over to the machines. Here’s the great nonfiction writer John McPhee talking about transcription in a rare interview he gave to the Paris Review in 2010:
First thing I do is transcribe my notes. This is not an altogether mindless process. You’re copying your notes, and you get ideas. You get ideas for structure. You get ideas for wording, phraseologies. As I’m typing, if something crosses my mind I flip it in there. When I’m done, certain ideas have accrued and have been added to it, like iron filings drawn to a magnet.
And so now you’ve got piles of stuff on the table, unlike a fiction writer. A fiction writer doesn’t have this at all. A fiction writer is feeling her way, feeling her way — it’s much more of a trial-and-error, exploratory thing. With nonfiction, you’ve got your material, and what you’re trying to do is tell it as a story in a way that doesn’t violate fact, but at the same time is structured and presented in a way that makes it interesting to read.
I always say to my classes that it’s analogous to cooking a dinner. You go to the store and you buy a lot of things. You bring them home and you put them on the kitchen counter, and that’s what you’re going to make your dinner out of. If you’ve got a red pepper over here — it’s not a tomato. You’ve got to deal with what you’ve got. You don’t have an ideal collection of material every time out.
I think I still get ideas for structure and how to put a story together when I’m reading through transcripts, even if they’ve been automatically generated, but it’s true that I probably lose something in terms of hearing where the pauses are, the tone of voice. I’m generally happy not to be transcribing everything myself any more, but McPhee’s words are a good reminder to think about how else to accrue those ideas.
Reading/listening
I am enjoying the latest Serial podcast, The Good Whale, about efforts to rehabilitate the star of the hit 1990s movie Free Willy. It’s a good antidote for anyone feeling overwhelmed by bleak political news.
Lovely New Yorker story about India’s polyglot linguistic culture and Hindu nationalist efforts to push Hindi.
Really excellent Guardian Long Read piece about the ethical complexity of trying to save extremely premature babies.
As always, thanks for reading, and I’ll be back soon. (Hopefully with some work to share, if I can ever finish going through the reams of audio transcripts AI has helped me to accumulate).
Thoughtful and inspiring piece. Lots to consider. enjoyed reading it.