With nothing less than the future of our digitized history at stake, the final episode of Season 2 of Traceroute explores the threats and challenges the Internet Archive faces in the wake of its copyright infringement case. We are joined by Rebecca Tushnet, the Harvard Law professor who defended the Archive in the case, to discuss the potential fallout of the court’s ruling: are we moving towards a society where information is owned by an elite few and 'rented out' at a price? If so, do we risk manipulation of that information for the sake of profit? Or will we find among our archivists, preservationists, librarians, and even activists a person who can be responsible enough to be dubbed “The Arbiter of History?” Don’t miss the thought-provoking finale to Traceroute!
John Taylor:
Before we begin, a brief trigger warning. This week's episode contains a mention of suicide. We thought you should be aware before listening.
Majel Barrett:
Last week on Traceroute.
John Taylor:
This is the story of Alexis Rossi, she's the Director of Media and Access at the Internet Archive in San Francisco.
Alexis Rossi:
In that sense, yes, I'm a weird sort of librarian.
John Taylor:
What Alexis is trying to do is back up everything.
Alexis Rossi:
As a library, one of the things we're really concerned about is provenance.
John Taylor:
How do we define history when its analog component is dust? S
peaker 4:
Perhaps we could use various optical tools to create an image of the record.
Bryce Roe:
But all preservation work involves subjective decisions on the part of the operator, that's just the nature of the work.
Amy Tobey:
Han shot first.
Fen Aldrich:
Is this the one use of the blockchain I can come up with?
John Taylor:
If you could prove ownership of a piece of information, then accessing that information without permission would be a crime.
Aaron Swartz:
Because everything is copyrighted, the speech, the thing I'm giving right now, these words are copyrighted.
Alexis Rossi:
If they decide they don't want something out there anymore, they just take it off of your device.
John Taylor:
Who will become the arbiter of history?
Majel Barrett:
Now the conclusion of our two-part season finale.
John Taylor:
If you knew that that was a deepfake of Majel Barrett's voice, your geek cred is legit. Now, if you knew that we used Majel Barrett's voice because she did the voiceovers for all the Star Trek: The Next Generation season finales, then not only do you win the Girl Scout nerd badge, but I will personally come over and sew it on your sash. However, there is a whole other reason we opened with a deepfake AI narrator. Our season finale is all about the intersection between open internet access, information, and ownership. In part one, we introduced you to Alexis Rossi of The Internet Archive. Alexis, along with other dedicated preservationists and even activists, is attempting to digitize and store the sum total of the world's knowledge for future generations to access and study. However, The Internet Archive faces an existential crisis stemming from a copyright lawsuit. This leaves us to ask the question, "If information about our past can be bought and sold, who becomes the arbiter of history?" which I personally think is just a really cool title.
Bartell LaRue:
I am the guardian of forever.
John Taylor:
All right, maybe it's not guardian of forever cool, but cool nonetheless. Now that we're all caught up, let's dive right into our crisis. The Internet Archive was sued by some of the major publishing houses, and in March of this year, a verdict was handed down; the Archive lost the case. Among the many supporters of the Internet Archive’s position was Attorney Rebecca Tushnet, who wrote a brief on behalf of herself and thirteen other prominent law professors in favor of the Archive.
Rebecca Tushnet:
I teach intellectual property law at Harvard Law School, so I've been teaching law for about 20 years.
John Taylor:
On the surface, the case seemed to center around a very non-controversial idea called controlled digital lending.
Rebecca Tushnet:
Controlled digital lending is the idea that when you have a copy of a book, you should be able to lend that copy, including in ways that are now facilitated by digital technologies. You should be able to take your physical copy and lend a digital copy, as long as you ensure that you're only lending one copy at a time, and that you're not lending the physical copy at the same time. The idea is one-to-one matching, where you're just getting the same use as you would out of the physical copy, but you're actually just using a digital copy instead. The Internet Archive case is the first big test of whether this is compatible with copyright law.
John Taylor:
Traditionally, a library buys a book from a publisher at a much higher price than the public does, by the way, and then lends those books out to customers. According to the publishers in the suit, controlled digital lending can be considered making copies of a book, and publishers have always controlled whether a person can make a copy of a book or not. Even though The Internet Archive argued that the digital "copy" appearing on your tablet directly correlates one-to-one with a physical version of the book in the library, and they aren't lending out more than one digital copy at a time, publishers said it's still copying, and that's a bozo no-no.
Rebecca Tushnet:
Does anyone actually care, if you're not Dr. McCoy, that they take it from one place and they sort of reassemble it in another? From the practical standpoint, it's not actually like there are dozens of copies being made, but from the technical standpoint, there are copies being made. That provides the legal hook for the publishers to say, "We get to control that, even though we don't control physical lending."
John Taylor:
This is where we reach the critical point. You see, The Internet Archive had been doing control digital lending for years before this suit, and though publishers had been ringing their hands over this, they hadn't really taken any action. Then the pandemic happened, suddenly school kids were on lockdown and couldn't get access to the physical books they needed, so The Internet Archive stepped in and offered to replace those physical books with loaned digital copies. In essence, this took the "controlled" out of controlled digital lending, and the publishers pounced on it. The Internet Archive argued that this limited deviation from their normal lending practices fell under the Fair Use Doctrine, a copyright law that permits the unlicensed use of protected works in certain circumstances.
Rebecca Tushnet:
The court ruled that this was not fair use, and that it did infringe the publisher's copyrights, and then it left to be determined the amount of damages, which could be very large depending on what happens next. The district court's opinion suggests that both what The Internet Archive did during the pandemic and CDL itself is all unfair.
Fen Aldrich:
We're actually at a really interesting moment, because I think we're at a really interesting turning point in internet archival right now that's over the past six months to a year even, between The Internet Archive potentially, we've got a number of streaming services consolidating and dropping streaming content that like, where do I find this now? The only place it existed was on these streaming services-
Amy Tobey:
The Pirate Bay.
Fen Aldrich:
... And now it's disappeared into the ether, except for extra economical ways of accessing it. Reddit was the big one, between Twitter and Reddit it's been huge, like it's largely become a joke that people use Google just to search Reddit because Google's search algorithms are better, but Reddit actually provides you the answers, and now that's going away. What do we do in the event of potentially these people that just wanted to make money destroying a thing that became the unofficial record of social knowledge of when you wanted to ask people a question, it's gone, it's not there, you can't find it right now?
John Taylor:
Interesting.
Amy Tobey:
That comes down to stewardship, like the steward of Twitter doesn't care about the history of Twitter, but the rest of us from a historical perspective are like, "That's where a good quarter of the shared historical record of COVID exists," is Twitter, Reddit, Facebook, these commercial platforms that we all entrusted our data to, fully knowing that they were for-profit enterprises that have different goals from the community's goals. That's really interesting, because that goes right back to this idea that we were talking about with The Internet Archive, of who chooses what gets saved, who chooses what goes on to become history. For Alexis Rossi, that decision is inextricably intertwined with the idea of ownership.
Alexis Rossi:
What this feels like to me is that publishers are trying to make it so that you can't own a book anymore. If we continue down the road that we're on right now and eBooks continue to become more and more popular, just as streaming movies online or streaming music online is now the default, if an electronic book becomes the default, you don't own books anymore, and as a library, that's a big problem.
John Taylor:
What Alexis is trying to say here is that whoever controls our access to content could become the arbiter of history.
Alexis Rossi:
If you let a corporation control what you have access to, you might not like the results, whether that's a publisher or it's Amazon or it's Spotify or it's whoever. We've already seen the impermanence of data in these kind of streaming, you don't actually own the thing that you think you just bought, it's just a license kind of realm. Anyone who subscribes to HBO Max, for example, just lost hundreds of Sesame Street programs. I don't think it takes a whole lot of extrapolation to see that if a library can't buy a book anymore, we have a problem, we have a real problem.
John Taylor:
In a statement released on March 25th, 2023, Internet Archive founder Brewster Kahle declared, "Libraries are more than the customer service departments for corporate database products. For democracy to thrive at a global scale, libraries must be able to sustain their historic role in society: owning, preserving, and lending books. This ruling is a blow for libraries, readers and authors, and we plan to appeal it." The fight continues. A very important fight with huge ramifications... the implications of which we'll get to, right after a word from our sponsor. When it comes to movie nights, you deserve the best, which is why you simply must experience the unparalleled might of Imperial Cinemas. This ain't your average movie chain, folks, this is like the star destroyer of theaters, with seats so comfortable they're downright regal. These babies have it all, visuals that rival a state-of-the-art hollow projector, sound that'll make you think you're literally tearing through hyperspace, and rations that are light years ahead of even the fanciest of Corellian banquets. Set your coordinates to bit.ly/imperialcinemas, and save a few credits while you're there with the promo code Han Shot First, that's bit.ly/imperialcinemas, promo code Han Shot First. Remember, at Imperial Cinemas, your satisfaction isn't just a guarantee, it's an order. So the Internet Archive has pledged to appeal its case, a fight which has implications well beyond their ability to loan out digital copies of books. Rebecca Tushnet puts it this way:
Rebecca Tushnet:
Wealthy institutions in our society would like to turn us all into renters of everything, so there is this overall problem of a decline in ownership, some of which is fine. Renting a movie is fine, but the fact that when you click the buy button on Amazon, you're not actually buying the digital content according to Amazon, even though it says buy, they actually say in the fine print, "Oh yeah, but we can take it back anytime," and they do. I think there are strong connections to lots of other things that are going on, so this is a larger problem about what it takes for a human being to flourish and whether owning stuff should matter. I certainly have nothing against something like Rent the Runway, but not all your clothes should be rented, you should have some, and that clearly books and movies are at the forefront of this attempted shift, and music as well.
John Taylor:
Think of it this way. Let's say I rent a copy of Moby Dick, and I'm sitting there on the beach drinking a pina colada, thumbing through my Kindle and enjoying my classic tale of fate and free will, when I'm surprised to read in the end that Captain Ahab not only kills the great whale, but does so using his Mobiomatic-4000 spear gun, which just happens to be 30% off with next-day Prime shipping if I order within the next eight hours. I say to myself, "Whoa, I don't remember it ending this way when I read it in high school," so I go to my bookcase to find my hard copy, but I don't have one. I call my friends and see if I can look at their copy, but they don't have one either, so there's no way of knowing that this isn't the way Moby Dick is supposed to end, there's no provenance. Greedo shot first. Without a verifiable original version of Moby Dick to reference, the copyright holder could even deem the title too controversial and arbitrarily change it to The Big, Big Mean Whale. As the Internet Archive case weaves its way through the justice system, ostensibly all the way to the Supreme Court, the responsibility of what gets saved and what doesn't falls solemnly on the shoulders of librarians like Alexis Rossi.
Alexis Rossi:
The books that we have in our libraries are incredibly important to us understanding the historical record and what has really happened, and being able to digitize those books and put them online and make them available to everyone in the world to do their own research is, I think, incredibly important to our ability to survive as a democracy. This is one of the reasons I became a librarian in the first place, is to make sure people have access to information. You've got to make up your own mind about things and you need to do your research, and you need to understand how to do that research and how to figure out what is true and what is not true.
John Taylor:
To help ensure that people have access to all information, all information is welcome at the Archive. Alexis doesn't get in the middle of that decision. As she mentioned earlier, anyone can upload anything to The Internet Archive, and there are no fees to do so. In fact, there are no fees to access the information either, most of their revenue is generated by donations only, and that money is used for overhead, not the least of which includes some very special data centers. Can you tell us a little bit about your server space, like what kind of resources do you need, how much space are you using, that sort of thing?
Alexis Rossi:
At the moment, we have about 99 petabytes of unique data, that's 99 million gigabytes-ish, it's pretty gigantic. Everything is stored at least twice, we have two physical data centers in different places here in the United States, those both have complete copies of that unique data, we have partial backups in other countries as well, in Amsterdam, in Egypt, and in Canada.
John Taylor:
So. does that mean you don't use cloud servers?
Alexis Rossi:
Yeah, so we run our own data centers, nothing is stored on the cloud. It is a constant source of work, but we think it's really important to host those things ourselves, because it allows us to respect reader privacy to the highest degree that we possibly can. Nobody knows what you're reading, and we as much as possible don't have real IP addresses in our logs, so it's a really good way for us to do that traditional library thing of respecting reader privacy, and making sure that you feel free to read whatever you want in our library.
John Taylor:
Very interesting, so you've had to build your own data center for this project?
Alexis Rossi:
Yeah. When we started doing this, there was no cloud, so we started with building our own data centers. We have looked at what it would take to store things in the cloud, how much it costs, et cetera, and it turns out we can do it for less ourselves. We are at a scale now where that is a constant, "Oh no, a disc went down, somebody has to go recover the disc and replace it," and it is a constant maintenance, again, with our Sisyphean task, but we think it's an important thing to do.
Amy Tobey:
It comes down to how do we replicate and how much do we replicate, and where do we stop? Typically in technology, we do N plus one, we have so many copies plus one more just in case. The minimum number of copies usually if you want to guarantee a single fault is two, you can handle a single fault, but usually people want to handle a double fault, which means your minimum is three. We come back to, how do we manage all these resources? How much do we invest in remembering?
Fen Aldrich:
"How much do we invest in remembering?" is such a good way to frame all of that. The challenge I think that comes up with it under capitalism is that everything is incentivized to generate income, like the incentive for anything, whether we preserve it or get rid of it, or do we care for it or do we create it in the first place, eventually gets tied to like, "Does it help me exist under a world that requires me to pay for everything?" All of this is coming back around, all the things that we have deemed worthy of saving and have the resources for have had some commercial value and success. If we don't have organizations that are interested in just the anthropology of it, just what humanity are we saving, what interesting corner of the world are we saving by preserving this, or what people are we giving immortality to by preserving this?
Amy Tobey:
So we're back to moderation.
Fen Aldrich:
Yeah. Well, moderation and also is the resource cost compared to what value a thing has an actual useful measurement for data preservation?
John Taylor:
That's such a great question, because our history itself is borne of that system, that even great ideas are preserved because they initially made money in some way, right?
Amy Tobey:
I'm thinking about Bach, which isn't unique, but the thing I think a lot about Bach is the vast majority of the music that he created was lost immediately the moment he created it, 'cause the vast majority of it was he would show up into church in the morning, sit down at the organ, crack his knuckles and start jamming. That's what he did, and this is Amy's take, but a lot of what we have left is when he had a good idea, took it home and wrote it down, and had a relationship with a publicist who would then take that and reproduce it and preserve it and all of those things, but the reality is we lost most of that music the moment it was created.
John Taylor:
Financial resources aside, when it comes to who decides what information gets saved or duplicated, The Internet Archive has what might be described as a very personal approach, or anarchy might be a good description as well. If a member of the public wants something saved, they can upload it to the Archive, and if they want to see more projects flourish, they can make a donation, but putting financial resources aside is easier said than done, even for the folks over at the Northeast Document Conservation Center, where director Bryce Roe faces the issue of financing on almost every project.
Bryce Roe:
Money is a practical consideration for what gets preserved, and so we're involved in that part of discussion at least, "This is what it's going to cost to do this object," and you might be able to do multiple discs with what you would spend to do this really damaged object with IRENE, and it's on them to prioritize their collections for preservation. We certainly encounter that and think about it in a couple different ways in our day-to-day work.
John Taylor:
Excellent.
Bryce Roe:
But it's endlessly fascinating to me, it's what makes my work feel really engaging on a day when it might be more spreadsheets and emails. I think the fact that there are these larger philosophical questions certainly makes our day-to-day work very interesting.
John Taylor:
Fortunately, money isn't always a factor in determining what gets preserved and archived and what doesn't. In fact, the Stanley Brothers restoration project at the NEDCC was chosen in a really unique way.
Bryce Roe:
For me, one of the things that stands out about that project is not on the technical side at all, but it was funded by the Virginia Museum Associations, like you can vote on Virginia's top 10 most endangered objects and then the winner gets funding. I just thought it was cool that the Virginia, I mean you could vote from anywhere, that it got the most votes that it was decided as the thing that people were most excited to have preserved that year.
John Taylor:
This is actually really cool, 'cause what she's saying is that they literally hold a People's Choice Award for the top 10 most endangered artifacts. If the artifact wins the vote, and that vote is open to anyone in the public, then the Virginia Association of Museums grants money for the item's restoration. This is probably the closest thing we've found to a democratized decision about what gets preserved and what doesn't. It sounds to me like professionals like Bryce Roe and Julia Hawkins should be the arbiters of history, but they say no, and they say it adamantly. If you are not the arbiters of history, how do you see yourselves?
Bryce Roe:
I think our goal is to carry it into the future as best as we can and as much of it as possible, to carry it into the future because we don't know what kinds of questions people will be asking. That's how I see our role, we're trying to ensure that as much of it as possible remains after us.
Julia Hawkins:
You're talking to a couple of people that went to archive school, so this is definitely-
John Taylor:
That's why it's the best question to ask.
Julia Hawkins:
Yes, it is the best question. I don't know, it feels very much like just trying to chip away at entropy, working with formats that are so new and yet are so vulnerable. Audio as recorded history is so new in the grand scheme of things, we have people here working on books that are from the time of illuminated manuscripts, and sometimes they feel like they're in better shape than the audio discs that we see that were recorded in the '50s. It feels very much like this race against time a lot of the time, but the more that we can collaboratively preserve and keep and share and make accessible, the better view we get of our own history, and maybe the better decisions that we can make in the future.
John Taylor:
Which is essentially what Alexis was getting at in part one, when she said it means the librarians are not the arbiters of what gets saved. We don't want the big, bad corporations to be the arbiters of history, but the literal professionals who have dedicated their lives to preserving history can't or don't want the job either, so who should it be? Should it be activists like Aaron Swartz, the young tech genius and internet hacktivist who led a grassroots effort that put an end to the Stop Internet Piracy Act? He was smart, understood the issues, and was ready to fight for a free and open internet, as he demonstrated in this speech from 2012.
Aaron Swartz:
There's a battle going on right now, a battle to define everything that happens on the internet in terms of traditional things that the law understands. Is sharing a video on BitTorrent like shoplifting from a movie store, or is it like loaning a videotape to a friend? Is reloading a webpage over and over again like a peaceful virtual sit-in or a violent smashing of shop windows? Is the freedom to connect like freedom of speech or like the freedom to murder? This bill would be a huge potentially permanent loss. If we lost the ability to communicate with each other over the internet, it would be a change to the Bill of Rights, the freedoms guaranteed in our constitution, the freedoms our country had been built on would be suddenly deleted.
John Taylor:
When I listen to this, I'm struck with the feeling that Aaron knew that he couldn't be the arbiter of history either. He understood that that much responsibility could not be placed in the hands of a single individual, no matter how good or pure their intentions were to begin with, call it absolute power corrupting absolutely. Let me give you an example. If I wanted to, I could go back and change some minor detail in part one of this finale just to satisfy some sponsor, and that seems innocuous enough, but is it a slippery slope? I should note here that I'm extrapolating a lot of what I think Aaron would do or believe or say, because I never got a chance to meet him or talk with him. Following his arrest in 2011 for downloading academic journal articles through the MIT Computer Network, which the United States said constituted multiple violations of the Computer Fraud and Abuse Act of '86, Aaron endured two years of prosecution and plea negotiations related to his indictment. His case never made it to trial because on January 11th, 2013, Aaron took his own life. I believe that Aaron Swartz knew exactly who the arbiter of history is supposed to be: it's you, it's the guy at the grocery store who stocks the frozen food, it's the woman who drives the transit bus, it's your non-binary neighbor with the toy poodle who barks too much, it's you, it's me, it's every single one of us, and the more the merrier. Who decides what happened in history? We'll all put our heads together and decide, armed with the most accurately preserved information available to us. It's peer review of subjective interpretation, an insanely contradictory but beautiful uniquely human process. Ancient Rome? I wasn't there and neither were you, but tell you what, you come to the table with your reference materials, I'll come with mine, we'll show each other their provenance and decide on the facts the best we can. Every one of us is the arbiter of history.
Amy Tobey:
Throughout my career and my life, I've created a number of artistic and technical things that I'm proud of, and I have a kid who eventually will inherit everything that I have. I'm curious sometimes to think about, what will he keep and what will he throw away. The papers from my desk, I sure hope he just burns those, but maybe the recordings of me playing in an orchestra, maybe he'll keep those. It's something I wonder about, because nobody else has any responsibility at all to preserve the things that I've done in my life, he has no real responsibility, but maybe he'll feel some. I wonder about that sometimes.
Fen Aldrich:
To throw a quote from a tattoo a friend is trying to get, "We are all of us moving forward, none of us are going backward." I think that's an interesting thing to keep in mind in the concept of archival, like no matter what this says about the past, we are never going back there, we are only ever moving forward and making something new.
John Taylor:
That's what technologists do, we make something new, we move forward, and as we do, we sometimes peel back the layers and look for the traces of humanity. I've come to believe that humanity is every layer, in fact it's the only layer, call it the stack, call it hardware or software or infrastructure, everything we create we weave together with dreams and ideas and visions of something better, something wonderful. At the top of that stack is everything new we've learned, and at the bottom of that stack is the map of where we started, our history, but without each layer from the very first to the very last, the thing won't work. Every layer depends on the other, without modern improvements, our stack won't function, without history, our stack is meaningless. There's a famous quote usually attributed to Willy Wonka that goes, "We are the music makers and we are the dreamers of dreams." The quote is actually taken from the poem Ode by Arthur O'Shaughnessy, and as relevant as the quote may be, I think it's the lesser known fourth stanza of the poem that really sums things up: "A breath of our inspiration is the life of each generation, a wondrous thing of our dreaming, unearthly, impossible seeming, the soldier, the king, and the peasant all working together in one 'til our dreams shall become their present, and their work in the world be done."
Fen Aldrich:
Traceroute is a podcast from Equinix and Stories Bureau. This episode was produced by John Taylor with help from Tim Balint and Cat Bagsic, who was edited by Joshua Ramsey and mixed by Jeremy Tuttle, with additional editing and sound design by Mathr de Leon. Our theme song was composed by Ty Gibbons. You can check us out on Twitter at Origins_dev, that's D-E-V, and type origins.dev into your browser for even more stories about the human layer of the stack. We'll leave these links and more including an episode transcript down in the show notes. If you enjoyed the show, please share it wherever you hang out online, and consider leaving a five-star rating on Apple and Spotify, because it really helps other people find the show. I'm Fen Aldrich, thanks for listening.
- Traceroute is a podcast from Equinix and Stories Bureau.
- This episode was produced by John Taylor with help from Tim Balint and Cat Bagsic.
- It was edited by Joshua Ramsey and mixed by Jeremy Tuttle, with additional editing and sound design by Mathr de Leon.
- Theme song was composed by Ty Gibbons.