We are at a crossroads of technology, where we're still used to the idea that audio and video are decent proof that something happened, in a way in which we don't generally trust written descriptions of an event. Generative AI will be a significant problem for a while, but this assumption that audio/video is inherently trustable will relatively soon (in the grand scheme of things) go away, and we'll return to the historical medium.
We've basically been living in a privileged and brief time in human history for the last 100-200 years, where you could mostly trust your eyes and years to learn about events that you didn't directly witness. This didn't exist before photography and phonograms: if you didn't witness an event personally, you could only rely on trust in other human beings that told you about it to know of it actually happened. The same will soon start to be true again, if it isn't already: a million videos from random anonymous strangers showing something happening will mean nothing, just like a million comments describing it mean nothing today.
This is not a brave new world of post-truth such as the world has never seen before. It is going back to basically the world we had before photo, video, and sound recordings.
I think I would not like to live in a world in which democracy isn’t the predominant form of government. The ability of the typical person to understand and form their own opinions about the world is quite important to democracy, and journalism does help with that. But I guess the modern version of image and video heavy journalism wasn’t the only thing we had the whole time; even as recent as the 90’s (I’m pretty sure; I was just a kid), newspapers were a major source. And somehow America was invented before photojournalism, but of course that form of democracy would be hard for us to recognize nowadays…
It is only when we got these portable video screens that stuff like YouTube and TikTok became really important news sources (for better or worse; worse I would say). And anyway, people already manage to take misleading or out of context videos, so it isn’t like the situation is very good.
Maybe AI video will be a blessing in disguise. At some point we’ll have to give up on believe something just because we saw it. I guess we’ll have to rely on people attesting to information, that sort of thing. With modern cryptography I guess we could do that fairly well.
Edit: Another way of looking at it: basically no modern journalist or politician has a reputation better than an inanimate object, a photos or video. That’s a really bizarre situation! We’re used to consulting people on hard decisions, right? Not figuring out everything by direct observation.
I'd argue it's a step or two more manipulative. Not only do bad actors have the ability to generate moving images which are default believed by many, they also have the ability to measure the response over large populations, which lets them tune for the effect they want. One step more is building response models for target groups so that each can receive tailored distraction/outrage materials targeted to them. Further, the ability to replicate speech patterns and voice for each of your trusted humans with fabricated material is already commonplace.
True endstage adtech will require attention modeling of individuals so that you can predict target response before presenting optimized material.
It's not just a step back, it's a step into black. Each person has to maintain an encrypted web of trust and hope nobody in their trust ring is compromised. Once they are, it's not clear even in person conversations aren't contaminated.
> Further, the ability to replicate speech patterns and voice for each of your trusted humans with fabricated material is already commonplace.
Just like the ability to emulate the writing style of your trusted humans was (somewhat) commonplace in the time in which you'd only talk to distant friends over letters.
> Once they are, it's not clear even in person conversations aren't contaminated.
How exactly could any current or even somewhat close technology alter my perception of what someone I'm talking to in-person is saying?
Otherwise, the points about targeting are fair - PR/propaganda has already advanced considerably compared to even 50 years ago, and more personalized propaganda will be a considerable problem, regardless of medium.
I feel as though i am honor-bound to say that this isn't new and we havent really been living in a place where we can trust in the way you claim. Its simply that every year it rapidly becomes more and more clear that there is no "original". you're not wrong i just think its important for people who care about such things to realize this the result of a historical process which has been going on longer than we've all been alive. in fact, it likely started at the beginning of the 100-200 year period you're talking about, but its origins are much much older than that.
Which was the era of insular beliefs, rank superstition and dramatically less use of human potential.
I feel that it’s not appreciated, that we are (were) part of an information ecosystem / market, and this looks like the dawn of industrial scale information pollution. Like firms just dumping fertilizer into the waterways with no care to the downstream impacts, just a concern for the bottom line.
It's not all the way back as long as solid encryption exists: Tim Cook could digitally sign his announcements, and assuming we can establish his signature (we had signatures and stamps 200 years ago) video proof still works.
So we're not going all the way back, but the era of believing strangers because they have photographic or video proof is drawing to a close.
Cryptography is nice here, but the base idea remains the same: you need to trust the person publishing the video to believe the video. Cryptography doesn't help for most interesting cases here, though it can help with another level, that of impersonation.
Sure, Tim Cook can sign a video so I know he is the one who published it - though watching it on https://apple.com does more or less the same thing. But if the video is showing some rockets hitting an air base, the cryptography doesn't do anything to tell you if these were real rockets or its an AI-generated video. It's your trust in Tim Cook (or lack thereof) that determines if you believe the video or not.
All this talk of trust speaks to the larger issue here too - that we've lost so much trust in governments and other important institutions. I'm not saying it was undeserved, but it's still an issue we need to fix.
That only really matters if it's hard to feed generated data into a camera/microphone that does this signing. It's not that hard already (you can just film a screen showing the generated video for a very basic version of this), and if there was significant interest, I'm sure it would become commoditized very quickly. Not to mention that any signing scheme is quickly captured by powerful states.
We've basically been living in a privileged and brief time in human history for the last 100-200 years, where you could mostly trust your eyes and years to learn about events that you didn't directly witness. This didn't exist before photography and phonograms: if you didn't witness an event personally, you could only rely on trust in other human beings that told you about it to know of it actually happened. The same will soon start to be true again, if it isn't already: a million videos from random anonymous strangers showing something happening will mean nothing, just like a million comments describing it mean nothing today.
This is not a brave new world of post-truth such as the world has never seen before. It is going back to basically the world we had before photo, video, and sound recordings.