Surfacing Data Integrity
How can we trust where data is coming from in the age of LLMs?
I’ve been thinking about this a lot. Mostly because I was on one of my discord channels, and someone posted a graphic meant to inform people about various hot button issues in regards to music streaming services.
The graphic below did what it was supposed to do - make you really think about switching to a platform that has less issues. whether that’s AI, funding or taking money from certain political factions, or artists treatment.
Like anyone else seeing this, I immediately looked for my current streaming service (Tidal) and found that it had issues I was concerned about. But instead of canceling my subscription post-haste, I took a beat. Where did they get the data for this graphic?
Looking at it, and I saw this bit: “Sources: “No Thanks” app, opensecrets.org, streaming services websites, countless articles/blogs/videos”
Data from Articles/Blogs/Videos
Immediately, that presented some red flags for me. Especially the “countless articles/blogs/videos” - especially since AI can generate articles/blogs/videos now that can literally contain any main talking point, I tend to not rely on them unless they come from a trusted source, and even then, I’m checking the sources.
Just the other day my mom sent me a music clip she really liked, and I had to tell her it was from an AI artist. She was shocked. She had no idea that it was AI. What told me it was AI was the features of the video, the glitches around logos, and the fact that the artist could only be found in a few places and didn’t have any internet footprint outside of streaming apps like Spotify and YouTube. Plus the generated video was from a supposed performance on America’s Got Talent, which is easily verifiable via several sources that the artist not only never performed on the show, but doesn’t actually exist. (This wasn’t the first time this happened either. The LLMs are getting better every week.)
Data from the Government
The next bit of info that I checked out was OpenSecrets.org. The site’s data is curated from government sources, even their FAQ says so. This gives me a little more reassurance that the data is accurate. GRANTED, the current state of the US government makes me question a lot of things, but the site is showing historical data from 2024 along with previous years. Since that’s the case, I can pretty much trust that at least the data there is likely to be what was reported TO and ABOUT the government since that predates the current administration.
So, I looked up Spotify and Tidal, and given the data and the amounts, and who the donations were made to, it indicates that these were likely donations made to political parties by matching contributions program provided to employees. There’s a section that even signifies the difference between donations from individuals working at a company and the company itself. Even more interesting is the difference between the donated amounts to political parties or individuals vs Lobbying on specific bills.
Spotify is the example I’ll use here:
You can see that the organization itself didn’t donate anything, while individuals donated less than 200k to political affiliated persons, causes, and/or parties.
Which makes sense. Why would a company donate to an individual campaign, when it can work with lobbyists to further their business goals with whomever is in office?
Spotify spent millions on lobbying around various bills. One of them I looked up was related to AI fraud. It doesn’t say WHAT their stance was or WHY the company was interested in this bill (Bill H.R. 6943), only that they paid a lobbyist to represent them to House Representatives involved with the bill. That could be anything from swaying opinion and therefore votes to helping representatives craft bill language.
Anything that causes you to immediately react - stop - breathe - look for more information.
Data from Apps
Circling back to the original graphic, what other sources does it have? An app built to help folks make smarter options focused around BDS (Boycott, Divest, Sanction) called “No Thanks.”
So, I followed up on that app, and it seems to have some issues with user vetting of data, and possible data integrity problems since it appears the app maintains its own data? It’s not clear what sources its deriving data from other than users. If that’s the case, then the data is the equivalent of a rating or a review on Yelp, GoodReads, or any other user driven application. It’s subjective, and prone to inaccuracies.
Graphic Accuracy
Given that only one source actually has accessible data, I’m inclined to very much question the accuracy of the graphic. Could it still have valuable information? Or be accurate? Sure, but I can’t verify it.
Anyone playing around on the internet (honestly, we should have more education on trusted sources and accuracy in general) should take a moment to question what they are seeing/reading/hearing and check for other sources. Especially around data presented in a format that is meant to drive an action. Or data or information presented by a particular state or public actor that can’t be referenced or double checked via other sources.
Anything that causes you to immediately react - stop - breathe - look for more information. Look for raw data. Look for multiple trusted sources. Whether that’s related to something personal, or even business data, take a moment and verify what you’re seeing is accurate, verifiable, and based in reality. Your brain and your databases will definitely benefit from your cautious pause before proceeding.





This is such an eye-opening post, thank you. I don't use any music streaming app, but your guidelines apply to all kinds of consumer stuff that I might purchase.