Metadata isn’t like the regular data we know – but the key differences between the two can be hard to figure out. To put it simply, though, metadata is often referred to as data that sorts and describes other data.
Metadata comes in handy across a range of industries, but its widespread usage and the sheer amount of details it contains can impact your digital privacy. There’s data we don’t want shared with the world, after all, and sometimes metadata can reveal more than we’d like, without us knowing.
I’ll take a deep dive into metadata, explaining what it is, why it’s so important in today’s digital world, and how it can impact our privacy on the internet.
What is metadata?
Metadata is data that describes other data. By summarizing basic information about other data, which is usually kept inside a big database, metadata makes it easier to understand, find, and work with large datasets in scenarios where manually sorting each piece of stored data would be impossible.
Okay, so let’s say you’re given the task of categorizing 20 grocery items – sorting them into fruits, veggies, and canned goods. It’d be easy, right? Well, what if it was 2,000 items… or 2 million? This is where metadata can help. In this example, metadata is the labels on the grocery goods that allow you to know, at a glance, what you’re looking at.
We use metadata every day in our phone galleries, too. If you’re looking for that iconic photo you took last Christmas Eve, would you scroll through your entire library of photos to find it? Or would you filter by date? The date of the photo in question is part of its metadata.
Types of metadata
Since metadata is so widely used across different industries, it applies to all sorts of things, including pictures, audio files, documents, spreadsheets, web pages, and more.
However, metadata usually contains the following basic information:
- Data title
- When the data was created
- When/if the data was modified
- The data’s author
- Data source
- Data’s file size
Why is metadata important?
As I mentioned earlier, metadata simplifies working with large datasets on the web, which are present in all sorts of different formats. Without metadata, it would be a near-impossible task to work with (or search for) specific types of data in such astronomically large databases.
With metadata, however, users can easily find the data they need, understand what the data includes, who made it and when, and more. This standardization and categorization is particularly useful for large corporations that share datasets between teams who might otherwise misinterpret the content.
Metadata also keeps the internet running smoothly. It can inform search engines about what exactly is on a particular page – which helps browsers find more relevant sites and services that match a user’s search queries.
This is also why it’s crucial for website owners to optimize the metadata (such as meta titles and meta descriptions) on their web pages.
What’s more, it’s also worth noting that most digital files have their own metadata. The document or spreadsheet you make keeps track of who authored it (name and/or email address), when, and on what device. Then, any song you listen to has metadata listing the artist, album, year of release, genre, etc.
Keeping your digital life organized without these details is going to be a headache, to say the least, as you’d have to remember a lot of information yourself.
Metadata and digital privacy
Metadata is everywhere – it’s an indispensable tool in the modern digital world. However, the specificity of metadata poses significant security risks, leaving a lot of room for the exploitation of a user’s private data.
For example, every time you take a picture on your smartphone or any other internet-enabled device, details like the time the picture was taken, its GPS location, and even the camera settings are embedded into the image as metadata.
Now, if you were to post any of these pictures online without editing them, chances are someone could take a look at the image’s metadata to find out where you are, when the picture was taken, etc. So, say you post a status update from outside your house; the image’s metadata can let someone know that your house is unattended.
Metadata can also cause havoc in the workplace. For instance, if someone wrote or edited an article intending to stay anonymous (as is often the case with politically sensitive pieces in countries where the freedom of the press is limited), accidentally leaked metadata can reveal their identity.
The more unfortunate news is that it looks like it’ll only get worse. Metadata is getting more and more specific, which might be useful for files, but quickly becomes eerie when we apply it to people. We’re talking about department stores, law enforcement, and snoopy intelligence agencies knowing things about you that you didn’t share in the first place.
Stores, for example, can use metadata to understand buying patterns and go as far as sending discount coupons on items their data suggests you should buy. How much of it is anticipation and how much of it is influencing and invasive, I’ll let you make that call for yourself.
On a larger scale, though, metadata is collected by national-level intelligence agencies, such as America’s NSA, apparently on grounds of national public safety. However, the fact they have huge swathes of incredibly personal metadata (like a person’s IP, gender, sexual orientation, religious background, and ethnicity, all of which can be gathered from a person’s social media account) in databases poses a significant security risk.
It has also birthed an unanswered ethical conundrum: does this mass-scale collection and maintenance of metadata violate our digital privacies?
On a more positive note, users do have some control over their metadata-related digital safety. For example, Microsoft Office allows you to check metadata and remove all personal information from it. Another formidable way to prevent your metadata from being exploited is to mix accurate and false information. This is called metadata shredding and mixes genuine metadata with randomly generated information. Furthermore, users, including individuals and companies, should use end-to-end encryption to protect the content of their sent messages and files.