Challenges in archiving social media
Understanding what could possibly be considered social media content on the one hand, and also "archiv...
Alle leden mogen wijzigen
Instead of focusing on the "look and feel" of the material, its visual form and its multimedia affordances, the "structured data" approach focuses on informational qualities and the raw data that derive from the captured social media content.
The output of this is structured textual data, usually in tabular form. Social media platforms make structured data derived from their websites available via API services, i.e., specific interfaces created for applications and tools to connect and interact with the back-end of the platform. By connecting to an API, an interested party is able to access information not normally available to the end user of the social media website, such as aggregated numbers of likes and reposts, metadata about location, unique identifiers for each post, etc.
Structured data are easier to analyze and process with computational tools, making them highly valuable for research. While primarily targeted towards commercial users such as developers, web designers, market analysts, etc., social media platform APIs have been heralded as a valuable source of social media data by social scientists, policy makers, journalists, and others.
Even so, the reality is that social media platforms still restrict research use and archiving of their data to a great extent: Twitter, as one of the most popular data sources for social media research, has taken into consideration the non-commercial users interested in its API, and allows for researchers to create accounts for academic purposes; the restrictions for sharing and publication though still hold and limit the possibilities of what can be done with the data. Unless one is willing to pay for premium access, Twitter only allows access to data from up to 7 days in the past from its public API – which means that content that was published even a month before the time of capture is inaccessible.
This means that it can be challenging for an organization to create complete and reliable collections solely based on access to social media platforms APIs.