Browsertrix Crawler
Initially known as Browsertrix, the Browsertrix Crawler is the latest and revamped version of what use...
Alle leden mogen wijzigen
It is safe to say that most of the tools that output structured data are not the easiest or most intuitive to use. One notable exception then is TAGS (Twitter Archiving Google Sheet). TAGS is in essence an app built on Google Sheets, that uses the Twitter API to fetch structured data based on queries the user inputs in the spreadsheet.
TAGS makes use of an already authenticated Twitter API app for its operation, but you are able to use your own Twitter API app if you prefer. The strongest point of the tool is definitely its user-friendliness, which only requires one to log into their Google Drive, open up the TAGS spreadsheet, fill in their search query, and wait for the captured data to be downloaded.
The usual restrictions of the public Twitter API usage apply, e.g., rate limiting and a 7-days-in-the-past window for capturing older tweets, but all in all, the tool works very smoothly. It was tested to capture an individual user’s tweets from their timeline, as well as tweets based on keyword searches e.g., “Amsterdam,” “#coronavirus” and others. The tool also allows you to harvest all of a user’s favourited tweets.
TAGS can be configured to capture tweets for extended periods of time, and does not require monitoring or even your machine to be on, like most tools mentioned above do. Plus, a neat extra are the Summary and Dashboard tabs, that allow you to inspect the content you harvested in graphs and numbers e.g., how many unique tweets vs. tweets in total, number of links, the popularity of a particular term over the harvesting period, etc. These features could come in handy for performing an initial appraisal of the harvested content and determining whether it is suitable for preservation, or if additions and/or filtering are needed.
However, ease of use comes at the expense of flexibility, as TAGS is not as granular and configurable in its search and crawling capabilities as SFM or twarc. Nevertheless, it is definitely recommended for starters, and possibly for use in educational projects involving web and social media archiving trainees, donors, and collaborators that are not (yet) comfortable with using more technically demanding tools.
Another important note to be made about this tool is that it does not seem to be actively developed and/or maintained –as such, there is no telling how long its viability will be. Nevertheless, similar tools could be custom-made using apps in Google Cloud or other providers, or on an organization’s own infrastructure.