Zoekresultaten

  • Start Werkgroep: Downloads van Twitteraccounts verwerken/archiveren Krijgt jouw organisatie wel eens gedownloade Twitter/X-accounts overgedragen? En wil je graag meewerken aan een methode deze vervolgens goed te verwerken en op te slaan? Schrijf je nu in voor de werkgroep! Voorzitterschap: KB, Sophie Ham en Daniel Steinmeier start werkgroep: 15 februari 2024 van 15.00 tot 16.30 LET OP: aangepaste datum Drie langere bijeenkomsten met de werkgroep per thema: analyseren social media content,...

  • Samenvatting: Beeld & Geluid werkt samen met kunstenaarsduo Eline Jongsma en Kel O'Neill (Jongsma + O'Neill) aan een onderzoek om een van hun laatste projecten veilig te stellen. Het onderzoek draait om de acquisitie van Jongsma en O'Neill's 'His Name is My Name' (HNiMN), een interactieve Instagramdocumentaire.

  • Draag bij aan de speciale webcollectie "War in Ukraine" Een internationale samenwerkingsverband van webarchivarissen - gecoördineerd door Vladimir Tybin van de Franse nationale bibliotheek - bouwt samen aan een collectie websites over de oorlog in Oekraïne. Iedereen kan nog tot 12 maart suggesties doen voor op te nemen websites via dit formulier. Hier kun je meer lezen over de collectie.

  • De 20e editie van het grootste internationale congres over webarchivering - de Web Archiving Conference van het International Internet Preservation Consortium (IIPC) - vindt dit jaar plaats in Nederland. De KB organiseert deze jubileumeditie samen met Beeld & Geluid in Hilversum. Voor wie is de conferentie bedoeld? De webarchiveringsconferentie is er voor iedereen die zich professioneel met webarchivering bezighoudt: archivarissen, preserveringsexperts, softwareontwikkelaars en onderzoekers...

  • Het is zover, Twitter kondigde gisteren aan dat zijn API betalend zal worden vanaf 9 februari: https://twitter.com/TwitterDev/status/1621026986784337922 Er is nog geen verdere informatie over de gevolgen voor gebruikers. Zal de Twitter API Academic Track beschikbaar blijven en wat zijn de gevolgen voor Twarc-gebruikers? Wanneer de wijzigingen bekend zijn, worden ze opgenomen in de handleidingen over tools om Twitter te archiveren op CEST bv. https://www.projectcest.be/wiki/Publicatie:Twitte...

  • Web Archiving Conference 2023 in Nederland: call for proposals De KB en het Instituut voor Beeld &Geluid organiseren samen de internationale webarchiveringsconferentie van het IIPC (International Internet Preservation Coalition) op 11 en 12 mei 2023 te Hilversum. Het is de 20ste editie van het congres, dat jaarlijks ongeveer 200 collega’s van over de hele wereld aantrekt. De laatste keer dat de WAC (Web Archiving Conference) in Nederland plaatsvond was in 2011. Veerkracht en Vernieuwing Het t...

  • Mijn naam is Naomi Verlaan. Ik ben student aan de Reinwardt Academie en voor mijn afstudeeronderzoek doe ik onderzoek naar het maken van een selectie voor het archiveren van sociale media. Om mijn onderzoek te kunnen starten is het van belang dat ik eerst weet hoever archief- en/of erfgoedinstellingen op dit moment zijn als het gaat om de archivering van sociale media. Ik heb hiervoor een enquete gemaakt en ik wil u graag vragen deze in te vullen. Het is een korte enquete van 3 of 7 open vra...

  • In januari 2007 werd de eerste site gearchiveerd door de KB. Vanwege de viering van 'Vijftien jaar webarchivering' bij de KB heeft de bibliotheek drie uitlegvideo's laten maken. Deze zijn hier te vinden: Verleden van webarchivering Video 1 https://youtu.be/WGZltMnSQgE Heden van webarchivering Video 2 https://youtu.be/lTK7eW_M14E Toekomst van van webarchivering Video 3 https://youtu.be/Seq44U-jXJE

  • Update over tools voor Sociale media-archivering In 2020 onderzocht Zefi Kavvadia van het IISG - in het kader van een NDE-project - tools om sociale media mee te archiveren. Zij reviewde maar liefst negen tools, waaronder Webrecorder, Social Feed Manager en TAGS. Sommige tools kunnen heel eenvoudig worden gebruikt, andere vragen behoorlijk wat technische expertise. Ook ontbreekt documentatie nog al eens. Om de resultaten van dit onderzoek nog beter toegankelijk te maken en om het in de toekom...

  • Social media archiving is one of the latest hot topics in digital preservation and information and records management. It is now becoming widely recognized that the contemporary record, whether it is meant for evidence, research, memory or any combination of the above, can and will very often contain material originating from various social media platforms. Background of the project In the Netherlands too there has been recent mounting interest in developing and establishing social media arch...

  • Social networking sites e.g. Facebook, LinkedIn, WhatsApp, Telegram, Viber, Signal, I-message, Facebook Messenger, WeChat; dating apps such as Tinder, Bumble, Inner Circle, Grindr– sites that connect friends, groups based on interests or professions. Blogs e.g. Medium, Wordpress, Blogger, and Tumblr; Twitter can be considered as micro-blogging Image and video sharing platforms e.g YouTube, Vimeo, Tumblr, Instagram, Snapchat, Imgur, TikTok, and Pinterest Discussion sites e.g. Reddit, Quora ...

  • The turn to more data-intensive access methods to web and social media archives, as indicated by the use of big data and digital humanities methods to analyze social media content calls for capturing social media in formats appropriate for these activities. Usually in formats like JSON, CSV, and XLSX, collections made up of structured data are more amenable to computational methods such as network analysis, topic modelling, and many other visualization and analysis methods. Indeed, critics of...

  • The WARC format is widely accepted enough to be considered one of the default formats for storing captured content from the web. It followed its predecessor, the ARC, as the main file format in use by the Internet Archive, and is maintained by the International Internet Preservation Consortium (IIPC). The rationale behind the WARC format is that one file format for web archiving should preferably be able to hold not only the archived resources themselves, but also metadata about the resources...

  • The two general approaches to social media archiving presented here ("look and feel" and "structured data") also have implications for file format selection, which by extension has implications for preservation and collection quality. The choices made when capturing and preserving, part of which is selecting appropriate formats according to one's purpose, will affect the possible uses the collection can be put into, and by extension, the types of users the collection will be useful for.

  • Samenvatting: This is a list of sources referenced in this wiki. Care has been taken to include every source, however additions and corrections for things that might have been accidentally overlooked are of course welcome!

  • It is safe to say that most of the tools that output structured data are not the easiest or most intuitive to use. One notable exception then is TAGS (Twitter Archiving Google Sheet). TAGS is in essence an app built on Google Sheets, that uses the Twitter API to fetch structured data based on queries the user inputs in the spreadsheet. TAGS makes use of an already authenticated Twitter API app for its operation, but you are able to use your own Twitter API app if you prefer. The strongest poi...

  • Munin (Munin-Indexer) uses Docker to wrap different scraping and archiving tools together and offer a scraping solution for Facebook, Instagram, and VKontakte. It indexes and scrapes posts, then crawls and captures them, and finally uses pywb to display them. Suitable for public social media content The important thing to note about Munin is that it is only able to archive public posts, i.e., only posts that do not sit behind a log-in. Consequently, this means that it is useful for archiving...

  • Note: According to its GitHub page, this tool is not in active development anymore at the time of this writing (January 2022). However, it is still available for download and it still functions as expected. In a way, crocoite is a good example of a tool arising from the open-source community that could prove problematic to use in a professional setting because of lack of ongoing support. As browser-based crawling seems to become central in the practice of archiving the dynamic web including ...

  • Initially known as Browsertrix, the Browsertrix Crawler is the latest and revamped version of what used to be a system of multiple browser-based crawlers that worked together to capture complex web content, such as social media. Browsertrix Crawler is built by the team behind the online web recording service Conifer and the desktop app ArchiveWeb.page (formerly known as Webrecorder and Webrecorder Player respectively) and uses the Chrome and Chromium browsers to interact with web pages and re...

  • For those looking for large-scale harvesting solutions, Brozzler, like Browsertrix, is an interesting choice. Brozzler was developed and is still being maintained by the Internet Archive, and it is already used by organizations such as the Portuguese Web Archive. It is a browser-based crawler which uses Chrome or Chromium to access web content and harvest it in a WARC file. Brozzler is one of the newer-generation capturing tools which leverages browser technologies to interact with pages and...

  • Part of the research into social media archiving tools that was performed by NDE/IISG had to do with finding out what the requirements would be to consider these tools suitable for different kids of usage and users. In these pages, more general quality attributes for social media archiving software are presented, that could be relevant for any type of tool, free and open-source or not. Additionally, the functional requirements that were first defined by the project team to select which open-s...

  • This wiki page includes the tool surveys that NDE/IISG performed between 2020 and 2021. The aim is to keep these pages up-to-date as much as possible, and encourage other organizations involved with web archiving in the Netherlands to contribute to them with content they consider useful.

  • Instead of focusing on the "look and feel" of the material, its visual form and its multimedia affordances, the "structured data" approach focuses on informational qualities and the raw data that derive from the captured social media content. The output of this is structured textual data, usually in tabular form. Social media platforms make structured data derived from their websites available via API services, i.e., specific interfaces created for applications and tools to connect and intera...

  • This method is based on common web archiving practices, which makes sense as social media archiving can be seen as an offshoot of web archiving. The most common method of web archiving, namely web crawling or web harvesting, attempts to preserve the so-called “look and feel” of online content, meaning the layout, structure, and style of a website, as well as its navigational features, like buttons and menus. However, this has proven to be relatively limited in what it can accomplish with soci...

  • Social media appears in many shapes and forms, aimed at many different audiences and serves many different purposes. This variety also affects the ways we can consider what a social media archival collection is, and how it is created. In this section, two broad approaches to social media archiving will be presented, i.e. "look and feel" and "structured data". These are not mutually exclusive, and in many cases it could be preferable or advisable to use both (if time and resources allow it). B...

  • Understanding what could possibly be considered social media content on the one hand, and also "archive-worthy" social media content on the other hand, is important for tool selection and assessment, but also for acknowledging that at times the solution might not lie with tools per se. The nature of the content is significant for tool selection For example, to capture Instagram content without capturing its media will result in a collection that is less rich than it could be, considering how...

  • Social media are identified with the rise of Web 2.0 and the era of increasing online interactivity, personalization, the use of mobile devices and cloud computing. A broad definition like the one proposed by Treem et al. (2016, pp. 770), who see social media as technologies that “create a way for individuals to maintain current relationships, to create new connections, to create and share their own content, and, in some degree, to make their own social networks observable to others” could b...

  • The quality attributes mentioned below refer to the features of the tools as software itself, i.e. the ways it achieves what it has been designed to do. They themselves affect the quality of the user experience it offers and the tool’s sustainability (Chung et al. 2000). The requirements listed below were taken from the practice of software testing and software selection. While most of the academic and professional literature on software testing and software selection seems to be geared towar...

  • The list of functional requirements below is based on an earlier project carried out within IISG which focused on workflows for acquiring and preserving born-digital materials in the broad sense, and another project that looked into web archiving tools and workflows specifically. For this NDE/IISG research on social media archiving tools, we were particularly interested in testing tools and their outcomes with a focus on how they can be used primarily in cultural heritage and secondarily in l...

  • Het Nationaal Register Webarchieven is opgefrist! Je vindt hier alle websites die in Nederland door erfgoedinstellingen worden gearchiveerd. De laatste maanden is er - met hulp van NDE -hard gewerkt om de website beter te laten voldoen aan de wettelijke eisen voor Toegankelijkheid. Het resultaat staat nu online. Archiveert jouw organisatie ook websites en wil je deze laten opnemen in het Register? Neem dan contact met ons op!

  • Handvatten voor webharvesting erfgoedinstellingen Mag je websites of webcontent zomaar bewaren? Naast praktische en technische uitdagingen, komen daar ook juridische en beleidsmatige vraagstukken bij. Het Instituut voor Informatierecht (IVIR) van de Universiteit van Amsterdam heeft, in opdracht van het WODC, onderzocht welke handvatten er zijn om de rechtszekerheid rondom webharvesting te waarborgen. Lees hier het volledige rapport.

  • Gepubliceerd op de website van SIDN op donderdag 23 september 2021. In Nederland bestaat het internet nog geen 40 jaar, maar het is allang niet meer weg te denken uit ons dagelijks leven. Online is veel informatie te vinden van grote historische waarde. Daarom is het belangrijk dat de websites met deze informatie worden gearchiveerd, zodat wetenschappers en andere geïnteresseerden deze bronnen in te toekomst kunnen gebruiken. Gebeurt dat niet? Dan bestaat de kans dat de homepages voorgoed v...