Kompetenzstufe: Fortgeschrittene*r
Sprache: Englisch, Spanisch
Format: Tutorial
Medientyp: Bildmedien, Textmedien
Veröffentlichung: 03.03.2014
Modifizert am: 04.02.2021
ID: ® 10.46430/phen0035
Data Mining the Internet Archive Collection
Caleb McDaniel
The collections of the Internet Archive include many digitized historical sources. Many contain rich bibliographic data in a format called MARC. In this lesson, you’ll learn how to use Python to automate the downloading of large numbers of MARC files from the Internet Archive and the parsing of MARC records for specific information such as authors, places of publication, and dates. The lesson can be applied more generally to other Internet Archive files and to MARC records found elsewhere.
Diese Ressource steht unter folgender Lizenz: