Kompetenzstufe: Anfänger*in
Sprache: Englisch, Spanisch, Französisch, Portugiesisch
Format: Tutorial
Medientyp: Bildmedien, Textmedien
Veröffentlichung: 05.08.2013
Modifizert am: 11.04.2025
ID: ® 10.46430/phen0023
Cleaning Data with OpenRefine
Seth van Hooland
, Ruben Verborgh
, Max De Wilde
Don’t take your data at face value. That is the key message of this tutorial which focuses on how scholars can diagnose and act upon the accuracy of data. In this lesson, you will learn the principles and practice of data cleaning, as well as how OpenRefine can be used to perform four essential tasks that will help you to clean your data: 1. Remove duplicate records 2. Separate multiple values contained in the same field 3. Analyse the distribution of values throughout a data set 4. Group together different representations of the same reality. These steps are illustrated with the help of a series of exercises based on a collection of metadata from the Powerhouse museum, demonstrating how (semi-)automated methods can help you correct the errors in your data.
Diese Ressource steht unter folgender Lizenz: