3.3 Renaming, recoding, and sorting data

3.3.1 Renaming

To rename variables, we will use the indexing approach (i.e., square brackets) with the function names(). The assignment operator will assign the new names to the variables of our data set. For example, to rename y (the number of friends) and use a more appropriate name (i.e., friends), we will assign the new variable name to the 9th column of the data set (i.e., y).


names(affective.dis) [9] <- 'friends'
head(affective.dis)
##   id treatment year    sex age depression life.satis alone friends
## 1  1         1 2020   male  23         22   53.50930     0       0
## 2  2         1 2020 female  19         19   42.10586     0       4
## 3  3         1 2020   male  24         14   22.73856     1       2
## 4  4         1 2020 female  21         23   18.42421     0      11
## 5  5         1 2020   male  20         16   68.70935     1       2
## 6  6         1 2020 female  21         11   40.22030     1       0

3.3.2 Recoding

We can recode variables in different columns (i.e., creating a new variable with the new codes and keep the old variable) or in the same column (i.e., overwriting the existing variable). In the following example, we are creating a new variable called satis. This new ordinal variable displays three ordered bands of the scale Satisfaction with life (e.g., Low, Medium, High) with cut-off points set at 26 and 47. We have to recode the values of life.satis into the new variable satis using conditional subsetting with logical operators.

  
affective.dis$satis [affective.dis$life.satis < 26] <- 'Low'
affective.dis$satis [affective.dis$life.satis >= 26 &
                          affective.dis$life.satis < 47] <- 'Medium'
affective.dis$satis [affective.dis$life.satis >= 47] <- 'High'

affective.dis <- affective.dis[c(1:7, 10, 9, 8)]
head(affective.dis)
##   id treatment year    sex age depression life.satis  satis friends alone
## 1  1         1 2020   male  23         22   53.50930   High       0     0
## 2  2         1 2020 female  19         19   42.10586 Medium       4     0
## 3  3         1 2020   male  24         14   22.73856    Low       2     1
## 4  4         1 2020 female  21         23   18.42421    Low      11     0
## 5  5         1 2020   male  20         16   68.70935   High       2     1
## 6  6         1 2020 female  21         11   40.22030 Medium       0     1

3.3.3 Sorting

If we are interested in sorting our data set for a visual inspection we will use the function order(). The argument decreasing set as FALSE shows the lowest values first, whereas the argument decreasing set as TRUE displays the highest values first.

  
affective.dis <- affective.dis[order(affective.dis$depression,
                                   decreasing = T), ]
head(affective.dis)
##    id treatment year    sex age depression life.satis  satis friends alone
## 7   7         1 2020   male  30         29   32.00596 Medium       9     0
## 4   4         1 2020 female  21         23   18.42421    Low      11     0
## 1   1         1 2020   male  23         22   53.50930   High       0     0
## 12 12         1 2020 female  26         22   48.86415   High      13     0
## 11 11         1 2020   male  27         20   40.08320 Medium      18     1
## 2   2         1 2020 female  19         19   42.10586 Medium       4     0

We might be also interested in inspecting depression scores as a function of two or more variables. For instance, to inspect depression by age groups, we will include two arguments in the function order(). First, we will order the observations by age. Then, within every year, we will order our observations by depression.


affective.dis <- affective.dis[order(affective.dis$age,
                                   affective.dis$depression), ]
head(affective.dis)
##    id treatment year    sex age depression life.satis  satis friends alone
## 36 36         3 2020 female  18          4   41.11086 Medium       4     0
## 34 34         3 2020 female  18          9   62.02641   High      10     0
## 29 29         3 2020   male  19          9   58.75117   High       2     0
## 17 17         2 2020   male  19         13   38.76411 Medium      10     1
## 10 10         1 2020 female  19         14   54.88683   High      18     1
## 2   2         1 2020 female  19         19   42.10586 Medium       4     0

When we use the function order(), we change the original order of the rows of our data set. It is always convenient to restore the original order by using the variable id (1:n). Remember to leave a blank space after the comma located within the square brackets (i.e., [rows , columns]) to sort the rows by id.


affective.dis <- affective.dis[order(affective.dis$id), ]