I describe an extended option from rename to easily rename your variabes
Stata
Tips
Author
Fernando Rios-Avila and Fahad Mirza
Published
June 2, 2023
Aknowledgements
This tip was brought to you by Fahad Mirza. Its one of those little things I have found useful, but usually forget, and have to look for it all over again.
Luckily, I know have my own site, where I can save and store this information! I give, however, total creadit to Fahad.
The problem
The problem is simple. Some times, you may have a series of variables with somewhat unappealing names. I particuarly dislike names that are too long. While some people like having descriptive variable names, I find it particularly distracting.
My preference is to have variables with good labels, and/or good value labels, whenever necessary. For variables themselves, I like short names, that are descriptive, however, I also like to have them labeled sequentially!.
How do we do that?
Obviously, the first approach is to go one by one. In fact, not too many Stata's ago, that was the only option. That particular task could have been done using loops as follows:
sysuse auto, clear* This loop iterates overallvariable names in the datasetforeach i ofvarlist * {local j = `j'+1ren`i' var_`j'}describe *
(1978 automobile data)
Variable Storage Display Value
name type format label Variable label
-------------------------------------------------------------------------------
var_1 str18 %-18s Make and model
var_2 int %8.0gc Price
var_3 int %8.0g Mileage (mpg)
var_4 int %8.0g Repair record 1978
var_5 float %6.1f Headroom (in.)
var_6 int %8.0g Trunk space (cu. ft.)
var_7 int %8.0gc Weight (lbs.)
var_8 int %8.0g Length (in.)
var_9 int %8.0g Turn circle (ft.)
var_10 int %8.0g Displacement (cu. in.)
var_11 float %6.2f Gear ratio
var_12 byte %8.0g origin Car origin
There is a better way
While the process above is rather simple, there is a better way of doing this, as Fahad suggests. That is using some of the extended options of rename.
Lets first replicate the code above, using the code that Fahad suggested.
(1978 automobile data)
Contains data from C:\Program Files\Stata17/ado\base/a/auto.dta
Observations: 74 1978 automobile data
Variables: 12 13 Apr 2020 17:45
(_dta has notes)
-------------------------------------------------------------------------------
Variable Storage Display Value
name type format label Variable label
-------------------------------------------------------------------------------
var_1 str18 %-18s Make and model
var_2 int %8.0gc Price
var_3 int %8.0g Mileage (mpg)
var_4 int %8.0g Repair record 1978
var_5 float %6.1f Headroom (in.)
var_6 int %8.0g Trunk space (cu. ft.)
var_7 int %8.0gc Weight (lbs.)
var_8 int %8.0g Length (in.)
var_9 int %8.0g Turn circle (ft.)
var_10 int %8.0g Displacement (cu. in.)
var_11 float %6.2f Gear ratio
var_12 byte %8.0g origin Car origin
-------------------------------------------------------------------------------
Sorted by: var_12
Note: Dataset has changed since last saved.
This is a much shorter, and cleaner code. What it does is take all variables within the first parenthesis to be rename using the instructions of the second set. Of course rename has quite few other options that you may find useful. Just type help rename group, to see all other extended options.
Before ending this tip. Something else you may find useful. You can use the option dryrun. Doing this none of the variable names will change, but instead you will see a report of how variable names will change after the command is executed.
---title: "Renaming variables in Bulk"description: "I describe an extended option from rename to easily rename your variabes"author: "Fernando Rios-Avila and Fahad Mirza"date: "6/02/2023"categories: - Stata - Tipsformat: html: code-line-numbers: true mermaid: theme: neutraldraft: false---## AknowledgementsThis tip was brought to you by Fahad Mirza. Its one of those little things I have found useful, but usually forget, and have to look for it all over again. Luckily, I know have my own site, where I can save and store this information! I give, however, total creadit to Fahad. ## The problemThe problem is simple. Some times, you may have a series of variables with somewhat unappealing names. I particuarly dislike names that are too long. While some people like having descriptive variable names, I find it particularly distracting.My preference is to have variables with good labels, and/or good value labels, whenever necessary. For variables themselves, I like short names, that are descriptive, however, I also like to have them labeled sequentially!.How do we do that?Obviously, the first approach is to go one by one. In fact, not too many `Stata's` ago, that was the only option. That particular task could have been done using loops as follows:```{stata}*| echo: true*| code-fold: falsesysuse auto, clear* This loop iterates over all variable names in the datasetforeach i of varlist * { local j = `j'+1 ren `i' var_`j'}describe *```## There is a better wayWhile the process above is rather simple, there is a better way of doing this, as Fahad suggests. That is using some of the extended options of `rename`.Lets first replicate the code above, using the code that Fahad suggested.```{stata}*| echo: true*| code-fold: falsesysuse auto, clearren (*) (var_#), addnumberdescribe```This is a much shorter, and cleaner code. What it does is take all variables within the first parenthesis to be rename using the instructions of the second set. Of course rename has quite few other options that you may find useful. Just type `help rename group`, to see all other extended options.Before ending this `tip`. Something else you may find useful. You can use the option `dryrun`. Doing this none of the variable names will change, but instead you will see a report of how variable names will change after the command is executed.```{stata}*| echo: true*| code-fold: falsesysuse auto, clearren (*) (var_#), addnumber dryrun```