# Make Scatters more Colorful: `mscatter`

Scatters across groups

## Introduction

The program `mscatter` came as an answer to two questions, often asked on `statalist`.

• How to produce Scatter plots `by` groups? to each group is labeled?
• How can I change the colors, if I have more 15 groups in my graph?

Doing plots that produce this is simple. One just needs to create multiple scatterplots, and combine them with `twoway`. This will produce you this kind of plot. For example, if I had to produce and join 4 scatter plots (across different groups) I would do the following:

``````twoway (scatter y x if z ==1, options) ///
(scatter y x if z ==2, options) ///
(scatter y x if z ==3, options) ///
(scatter y x if z ==4, options), ///
overall options``````

Doing this for multiple groups, however, can be a pain. Thus, I decided to write a do-file that address this problem.

## `mscatter`: scatter with Multiple groups

First some minimal setup. The latest version of `mscatter` is available from `SSC`. You can install it using the following:

``ssc install mscatter``

In the examples below, I will also use the scheme `white2` within [`color_style`]](https://friosavila.github.io/stataviz/stataviz1.html). See instructions there if you want to install all its components.

Now, how do you produce a scatter with multiple groups. The task is, in fact, simple. If I were to use `auto.dta` dataset, it would be something like this:

``````sysuse auto, clear
two scatter price mpg if foreign == 1 || ///
scatter price mpg if foreign == 0
graph export ms1.png, width(1200) replace   ``````

In fact, I have been pointted out to one command by Nick Cox, named linkplot, which may also help you produce similar plots.

``````ssc install linkplot
graph export ms2.png, width(1200) replace   ``````

And produces pretty much the same. The limitation: only works with up to 15 groups. But who needs more right? Well, if you need more, you can use `mscatter`!

``````set scheme white2                   // Lets use white scheme
mscatter price mpg , over(rep78)    /// Upto here normal. I use over instead of By
alegend  legend(cols(5)) msize(3) /// add a legend with large dots
by(foreign)                       // and groups by foreign
graph export ms3.png, width(1200) replace   ``````

But those are easy to do by hand. What if you have many groups. Lets see with some different data:

``````webuse nlswork
mscatter ln_wage ttl_exp ,     /// normal
over(age)            /// but over many groups!
colorpalette(magma ) /// all color coded
alegend               // with a legend to match
graph export ms4.png, width(1200) replace           ``````

And as I show you before, mscatter can be combined with by()

``````mscatter ln_wage ttl_exp , over(grade) ///
colorpalette(magma ) by(race, legend(off)) // legend(off) should go here
graph export ms5.png, width(1200) replace       ``````

Only current limitation, you can use weights (for size of markers) but it will fail, if you have groups without observations based on over and by (difficult to explain and reproduce).

## Other features

`mscatter` has other features that might put it apart from other commands. While the original version was written to take advantage of frames. The current version will also work with earlier Stata versions (at least V14), using `preserve/restore` commands.

In addition to this, `mscatter` may also allow you to add fitted plots to your scatter!

Lets see how this works:

``````webuse nlswork, clear
set seed 10
keep if runiform()<.2
color_style tableau
mscatter ln_wage ttl_exp , over(race)  ///
mfcolor(%5) mlcolor(%5) alegend ytitle("Log Wages")

graph export ms6.png, width(1200) replace         ``````

As you can see, this figure plots log of wages agains total experience across race groups. In addition to this, however, it also adds a fitted plot using quadratic fit using the option `fit(qfitci)`. You can add here other options that are command specific. Of course, with the addition of this, you may want to drop the scatter plot all together

``````mscatter ln_wage ttl_exp , over(race)  ///
fit(qfitci) mfcolor(%5) mlcolor(%5) ///
noscatter  /// Asks not to show the scatter points
alegend ytitle("Log Wages")

graph export ms7.png, width(1200) replace         ``````

## Conclusions

`mscatter` is a data-visualization utility that may enable you to create informative graphs capturing correlations between 2 variables.

While its use may be limited, I believe that those who need it, may find this a good tool to have for their arsenal.

Lastly, try checking the helpfile, where I add other options, not described here.

Til next time!