Simulating data from a correlation matrix in SPSS

Like most psychologists, my training in statistics focused on using SPSS.  It was only when I had completed my Honours research that I came to realise that there were other programs out there that might be better.  Some of those programs might even be able to do things that SPSS could not!  So I worked through AMOS, MPlus, RUMM2030, IRTPRO, and others – most only used for specific tasks rather than a true replacement.  But I always ended up back at SPSS when I could.

One of the things that I often wanted to do in SPSS was to create a simulated dataset from a correlation matrix.  Simulation studies are fascinating, and I wanted to be able to start playing with it myself.  Digging around online revealed a couple of interesting options, including simulating the active dataset.  Then I stumbled on this code, and managed to get it working!  I can’t credit where it came from, but if you are looking to simulate a dataset based on a published correlation matrix, this could help you out.

********************************************************

* Encoding: UTF-8.
*Input your desired correlation matrix and save it to a file.
*The example provided is based on four variables.
matrix data variables=v1 to v4
/contents=corr.
begin data.
1
.700 1
.490 .490 1
.408 .408 .408 1
end data.
save outfile=’C:\Users\shane\Downloads\corrmat.sav’
/keep=v1 to v4.

*Generate raw data. Change #i from 100 to your desired N.
*Change x(10) and #j from 10 to the size of your correlation matrix, if different.
new file.
input program.
loop #i=1 to 100.
vector x(4).
loop #j=1 to 4.
compute x(#j)=rv.normal(0,1).
end loop.
end case.
end loop.
end file.
end input program.
execute.

*Use the FACTOR procedure to generate principal components, which
*will be uncorrelated and have mean 0 and standard deviation 1 for each
*variable. Change “x10” to reflect a different number of variables if
*necessary, and if doing this then also change the number of factors
*(components) to generate and save.
factor var=x1 to x4
/criteria=factors(4)
/save=reg(all z).

*Use the MATRIX procedure to transform the uncorrelated variables to
*a set of correlated variables, and save these to a new file. These variables
*will be named COL1 to COL10 (or whatever is the chosen number of
*variables).
matrix.
get z /var=z1 to z4.
get r /file=’C:\Users\shane\Downloads\corrmat.sav’.
compute out=z*chol(r).
save out /outfile=’C:\Users\shane\Downloads\generated_data.sav’.
end matrix.

*Get the generated data and test correlations.
get file=’C:\Users\shane\Downloads\generated_data.sav’.

*Rename variables if desired. Replace “var1 to var10” with appropriate
*variable names.
rename variables (col1 to col4=var1 to var4).

*Test correlations.
correlations var1 to var4.

********************************************************

Leave a Reply

Your email address will not be published. Required fields are marked *