## Ordinal variables in semopy

We conider a variable to be ordinal if it has a categorical non-continuous nature (for instance, if we can encode it as an integer), and if you can meaningfully sort it (it is possible to impose a total order relation). Simple example of an ordinal variable is size encoded as "Tiny", "Small", "Average", "Big".

There are 2 ways to treat ordinal variables in semopy.

## Fixed effects

First, if in your SEM model ordinal variables are also exogenous, its should be all right to treat those variables as fixed effects. This is done automatically if you use

ModelMeans or

ModelEffects. However, results are subject to change under different encodings of ordinal variables in the data.

## Heterogenous correlation matrix

Second, is to fit the model not do covariance matrix, but to so-called heterogenous correlation matrix. Heterogenous correlation matrix is a correlation matrix, where correlations between ordinal variables are calculated as polychoric correlations, and correlations between ordinal and continious variables are calculated as polyserial correlations. For details, see

the semopy paper. To do it, just speciy ordinal variables using the

**DEFINE** command:

**DEFINE(ordinal)**

Makes semopy treat variables as ordinal, i.e. their polychoric and/or polyserial correlations will be estimated.

y ~ x1 + cat1 + cat2
DEFINE(ordinal) cat1 cat2

Here, Pearsons correlations between

cat1 and

cat2 variables will be substituted with polychoric correlations, and correlations inbetween

cat1, cat2 and

x, y with polyserial correlations.

This has some drawbacks:

- Heterogenous correlations matrix takes a long time to compute and the time increases drastically as increases the number of observations and the number odinal variables.
- Sometimes, heterogenous correlation matrix is not positive-definite, and semopy will find the closest positive-definite matrix. It might result in some original information deformation.
- It works only for Model.