#Needed packages
import pandas as pd
import pyequate as eq
#Read in data
#Form X and Y in same file
ACT = pd.read_csv(r'ACTmath.csv')
#Splitting ACT into formx and formy
formx = ACT[["scale", "xcount"]]
formy = ACT[["scale", "ycount"]]
#Renaming both counts to x
#If different forms are in different datasets, the count column may be named the same
formx = formx.rename(columns = {"xcount": "x"})
formy = formy.rename(columns = {"ycount": "x"})Random Groups Equating
To illustrate functions for random groups equating, we are using the ACTmath file. This is a publicly available dataset and can be found on the GitHub page for this package.
We will first walk through an example where both form X and form Y are in a single dataset, followed by an example where form X and form Y are in two different datasets. The functions remain unchanged and the only difference is which columns are being called. Importantly, the functions are expecting frequency data, in the form shown in Table 1. Specifically, one column is the total score (Score) and another is the number of students earning that score (Count).
| Score | Count |
|---|---|
| 0 | 0 |
| 1 | 2 |
| 2 | 4 |
| 3 | 5 |
| 4 | 5 |
The code below is loading the packages and reading in the data. To illustrate referencing separate datasets, the code chunk is also creating a situation where form x and form y are in two separate datasets.
There are three different options for equating with random groups: mean, linear, and equipercentile. Mean and linear equating will use the same base function, eq.linear(), but different type = specifications. Linear equating will specify type = "linear" while mean equating will specify type = "mean".
The eq.linear() function has four required arguments: x, y, score_min, and score_max. x and y are the columns containing frequencies of observed scores for form x (x) and form y (y). score_min, and score_max are the minimum and maximum possible scores for the measure.
Optional arguments are type = (defaults to “linear”), rescale = (defaults to False), and group = (defaults to “Random”). type = accepts “linear” (linear equating), “mean” (mean equating), and “zscore” (z-score equating). If you want scores to be rescaled to a 0-100 range, change rescale = False to rescale = True. If single group equating is desired, use group = "Single".
Below are mean and linear equating for the ACTmath dataset, first with both forms in one dataset and then with each form in a separate dataset.
#Mean equating
meq = eq.linear(ACT['xcount'], ACT['ycount'], #Specify form x and y
0, 40, #Score min and max
type = "mean", #Type of equating
rescale = False, #No rescaling
group = "Random") #Random groups design
#Linear equating
leq = eq.linear(ACT['xcount'], ACT['ycount'], #Specify form x and y
0, 40, #Score min and max
type = "linear", #Type of equating
rescale = False, #No rescaling
group = "Random") #Random groups design#Mean equating
meq_2 = eq.linear(formx['x'], formy['x'],
0, 40,
type = "mean",
rescale = False,
group = "Random")
#Linear Equating
leq_2 = eq.linear(formx['x'], formy['x'],
0, 40,
type = "linear",
rescale = False,
group = "Random")The resulting objects (meq, leq, etc.) can be examined, exported, or used to create a visualization as described in a later section. For now, we will take a quick glance at the first few rows:
#Mean equating results
meq.head()
#Linear equating results
meq.head()| Score | ex | |
|---|---|---|
| 0 | 0 | -4.317073 |
| 1 | 1 | -3.317073 |
| 2 | 2 | -2.317073 |
| 3 | 3 | -1.317073 |
| 4 | 4 | -0.317073 |
Equipercentile equating expects data in the same format as mean and linear equating, but uses a different function (equipercen()). As with linear(), it takes arguments of x and y as well as score_min= and score_max=.
A planned addition to this function is incorporating presmoothing.
#Equipercentile equating
eqeq = eq.equipercen(x = formx['x'], y = formy['x'], #Specify form x and form y
score_min = 0, score_max = 40) #Min and max possible scores
#Examine results
eqeq.head()| score | equated | |
|---|---|---|
| 0 | 0 | -0.15 |
| 1 | 1 | 0.90 |
| 2 | 2 | 2.60 |
| 3 | 3 | 2.95 |
| 4 | 4 | 3.30 |
Finally, if you wanted to compare the results across the three different equating methods, the outputs can be combined into a single dataset.
#Gather score column and equated columns from each method
rcombined = pd.concat([meq['Score'], meq['ex'], leq['ex'], eqeq['equated']], axis=1)
#Rename columns to something sensible
rcombined.columns = ["score", "mean", "linear", "equip"]
#Examine results
rcombined.head()| score | mean | linear | equip | |
|---|---|---|---|---|
| 0 | 0 | -4.317073 | 13.914711 | -0.15 |
| 1 | 1 | -3.317073 | 14.742038 | 0.90 |
| 2 | 2 | -2.317073 | 15.569364 | 2.60 |
| 3 | 3 | -1.317073 | 16.396691 | 2.95 |
| 4 | 4 | -0.317073 | 17.224018 | 3.30 |