Appendix A:  Regression Estimates and Predicted College-Going Rates

By Michael T. Childress

From Listening to Kentucky High Schools
pp. 69-72, published 2002


We used multiple regression analysis to generate the predicted college-going rates for the 233 Kentucky high schools in our sample. The model is based on several factors that studies have shown are linked to student performance and/or college attendance. The variables in the model include:

College-Going Rate (COL). The dependent variable is the percentage of the high school graduating class going to college. The data are from Kentucky Department of Education’s report, "Transition to Adult Life." We collected our data from the 1999-2000 school report cards. The state average reported for this year is 53 percent.

Education (BA). Using data from the 1990 Census, we included the percentage of the adult population (25 years old and older) with a B.A. or higher for the zip code in which the high school is located. If the high school was the only one in the county or if it included students from around the county, then we used the county B.A. percentage.

Socioeconomic Status (LUNCH). We use Kentucky Department of Education data on the percentage of the students who receive free or reduced priced lunch as a proxy for socioeconomic status.

School Size (ENROLL99). Some literature indicates that the size of the school is linked to student performance, with smaller schools associated with better student outcomes. We used the enrollment numbers included on the 1999-2000 school report cards to gauge the impact on the college-going rate.

Spending per Student (SCHSPND). Education researchers generally agree that spending influences student performance. More money can mean smaller class sizes and more qualified teachers. This variable comes from the 1999-2000 school report cards.

Teacher Experience (TEXPAVG). A significant amount of research has shown that teacher qualifications affect student achievement. One indicator of teacher experience is the number of years teaching, data that is included on the school report cards.

Proximity to an Institution of Higher Education (COPSE). We included this variable to test the relationship between the proximity of an institution of postsecondary education and a high school. This variable is coded as a dichotomous variable. If there is an institution of postsecondary education in the same county as the high school, then the variable is coded as a "1." Otherwise, COPSE is coded "0."

Unemployment Rate (NOV99UMP). We included the county unemployment rate for the fall following graduation to determine how employment opportunities relate to the college-going rate. The data are from the Kentucky Department for Employment Services.

Parental Involvement (NPT_CONF). This variable is a measurement of the number of students whose parent/guardian had at least one teacher conference during the school year and comes from the school report cards. We have transformed the variable so that it is the percentage of students who satisfy this criterion. (Note: We truncated the variable so the maximum value is 100 percent.)

Performance on Standardized Tests (ZTESTS). Student achievement is linked to postsecondary education attendance. We converted three Kentucky Core Content test scores and one national basic skills test score into z-scores and then averaged them to get a single z-score. The four tests are 9th-grade language, 10th-grade reading, 11th-grade math, and 12th-grade writing. The data are from the school report cards.

Location of the school. We use a series of dichotomous variables (i.e., coded "0" or "1") based on the USDA BEALE codes to reflect the relative urban to rural nature of the county. Counties with a BEALE code of "0" through "7" are included in the model while the completely rural counties are excluded; the completely rural counties have BEALE codes of "8" or "9."

We use the parameter estimates in Table A.1 to calculate a predicted (or expected) college-going rate for a high school. The formula for calculating these rates is:

Ŷ = α + β1X1 + β2X2 + β3X3 + β4X4 + β5X5 + β6X6 + β7X7 + β8X8 + β9X9

+ β10X10 + β11X11 + β12X12 + β13X13 + β14X14 + β15X15 + β16X16 + β17X17

Where,

Y = the predicted college-going rate

α = intercept

β1X1 = BA coefficient times the percentage of the adult population 25 years old and older with a BA degree or higher.

β2X2 = LUNCH coefficient times the percentage of students in the school receiving free or reduced priced lunches.

β3X3 = ENROLL99 coefficient times the school’s total enrollment.

β4X4 = SCHSPND coefficient times the spending per student at the school.

β5X5 = TEXPAVG coefficient times the average years of experience among the school’s teachers.

β 6X6 = COPSE coefficient times whether there is an institution of postsecondary education in the county (coded "1" if yes and "0" if no).

β 7X7 = NOV99UMP coefficient times the county’s unemployment rate in November 1999.

β 8X8 = NPT_CONF coefficient times the percentage of the students whose parent/guardian had at least one teacher conference.

β 9X9 = ZTESTS coefficient times the normalized and averaged scores on the four different Kentucky core content tests.

β 10X10 = BEALE0 coefficient times whether the county has a BEALE code of "0" (This variable equals "1" if yes and "0" if no.)

β 11X11 = BEALE1 coefficient times whether the county has a BEALE code of "1" (This variable equals "1" if yes and "0" if no.)

β 12X12 = BEALE2 coefficient times whether the county has a BEALE code of "2" (This variable equals "1" if yes and "0" if no.)

β 13X13 = BEALE3 coefficient times whether the county has a BEALE code of "3" (This variable equals "1" if yes and "0" if no.)

β 14X14 = BEALE4 coefficient times whether the county has a BEALE code of "4" (This variable equals "1" if yes and "0" if no.)

β 15X15 = BEALE5 coefficient times whether the county has a BEALE code of "5" (This variable equals "1" if yes and "0" if no.)

β 16X16 = BEALE6 coefficient times whether the county has a BEALE code of "6" (This variable equals "1" if yes and "0" if no.)

β 17X17 = BEALE7 coefficient times whether the county has a BEALE code of "7" (This variable equals "1" if yes and "0" if no.)

The model coefficients are presented in Table A.1. This model explains 59 percent of the variation in the dependent variable (adjusted r-squared equals 0.59).

Table A.1: Model Estimates

To view a list of all chapters in this book, click here.  To read the chapters in sequential order, please follow the arrows below.

    Back to Action Items

   Ahead to Appendix B: Teacher Transcript Analysis