5.1 Population Methods

Interpolating Single Years of Age

Method

F(x) is the cumulative age schedule of fertility.

f(x) = F'(x) is fertility rate at exact age x.

ASFR(x) = F(x+1) - F(x) is the age specific fertility rate.

F(x) for x>20 is assumed to be a logistic, i.e.,

logit(F(x)) = ln((F(x) - min)/(max - F(x)) where min and max are to be estimated.

min and max are parameters to be estimated. max must be greater than the maximum observed value of F(x); min must be smaller than the minimum observed value of F(x).

min and max are estimated by selected the values that maximize the correlation between age and logit(F(x)). The estimated F(x) is adjusted proportionately to insure that estimated F(max age) = observed F(max age). This insures that the TFR is the same for the estimated and the observed schedules.

F(x) for 15<x<20: f(x) is assumed to be linear in age; F(15)=f(15)=0 and F(20) = estimated F(20) from the logistic.

Advantages of the method

1. The total fertility rate is preserved.

2. ASFRs are always positive.

3. ASFRs rise and then decline, i.e., single peaked schedule.

Disadvantages of the method

1. Five-year ASFRs are not preserved.

2. Discontinuity in f at x=20.

General Discussion

Using this general approach, it might be possible to solve disadvantage 2 using a constrained polynomial (F(15)=0; F(20) = estimated value; f(20) = estimated value).

A better fit might be possible if the upper age group were assumed to be linear or some other function rather than using a single logistic to represent x>20.

The spreadsheet contains two examples for Taiwan. The fit is better for a high fertility schedule.

An alternative approach would be to fit a higher order polynomial or a cubic spline to F(x). An exact fit is possible so that the five-year ASFRs would be preserved. The difficulty with this approach is that f(x) can have local minimums and maximums and can even turn negative at low and high ages. My experience is that this is likely to happen with ASFRs.

Comment

From: Gretchen Stockmayer

Date: March 15, 2006

I just used this method to convert UN 5 year rates to 1 year rates for the US from 1950-2050 and found that the problem of the discontinuity at f(20) is lessened if you include age 15 in the correlation calculation.

Specifically, in the spreadsheet calculation, the correlation that is maximized to choose the min and max values is for age 20,25,...,50 and logit(F(20,25,...,50))). Start those series at 15 instead and you don't get as big of a break between f(19) and f(20) in the final estimated series. (I still get the f(15)-f(19) values from the linear interpolation.)


-- Back to Table of Contents

DATA

RESEARCH

TRAINING

COLLABORATION

REGIONAL CENTERS

EXTERNAL LINKS

CONTACT US

Copyright (c) 2004-2017