Transferability, Empiricism and the Elusive Science of Travel Forecasting

It seems that empiricism in the absence of theory is the enemy of transferability. Physicists have postulated that the speed of light in a vacuum is a constant; no matter (not a pun) where or when it is measured, it is always the same. The speed of light is perfectly transferable. However, in travel forecasting the parameters of travel models are not constant and can vary from place to place and from time to time in the same location. We could argue that travel environments vary and people are inconsistent, just as atmospheric conditions slightly affect the speed of light. Or we could argue that our empirical methods are failing us.

I recently went to a conference where a speaker presented two ABM calibrations for two different locales. Although the model structures were the same, the coefficients were markedly different. The speaker did not seem to think that this was odd. In both cases, the models could produce a forecast with plausible results. What else is needed?

Personally, I am not comfortable with models that cannot be replicated. Are the people of (picking two random cities) San Francisco and Atlanta so different as to their travel decision making that they require entirely different models? Significant differences in coefficients across models suggests that those models are not transferable from place to place and further suggests that they are not transferable between now and a future date, either.

Forgive me for digressing. In cleaning out my UWM office a few weeks ago, I found some old unpublished research results from the original Subjective Value of Time study, from 1978. The major outputs of this study were many ratios of values of time, free of economic considerations. These ratios were not determined through model calibrations, they were determined through a well-crafted psychophysical scaling experiment with 84 subjects selected at random from Chicago. The subjective values of time were about as constant as you can get, as evidenced by exceptionally strong goodness-of-fit statistics. Not irrelevantly, these subjective values of time have stood up very well over the more than 36 years since the experiment was performed.

This additonal article about the experiment was never written because the very large stack of statistical analyses found nothing significant. What was I looking for, you might ask? I was looking for variations in subjective values of time across socioeconomic characteristics and gender. There were no variations. It is difficult to prove a hypothesis with insignificant statistics; and it is even more difficult to convince referees that a paper is worthy of publication when there is nothing striking to show.

However, insignificance is critical to the concept of transferability. We can assert that two sets of coefficients that are insignificantly different from each other are likely to be transferable, or at least we have no evidence to say that they are not transferable. If subjective values of times do not vary between men and women and between rich and poor, there is a pretty good chance they can be treated as constants within our models. But they often are not treated as constants.

The empirical method of travel model calibration follows an elementary three step estimation technique: assemble household travel data; estimate coefficients of a choice model; select any coefficient for inclusion that has a “significant” t-score. What is there to criticize? Lots, actually.

First, the theoretical foundation of this three step estimation technique is exceptionally weak. Yes, random utility theory has taken us in a positive direction, but we have precious little theory to tell us what a utility expression should contain. Thus, we fall back on empiricism.

Second, the three step estimation technique ignores history, for the most part. Things that we know for sure are assumed to be unknown prior to the estimation. Many wheels are being reinvented. In fairness, I have seen models rejected that disagreed severely with history, once they had been estimated. And, unfortunately, I have also seen models that were accepted even though they disagreed with history.

Third, we often accept coefficients that are only slightly better than garbage. Our criteria for determining statistical significance are taken from social science research, without any regard for needs of forecasting.

I don’t think we can elevate the art of travel forecast to a science until such time that are models are reasonably transferable.

Alan Horowitz, April 14, 2015