# «Spatial, Temporal and Size Distribution of Freight Train Delays: Evidence from Sweden Niclas A. Krüger* a,b,c Inge Vierth a,b Farzad Fakhraei ...»

Scheduling methods and delays: The optimization of capacity utilization and timetable design requires a certain degree of reliability of train operations. D’Ariano and Pranzo (2009) categorize the problems in rail traffic management into two major groups regarding time: a) the long term timetable design that should provide a robust schedule for a large network and a long time horizon and b) the real-time dispatching that considers the current timetable and deals with delay management. D'Ariano et al (2008) studies the implementation of the traffic management system ROMA (Railway traffic Optimization by Means of Alternative graphs) to optimize rail traffic also when the timetable is not conflict-free. Branch-and-bound algorithms are used to sequence train movements and local search algorithms are applied to optimize the rerouting processes. Corman et al (2010) evaluate the performance of the ROMA-system by inspecting the quality of the dispatch solutions when input data is manipulated by small stochastic variations.

The authors find that the first-in-first-out-strategy in most cases is not able to solve conflicts without causing delay propagations. Optimized dispatching solutions, computed by the ROMA system, can on the other hand give a better delay reduction than straightforward dispatching rules. Mu and Dessouky (2011) compare different methods that can solve or improve the freight train scheduling problem and show that the heuristics algorithms provide better solutions than the existing procedures.

In summary, we conclude that the distribution and propagation of train delays as well as the impact of infrastructure and capacity utilization on delays has been analyzed during the last two decades using a wide range of methods. In many of the latest papers, scheduling and simulation methods are developed. This paper contributes mainly to the first category (Distribution of delays), even though we focus on freight trains and not on passenger trains. However, our data covers a complete national rail network, whereas earlier studies use a much more limited data. In addition, we analyze spatial and temporal distributions not previously studied. Moreover, we contribute empirically to the topics delay propagation and capacity utilization. The approach here is on a rather aggregate level in contrast to many other previous studies. However, the results from disaggregate models have implications for aggregate data and vice versa. Hence, the empirical evidence presented by this paper in section 4 provides new insights, inputs and validation cues for other research methods and research areas related to rail delays.

3 Data According to official statistics, ton-kilometers transported by rail in Sweden decreased between 2008 (23.1 billion ton-kilometers) and 2009 (19.4 billion ton-kilometers), a decrease by 17 percent due to the economic bust following the financial crisis in late 2008. Sweden has a relatively high share of rail freight compared to other countries; the share is about twice as high as the average share for the European Union. Most of the rail infrastructure in Sweden is used by both passenger and freight trains.

The Swedish Transport Administration is responsible for the overall planning of the traffic on rail tracks and for the maintenance of the national rail network. This paper utilizes a database from the Swedish Transport Administration for 2008 and 2009 comprising all trains. The timetables for freight trains differ from the timetables for passenger trains as the number of freight trains has to be adjusted to fluctuating demand. Regular routes have to be booked several months in advance and can be cancelled with short notice (to date paths can be cancelled without costs). Ad hoc routes can be booked a short time before the departure of the train. Routes are allocated to trains using specific unique train numbers for trains that go between specified origins and destinations (OD-pairs). A destination can also be a marshaling yard, where wagons are recombined (and a new train number is used for the new train). Hence, it is not always possible to follow goods/wagons to the final destination within the rail system.

The database contains more than six million freight train observations per year. For each train, event times along the route are recorded, including arrival and departure time at the origin and destination and all sections in between. Codes for causes of delays and of cancelations (at the OD-level or section level) are also recorded in the database, but incomplete. Based on information in the database, OD-distance on scheduled and actual travel time, speed as well as arrival and departure deviations can be calculated.

In total we have 6,766,331 observations in the raw data for 2008 and 6,140,445 for 2009.

We disregard observations that are obviously wrong (departure time after arrival time, trips with an average speed of more than 120 kilometers per hour), observations for cancelled trains and observations for very short distances. The purpose of this paper necessitates the inclusion of extreme values, so we do not exclude so-called outliers from the data. We focus on the arrival delay at the destinations, hence we disregard the data for segments between origin and destination; in total our data contains 157,537 OD-pairs with a unique train number for 2008 and 125,219 OD-pairs for 2009. As mentioned above, the train numbers do not necessarily represent the total route for the cargo from sender to receiver. The main variable of interest in this paper is arrival delays, computed as difference between actual arrival time and scheduled arrival time at the destinations within the rail network. Hence, positive arrival deviation corresponds to late arrival and negative values to early arrival.

According to the earlier definition of the Swedish Transport Administration trains are considered to be on time if they arrive less than five minutes behind schedule at the terminal station, but recently the definition was changed to 15 minutes. Based on the earlier definition (5 minutes), 16 (15) percent of the freight trains in 2008 (2009) arrived in time at the terminal station (OD-pairs). 27 (25) percent in 2008 (2009) of the freight trains arrived too late and 56 (60) percent arrived too early. The high share of trains arriving before scheduled time illustrates the high amount of slack in the freight timetable. In 2008 5.8 percent of the freight trains were cancelled by the Swedish Transport Administration and in 2009 2.6 percent of the trains. The cancelled trains can cause severe extra costs for the operators and/or shippers similar as delayed trains.

In Figure 1 we show the frequency of different categories of causes for delays for the year 2009 (the results for 2008 are almost identical). We see that causes related to traffic management and to operators are most often mentioned as a cause. Another way of looking at causes of delay is to look at their share of total delay minutes (see Figure 2). Interestingly, although operator errors are more frequent, their share of total delays is less than for delays caused by traffic management, which account for 55% of all delay minutes but only 37% of all delay events. The five most mentioned delay causes due to operator error were in descending order late departure from freight terminal, train linkage, circulation/train turning, late from abroad and extra wagon service. The five most mentioned delay causes due to traffic management error are in descending order meeting train, train ahead, bypass, crossing the train route and scarcity of track.

Figure 1: Total number of delays and causes (share in %) Figure 2: Total delay minutes and causes (share in %) 4 Analysis

4.1 Delay distribution on size-frequency scale 4.1.1 Analysis of percentiles The histogram of freight train arrival deviations at final destinations exhibit non-normal properties; the histogram has a sharp peak around zero and fat tails (see Figure ).

In order to check whether many small deviations from timetable or a few extremely large deviations matter most to total freight train delay time, we examine different percentiles of arrival delays at the final destinations, in order to compute their share of total delay minutes. For example, by examining the 90th percentile we can see how much the 10% largest delays account for as percentage of total delay. Table 1 shows the share of each percentile as percentage of total delay minutes in 2009 (the results for 2008 resemble those for 2009).

For each percentile in the first column of Table 1, we show in the second column the total number of observations that fall within each percentile. The third column shows the average of arrival delay within each percentile and the fourth column reports the share of each percentile of total delay minutes. For example, we can see that 74% share of delays in total are caused by only 20% of all observations. Regarding these calculations we see that extreme delays are important since their contribution to the total delay is larger than the many small delays. This result is stable across the two years included in our database.

However, for a quantitative measure it is useful to compute the kurtosis. Kurtosis measures the degree that values, much larger or smaller than the average, occur more frequently (high kurtosis) or less frequently (low kurtosis) than in a normal (bell shaped) distribution; that is, kurtosis allows you to assess whether the distribution curve has thin (few extreme events) or fat tails (many extreme events). For our data the kurtosis of arrival delays at destination is 24, in comparison, the (standard) normal distribution has a kurtosis of 3. This means that large delays are much more common in our data compared to a normal distribution.

4.1.2 Identification of distribution Since we ruled out the normal distribution in the previous section, we want to explore what candidate distribution will fit our data better. Distribution fitting can be a valid method for several reasons: First, the distribution gives a hint on the mechanism behind the causes of delays or, to put it the other way around, any mechanism explaining delays should lead to a similar distribution as the empirical delay distribution. Related to this is that by means of distribution fitting it can be examined whether delays of different sizes are caused by the same mechanism.

Second, distribution fitting allows us to predict probabilities for very low probability events (e g large delays). Third, examining the distribution helps us find a general pattern that potentially can be applied to and compared to regional and international data for delays.

There are different candidate distributions that exhibit fat tails. The lognormal distribution is the simplest one to check, since it implies that the natural logarithm of delays is normally distributed. Another candidate distribution is the so-called power-law distribution.

**For a continuous variable as train delays, the power-law distribution is defined as follows (Clauset et al., 2009):**

f ( x )dx = Pr (x ≤ X x + dx ) = cx −α dx (1) Where X is the observed value, f(x) is the density function and c is a normalization constant. The normalization constant is needed since the density function diverges as x approaches zero.

Therefore we must have a lower limit for the power law process. The complementary cumulative

**distribution function is defined as:**

∞ 1 − F ( x ) = Pr( x ≥ X ) = ∫ f ( x )dx = Cx1−α (2) x

Whereas we can identify a power-law by a straight line in a double-logarithmic plot, an

**exponential distribution will exhibit a straight line in a semi-logarithmic plot instead:**

d ln(1 − F ( x) ) = −β (6) dx There are related distributions as the power law with exponential cutoff and the stretched exponential. Hence, it is difficult to definitely rule out the power law or to distinguish it from the exponential distribution. The standard way to identify power-law behavior is by spotting a straight line if plotting the logarithm of the inverse cumulative distribution against the logarithm of seize (see Figure ). We pool all observations in order to detect very large delays. As mentioned before, a power law distribution is necessarily bounded by a minimum value and hence applicable only to the tail of the distribution. Additionally, empirical distributions often exhibit signs of boundary effects so that power laws only can be confirmed for certain intervals.

Based on the method outlined in Clauset et al (2009) and based on maximum-liklihood estimation, different lower boundaries are selected and the power law coefficient is estimated;

goodness-of-fit tests are used to select the best fitting lower boundary. Figure 4 shows the result of fitting the data to the power law distribution. It seems that the power law hypothesis cannot be confirmed. For very large delays the power law overestimates the probability for having a delay of that magnitude. However, we cannot rule out that the tail is power law distributed with an exponential cutoff, that is, a combination of both distributions. One reason for this differing behavior for extremely large values might be a boundary effect, caused by the fact that very large departure delays are registered as cancellation of the train trip instead.

−8 −8 −10 −10

The exponential distribution gives a better fit for the extreme values in the tail (the power law coefficient is 3.212 and for the exponential distribution the coefficient is.584). The goodness-offit test confirms the visual inspection so that we conclude that the exponential distribution can be used to describe the tail of arrival delays of freight trains in the Swedish rail network.

**4.2 Delay distribution on spatial scale**