TRB in 2003 explained how develop a performance-measurement program. They say that implementing and updating a performance-measurement program is an iterative process.
Setting performance goals
When performance measures are linked to agency goals, performance standards should be established for each measure. These standards are used to determine whether or not each goal is being accomplished.
The standards chosen should be neither unrealistic, in which case the usefulness of the entire program will be called into question, nor too easy to achieve, in which case agency performance is unlikely to improve. Brown in 1996 stated that standards should be “challenging, worthwhile, and achievable.” Standards should require work to achieve; but the benefit derived should outweigh the cost of achieving the increased performance, and the goal should not be set so high that it can never be reached.
Some standards can be implemented as design standards. If the design standard is being met, the agency can be reasonably confident that the goal related to that standard is being met. This saves the agency the need to regularly track the measure related to that goal, as long as it takes care to ensure that the design standards are followed.
Comparison to the annual average
Under this system, the average value for each measure is determined annually, and the routes that fall into the lowest (and sometimes highest) groups for each measure (e.g., lowest 10th percentile, lowest 25th percentile) are identified for further action.
The drawbacks of this method are that there is no connection between the standards and customer satisfaction, nor is there any identification of how well the system as a whole is operating.
This is a variation on the system described above, comparison to the annual average. In this case, the value for each measure is compared to the average value for the measure in the first year that the performance-measurement system was implemented. Measures that fall below a certain percentage of the baseline value are targeted for further action. This system is an improvement on comparison to the annual average, as it allows current performance to be easily compared to the baseline and focuses attention only on those areas that are truly under-performing.
As with the first system, there is no connection between the standards and customer satisfaction. There is no incentive to improve, and this method requires that the baseline condition be adequate; otherwise, the performance standard could be met but not the goal that the standard relates to.
Another option is to set the standard based on the previous year’s performance measure value. In this case, the standard would be expressed as “improvement from the previous year” or “x% improvement over the previous year.” (If performance dropped the previous year, the previous year’s standard would be retained and not lowered.) Measures that show worsening performance, compared to the previous year, would be targeted for further action.
The advantage of this method is that incentives are built into the method to achieve continually improving performance and to track performance trends over time. Disadvantages include no direct relationship between the standards and customer satisfaction and a potential to greatly increase the number of measures that require follow-up attention, if performance slips system-wide from one year to the next. Also, it must be recognized that at some point it becomes cost-ineffective to try to continue to improve performance in a particular area; in these cases, the standard should be to maintain the existing high level of performance.
Under this method, transit agency management, often in consultation with the agency’s governing body, sets targets based on a combination of current agency performance, professional judgment, and agency goals. This method allows customer and community issues to be considered and, if the standards are updated on a regular basis, allows for continual performance improvement. This method allows standards to be directly tied to customer satisfaction, particularly when the results of a customer satisfaction survey are available to determine the level at which customers are satisfied or very satisfied. One potential flaw with this method is that the experience of other agencies is not taken into consideration.
Comparison to typical industry standards
This method builds on the work done by other agencies, under the principle that “if it’s good enough for the other guy, it should be good enough for us.”
This method has the advantage of being at least somewhat defensible—the standards were not pulled out of thin air, but are comparable to what others are doing—but it fails to consider either other agencies’ special circumstances that caused them to adopt a particular standard or the agency’s own circumstances.
The method can be useful for identifying if existing standards, or ones being considered, are considerably higher or lower than those of other agencies. A considerably higher standard may indicate that it is being set unrealistically high, while a standard that is considerably lower than others may indicate that it has not been set high enough.
When comparing other agencies’ standards, it is important not only to identify the standard itself but also any definitions used to develop the performance measure.
Under this method, an agency identifies other agencies with similar conditions (e.g., city sizes, level of government support, fare levels, goals and objectives, cost of living index values, or other similar criteria), and determines how well those agencies are performing in the measurement categories. Standards are based on the average values of the peer agencies for given measures, or alternatively, some percentile value.
This method has the advantage of providing a realistic assessment of where an agency may have room for improvement and the ranges of performance that are being achieved by its peers. However, it requires up-front work to identify peer agencies, and both up-front and ongoing work to track performance measure results from the selected peer group. Also, not every selected peer agency may track performance in the areas that the agency setting standards is interested in.
A combination of the methods described above is ideal. Developing a baseline and tracking performance each year provides useful information on whether changes in a measure represent trends or 1-year statistical blips. Comparing performance to peer agencies will indicate areas of excellence or deficiency. Internal review of standards allows local conditions and objectives to be considered and should be done annually to encourage continued improvement in areas where improvement is still feasible.
Stakeholder acceptance
Several key groups of stakeholders must accept the performance measurement program for the program to have long-term viability and usefulness. Experience shows that a program initiated without broad input and support of stakeholders is likely to fail or, at a minimum, operate substantially below expectations.
Some stakeholders are Agency staff, Transit agency customers, Agency governing body or Service contractors.
Linkage to goals
A transit agency’s goals should reflect the most important aspects of what it wishes to accomplish. Performance measures are the primary means of assessing how successful an agency is in accomplishing its goals.
Clarity
The program’s intended audience should understand the performance measures used in the program. Acceptance of measures by stakeholders at all levels will be facilitated if the measures are easy to understand and the links between measures and goals are evident.
Reliability and credibility
The reliability of performance-measure results directly depends on the quality of the data used to calculate the measures. Some kinds of data normally are more accurate than others, some have errors. The methodology used to calculate a performance measure should be consistent between reporting periods. Objectivity is another aspect of reliability. Performance measures should not be selected on the basis of which measures will make the agency look good and avoided where those performance measures make an agency look bad.
Variety of measures
The performance measures used by a given transit agency should reflect a broad range of relevant issues.
Number of measures
The need for a variety of measures must be balanced to avoid overwhelming the end user with superfluous data to sift through to find the key drivers of service quality.
Level of detail
Measures used within a performance-measurement program should be sufficiently detailed to allow accurate identification of areas where goals are not being achieved, but should not be more complex than needed to accomplish this task.
Flexibility
Goals change over time, as do external factors. A performance-measurement program should provide the flexibility needed to permit change in the future, while retaining links to necessary historical measures.
Realism of goals and targets
Targets should be realistic, but slightly out of reach, to encourage managers and employees to find ways to continually improve performance.
Timeliness
Timely reporting allows all to understand the benefits that resulted from actions to improve service and also allows agencies to quickly identify and react to problem areas.
Integration into agency decision-making
In order for the effort put into developing and monitoring a performance measurement program to be worthwhile, agencies must carefully consider what the performance results are indicating, and use the results both to evaluate the success of past efforts and to help develop ideas for improving future performance.
References
Brown, M., 1996. Keeping Score: Using the Right Metrics to Drive World-Class Performance, Quality
Resources, New York, NY.
Transportation Research Board (TRB), 2003. TCRP Report 47- A Guidebook for Developing a Transit Performance-Measurement System.
Málaga, Spain