Selecting appropriate measures

There are three main types of performance measures: natural measures, constructed measures, proxy measures.

Natural Measures

Natural measures are those that follow from the nature of the objective itself. The most obvious examples are dollars (for financial or economic impacts), abundance (for species populations), probability of occurrence (for discrete events) and so on. It is best to use natural measures wherever possible. They are the most readily understood measures, as they directly describe the objective they represent. Unfortunately, natural measures are not always practical to use due to the limitations of modeling ability or because of the complexity of the objective. For example, we might like to know the total number of moose in a region, but it might only be possible to estimate with any certainty the number of hectares of moose habitat. In some cases natural measures simply don’t exist. There is no natural unit for example for hunter satisfaction. In this case we might prefer instead to use a constructed scale.

Constructed Scales

Constructed scales report an impact directly, but using a scale that is constructed for the decision at hand, rather than already in wide usage. Well known examples include:

Dow Jones Industrial Average
Richter Scale for earthquakes
Apgar scale for newborns
Grade Point Average for students
Michelin Rating Systems for restaurants

Over time these have become so widely used and commonly interpreted that they function almost like natural measures. Constructed scales are a practical solution to handling difficult or complex indicators. Constructed scales can range in quality from simple survey-type scales to sophisticated and highly specific impact descriptors.

In an SDM context, a common approach is to ask experts to assign a score to each alternative using a context-specific constructed scale, called a defined impact scale. A common problem with the use of scales generally is lack of specificity. For example, an expert may be asked to simple select a number (often between 1 and 7 or 1 and 10) that best represents the expected consequence of an alternative. While simple to design and administer, these kinds of scales of limited value. The main problem is that there is ambiguity surrounding exactly what is meant by a score of two relative to a score of five or seven. If an alternative scores five and another scores seven, how much better is the second alternative relative to the first? Remember, at some point the decision maker may have to trade off this difference against some other measure, such as dollars. The more precise we can be in defining the difference between two alternatives, the better.

Some best practices have emerged to guide the development of defined impact scales to avoid this problem. In general, it is good practice to:

Start by defining the boundary conditions (e.g., the worst and best plausible outcomes, either across the range of alternatives or in a global context)
Choose how many categories or levels you need for an appropriate level of precision to inform the decision (you may need to iterate)
Describe each level with clear, unambiguous descriptions of the consequence or impact (so that every expert will interpret them the same way when assigning scores, and every decision making will interpret them the same way when making value trade-offs)
Assign relative preference (this part is optional – only go here if the value function is very non-linear and you know what to do with that information!)

Here is a scale that was used to describe the expected impacts of different flow management options in a regulated river on whooping crane habitat:

Value	Description
-3	Reduction in suitable habitat of > 90 acres
-2	Reduction in suitable habitat of < 90 acres and > 45 acres
-1	Reduction in suitable habitat of < 45 acres
0	No net change in habitat suitability
1	Increase in suitable habitat of < 45 acres
2	Increase in suitable habitat of < 90 acres and > 45 acres
3	Increase in suitable habitat of > 90 acres

Proxy Measures

Natural and constructed measures report impacts on an objective directly. However, when neither can be found or developed in a practical way, proxy measures are often used to report impacts indirectly. A common example in natural resource management is the use of habitat area as a proxy for the degree of welfare of a species.

When a proxy measure is used, the decision maker must implicitly consider the relationship between the proxy and the endpoint it represents. As long as there is a one-to-one relationship between a fundamental objective and a proxy measure, the situation is reasonably manageable for decision makers (Keeney, 1992). However, in ecological systems, the prevalence of non-linear phenomena means there are potential pitfalls with the use of proxies. This results in several interrelated problems.

Proxies hide non-linear relationships between proxy and endpoint. For example, the area of available spawning habitat serves as a good proxy indicator for the productive potential of wild salmon. However, this proxy only holds for as long as spawning habitat limits the productive capacity of salmon; after a point, adding more and more spawning habitat does not necessarily provide any benefit at all. Any such threshold effects can result in decision makers placing inappropriately high or low weights on a proxy.

Proxies mask uncertainty in the relationship between proxy and endpoint. Proxy indicators can hide the uncertainty in the relationship between the proxy and the true endpoint, even while the proxy itself can be estimated with confidence. For example, although we might be quite certain of being able to produce 25% more bugs to serve as food for fish, we may have no idea whether this is what is limiting fish production. If what we really care about is fish rather than bugs, which seems likely, it is important that decision makers know that there is great uncertainty about whether an action that produces more bugs will in turn produce more fish. By choosing the proxy as the performance measure, subsequent analytical effort is focused on developing accurate estimates of the proxy, leaving potentially gaping holes in understanding about how the endpoint will respond.

They obscure important value judgments. When one proxy measure serves as a proxy for multiple endpoints and the relationship with one endpoint is different than with another, important trade-offs between the endpoints may not be exposed. For example, when dust control, visual quality and wildlife are all fundamental objectives in managing a reservoir drawdown zone, it seems easy and intuitive to use the area of vegetation as a proxy for them all. However, the type of vegetation that maximizes dust control (achieved by policies that produce large areas sparsely vegetated with simple non-native grass communities) is not the same as the type of vegetation that maximize wildlife value (achieved by policies that produce relatively smaller areas vegetated with complex shrub and cottonwood communities and a diversity of native grasses). Thus overemphasis on the proxy can start to reify the proxy, and obscure dialogue about what really matters.

Proxies reinforce reliance on technical experts. In environmental risk decisions, the ability to assess the relationship between a proxy and its endpoint usually requires detailed technical knowledge. As a result, non-technical stakeholders will be forced to rely on technical specialists to provide insight into this relationship. It can become difficult to separate the technical judgment about the relationship between the measure and the endpoint from the value judgment about how much weight (importance) to give the endpoint. This tends to increase the power or perceived power of technical stakeholders, who should have no more legitimacy (and sometimes less) than non-technical participants for making value judgments.

Key Ideas

Natural measures are the most readily understandable; use them whenever possible.
Consider constructed scales if natural measures are not available. Design them carefully to avoid ambiguity.
Use caution with proxy measures.