Futarchy. In the proposal for Futarchy, Robin Hanson suggested using “hard” metrics, like GDP+ as the basis for evaluating proposals. What I find problematic is that there are often “shortcuts” to improve a metric, which are not aligned with the intention why a particular metric was chosen.
For example, a hospital can reduce its death rate by not performing difficult operations. While this would certainly improve the metric (probably more than anything else), it’s not necessarily what you’d want as a patient1. Let’s consider the difference between standards and rules.
EDIT: Discovered Goodhart’s law.
Standards vs. Rules. A standard is something like “Don’t drive recklessly”. A rule is “Don’t drive faster than 54 mph”. They both try to achieve the same thing, but the rule is “hard” in the sense that it is hardly disputable whether somebody was in fact driving faster than 54 mph or not. However, somebody could still drive recklessly when going slower than 54 mph (maybe it was raining, maybe there were pedestrians, etc). On the other hand, if the street is empty and the vision is clear, driving faster than 54 mph could be reasonable.
Vote Values. In the Futarchy proposal there was another component: Elections over values. If the populace would feel like a particular metric was too naively optimized, they could use the election to vote for a different, more nuanced metric2. This could keep the problem in check, but considering the inertia of such a process, I can’t be too optimistic about it.
Also, since I’m interested in Futarchy as a management tool, I’m not sure how practical elections would be. It seems strange to constantly redefine what the success of an organization means. Successful organizations are successful. What if we asked that question more directly?
Predictocracy. Another approach, Predictocracy, would select a random “decision maker” in the future to approve or disapprove of a certain decision reached by a market. The idea is that it is easier to evaluate a decision after the fact. In the organizational context, the question “Was this decision a good idea?” could be asked. While certainly not perfect, the approach could be “good enough”3.
Basically, traders would predict the approval rating of a decision, not the effects of a decision itself. This adds a layer of human judgement between traders and reality, albeit a fragile one. It certainly matters who and how many you ask.
Who. You could ask anybody who cares about the organization, basically anybody in its address book. Their motivation to be correct and information about what happened might be very limited though. It would be better to ask shareholders only. But should you select them randomly, or randomly but in proportion to the number of shares they own? In any case, a mechanism similar to Augur’s lie detector might be necessary to prevent collusion.
How many. It also matters how many people are going to be asked. A smaller sample size means more extreme outcomes become more likely, but a bigger sample size takes up more resources. The market depth is the best indicator for the importance of a decision, so it should be the basis for the sample size.
However, should it be between 1 and 100 people (i.e. more like a jury) or between 1 and the size of the organization (i.e. more like an election)? A jury is more vulnerable to corruption, while an election is more vulnerable to manipulation. Assuming the later is more expensive, it’s probably the safer alternative.
Coordination is hard. These considerations remind me once more that all human coordination is hard. The goal can’t be to find the “perfect” organizational structure, but one that is reasonably better than existing ones. If some variation of Futarchy, one of which I’ve dreamed up in this post, could eventually achieve this goal remains to be seen.
You could make the argument that the metric is just too simple. Using death rate for routine procedures + success rate for non-routine procedures would solve the problem. Maybe ever more complex metrics are the solution (similar to how the law becomes ever more complex). ↩︎
Expanding on the previous footnote, elections would be the process by which the metrics become ever more complex. ↩︎