« Aspects from afar | Main | A word on this blog »

"Breakthrough Ideas" - The Acceptability Envelope

Tuesday @ oopsla

Of all the short talks in the Breakthrough Ideas session, Martin Rinard's presentation prompted the greatest number of questions and comment from the audience. He gave a very quick and intense talk on Exploring the Acceptability Envelope. The point of the talk was first to propose that we can consider software "correctness" by using statistical analysis and probability to arrive at a level of acceptability in software, and then to present some examples supporting this idea. In short, he favored probabilistic reasoning over logical certainty. After thinking this over for a while, I admit this idea started to appeal to me.

The way I considered the question was to start with asking why we look for (and expect) certainty in the answers provided to us by our software, while at the same time maintaining a certain tolerance level for non-critical errors that keep us from getting the answers we want.

To paraphrase one way of putting forth the point here (my words only-- hopefully I'm not butchering the intent of the presentation too badly), given a series of "x" tasks needed to complete a process, and "x-n" failures at any given time, it is possible to reason about the outcome of the process in certain circumstances where "n" is either low, or "x-n" failures represent non-critical parts of the entire process.

For example, if 100 tasks are required to complete in order to determine a given outcome, but any one of those tasks is susceptible to failure, rather than returning an error (or no result) to the user, a given system should be able to determine the likelihood of the outcome based on probability. I take it to mean that the term "outcome" here is contextual. In mission-critical systems for example, we may need to know that all required tasks completed successfully in order for us to proceed.

On the other hand, when booking a trip to Memphis for the weekend and requesting an aisle seat, I could see probabilistic reasoning being pretty helpful. If the steps involved include writing to the database and interacting with two third-party systems in order to make the request, failure to receive a response from one of those systems would probably generate either an error message back to the user, or long wait times. As it is, errors are propagated back, and long wait times are often mitigated by putting the user's request in a "wait" state, setting a placeholder and resorting to asynchronous communication with the third-party systems. This solves the technical problem, but not the user's problem-- they often don't want to know that certain parts of their request are in a "wait" state. And as software designers, we shouldn't presume anything about the user's workflow apart from the software! Maybe they need an answer right away in order to complete another request-- if so, the development effort put into implementing the asynchronous communication doesn't help the user at all. That said, why not avoid the complexity of dealing with the asynchronous messages in the first place, give up on the third-party response and present the user with an semi-intelligent conclusion based on the probabilistic outcome? If the probability of success is high even with a given failure rate, better to inform the user of success, and reduce the overall complexity of the system at the same time in this case, than to either leave the user in limbo, or (worse) show them some meaningless error message like "Unable to communicate with third-party system".

This all may sound weird, I know. I thought so at first as well. But then I started thinking about it in human terms-- were I a person standing in the room giving you directions around an unfamiliar part of town, I might be wrong about certain landmarks, the distance between two streets, the number of stoplights between here and there, etc.. yet somehow, while I reason about directions and guide you through them, I don't avoid giving you any help at all just because I completely forgot one step in the process (like how to get out of the parking lot, for example). Likewise, (person-to-person), you're more likely to to find my directions useful in some way, shape or form, even if I am unsure about a few steps. Why couldn't we approach software like that? Take the worst case scenario in each instance: in the person-to-person example, if my directions are completely off, you can probably stop and ask for directions. In the software example, you will probably call a help desk anyway.

Software is just going to keep gaining in complexity, and systems will need more and more work integrating with each other. It's not a bad idea to ask ourselves, where do we want to invest our time developing these systems? Hire a staff of thirty to chase bugs everywhere and sanitize a limited number of "happy paths" through the system? Or use the same size or smaller team to focus on critical elements of the system, using probabilistic reasoning for other parts, in order to get the user some sort of feedback so they can get back to more important things?

In any case, I found Rinard's fairly engaging as a speaker. Doing a search on MIT's site, I found some interesting material he's published you might want to check out as well.

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)