Mistakes By TjM: "Applied Rationality Training Regime" #20: OODA Loop

< | ^ | >

January 20 brings us to the Training Regime Day 20: OODA Loop. The OODA loop idea is adapted by lots of people in lots of ways; I'm going to adapt Mark Xu's adaptation of the CFAR adaptation of Colonel Boyd's original (or whatever intermediate form CFAR was working from). Let's try a textual diagram:

1.Observe [update model of world-region]

=> O-O transition => [*end updating world model]

2.Orient [update model of available operations]

=> O-D transition => [*end updating ops/self model]

3.Decide [update model of choices -- What I Do Next]

=> D-A transition => [*end updating choices model]

4.ACT [update world]

<= A-O transition <= [*end action, back to 1.Observe]

It's easier to type it this way, but actually I visualize these on a clock-face with Observe at 12, Orient at 3, and so on. Each one might take a fraction of a second, or years. I think of each of the stages as having its own heuristics for making pretty-good guesses, and each transition as having heuristics for ending the corresponding stage, saying "never mind perfection, we've got to move on." If you use bad heuristics, that's your failure in an OODA stage or transition; if you use good heuristics and fail anyway, that's.... well, that's too bad. Note that 4.ACT can make changes in the outside world or in your internal structures; the other stages and transitions are internal.

Consider an example: I'm going through this "Training Regime" as an OODA loop in which each day I try to update some part of my world-model to understand the day's exercise, update my self-model of what actions I can take, update my model of my actual choices (i.e., make one), and then write something. Do my heuristics make sense? A sub-example: One peculiar heuristic that I don't see other people using is that I often throw in some random alternate-point-of-view boggling by looking up whose birthday I'm at. Today is the birthday of Buzz Aldrin, the second man to be able to say (as he does, dunno if Armstrong ever did) "Shoot for the moon; you might get there." That's an interesting heuristic about the Orient phase of a loop. Come to think of it, there's a meta-heuristic about that, something connecting identity with abilities, implicit in a quote from DeForest Kelley: "Dammit, Jim, I'm an actor, not a doctor." Something like that. It's his birthday too; he'd be 101. I'm not going anywhere in particular with these, but they do make me think about the stages in a different light, forced by random input.

I suppose that every sequence of conscious actions (or actions which might be raised to consciousness by Noticing) can be usefully viewed as an OODA loop, even if one or more of the stages/transitions may have been made automatic, i.e. made into TAP lines.

Indeed, consider a "TAP loop" -- a phrase I may have just now invented, but surely the idea is obvious: the pattern you detect (in yourselves or in your surroundings) triggers an action which changes the pattern to something new which triggers an action which changes the pattern.... I hereby claim that a TAP loop is an OODA loop, stripped to the bone by pre-computing as much as possible with a set of simplifying assumptions which, if true, imply the appropriateness of the Action associated with any given Trigger. Right? Observe, Orient, Decide all get folded together into a single pattern-recognition, and of course under some circumstances that'll work. Of course, this morning as I was going through the TAP-loop that gets my breakfast ready, I happened to have a side-thought about cleanup while I was still on preparation, which led to my putting the granola away before I'd added a bit on top of my chopped apple with yogurt. TAP loops are fragile, fully-worked-out OODA loops less so, but working out the stages is a lot more work than just following a TAP.

OODA loops can obviously be nested: any of the four parts can and likely will be a composite development, as when we start a loop of building up world-model information for a decision. OODA loops can obviously overlap, as when a sequence of discussions happens to overlap with a sequence of meals, each with its own observations, orientations, etc. I think that I'm gonna change the way I diagram events in my head: I'm gonna try to add the clockwise-circle-arrow symbol, U+2941: ⥁Maybe that'll remind me to look for the OODA loops within and surrounding each human (or AI) action I see.

(And as yet another example, this particular January 20th has seen a presidential transition, and all of us have to Observe and Orient and Decide and Act once more.... I did vote for this outcome, but I can't claim to be happy about it. It's less bad than the alternative, at least in the short and medium runs, but my 2016 hope that the Blue Tribe would take up the banner of constitutional limits seems to have been bad modelling on my part. So it goes.)

Labels: rationality

Mistakes By TjM

Wednesday, January 20, 2021

"Applied Rationality Training Regime" #20: OODA Loop

0 Comments:

About Me

Previous Posts