Thursday, January 28, 2021

"Applied Rationality Training Regime" Review#3

 

   < ^  |  >

 

January 28, Review Day #3, and this morning while walking on an icy road with my daughter I explained that the imaginary cartoon spider that was TAP-dancing on her head had just reminded me of what she'd said earlier on the walk. Through the day it happened again, a couple more times -- Checklist Charlie has actually become somewhat useful. Can I "systemize" on that basis? Maybe, at least partly. Marian also spoke up when I was about to say the wrong thing -- well, I think it would have been the wrong thing. It would be nice to believe that I can become less wrong than I was. Charlie did fail me once, and I think  it was because I was insufficiently real in my cartoon-spider-TAP-dance-visualization. I need a proper mix of fact and fancy... it's not quite enough to say with Kathleen Lonsdale (the crystallographer who flattened benzene, whose birthday it is) that "in science as in the arts, there is very little worth having that does not require the exercise of intuition as well as of intelligence, the use of imagination as well as of information." I've gotta imagine the right things, making the right use of the information.

And it's also Arthur Rubinstein's birthday: "Love life, and life will love you back." This, too, is applied rationality. 

I keep thinking that "Applied rationality" is such an awkward name. I like "selves-control" or "selves-training" better, but I can't imagine anyone else would.

I was just looking at Mental subagent implications for AI Safety:

Assume, for the sake of argument, that breaking a human agent down into distinct mental subagents is a useful and meaningful abstraction. The appearance of human agentic choice arises from the under-the-hood consensus among mental subagents. Each subagent is a product of some simple, specific past experience or human drive. Human behavior arises from the gestalt of subagent behavior. Pretend that subagents are ontologically real. ...Some agents are going to be dis-endorsed by almost every other relevant agent. I think inside you there's probably something like a 2-year-old toddler who just wants everyone to immediately do what you say.

I thought at first "A-ha! That's exactly how I think of it, it's what I've been doing throughout." But now I don't think it's quite right. It's not that your stories, your subagents, are ontologically real so much as it is that they will have been ontologically "real", as real as your overall self. They are, and you are, stories that you tell your selves. Tentatively, I see no reason to believe that more than a small fraction of them are products of simple, specific past experiences or human drives, either. And I don't think this has the same implications for AI safety, but it might. 


Labels:

0 Comments:

Post a Comment

<< Home