Bayes Applet Proposal

From The Transhumanist Wiki

Jump to: navigation, search

This proposal has now been implemented by Christian Rovner.


This is a proposal for a graphic Java applet which would demonstrate the application of Bayes Theorem.

Suppose that we're administering a cancer screening test, that the prior probability a patient has cancer is 25%, that the conditional probability of a 'positive' result on the test given cancer is 98%, and that the conditional probability of a 'positive' result on the test given health is 20%. (False negative rate of 2%, false positive rate of 20%.) If you receive a positive result on the test, Bayes' Theorem says that your updated probability is:

p(cancer|positive) =
p(cancer & positive) / p(positive) =

                p(positive|cancer) * p(cancer)
___________________________________________________________________
[p(positive|cancer) * p(cancer)] + [p(positive|health) * p(health)]

= (.98 * .25) / [(.98 * .25) + (.20 * .75)]
= 62%

For this case, the applet would produce a graphic that looked like this:

 ______________________________ __________
[______________________________|__________]
 \                            /          /
  \                          /          /
   \                        /          /
    \                      /          /
     \                    /          /
      \                  /          /
       \                /          /
        \              /          /
         \            /          /
          \          /          /
           \        /          /
            \______/__________/
            [______|__________]

Note that the bottom bar is centered relative to the top bar and that the proportions are correct in both cases.

If you label the parts of this diagram, they are:


 _____________________all patients________
[________healthy patients______|__cancer__]
 \                            /          /
  \                          /          /
   \    positive|health     /          /
    \                      /positive| /
     \                    /  cancer  /
      \                  /          /
       \                /          /
        \              /          /
         \            /          /
          \          /          /
           \        /          /
            \______/__________/
            [_p&h__|___p&c____]
     all patients with positive results

Bayes' Theorem is:

p(A|X) = p(X&A)/p(X) =
p(X|A)*p(A) / [p(X|A)*p(A) + p(X|~A)*p(~A)]

The labeled diagram for Bayes' Theorem is:

 ______________________________ __________
[______________p(~A)___________|___p(A)___] = p(1)
 \                            /          /
  \                          /          /
   \      p(X|~A)           /  p(X|A)  /
    \                      /          /
     \                    /          /
      \                  /          /
       \                /          /
        \              /          /
         \            /          /
          \          /          /
           \        /          /
            \______/__________/
            [_X&~A_|__p(X&A)__] = p(X)
                   /
                  /
                p(A|X) = p(X&A)/p(X)

At minimum, I'd like to see an applet that produced the shape itself, so that a screen capture would give me a diagram to separately label using an image editor. The long bar at the top should be 3D (yes, I've read The Visual Display of Quantitative Information, I've seen "the duck", and I still want the top bar to be 3D, okay?), divided into two halves with customizable colors (applet parameters), with (by default) the right half a light shade of gray and the second half a slightly darker (but easily distinguishable) shade of grey. Suppose the applet width is set to 320 pixels and that the top bar is set to take up a total of 303 pixels. For the front of the 3D bar, the surface of the 3D bar that faces the reader, there would be (in this example), 225 light gray pixels and 75 dark grey pixels, with the remaining 3 pixels being allocated to the black bars of the wireframe; one pixel for the right edge, one for the left edge, and one for the separator line (which will not be counting for either side). The lower bar would have 122 pixels: 45 light gray pixels, 74 dark grey pixels, and 1 pixel each for the right side, left side, and separator. The lines from the lower bar to the upper bar should be continuous, and 3D (this is the reason for the 3D-ness of the graph; to show projection). I.e.:

   _______________
  /______________/|
 |_______________|/
 \\  \         ///
  \\  \_______///
   \\ /______/|/
    \|_______|/

I would suggest using floating-point numbers to calculate all coordinates from the probabilities, converting to integers only when needed for drawing pixels. That is, the coordinates of the lines themselves would be calculated in floating point, finally being converted to pixels only when necessary for drawing.

The p(X|A) and p(X|~A) areas of the projection (the trapezoidal areas) should also have customizable colors, but by default it should be a lighter shade of the p(A) and p(~A) areas. It would also be nice to have customizable colors for the p(X&A) and p(X&~A) areas. If side polygons of the 3D shapes, etc., need colors, then offer a default way of calculating them but, what the heck, make them customizable as well. Applet parameters.

The applet must gracefully display cases where prior, conditional, or final probabilities end up equal to 0 or 1.

This is a minimum Bayes' Applet, suitable for producing editable screenshots. A medium-level Bayes' Applet would also have the following functions:

User-controllable prior probabilities and conditional probabilities, via text input boxes, i.e:

 ______________________________ __________
[______________________________|__________]
 \                            /          /
  \                          /          /
   \                        /          /  Prior probability:
    \                      /          /              p(cancer): [25.0%]
     \                    /          /
      \                  /          /     Conditional probabilities:
       \                /          /        p(positive|cancer): [98.0%]
        \              /          /         p(positive|health): [20.0%]
         \            /          /
          \          /          /         Final probability:
           \        /          /            p(cancer|positive):  62.0%
            \______/__________/
            [______|__________]

This would enable users to change the probabilities and see how the differential between the two conditional probabilities slides the prior probability to the final probability. The labels for the probabilities would be applet parameters.

Finally, a deluxe Bayes' Applet would have the ability to actually apply user-specifiable captions in the following areas:

 _____________________all patients________
[________healthy patients______|__cancer__]
 \                            /          /
  \                          /          /
   \    positive|health     /          /  Prior probability:
    \                      /positive| /              p(cancer): [25.0%]
     \                    /  cancer  /
      \                  /          /     Conditional probabilities:
       \                /          /        p(positive|cancer): [98.0%]
        \              /          /         p(positive|health): [20.0%]
         \            /          /
          \          /          /         Final probability:
           \        /          /            p(cancer|positive):  62.0%
            \______/__________/
            [_p&h__|___p&c____]
     all patients with positive results

Labeling these by area for the sake of applet parameters, we might have:

 ______________TopBar1______TopSep________
[______________TopLeft_________|_TopRight_] TopBar2
 \                            /          /
  \                          /          /
   \      CondLeft          /CondRight /
    \                      /          /
     \                    /          /
      \                  /          /
       \                /          /
        \              /          /
         \            /          /
          \          /          /
           \        /          /
            \______/__________/
            BotLeft|_BotRight_] BotBar
                   /
              BotSep

It would be helpful if the applet behaved intelligently when the labeled areas shrunk - for example, offsetting the caption to a nearby point. Note that if the entire final area shrinks to zero (both conditional probabilities are zero) it will still be necessary to avoid collision.

Also useful would be the ability to draw lines from these points to controls set at the side. For example, in addition to controlling the prior and conditional probabilities, we might also want to display groups of patients, for example:

  • Group 1: 750 healthy
  • Group 2: 250 cancer
  • Group A: 600 healthy & negative
  • Group B: 150 healthy & positive
  • Group C: 5 cancer & negative
  • Group D: 245 cancer & positive

Then draw lines from these groups to the appropriate sections of the graph. This would require the ability to specify an (x, y) position at the right side of the applet for the various quantities that one is given the option of calculating, along with the label. For example, the last line above might appear as "400, 100;BotRight;Group D:  %NNN healthy and negative" in an applet parameter.

Other exotic capabilities might include the ability to grab the prior probability and move it, watching the change this creates in the final probability (and also the size of the lower bar, incidentally); the ability to grab the conditional probabilities by the left side and right side of their respective trapezoids; and the ability to set the applet to cycle temporally between two sets of values for [prior, c1, c2] and [prior, c1, c2], which as a special case would allow you to show what happens as the prior goes from 10% to 20% or 0% to 100%, for example. I would suggest lingering on the first extreme for 2 seconds, going to the other extreme over 10 seconds, lingering there for 2 seconds, going back to the first extreme, etc. All other quantities including displayed group sizes, etc. should update accordingly. I would suggest using buffered drawing and actually checking the amount of elapsed time before a redraw, rather than a simple loop. A temporally varying graph where one of the properties is constant (for example, constant prior probability) might allow that constant to be edited by the user using the standard text input boxes.


After some bad previous experiences, I'm not offering a mutex lock on this task to any one volunteer. If you want to work on it, feel free to leave a note saying you're doing so, but whomever delivers the applet first gets the credit and the glory.

-- Eliezer Yudkowsky



Christian Rovner is working on it.

Christian Rovner is almost done.