Fenix0817
diff --git a/‎chap02/Bayes net-Causal reasoning.ipynb
Lines changed: 302 additions & 0 deletions b/‎chap02/Bayes net-Causal reasoning.ipynb
Lines changed: 302 additions & 0 deletions
@@ -0,0 +1,302 @@
+{
+ "metadata": {
+  "name": ""
+ },
+ "nbformat": 3,
+ "nbformat_minor": 0,
+ "worksheets": [
+  {
+   "cells": [
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "In this section we shall look at different kinds of reasoning used in a Bayes Net. We shall use the libpgm library to create a Bayes net. Libpgm uses a JSON-formatted file with a specific format where the edges, vertices and the CPDs at each vertice are annotated. This json file is read into the NodeData and GraphSkeleton objects to create a DiscreteBayesianNetwork (which, as the name suggests is a Bayes net where the CPDs take on discrete values). The TableCPDFactorization is an object that wraps the DiscreteBayesianNetwork and allows us to query the CPDs in the network. Please copy the json file for this example, 'job_interview.txt' to a local folder and change the relevant line in the getTableCPD() function to refer your local path to job_interview.txt. \n",
+      "\n",
+      "The following discussion uses integers 0, 1, 2 for discrete outcomes of each random variable, where 0 is the worst outcome. For e.g, Interview=0 indicates the worst outcome of the interview and Interview=2 is the best outcome. \n",
+      "\n",
+      "The first kind of reasoning we shall explore is called 'Causal Reasoning'. Initially we observe the prior probability of an event unconditioned by any evidence(for this example, we shall focus on the 'Offer' random variable). We then introduce observations of (one of) the parent variables. Consistent with our logical reasoning, we note that if one of the parents (equivalent to causes) of an event are observed, then we have stronger beliefs about the Offer random variable.\n",
+      "\n",
+      "We start off by defining a function that reads the JSON data file and creating an object we can use to run probability queries on."
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "from libpgm.graphskeleton import GraphSkeleton\n",
+      "from libpgm.nodedata import NodeData\n",
+      "from libpgm.discretebayesiannetwork import DiscreteBayesianNetwork\n",
+      "from libpgm.tablecpdfactorization import TableCPDFactorization\n",
+      "\n",
+      "def getTableCPD():\n",
+      "    nd = NodeData()\n",
+      "    skel = GraphSkeleton()\n",
+      "    jsonpath=\"job_interview.txt\"\n",
+      "    nd.load(jsonpath)\n",
+      "    skel.load(jsonpath)\n",
+      "    bn = DiscreteBayesianNetwork(skel, nd)\n",
+      "    tablecpd=TableCPDFactorization(bn)\n",
+      "    return tablecpd\n"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": [],
+     "prompt_number": 1
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "What is prior probability of getting a job offer $ P(Offer=1) $? Note that the probability query takes 2 dictionary arguments, the first one being the query and the second being the evidence set, which is empty right now."
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "tcpd=getTableCPD()\n",
+      "tcpd.specificquery(dict(Offer='1'),{})"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": [
+      {
+       "metadata": {},
+       "output_type": "pyout",
+       "prompt_number": 13,
+       "text": [
+        "0.432816"
+       ]
+      }
+     ],
+     "prompt_number": 13
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "It is about 43%, and if we now introduce evidence that the candidate has poor grades, how does it change the probability of getting an offer? Evaluating for $ P(Offer=1 | Grades=0) $"
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "tcpd=getTableCPD()\n",
+      "tcpd.specificquery(dict(Offer='1'),dict(Grades='0'))"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": [
+      {
+       "metadata": {},
+       "output_type": "pyout",
+       "prompt_number": 11,
+       "text": [
+        "0.35148"
+       ]
+      }
+     ],
+     "prompt_number": 11
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "As expected, it decreases the probability of getting an offer. Adding further evidence that the candidate's experience is low as well, we evaluate $ P(Offer=1 | Grades=0, Experience=0) $"
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "tcpd=getTableCPD()\n",
+      "tcpd.specificquery(dict(Offer='1'),dict(Grades='0',Experience='0'))"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": [
+      {
+       "metadata": {},
+       "output_type": "pyout",
+       "prompt_number": 12,
+       "text": [
+        "0.2078"
+       ]
+      }
+     ],
+     "prompt_number": 12
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "As expected, it drops even lower to 20%.\n",
+      "\n",
+      "What we have seen is the introduction of observed parent random variable strenthens our beliefs leading to the name 'Causal Reasoning'. \n",
+      "\n",
+      "## Evidential Reasoning\n",
+      "\n"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "The second kind of reasoning is when we observe the value of a child variable, and we wish to reason about how it strengthens our beliefs about its parents. Evaluating for the prior probability of high Experience $ P(Experience=1) $"
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "tcpd=getTableCPD()\n",
+      "tcpd.specificquery(dict(Experience='1'),{})"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": [
+      {
+       "output_type": "stream",
+       "stream": "stdout",
+       "text": [
+        "0.4\n"
+       ]
+      }
+     ],
+     "prompt_number": 14
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "If we now introduce evidence that the candidate's interview was good, and evaluate for  $ P(Experience=1\\mid Interview=2) $"
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "tcpd=getTableCPD()\n",
+      "tcpd.specificquery(dict(Experience='1'),dict(Interview='2'))"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": [
+      {
+       "metadata": {},
+       "output_type": "pyout",
+       "prompt_number": 15,
+       "text": [
+        "0.8641975308641975"
+       ]
+      }
+     ],
+     "prompt_number": 15
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "we see that the probability that the candidate was highly experienced increases, which follows the reasoning that the candidate must have good experience or education, or both. In Evidential reasoning, we reason from effect to cause. "
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "## Intercausal reasoning\n",
+      "\n",
+      "Intercausal reasoning, as the name suggests, is a type of reasoning where multiple causes of a single effect interact. We first determine what is the prior probability of having high relevant experience. On evaluating $ P(Experience=1) $"
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "tcpd=getTableCPD()\n",
+      "tcpd.specificquery(dict(Experience='1'),{})"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": [
+      {
+       "metadata": {},
+       "output_type": "pyout",
+       "prompt_number": 16,
+       "text": [
+        "0.4000000000000001"
+       ]
+      }
+     ],
+     "prompt_number": 16
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "Introducing evidence that the interview went extremely well, we think that the candidate must be quite experienced. Evaluating for $ P(Experience=1 \\mid Interview=2) $"
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "tcpd=getTableCPD()\n",
+      "tcpd.specificquery(dict(Experience='1'),dict(Interview='2'))"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": [
+      {
+       "metadata": {},
+       "output_type": "pyout",
+       "prompt_number": 17,
+       "text": [
+        "0.8641975308641975"
+       ]
+      }
+     ],
+     "prompt_number": 17
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "The Bayes net confirms what we think is true, and the probability of high experience goes up from 0.4 to 0.86. Now if we introduce evidence the the candidate didn't have good grades and still managed to get a good score in the interview, we may conclude that the candidate must be so experienced that his grades didn't matter at all. Evaluating for $ P(Experience=1\u2223Interview=2,Grades=0) $"
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "tcpd=getTableCPD()\n",
+      "tcpd.specificquery(dict(Experience='1'),dict(Interview='2',Grades='0'))"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": [
+      {
+       "metadata": {},
+       "output_type": "pyout",
+       "prompt_number": 18,
+       "text": [
+        "0.9090909090909091"
+       ]
+      }
+     ],
+     "prompt_number": 18
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "which confirms what we were thinking, even though the probability of high experience went up only a little, it strengthens our belief about the candidate's high experience. This example shows the interplay between the two parents of Interview, which are Experience and Grades, and shows us that if we know one of the causes behind an effect, it reduces the importance of the other cause. In other words, we have explained away the  poor grades on observing the experience of the candidate. This phenomenon is commonly calle 'Explaining Away'. "
+     ]
+    }
+   ],
+   "metadata": {}
+  }
+ ]
+}