Full explanations on Bayes Probability and Naive Bayes Probability are in the
Classification by Bayes Probability Explained Page
. This panel presents help and advice on how to use the program in the previous panel on this page.
Definitions and Conventions
Attributes are the predictors. In this program (Naive Bayes), attributes are binomial, and are single characters of either positive (+) or negative (-). Any other character wil be interpreted as -. Multiple attributes are presented as a line of +s and -s with no gaps in between, a
pattern. For example ++--+ means positive for attribute 1 and 2, negative for 3 and 4, positive for attribute 5. Please note all rows in the same data must have the same number of attributes in the same order. Greater details on the representation of patterns for this program are available in the Discussion panel of
Classification by Bayes Probability Explained Page
Outcome Group Names identifies the outcome a case is from. Names are single words, combination of letters and numbers with no gaps, and treated by the program as text. There can be two or more outcomes in Bayesian analysis
Input Data are placed in the first text area labeled "Data" in Program 1, and can be in one or two formats
- For Program 1 in the previous panel, for building and testing a model, both the attributes and the outcome names are required. These are in two columns separated by spaces of tabs. The first the attributes in rows of +s and -s. The second the outcome names. Each row consists of the combination of attributes and outcomes in a case.
- For all other programs in the previous panel, data is optional, and when present is interpretted by the model. Only the first column of attributes are read and interpreted.
Sample Size is an array of numbers representing the number of cases in each outcome in the reference data. This is used for calculating the probability of an attribute in each outcome (P(attribute|outcome) or P(a|o)).
a priori Probability is a believe in the probability of belonging to a outcome prior to executing the Baysian model. In data entry, any set of number representing relative proportions (sample size, ratios, percents) can be used, and the program will normalize them by dividing each by the total. The default setting is to have the same apriori probability for all outcomes. User should change this when calculating Bayesian probability P(o|p,π).
Cost Coefficients is an array of number representing the relative lost to the model if an outcome is misidentified, so represents the importance of each outcome. Any set of numbers representing relative values can be used (dollar, quality of life, death rate), as the program normalizes the numbers to proportions of the total. The default setting is for all outcomes to have the same cost, and in most cases no change to this is required. Changes are only required if the user wishes to insert a calibrated bias into the model
Probabilities
The program produces 3 type of probabilities during model development
- P(+|o) is the probability of an attribute being positive (+) in a outcome (o). This is calculated by dividing the number of +s by the sample size of the outcome group, and represents the coefficients for calculating Bayesaen Probability.
- P(pattern|outcome) or P(p|o) is the product of P(+|o) from all attributes for each outcome in a case. This is presented for error checking only, and not used for any other interprertations.
- P(outcome|pattern) or P(o|p) is the probability of having an outcome when the attribute pattern is observed, without any other considerations. This is called Maximum Likelihood, representing how the coefficients function.
The model produces
P(outcome|pattern,π,c) or P(o|p,π,c), the final Bayes Probability, taken differences in a priori probabilities and cost coefficients into consideration. Using default setting, the results are the same as P(o|p), as the sample size from all outcomes are the same, and cost coefficients are the same by default.
Default Example Used on This Page
The default example on this page consists of a model to identify the ethnicity of individuals using hair and eye color. The data are computer generated to demonstrate the process, and do not represent collected facts
- There are 3 outcomes, Italian, French, and German. There are 10 individual in each outcome
- There are two predicting variables, creating 5 input attributes
- The first variable is hair color, represented by the first two attributes, +- for dark color hair, -+ for light color hair
- The second variable is eye color, represented by the last 3 attributes, +-- for brown eyes, --+ for blue eyes, and -+- for eyes of any other color
- The attributes from the two predictors are then concatenated to form a single text string, +-+-- for dark hair and brown eye, -+-+- for light hair and blue eye, and so on
Programs
The programs on this page consists of a single program with 3 points of entry (Program 1-3), and a supplementary program (program 4) to create a Javascript interpreter. The Example buttons triggers the loading of the default example data for each entry, and runs the program using that data. Users can enter his/her own data and runs the program with the program buttons
Program 1 models and interprets the same data
- The data is in two columns separated by tabs or spaces. The first column contains the attributes, the second column the Outcome outcome names
- The program first counts the number in each outcome, and the number of positives for each outcome, forming the table of counts. It deposits the two counts as entry data for Program 2
- The program then converts calculate probabilities of being positive for each attribute column in each outcome (P(a|o)), and deposit this as entry data for program 3
- The program also deposits the sample size for the 3 outcomes as a priori probabilities for Program 3. By default, the program produces the same cost coefficients (1), and deposits as costs for program 3
- The program then converts the a priori and cost coefficients to proportion of the total, by dividing each value by the total of all outcomes
- Using the array of attributes, the program calculates the probability of the attribute pattern for each case (P(p|o). The is presented for error checking only, and is not interpreted.
- The program then converts P(p|o) to probability of each outcome from the pattern (P(o|p)). This is the first a posteriori Probability, assuming a priori probabilities and cost coefficients from all outcomes to be the same. This results provide an estimate of how well the model separates the outcomes using the reference data that creates the model.
- The program then modifies the initial Baysean Probability using the a priori and cost arrays to produce the final Baysean Probability (P(o|p,π,c)), by multiplying each probabiliy by the two coefficients, then recalibrate the coefficients as fractions of their total.
- All results are presented to the maximum precision of 4 decimal places in this program
Program 2 begins with the table of counts, and is used to avoid the entry of large volumn of data when the counting has been previously done already.
- The number of columns in the table of counts is the number of outcomes
- The first row contains the outcome numes
- All sunsequent rows are the counts in each attribute for the outcomes
- The sample size for the outcomes are entered separately as an array separated by spaces
- The program reads in the data, and executes steps 3-5 of program 1
- If the data text area is empty, the program terminates
- If there is data, the program reads the first column as attributes, and executes steps 6-9 of Program 1
Program 3 begins with the table of probabilities of each attribute for each group (P(a|o)), the two arrays of a priori probability and the cost coefficients, and the attributes. It can be used to interpret any set of attributes when the table of P(a|o) is already available, and allows the a priori probabilities and cost coefficients to be adjusted.
- The table of probabilities has the same format as the table of counts, except that the numbers are probabilities
- The arrays of a priori and costs can be numbers in any format, as the program will normalize each value as a fraction of the total before they are used.
- The attributes must be in the first or only column in the data text area.
- The program reads in the data, and executes steps 6-9 of program 1
Program 4 is supplementary, and can be executed when the table of probabilities of each attribute for each outcome (P(a|o)), and the two arrays of a priori probability and the cost coefficients and available and adjusted the the user's requirements
- It produces a Javascript subroutine which accepts an attribute text string, and return an array of probabilities
- It is intended for the user to edit, change, or append this program, and incorporate it is his/her own program to interpret future data.
- The user will need to convert variables to the attribute string appropriate for this program, and handle the array of numbers returned.