Content Disclaimer Copyright @2020. All Rights Reserved. |
Links : Home Index (Subjects) Contact StatsToDo
MacroPlot Resources
Orientation
Macroplot plotting is controlled by the macros in the text area provided.
Macros
Each macro must occupy its own line. If the first character of a macro is not A-Z, the line will be considered a comment and ignored The first macro, which is obligatory, initializes the plot. The macro is
Example : Bitmap Initialize 700 500 255 255 255 255 which provides a landscape area 700 pixels wide, 500 pixel high, with white background The following are default settings when the bitmap is initiated.
A central plotting area is also defined
This panel lists and describes all macros used in this version of MacroPlot by Javascript. They are divided into the following sub-panels
Color Palettes
This sub-panel lists those macros that initialized the bitmap, and set the parametrs for drawing
Axis & Coordinates
Initialize PlottingBitmap Initialize w h r g b t is the first and obligatory macro, which Initializes the bitmap
Settings for linesThe settings provide parameters for all subsequent plotting until the parameter is resetLine Color r g b t sets the line color of red, green, blue and transparency values, each value is 0 for non-existence to 255 for maximum intensity. On initialization of the bitmap, line color is lines is set by default to black (0 0 0 255) Line Thick p sets the thickness of lines to p pixels. On initialiszation, the default setting is 3 pixels for line thickness Settings for fillsWhen bars, dots, arcs and wedges are plotted, the interior of these symbols are called fills, and they are set as followsFill Color r g b t sets the filling color of red, green, blue and transparency values, each value is 0 for non-existence to 255 for maximum intensity. On initialization of the bitmap, fill color is lines is set by default to black (0 0 0 255). Fill Type t sets how the fills are to be used, t can be one of the following
Settings for fontsThese set the font characteristics for text output. Please note: settings for lines and fills for fonts are separate and independent to those for general line and shape plottingsFont Face name sets the font face. The program will accept all fonts supported by the user's border. The 3 fonts accepted by all browsers are serif, sans-serif, and monospace. On initialization, sans-serif is set by default Font Style s where s can be either normal or bold. On initialization the default setting is bold Font Size h where h is the height of the text in pixels. On initialization, the default font size is set to 16 Font Thick p where p is the thickness of the outline of the font. On initialization, this is set to p=1 Font LColor r g b t sets the color of the outline of the font. On initialization this is set to black (0 0 0 255) Font FColor r g b t sets the fill color of the of the font. On initialization this is set to black (0 0 0 255) Font Color r g b t sets both LColor and FColor to the same color. On initialization this is set to black (0 0 0 255) Font Type t where t determines which part of the font is drawn, and can be one of the following
This sub-panel presents macros that define the plotting areas, and creating the x and y axis for plotting
Drawings
Drawing on the bitmapWhen plotting on the initialized bitmap
Drawing on the plotting areaIn most cases, there is a need to draw and label the x and y axis, and drawing coordinates used are the actual values of the data. The macros used for these all begins with the keyword Plot, and are purposes are as followsPlot Pixels lp tp rp bp defines an area for plotting
Plot Logy 1 sets the vertical y axis to the log scale. Normal scale is set on initialization, or reset by Plot Logy 0 Plot XLabel label distance places the label for the horizontal x axis, below the bottom of the plotting area
Plot YLabel label distance places the label for the vertical y axis, on the left of plotting area
The quickest and easiest way to draw axisThe following 4 macros are sufficient to draw the x and y axis under most circumstancesPlot XAxis y nsIntv nbIntv len gap line will mark out and numerate the horizontal x axis
Other methods of drawing axisUsers may wish to draw individual part of the axis, and the following macros can be usedPlot XLine y Draws the horizontal x axis line at the y value y Plot YLine x Draws the vertical y axis line at the x value y Plot XMark y begin interval len marks the horizontal x axis with a series of vertical marks
Plot YMark x start interval len marks the vertical y axis with a series of horizontal marks
Plot XScale y start interval gap writes the numerical scales for the horizontal x axis
Plot YScale x start interval gap writes the numerical scales for the vertical y axis
This sub-panel describes those macros that draws the plotting objects. Drawing are performed in two environments
Drawing linesThe thickness and color of any line drawn is set by the Line macros (see setting sub-panel). The default setting is black line 3 pixels in widthBitmap Line x1 y1 x2 y2 draws the line from x1y1 to x2y2
Plot Line x1 y1 x2 y2 draws the line from x1y1 to x2y2
Drawing barsThe color and thickness of the outline are defined in the Line macro. The color of the fill is defined in the fill color and Fill Type macro. The default setting is black (0 0 0 255) for both line and fill color, and the Fill type is set to 1, only the fill and no outlines. These settings are suitable for most circumstances, but user can change them is so required.Bitmap Bar x1 y1 x2 y2 draws a bar the corner of which are x1y1 and x2y2. X and y are number of pixels from the left and top border of the bitmap Plot Bar x1 y1 x2 y2 draws a bar the corner of which are x1y1 and x2y2. X and y are data values as defined in Plot Values lv tv rv bv Bar Wide w sets the width / height of bars for Plot VBar and Plot HBar
Plot VBar x y1 y2 hshift draws a vertical bar
Plot HBar x1 x2 y vshift draws a horizontal bar
Drawing dotsThere are only 2 dot types, circle and square. If more than 2 tyoes of dats are required, they can be distinguished by the colours of the outline and fill, and by their sizes. Settingsd for dot parameters are in the settings sub-panelBitmap Circle x y radius and Bitmap Square x y radius draws a circle or a square dot
Plot Circle x y radius hshift vshift and Plot Square x y radius hshift vshift draws a circle or a square dot
Dot Type t where t is either circle or square. The default setting is circle Plot Dot x y hshift vshift draws the dot, with its parameters (shape size color outline fill) already pre-set
Drawing textThe color, outline, fill, font, and weight of text are preset (see settings). The default settinfs are sans-sherif, black fill only, and 16pxs highBitmap HText x y ha va txt draws text horizontally on the bitmap
Plot HText x y ha va txt hshift vshift draws text horizontally on the bitmap
Bitmap VText x y ha va txt draws text vertically (90 degrees anticlockwise from horizontal) on the bitmap
Plot VText x y ha va txt hshift vshift draws text vertically (90 degrees anticlockwise from horizontal) on the bitmap
Other miscellaneous drawingsBitmap Arc x y radius startDeg endDeg rotate draws an arc.
Plain Colors
Table of colors used on this web site
Explanations
Introduction
Javascript Program
This page presents a number of algorithms for numerical transformation, in Javascript programs and in R codes.
Linear Transformation
Transformations are often carried out in data analysis, either to rescale the values, or to change the data to normal distribution so that the powerful tools of parametric statistics can be used for analysis. The following transformations are available on this page
Javascript programsData entry uses a single column of numbers to be transformedResults are presented in 3 forms
R codesAll algorithm in R codes are presented. In most cases, the native R codes are used, as the primary purpose of the R codes are for checking for errors in the Javascript codes. Users are reminded that powerful packages that automatically perform transformation are availble in R resource centers, but they are not presented on this page, as most transformation algorithms are quite simple.Users are also remided that there are functions that are used repeately. These are placed ahead of codes that call these functions. Should users wish to extract the codes for their own use, care must be exercised to include these supportive functions. References
Linear transformation is essentially a rescaling of the measurements. An example is to harmonize exam results from multiple groups of student scored by multiple examiners. The marks from each group are transformed to the same minimum/maximum or mean/Standard Deviations, so they can be compared.
Curvilinear Transformation
Three algorithms for linear transformation are presented Transform to new minimum and maximum
Transform to new mean and Standard Deviations
Transform to ranks. The values in the input array are ranked from the lowest to the highest values. The ranks of repeated values are averaged. Please note: Javascript gives the lowest value the rank of 0 while R gives the rank of 1. If R is used for ranking and the lowest rank is to be 0, the all the ranks must be subtracted by 1
Curvilinear transformations are used to transform data with distributions difficult to analyse to another, usually normal distribution, which are easy to analyse statistically.
Poisson Transformation
Logarithm and anti-logarithm transformationsMost primary measurements (length, weight, time, are ratios to a standard, and as such they are non zero values with a positive skew (short tail on the lower side and long tail on the high side. These measurements only approach normal distribution when the data set contains high values with low variance, so that the confidence intervals are far from the zero (0) value. Logarithmic transformation reduces this skewness towards normality, thus allowing the use of the powerful statistical tools associated with normal distributions to analyse the data. Values are firstly transforms logarithmically, analysed, and the results reverse transformed (anti-logarithmic transformation) back to the original measurements for presentation.The most common method is to use the logarithm with a base e (y = loge(x)), and the e based antilog is the exponential function (x = antiloge(y) = exp(y)) Biologists and clinicians often prefer logarithm based on 10, and engineers prefer logarithm based on 2. TheLog and antilog with a nominated base are calculated as
x = antilogbase(y) = exp(y * loge(base)) Arcsine and reverse arcsine transformationsproportions, or probabilities, have a finite range, being from 0 to 1. The distribution in the center, around p=0.5, is close to normal. When the measurements are neat 0 or 1 however, the distribution becomes increasingly skewed, with the side towards 0.5 having a longer tail than the side towards the extremes of 0 and 1. Arcsine transformation stretches the intervals near the extremes, providing a wider range of values that are normally distributed. Please note: arcsine transformation transform the probability values between 0 and 1 to 0 and π/2 (1.5708)
x = reverse arcsine(y) = sin(y)2 Logit and Logistic transformationLogit transformation converts a proportion or probability measurement (p) with a range from 0 to 1 to a linear measurement from -∞ to +∞ in a sigmoidal manner. This transformation therefore removes the bias imposed on the distribution of probability because of its finite range (0 to 1), allowing the transformed values to have an infinite range, with uniform distribution throuout. It is therefore an important method of converting probability values into a normally distributed value.Logistic transformation is the reverse of logit transmission, so that, after statical analysis of the logit transformed values, the results can be converted back to probabilities or proportions for presentation
p = logistic(x) = 1 / (1 + exp(-x)) Bimodal transformationThis is an adaptation of the logistic transformation that is useful to handle fuzzy logicIn neural networks, measurements are sometimes framed in Fuzzy logic, a value between 0 and 1 representing the degree of confidence that something is true. We will use the default example in the Javascript program to demonstrate the process In the care of women in labour, the pH of the fetal blood is sometimes used to decide whether the baby is abnormal (unwell) or normal. Commonly, a pH<7.25 is considered abnormal and requires some form of intervention, and pH>7.38 as reassuringly normal. The process of bimodal transformation is to convert the normally distributed pH measurements into the bimodal fuzzy measurement of the level of confidence (0-1) that it is abnormal or normal for decision making. A number of methods to do this are available
ReferencesArcsine distribution by Wikipedia
Poisson distribution concerns counts, the probability of events under a set of circumstances. Examples are traffic accidents in a particular location, number of fishes found in a pond, number of cells seen in a microscopic field, number of asthmatic attacks in a group of children over a period of time (e.g. per 100 child years), pregnancies in women taking a particular contraception (e.g. pregnancies per 100 women years).
Box-Cox Transformation
Data with Poisson distributions are integers >0, although averaged counts (λ) can be decimal values. Poisson data has a positive skew in low values, with a short tail towards 0 and long tail towards higher values. When Poisson values are large (>30) the distribution becomes increasingly similar to the Binomial distribution then the normal distribution. The Negative Binomial distribution, counting the number of cases between consecutive positive cases, is very similar to Poisson, and often treated as such. Count data containing high values can therefore be treated as normally distributed, or be made approximately normal by a logarithmic transformation. Count data of low values however will require specific transformation, two commonly used ones are presented on this page Anscombe Transformation and different methods of reverse transformationAnscombe transformation assumes the input data to have a Poisson distribution, and transform them to a set with normal distribution.
More recently, two additional reverse formulae have been proposed, to be used when the counts are very low, in order to reduce the random variance.
Freeman-Tukey TransformationThe Tukey-Freeman is an adaptation of the arcsine transformation, specifically to convert data with a Poisson distribution to values with normal distribution. The formula is y = sqrt(x + 1) + sqrt(x)There is no specific formula for the reverse Freeman-Tukey transformation. The reversed value is found by interative approximation until a Poisson distributed value is found that will produce an approximation of the the normal distributed value. Please note: In estimating the reverse, the normally distributed values must be >=1, or the iterative approximation will not converge, and the program hangs. ReferencesAnscombe Transformation
Box and Cox, in 1964, devised the method to transform data with an exponential distribution to that of normal distribution. This means the transmission is usable across a wide variety of distributions, including Poisson, Negative Binomial, and inverse Gaussian. The transformation uses a parameter lambda (λ) which controls the extent of the transformation
This page provides the following alternative algorithms to optimize λ
References
This panel presents the R codes for all transformations. Detailed discussions for each tansformation are in the Introduction panel
GetParameters is a function that calculates sample size (n), mean, SD, skewness, Kurtosis, and Chi Sq of a vector. This function is repeatedly used in all sections so are placed at the beginning. Users may need to copy this closer to individual transformations for use. # Get parameters: get n, Mean, SD, Skewness, Kurtosis, and Chi Sq from vector # Needed for all procedures GetParameters <- function(ar) { n = length(ar) mean = mean(ar) sd = sd(ar) skew = sum((ar - mean)^3) / ((n - 1) * sd^3) kurtosis = sum((ar - mean)^4) / ((n - 1) * sd^4) - 3 chiSq = n * skew^2 / 6 + n * kurtosis^2 /24 c(n,mean,sd,skew,kurtosis,chiSq) } Section 1. Linear transformations1.1. Transform to new minimum and maximunInput data is vector dat, to be determined by the user newMin (minimum) and newMax (maximum) are nominated by the user # new minmax dat = c(-0.7,0.3,1.1,-1.3,0.5,1.1,0.5,-0.5,-0.1,-1.9,2.3-2.0-1.1-1.9,-0.11,10.0,0.0,-1.3,0.2) oldMin = min(dat) oldMax = max(dat) oldDiff = oldMax - oldMin newMin = 0 newMax = 1 newDiff = newMax - newMin res <- (dat-oldMin) / oldDiff * newDiff + newMin res GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSqResults are [1] 0.15748031 0.23622047 0.29921260 0.11023622 0.25196850 0.29921260 0.25196850 0.17322835 0.20472441 [10] 0.06299213 0.00000000 0.20393701 1.00000000 0.21259843 0.11023622 0.22834646 > GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq [1] 16.000000 0.318125 2.786549 2.649353 6.852096 50.018330 > GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSq [1] 16.0000000 0.2376476 0.2194133 2.6493527 6.8520957 50.01833001.2. Transform to new new mean and Standard Deviations Input data is vector dat, to be determined by the user newMean and newSD are nominated by the user # new mean / SD dat = c(-0.7,0.3,1.1,-1.3,0.5,1.1,0.5,-0.5,-0.1,-1.9,2.3-2.0-1.1-1.9,-0.11,10.0,0.0,-1.3,0.2) oldMean = mean(dat) oldSD = sd(dat) newMean = 100 newSD = 10 res <- (dat-oldMean) / oldSD * newSD + newMean res GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSqThe results are > res [1] 96.34629 99.93496 102.80589 94.19309 100.65269 102.80589 100.65269 97.06402 98.49949 92.03989 [11] 89.16895 98.46360 134.74504 98.85835 94.19309 99.57609 > GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq [1] 16.000000 0.318125 2.786549 2.649353 6.852096 50.018330 > GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSq [1] 16.000000 100.000000 10.000000 2.649353 6.852096 50.0183301.3. Transform to ranks Input data is vector dat, to be determined by the user # ranking dat = c(-0.7,0.3,1.1,-1.3,0.5,1.1,0.5,-0.5,-0.1,-1.9,2.3-2.0-1.1-1.9,-0.11,10.0,0.0,-1.3,0.2) res <- rank(dat, ties.method = "average") # min rank = 1 res <- res - 1 # min rank = 0 res GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSqThe results are > res [1] 4.0 10.0 13.5 2.5 11.5 13.5 11.5 5.0 7.0 1.0 0.0 6.0 15.0 8.0 2.5 9.0 > GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq [1] 16.000000 0.318125 2.786549 2.649353 6.852096 50.018330 > GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSq [1] 16.000000000 7.500000000 4.750438576 -0.004664111 -1.336702456 1.191240315 Section 2. Curved transform2.1a. log (base e) TransformationInput data is vector dat, to be determined by the user # log transformation dat = c(0.9,2.1,3.4,0.5,2.4,3.4,2.4,1.1,1.6,0.2,5.9,0.2,0.6,0.2,1.6,3.4,1.7,1.7,0.5,1.9) res <- log(dat) res GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSqThe results are > res [1] -0.10536052 0.74193734 1.22377543 -0.69314718 0.87546874 1.22377543 0.87546874 0.09531018 0.47000363 [10] -1.60943791 1.77495235 -1.60943791 -0.51082562 -1.60943791 0.47000363 1.22377543 0.53062825 0.53062825 [19] -0.69314718 0.64185389 > GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq [1] 20.000000 1.785000 1.436470 1.135229 1.136463 5.372107 > GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSq [1] 20.0000000 0.1923394 1.0109045 -0.5624999 -0.8140841 1.60696462.1b. Exponential (antilog base e) Transformation Input data is vector dat, to be determined by the user # Exponential transformation dat = c(-0.11,0.74,1.22,-0.69,0.88,1.22,0.88,0.10,0.47,-1.61,1.78,-1.61,-0.51,-1.61, 0.47,1.22,0.53,0.53,-0.69,0.64) res <- exp(dat) res GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSqThe results are > res [1] 0.8958341 2.0959355 3.3871877 0.5015761 2.4108997 3.3871877 2.4108997 1.1051709 1.5999942 0.1998876 [11] 5.9298564 0.1998876 0.6004956 0.1998876 1.5999942 3.3871877 1.6989323 1.6989323 0.5015761 1.8964809 > GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq [1] 20.0000000 0.1925000 1.0108041 -0.5626384 -0.8097363 1.6016006 > GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSq [1] 20.000000 1.785390 1.439018 1.150753 1.201415 5.6169402.2a. log (nominated base) transformation Input data is vector dat, to be determined by the user Base is nominated by user, commonly used base are 2 and 10 # Log base transformation dat = c(9,21,34,5,24,34,24,11,16,2,59,2,6,2,16,34,17,17,5,19) base = 10 # user determined base res <- log(dat,base) res GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSqThe results are > res [1] 0.9542425 1.3222193 1.5314789 0.6989700 1.3802112 1.5314789 1.3802112 1.0413927 1.2041200 0.3010300 [11] 1.7708520 0.3010300 0.7781513 0.3010300 1.2041200 1.5314789 1.2304489 1.2304489 0.6989700 1.2787536 > GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq [1] 20.000000 17.850000 14.364705 1.135229 1.136463 5.372107 > GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSq [1] 20.0000000 1.0835319 0.4390302 -0.5624999 -0.8140841 1.60696462.2b. antilog (nominated base) Transformation Input data is vector dat, to be determined by the user Base is nominated by user, commonly used base are 2 and 10 # Antilog bass transformation dat = c(0.95,1.32,1.53,0.70,1.38,1.53,1.38,1.04,1.20,0.30,1.77,0.30,0.78,0.30,1.20,1.53,1.23,1.23,0.70,1.28) base = 10 # user determined base res <- exp(dat * log(base)) res GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSqThe results are > res [1] 8.912509 20.892961 33.884416 5.011872 23.988329 33.884416 23.988329 10.964782 15.848932 1.995262 [11] 58.884366 1.995262 6.025596 1.995262 15.848932 33.884416 16.982437 16.982437 5.011872 19.054607 > GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq [1] 20.0000000 1.0825000 0.4387407 -0.5622308 -0.8102032 1.6007025 > GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSq [1] 20.000000 17.801850 14.329795 1.137921 1.145022 5.4087802.3a. Arcsine transformation Input data is vector dat, to be determined by the user # Arcsine transformation dat = c(0.001,0.01,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,0.99,0.999) res <- asin(sqrt(dat)) res GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSqThe results are > res [1] 0.03162805 0.10016742 0.32175055 0.46364761 0.57963974 0.68471920 0.78539816 0.88607712 0.99115659 [10] 1.10714872 1.24904577 1.47062891 1.53916828 > GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq [1] 1.300000e+01 5.000000e-01 3.626525e-01 1.818563e-17 -1.505980e+00 1.228487e+00 > GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSq [1] 1.300000e+01 7.853982e-01 4.845403e-01 1.212293e-15 -1.179442e+00 7.535039e-012.3b. Reverse arcsine transformation Input data is vector dat, to be determined by the user # Reverse arcsine transformation dat = c(0.03,0.10,0.32,0.46,0.58,0.68,0.79,0.89,0.99,1.11,1.25,1.47,1.54) res <- sin(dat)^2 res GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSqThe results are > res [1] 0.000899730 0.009966711 0.098952121 0.197089922 0.300330235 0.395380667 0.504601772 0.603840501 0.698939437 [10] 0.802276136 0.900571808 0.989874462 0.999051886 > GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq [1] 13.000000000 0.785384615 0.485482842 -0.002376608 -1.185168283 0.760850161 > GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSq [1] 13.000000000 0.500136568 0.363304272 -0.002285859 -1.511849683 1.2380931142.4a. logit transformation Input data is vector dat, to be determined by the user # Logit transformation dat = c(0.001,0.01,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,0.99,0.999) res <- log(dat/(1-dat)) res GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSqThe results are > res [1] -6.9067548 -4.5951199 -2.1972246 -1.3862944 -0.8472979 -0.4054651 0.0000000 0.4054651 0.8472979 [10] 1.3862944 2.1972246 4.5951199 6.9067548 > GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq [1] 1.300000e+01 5.000000e-01 3.626525e-01 1.818563e-17 -1.505980e+00 1.228487e+00 > GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSq [1] 1.300000e+01 -9.834548e-17 3.569554e+00 -2.963784e-16 -1.779391e-01 1.715042e-022.4b. Logistic (reverse logit) transformation Input data is vector dat, to be determined by the user # Logistic (reverse logit) transformation dat = c(-10,-8,-6,-4,-2,-1,0,1,2,4,6,10) res <- 1 / (1 + exp(-dat)) res > res [1] 4.539787e-05 3.353501e-04 2.472623e-03 1.798621e-02 1.192029e-01 2.689414e-01 5.000000e-01 7.310586e-01 [9] 8.807971e-01 9.820138e-01 9.975274e-01 9.999546e-01 > GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq [1] 12.0000000 -0.6666667 5.8205488 0.1092094 -0.8718018 0.4038726 > GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSq [1] 12.0000000 0.4583613 0.4353994 0.1504565 -1.8197489 1.70101742.4c. Bimodal (modified logistic) transformation Input data is vector dat, to be determined by the user lowMode and highMode are 5% and 95% confidence values for true and is nominated by user #Bimodal (modified logistic) transformation dat = c(7.32,7.33,7.36,7.33,7.33,7.24,7.40,7.27,7.30,7.25,7.21,7.34,7.39,7.30,7.24,7.32,7.32,7.48,7.26,7.25) lowMode = 7.25 # value at 0.5% confidence level highMode = 7.38 # value at 95% confidence level midValue = (lowMode + highMode) / 2; d = (highMode - midValue) / log((0.95 / (1 - 0.95))) res <- 1 / (1 + exp(-(dat - midValue) / d)) res GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSqThe results are > res [1] 0.556382994 0.663623950 0.884776133 0.663623950 0.663623950 0.032375893 0.979172227 0.115223867 0.336376050 [10] 0.050000000 0.008523219 0.756295644 0.967624107 0.336376050 0.032375893 0.556382994 0.556382994 0.999432865 [19] 0.076459860 0.050000000 > GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq [1] 20.00000000 7.31200000 0.06469361 0.68942836 0.31068898 1.66481127 > GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSq [1] 20.00000000 0.46425163 0.35840486 0.04551001 -1.50198348 1.88686585 > Section 3. Poisson to Normal Transformation3.1a. Anscombe Transformation # Anscombe Transformation dat = c(0.9,2.1,3.4,0.5,2.4,3.4,2.4,1.1,1.6,0.2,5.9,0.2,0.6,0.2,1.6,3.4,1.7,1.7,0.5,1.9) res = 2.0 * sqrt(dat + 3.0 / 8.0) res GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSqThe results are > res [1] 2.258318 3.146427 3.885872 1.870829 3.331666 3.885872 3.331666 2.428992 2.810694 1.516575 5.009990 1.516575 [13] 1.974842 1.516575 2.810694 3.885872 2.880972 2.880972 1.870829 3.016621 > GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq [1] 20.000000 1.785000 1.436470 1.135229 1.136463 5.372107 > GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSq [1] 20.0000000 2.7915426 0.9443963 0.4186452 -0.4447115 0.7490195Reverse Anscombe (algebraic) Transformation Input data is vector dat, to be determined by the user # Reverse Anscombe (algebraic) Transformation dat = c(2.26,3.15,3.89,1.87,3.33,3.89,3.33,2.43,2.81,1.52,5.01,1.52,1.97,1.52,2.81, 3.89,2.88,2.88,1.87,3.02) res = dat^2 / 4.0 - 3.0 / 8.0 res GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSqThe results are > res [1] 0.901900 2.105625 3.408025 0.499225 2.397225 3.408025 2.397225 1.101225 1.599025 0.202600 5.900025 0.202600 [13] 0.595225 0.202600 1.599025 3.408025 1.698600 1.698600 0.499225 1.905100 > GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq [1] 20.0000000 2.7925000 0.9446630 0.4203429 -0.4496880 0.7574767 > GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSq [1] 20.000000 1.786456 1.437622 1.133415 1.122643 5.3323693.1c. Reverse Anscombe (unbiased) Transformation Input data is vector dat, to be determined by the user # Reverse Anscombe (unbiased) Transformation dat = c(2.26,3.15,3.89,1.87,3.33,3.89,3.33,2.43,2.81,1.52,5.01,1.52,1.97,1.52,2.81, 3.89,2.88,2.88,1.87,3.02) res = dat^2 / 4.0 - 1.0 / 8.0 res GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSqThe results are > res [1] 1.151900 2.355625 3.658025 0.749225 2.647225 3.658025 2.647225 1.351225 1.849025 0.452600 6.150025 0.452600 [13] 0.845225 0.452600 1.849025 3.658025 1.948600 1.948600 0.749225 2.155100 > GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq [1] 20.0000000 2.7925000 0.9446630 0.4203429 -0.4496880 0.7574767 > GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSq [1] 20.000000 2.036456 1.437622 1.133415 1.122643 5.3323693.1d. Reverse Anscombe (exact) Transformation Input data is vector dat, to be determined by the user Reverse Anscombe (exact) Transformation dat = c(2.26,3.15,3.89,1.87,3.33,3.89,3.33,2.43,2.81,1.52,5.01,1.52,1.97,1.52,2.81, 3.89,2.88,2.88,1.87,3.02) res <- dat^2 / 4 + sqrt(1/dat * 3 / 2) / 4 - 1/dat^2 * 11 / 8 + sqrt(1/dat^3 *3 / 2) * 5 / 8 - 1 / 8 res GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSqThe results are > res [1] 1.3116663 2.5264852 3.8221714 0.8792641 2.8169839 3.8221714 2.8169839 1.5168628 2.0200486 0.5142849 [11] 6.3002987 0.5142849 0.9859128 0.5142849 2.0200486 3.8221714 2.1198637 2.1198637 0.8792641 2.3263824 > GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq [1] 20.0000000 2.7925000 0.9446630 0.4203429 -0.4496880 0.7574767 > GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSq [1] 20.0000000 2.1824649 1.4582893 1.0619616 0.9728408 4.54789093.2. Freeman Tukey Transformation : (common function) this function is used by both forward and reverse transformations Input data is vector dat, to be determined by the user # Find a Freeman Tukey Transformation value # function FindFTT is needed for both forward and reverse Freeman-Tukey Transformation FindFTT <- function(v) # v=input value return (sqrt(v + 1) + sqrt(v));3.2a. Freeman Tukey Transformation (forward) Input data is vector dat, to be determined by the user Freeman Tukey Transformation (forward) dat = c(0.9,2.1,3.4,0.5,2.4,3.4,2.4,1.1,1.6,0.2,5.9,0.2,0.6,0.2,1.6,3.4,1.7,1.7,0.5,1.9) res = FindFTT(dat) res GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSqThe results are > res [1] 2.327088 3.209819 3.941527 1.931852 3.393102 3.941527 3.393102 2.497947 2.877363 1.542659 5.055777 1.542659 [13] 2.039508 1.542659 2.877363 3.941527 2.947008 2.947008 1.931852 3.081344 > GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq [1] 20.000000 1.785000 1.436470 1.135229 1.136463 5.372107 > GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSq [1] 20.0000000 2.8481344 0.9486997 0.3784766 -0.4740659 0.66476403.2b. Reverse Freeman Tukey Transformation (reversed) Input data is vector dat, to be determined by the user Please note" All values must be >=1 for the program will hang # Reverse Freeman Tukey Transformation (reversed) # all value must be >=1 dat = c(2.33,3.21,3.94,1.93,3.39,3.94,3.39,2.50,2.88,1.54,5.06,1.54,2.04,1.54,2.88,3.94,2.95,2.95,1.93,3.08) res = vector() for (y in dat) # interative approximation for each value in vector { lx = 1e-10 rx = 100 mx = (lx+rx) / 2 my = FindFTT(mx) k = 0 while(abs(my-y)>0.01) { if(my>y) rx = mx else lx = mx mx = (rx + lx) / 2 my = FindFTT(mx) k = k + 1 } res <-append(res,mx) } res GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSqThe results are > res [1] 0.9033203 2.0996094 3.3935547 0.5004883 2.3925781 3.3935547 2.3925781 1.0986328 1.6113281 0.1953125 [11] 5.9082031 0.1953125 0.5981445 0.1953125 1.6113281 3.3935547 1.7089844 1.7089844 0.5004883 1.9042969 > GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq [1] 20.0000000 2.8480000 0.9494020 0.3776856 -0.4661588 0.6565749 > GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSq [1] 20.000000 1.785278 1.436826 1.136457 1.158377 5.423311 Section 4. Box Cox TransformationFunctions to be used by all forward Box-Cox transformations#.Functions to be used by all forward procedures ForwardBoxCox <- function (ar,lambda) #calculate result vector from input vector and lambda { res = vector() if (lambda==0) return(log(ar)) return ((ar^lambda - 1 ) / lambda) } # Find the best lambda for vector ar depending on the key BestLambdaByKey <- function (ar,key) # key 1=mean 2=sd,3=skew,4=kurtosis,5=chiSq { minL = -50; # range of iteration may need to be changed maxL = 50; minv = 1e10; while (abs(maxL-minL)>0.0001) { minv = 1e10; j = 5; step = (maxL - minL) / 4; arL = c(0,0,0,0,0,0); #print(arL) for(i in 0:4) { arL[i+1] = i * step + minL; # lambda #print(arL) #print(c(i+1,arL[i+1])) bcAr <- ForwardBoxCox(ar,arL[i+1]) #print(ar) #print(bcAr) tmpAr <- GetParameters(bcAr) #print(tmpAr[key]) k = abs(tmpAr[key]) #print(c(k,minv)) if(k<minv) { minv = k; j = i+1; } } if(j>0) { minL = arL[j-1]; } else { minL = arL[0+1]; } if(j<4) { maxL = arL[j+1]; } else { maxL = arL[4+1]; } } #print(arL[j+1]) return (arL[j+1]) }4.1.a to 4.1.e are 5 different options for forward Box-Cox transformations 4.1.a. Forward Box Cox by lambda
# Forward Box Cox by lambda dat = c(0.9,2.1,3.4,0.5,2.4,3.4,2.4,1.1,1.6,0.2,5.9,0.2,0.6,0.2,1.6,3.4,1.7,1.7,0.5,1.9) lambda = 0 # user to change lambda #lambda = 0.5 # user to change lambda res <- ForwardBoxCox(dat,lambda) res GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSqThe results are [1] -0.10536052 0.74193734 1.22377543 -0.69314718 0.87546874 1.22377543 0.87546874 0.09531018 0.47000363 [10] -1.60943791 1.77495235 -1.60943791 -0.51082562 -1.60943791 0.47000363 1.22377543 0.53062825 0.53062825 [19] -0.69314718 0.64185389 > # arameters = [SD, Skewness, Kurtosis, ChiSq] > GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq [1] 20.000000 1.785000 1.436470 1.135229 1.136463 5.372107 > GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSq [1] 20.0000000 0.1923394 1.0109045 -0.5624999 -0.8140841 1.60696464.1.b. Forward Box Cox for smallest SD Input data is vector dat, to be determined by the user # Forward Box Cox for smallest SD dat = c(0.9,2.1,3.4,0.5,2.4,3.4,2.4,1.1,1.6,0.2,5.9,0.2,0.6,0.2,1.6,3.4,1.7,1.7,0.5,1.9) lambda = BestLambdaByKey(dat,3) #3 is SD res <- ForwardBoxCox(dat,lambda) res GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSqThe results are [1] -0.10479905 0.77059381 1.30304225 -0.66932182 0.91555161 1.30304225 0.91555161 0.09577277 0.48139702 [10] -1.48483417 1.94492116 -1.48483417 -0.49780637 -1.48483417 0.48139702 1.30304225 0.54518049 0.54518049 [19] -0.66932182 0.66322722 > GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq [1] 20.000000 1.785000 1.436470 1.135229 1.136463 5.372107 > GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSq [1] 20.0000000 0.2436074 1.0061865 -0.4167196 -0.8855300 1.23232034.1.c. Forward Box Cox for skewness closest to 0 Input data is vector dat, to be determined by the user # Forward Box Cox for skewness closest to 0 dat = c(0.9,2.1,3.4,0.5,2.4,3.4,2.4,1.1,1.6,0.2,5.9,0.2,0.6,0.2,1.6,3.4,1.7,1.7,0.5,1.9) lambda = BestLambdaByKey(dat,4) #4 is skewness res <- ForwardBoxCox(dat,lambda) res GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSqThe results are > res [1] -0.10335585 0.85238967 1.54361113 -0.61224544 1.03190910 1.54361113 1.03190910 0.09699126 0.51282904 [10] -1.21640261 2.49913603 -1.21640261 -0.46593467 -1.21640261 0.51282904 1.54361113 0.58563165 0.58563165 [19] -0.61224544 0.72347173 > GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq [1] 20.000000 1.785000 1.436470 1.135229 1.136463 5.372107 > GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSq [1] 2.000000e+01 3.810286e-01 1.038916e+00 8.387984e-05 -8.273129e-01 5.703723e-014.1.d. Forward Box Cox for Kurtosis closest to 0 Input data is vector dat, to be determined by the user # Forward Box Cox for Kurtosis closest to 0 dat = c(0.9,2.1,3.4,0.5,2.4,3.4,2.4,1.1,1.6,0.2,5.9,0.2,0.6,0.2,1.6,3.4,1.7,1.7,0.5,1.9) lambda = BestLambdaByKey(dat,5) #5 is kurtosis res <- ForwardBoxCox(dat,lambda) res GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSqThe results are > res [1] -0.1013823 0.9867730 1.9846252 -0.5430396 1.2288948 1.9846252 1.2288948 0.0987297 0.5614531 [10] -0.9434720 3.6561498 -0.9434720 -0.4258453 -0.9434720 0.5614531 1.9846252 0.6490465 0.6490465 [19] -0.5430396 0.8202663 > GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq [1] 20.000000 1.785000 1.436470 1.135229 1.136463 5.372107 > GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSq [1] 2.000000e+01 5.975430e-01 1.206211e+00 6.510760e-01 1.186852e-04 1.413000e+004.1.e. Forward Box Cox for smallest Chi Sq Input data is vector dat, to be determined by the user # Forward Box Cox for smallest Chi Sq dat = c(0.9,2.1,3.4,0.5,2.4,3.4,2.4,1.1,1.6,0.2,5.9,0.2,0.6,0.2,1.6,3.4,1.7,1.7,0.5,1.9) lambda = BestLambdaByKey(dat,6) #6 is chi sq res <- ForwardBoxCox(dat,lambda) res GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSqThe results are > res [1] -0.10282858 0.88552606 1.64708468 -0.59280189 1.07983207 1.64708468 1.07983207 0.09744735 0.52514358 [10] -1.13415389 2.75465444 -1.13415389 -0.45484084 -1.13415389 0.52514358 1.64708468 0.60159801 0.60159801 [19] -0.59280189 0.74757806 > GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq [1] 20.000000 1.785000 1.436470 1.135229 1.136463 5.372107 > GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSq [1] 20.0000000 0.4346936 1.0683682 0.1662375 -0.6983293 0.49850294.1.f Reverse Box Cox transformation based on lambda Input data is vector dat, to be determined by the user The value lambda controls the curvature, and is nominated by the user # Reverse Box Cox transformation based on lambda dat = c(-0.11,0.74,1.22,-0.69,0.88,1.22,0.88,0.10,0.47,-1.61,1.78,-1.61,-0.51,-1.61,0.47,1.22,0.53,0.53,-0.69,0.64) lambda = 0 # user to change lambda #lambda = 0.5 # user to change lambda if(lambda==0) res <- exp(dat) else res <- (dat * lambda + 1.0)^(1/lambda) res GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSqThe results are > res [1] 0.8958341 2.0959355 3.3871877 0.5015761 2.4108997 3.3871877 2.4108997 1.1051709 1.5999942 0.1998876 [11] 5.9298564 0.1998876 0.6004956 0.1998876 1.5999942 3.3871877 1.6989323 1.6989323 0.5015761 1.8964809 > GetParameters(dat) #n,mean,sd,skewness,kurtosis,chiSq [1] 20.0000000 0.1925000 1.0108041 -0.5626384 -0.8097363 1.6016006 > GetParameters(res) #n,mean,sd,skewness,kurtosis,chiSq [1] 20.000000 1.785390 1.439018 1.150753 1.201415 5.616940 |