Content Disclaimer
Copyright @2020.
All Rights Reserved.
StatsToDo : Home Page

Links : Home Index (Subjects) Contact StatsToDo


This is the home page of the Statistics Toolkit (StatsToDo) website.

The site began in the 1990s as a facility to support clinical data analysis at the Department of Obstetrics and Gynaecology, the Chinese University of Hong Kong. Since the author's retirement, the site has been transferred to a server away from the university, and additional procedures have gradually been added. Currently it is a stand alone site, and free to use by all those wishing to do so.

Since 2020, attempts are made to replace all php programs with Javascript programs, for the following reasons

  • This reduces processing load on the server, providing easier access to other users
  • It eliminates limits on processing time, other than that on the user's computer and browser
  • It provides greater privacy for the users, as no data is transmitted out of the user's computer
  • It provides greater security for the user, as all the processing codes are on the back page and available to the user
The pages are still served as a php file, as php is used to assemble the components of the page. However the pages are essentially html files, and can be saved as such for repeated use on the user's computer.

The original php programs are served with _Exp, _Pgm, or _Tab descriptions. The new pages, using Javascript, have no such description attachments

The menu bar on top of every page provides links to the index of resources, and to contact the author.

Comments, feedback, corrections, suggestions, and questions are all very welcomed. These can be sent via the contact page (linked from menu on top of pages)


The author has many years of experience in data analysis as a clinician, in research, auditing, and quality control. However, other than a 1 semester course in Multiple Regression at the Master's level, all information provided on this site are based on knowledge from informal studies and personal experience.

Comments, advice, and suggestions, as well as the algoritms provided, are therefore based on "to the best of the author's current understanding", and can not be considered as formal knowledge or authoratative instructions. Users in doubt are strongly advised to consult a professional statistician.

The author reserves the right to amend, update and delete any information of the website without prior notice, and acepts no liability for any loss, change or damage howsoever arising from any use or misuse of or reliance on any content from this website.

The resources of this site is free to all who wish to use them. However, a condition of use is that the user accepts his/her own responsibility for accepting or reproducing any information from the site or communications with the author.

Numerical precision

Non integer numerical output from StatsToDo are by default in 4 decimal places of precision, unless the situation warrants otherwise. Experience indicates that this is more than sufficient in most circumstances. As percent (%) is usually presented to 1 decimal point, probability to 2 or at most 3 decimal point, t,z,F, and Chi Square have little meaning after the second decimal point, user should be aware that 4 decimal points of precision is often redundant, and edit numerical results before publication.

Many mathematical procedures are iterative approximations. Depending on the power of the processor, and the limits of approximation set during computation, the results may differ. In addition, as servers are continuously upgraded and becomes more powerful, results produced from the same program may change over time. Experience indicates that numerical outputs may differ as much as 0.1% to 0.3% from different sources, and similar differences may occasionally be seen between results of current computation and static tables produced previously. Users should understand the cause of these differences, and not be confused or alarmed. If in doubt, user should send questions via the contact page or consult his/her own statistical advisor.

Default Example Data

In nearly all programs offered from StatsToDo, default example data are provided, to show the user the format of data input, and to provide an example of the results that are produced.

The default examples are all artificially generated to demonstrate the procedures involved, and must not be interpreted as anything reflecting reality. The research model in each example is also deliberately simplistic, so that it does not distract the user from focussing on the statistical and computational aspects

The sizes of the example data are also very small, so that the user can visualize them easily, but do not reflect the required sample size for meaningful interpretation of results.

After testing and understanding the program with the default example data, the user can then replace them with his/her own data to produce the required results.


All programs in php, Javascript, R and Python codes are written by the author personally, although text books, journal articles, and the www were widely consulted and copied during development. A far as the author is aware, all sources used are in the public domain, fully referenced, and the copyrights of others have not been infringed.

The author reserves the copyright of contents of this site, mainly to avoid possible disputes. Users are free to copy any contents of the site (explanations, tables, algorithms) for their own used, on condition that they accept responsibility for the results produced. Acknowledgements are not obligatory, but would be greatly appreciated

Changes Since 2020

The site is reviewd and improved continuously, but a major redevelopment began in 2020. The reasons for redevelopments are
  • Having something to do during the pandemic lockdowns
  • To reduce pressure to the server (traffic and server side processing)
  • To simplify the structure of information by combining all contents related to a subject to the same page
  • To confine data processing on the user's computer, to avoid security risks
  • To validate the calculations by cross checking against available R codes
  • To allow users to visualize the source code, so that the user can
    • Check for accuracy of source codes and take responsibility for the results
    • Save the page as html on his/her own computer and use the page repeatedly independent to the server
    • Copy the codes to develop his/her own applications
  • To provide programs in R or Python codes where the computation is beyond the author's capability
The approaches to developments are
  • To use php on the server side only to assemble the program. All calculations are to be performed on client side using Javascript
  • Whenever possible, the programs are duplicated in R codes. In some pages, such as encryption, Python codes are presented
  • To present each subject, explanation, programs, tables, examples in the same page
Redevelopment is ongoing, and currenttly some of the pages have been redeveloped.