Content Disclaimer
Copyright @2020.
All Rights Reserved.

StatsToDo: Home Page

Links : Home Index (Subjects) Contact StatsToDo

This is the home page of the StatsToDo website.

The site began in the 1990s as a facility to support clinical data analysis at the Department of Obstetrics and Gynaecology, the Chinese University of Hong Kong. Since the author's retirement, the contents have been enlarged and improved as a personal hobby, and hosted in a personal server. Currently it is a stand alone site, and free to use by all those wishing to do so.

Page formats have changed since 2020, as follows

  • Two formats of presentation for complex pages, the file like tabbed panels, and the collapsible panels. The former displays all the options more clearly, but only one panel can be viewed at any time. The latter allows the viewing of multiple panels when chosen, but by doing so it distorts the arrangements of the indcator tabs.
  • The replacement of server side calculations (php) by client side calculations (Javascript). The pages are still assembled and served as php files, but data processing are on the client side using Javascript for the following reasons
    • It reduces processing load on the server, providing easier access to other users
    • It eliminates limits on processing time, other than that set by the user's own computer and browser
    • It provides greater privacy for the users, as no data leaves the user's computer, and Javascript has no access to the files of the user's computer
    • It allows users to visualize and verify program codes used, as they are all listed on the back of the page
  • The inclusion of R and Python codes for some of the procedures.

    R codes were initially used to check the correctness of the Javascript programs, but included to encourage its use, as R provides a vast number of validated statistical programs and generally accepted as validated

    Python is used for encryption programs as its algorithms are validated and more secure.

All pages of the site are listed and linked in the index of resources page, accessable via the menu bar on top of every page.

Comments, feedback, corrections, suggestions, and questions are all very welcomed. These can be sent via the contact page (linked from menu on top of pages). Please be aware that a reply depends on having a return email address.

About StatsToDo

This site is not intended, and is insufficient, as a reference or teaching site for statistics.

Over the years, programs were added according to needs of ongoing projects of a clinical academic department, in response to questions and suggestions, and as interests dictated. Therefore there is no systematic organisation of the information available, and otherwise important algorithms that have not been needed or noticed by the author may be missing.

Many old programs, superceded by better ones, have been deleted, but some outdated program may still be present.

Many, but not all, of the programs have been used to produce results that were accepted by the major clinical journals.

Users should therefore consider this site as a data analysis tool box that may contain algorithms that are useful to their current needs, but not as a teaching resource or reference authority. Users should exercise their own judgements, and seek advice from experts when in doubt.


The author has many years of experience in data analysis as a clinician, in research, auditing, and quality control. However, other than a 1 semester course in Multiple Regression at the Master's level, all information provided on this site are based on knowledge from informal studies and personal experience.

Comments, advice, and suggestions, as well as the algoritms provided, are therefore based on "to the best of the author's current understanding", and should not be considered as accepted knowledge or authoratative instructions. Users in doubt are strongly advised to consult a qualified statistician.

The author reserves the right to amend, change and delete any contents of the website without prior notice, and accepts no liability for any loss, change or damage howsoever arising from any use or misuse of or reliance on any content from this website.

The resources of this site is free to all who wish to use them. However, a condition of use is that the user accepts his/her own responsibility for using or reproducing any information obtained from the site or communications with the author.


All contents of the pages from this site, including programs in php, Javascript, R and Python codes, are created by the author personally, although text books, journal articles, and the www were widely consulted and copied during development. As far as the author is aware, all sources used are in the public domain, fully referenced, and the copyrights of others have not been infringed.

The author reserves the copyright of contents of this site, mainly to avoid possible disputes. Users are free to copy any contents of the site (explanations, tables, algorithms) for their own use, on condition that they accept responsibility for the results produced.

References and Acknowledgement

Numerous enquiries have been received on how to reference the algorithms from StatsToDo. The usual accepted one is to place the url in angle brackets. e.g. < >.

As web pages tends to be edited, it would be better to reference the text book or journal article that describe the algorithm in the first place. All algorithms from StatsToDo have references listed on the same page, and the user can access and read the references, accepts that the algorithm from StatsToDo correctly reflect the original descriptions, before referencing the original work.

Technical Issues

Numerical precision

Non integer numerical output from StatsToDo are by default in 4 decimal places of precision, unless the situation warrants otherwise. Experience indicates that this is more than sufficient in most circumstances. As percent (%) is usually presented to 1 decimal point, probability to 2 or at most 3 decimal point, t,z,F, and Chi Square have little meaning after the second decimal point. Users should be aware that 4 decimal points of precision is often redundant, and edit numerical results before publication.

Many mathematical procedures are iterative approximations. Depending on the power of the processor, and the limits of approximation set during computation, the results may differ. In addition, as processors are continuously upgraded and becomes more powerful, results produced from the same program may change over time. Experience indicates that numerical outputs may differ as much as 0.1% to 0.3% from different sources, and similar differences may occasionally be seen between results of current computation and static tables produced previously. Users should understand the cause of these differences, and not be confused or alarmed. If in doubt, user should enquire via the contact page or consult his/her own statistical advisor.

Default Example Data

In nearly all programs from this site, default example data are provided, to show the user the format of data input, and to provide an example of the results that are produced.

The default examples are all artificially generated to demonstrate the procedures involved, and must not be interpreted as anything reflecting reality. The research model in each example is also deliberately simplistic, so that it does not distract the user from focussing on the statistical and computational aspects

The sizes of the example data are also very small, so that the user can visualize the data and results easily, but do not reflect the required sample size for meaningful interpretation of results.

After testing and understanding the program with the default example data, the user can then replace them with his/her own data to produce the required results.