> From <@uconnvm.uconn.edu:kent@darwin.eeb.uconn.edu> Wed Jan 7 12:51 GMT 1998 > To: ripley@stats.ox.ac.uk (Prof Brian Ripley) > Cc: s-news@utstat.toronto.edu > Subject: Re: Summary of Robust Regression Algorithms > From: kent@darwin.eeb.uconn.edu (Kent E. Holsinger) > > >>>>> "Brian" == Prof Brian Ripley writes: > > Brian> My best example of this not knowing the literature is the > Brian> Hauck-Donner (1977) phenomenon: a small t-value in a > Brian> logistic regression indicates either an insignificant OR a > Brian> very significant effect, but step.glm assumes the first, > Brian> and I bet few users of glm() stop to think. > > All right I confess. This is a new one for me. Could some one explain > the Hauck-Donner effect to me? I understand that the t-values from > glm() are a Wald approximation and may not be terribly reliable, but I > don't understand how a small t-value could indicate "either an > insignificant OR a very significant effect." > > Thanks for the help. It's finding gems like these that make this group > so extraordinarily valuable.
The first approach, being a kind of compendium model presently exercised by all UNESCO world reports with the exceptions of the World Education Report and the World Culture Report, is a major activity of the Organization and should be maintained through a reporting mechanism at longer intervals (e.g. every four to six years), whereas the second approach demands an appropriate timing as well as addressee, such as the General Conference of UNESCO, at two-year-intervals. unesdoc.unesco.org

C'est en particulier ce que l'on utilise si Y est qualitative. Dans ce cas, on peut chercher P(y=0) ; mais comme une probabilitÃ© est toujours comprise entre 0 et 1, on n'arrivera pas Ã  l'exprimer comme combinaison linÃ©aire de variables quantitatives auxquelles on ajoute du bruit. On applique alors Ã  cette probabilitÃ© une bijection g entre l'intervalle [0;1] et la droite rÃ©elle (on dit que g est un lien). On essaye alors d'exprimer g(P(y=0)) comme combinaison linÃ©aire des variables prÃ©dictives.

Il s'agit de prÃ©dire la valeur d'une variables qualitative, i.e., de mettre les individus dans des classes. (Par exemple : aide au diagnostic mÃ©dical, reconnaissance des mauvais payeurs par une banque, etc.) On cherche des "fonctions linÃ©aires discirminantes (des combinaisons linÃ©aires dea variables, qui maximisent la variance interclasse et minimisent la variance intraclasse)

There is one fairly common circumstance in which both convergence problems and the Hauck-Donner phenomenon (and trouble with \sfn{step}) can occur. This is when the fitted probabilities are extremely close to zero or one. Consider a medical diagnosis problem with thousands of cases and around fifty binary explanatory variables (which may arise from coding fewer categorical factors); one of these indicators is rarely true but always indicates that the disease is present. Then the fitted probabilities of cases with that indicator should be one, which can only be achieved by taking \hat\beta_i = \infty. The result from \sfn{glm} will be warnings and an estimated coefficient of around +/- 10 [and an insignificant t value].

On se donne deux Ã©chantillons de taille n, et on veut savoir si leur moyennes sont significativement diffÃ©rentes. Pour cela, on commence par calculer les moyennes et leur diffÃ©rences. Ensuite on recommence, mais en prenant deux Ã©chantillons de taille n au hasard dans nos 2n valeurs. Et on continue jusqu'Ã  avoir une bonne estimation de la distribution de ces diffÃ©rences. Ensuite, on regarde oÃ¹ notre diffÃ©rence initiale se trouve dans cette distribution : on rejette l'hypothÃ¨se d'Ã©galitÃ© si elle semble trop marginale.

knnTree Construct or predict with k-nearest-neighbor classifiers, using cross-validation to select k, choose variables (by forward or backwards selection), and choose scaling (from among no scaling, scaling each column by its SD, or scaling each column by its MAD). The finished classifier will consist of a classification tree with one such k-nn classifier in each leaf.


There is a description in V&R2, pp. 237-8., given below. I guess I was teasing people to look up Hauck-Donner phenomenon in our index. (I seem to remember this was new to my co-author too, so you were in good company. This is why it is such a good example of a fact which would be useful to know but hardly anyone does. Don't ask me how I knew: I only know that I first saw this in about 1980.)

On Ã©tudie les liaisons entre des variables qualitatives, sans qu'une variable joue un rÃ´le particulier (il n'y a pas de variable Ã  prÃ©voir). Les donnÃ©es sont reprÃ©sentÃ©es par un tableau de contingence. On commence par formuler une hypothÃ¨se sur ce tableau, qui donne un certain modÃ¨le et certaines valeurs pour les frÃ©quences. Par exemple, s'il y a deux variables, si on suppose qu'elles sont indÃ©pendantes, le modÃ¨le implique que

To expand a little, if |t| is small it can EITHER mean than the Taylor expansion works and hence the likelihood ratio statistic is small OR that |\hat\beta_i| is very large, the approximation is poor and the likelihood ratio statistic is large. (I was using `significant' as meaning practically important.) But we can only tell if |\hat\beta_i| is large by looking at the curvature at \beta_i=0, not at |\hat\beta_i|. This really does happen: from later on in V&R2:

