Two sample rank tests with adaptive score functions using kernel density estimation

Greene, Brandon

Two sample rank tests with adaptive score functions using kernel density estimation

Dateien

GreeneBrandon_2017_10_06.pdf (1.19 MB)

Datum

2017

Autor:innen

Greene, Brandon

Lizenz

http://rightsstatements.org/page/InC/1.0/

Zitierlink

http://dx.doi.org/10.22029/jlupub-9759

Zusammenfassung

In the basic two sample testing problem we are interested in comparing two distributions F and G by testing the null hypothesis that they are equal against the alternative that they somehow differ based on independently identically distributed samples from each of them. In the case of fixed, known F and G under the null hypothesis as well the alternative the well-known Neymann-Pearson lemma provides the most powerful test, however, in practical applications it is usually not feasible to make such strong assumptions regarding the specification of F and G.Non-parametric tests - and rank tests in particular - make no assumptions about the form of the distributions other than perhaps some degree of smoothness, and since the vector of ranks is known to be uniformly distributed under the null hypothesis independent of the underlying distribution, the exact distribution of rank test statistics is available in this case. This leads to a class of tests which are valid under any null hypothesis distribution, but their power can vary greatly with F and G under the alternatives. In this work we look at rank tests of the form proposed by K. Behnen and G. Neuhaus (1983), which use an adaptive score function derived from densities of the transformed data. The score function proposed is adaptive in the sense that it provides a locally optimal test under any alternative, however the densities involved are theoretical and need to be estimated from the data. In order to do this we use simple rank-based kernel density estimators in order to construct a rank test statistic.Hajek projections are used to prove a linearization of the test statistic as a sum of i.i.d random variables plus negligible rest terms, and this result is used to show asymptotic normality under the null hypothesis, but a series of simulations indicate that there are problems with centering and scaling of the test statistic for finite sample sizes, and that the proven distributional convergence is very slow in practice. Further investigations show the reasons for each of these problems. Centering and scaling can be remedied with modifications to the score function and variance estimate of the test statistic, but the slow convergence is shown to be the result of the choice to use kernel density estimators. In a further series of simulations we compare the power of the derived tests using their exact or monte-carlo distributions with the non-adaptive Wilcoxon rank-sum test under a selection of generalized shift alternatives.

Sammlungen

Dissertationen/Habilitationen

Komplettanzeige

Two sample rank tests with adaptive score functions using kernel density estimation

Dateien

Datum

Autor:innen

Betreuer/Gutachter

Weitere Beteiligte

Herausgeber

Zeitschriftentitel

ISSN der Zeitschrift

Bandtitel

Verlag

Lizenz

Zitierlink

Zusammenfassung

Beschreibung

Inhaltsverzeichnis

Anmerkungen

Erstpublikation in

Sammelband

URI der Erstpublikation

Forschungsdaten

Schriftenreihe

Erstpublikation in

Zitierform

Sammlungen