Bug 14359 - Wishlist: enhance formula interface in t.test()
Summary: Wishlist: enhance formula interface in t.test()
Status: ASSIGNED
Alias: None
Product: R
Classification: Unclassified
Component: Analyses (show other bugs)
Version: R-devel (trunk)
Hardware: All All
: P3 enhancement
Assignee: R-core
URL:
Depends on:
Blocks:
 
Reported: 2010-08-13 23:12 UTC by jwdietrich
Modified: 2017-04-26 07:18 UTC (History)
1 user (show)

See Also:


Attachments
r script with different uses of t.test (1.09 KB, application/octet-stream)
2010-08-13 23:12 UTC, jwdietrich
Details

Note You need to log in before you can comment on or make changes to this bug.
Description jwdietrich 2010-08-13 23:12:23 UTC
Created attachment 1121 [details]
r script with different uses of t.test

t.test seems to deliver incorrect results if the data vectors contain missing values.

Suppose, we have two vectors with numbers and some missing values that refer to the same individuals and that should therefore be evaluated with a paired t-test:

> testdata.A <- c(1.15, -0.2, NA, 1, -2, -0.5, 0.1, 1.2, -1.4, 0.01);
> testdata.B <- c(1.2, 1.1, 3, -0.1, 3, 1.1, 0, 1.3, 4, NA);

Then

> print(t.test(testdata.A, testdata.B, paired=TRUE, alternative="two.sided", na.action="na.pass"))

and

> print(t.test(testdata.A, testdata.B, paired=TRUE, alternative="two.sided", na.action="na.exclude"))

deliver the same p value (0.1162).

However, after combining the two vectors with

> testdata <- c(testdata.A, testdata.B);

and defining a criterion vector with

> criterion <- c(0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1);

we get a different p-value (0.01453) with

> print(t.test(testdata ~ criterion, paired=TRUE, alternative="two.sided", na.action="na.exclude")) .

The statement

> print(t.test(testdata ~ criterion, paired=TRUE, alternative="two.sided", na.action="na.pass"))

however, delivers a p-value of 0.1162 again.

With

> print(t.test(testdata[criterion==0], testdata[criterion==1], paired=TRUE, alternative="two.sided", na.action="na.exclude"))

that imitates the first form, we get again a p value of 0.1162.

Excel, StatView and KaleidaGraph all display a p-value of 0.1162.
Comment 1 jwdietrich 2010-08-13 23:51:20 UTC
I have just learned from Thomas Lumley that "if there is a bug it is only that t.test.formula doesn't throw an error when given the paired=TRUE option.

The documentation says "The formula interface is only applicable for the 2-sample tests.",  but there probably should be an explicit check -- I didn't see that in the documentation until I realized that t.test.formula couldn't work for paired tests because you don't tell it which observations are paired."

Therefore, it would be advisable to implement the check for a paired test in the formula notation. The t.test function itself seems to be correct.
Comment 2 Peter Dalgaard 2010-08-14 04:44:19 UTC
r-bugs@r-project.org wrote:
> https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=14359
> 
> 
> 
> 
> 
> --- Comment #1 from jwdietrich <j.w.dietrich.bz@medizinische-kybernetik.de>  2010-08-13 18:51:20 ---
> I have just learned from Thomas Lumley that "if there is a bug it is only that
> t.test.formula doesn't throw an error when given the paired=TRUE option.


Before anyone gets carried away, let me remind you of example(sleep).
I.e., we're actually using the current feature, even though we document
it not to exist...

(I se that example(t.test) is still performing the incorrect test. I
thought Martin was on the warpath agaist it at some point?)

The other option is to document what currently happens.

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Phone: (+45)38153501
Email: pd.mes@cbs.dk  Priv: PDalgd@gmail.com


Comment 3 Martin Maechler 2010-08-14 12:37:38 UTC
On Sat, Aug 14, 2010 at 01:44, Peter Dalgaard <pd.mes@cbs.dk> wrote:
> r-bugs@r-project.org wrote:
>> https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=14359
>>
>>
>>
>>
>>
>> --- Comment #1 from jwdietrich <j.w.dietrich.bz@medizinische-kybernetik.de>  2010-08-13 18:51:20 ---
>> I have just learned from Thomas Lumley that "if there is a bug it is only that
>> t.test.formula doesn't throw an error when given the paired=TRUE option.

>
> Before anyone gets carried away, let me remind you of example(sleep).
> I.e., we're actually using the current feature, even though we document
> it not to exist...

Yes. {note that I did not understand what you wrote before thinking
about the 'sleep' example}.

I think there are two issues here:
1) "sleep":  A data frame of dim.  20 x 2  (variable 'extra' (=y) and 'group').
     As the help file explains these are measurements on 10 (!)
individuals, once with and once without
    "sleeping pill".
   *BUT* the data itself do not contain this information:  You just
have to know that obs. 1 & 11, 2 & 12,...
   each or of the same patient.
  I propose to add a third variable, an "ID" (factor), and also change
the example, using that ID.

  *As* the sleep data is really about a paired experiment, I would
feel very bad as a statistics professor,
  not to perform a paired test.  But there's no need to do it using
the formula interface.

2) t.test: documentation / implementation:
  a) NA handling:  We have this on the help page

Arguments:
 ..........
na.action: a function which indicates what should happen when the data
          contain ‘NA’s.  Defaults to ‘getOption("na.action")’.
.....
Details:
 .........
    Missing values are removed (in pairs if ‘paired’ is ‘TRUE’).

and I'd say that 'Details' should be a bit more specific: I currently
describes what happens in t.test.default(),
and that is called from t.test.formula *after* it has applied
na.action in building the model.frame.


> (I se that example(t.test) is still performing the incorrect test. I
> thought Martin was on the warpath agaist it at some point?)

Yes!  The sleep data is about a paired experiment, and we are bad
teachers of statistics,
if we advertize to do an unpaired test
(we could do it *after* doing it paired, and *with* a comment saying
that this is sub-optimal,
 losing power, ...,)

> The other option is to document what currently happens.

I would not want that... unless "we must" because of dozens of cases
people use this unfortunately.

The sleep data is a bad example for non-paired t-tests, see above,
and, as long it's missing the 'ID' code, actually a bad example of a
data set...

and I still agree with whoever had decided that "paired=TRUE" and
'formula'  should not be allowed to go toegether...

Martin

> --
> Peter Dalgaard
> Center for Statistics, Copenhagen Business School
> Phone: (+45)38153501
> Email: pd.mes@cbs.dk  Priv: PDalgd@gmail.com
>
> _______________________________________________
> R-core list: https://stat.ethz.ch/mailman/listinfo/r-core
>
>


Comment 4 Peter Dalgaard 2010-08-14 13:34:42 UTC
Martin Maechler wrote:
> On Sat, Aug 14, 2010 at 01:44, Peter Dalgaard <pd.mes@cbs.dk> wrote:
>> r-bugs@r-project.org wrote:
>>> https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=14359
>>>
>>>
>>>
>>>
>>>
>>> --- Comment #1 from jwdietrich <j.w.dietrich.bz@medizinische-kybernetik.de>  2010-08-13 18:51:20 ---
>>> I have just learned from Thomas Lumley that "if there is a bug it is only that
>>> t.test.formula doesn't throw an error when given the paired=TRUE option.
> 
>> Before anyone gets carried away, let me remind you of example(sleep).
>> I.e., we're actually using the current feature, even though we document
>> it not to exist...
> 
> Yes. {note that I did not understand what you wrote before thinking
> about the 'sleep' example}.
> 
> I think there are two issues here:
> 1) "sleep":  A data frame of dim.  20 x 2  (variable 'extra' (=y) and 'group').
>      As the help file explains these are measurements on 10 (!)
> individuals, once with and once without
>     "sleeping pill".
>    *BUT* the data itself do not contain this information:  You just
> have to know that obs. 1 & 11, 2 & 12,...
>    each or of the same patient.
>   I propose to add a third variable, an "ID" (factor), and also change
> the example, using that ID.


It still remains that there is no formula interface for paired tests
which is a bit of a nuisance, at least paedagogically.

"y ~ g + strata(ID)" is a definite possibility for "long"-format data,
with the paired test being the extreme case, but it doesn't carry too
well to the wilcox.test, where the natural large-strata stratified
analysism does not reduce to the signed-rank test.

I don't really think it is that harmful to have to assume that the data
set is sorted by pair (or pair within "group") and say so in the
documentation. The NA handling mess-up is kind of serious, though. There
must be a similar issue with subset=, but it is easier in that case to
require user discretion.

> 
>   *As* the sleep data is really about a paired experiment, I would
> feel very bad as a statistics professor,
>   not to perform a paired test.  But there's no need to do it using
> the formula interface.
> 
> 2) t.test: documentation / implementation:
>   a) NA handling:  We have this on the help page
> 
> Arguments:
>  ..........
> na.action: a function which indicates what should happen when the data
>           contain ‘NA’s.  Defaults to ‘getOption("na.action")’.



Yes. I would suggest that a simple workaround for the observed behaviour
 is to zap this option and internally hardcode it to na.pass.
t.test.default is doing NA removal anyway, so the only possible point of
using getOption would be if it were set na.fail globally (does anyone
ever do that?).

(BTW, model.frame has na.action formally defaulting to na.fail, but
internally circumvents it. Terribly confusing.)

> .....
> Details:
>  .........
>     Missing values are removed (in pairs if ‘paired’ is ‘TRUE’).
> 
> and I'd say that 'Details' should be a bit more specific: I currently
> describes what happens in t.test.default(),
> and that is called from t.test.formula *after* it has applied
> na.action in building the model.frame.
> 
> 
>> (I se that example(t.test) is still performing the incorrect test. I
>> thought Martin was on the warpath agaist it at some point?)
> 
> Yes!  The sleep data is about a paired experiment, and we are bad
> teachers of statistics,
> if we advertize to do an unpaired test
> (we could do it *after* doing it paired, and *with* a comment saying
> that this is sub-optimal,
>  losing power, ...,)
> 
>> The other option is to document what currently happens.
> 
> I would not want that... unless "we must" because of dozens of cases
> people use this unfortunately.
> 
> The sleep data is a bad example for non-paired t-tests, see above,
> and, as long it's missing the 'ID' code, actually a bad example of a
> data set...
> 
> and I still agree with whoever had decided that "paired=TRUE" and
> 'formula'  should not be allowed to go toegether...
> 
> Martin
> 
>> --
>> Peter Dalgaard
>> Center for Statistics, Copenhagen Business School
>> Phone: (+45)38153501
>> Email: pd.mes@cbs.dk  Priv: PDalgd@gmail.com
>>
>> _______________________________________________
>> R-core list: https://stat.ethz.ch/mailman/listinfo/r-core
>>
>>
> 



-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Phone: (+45)38153501
Email: pd.mes@cbs.dk  Priv: PDalgd@gmail.com


Comment 5 Martin Maechler 2010-08-14 14:15:43 UTC
On Sat, Aug 14, 2010 at 10:34, Peter Dalgaard <pd.mes@cbs.dk> wrote:
> Martin Maechler wrote:
>> On Sat, Aug 14, 2010 at 01:44, Peter Dalgaard <pd.mes@cbs.dk> wrote:
>>> r-bugs@r-project.org wrote:
>>>> https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=14359
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --- Comment #1 from jwdietrich <j.w.dietrich.bz@medizinische-kybernetik.de>  2010-08-13 18:51:20 ---
>>>> I have just learned from Thomas Lumley that "if there is a bug it is only that
>>>> t.test.formula doesn't throw an error when given the paired=TRUE option.
>>
>>> Before anyone gets carried away, let me remind you of example(sleep).
>>> I.e., we're actually using the current feature, even though we document
>>> it not to exist...
>>
>> Yes. {note that I did not understand what you wrote before thinking
>> about the 'sleep' example}.
>>
>> I think there are two issues here:
>> 1) "sleep":  A data frame of dim.  20 x 2  (variable 'extra' (=y) and 'group').
>>      As the help file explains these are measurements on 10 (!)
>> individuals, once with and once without
>>     "sleeping pill".
>>    *BUT* the data itself do not contain this information:  You just
>> have to know that obs. 1 & 11, 2 & 12,...
>>    each or of the same patient.
>>   I propose to add a third variable, an "ID" (factor), and also change
>> the example, using that ID.

.. I changed the sleep.Rd example .. (but not using the ID), using
with(.., and t.test(x,y,..))
the example also in R-patched, so  at least we do *not* work against
the docs there.
...
if needed you can always change that more
>
> It still remains that there is no formula interface for paired tests
> which is a bit of a nuisance, at least paedagogically.
>
> "y ~ g + strata(ID)" is a definite possibility for "long"-format data,
> with the paired test being the extreme case, but it doesn't carry too
> well to the wilcox.test, where the natural large-strata stratified
> analysism does not reduce to the signed-rank test.
>
> I don't really think it is that harmful to have to assume that the data
> set is sorted by pair (or pair within "group") and say so in the
> documentation.

For the t-test (or other "paired" situations) I could -- just barely -- agree.
But in a more general case, I think just assuming observations ordered
in a certain way,
and giving complete non-sense answers when they are ordered differently,
is something very dangerous,
and I would not want to support such unsafe practice by a formula notation.

More natural (to me) than   y ~ g + strata("ID"),
would be to use   y ~ g | ID in this case.
I have to run ... for the rest of the day ...
and I didn't understand the "problem" with wilcox.test.

> The NA handling mess-up is kind of serious, though. There
> must be a similar issue with subset=, but it is easier in that case to
> require user discretion.
>
>>
>>   *As* the sleep data is really about a paired experiment, I would
>> feel very bad as a statistics professor,
>>   not to perform a paired test.  But there's no need to do it using
>> the formula interface.
>>
>> 2) t.test: documentation / implementation:
>>   a) NA handling:  We have this on the help page
>>
>> Arguments:
>>  ..........
>> na.action: a function which indicates what should happen when the data
>>           contain ‘NA’s.  Defaults to ‘getOption("na.action")’.
>
>
> Yes. I would suggest that a simple workaround for the observed behaviour
>  is to zap this option and internally hardcode it to na.pass.
> t.test.default is doing NA removal anyway, so the only possible point of
> using getOption would be if it were set na.fail globally (does anyone
> ever do that?).

Some people may..  As they have been allowed to use t.test that way
for a very long time,
it seems dangerous to change now

> (BTW, model.frame has na.action formally defaulting to na.fail, but
> internally circumvents it. Terribly confusing.)
>
>> .....
>> Details:
>>  .........
>>     Missing values are removed (in pairs if ‘paired’ is ‘TRUE’).
>>
>> and I'd say that 'Details' should be a bit more specific: I currently
>> describes what happens in t.test.default(),
>> and that is called from t.test.formula *after* it has applied
>> na.action in building the model.frame.
>>
>>
>>> (I se that example(t.test) is still performing the incorrect test. I
>>> thought Martin was on the warpath agaist it at some point?)
>>
>> Yes!  The sleep data is about a paired experiment, and we are bad
>> teachers of statistics,
>> if we advertize to do an unpaired test
>> (we could do it *after* doing it paired, and *with* a comment saying
>> that this is sub-optimal,
>>  losing power, ...,)
>>
>>> The other option is to document what currently happens.
>>
>> I would not want that... unless "we must" because of dozens of cases
>> people use this unfortunately.
>>
>> The sleep data is a bad example for non-paired t-tests, see above,
>> and, as long it's missing the 'ID' code, actually a bad example of a
>> data set...
>>
>> and I still agree with whoever had decided that "paired=TRUE" and
>> 'formula'  should not be allowed to go toegether...
>>
>> Martin


Comment 6 Peter Dalgaard 2010-08-14 22:35:43 UTC
Thomas Lumley wrote:
> On Sat, 14 Aug 2010, Peter Dalgaard wrote:
> 
>> r-bugs@r-project.org wrote:
>>> https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=14359
>>>
>>>
>>>
>>>
>>>
>>> --- Comment #1 from jwdietrich <j.w.dietrich.bz@medizinische-kybernetik.de>  2010-08-13 18:51:20 ---
>>> I have just learned from Thomas Lumley that "if there is a bug it is only that
>>> t.test.formula doesn't throw an error when given the paired=TRUE option.
>> Before anyone gets carried away, let me remind you of example(sleep).
>> I.e., we're actually using the current feature, even though we document
>> it not to exist...
> 
> Um, no.

= sleep)
> asks for a two-sample t-test. It's conf> t.test(extra ~ group, data

using, because the previous line in the example asks for a paired t-test
on the same data, but it is not inconsistent with the documentation.

Um, yes. In example(sleep), we do (well, did, now that Martin changed it.)


>> (I se that example(t.test) is still performing the incorrect test. I
>> thought Martin was on the warpath agaist it at some point?)
> 
> The example is statistically incorrect but computationally correct: it is performing a two-sample t-test, and does not use paired=TRUE
> 
>> The other option is to document what currently happens.
> 
> I really don't think that's an option:
> 
> "The formula interface with paired=TRUE will delete missing data and

then assume that the non-missing data for the lhs and rhs still form
pairs in the order in which they appear in the file"

I was thinking more like "Beware that this is unreliable if there are
missing values in data."

> 
> 
> It would be an option to fix the missing-data handling so that

paired=TRUE works when the data are correctly ordered, but I really
don't like assuming a data frame just happens to be in the right order.
Vince Carey's gee package assumes this and in my experience it does
fairly often lead to errors.

That is a point. On the other hand, it is hard to guard against users
assuming R to have extra-sensory perception: If you give only y and g,
how else would R know what the pairs are? (Like assuming R to infer that
if g goes 1 2 1 2 1 1 2 2, then there are really 6 pairs, one with a
missing 2 and one with a missing 1. I suppose that is the sort of thing
happening with gee()?)

> 
> A more helpful option would be to allow paired=id (or paired=~id) rather

than paired=TRUE in the formula interface, where id is a variable
identifying the pairs. Or perhaps outcome~group+cluster(id).

or outcome ~ group | id, as Martin suggested. The paired = ~id has the
perspective that we could actually fudge in paired = TRUE as equivalent
to ~foo with foo set to unsplit(list(1:N,1:N), g). With a stern warning,
of course. Even if we don't want that, it gives an opportunity to make
paired=TRUE defunct ("paired must be a formula").


> 
>      -thomas
> 
> Thomas Lumley
> Professor of Biostatistics
> University of Washington, Seattle
> 
> 
> 



-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Phone: (+45)38153501
Email: pd.mes@cbs.dk  Priv: PDalgd@gmail.com


Comment 7 Martin Maechler 2010-08-14 23:59:03 UTC
On Sat, Aug 14, 2010 at 19:35, Peter Dalgaard <pd.mes@cbs.dk> wrote:
> Thomas Lumley wrote:
>> On Sat, 14 Aug 2010, Peter Dalgaard wrote:
>>
>>> r-bugs@r-project.org wrote:
>>>> https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=14359
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --- Comment #1 from jwdietrich <j.w.dietrich.bz@medizinische-kybernetik.de>  2010-08-13 18:51:20 ---
>>>> I have just learned from Thomas Lumley that "if there is a bug it is only that
>>>> t.test.formula doesn't throw an error when given the paired=TRUE option.
>>> Before anyone gets carried away, let me remind you of example(sleep).
>>> I.e., we're actually using the current feature, even though we document
>>> it not to exist...
>>
>> Um, no.
> = sleep)
>> asks for a two-sample t-test. It's conf> t.test(extra ~ group, data
> using, because the previous line in the example asks for a paired t-test
> on the same data, but it is not inconsistent with the documentation.
>
> Um, yes. In example(sleep), we do (well, did, now that Martin changed it.)
>
>
>>> (I se that example(t.test) is still performing the incorrect test. I
>>> thought Martin was on the warpath agaist it at some point?)
>>
>> The example is statistically incorrect but computationally correct: it is performing a two-sample t-test, and does not use paired=TRUE

Sure.  But statistically incorrect in an example for the t.test (for
many people *the* thing they learn from (inferential) statistics)) is
what brought me "on the warpath" (PD).
As I also said in a different place in the thread,
we can leave it there, if we also do the paired test, and use comments
explaining why/ that "paired" is
what really makes sense.

>>
>>> The other option is to document what currently happens.
>>
>> I really don't think that's an option:
>>
>> "The formula interface with paired=TRUE will delete missing data and
> then assume that the non-missing data for the lhs and rhs still form
> pairs in the order in which they appear in the file"
>
> I was thinking more like "Beware that this is unreliable if there are
> missing values in data."
>
>>
>>
>> It would be an option to fix the missing-data handling so that
> paired=TRUE works when the data are correctly ordered, but I really
> don't like assuming a data frame just happens to be in the right order.

Exactly, that's the point.  I agree fully.

> Vince Carey's gee package assumes this and in my experience it does
> fairly often lead to errors.
>
> That is a point. On the other hand, it is hard to guard against users
> assuming R to have extra-sensory perception: If you give only y and g,
> how else would R know what the pairs are? (Like assuming R to infer that
> if g goes 1 2 1 2 1 1 2 2, then there are really 6 pairs, one with a
> missing 2 and one with a missing 1. I suppose that is the sort of thing
> happening with gee()?)
>
>>
>> A more helpful option would be to allow paired=id (or paired=~id) rather
> than paired=TRUE in the formula interface, where id is a variable
> identifying the pairs. Or perhaps outcome~group+cluster(id).
>
> or outcome ~ group | id, as Martin suggested. The paired = ~id has the
> perspective that we could actually fudge in paired = TRUE as equivalent
> to ~foo with foo set to unsplit(list(1:N,1:N), g). With a stern warning,
> of course. Even if we don't want that, it gives an opportunity to make
> paired=TRUE defunct ("paired must be a formula").

I'm liking the idea ("paired = ~id ") quite a bit
(noting that  paired = TRUE/FALSE would still be there for t.test.default()).
You'd get my vote for switching there.

Martin


Comment 8 Brian Ripley 2010-10-07 15:28:14 UTC
This seems to have become a wish to enhance the formula interface