Bug 17384 - boxplot(..., las=2) outputs too too many x labels
Summary: boxplot(..., las=2) outputs too too many x labels
Status: CLOSED FIXED
Alias: None
Product: R
Classification: Unclassified
Component: Graphics (show other bugs)
Version: R 3.4.3
Hardware: All All
: P5 normal
Assignee: R-core
URL:
Depends on:
Blocks:
 
Reported: 2018-02-01 12:45 UTC by Ulrich Windl
Modified: 2018-04-18 10:11 UTC (History)
1 user (show)

See Also:


Attachments
Image of a normal boxplot() (3.78 KB, image/png)
2018-02-01 12:45 UTC, Ulrich Windl
Details
Image of boxplot(..., las=2) (6.72 KB, image/png)
2018-02-01 12:45 UTC, Ulrich Windl
Details
dput(IRQ) as ZIP file (15.51 KB, application/zip)
2018-02-01 14:43 UTC, Ulrich Windl
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ulrich Windl 2018-02-01 12:45:05 UTC
Created attachment 2317 [details]
Image of a normal boxplot()

I have some data that is grouped by a factor with 256 levels. The values are presented as character originally, so I use as-numeric() to convert those. For example:
> str(with(IRQ, boxplot(rIntr ~ as.numeric(intr))))
List of 6
 $ stats: num [1:5, 1:256] 0 0 0 0 0 0 0 0 0 0 ...
 $ n    : num [1:256] 143 143 143 143 143 143 143 143 143 143 ...
 $ conf : num [1:2, 1:256] 0 0 0 0 0 0 0 0 0 0 ...
 $ out  : num(0) 
 $ group: num(0) 
 $ names: chr [1:256] "0" "1" "2" "3" ...

This produces 14 X axis labels like 0, 14, 30, 46, 62, ..., 217, 238".
However if I add "las=2" like this, I get virtually 256 labels overlapping each other:
> str(with(IRQ, boxplot(rIntr ~ as.numeric(intr), las=2)))
List of 6
 $ stats: num [1:5, 1:256] 0 0 0 0 0 0 0 0 0 0 ...
 $ n    : num [1:256] 143 143 143 143 143 143 143 143 143 143 ...
 $ conf : num [1:2, 1:256] 0 0 0 0 0 0 0 0 0 0 ...
 $ out  : num(0) 
 $ group: num(0) 
 $ names: chr [1:256] "0" "1" "2" "3" ...

So it seems boxplot() assumes an incorrect width for the X axis labels, and I think it is a bug that should be fixed.
Comment 1 Ulrich Windl 2018-02-01 12:45:47 UTC
Created attachment 2318 [details]
Image of boxplot(..., las=2)
Comment 2 Martin Maechler 2018-02-01 13:44:54 UTC
This is not reproducible,  we don't have 'rIntr' or 'intr'.
Please provide the data as attachment.
Comment 3 Ulrich Windl 2018-02-01 14:43:30 UTC
Created attachment 2319 [details]
dput(IRQ) as ZIP file

I thought the data set is so trivial that you don't need it, but anyway: Here it is!
Comment 4 Ulrich Windl 2018-03-19 09:15:08 UTC
Any news on this bug?
Comment 5 Martin Maechler 2018-04-09 12:12:42 UTC
I have spent an unreasonable amount of time on this now.

It is a bug at least insofar as it is undocumented behavior.
"Of course" the problem is not in  boxplot() -> plot.boxplot() -> bxp()
but rather in the underlying  axis() function:

Here some simple R code for playing :

par(lab = c(20,20, 7))
x <- y <- 0:100
plot(x,y, sub=R.version.string)# now resize the window:
                               # the labels that are drawn will change

plot(x,y, sub=R.version.string, las=2)# las=2 : all labels are *perpendicular*
## now resize the window:
## ==> the labels that are drawn remain *fixed* and can overlap !!
##

-------------
This is contrary to what
the section 'Details:'  on the  help(axis)   page  says
'
     The code tries hard not to draw overlapping tick labels, and so
     will omit labels where they would abut or overlap previously drawn
     labels.  This can result in, for example, every other tick being
     labelled.  (The ticks are drawn left to right or bottom to top,
     and space at least the size of an ‘m’ is left between labels.)
'

Indeed that is only true when the labels are *NOT* perpendicular to the axis.
I've committed changes to  src/library/graphics/src/plot.c , notably the C_axis() routine which contains all the relevant code and where the new code, with its  boolean variable  `perpendicular`
now makes it clear that all the  "tries hard ..." above only happens when the axis labelling happens in a non-perpendicular i.e.  "parallel" way.


## I think this is *NOT* documented.
Comment 6 Ulrich Windl 2018-04-11 06:41:20 UTC
I don't know the reason why the algorithm does not work for rotated text, but (for example) in PostScript the stringwidth operator that returns width and height of a text string always returns a zero height. So if you'd use that as width for a 90° rotated text, you'd have a problem...
Comment 7 Martin Maechler 2018-04-18 10:11:50 UTC
I have committed changes to R-devel now (svn 74613) which

1) introduce a new optional argument  'gap.axis' to axis().
  The argument is a multiplication factor to the "gap" which is required between labels. 
  The R <= 3.5.0 behavior corresponds to  
    gap.axis =  1 for parallel axis labels
    gap.axis = -1 for perpendicular axis labels.

2) The default in R-devel is  
     gap.axis =  1  for parallel,  and
     gap.axis = 1/4 for perpendicular axis labels.


This fixes the bug *and* provides neat new tweaking possibilities.

I currently plan to also introduce two new optional arguments to  plot.default(),
namely  xgap.axis and ygap.axis  which will simply be passed as 'gap.axis' to the respective
x- and y- axis() calls inside plot.default().

Feedback is welcome (and the naming of these new arguments can be discussed:  I've chosen
  "*.axis"  because there are already  three *.axis par() settings, also usable as arguments to axis() ... and all three related to the tick labels.