Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
dplyr 1.1.1 source code.tar.gz | 2023-03-21 | 802.5 kB | |
dplyr 1.1.1 source code.zip | 2023-03-21 | 1.1 MB | |
README.md | 2023-03-21 | 4.2 kB | |
Totals: 3 Items | 1.9 MB | 0 |
- Mutating joins now warn about multiple matches much less often. At a high
level, a warning was previously being thrown when a one-to-many or
many-to-many relationship was detected between the keys of
x
andy
, but is now only thrown for a many-to-many relationship, which is much rarer and much more dangerous than one-to-many because it can result in a Cartesian explosion in the number of rows returned from the join (#6731, [#6717]).
We've accomplished this in two steps:
-
multiple
now defaults to"all"
, and the options of"error"
and"warning"
are now deprecated in favor of usingrelationship
(see below). We are using an accelerated deprecation process for these two options because they've only been available for a few weeks, andrelationship
is a clearly superior alternative. -
The mutating joins gain a new
relationship
argument, allowing you to optionally enforce one of the following relationship constraints between the keys ofx
andy
:"one-to-one"
,"one-to-many"
,"many-to-one"
, or"many-to-many"
.For example,
"many-to-one"
enforces that each row inx
can match at most 1 row iny
. If a row inx
matches >1 rows iny
, an error is thrown. This option serves as the replacement formultiple = "error"
.The default behavior of
relationship
doesn't assume that there is any relationship betweenx
andy
. However, for equality joins it will check for the presence of a many-to-many relationship, and will warn if it detects one.
This change unfortunately does mean that if you have set multiple = "all"
to
avoid a warning and you happened to be doing a many-to-many style join, then
you will need to replace multiple = "all"
with
relationship = "many-to-many"
to silence the new warning, but we believe
this should be rare since many-to-many relationships are fairly uncommon.
-
Fixed a major performance regression in
case_when()
. It is still a little slower than in dplyr 1.0.10, but we plan to improve this further in the future (#6674). -
Fixed a performance regression related to
nth()
,first()
, andlast()
(#6682). -
Fixed an issue where expressions involving infix operators had an abnormally large amount of overhead (#6681).
-
group_data()
on ungrouped data frames is faster (#6736). -
n()
is a little faster when there are many groups (#6727). -
pick()
now returns a 1 row, 0 column tibble when...
evaluates to an empty selection. This makes it more compatible with tidyverse recycling rules in some edge cases (#6685). -
if_else()
andcase_when()
again accept logical conditions that have attributes (#6678). -
arrange()
can once again sort thenumeric_version
type from base R (#6680). -
slice_sample()
now works when the input has a column namedreplace
.slice_min()
andslice_max()
now work when the input has columns namedna_rm
orwith_ties
(#6725). -
nth()
now errors informatively ifn
isNA
(#6682). -
Joins now throw a more informative error when
y
doesn't have the same source asx
(#6798). -
All major dplyr verbs now throw an informative error message if the input data frame contains a column named
NA
or""
(#6758). -
Deprecation warnings thrown by
filter()
now mention the correct package where the problem originated from (#6679). -
Fixed an issue where using
<-
within a groupedmutate()
orsummarise()
could cross contaminate other groups (#6666). -
The compatibility vignette has been replaced with a more general vignette on using dplyr in packages,
vignette("in-packages")
(#6702). -
The developer documentation in
?dplyr_extending
has been refreshed and brought up to date with all changes made in 1.1.0 (#6695). -
rename_with()
now includes an example of usingpaste0(recycle0 = TRUE)
to correctly handle empty selections (#6688). -
R >=3.5.0 is now explicitly required. This is in line with the tidyverse policy of supporting the 5 most recent versions of R.