collapse - Browse /v2.1.0 at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
collapse version 2.1.0 source code.tar.gz	2025-03-10	12.7 MB	0
collapse version 2.1.0 source code.zip	2025-03-10	13.0 MB	0
README.md	2025-03-10	4.0 kB	0
Totals: 3 Items		25.7 MB	0

collapse 2.1.0, released in March 2025, introduces a fast slicing function, an improved weighted quantile algorithm, a few convenience features, and removes some legacy functions from the package.

Potentially breaking changes

Functions pwNobs, as.factor_GRP, as.factor_qG, is.GRP, is.qG, is.unlistable, is.categorical, is.Date, as.numeric_factor, as.character_factor, and Date_vars, which were renamed in v1.6.0 by either replacing '.' with '_' or using all lower-case letters, and depreciated since then, are now finally removed from the package.
num_vars() (and thus also cat_vars() and collap()) were changed to a simpler C-definition of numeric data types which is more in-line with is.numeric(): is_numeric_C <- function(x) typeof(x) %in% c("integer", "double") && !inherits(x, c("factor", "Date", "POSIXct", "yearmon", "yearqtr")). The previous definition was: is_numeric_C_old <- function(x) typeof(x) %in% c("integer", "double") && (!is.object(x) || inherits(x, c("ts", "units", "integer64"))). Thus, the definition changed from including only certain classes to excluding the most important classes. Thanks @maouw for flagging this (#727).

Bug Fixes

Fixed some issues using collapse and the tidyverse together, particularly regarding tidyverse methods for 'grouped_df' - thanks @NicChr (#645).
More consistent handling of zero-length inputs - they are now also returned in fmean() and fmedian()/fnth() instead of returning NA (#628).

Additions

Added function fslice(): a fast alternative to dplyr::slice_[head|tail|min|max] that also works with matrices. Thanks @alinacherkas for the proposal and initial implementation (#725).
Added function groupv() as programmers version of group(), or rather, groupv() is now identical to the former group(), and group() now supports multiple vectors as input e.g. group(v1, v2). This is done for convenience and consistency with radixorder[v](). For backwards compatibility, group() also supports a single list as input.
join() has a new argument require allowing the user to generate messages or errors if the join operation is not successful enough:

:::r join(df1, df2, require = list(x = 0.8, fail = "warning"))

> Warning: Matched 75.0% of records in table df1 (x), but 80.0% is required

> left join: df1[id1, id2] 3/4 (75%) <1:1st> df2[id1, id2] 3/4 (75%)

> id1 id2 name age salary dept

> 1 1 a John 35 60000 IT

> 2 1 b Jane 28 NA <NA>

> 3 2 b Bob 42 55000 Marketing

> 4 3 c Carl 50 70000 Sales
psmat() now has a fill argument to fill empty slots in matrix/array with other elements (default NULL/NA).

Improvements

The weighted quantile algorithm in fquantile()/fnth() was improved to a more theoretically sound method following excellent notes by Matthew Kay. It now also supports quantile type 4, but it does not skip zero weights anymore, as the new algorithm makes it difficult to skip them 'on the fly'. Note that the existing collapse algorithm already had very good properties after a bug fix in v2.0.17, but the new algorithm is more exact and also faster.
The collapse arXiv article has been updated and significantly enhanced. It is an excellent resource to get an overview of the package.

Notes

On CRAN, collapse R dependency was changed to >= 4.1.0 to be able to use the base pipe in examples without generating a NOTE on R CMD check (another absolutely unnecessary restriction). The package depends on R >= 3.5.0 and the DESCRIPTION file on GitHub/R-universe will continue to reflect this.

Source: README.md, updated 2025-03-10

collapse Files

Advanced and Fast Data Transformation in R

Potentially breaking changes

Bug Fixes

Additions

> Warning: Matched 75.0% of records in table df1 (x), but 80.0% is required

> left join: df1[id1, id2] 3/4 (75%) <1:1st> df2[id1, id2] 3/4 (75%)

> id1 id2 name age salary dept

> 1 1 a John 35 60000 IT

> 2 1 b Jane 28 NA <NA>

> 3 2 b Bob 42 55000 Marketing

> 4 3 c Carl 50 70000 Sales

Improvements

Notes

collapse Files

Advanced and Fast Data Transformation in R

Get an email when there's a new version of collapse

Potentially breaking changes

Bug Fixes

Additions

> Warning: Matched 75.0% of records in table df1 (x), but 80.0% is required

> left join: df1[id1, id2] 3/4 (75%) <1:1st> df2[id1, id2] 3/4 (75%)

> id1 id2 name age salary dept

> 1 1 a John 35 60000 IT

> 2 1 b Jane 28 NA <NA>

> 3 2 b Bob 42 55000 Marketing

> 4 3 c Carl 50 70000 Sales

Improvements

Notes