[R-gregmisc-users] SF.net SVN: r-gregmisc:[1311] trunk/gdata
Brought to you by:
warnes
From: <gg...@us...> - 2008-12-31 13:30:10
|
Revision: 1311 http://r-gregmisc.svn.sourceforge.net/r-gregmisc/?rev=1311&view=rev Author: ggorjan Date: 2008-12-31 13:30:07 +0000 (Wed, 31 Dec 2008) Log Message: ----------- Enhanced function object.size that returns the size of multiple objects. There is also a handy print method that can print size of an object in "human readable" format when options(humanReadable=TRUE) or print(object.size(x), humanReadable=TRUE). Added Paths: ----------- trunk/gdata/R/object.size.R trunk/gdata/man/humanReadable.Rd trunk/gdata/man/object.size.Rd Added: trunk/gdata/R/object.size.R =================================================================== --- trunk/gdata/R/object.size.R (rev 0) +++ trunk/gdata/R/object.size.R 2008-12-31 13:30:07 UTC (rev 1311) @@ -0,0 +1,92 @@ +### object.size.R +###------------------------------------------------------------------------ +### What: Print object size in human readable format - code +### $Id$ +### Time-stamp: <2008-12-30 08:05:43 ggorjan> +###------------------------------------------------------------------------ + +object.size <- function(...) +{ + structure(sapply(list(...), function(x) .Internal(object.size(x))), + class=c("object_size", "numeric")) +} + +print.object_size <- function(x, quote=FALSE, humanReadable, ...) +{ + xOrig <- x + if(missing(humanReadable)) { + opt <- getOption("humanReadable") + humanReadable <- ifelse(!is.null(opt), opt, FALSE) + } + if(humanReadable) { + print(humanReadable(x), quote=quote, ...) + } else { + class(x) <- "numeric" + NextMethod() + } + invisible(xOrig) +} + +is.object_size <- function(x) inherits(x, what="object_size") + +as.object_size <- function(x) +{ + if(!is.numeric(x)) stop("'x' must be numeric/integer") + class(x) <- c("object_size", "numeric") + x +} + +c.object_size <- function(..., recursive=FALSE) +{ + x <- NextMethod() + if(is.numeric(x)) class(x) <- c("object_size", "numeric") + x +} + +humanReadable <- function(x, standard="SI", digits=1, width=3, sep=" ") +{ + ## --- Setup --- + + if(any(x < 0)) stop("'x' must be positive") + if(standard == "SI") { + suffix <- c("B", "kB", "MB", "GB", "TB", "PB", "EB", "ZB", "YB") + base <- 1000 + } else { + suffix <- c("B", "KiB", "MiB", "GiB", "TiB", "PiB", "EiB", "ZiB", "YiB") + base <- 1024 + } + + ## --- Apply --- + + .applyHuman <- function(x, base, suffix, digits, width, sep) + { + ## Which suffix should we use? + n <- length(suffix) + for(i in 1:n) { + if(x >= base) { + if(i < n) x <- x / base + } else { + break + } + } + ## Formatting + if(is.null(width)) { ## the same formatting for all + x <- format(round(x=x, digits=digits), nsmall=digits) + } else { ## similar to ls, du, and df + lenX <- nchar(x) + if(lenX > width) { + digitsMy <- width - (lenX - (lenX - (nchar(round(x)) + 1))) + digits <- ifelse(digitsMy > digits, digits, digitsMy) + } + if(i == 1) digits <- 0 + x <- round(x, digits=digits) + } + paste(x, suffix[i], sep=sep) + } + + sapply(X=x, FUN=".applyHuman", base=base, suffix=suffix, digits=digits, + width=width, sep=sep) +} + +###------------------------------------------------------------------------ +### object.size.R ends here Property changes on: trunk/gdata/R/object.size.R ___________________________________________________________________ Added: svn:keywords + Added: trunk/gdata/man/humanReadable.Rd =================================================================== --- trunk/gdata/man/humanReadable.Rd (rev 0) +++ trunk/gdata/man/humanReadable.Rd 2008-12-31 13:30:07 UTC (rev 1311) @@ -0,0 +1,131 @@ +% humanReadable.Rd +%-------------------------------------------------------------------------- +% What: Print byte size in human readable format man page +% $Id$ +% Time-stamp: <2008-12-30 13:26:35 ggorjan> +%-------------------------------------------------------------------------- + +\name{humanReadable} + +\alias{humanReadable} + +\title{Print byte size in human readable format} + +\description{ + +\code{humanReadable} converts byte size in human readable format such as +kB, MB, GB, etc. + +} + +\usage{ + +humanReadable(x, standard="SI", digits=1, width=3, sep=" ") + +} + +\arguments{ + \item{x}{integer, byte size} + \item{standard}{character, "SI" for powers of 1000 or anything else for + powers of 1024, see details} + \item{digits}{integer, number of digits after decimal point} + \item{width}{integer, width of number string} + \item{sep}{character, separator between number and unit} +} + +\details{ + +Basic unit used to store information in computers is a bit. Bits are +represented as zeroes and ones - binary number system. Although, the binary +number system is not the same as the decimal number system, decimal prefixes +for binary multiples such as kilo and mega are often used. In the decimal system +kilo represent 1000, which is close to \eqn{1024 = 2^{10}} in the binary system. +This sometimes causes problems as it is not clear which powers (2 or 10) are used +in a notation like 1 kB. To overcome this problem International Electrotechnical +Commission (IEC) has provided the following solution to this problem: + +\tabular{lrcll}{ +Name \tab System \tab Symbol \tab Size \tab Conversion \cr +byte \tab binary \tab B \tab \eqn{2^3} \tab 8 bits \cr +kilobyte \tab decimal \tab kB \tab \eqn{10^3} \tab 1000 bytes \cr +kibibyte \tab binary \tab KiB \tab \eqn{2^{10}} \tab 1024 bytes \cr +megabyte \tab decimal \tab MB \tab \eqn{(10^3)^2} \tab 1000 kilobytes\cr +mebibyte \tab binary \tab MiB \tab \eqn{(2^{10})^2} \tab 1024 kibibytes\cr +gigabyte \tab decimal \tab GB \tab \eqn{(10^3)^3} \tab 1000 megabytes\cr +gibibyte \tab binary \tab GiB \tab \eqn{(2^{10})^3} \tab 1024 mebibytes\cr +terabyte \tab decimal \tab TB \tab \eqn{(10^3)^4} \tab 1000 gigabytes\cr +tebibyte \tab binary \tab TiB \tab \eqn{(2^{10})^4} \tab 1024 gibibytes\cr +petabyte \tab decimal \tab PB \tab \eqn{(10^3)^5} \tab 1000 terabytes\cr +pebibyte \tab binary \tab PiB \tab \eqn{(2^{10})^5} \tab 1024 tebibytes\cr +exabyte \tab decimal \tab EB \tab \eqn{(10^3)^6} \tab 1000 petabytes\cr +exbibyte \tab binary \tab EiB \tab \eqn{(2^{10})^6} \tab 1024 pebibytes\cr +zettabyte \tab decimal \tab ZB \tab \eqn{(10^3)^7} \tab 1000 exabytes\cr +zebibyte \tab binary \tab ZiB \tab \eqn{(2^{10})^7} \tab 1024 exbibytes\cr +yottabyte \tab decimal \tab YB \tab \eqn{(10^3)^8} \tab 1000 zettabytes\cr +yebibyte \tab binary \tab YiB \tab \eqn{(2^{10})^8} \tab 1024 zebibytes\cr +} + +where Zi and Yi are GNU extensions to IEC. To get the output in the decimal +system (powers of 1000) use \code{standard="SI"}. Otherwise IEC standard +(powers of 1024) is used. + +For printout both \code{digits} and \code{width} can be specified. If +\code{width} is \code{NULL}, all values have given number of digits. If +\code{width} is not \code{NULL}, output is rounded to a given width and +formated similar to human readable format of \code{ls}, \code{df} or +\code{du} shell commands. + +} + +\references{ + +Wikipedia: +\url{http://en.wikipedia.org/wiki/Byte} +\url{http://en.wikipedia.org/wiki/SI_prefix} +\url{http://en.wikipedia.org/wiki/Binary_prefix} + +GNU manual for coreutils: +\url{http://www.gnu.org/software/coreutils/manual/html_node/Block-size.html#Block-size} + +} + +\value{ + +Byte size in human readable format as character with proper unit symbols +added at the end of the string. + +} + +\author{Ales Korosec and Gregor Gorjanc} + +\seealso{ + \code{\link{object.size}}, \code{\link[gdata]{ll}} +} + +\examples{ + +baseSI <- 10 +powerSI <- seq(from=3, to=27, by=3) +SI0 <- (baseSI)^powerSI +k <- length(SI0) - 1 +SI1 <- SI0 - SI0 / c(2, runif(n=k, min=1.01, max=5.99)) +SI2 <- SI0 + SI0 / c(2, runif(n=k, min=1.01, max=5.99)) + +baseIEC <- 2 +powerIEC <- seq(from=10, to=90, by=10) +IEC0 <- (baseIEC)^powerIEC +IEC1 <- IEC0 - IEC0 / c(2, runif(n=k, min=1.01, max=5.99)) +IEC2 <- IEC0 + IEC0 / c(2, runif(n=k, min=1.01, max=5.99)) + +cbind(humanReadable(x=SI1, width=NULL, digits=3), + humanReadable(x=SI0, width=NULL, digits=2), + humanReadable(x=SI2, width=NULL, digits=1), + humanReadable(x=IEC1, standard="IEC", width=7, digits=3), + humanReadable(x=IEC0, standard="IEC", width=7, digits=2), + humanReadable(x=IEC2, standard="IEC", width=7, digits=1)) +} + +\keyword{misc} + +%-------------------------------------------------------------------------- +% humanReadable.Rd ends here Added: trunk/gdata/man/object.size.Rd =================================================================== --- trunk/gdata/man/object.size.Rd (rev 0) +++ trunk/gdata/man/object.size.Rd 2008-12-31 13:30:07 UTC (rev 1311) @@ -0,0 +1,85 @@ +% File src/library/utils/man/object.size.Rd +% Part of the R package, http://www.R-project.org +% Copyright 1995-2007 R Core Development Team +% Distributed under GPL 2 or later + +\name{object.size} +\alias{object.size} +\alias{print.object_size} +\alias{c.object_size} +\alias{as.object_size} +\alias{is.object_size} + +\title{Report the Space Allocated for an Object} +\description{ + Provides an estimate of the memory that is being used to store an \R object. +} +\usage{ +object.size(\dots) + +\method{print}{object_size}(x, quote=FALSE, humanReadable, \dots) +} +\arguments{ + \item{\dots}{\code{object.size}: \R objects; \code{print}; arguments + to be passed to or from other methods.} + \item{x}{output from \code{object.size}} + \item{quote}{logical, indicating whether or not the result should be + printed with surrounding quotes.} + \item{humanReadable}{logical, use the \dQuote{human readable} format.} +} +\details{ + + This is a modified copy from the utils package in R as fo 2008-12-15. + + Exactly which parts of the memory allocation should be attributed to + which object is not clear-cut. This function merely provides a rough + indication: it should be reasonably accurate for atomic vectors, but + does not detect if elements of a list are shared, for example. + (Sharing amongst elements of a character vector is taken into account, + but not that between character vectors in a single object.) + + The calculation is of the size of the object, and excludes the space + needed to store its name in the symbol table. + + Associated space (e.g. the environment of a function and what the + pointer in a \code{EXTPTRSXP} points to) is not included in the + calculation. + + Object sizes are larger on 64-bit platforms than 32-bit ones, but will + very likely be the same on different platforms with the same word + length and pointer size. + + % Modificitaion start + Class of returned object is \code{c("byte", "numeric")} with + appropriate \code{print} and \code{c} methods. + + By default \code{object.size} outputs size in bytes, but human + readable format similar to \code{ls}, \code{df} or \code{du} shell + commands can be invoked with \code{options(humanReadable=TRUE)}. + % Modificitaion end + +} +\value{ + An object of class \code{"object.size"} with a length-one double value, + an estimate of the memory allocation attributable to the object in bytes. +} +\seealso{ + \code{\link{Memory-limits}} for the design limitations on object size. + \code{\link{humanReadable}} for human readable format. +} + +\examples{ +object.size(letters) +object.size(ls) +## find the 10 largest objects in the base package +z <- sapply(ls("package:base"), function(x) + object.size(get(x, envir = baseenv()))) +(tmp <- as.matrix(rev(sort(z))[1:10])) + +as.object_size(14567567) +options(humanReadable=TRUE) +(z <- object.size(letters, c(letters, letters), rep(letters, 100), rep(letters, 10000))) +is.object_size(z) +as.object_size(14567567) +} +\keyword{utilities} This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |