| Title: | Collection of Utility Functions for Data Analysis and Computing |
|---|---|
| Description: | Provides utility functions for data analysis and computing. Includes functions for logging, parallel processing, and other computational tasks to streamline workflows. |
| Authors: | Meng Xu [aut, cre] (ORCID: <https://orcid.org/0000-0002-8300-1054>), Haoliang Zhu [aut] (ORCID: <https://orcid.org/0009-0004-9766-5179>) |
| Maintainer: | Meng Xu <[email protected]> |
| License: | MIT+ file LICENSE |
| Version: | 0.4.7 |
| Built: | 2026-06-02 10:25:42 UTC |
| Source: | https://github.com/mengxu98/thisutils |
Provides utility functions for data analysis and computing. Includes functions for logging, parallel processing, and other computational tasks to streamline workflows.
Meng Xu (Maintainer), [email protected]
https://mengxu98.github.io/thisutils/
Useful links:
This operator returns the left side if it's not NULL,
otherwise it returns the right side.
a %ss% ba %ss% b
a |
The left side value to check. |
b |
The right side value to use if |
a if it is not NULL, otherwise b.
NULL %ss% 10 5 %ss% 10NULL %ss% 10 5 %ss% 10
Add a package file and print package information
add_pkg_file( use_figlet = TRUE, figlet_font = "Slant", colors = c("red", "yellow", "green", "magenta", "cyan", "yellow", "green", "white", "magenta", "cyan"), verbose = TRUE )add_pkg_file( use_figlet = TRUE, figlet_font = "Slant", colors = c("red", "yellow", "green", "magenta", "cyan", "yellow", "green", "white", "magenta", "cyan"), verbose = TRUE )
use_figlet |
Whether to use figlet for ASCII art generation.
Default is |
figlet_font |
Character string, figlet font to use.
Default is |
colors |
Character vector, colors to use for the logo elements. |
verbose |
Whether to print the message.
Default is |
Creates a file named R/<pkg_name>-package.R.
Convert matrix into dense/sparse matrix
as_matrix(x, return_sparse = FALSE)as_matrix(x, return_sparse = FALSE)
x |
A matrix. |
return_sparse |
Whether to output a sparse matrix.
Default is |
A dense or sparse matrix.
m <- simulate_sparse_matrix( 1000, 1000, decimal = 3 ) a <- as_matrix(m) a[1:5, 1:5] b <- as_matrix(m, return_sparse = TRUE) b[1:5, 1:5]m <- simulate_sparse_matrix( 1000, 1000, decimal = 3 ) a <- as_matrix(m) a[1:5, 1:5] b <- as_matrix(m, return_sparse = TRUE) b[1:5, 1:5]
Capitalize the first letter of each word
capitalize(x, force_tolower = FALSE)capitalize(x, force_tolower = FALSE)
x |
A vector of character strings to be capitalized. |
force_tolower |
Whether to force the remaining letters to be lowercase. |
x <- c( "hello world", "hello World" ) capitalize(x)x <- c( "hello world", "hello World" ) capitalize(x)
Check CI environment
check_ci_env()check_ci_env()
A logical value.
Check if a package is installed with the specified version
check_pkg_status(pkg, version = NULL, lib = .libPaths()[1])check_pkg_status(pkg, version = NULL, lib = .libPaths()[1])
pkg |
Package name. |
version |
Package version to check. If |
lib |
The location of the library directories where to install the packages. |
TRUE if the package is installed with the specified version, FALSE otherwise.
Check and install R packages
check_r( packages, lib = .libPaths()[1], dependencies = NA, force = FALSE, verbose = TRUE )check_r( packages, lib = .libPaths()[1], dependencies = NA, force = FALSE, verbose = TRUE )
packages |
Package to be installed.
Package source can be CRAN, Bioconductor or Github.
By default, the package name is extracted according to the |
lib |
The location of the library directories where to install the packages. |
dependencies |
Which dependencies to install.
Passed to pak::pkg_install.
Default is |
force |
Whether to force the installation of packages.
Default is |
verbose |
Whether to print the message.
Default is |
Package installation status.
Check sparsity of matrix
check_sparsity(x)check_sparsity(x)
x |
A matrix. |
Sparsity of matrix.
Collapse sparse matrix rows by group
collapse_sparse_rows(matrix, group)collapse_sparse_rows(matrix, group)
matrix |
A sparse matrix. |
group |
A vector defining the output row groups. |
A sparse matrix with rows collapsed by 'group'.
mat <- Matrix::Matrix( matrix(c(1, 0, 2, 0, 3, 4), nrow = 3, byrow = TRUE), sparse = TRUE ) collapse_sparse_rows(mat, c("g1", "g1", "g2"))mat <- Matrix::Matrix( matrix(c(1, 0, 2, 0, 3, 4), nrow = 3, byrow = TRUE), sparse = TRUE ) collapse_sparse_rows(mat, c("g1", "g1", "g2"))
Compute per-cell Local Inverse Simpson's Index (LISI) scores for one or more categorical variables.
This is a clean-room reimplementation of the immunogenomics/LISI.
compute_lisi( X, meta_data, label_colnames, perplexity = 30, tol = 1e-05, max_iter = 50 )compute_lisi( X, meta_data, label_colnames, perplexity = 30, tol = 1e-05, max_iter = 50 )
X |
A matrix-like object with cells in rows and embedding/features in columns. |
meta_data |
A data frame with one row per cell. |
label_colnames |
Character vector of column names in |
perplexity |
Effective neighborhood size. Defaults to |
tol |
Tolerance used in the binary search for the target perplexity.
Defaults to |
max_iter |
Maximum number of binary-search iterations.
Defaults to |
A data frame with one row per cell and one column per label.
Korsunsky I, Millard N, Fan J, et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nature Methods (2019). https://www.nature.com/articles/s41592-019-0619-0
LISI reference implementation: https://github.com/immunogenomics/LISI
set.seed(1) X <- rbind( matrix(stats::rnorm(100, mean = -1), ncol = 2), matrix(stats::rnorm(100, mean = 1), ncol = 2) ) meta_data <- data.frame( batch = rep(c("A", "B"), each = 50), group = sample(c("g1", "g2"), 100, replace = TRUE) ) res <- compute_lisi( X, meta_data, c("batch", "group"), perplexity = 10 ) head(res) boxplot(res)set.seed(1) X <- rbind( matrix(stats::rnorm(100, mean = -1), ncol = 2), matrix(stats::rnorm(100, mean = 1), ncol = 2) ) meta_data <- data.frame( batch = rep(c("A", "B"), each = 50), group = sample(c("g1", "g2"), 100, replace = TRUE) ) res <- compute_lisi( X, meta_data, c("batch", "group"), perplexity = 10 ) head(res) boxplot(res)
Download file from the Internet
download( url, destfile, methods = c("auto", "wget", "libcurl", "curl", "wininet", "internal"), quiet = FALSE, ..., max_tries = 2 )download( url, destfile, methods = c("auto", "wget", "libcurl", "curl", "wininet", "internal"), quiet = FALSE, ..., max_tries = 2 )
url |
a |
destfile |
a character string (or vector, see the |
methods |
Methods to be used for downloading files.
Can be |
quiet |
If |
... |
Other arguments passed to utils::download.file. |
max_tries |
Number of tries for each download method.
Default is |
Create ASCII art text using figlet.
figlet( text, font = "Slant", width = getOption("width", 80), justify = "left", absolute = FALSE, strip = TRUE )figlet( text, font = "Slant", width = getOption("width", 80), justify = "left", absolute = FALSE, strip = TRUE )
text |
Text to make bigger. |
font |
Name of font, path to font, or |
width |
Width to use when justifying and breaking lines. |
justify |
Text justification to use in rendering ("left", "centre", "right"). |
absolute |
Logical, indicating if alignment is absolute. |
strip |
Logical, indicating if whitespace should be removed. |
An object of class figlet_text which is a character vector with a handy print method.
https://github.com/richfitz/rfiglet, https://github.com/jbkunst/figletr, https://www.figlet.org/
figlet("thisutils")figlet("thisutils")
Get a figlet font
figlet_font(font)figlet_font(font)
font |
Path or name of the font to load |
A 'figlet_font' object for use with [figlet]
Get a function from a namespace
get_namespace_fun(pkg, fun)get_namespace_fun(pkg, fun)
pkg |
The name of the package. |
fun |
The name of the function. |
Function.
Get the verbose option from the global options or the local argument.
get_verbose(verbose = NULL)get_verbose(verbose = NULL)
verbose |
The verbose option. Default is 'NULL', which means to get the verbose option from the global options. |
The verbose option.
get_verbose() get_verbose(verbose = FALSE) get_verbose(verbose = TRUE) options(log_message.verbose = FALSE) get_verbose() get_verbose(verbose = TRUE) options(log_message.verbose = TRUE) get_verbose() options(log_message.verbose = NULL)get_verbose() get_verbose(verbose = FALSE) get_verbose(verbose = TRUE) options(log_message.verbose = FALSE) get_verbose() get_verbose(verbose = TRUE) options(log_message.verbose = TRUE) get_verbose() options(log_message.verbose = NULL)
Invoke a function with a list of arguments
invoke_fun(.fn, .args = list(), ..., .env = rlang::caller_env())invoke_fun(.fn, .args = list(), ..., .env = rlang::caller_env())
.fn |
A function, or function name as a string. |
.args |
A list of arguments. |
... |
Other arguments passed to the function. |
.env |
Environment in which to evaluate the call.
This will be most useful if |
f <- function(x, y) { x + y } invoke_fun(f, list(x = 1, y = 2)) invoke_fun("f", list(x = 1, y = 2)) invoke_fun("f", x = 1, y = 2)f <- function(x, y) { x + y } invoke_fun(f, list(x = 1, y = 2)) invoke_fun("f", list(x = 1, y = 2)) invoke_fun("f", x = 1, y = 2)
Check if the system is running on Apple Silicon
is_apple_silicon()is_apple_silicon()
A logical value.
Check if the operating system is Linux
is_linux()is_linux()
A logical value.
Check if the operating system is macOS
is_osx()is_osx()
A logical value.
Detect outliers using MAD (Median Absolute Deviation)
is_outlier( x, nmads = 2.5, constant = 1.4826, type = c("both", "lower", "higher") )is_outlier( x, nmads = 2.5, constant = 1.4826, type = c("both", "lower", "higher") )
x |
Numeric vector. |
nmads |
Number of MADs from the median to define the boundaries for outliers.
Default is |
constant |
Constant factor to convert the MAD to a standard deviation.
Default is |
type |
Type of outliers to detect.
Available options are |
Numeric vector of indices indicating the positions of outliers in x.
x <- c(1, 2, 3, 4, 5, 100) is_outlier(x) # returns 6 x <- c(3, 4, 5, NA, 6, 7) is_outlier(x, nmads = 1.5, type = "lower") # returns 4 x <- c(10, 20, NA, 15, 35) is_outlier(x, nmads = 2, type = "higher") # returns 3, 5x <- c(1, 2, 3, 4, 5, 100) is_outlier(x) # returns 6 x <- c(3, 4, 5, NA, 6, 7) is_outlier(x, nmads = 1.5, type = "lower") # returns 4 x <- c(10, 20, NA, 15, 35) is_outlier(x, nmads = 2, type = "higher") # returns 3, 5
Check if the operating system is Windows
is_windows()is_windows()
A logical value.
Integrate the message printing function with the cli package, and the base::message function. The message could be suppressed by base::suppressMessages.
log_message( ..., expr = NULL, verbose = NULL, message_type = c("info", "success", "warning", "error", "running", "ask"), cli_model = TRUE, level = 1, symbol = " ", text_color = NULL, back_color = NULL, text_style = NULL, multiline_indent = FALSE, timestamp = TRUE, timestamp_format = paste0("[", format(Sys.time(), "%Y-%m-%d %H:%M:%S"), "] "), timestamp_style = FALSE, plain_text = FALSE, .envir = parent.frame(), .frame = .envir )log_message( ..., expr = NULL, verbose = NULL, message_type = c("info", "success", "warning", "error", "running", "ask"), cli_model = TRUE, level = 1, symbol = " ", text_color = NULL, back_color = NULL, text_style = NULL, multiline_indent = FALSE, timestamp = TRUE, timestamp_format = paste0("[", format(Sys.time(), "%Y-%m-%d %H:%M:%S"), "] "), timestamp_style = FALSE, plain_text = FALSE, .envir = parent.frame(), .frame = .envir )
... |
The message to print. |
expr |
An optional expression to evaluate while capturing its standard output,
messages, and warnings, then re-printing them with |
verbose |
Whether to print the message.
Default is |
message_type |
Type of message.
Could be choose one of |
cli_model |
Whether to use the |
level |
The level of the message, which affects the indentation.
Level |
symbol |
The symbol used for indentation.
When specified, it ignores the level parameter and uses the symbol directly.
Default is |
text_color |
Color for the message text.
Supports R color names (e.g., |
back_color |
Background color for the message text.
Details see parameter |
text_style |
Text styles to apply.
Can be one or more of:
|
multiline_indent |
Whether to apply consistent formatting (timestamp and indentation) to each line in multiline messages.
When |
timestamp |
Whether to show the current time in the message.
Default is |
timestamp_format |
Format string for timestamp display.
Default is |
timestamp_style |
Whether to apply the same text styling to the timestamp as the message text.
When |
plain_text |
Whether to print only the text content.
When |
.envir |
The environment to evaluate calls in. Default is parent.frame. |
.frame |
The frame to use for error reporting.
Default is |
Formated message, a logical value (TRUE/FALSE/NA) if message_type = "ask",
or the evaluated result of expr if expr is supplied.
https://cli.r-lib.org/articles/index.html
# basic usage log_message("Hello, ", "world!") log_message("hello, world!") log_message("Hello, world!", timestamp = FALSE) log_message( "Hello, ", "world!", message_type = "success" ) log_message( "Hello, world!", message_type = "warning" ) log_message( "Processing data...", message_type = "running" ) log_message( "Hello, ", "world!", cli_model = FALSE ) # suppress messages suppressMessages(log_message("Hello, world!")) log_message("Hello, world!", verbose = FALSE) options(log_message.verbose = FALSE) log_message("Hello, world!") # for global verbose option options(log_message.verbose = TRUE) log_message("Hello, world!", verbose = FALSE) options(log_message.verbose = NULL) # cli inline markup log_message("{.arg abc} is a argument") ## 'message' can not deal with cli inline markup message("hello, {.code world}!") log_message("{.val list('abc')} is a {.cls {class(list('abc'))}}") log_message("{.code lm(y ~ x)} is a code example") log_message("{.dt List}list('abc')") log_message("address: {.email [email protected]}") log_message("{.emph R} is a programming language") log_message("{.envvar R_HOME}") log_message("{.file log_message.R} is a file") log_message("{.fn lm} is a function") log_message("{.fun lm} is a function") log_message("{.help lm} to get help") log_message("... see {.help [{.fun lm}](stats::lm)} to learn more") log_message( "See the {.href [cli homepage](https://cli.r-lib.org)} for details" ) log_message("press {.kbd ENTER}") log_message("press {.key ENTER}") log_message("URL: {.url https://cli.r-lib.org}") log_message("Some {.field field}") log_message("{.path /usr/bin/R} is a path") log_message("{.pkg cli} is a package") log_message("{.val object} is a variable") log_message("{.run Rscript log_message.R} is a runnable file") log_message("{.str object} is a string") log_message("{.strong abc} is a strong string") log_message("{.topic stats::lm} is a topic") log_message("{.vignette cli} is a vignette") # set indentation log_message("Hello, world!", level = 2) log_message("Hello, world!", symbol = "->") log_message( "Hello, world!", symbol = "#####", level = 3 ) # color formatting log_message( "This is a red message", text_color = "#ff9900" ) log_message( "This is a message with background", back_color = "#EE4000" ) log_message( "This is a message with both text and background", text_color = "white", back_color = "cyan" ) log_message( "This is a message with background", back_color = "#EE4000", cli_model = FALSE ) log_message( "This is a message with both text and background", text_color = "red", back_color = "cyan", cli_model = FALSE ) log_message( "Hex color with {.arg cli_model = FALSE}", text_color = "#FF5733", cli_model = FALSE ) log_message( "Bright red text", text_color = "br_red" ) log_message( "Bright background", back_color = "br_yellow" ) log_message( "Combined grey and style", text_color = "grey", text_style = "bold" ) # text style formatting log_message( "Bold message", text_style = "bold" ) log_message( "Italic message", text_style = "italic" ) log_message( "Underlined message", text_style = "underline" ) log_message( "Combined styles", text_style = c("bold", "underline") ) log_message( "Color and style", text_color = "blue", text_style = c("bold", "italic") ) log_message( "Hex color and style", text_color = "#FF5733", text_style = c("bold", "underline") ) # multiline message log_message( "Line 1\nLine 2\nLine 3", multiline_indent = TRUE, text_style = "italic" ) log_message( "Multi-line\ncolored\nmessage", text_color = "blue", text_style = "italic" ) log_message( "Multi-line\ncolored\nmessage", text_color = "blue", timestamp = FALSE ) # timestamp styling log_message( "Multi-line message\nwith timestamp styling", text_color = "red", text_style = "bold", timestamp_style = TRUE ) log_message( "Multi-line message\nwithout timestamp styling", text_color = "#669999", text_style = c("bold", "italic") ) # combine cli package and log_message log_message( cli::col_green( "I am a green line ", cli::col_blue( cli::style_underline( cli::style_bold("with a blue substring") ) ), " that becomes green again!" ) ) # cli variables fun <- function(x = 1) { log_message("{.val x}") log_message("{.val {x}}") log_message("{.val {x + 1}}") } fun() # print objects directly df <- data.frame( x = 1:3, y = letters[1:3], z = c(" a", "b ", "c") ) log_message("Content:\n", df) # interactive prompt if (interactive()) { log_message( "Do you want to continue?", message_type = "ask" ) } # capture output from another expression fun <- function() { cat("This is standard output\n") message("This is a message") return(1 + 1) } fun() log_message( expr = fun(), message_type = "running" )# basic usage log_message("Hello, ", "world!") log_message("hello, world!") log_message("Hello, world!", timestamp = FALSE) log_message( "Hello, ", "world!", message_type = "success" ) log_message( "Hello, world!", message_type = "warning" ) log_message( "Processing data...", message_type = "running" ) log_message( "Hello, ", "world!", cli_model = FALSE ) # suppress messages suppressMessages(log_message("Hello, world!")) log_message("Hello, world!", verbose = FALSE) options(log_message.verbose = FALSE) log_message("Hello, world!") # for global verbose option options(log_message.verbose = TRUE) log_message("Hello, world!", verbose = FALSE) options(log_message.verbose = NULL) # cli inline markup log_message("{.arg abc} is a argument") ## 'message' can not deal with cli inline markup message("hello, {.code world}!") log_message("{.val list('abc')} is a {.cls {class(list('abc'))}}") log_message("{.code lm(y ~ x)} is a code example") log_message("{.dt List}list('abc')") log_message("address: {.email [email protected]}") log_message("{.emph R} is a programming language") log_message("{.envvar R_HOME}") log_message("{.file log_message.R} is a file") log_message("{.fn lm} is a function") log_message("{.fun lm} is a function") log_message("{.help lm} to get help") log_message("... see {.help [{.fun lm}](stats::lm)} to learn more") log_message( "See the {.href [cli homepage](https://cli.r-lib.org)} for details" ) log_message("press {.kbd ENTER}") log_message("press {.key ENTER}") log_message("URL: {.url https://cli.r-lib.org}") log_message("Some {.field field}") log_message("{.path /usr/bin/R} is a path") log_message("{.pkg cli} is a package") log_message("{.val object} is a variable") log_message("{.run Rscript log_message.R} is a runnable file") log_message("{.str object} is a string") log_message("{.strong abc} is a strong string") log_message("{.topic stats::lm} is a topic") log_message("{.vignette cli} is a vignette") # set indentation log_message("Hello, world!", level = 2) log_message("Hello, world!", symbol = "->") log_message( "Hello, world!", symbol = "#####", level = 3 ) # color formatting log_message( "This is a red message", text_color = "#ff9900" ) log_message( "This is a message with background", back_color = "#EE4000" ) log_message( "This is a message with both text and background", text_color = "white", back_color = "cyan" ) log_message( "This is a message with background", back_color = "#EE4000", cli_model = FALSE ) log_message( "This is a message with both text and background", text_color = "red", back_color = "cyan", cli_model = FALSE ) log_message( "Hex color with {.arg cli_model = FALSE}", text_color = "#FF5733", cli_model = FALSE ) log_message( "Bright red text", text_color = "br_red" ) log_message( "Bright background", back_color = "br_yellow" ) log_message( "Combined grey and style", text_color = "grey", text_style = "bold" ) # text style formatting log_message( "Bold message", text_style = "bold" ) log_message( "Italic message", text_style = "italic" ) log_message( "Underlined message", text_style = "underline" ) log_message( "Combined styles", text_style = c("bold", "underline") ) log_message( "Color and style", text_color = "blue", text_style = c("bold", "italic") ) log_message( "Hex color and style", text_color = "#FF5733", text_style = c("bold", "underline") ) # multiline message log_message( "Line 1\nLine 2\nLine 3", multiline_indent = TRUE, text_style = "italic" ) log_message( "Multi-line\ncolored\nmessage", text_color = "blue", text_style = "italic" ) log_message( "Multi-line\ncolored\nmessage", text_color = "blue", timestamp = FALSE ) # timestamp styling log_message( "Multi-line message\nwith timestamp styling", text_color = "red", text_style = "bold", timestamp_style = TRUE ) log_message( "Multi-line message\nwithout timestamp styling", text_color = "#669999", text_style = c("bold", "italic") ) # combine cli package and log_message log_message( cli::col_green( "I am a green line ", cli::col_blue( cli::style_underline( cli::style_bold("with a blue substring") ) ), " that becomes green again!" ) ) # cli variables fun <- function(x = 1) { log_message("{.val x}") log_message("{.val {x}}") log_message("{.val {x + 1}}") } fun() # print objects directly df <- data.frame( x = 1:3, y = letters[1:3], z = c(" a", "b ", "c") ) log_message("Content:\n", df) # interactive prompt if (interactive()) { log_message( "Do you want to continue?", message_type = "ask" ) } # capture output from another expression fun <- function() { cat("This is standard output\n") message("This is a message") return(1 + 1) } fun() log_message( expr = fun(), message_type = "running" )
Process matrix
matrix_process( matrix, method = c("raw", "zscore", "fc", "log2fc", "log1p"), ... )matrix_process( matrix, method = c("raw", "zscore", "fc", "log2fc", "log1p"), ... )
matrix |
A matrix. |
method |
Method to use for processing the matrix. |
... |
Other arguments passed to the method. |
A processed matrix.
m <- simulate_sparse_matrix(10, 10) matrix_process(m, method = "raw") matrix_process(m, method = "zscore") matrix_process(m, method = "fc") matrix_process(m, method = "log2fc") matrix_process(m, method = "log1p") m <- as_matrix(m) matrix_process(m, method = function(x) x / rowMeans(x))m <- simulate_sparse_matrix(10, 10) matrix_process(m, method = "raw") matrix_process(m, method = "zscore") matrix_process(m, method = "fc") matrix_process(m, method = "log2fc") matrix_process(m, method = "log1p") m <- as_matrix(m) matrix_process(m, method = function(x) x / rowMeans(x))
Switch matrix to table
matrix_to_table( matrix, row_names = NULL, col_names = NULL, threshold = 0, keep_zero = TRUE )matrix_to_table( matrix, row_names = NULL, col_names = NULL, threshold = 0, keep_zero = TRUE )
matrix |
A matrix. |
row_names |
Character vector of row names to filter by. |
col_names |
Character vector of column names to filter by. |
threshold |
The threshold for filtering values based on absolute values.
Defaults to |
keep_zero |
Whether to keep zero values in the table. Defaults to |
A table with three columns: row, col, and value.
test_matrix <- simulate_sparse_matrix(10, 10) colnames(test_matrix) <- paste0("c", 1:10) rownames(test_matrix) <- paste0("r", 1:10) table <- matrix_to_table(test_matrix) matrix_new <- table_to_matrix(table) test_matrix <- test_matrix[rownames(matrix_new), colnames(matrix_new)] |> as_matrix() identical(test_matrix, matrix_new) matrix_to_table( test_matrix, threshold = 2 ) matrix_to_table( test_matrix, row_names = c("r1", "r2"), col_names = c("c1", "c2") )test_matrix <- simulate_sparse_matrix(10, 10) colnames(test_matrix) <- paste0("c", 1:10) rownames(test_matrix) <- paste0("r", 1:10) table <- matrix_to_table(test_matrix) matrix_new <- table_to_matrix(table) test_matrix <- test_matrix[rownames(matrix_new), colnames(matrix_new)] |> as_matrix() identical(test_matrix, matrix_new) matrix_to_table( test_matrix, threshold = 2 ) matrix_to_table( test_matrix, row_names = c("r1", "r2"), col_names = c("c1", "c2") )
Maximum depth of a list
max_depth(x, depth = 0)max_depth(x, depth = 0)
x |
A list. |
depth |
The depth of the list. |
x <- list( a = list(b = list(c = 1)), d = list(e = list(f = 2)) ) max_depth(x)x <- list( a = list(b = list(c = 1)), d = list(e = list(f = 2)) ) max_depth(x)
Maximum P-value
maximump(p, alpha = 0.05, log.p = FALSE)maximump(p, alpha = 0.05, log.p = FALSE)
p |
A vector of P-values. |
alpha |
The significance level. |
log.p |
Whether to return the log of the P-value. |
p <- c(0.01, 0.02, 0.03, 0.04, 0.05) maximump(p) maximump(p, alpha = 0.01) maximump(p, log.p = TRUE)p <- c(0.01, 0.02, 0.03, 0.04, 0.05) maximump(p) maximump(p, alpha = 0.01) maximump(p, log.p = TRUE)
Mean P-value
meanp(p)meanp(p)
p |
A vector of P-values. |
p <- c(0.01, 0.02, 0.03, 0.04, 0.05) meanp(p)p <- c(0.01, 0.02, 0.03, 0.04, 0.05) meanp(p)
Minimum P-value
minimump(p, alpha = 0.05, log.p = FALSE)minimump(p, alpha = 0.05, log.p = FALSE)
p |
A vector of P-values. |
alpha |
The significance level. |
log.p |
Whether to return the log of the P-value. |
p <- c(0.01, 0.02, 0.03, 0.04, 0.05) minimump(p) minimump(p, alpha = 0.01) minimump(p, log.p = TRUE)p <- c(0.01, 0.02, 0.03, 0.04, 0.05) minimump(p) minimump(p, alpha = 0.01) minimump(p, log.p = TRUE)
Normalize numeric vector
normalization(x, method = "max_min", na_rm = TRUE, ...)normalization(x, method = "max_min", na_rm = TRUE, ...)
x |
Input numeric vector. |
method |
Method used for normalization. |
na_rm |
Whether to remove |
... |
Parameters for other methods. |
Normalized numeric vector.
x <- c(runif(2), NA, -runif(2)) x normalization(x, method = "max_min") normalization(x, method = "maximum") normalization(x, method = "sum") normalization(x, method = "softmax") normalization(x, method = "z_score") normalization(x, method = "mad") normalization(x, method = "unit_vector") normalization(x, method = "unit_vector", na_rm = FALSE)x <- c(runif(2), NA, -runif(2)) x normalization(x, method = "max_min") normalization(x, method = "maximum") normalization(x, method = "sum") normalization(x, method = "softmax") normalization(x, method = "z_score") normalization(x, method = "mad") normalization(x, method = "unit_vector") normalization(x, method = "unit_vector", na_rm = FALSE)
Parallelize a function
parallelize_fun( x, fun, cores = 1, export_fun = NULL, clean_result = FALSE, throw_error = TRUE, progress_bar_width = 10L, timestamp_format = paste0("[", format(Sys.time(), "%Y-%m-%d %H:%M:%S"), "] "), verbose = TRUE )parallelize_fun( x, fun, cores = 1, export_fun = NULL, clean_result = FALSE, throw_error = TRUE, progress_bar_width = 10L, timestamp_format = paste0("[", format(Sys.time(), "%Y-%m-%d %H:%M:%S"), "] "), verbose = TRUE )
x |
A vector or list to apply over. |
fun |
The function to be applied to each element. |
cores |
The number of cores to use for parallelization with foreach::foreach.
Default is |
export_fun |
The functions to export the function to workers. |
clean_result |
Whether to remove failed results from output.
If |
throw_error |
Whether to print detailed error information for failed results.
Default is |
progress_bar_width |
Width of the verbose progress bar in characters.
Default is |
timestamp_format |
Format string for timestamp display.
Default is |
verbose |
Whether to print the message.
Default is |
A list of computed results.
If clean_result = FALSE, failed results are included as error objects.
If clean_result = TRUE, only successful results are returned.
parallelize_fun(1:3, function(x) { Sys.sleep(0.2) x^2 }) parallelize_fun(list(1, 2, 3), function(x) { Sys.sleep(0.2) x^2 }, cores = 2) # Examples with error handling parallelize_fun(1:5, function(x) { if (x == 3) stop("Error on element 3") x^2 }, clean_result = FALSE) parallelize_fun(1:5, function(x) { if (x == 3) stop("Error on element 3") x^2 }, clean_result = TRUE) # Control error printing parallelize_fun(1:5, function(x) { if (x == 2) stop("Error on element 3") if (x == 4) stop("Error on element 4") x^2 }) parallelize_fun(1:5, function(x) { if (x == 3) stop("Error on element 3") x^2 }, throw_error = FALSE)parallelize_fun(1:3, function(x) { Sys.sleep(0.2) x^2 }) parallelize_fun(list(1, 2, 3), function(x) { Sys.sleep(0.2) x^2 }, cores = 2) # Examples with error handling parallelize_fun(1:5, function(x) { if (x == 3) stop("Error on element 3") x^2 }, clean_result = FALSE) parallelize_fun(1:5, function(x) { if (x == 3) stop("Error on element 3") x^2 }, clean_result = TRUE) # Control error printing parallelize_fun(1:5, function(x) { if (x == 2) stop("Error on element 3") if (x == 4) stop("Error on element 4") x^2 }) parallelize_fun(1:5, function(x) { if (x == 3) stop("Error on element 3") x^2 }, throw_error = FALSE)
Parse '' inline expressions and evaluate them in the current environment, while preserving outer formatting markers like '{.val ...}'.
parse_inline_expressions(text, env = parent.frame())parse_inline_expressions(text, env = parent.frame())
text |
A character string containing inline expressions to parse. |
env |
Environment in which to evaluate expressions. Defaults to the calling environment. |
A character string with expressions evaluated but formatting preserved.
i <- 1 parse_inline_expressions( "{.val {i}}" ) x <- 5 y <- 10 parse_inline_expressions( "{.pkg {x + y}}" ) name <- "testing" name <- parse_inline_expressions( "{.pkg {name}}" ) name log_message(name)i <- 1 parse_inline_expressions( "{.val {i}}" ) x <- 5 y <- 10 parse_inline_expressions( "{.pkg {x + y}}" ) name <- "testing" name <- parse_inline_expressions( "{.pkg {name}}" ) name log_message(name)
Correlation and covariance calculation for sparse matrix
pearson_correlation(x, y = NULL)pearson_correlation(x, y = NULL)
x |
Sparse matrix or character vector. |
y |
Sparse matrix or character vector. |
A list with covariance and correlation matrices.
m1 <- simulate_sparse_matrix( 100, 100 ) m2 <- simulate_sparse_matrix( 100, 100, sparsity = 0.05 ) a <- pearson_correlation(m1, m2) a$cov[1:5, 1:5] a$cor[1:5, 1:5]m1 <- simulate_sparse_matrix( 100, 100 ) m2 <- simulate_sparse_matrix( 100, 100, sparsity = 0.05 ) a <- pearson_correlation(m1, m2) a$cov[1:5, 1:5] a$cor[1:5, 1:5]
Print logo
## S3 method for class 'thisutils_logo' print(x, ...)## S3 method for class 'thisutils_logo' print(x, ...)
x |
Input information. |
... |
Other parameters. |
Print the ASCII logo
)Coefficient of determination ()
r_square(y_true, y_pred)r_square(y_true, y_pred)
y_true |
A numeric vector with ground truth values. |
y_pred |
A numeric vector with predicted values. |
The value.
y <- rnorm(100) y_pred <- y + rnorm(100, sd = 0.5) r_square(y, y_pred)y <- rnorm(100) y_pred <- y + rnorm(100, sd = 0.5) r_square(y, y_pred)
Check and remove R packages
remove_r(packages, lib = .libPaths()[1], verbose = TRUE)remove_r(packages, lib = .libPaths()[1], verbose = TRUE)
packages |
Package to be removed. |
lib |
The location of the library directories where to remove the packages. |
verbose |
Whether to print the message.
Default is |
Remove and normalize spaces
remove_space( x, trim_start = TRUE, trim_end = FALSE, collapse_multiple = TRUE, preserve_newlines = TRUE )remove_space( x, trim_start = TRUE, trim_end = FALSE, collapse_multiple = TRUE, preserve_newlines = TRUE )
x |
A vector of character strings. |
trim_start |
Whether to remove leading spaces before the first word.
Default is |
trim_end |
Whether to remove trailing spaces after the last word.
Default is |
collapse_multiple |
Whether to collapse multiple consecutive spaces between words into a single space.
Default is |
preserve_newlines |
Whether to preserve newline characters when collapsing spaces.
Default is |
A character vector with spaces normalized according to the specified parameters.
x <- c( " hello world ", " test case ", "no space", " multiple spaces " ) remove_space(x) remove_space(x, trim_start = FALSE) remove_space(x, trim_end = TRUE) remove_space(x, collapse_multiple = FALSE) remove_space( x, trim_start = FALSE, trim_end = FALSE, collapse_multiple = FALSE ) # with newlines multiline <- c( "hello\n\n world ", " first \n second " ) remove_space(multiline) remove_space(multiline, preserve_newlines = FALSE)x <- c( " hello world ", " test case ", "no space", " multiple spaces " ) remove_space(x) remove_space(x, trim_start = FALSE) remove_space(x, trim_end = TRUE) remove_space(x, collapse_multiple = FALSE) remove_space( x, trim_start = FALSE, trim_end = FALSE, collapse_multiple = FALSE ) # with newlines multiline <- c( "hello\n\n world ", " first \n second " ) remove_space(multiline) remove_space(multiline, preserve_newlines = FALSE)
This function generates a sparse matrix with a specified number of rows and columns, a given sparsity level, and a distribution function for the non-zero values.
simulate_sparse_matrix( nrow, ncol, sparsity = 0.95, distribution_fun = function(n) stats::rpois(n, lambda = 0.5) + 1, decimal = 0, seed = 1 )simulate_sparse_matrix( nrow, ncol, sparsity = 0.95, distribution_fun = function(n) stats::rpois(n, lambda = 0.5) + 1, decimal = 0, seed = 1 )
nrow |
Number of rows in the matrix. |
ncol |
Number of columns in the matrix. |
sparsity |
Proportion of zero elements (sparsity level).
Default is |
distribution_fun |
Function to generate non-zero values. |
decimal |
Controls the number of decimal places in the generated values.
If set to |
seed |
Random seed for reproducibility. |
A sparse matrix of class "dgCMatrix".
simulate_sparse_matrix(1000, 500) |> check_sparsity() simulate_sparse_matrix(10, 10, decimal = 1) simulate_sparse_matrix(10, 10, decimal = 5)simulate_sparse_matrix(1000, 500) |> check_sparsity() simulate_sparse_matrix(10, 10, decimal = 1) simulate_sparse_matrix(10, 10, decimal = 5)
Safe correlation function which returns a sparse matrix.
sparse_cor( x, y = NULL, method = "pearson", allow_neg = TRUE, remove_na = TRUE, remove_inf = TRUE, ... )sparse_cor( x, y = NULL, method = "pearson", allow_neg = TRUE, remove_na = TRUE, remove_inf = TRUE, ... )
x |
Sparse matrix or character vector. |
y |
Sparse matrix or character vector. |
method |
Method to use for calculating the correlation coefficient. |
allow_neg |
Logical. Whether to allow negative values or set them to 0. |
remove_na |
Logical. Whether to replace NA values with 0. |
remove_inf |
Logical. Whether to replace infinite values with 1. |
... |
Other arguments passed to stats::cor function. |
A correlation matrix.
m1 <- simulate_sparse_matrix( 500, 100 ) m2 <- simulate_sparse_matrix( 500, 100, seed = 2025 ) a <- sparse_cor(m1) b <- sparse_cor(m1, m2) c <- as_matrix( cor(as_matrix(m1)), return_sparse = TRUE ) d <- as_matrix( cor(as_matrix(m1), as_matrix(m2)), return_sparse = TRUE ) a[1:5, 1:5] c[1:5, 1:5] all.equal(a, c) b[1:5, 1:5] d[1:5, 1:5] all.equal(b, d) m1[sample(1:500, 10)] <- NA m2[sample(1:500, 10)] <- NA sparse_cor(m1, m2)[1:5, 1:5] system.time( sparse_cor(m1) ) system.time( cor(as_matrix(m1)) ) system.time( sparse_cor(m1, m2) ) system.time( cor(as_matrix(m1), as_matrix(m2)) )m1 <- simulate_sparse_matrix( 500, 100 ) m2 <- simulate_sparse_matrix( 500, 100, seed = 2025 ) a <- sparse_cor(m1) b <- sparse_cor(m1, m2) c <- as_matrix( cor(as_matrix(m1)), return_sparse = TRUE ) d <- as_matrix( cor(as_matrix(m1), as_matrix(m2)), return_sparse = TRUE ) a[1:5, 1:5] c[1:5, 1:5] all.equal(a, c) b[1:5, 1:5] d[1:5, 1:5] all.equal(b, d) m1[sample(1:500, 10)] <- NA m2[sample(1:500, 10)] <- NA sparse_cor(m1, m2)[1:5, 1:5] system.time( sparse_cor(m1) ) system.time( cor(as_matrix(m1)) ) system.time( sparse_cor(m1, m2) ) system.time( cor(as_matrix(m1), as_matrix(m2)) )
An optimised version of split for the special case of splitting row indices into groups.
split_indices(group, n = 0L)split_indices(group, n = 0L)
group |
Integer indices |
n |
The largest integer (may not appear in index).
This is hint: if the largest value of |
A list of vectors of indices.
split_indices(sample(10, 100, rep = TRUE)) split_indices(sample(10, 100, rep = TRUE), 10)split_indices(sample(10, 100, rep = TRUE)) split_indices(sample(10, 100, rep = TRUE), 10)
Sum P-value
sump(p)sump(p)
p |
A vector of P-values. |
p <- c(0.01, 0.02, 0.03, 0.04, 0.05) sump(p)p <- c(0.01, 0.02, 0.03, 0.04, 0.05) sump(p)
Switch table to matrix
table_to_matrix( table, row_names = NULL, col_names = NULL, threshold = 0, return_sparse = FALSE )table_to_matrix( table, row_names = NULL, col_names = NULL, threshold = 0, return_sparse = FALSE )
table |
A table with three columns: |
row_names |
Character vector of row names to filter by. |
col_names |
Character vector of column names to filter by. |
threshold |
The threshold for filtering values based on absolute values.
Defaults to |
return_sparse |
Whether to return a sparse matrix. Defaults to |
A matrix.
table <- data.frame( row = c("r1", "r2", "r3", "r4", "r5", "r6"), col = c("c4", "c5", "c6", "c1", "c2", "c3"), value = c(0.6, -0.5, -0.4, 0.3, 0.2, 0.1) ) matrix <- table_to_matrix(table) table_new <- matrix_to_table(matrix) identical(table, table_new) table_to_matrix(table, threshold = 0.3) table_to_matrix( table, row_names = c("r1", "r2"), col_names = c("c4", "c5") ) sparse_matrix <- simulate_sparse_matrix(10, 10) table_sparse <- matrix_to_table( sparse_matrix ) sparse_matrix_new <- table_to_matrix( table_sparse, return_sparse = TRUE ) identical(sparse_matrix, sparse_matrix_new)table <- data.frame( row = c("r1", "r2", "r3", "r4", "r5", "r6"), col = c("c4", "c5", "c6", "c1", "c2", "c3"), value = c(0.6, -0.5, -0.4, 0.3, 0.2, 0.1) ) matrix <- table_to_matrix(table) table_new <- matrix_to_table(matrix) identical(table, table_new) table_to_matrix(table, threshold = 0.3) table_to_matrix( table, row_names = c("r1", "r2"), col_names = c("c4", "c5") ) sparse_matrix <- simulate_sparse_matrix(10, 10) table_sparse <- matrix_to_table( sparse_matrix ) sparse_matrix_new <- table_to_matrix( table_sparse, return_sparse = TRUE ) identical(sparse_matrix, sparse_matrix_new)
The thisutils logo, using ASCII or Unicode characters Use cli::ansi_strip to get rid of the colors.
thisutils_logo(unicode = cli::is_utf8_output())thisutils_logo(unicode = cli::is_utf8_output())
unicode |
Unicode symbols on UTF-8 platforms. Default is cli::is_utf8_output. |
A character vector with class thisutils_logo.
https://github.com/tidyverse/tidyverse/blob/main/R/logo.R
thisutils_logo()thisutils_logo()
The function is used as a fail-safe if code sometimes works and sometimes doesn't,
usually because it depends on a resource that may be temporarily unavailable.
It tries to evaluate the expression max_tries times.
If all the attempts fail, it throws an error;
if not, the evaluated expression is returned.
try_get(expr, max_tries = 5, error_message = "", retry_message = "Retrying...")try_get(expr, max_tries = 5, error_message = "", retry_message = "Retrying...")
expr |
The expression to be evaluated. |
max_tries |
The maximum number of attempts to evaluate the expression before giving up.
Default is |
error_message |
Additional custom error message to be displayed when an error occurs. |
retry_message |
Message displayed when a new try to evaluate the expression would be attempted. |
The evaluated expression if successful, otherwise it throws an error if all attempts are unsuccessful.
f <- function() { value <- runif(1, min = 0, max = 1) if (value > 0.5) { log_message("value is larger than 0.5") return(value) } else { log_message( "value is smaller than 0.5", message_type = "error" ) } } f_evaluated <- try_get(expr = f()) print(f_evaluated)f <- function() { value <- runif(1, min = 0, max = 1) if (value > 0.5) { log_message("value is larger than 0.5") return(value) } else { log_message( "value is smaller than 0.5", message_type = "error" ) } } f_evaluated <- try_get(expr = f()) print(f_evaluated)
Implement similar functions to the tidyr::unnest function.
unnest_fun(data, cols, keep_empty = FALSE)unnest_fun(data, cols, keep_empty = FALSE)
data |
A data frame. |
cols |
Columns to unnest. |
keep_empty |
By default, you get one row of output for each element of the list your unchopping/unnesting.
This means that if there's a size-0 element (like |
data <- data.frame( id = 1:3, x = c("a", "b", "c"), stringsAsFactors = FALSE ) data$data <- list( c(1, 2), c(3, 4, 5), c(6) ) unnest_fun(data, cols = "data") data2 <- data.frame( id = 1:3, x = c("a", "b", "c"), stringsAsFactors = FALSE ) data2$data <- list( c(1, 2), numeric(0), c(6) ) unnest_fun(data2, cols = "data") unnest_fun(data2, cols = "data", keep_empty = TRUE)data <- data.frame( id = 1:3, x = c("a", "b", "c"), stringsAsFactors = FALSE ) data$data <- list( c(1, 2), c(3, 4, 5), c(6) ) unnest_fun(data, cols = "data") data2 <- data.frame( id = 1:3, x = c("a", "b", "c"), stringsAsFactors = FALSE ) data2$data <- list( c(1, 2), numeric(0), c(6) ) unnest_fun(data2, cols = "data") unnest_fun(data2, cols = "data", keep_empty = TRUE)
Vote P-value
votep(p, alpha = 0.5)votep(p, alpha = 0.5)
p |
A vector of P-values. |
alpha |
The significance level. |
p <- c(0.01, 0.02, 0.03, 0.04, 0.05) votep(p) votep(p, alpha = 0.01)p <- c(0.01, 0.02, 0.03, 0.04, 0.05) votep(p) votep(p, alpha = 0.01)
Wilkinson's P-value
wilkinsonp(p, r = 1, alpha = 0.05, log.p = FALSE)wilkinsonp(p, r = 1, alpha = 0.05, log.p = FALSE)
p |
A vector of P-values. |
r |
The number of studies to include in the P-value calculation. |
alpha |
The significance level. |
log.p |
Whether to return the log of the P-value. |
p <- c(0.01, 0.02, 0.03, 0.04, 0.05) wilkinsonp(p) wilkinsonp(p, r = 2) wilkinsonp(p, alpha = 0.01) wilkinsonp(p, log.p = TRUE)p <- c(0.01, 0.02, 0.03, 0.04, 0.05) wilkinsonp(p) wilkinsonp(p, r = 2) wilkinsonp(p, alpha = 0.01) wilkinsonp(p, log.p = TRUE)