Package 'currr' reference manual

Title:	Apply Mapping Functions in Frequent Saving
Description:	Implementations of the family of map() functions with frequent saving of the intermediate results. The contained functions let you start the evaluation of the iterations where you stopped (reading the already evaluated ones from cache), and work with the currently evaluated iterations while remaining ones are running in a background job. Parallel computing is also easier with the workers parameter.
Authors:	Marcell Granat [aut, cre]
Maintainer:	Marcell Granat <[email protected]>
License:	MIT + file LICENSE
Version:	0.1.2
Built:	2025-03-01 05:21:53 UTC
Source:	https://github.com/marcellgranat/currr

Wrapper function of `purrr::map`. Apply a function to each element of a vector, but save the intermediate data after a given number of iterations.

Description

The map functions transform their input by applying a function to each element of a list or atomic vector and returning an object of the same length as the input. cp_map functions work exactly the same way, but creates a secret folder in your current working directory and saves the results if they reach a given checkpoint. This way if you rerun the code, it reads the result from the cache folder and start to evalutate where you finished.

cp_map() always returns a list.
map_lgl(), map_dbl() and map_chr() return an atomic vector of the indicated type (or die trying). For these functions, .f must return a length-1 vector of the appropriate type.

Usage

cp_map(.x, .f, ..., name = NULL, cp_options = list())
cp_map(.x, .f, ..., name = NULL, cp_options = list())

Arguments

`.x`	A list or atomic vector.
`.f`	A function, specified in one of the following ways: A named function, e.g. `mean`. An anonymous function, e.g. `⁠\(x) x + 1⁠` or `function(x) x + 1`. A formula, e.g. `~ .x + 1`. You must use `.x` to refer to the first argument. Only recommended if you require backward compatibility with older versions of R.
`...`	Additional arguments passed on to the mapped function.
`name`	Name for the subfolder in the cache folder. If you do not specify, then `cp_map` uses the name of the function combined with the name of x. This is dangerous, since this generated name can appear multiple times in your code. Also changing x will result a rerun of the code, however you max want to avoid this. (if a subset of .x matches with the cached one and the function is the same, then elements of this subset won't evaluated, rather read from the cache)
`cp_options`	Options for the evaluation: `wait`, `n_checkpoint`, `workers`, `fill`. `wait`: An integer to specify that after how many iterations the console shows the intermediate results (default `1`). If its value is between 0 and 1, then it is taken as proportions of iterations to wait (example length of .x equals 100, then you get back the result after 50 if you set it to 0.5). Set to `Inf` to get back the results only after full evaluations. If its value is not equal to `Inf` then evaluation is goind in background job. `n_chekpoint`: Number of checkpoints, when intermadiate results are saved (default = 100). `workers`: Number of CPU cores to use (parallel package called in background). Set to 1 (default) to avoid parallel computing. `fill()` When you get back a not fully evaluated result (default `TRUE`). Should the length of the result be the same as .x? You can set these options also with `options(currr.n_checkpoint = 200)`. Additional options: `currr.unchanged_message` (TRUE/FALSE), `currr.progress_length`

Value

A list.

Examples

# Run them on console!
# (functions need writing and reading access to your working directory and they also print)

avg_n <- function(.data, .col, x) {
  Sys.sleep(.01)

  .data |>
    dplyr::pull({{ .col }}) |>
    (\(m) mean(m) * x) ()
}


cp_map(.x = 1:10, .f = avg_n, .data = iris, .col = 2, name = "iris_mean")

 # same function, read from cache
cp_map(.x = 1:10, .f = avg_n, .data = iris, .col = 2, name = "iris_mean")

remove_currr_cache()

# Run them on console!
# (functions need writing and reading access to your working directory and they also print)

avg_n <- function(.data, .col, x) {
  Sys.sleep(.01)

  .data |>
    dplyr::pull({{ .col }}) |>
    (\(m) mean(m) * x) ()
}


cp_map(.x = 1:10, .f = avg_n, .data = iris, .col = 2, name = "iris_mean")

 # same function, read from cache
cp_map(.x = 1:10, .f = avg_n, .data = iris, .col = 2, name = "iris_mean")

remove_currr_cache()

Wrapper function of `purrr::map`. Apply a function to each element of a vector, but save the intermediate data after a given number of iterations.

Description

cp_map() always returns a list.
map_lgl(), map_dbl() and map_chr() return an atomic vector of the indicated type (or die trying). For these functions, .f must return a length-1 vector of the appropriate type.

Usage

cp_map_chr(.x, .f, ..., name = NULL, cp_options = list())
cp_map_chr(.x, .f, ..., name = NULL, cp_options = list())

Arguments

`.x`	A list or atomic vector.
`.f`	A function, specified in one of the following ways: A named function, e.g. `mean`. An anonymous function, e.g. `⁠\(x) x + 1⁠` or `function(x) x + 1`. A formula, e.g. `~ .x + 1`. You must use `.x` to refer to the first argument. Only recommended if you require backward compatibility with older versions of R.
`...`	Additional arguments passed on to the mapped function.
`name`	Name for the subfolder in the cache folder. If you do not specify, then `cp_map` uses the name of the function combined with the name of x. This is dangerous, since this generated name can appear multiple times in your code. Also changing x will result a rerun of the code, however you max want to avoid this. (if a subset of .x matches with the cached one and the function is the same, then elements of this subset won't evaluated, rather read from the cache)
`cp_options`	Options for the evaluation: `wait`, `n_checkpoint`, `workers`, `fill`. `wait`: An integer to specify that after how many iterations the console shows the intermediate results (default `1`). If its value is between 0 and 1, then it is taken as proportions of iterations to wait (example length of .x equals 100, then you get back the result after 50 if you set it to 0.5). Set to `Inf` to get back the results only after full evaluations. If its value is not equal to `Inf` then evaluation is goind in background job. `n_chekpoint`: Number of checkpoints, when intermadiate results are saved (default = 100). `workers`: Number of CPU cores to use (parallel package called in background). Set to 1 (default) to avoid parallel computing. `fill()` When you get back a not fully evaluated result (default `TRUE`). Should the length of the result be the same as .x? You can set these options also with `options(currr.n_checkpoint = 200)`. Additional options: `currr.unchanged_message` (TRUE/FALSE), `currr.progress_length`

Value

A character vector.

Examples

# Run them on console!
# (functions need writing and reading access to your working directory and they also print)

avg_n <- function(.data, .col, x) {
  Sys.sleep(.01)

  .data |>
    dplyr::pull({{ .col }}) |>
    (\(m) mean(m) * x) ()
}


cp_map(.x = 1:10, .f = avg_n, .data = iris, .col = Sepal.Length, name = "iris_mean")

 # same function, read from cache
cp_map(.x = 1:10, .f = avg_n, .data = iris, .col = Sepal.Length, name = "iris_mean")

remove_currr_cache()

# Run them on console!
# (functions need writing and reading access to your working directory and they also print)

avg_n <- function(.data, .col, x) {
  Sys.sleep(.01)

  .data |>
    dplyr::pull({{ .col }}) |>
    (\(m) mean(m) * x) ()
}


cp_map(.x = 1:10, .f = avg_n, .data = iris, .col = Sepal.Length, name = "iris_mean")

 # same function, read from cache
cp_map(.x = 1:10, .f = avg_n, .data = iris, .col = Sepal.Length, name = "iris_mean")

remove_currr_cache()

Wrapper function of `purrr::map`. Apply a function to each element of a vector, but save the intermediate data after a given number of iterations.

Description

cp_map() always returns a list.
map_lgl(), map_dbl() and map_chr() return an atomic vector of the indicated type (or die trying). For these functions, .f must return a length-1 vector of the appropriate type.

Usage

cp_map_dbl(.x, .f, ..., name = NULL, cp_options = list())
cp_map_dbl(.x, .f, ..., name = NULL, cp_options = list())

Arguments

`.x`	A list or atomic vector.
`.f`	A function, specified in one of the following ways: A named function, e.g. `mean`. An anonymous function, e.g. `⁠\(x) x + 1⁠` or `function(x) x + 1`. A formula, e.g. `~ .x + 1`. You must use `.x` to refer to the first argument. Only recommended if you require backward compatibility with older versions of R.
`...`	Additional arguments passed on to the mapped function.
`name`	Name for the subfolder in the cache folder. If you do not specify, then `cp_map` uses the name of the function combined with the name of x. This is dangerous, since this generated name can appear multiple times in your code. Also changing x will result a rerun of the code, however you max want to avoid this. (if a subset of .x matches with the cached one and the function is the same, then elements of this subset won't evaluated, rather read from the cache)
`cp_options`	Options for the evaluation: `wait`, `n_checkpoint`, `workers`, `fill`. `wait`: An integer to specify that after how many iterations the console shows the intermediate results (default `1`). If its value is between 0 and 1, then it is taken as proportions of iterations to wait (example length of .x equals 100, then you get back the result after 50 if you set it to 0.5). Set to `Inf` to get back the results only after full evaluations. If its value is not equal to `Inf` then evaluation is goind in background job. `n_chekpoint`: Number of checkpoints, when intermadiate results are saved (default = 100). `workers`: Number of CPU cores to use (parallel package called in background). Set to 1 (default) to avoid parallel computing. `fill()` When you get back a not fully evaluated result (default `TRUE`). Should the length of the result be the same as .x? You can set these options also with `options(currr.n_checkpoint = 200)`. Additional options: `currr.unchanged_message` (TRUE/FALSE), `currr.progress_length`

Value

A numeric vector.

Examples

# Run them on console!
# (functions need writing and reading access to your working directory and they also print)

avg_n <- function(.data, .col, x) {
  Sys.sleep(.01)

  .data |>
    dplyr::pull({{ .col }}) |>
    (\(m) mean(m) * x) ()
}


cp_map(.x = 1:10, .f = avg_n, .data = iris, .col = Sepal.Length, name = "iris_mean")

 # same function, read from cache
cp_map(.x = 1:10, .f = avg_n, .data = iris, .col = Sepal.Length, name = "iris_mean")

remove_currr_cache()

# Run them on console!
# (functions need writing and reading access to your working directory and they also print)

avg_n <- function(.data, .col, x) {
  Sys.sleep(.01)

  .data |>
    dplyr::pull({{ .col }}) |>
    (\(m) mean(m) * x) ()
}


cp_map(.x = 1:10, .f = avg_n, .data = iris, .col = Sepal.Length, name = "iris_mean")

 # same function, read from cache
cp_map(.x = 1:10, .f = avg_n, .data = iris, .col = Sepal.Length, name = "iris_mean")

remove_currr_cache()

Wrapper function of `purrr::map`. Apply a function to each element of a vector, but save the intermediate data after a given number of iterations.

Description

cp_map() always returns a list.
map_lgl(), map_dbl() and map_chr() return an atomic vector of the indicated type (or die trying). For these functions, .f must return a length-1 vector of the appropriate type.

Usage

cp_map_dfc(.x, .f, ..., name = NULL, cp_options = list())
cp_map_dfc(.x, .f, ..., name = NULL, cp_options = list())

Arguments

`.x`	A list or atomic vector.
`.f`	A function, specified in one of the following ways: A named function, e.g. `mean`. An anonymous function, e.g. `⁠\(x) x + 1⁠` or `function(x) x + 1`. A formula, e.g. `~ .x + 1`. You must use `.x` to refer to the first argument. Only recommended if you require backward compatibility with older versions of R.
`...`	Additional arguments passed on to the mapped function.
`name`	Name for the subfolder in the cache folder. If you do not specify, then `cp_map` uses the name of the function combined with the name of x. This is dangerous, since this generated name can appear multiple times in your code. Also changing x will result a rerun of the code, however you max want to avoid this. (if a subset of .x matches with the cached one and the function is the same, then elements of this subset won't evaluated, rather read from the cache)
`cp_options`	Options for the evaluation: `wait`, `n_checkpoint`, `workers`, `fill`. `wait`: An integer to specify that after how many iterations the console shows the intermediate results (default `1`). If its value is between 0 and 1, then it is taken as proportions of iterations to wait (example length of .x equals 100, then you get back the result after 50 if you set it to 0.5). Set to `Inf` to get back the results only after full evaluations. If its value is not equal to `Inf` then evaluation is goind in background job. `n_chekpoint`: Number of checkpoints, when intermadiate results are saved (default = 100). `workers`: Number of CPU cores to use (parallel package called in background). Set to 1 (default) to avoid parallel computing. `fill()` When you get back a not fully evaluated result (default `TRUE`). Should the length of the result be the same as .x? You can set these options also with `options(currr.n_checkpoint = 200)`. Additional options: `currr.unchanged_message` (TRUE/FALSE), `currr.progress_length`

Value

A tibble.

Examples

# Run them on console!
# (functions need writing and reading access to your working directory and they also print)

avg_n <- function(.data, .col, x) {
  Sys.sleep(.01)

  .data |>
    dplyr::pull({{ .col }}) |>
    (\(m) mean(m) * x) ()
}


cp_map(.x = 1:10, .f = avg_n, .data = iris, .col = Sepal.Length, name = "iris_mean")

 # same function, read from cache
cp_map(.x = 1:10, .f = avg_n, .data = iris, .col = Sepal.Length, name = "iris_mean")

remove_currr_cache()

# Run them on console!
# (functions need writing and reading access to your working directory and they also print)

avg_n <- function(.data, .col, x) {
  Sys.sleep(.01)

  .data |>
    dplyr::pull({{ .col }}) |>
    (\(m) mean(m) * x) ()
}


cp_map(.x = 1:10, .f = avg_n, .data = iris, .col = Sepal.Length, name = "iris_mean")

 # same function, read from cache
cp_map(.x = 1:10, .f = avg_n, .data = iris, .col = Sepal.Length, name = "iris_mean")

remove_currr_cache()

Wrapper function of `purrr::map`. Apply a function to each element of a vector, but save the intermediate data after a given number of iterations.

Description

cp_map() always returns a list.
map_lgl(), map_dbl() and map_chr() return an atomic vector of the indicated type (or die trying). For these functions, .f must return a length-1 vector of the appropriate type.

Usage

cp_map_dfr(.x, .f, ..., name = NULL, cp_options = list())
cp_map_dfr(.x, .f, ..., name = NULL, cp_options = list())

Arguments

`.x`	A list or atomic vector.
`.f`	A function, specified in one of the following ways: A named function, e.g. `mean`. An anonymous function, e.g. `⁠\(x) x + 1⁠` or `function(x) x + 1`. A formula, e.g. `~ .x + 1`. You must use `.x` to refer to the first argument. Only recommended if you require backward compatibility with older versions of R.
`...`	Additional arguments passed on to the mapped function.
`name`	Name for the subfolder in the cache folder. If you do not specify, then `cp_map` uses the name of the function combined with the name of x. This is dangerous, since this generated name can appear multiple times in your code. Also changing x will result a rerun of the code, however you max want to avoid this. (if a subset of .x matches with the cached one and the function is the same, then elements of this subset won't evaluated, rather read from the cache)
`cp_options`	Options for the evaluation: `wait`, `n_checkpoint`, `workers`, `fill`. `wait`: An integer to specify that after how many iterations the console shows the intermediate results (default `1`). If its value is between 0 and 1, then it is taken as proportions of iterations to wait (example length of .x equals 100, then you get back the result after 50 if you set it to 0.5). Set to `Inf` to get back the results only after full evaluations. If its value is not equal to `Inf` then evaluation is goind in background job. `n_chekpoint`: Number of checkpoints, when intermadiate results are saved (default = 100). `workers`: Number of CPU cores to use (parallel package called in background). Set to 1 (default) to avoid parallel computing. `fill()` When you get back a not fully evaluated result (default `TRUE`). Should the length of the result be the same as .x? You can set these options also with `options(currr.n_checkpoint = 200)`. Additional options: `currr.unchanged_message` (TRUE/FALSE), `currr.progress_length`

Value

A tibble.

Examples

# Run them on console!
# (functions need writing and reading access to your working directory and they also print)

avg_n <- function(.data, .col, x) {
  Sys.sleep(.01)

  .data |>
    dplyr::pull({{ .col }}) |>
    (\(m) mean(m) * x) ()
}


cp_map(.x = 1:10, .f = avg_n, .data = iris, .col = Sepal.Length, name = "iris_mean")

 # same function, read from cache
cp_map(.x = 1:10, .f = avg_n, .data = iris, .col = Sepal.Length, name = "iris_mean")

remove_currr_cache()

# Run them on console!
# (functions need writing and reading access to your working directory and they also print)

avg_n <- function(.data, .col, x) {
  Sys.sleep(.01)

  .data |>
    dplyr::pull({{ .col }}) |>
    (\(m) mean(m) * x) ()
}


cp_map(.x = 1:10, .f = avg_n, .data = iris, .col = Sepal.Length, name = "iris_mean")

 # same function, read from cache
cp_map(.x = 1:10, .f = avg_n, .data = iris, .col = Sepal.Length, name = "iris_mean")

remove_currr_cache()

Wrapper function of `purrr::map`. Apply a function to each element of a vector, but save the intermediate data after a given number of iterations.

Description

cp_map() always returns a list.
map_lgl(), map_dbl() and map_chr() return an atomic vector of the indicated type (or die trying). For these functions, .f must return a length-1 vector of the appropriate type.

Usage

cp_map_lgl(.x, .f, ..., name = NULL, cp_options = list())
cp_map_lgl(.x, .f, ..., name = NULL, cp_options = list())

Arguments

`.x`	A list or atomic vector.
`.f`	A function, specified in one of the following ways: A named function, e.g. `mean`. An anonymous function, e.g. `⁠\(x) x + 1⁠` or `function(x) x + 1`. A formula, e.g. `~ .x + 1`. You must use `.x` to refer to the first argument. Only recommended if you require backward compatibility with older versions of R.
`...`	Additional arguments passed on to the mapped function.
`name`	Name for the subfolder in the cache folder. If you do not specify, then `cp_map` uses the name of the function combined with the name of x. This is dangerous, since this generated name can appear multiple times in your code. Also changing x will result a rerun of the code, however you max want to avoid this. (if a subset of .x matches with the cached one and the function is the same, then elements of this subset won't evaluated, rather read from the cache)
`cp_options`	Options for the evaluation: `wait`, `n_checkpoint`, `workers`, `fill`. `wait`: An integer to specify that after how many iterations the console shows the intermediate results (default `1`). If its value is between 0 and 1, then it is taken as proportions of iterations to wait (example length of .x equals 100, then you get back the result after 50 if you set it to 0.5). Set to `Inf` to get back the results only after full evaluations. If its value is not equal to `Inf` then evaluation is goind in background job. `n_chekpoint`: Number of checkpoints, when intermadiate results are saved (default = 100). `workers`: Number of CPU cores to use (parallel package called in background). Set to 1 (default) to avoid parallel computing. `fill()` When you get back a not fully evaluated result (default `TRUE`). Should the length of the result be the same as .x? You can set these options also with `options(currr.n_checkpoint = 200)`. Additional options: `currr.unchanged_message` (TRUE/FALSE), `currr.progress_length`

Value

A logical vector.

Examples

# Run them on console!
# (functions need writing and reading access to your working directory and they also print)

avg_n <- function(.data, .col, x) {
  Sys.sleep(.01)

  .data |>
    dplyr::pull({{ .col }}) |>
    (\(m) mean(m) * x) ()
}


cp_map(.x = 1:10, .f = avg_n, .data = iris, .col = Sepal.Length, name = "iris_mean")

 # same function, read from cache
cp_map(.x = 1:10, .f = avg_n, .data = iris, .col = Sepal.Length, name = "iris_mean")

remove_currr_cache()

# Run them on console!
# (functions need writing and reading access to your working directory and they also print)

avg_n <- function(.data, .col, x) {
  Sys.sleep(.01)

  .data |>
    dplyr::pull({{ .col }}) |>
    (\(m) mean(m) * x) ()
}


cp_map(.x = 1:10, .f = avg_n, .data = iris, .col = Sepal.Length, name = "iris_mean")

 # same function, read from cache
cp_map(.x = 1:10, .f = avg_n, .data = iris, .col = Sepal.Length, name = "iris_mean")

remove_currr_cache()

Remove currr's intermediate data from the folder.

Description

Remove currr's intermediate data from the folder.

Usage

remove_currr_cache(list = NULL)
remove_currr_cache(list = NULL)

Arguments

list

A character vector specifying the name of the caches you want to remove (files in .currr.data folder). If empy (default), all caches will be removed.

Value

No return value, called for side effects

Run a map with the function, but saves after a given number of execution. This is an internal function, you are not supposed to use it manually, but can call for background job inly if exported.

Description

Run a map with the function, but saves after a given number of execution. This is an internal function, you are not supposed to use it manually, but can call for background job inly if exported.

Usage

saving_map(.ids, .f, name, n_checkpoint = 100, currr_folder, ...)
saving_map(.ids, .f, name, n_checkpoint = 100, currr_folder, ...)

Arguments

`.ids`	Placement of .x to work with.
`.f`	Called function.
`name`	Name for saving.
`n_checkpoint`	Number of checkpoints.
`currr_folder`	Folder where cache files are stored.
`...`	Additionals.

Value

No return value, called for side effects

Run a map with the function, but saves after a given number of execution. This is an internal function, you are not supposed to use it manually, but can call for background job only if exported. This function differs from saving_map, since it does not have a ... input. This is neccessary because job::job fails if ... is not provided for the cp_map call.

Description

Run a map with the function, but saves after a given number of execution. This is an internal function, you are not supposed to use it manually, but can call for background job only if exported. This function differs from saving_map, since it does not have a ... input. This is neccessary because job::job fails if ... is not provided for the cp_map call.

Usage

saving_map_nodot(.ids, .f, name, n_checkpoint = 100, currr_folder)
saving_map_nodot(.ids, .f, name, n_checkpoint = 100, currr_folder)

Arguments

`.ids`	Placement of .x to work with.
`.f`	Called function.
`name`	Name for saving.
`n_checkpoint`	Number of checkpoints.
`currr_folder`	Folder where cache files are stored.

Value

No return value, called for side effects

Package 'currr'

Help Index

Wrapper function of purrr::map. Apply a function to each element of a vector, but save the intermediate data after a given number of iterations.

Description

Usage

Arguments

Value

See Also

Examples

Wrapper function of purrr::map. Apply a function to each element of a vector, but save the intermediate data after a given number of iterations.

Description

Usage

Arguments

Value

See Also

Examples

Wrapper function of purrr::map. Apply a function to each element of a vector, but save the intermediate data after a given number of iterations.

Description

Usage

Arguments

Value

See Also

Examples

Wrapper function of purrr::map. Apply a function to each element of a vector, but save the intermediate data after a given number of iterations.

Description

Usage

Arguments

Value

See Also

Examples

Wrapper function of purrr::map. Apply a function to each element of a vector, but save the intermediate data after a given number of iterations.

Description

Usage

Arguments

Value

See Also

Examples

Wrapper function of purrr::map. Apply a function to each element of a vector, but save the intermediate data after a given number of iterations.

Description

Usage

Arguments

Value

See Also

Examples

Remove currr's intermediate data from the folder.

Description

Usage

Arguments

Value

Run a map with the function, but saves after a given number of execution. This is an internal function, you are not supposed to use it manually, but can call for background job inly if exported.

Description

Usage

Arguments

Value

Description

Usage

Arguments

Value

Wrapper function of `purrr::map`. Apply a function to each element of a vector, but save the intermediate data after a given number of iterations.

Wrapper function of `purrr::map`. Apply a function to each element of a vector, but save the intermediate data after a given number of iterations.

Wrapper function of `purrr::map`. Apply a function to each element of a vector, but save the intermediate data after a given number of iterations.

Wrapper function of `purrr::map`. Apply a function to each element of a vector, but save the intermediate data after a given number of iterations.

Wrapper function of `purrr::map`. Apply a function to each element of a vector, but save the intermediate data after a given number of iterations.

Wrapper function of `purrr::map`. Apply a function to each element of a vector, but save the intermediate data after a given number of iterations.