Title: | A Thoughtful Saver of Results |
---|---|
Description: | Helps with the thoughtful saving, reading, and management of result files (using 'rds' files). The core functions take a list of parameters that are used to generate a unique hash to save results under. Then, the same parameter list can be used to read those results back in. This is helpful to avoid clunky file naming when running a large number of simulations. Additionally, helper functions are available for compiling a flat file of parameters of saved results, monitoring result usage, and cleaning up unwanted or unused results. For more information, visit the 'indexr' homepage <https://lharris421.github.io/indexr/>. |
Authors: | Logan Harris [aut, cre] |
Maintainer: | Logan Harris <[email protected]> |
License: | GPL-3 |
Version: | 0.2.2 |
Built: | 2025-02-18 07:04:44 UTC |
Source: | https://github.com/lharris421/indexr |
This function checks for the existence of results saved under specified parameter list
in RDS files (saved with indexr
) within a given folder.
check_hash_existence( folder, parameters_list, halt = FALSE, hash_includes_timestamp = FALSE, ignore_na = TRUE, alphabetical_order = TRUE, algo = "xxhash64", ignore_script_name = FALSE )
check_hash_existence( folder, parameters_list, halt = FALSE, hash_includes_timestamp = FALSE, ignore_na = TRUE, alphabetical_order = TRUE, algo = "xxhash64", ignore_script_name = FALSE )
folder |
A string specifying the directory containing the RDS files. |
parameters_list |
A list of parameters for which a corresponding hash named file is checked. |
halt |
Logical; if TRUE, the function stops execution if an existing file is found. This may be useful as a check before running a simulation. |
hash_includes_timestamp |
Logical; if TRUE, timestamps are included in the hash generation process. |
ignore_na |
Logical; if TRUE, NA values are ignored during hash generation. |
alphabetical_order |
Logical; if TRUE, parameters are sorted alphabetically before hash generation. |
algo |
Character string specifying the hashing algorithm to use. Default is |
ignore_script_name |
Logical. If |
A logical of whether or not a file exists, unless halt = TRUE
and a file is found, then an error is thrown.
## Setup tmp_dir <- file.path(tempdir(), "example") dir.create(tmp_dir) ## Save an object parameters_list <- list(example = "check_hash_existence") save_objects(folder = tmp_dir, results = 1, parameters_list = parameters_list) ## Check that an object under specified parameters is saved check_hash_existence(folder = tmp_dir, parameters_list) ## Cleanup unlink(tmp_dir, recursive = TRUE)
## Setup tmp_dir <- file.path(tempdir(), "example") dir.create(tmp_dir) ## Save an object parameters_list <- list(example = "check_hash_existence") save_objects(folder = tmp_dir, results = 1, parameters_list = parameters_list) ## Check that an object under specified parameters is saved check_hash_existence(folder = tmp_dir, parameters_list) ## Cleanup unlink(tmp_dir, recursive = TRUE)
Allows the user to leverage the generate_hash
function to generate a table that is subsequently used to remove indicated results.
cleanup_from_hash_table( folder, hash_table, mode = c("manual", "all"), column = NULL, request_confirmation = TRUE )
cleanup_from_hash_table( folder, hash_table, mode = c("manual", "all"), column = NULL, request_confirmation = TRUE )
folder |
A string specifying the directory containing the RDS files. |
hash_table |
A |
mode |
A character string. When |
column |
A character string indicating the logical column in |
request_confirmation |
Logical, if TRUE will request user input before proceeding to delete files. |
There are a few ways to use this. When mode = "manual"
(default) the function expects that the user will add a column to a hash table that indicated which files to delete. When mode = "all"
, any results in the hash table will be removed. This is generally only used when a filter_list
is passed to create_hash_table
.
Nothing, this function is called for its side effects.
## Setup tmp_dir <- file.path(tempdir(), "example") dir.create(tmp_dir) ## Save example objects parameters_list1 <- list(example = "tagging1") parameters_list2 <- list(example = "tagging2") save_objects(folder = tmp_dir, results = 1, parameters_list = parameters_list1) save_objects(folder = tmp_dir, results = 2, parameters_list = parameters_list2) ## See the files saved list.files(tmp_dir) ## Create hash table (flat file of result parameters) hash_table <- create_hash_table(folder = tmp_dir) ## Delete "all" files based on hash table, without confirmation cleanup_from_hash_table( folder = tmp_dir, hash_table = hash_table, mode = "all", request_confirmation = FALSE ) ## See the files have been deleted list.files(tmp_dir) ## Cleanup unlink(tmp_dir, recursive = TRUE)
## Setup tmp_dir <- file.path(tempdir(), "example") dir.create(tmp_dir) ## Save example objects parameters_list1 <- list(example = "tagging1") parameters_list2 <- list(example = "tagging2") save_objects(folder = tmp_dir, results = 1, parameters_list = parameters_list1) save_objects(folder = tmp_dir, results = 2, parameters_list = parameters_list2) ## See the files saved list.files(tmp_dir) ## Create hash table (flat file of result parameters) hash_table <- create_hash_table(folder = tmp_dir) ## Delete "all" files based on hash table, without confirmation cleanup_from_hash_table( folder = tmp_dir, hash_table = hash_table, mode = "all", request_confirmation = FALSE ) ## See the files have been deleted list.files(tmp_dir) ## Cleanup unlink(tmp_dir, recursive = TRUE)
save_objects
with incremental=TRUE
This function is only intended to be used after save_objects
with incremental=TRUE
. In this case, save_objects
with save results under temporary hashes in a folder with the hash corresponding the the parameters. compress_incremental
then combines the results and saves them under the corresponding hash and deletes the old directory with the temporary results.
compress_incremental( folder, parameters_list, hash_includes_timestamp = FALSE, ignore_na = TRUE, alphabetical_order = TRUE, algo = "xxhash64", ignore_script_name = FALSE, remove_folder = TRUE )
compress_incremental( folder, parameters_list, hash_includes_timestamp = FALSE, ignore_na = TRUE, alphabetical_order = TRUE, algo = "xxhash64", ignore_script_name = FALSE, remove_folder = TRUE )
folder |
Character string specifying the path to the directory where the temporary folder was saved (should be the same as supplied to |
parameters_list |
The named list of arguments used with |
hash_includes_timestamp |
Logical. If |
ignore_na |
Logical. If |
alphabetical_order |
Logical. If |
algo |
Character string specifying the hashing algorithm to use. Default is |
ignore_script_name |
Logical. If |
remove_folder |
Logical. If |
If the individual results can be put into a data.frame
they will be, otherwise they will be stored as a list.
No return value. This function is called for its side effects.
## Save results incrementally params <- list(a = "1", b = "2") tmp_dir <- file.path(tempdir(), "example") dir.create(tmp_dir) for (i in 1:10) { save_objects(tmp_dir, data.frame(idx = i, val = rnorm(1)), params, incremental = TRUE) } ## See contents of tmp directory for incremental file list.files(file.path(tmp_dir, generate_hash(params))) ## Compress results into a single file compress_incremental(tmp_dir, params) list.files(tmp_dir) ## Read in compressed file and view results read_objects(tmp_dir, params) ## Cleanup unlink(tmp_dir, recursive = TRUE)
## Save results incrementally params <- list(a = "1", b = "2") tmp_dir <- file.path(tempdir(), "example") dir.create(tmp_dir) for (i in 1:10) { save_objects(tmp_dir, data.frame(idx = i, val = rnorm(1)), params, incremental = TRUE) } ## See contents of tmp directory for incremental file list.files(file.path(tmp_dir, generate_hash(params))) ## Compress results into a single file compress_incremental(tmp_dir, params) list.files(tmp_dir) ## Read in compressed file and view results read_objects(tmp_dir, params) ## Cleanup unlink(tmp_dir, recursive = TRUE)
Reads in all the parameter files for a give folder, flattens nested lists, and then combines the parameters into a data frame. Each row in the resulting data frame represents the arguments used for one RDS file, identified by its hash. Optionally, the function can filter the data frame based on specified criteria and save it to a file.
create_hash_table(folder, save_path = NULL, filter_list = NULL)
create_hash_table(folder, save_path = NULL, filter_list = NULL)
folder |
A string specifying the directory containing the RDS files. |
save_path |
An optional string specifying the path to save the resulting hash table as a CSV file.
If |
filter_list |
An optional list of filters to apply to the hash table. Each element of the list should be named according to a column in the hash table and contain the value to filter for in that column. |
Saving the hash table can be helpful for the manipulation of parameters (see ?update_hash_table
)
or for removal of unwanted results (see ?cleanup_from_hash_table
).
A data frame where each row corresponds to an parameters_list
from an RDS file,
with an additional column for the hash of each set of arguments.
## Setup tmp_dir <- file.path(tempdir(), "example") dir.create(tmp_dir) ## Save objects obj1 <- rnorm(1000) obj2 <- data.frame( x = runif(100), y = "something", z = rep(c(TRUE, FALSE), 50) ) obj3 <- list(obj1, obj2) params1 <- list( distribution = "normal", other_params = list(param1 = TRUE, param2 = 1, param3 = NA) ) params2 <- list( distribution = "uniform", other_params = list(param1 = FALSE, param2 = 2, param3 = "1", param4 = 4) ) params3 <- list( distribution = "composite", other_params = list(param1 = TRUE, param2 = 3, param3 = 1) ) save_objects(tmp_dir, obj1, params1) save_objects(tmp_dir, obj2, params2) save_objects(tmp_dir, obj3, params3) ## Create hash table (and save it) create_hash_table(tmp_dir, save_path = file.path(tmp_dir, "hash_table.csv")) ## Cleanup unlink(tmp_dir, recursive = TRUE)
## Setup tmp_dir <- file.path(tempdir(), "example") dir.create(tmp_dir) ## Save objects obj1 <- rnorm(1000) obj2 <- data.frame( x = runif(100), y = "something", z = rep(c(TRUE, FALSE), 50) ) obj3 <- list(obj1, obj2) params1 <- list( distribution = "normal", other_params = list(param1 = TRUE, param2 = 1, param3 = NA) ) params2 <- list( distribution = "uniform", other_params = list(param1 = FALSE, param2 = 2, param3 = "1", param4 = 4) ) params3 <- list( distribution = "composite", other_params = list(param1 = TRUE, param2 = 3, param3 = 1) ) save_objects(tmp_dir, obj1, params1) save_objects(tmp_dir, obj2, params2) save_objects(tmp_dir, obj3, params3) ## Create hash table (and save it) create_hash_table(tmp_dir, save_path = file.path(tmp_dir, "hash_table.csv")) ## Cleanup unlink(tmp_dir, recursive = TRUE)
This function generates a hash value for a given list of arguments. It is designed to produce a consistent hash by optionally removing NA values, ordering arguments alphabetically, handling timestamp inclusion, etc.
generate_hash( parameters_list, hash_includes_timestamp = FALSE, ignore_na = TRUE, alphabetical_order = TRUE, algo = "xxhash64", ignore_script_name = FALSE )
generate_hash( parameters_list, hash_includes_timestamp = FALSE, ignore_na = TRUE, alphabetical_order = TRUE, algo = "xxhash64", ignore_script_name = FALSE )
parameters_list |
A named list of arguments for which the hash will be generated. Each element in the list should correspond to a parameter. |
hash_includes_timestamp |
Logical; if FALSE, any timestamp included in parameters_list will be removed before hash generation. If TRUE, the timestamp will be included in the hash calculation. |
ignore_na |
Logical; if TRUE, any NA values in parameters_list will be removed before hash generation. |
alphabetical_order |
Logical; if TRUE, the arguments in parameters_list will be sorted alphabetically by their names before hash generation. |
algo |
The hash algorithm to use (See |
ignore_script_name |
Logical. If |
A character string representing the hash value of the provided argument list.
args <- list(param1 = "value1", param2 = 100, param3 = NA) generate_hash(args)
args <- list(param1 = "value1", param2 = 100, param3 = NA) generate_hash(args)
Reads R objects from specified folders based on a generated hash of the provided parameters_list
.
read_objects( folders, parameters_list, hash_includes_timestamp = FALSE, ignore_script_name = FALSE, ignore_na = TRUE, alphabetical_order = TRUE, algo = "xxhash64", print_hash = FALSE, tagging_file_name = "indexr_tagging.txt", silent = FALSE )
read_objects( folders, parameters_list, hash_includes_timestamp = FALSE, ignore_script_name = FALSE, ignore_na = TRUE, alphabetical_order = TRUE, algo = "xxhash64", print_hash = FALSE, tagging_file_name = "indexr_tagging.txt", silent = FALSE )
folders |
Character vector specifying the paths to directories containing the saved objects. The function will check each folder in order to find the file. |
parameters_list |
A named list of arguments used to generate a unique hash for the file. |
hash_includes_timestamp |
Logical. If |
ignore_script_name |
Logical. If |
ignore_na |
Logical. If |
alphabetical_order |
Logical. If |
algo |
Character string specifying the hashing algorithm to use. Default is |
print_hash |
Logical. If |
tagging_file_name |
Character string of a txt file that is being used for tagging results. See |
silent |
Logical. If |
This function attempts to read an R object from files located in one of the specified folders. The file name is based on the hash of the provided arguments. If the object is successfully read and a tagging files exists and is specified, the function appends the hash and the current timestamp to the tagging file in the folder where the file was found.
The data stored in the file retrieved, typically the results. Returns NULL
if the file is not found in any of the specified folders.
## Setup tmp_dir <- file.path(tempdir(), "example") dir.create(tmp_dir) ## Example using parameter list to run simulation and save results parameters_list <- list( iterations = 1000, x_dist = "rnorm", x_dist_options = list(n = 10, mean = 1, sd = 2), error_dist = "rnorm", error_dist_options = list(n = 10, mean = 0, sd = 1), beta0 = 1, beta1 = 1 ) betas <- numeric(parameters_list$iterations) for (i in 1:parameters_list$iterations) { x <- do.call(parameters_list$x_dist, parameters_list$x_dist_options) err <- do.call(parameters_list$error_dist, parameters_list$error_dist_options) y <- parameters_list$beta0 + parameters_list$beta1*x + err betas[i] <- coef(lm(y ~ x))["x"] } save_objects(folder = tmp_dir, results = betas, parameters_list = parameters_list) ## Read back in (consider clearing environment before running) ## Re-setup tmp_dir <- file.path(tempdir(), "example") parameters_list <- list( iterations = 1000, x_dist = "rnorm", x_dist_options = list(n = 10, mean = 1, sd = 2), error_dist = "rnorm", error_dist_options = list(n = 10, mean = 0, sd = 1), beta0 = 1, beta1 = 1 ) betas <- read_objects(folder = tmp_dir, parameters_list = parameters_list) ## Cleanup unlink(tmp_dir, recursive = TRUE)
## Setup tmp_dir <- file.path(tempdir(), "example") dir.create(tmp_dir) ## Example using parameter list to run simulation and save results parameters_list <- list( iterations = 1000, x_dist = "rnorm", x_dist_options = list(n = 10, mean = 1, sd = 2), error_dist = "rnorm", error_dist_options = list(n = 10, mean = 0, sd = 1), beta0 = 1, beta1 = 1 ) betas <- numeric(parameters_list$iterations) for (i in 1:parameters_list$iterations) { x <- do.call(parameters_list$x_dist, parameters_list$x_dist_options) err <- do.call(parameters_list$error_dist, parameters_list$error_dist_options) y <- parameters_list$beta0 + parameters_list$beta1*x + err betas[i] <- coef(lm(y ~ x))["x"] } save_objects(folder = tmp_dir, results = betas, parameters_list = parameters_list) ## Read back in (consider clearing environment before running) ## Re-setup tmp_dir <- file.path(tempdir(), "example") parameters_list <- list( iterations = 1000, x_dist = "rnorm", x_dist_options = list(n = 10, mean = 1, sd = 2), error_dist = "rnorm", error_dist_options = list(n = 10, mean = 0, sd = 1), beta0 = 1, beta1 = 1 ) betas <- read_objects(folder = tmp_dir, parameters_list = parameters_list) ## Cleanup unlink(tmp_dir, recursive = TRUE)
This function processes all RDS files in a specified directory, generating new hashes
for each file's args_list
and renaming the files accordingly. It's useful when changing
the hash generation algorithm or parameters (if the parameters are manually changed for some reason).
rehash( folder, hash_includes_timestamp = FALSE, ignore_na = TRUE, alphabetical_order = TRUE, algo = "xxhash64" )
rehash( folder, hash_includes_timestamp = FALSE, ignore_na = TRUE, alphabetical_order = TRUE, algo = "xxhash64" )
folder |
A string specifying the directory containing the RDS files to be rehashed. |
hash_includes_timestamp |
Logical; if TRUE, includes timestamps in the hash generation. |
ignore_na |
Logical; if TRUE, NA values are ignored during hash generation. |
alphabetical_order |
Logical; if TRUE, parameters are sorted alphabetically before hash generation. |
algo |
The (potentially new) hash algorithm to use (see |
The function does not return a value but renames the RDS files in the specified directory based on new hashes.
## Setup tmp_dir <- file.path(tempdir(), "example") dir.create(tmp_dir) # Save example objects obj1 <- rnorm(1000) obj2 <- data.frame( x = runif(100), y = "something", z = rep(c(TRUE, FALSE), 50) ) obj3 <- list(obj1, obj2) params1 <- list( distribution = "normal", other_params = list(param1 = TRUE, param2 = 1, param3 = NA) ) params2 <- list( distribution = "uniform", other_params = list(param1 = FALSE, param2 = 2, param3 = "1", param4 = 4) ) params3 <- list( distribution = "composite", other_params = list(param1 = TRUE, param2 = 3, param3 = 1) ) save_objects(tmp_dir, obj1, params1) save_objects(tmp_dir, obj2, params2) save_objects(tmp_dir, obj3, params3) ## See current file names list.files(tmp_dir) ## Rehash with new algo rehash(tmp_dir, algo = "xxhash32") ## Observe new file names list.files(tmp_dir) ## Cleanup unlink(tmp_dir, recursive = TRUE)
## Setup tmp_dir <- file.path(tempdir(), "example") dir.create(tmp_dir) # Save example objects obj1 <- rnorm(1000) obj2 <- data.frame( x = runif(100), y = "something", z = rep(c(TRUE, FALSE), 50) ) obj3 <- list(obj1, obj2) params1 <- list( distribution = "normal", other_params = list(param1 = TRUE, param2 = 1, param3 = NA) ) params2 <- list( distribution = "uniform", other_params = list(param1 = FALSE, param2 = 2, param3 = "1", param4 = 4) ) params3 <- list( distribution = "composite", other_params = list(param1 = TRUE, param2 = 3, param3 = 1) ) save_objects(tmp_dir, obj1, params1) save_objects(tmp_dir, obj2, params2) save_objects(tmp_dir, obj3, params3) ## See current file names list.files(tmp_dir) ## Rehash with new algo rehash(tmp_dir, algo = "xxhash32") ## Observe new file names list.files(tmp_dir) ## Cleanup unlink(tmp_dir, recursive = TRUE)
Saves RDS files to a specified folder with a name that is a hash generated from a list of parameters used for the simulation. There are a number of options that control the behavior, however, the default functionality likely covers 99% of use cases.
save_objects( folder, results, parameters_list = NULL, ignore_na = TRUE, alphabetical_order = TRUE, overwrite = FALSE, include_timestamp = TRUE, hash_includes_timestamp = FALSE, algo = "xxhash64", get_script_name = TRUE, ignore_script_name = FALSE, incremental = FALSE, silent = FALSE )
save_objects( folder, results, parameters_list = NULL, ignore_na = TRUE, alphabetical_order = TRUE, overwrite = FALSE, include_timestamp = TRUE, hash_includes_timestamp = FALSE, algo = "xxhash64", get_script_name = TRUE, ignore_script_name = FALSE, incremental = FALSE, silent = FALSE )
folder |
Character string specifying the path to the directory where the objects will be saved. |
results |
The R object or list of objects to be saved. |
parameters_list |
A named list of arguments used to generate a unique hash for the file. |
ignore_na |
Logical. If |
alphabetical_order |
Logical. If |
overwrite |
Logical. If |
include_timestamp |
Logical. If |
hash_includes_timestamp |
Logical. If |
algo |
Character string specifying the hashing algorithm to use. Default is |
get_script_name |
Logical. If |
ignore_script_name |
Logical. If |
incremental |
Logical. If |
silent |
Logical. If |
This function saves R objects to disk with a file name based on a generated hash of the provided arguments. It supports incremental saving, where multiple results can be saved under the same hash in a subdirectory and later collected. This can be helpful for a simulation that runs and saves results in parallel for the SAME set of simulation parameters.
No return value. This function is called for its side effects.
## Setup tmp_dir <- file.path(tempdir(), "example") dir.create(tmp_dir) ## Example using parameter list to run simulation and save results parameters_list <- list( iterations = 1000, x_dist = "rnorm", x_dist_options = list(n = 10, mean = 1, sd = 2), error_dist = "rnorm", error_dist_options = list(n = 10, mean = 0, sd = 1), beta0 = 1, beta1 = 1 ) betas <- numeric(parameters_list$iterations) for (i in 1:parameters_list$iterations) { x <- do.call(parameters_list$x_dist, parameters_list$x_dist_options) err <- do.call(parameters_list$error_dist, parameters_list$error_dist_options) y <- parameters_list$beta0 + parameters_list$beta1*x + err betas[i] <- coef(lm(y ~ x))["x"] } save_objects(folder = tmp_dir, results = betas, parameters_list = parameters_list) ## Read back in (consider clearing environment before running) ## Re-setup tmp_dir <- file.path(tempdir(), "example") parameters_list <- list( iterations = 1000, x_dist = "rnorm", x_dist_options = list(n = 10, mean = 1, sd = 2), error_dist = "rnorm", error_dist_options = list(n = 10, mean = 0, sd = 1), beta0 = 1, beta1 = 1 ) betas <- read_objects(folder = tmp_dir, parameters_list = parameters_list) ## Cleanup unlink(tmp_dir, recursive = TRUE)
## Setup tmp_dir <- file.path(tempdir(), "example") dir.create(tmp_dir) ## Example using parameter list to run simulation and save results parameters_list <- list( iterations = 1000, x_dist = "rnorm", x_dist_options = list(n = 10, mean = 1, sd = 2), error_dist = "rnorm", error_dist_options = list(n = 10, mean = 0, sd = 1), beta0 = 1, beta1 = 1 ) betas <- numeric(parameters_list$iterations) for (i in 1:parameters_list$iterations) { x <- do.call(parameters_list$x_dist, parameters_list$x_dist_options) err <- do.call(parameters_list$error_dist, parameters_list$error_dist_options) y <- parameters_list$beta0 + parameters_list$beta1*x + err betas[i] <- coef(lm(y ~ x))["x"] } save_objects(folder = tmp_dir, results = betas, parameters_list = parameters_list) ## Read back in (consider clearing environment before running) ## Re-setup tmp_dir <- file.path(tempdir(), "example") parameters_list <- list( iterations = 1000, x_dist = "rnorm", x_dist_options = list(n = 10, mean = 1, sd = 2), error_dist = "rnorm", error_dist_options = list(n = 10, mean = 0, sd = 1), beta0 = 1, beta1 = 1 ) betas <- read_objects(folder = tmp_dir, parameters_list = parameters_list) ## Cleanup unlink(tmp_dir, recursive = TRUE)
Tagging is mainly helpful for removing unused results.
start_tagging()
initializes the tagging process by creating a txt
file in the results directory which will keep a record of which results are being read by read_objects()
.
cleanup()
removes any .rds
files in the specified folder that are not listed in the tagging file.
close_tagging()
deletes the tagging file, ending the tagging session.
start_tagging(folder, tagging_file_name = "indexr_tagging.txt") cleanup( folder, tagging_file_name = "indexr_tagging.txt", cutoff_date = NULL, request_confirmation = TRUE ) close_tagging(folder, tagging_file_name = "indexr_tagging.txt")
start_tagging(folder, tagging_file_name = "indexr_tagging.txt") cleanup( folder, tagging_file_name = "indexr_tagging.txt", cutoff_date = NULL, request_confirmation = TRUE ) close_tagging(folder, tagging_file_name = "indexr_tagging.txt")
folder |
A character string specifying the path to the directory where the result files are saved and where the tagging file will be created. |
tagging_file_name |
A character string for a txt file the tagging information is to be saved under. |
cutoff_date |
A character string in "%Y-%m-%d %H:%M:%S" format used to specify that any tagged files before the date should also be removed. |
request_confirmation |
Logical, if TRUE will request user input before proceeding to delete files. |
No return value. This function is called for its side effects.
## Setup tmp_dir <- file.path(tempdir(), "example") dir.create(tmp_dir) ## Save example objects parameters_list1 <- list(example = "tagging1") parameters_list2 <- list(example = "tagging2") save_objects(folder = tmp_dir, results = 1, parameters_list = parameters_list1) save_objects(folder = tmp_dir, results = 2, parameters_list = parameters_list2) ## See the files have been saved list.files(tmp_dir) ## Start tagging start_tagging(tmp_dir) ## Read back in one the first file, this causes this file to be tagged res1 <- read_objects(folder = tmp_dir, parameters_list = parameters_list1) ## Remove untagged file without confirmation (that for parameters_list2) cleanup(tmp_dir, request_confirmation = FALSE) ## See that one file was removed list.files(tmp_dir) ## Close tagging (just removes tagging file) close_tagging(tmp_dir) ## Cleanup unlink(tmp_dir, recursive = TRUE)
## Setup tmp_dir <- file.path(tempdir(), "example") dir.create(tmp_dir) ## Save example objects parameters_list1 <- list(example = "tagging1") parameters_list2 <- list(example = "tagging2") save_objects(folder = tmp_dir, results = 1, parameters_list = parameters_list1) save_objects(folder = tmp_dir, results = 2, parameters_list = parameters_list2) ## See the files have been saved list.files(tmp_dir) ## Start tagging start_tagging(tmp_dir) ## Read back in one the first file, this causes this file to be tagged res1 <- read_objects(folder = tmp_dir, parameters_list = parameters_list1) ## Remove untagged file without confirmation (that for parameters_list2) cleanup(tmp_dir, request_confirmation = FALSE) ## See that one file was removed list.files(tmp_dir) ## Close tagging (just removes tagging file) close_tagging(tmp_dir) ## Cleanup unlink(tmp_dir, recursive = TRUE)
This function updates names of existing results by re-hashing each set of
parameters with potentially updated values based on adjustments made to a
hash table (see ?create_hash_table
) by user. It loads RDS files based
on their existing hashes, compares to the corresponding entry in a hash table,
generates new hashes where needed, and saves the files with the new hashes.
The old files are deleted if their hashes differ from the new ones.
update_from_hash_table( hash_table, rds_folder, hash_includes_timestamp = FALSE, ignore_na = TRUE, alphabetical_order = TRUE, algo = "xxhash64" )
update_from_hash_table( hash_table, rds_folder, hash_includes_timestamp = FALSE, ignore_na = TRUE, alphabetical_order = TRUE, algo = "xxhash64" )
hash_table |
A file path to a modified hash table generated by |
rds_folder |
A string specifying the directory containing the RDS files associated with the hash table. |
hash_includes_timestamp |
Logical; if TRUE, timestamps are included in the hash generation. |
ignore_na |
Logical; if TRUE, NA values are ignored during hash generation. |
alphabetical_order |
Logical; if TRUE, parameters are sorted alphabetically before hash generation. |
algo |
Character string specifying the hashing algorithm to use. Default is |
The function does not return a value but saves updated RDS files and deletes old files as needed.
## Setup tmp_dir <- file.path(tempdir(), "example") dir.create(tmp_dir) ## Save objects obj1 <- rnorm(1000) obj2 <- data.frame( x = runif(100), y = "something", z = rep(c(TRUE, FALSE), 50) ) obj3 <- list(obj1, obj2) params1 <- list( distribution = "normal", other_params = list(param1 = TRUE, param2 = 1, param3 = NA) ) params2 <- list( distribution = "uniform", other_params = list(param1 = FALSE, param2 = 2, param3 = "1", param4 = 4) ) params3 <- list( distribution = "composite", other_params = list(param1 = TRUE, param2 = 3, param3 = 1) ) save_objects(tmp_dir, obj1, params1) save_objects(tmp_dir, obj2, params2) save_objects(tmp_dir, obj3, params3) ## Create hash table create_hash_table(tmp_dir, save_path = file.path(tmp_dir, "hash_table.csv")) ## Read in hash table, make a change, and save hash_table <- read.csv(file.path(tmp_dir, "hash_table.csv")) hash_table$distribution <- "something different" write.csv(hash_table, file.path(tmp_dir, "hash_table.csv")) ## See file names before change list.files(tmp_dir) update_from_hash_table( hash_table = file.path(tmp_dir, "hash_table.csv"), rds_folder = tmp_dir ) ## See difference to before running update_hash_table() list.files(tmp_dir) ## Cleanup unlink(tmp_dir, recursive = TRUE)
## Setup tmp_dir <- file.path(tempdir(), "example") dir.create(tmp_dir) ## Save objects obj1 <- rnorm(1000) obj2 <- data.frame( x = runif(100), y = "something", z = rep(c(TRUE, FALSE), 50) ) obj3 <- list(obj1, obj2) params1 <- list( distribution = "normal", other_params = list(param1 = TRUE, param2 = 1, param3 = NA) ) params2 <- list( distribution = "uniform", other_params = list(param1 = FALSE, param2 = 2, param3 = "1", param4 = 4) ) params3 <- list( distribution = "composite", other_params = list(param1 = TRUE, param2 = 3, param3 = 1) ) save_objects(tmp_dir, obj1, params1) save_objects(tmp_dir, obj2, params2) save_objects(tmp_dir, obj3, params3) ## Create hash table create_hash_table(tmp_dir, save_path = file.path(tmp_dir, "hash_table.csv")) ## Read in hash table, make a change, and save hash_table <- read.csv(file.path(tmp_dir, "hash_table.csv")) hash_table$distribution <- "something different" write.csv(hash_table, file.path(tmp_dir, "hash_table.csv")) ## See file names before change list.files(tmp_dir) update_from_hash_table( hash_table = file.path(tmp_dir, "hash_table.csv"), rds_folder = tmp_dir ) ## See difference to before running update_hash_table() list.files(tmp_dir) ## Cleanup unlink(tmp_dir, recursive = TRUE)