Data exploration and modelling is a process in which a lot of data artifacts are produced. Artifacts like: subsets, data aggregates, plots, statistical models, different versions of data sets and different versions of results. The more projects we work with the more artifacts are produced and the harder it is to manage these artifacts. Archivist helps to store and manage artifacts created in R. Archivist allows you to store selected artifacts as a binary files together with their metadata and relations. Archivist allows to share artifacts with others, either through shared folder or github. Archivist allows to look for already created artifacts by using it's class, name, date of the creation or other properties. Makes it easy to restore such artifacts. Archivist allows to check if new artifact is the exact copy that was produced some time ago. That might be useful either for testing or caching.
https://raw.githubusercontent.comserwis is closed, this is why we need to update hooks to
cache()function due to (#327). Effect of the bytecompiler needs further research.
saveToRepo()have now additional parameter
use_flocks. If set up to
flockpackage is use to synchronize access to database (#322).
areadLocal()that work as
aread()for selected local repositories (#298).
forcehas now different meaning in
createLocalRepo(). As suggested in (#319) and (#318) if forces to override existing backpack.db file.
adigest()allows to use different hash functions (#323)
dbExecute()is used instead of
dbGetQuery()as is recommended in latest version of
removeTagsRepo()is added. It allows to remove specific tags from specific objects. Note that you cannot specify tags via regular expressions (to avoid some accidental deletes). You can always use
getLocalTags()to get list of available tags.
setPostgresRepo(). Note, that it's still an experimental feature.
asearchexamples due to new version of ggplot2 - 2.2.0 [#296]
asaveexamples due to new version of ggplot2 - 2.2.0 [#300]
loadFromLocalRepo()are now handling URL addresses as well. This may be useful to access artifacts generated by the shiny app.
%a%archives proper names of first object so does
ahistoryprints proper name of archived artifact instead of
latexformat as it has new
atrace()function is added. It call
trace()function to store a selected object in the repository after each call to specified FUN (for example 'lm').
restoreLibs()can now restore libraries in custom directory. [#251
maxTagsparameter so that gallery's summaries in the
README.mdfiles now has limited chunk's length. [#249]
restoreLibs()function is added. It recovers previous versions of R packages. Needed due to rapid changes in structure of
ggplot2objects. Now one can restore version of the
ggplot2package consistent with archived object.
RemoteRepoCheckis used to verify if parameters for remote repo are correct.
asessionreturns session info for given artifact (similar to aread).
aformatreturns vector of formats in which the artifact is saved (similar to aread).
saveToRepoby default saves session info.
repoDirGithas changed name to
subdirand the default value is now '/'.
alinkis now working with github and bitbucket repositories.
asearchreturns named list of artifacts. MD5hashes are used as names.
silent=TRUEby default in
saveToRepo. Less warnings.
saveToRepohas now two copies, consistent with other names
saveToLocalRepoan short one
pullGitHubRepohave been moved to separate
archivist.githubpackage to maintain Local/Remote consistency. [#198].
deleteRepowas deprecated. Use
createEmptyRepowere deprecated. Use
rmFromRepowas deprecated. Use
multiSearchInLocalRepoand it's remote version were deprecated. Now multiple patterns are available in
alinkfunction: Returns a Link To Download an Artifact Stored on GitHub Repository. Ideal combination with
pushRepofunction which add files, commits them and pushes from Local
Repositoryto synchronized GitHub one. [#146].
git pull) changes from remote GitHub
Repositoryto the correspoding Local one. [#146].
createGithubMDGallerythat give the markdown summary for each artifact in the repository. Ideal for README.md file. Example [#144]
asearchfunction enables a user to read artifacts from default GitHub repository. In the previous version it was possible only in default local repository.
apotions('repo/repoDir', NULL, unset = TRUE)[#176].
asearchcompletely new example section divided into 3 subsections: default local repository, default GitHub resository and Github repository.
htestobject's data is now saved to repository as a list.
devtoolss::session_info()with an artifact during the execution of
format:is now added to every artifact/miniature. Artifacts can be saved in different (and more than one) formats (rda/json/csv) what makes them easier to access from other languages.
New and renamed parameters:
createEmptyGithubRepowere changed into
createEmptyGithubReponow can use
repoDirto specify in which directory the synchronized Local Repository should be created [#142].
archiveno longer cats hook to the artifact during the execution. Hook cat can be set with new
alinkparameter that uses
alink()function, where parameters can be passed with
deleteRepohas now new
unsetparameter that allows to unset global
aoptions('repoDir', NULL, unset = TRUE)when deleted
repoDirwas a globally specified Repository [#157].
repoDirto maintain consistency within package documentation and name convention.
cloneGithubReponow reacts on new
defaultparameter which sets newly created/cloned repositories (GitHub and synchronized with it Local one) as default [#171 , #142].
ahistory()to maintain consistency with
alink. Now the
createEmptyGithubRepofunction. We also added
createEmptyLocalRepoto maintain consistency with other sister functions.
createEmptyRepois now a wrapper around
createEmptyGithubRepofunctions. 2. One can now clone GitHub-archivist repo with new
cloneGithubRepofunction. 3. One can automatically archive artifacts to Local and synchronized GitHub archivist-like Repositiories with new
archivefunction. Example: https://github.com/MarcinKosinski/archive-test4/commits/master 4. Added manual page to enable easier usage of this integration: ``?
archivist-github-integration``` (or shorter?agithub`).
splitTagsGithubenabling to split
tagcolumn in database into two separate columns:
checkDirectoryfunction is now immune to directories that don't exist. This made
showLocalRepofunction working properly when passed an argument to the directory that do not exist. 2. Changed
dbDisconnect( conn )call to the
on.exit(dbDisconnect( conn ))in
executeSingleQueryfunction to prevent a situation in which during an error inside a function (which might be produced), the connection stays open, when it shouldn
operator does react ondefault = TRUE
function. 4.deleteRoot = TRUE
argument of thedeleteRepo
function works properly and enables removing root directory of the Repository. 5. Some changes inrmFromRepo
's body: 1. Function will give a warning when a user uses wrong md5hash (that does not exist in theRepository
). In case of wrong md5hash abbreviation a user will receive an error message. 2. Artifacts' data is now removed from tag table inbackpack.db
file whenmany = TRUE
. They were not removed before. 3. Artifacts' data files are now removed fromgallery
folder. They were not removed before. 4.Invisible(NULL)
is the result of the function evaluation. 6. Some changes incopyRepo
's body: 1.Invisible(NULL)
is the result of the function evaluation 2.repoFrom
is set toNULL
as default. 7.copyFromLocalRepo
copies only distinct records for tabletag
file, that can be seen withshowRepo
and copies all mentioned artifacts for local version. 8.downloadDB
function gives a user-friendly error. 9. InzipGithubRepo
unzipped file has the same name as zip file. Earlier it had a name of the temporary file that was difficult to notice. 10. InsetGithubRepo
it is now possible to use repoDirGit parameter. Before there was wrongstopifnot
was replaced byfile.path()
in appropriate places of function's bodies in the following R scripts:archive.R
. 12. Two crucial parts ofcheckDirectory
's function body were removed due to changes in point 11.checkDirectory2
was completely removed as it is unnecessary now. 13. Small change intest_base_functionalities.R
due to changes in point 11 and 12. 14.aoptions
will work properly withshowGithubRepo
andsummaryGithubRepo` when set. It might have not been noticed in version 1.7, it might have been a bug that occured in the development between 1.7 and 1.8 version.
print.ahistoryfunction can now print outputs of the artifact's history as the
knitr::kablewould. 2. Examples for
searchInGithubReponow works for
repo='archivistparameters as we added new backpack.db file. The previous one was almost empty (for 7 months). 3. Additional examples to better understand usage of archivist package functions: 1. in
loadFromRepofunction - Loading artifacts from the repository which is built in the archivist package and saving them on the example repository. 2. in
createEmptyRepofunction - creating a default local Repository in non existing directory. 3. in
rmFromRepofunction - removing artifacts with
many = TRUEargument. 4. in
deleteRepofunction - using
deleteRoot = TRUEargument. 5. in
copy*Repofunction - using graphGallery local repository in
copyLocalRepofunction. 6. in
get*Tagsfunction - additional example using
getTagsLocalfunction. 7. in
aoptionsfunction - added two new examples concerning usage of
repoDirparameters in this function. 4. Alterations in the text of:
?areaddocumentation pages. 5. Adding missing functions which are used in the archivist package now to
?Repositorydocumentation page. 6.
tempdir()was replaced by
tempfile()in examples sections of:
tempdiris existing directory in which R works so calling
deleteRepo( exampleRepoDir, deleteRoot=TRUE)removed important R files. 7. New tests for the following functions:
zip*Repo. 8. In order to obtain cohesion with
Tagsin all functions there has been stated such an order: 1. If we use
Tagsin the text of function's documentation, examples' comments, then
Tagsare considered as a proper name and they begin with capital letter. 2. If we use
tagsin function's body, as parameters, as R object's atrributes, then they begin with small letter. 9. Added checking if parameters have appropriate lengths in the following function's bodies:
The order of parameters in asearch has changed!
Added graphGallery for self-contained examples
aread allows for single MD5 hash (which will be read from the default repo)
asearch allows for only patterns (will be searched in local repo)
ahistory has now 'artifact' argument instead of 'obj'
Removed unnecessary dependencies - now archivist is free of dependencies.
shiny package is in Suggests so you should load that package before running shinySearchInLocalRepo function.
saveSetToRepo with a new function
loadSetFromRepo to the
...should be updated...
...should be updated...
setGithubRepofunctions. ...should be updated...