vignettes/datashield-admin.Rmd
datashield-admin.Rmd
Opal is the
reference implementation of the DataSHIELD infrastructure. All the
DataSHIELD administration tasks can be performed programmatically using
functions starting with dsadmin.*
:
See also the Opal DataSHIELD Administration documentation to learn how to administrate DataSHIELD using the graphical user interface.
Setup the connection with Opal:
library(opalr)
o <- opal.login("administrator", "password", "https://opal-demo.obiba.org")
Opal can handle clusters of R servers. R packages are handled at the cluster level (because all the R servers in a cluster are expected to be identical). See Opal and R server documentation.
List installed DataSHIELD R packages:
dsadmin.package_descriptions(o, profile = "default")
Install a DataSHIELD R package from the configured CRAN repositories (most likely the DataSHIELD repo):
dsadmin.install_package(o, pkg = "dsBase", profile = "default")
R packages which source code is one GitHub can be installed directly:
dsadmin.install_github_package(o, pkg = "dsSurvival", username = "neelsoumya", ref = "v1.0.0", profile = "default")
When developing a new DataSHIELD R package, it can be built and installed as follow (from the root of the R package source directory):
dsadmin.install_local_package(o, devtools::build(), profile = "default")
To remove a DataSHIELD R package:
dsadmin.remove_package(o, pkg = "dsSurvival", profile = "default")
Note that removing a package does not update the DataSHIELD settings of the associated profiles. See the following sections to administrate the profiles and their settings.
A DataSHIELD profile is based on a R servers cluster. In the most
simple setup, there is only one cluster of one R server and this cluster
is called default
, and the profile is named the same.
It is possible to define several profiles based on the same cluster. The benefit of doing this is to:
To list the DataSHIELD profiles:
To create a new DataSHIELD profile, to initialize it the DataSHIELD settings as declared by the installed packages and to enable it, use the following.
# ensure the profile does not exist
if (dsadmin.profile_exists(o, "demo"))
dsadmin.profile_delete(o, "demo")
# create a profile, disabled
dsadmin.profile_create(o, "demo", cluster = "default")
# make only dsBase and resourcer packages visible
dsadmin.profile_init(o, "demo", packages = c("dsBase", "resourcer"))
# ready to be used
dsadmin.profile_enable(o, "demo")
When a DataSHIELD R package is installed but should be used only by a restricted group of users, proceed as follow:
dsadmin.profile_perm_add(o, "demo", subject = "testers", type = "group")
# verify permissions
dsadmin.profile_perm(o, "demo")
The DataSHIELD settings are defined per profile (DataSHIELD Profiles section). The settings can be minimally initialized by reading the declared settings from the installed DataSHIELD R packages. They can also be amended afterwards.
The DataSHIELD methods define the allowed function calls and their mapping to a server side function call.
To list the aggregation functions:
dsadmin.get_methods(o, type = "aggregate", profile = "demo")
Fully custom settings can be defined (useful for developers).
dsadmin.set_method(o, "hello", func = function(x) { paste0("Hello ", x, "!") }, type = "aggregate", profile = "demo")
# verfiy custom method
dsadmin.get_method(o, "hello", type = "aggregate", profile = "demo")
A simple test of our custom hello()
function would
be:
library(DSOpal)
builder <- DSI::newDSLoginBuilder()
builder$append(server = "study1", url = "https://opal-demo.obiba.org",
user = "administrator", password = "password",
profile = "demo")
logindata <- builder$build()
conns <- DSI::datashield.login(logins = logindata)
# call the hello() function on the R server
datashield.aggregate(conns, expr = quote(hello('friends')))
datashield.logout(conns)
The DataSHIELD R options affects the behaviour of some methods.
To modify an R option:
dsadmin.set_option(o, "datashield.privacyLevel", "10", profile = "demo")
# verify options
dsadmin.get_options(o, profile = "demo")
An advanced setting, mostly for backward compatibility issues, is the possibility to chose which R parser should be used by Opal when validating the submitted R code (remember that only a subset of the R language is allowed). See datashield4j library documentation for possible values.
To set the legacy R parser:
dsadmin.profile_rparser(o, "demo", rParser = "v1")
The DataSHIELD requires permissions: permissions to access the data (whether these are in a table or a resource) and permission to use the DataSHIELD service.
To facilitate permissions maintenance, create users in appropriate group(s). Groups can represent data access and DataSHIELD service access.
if (oadmin.user_exists(o, "userx"))
oadmin.user_delete(o, "userx")
# generated password
password <- oadmin.user_add(o, "userx", groups = c("demo", "datashield"))
# verify user
subset(oadmin.users(o), name == "userx")
To set some DataSHIELD-compatible permissions (view without accessing individual-level data) to each tables of a project, use the following:
lapply(opal.tables(o, "CNSIM")$name, function(table) {
opal.table_perm_add(o, "CNSIM", table, subject = "demo", type = "group", permission = "view")
})
# verify table permissions
opal.table_perm(o, "CNSIM", "CNSIM1")
Similarly, permissions to use all the resources of a project in a DataSHIELD context is even simpler:
opal.resources_perm_add(o, "RSRC", subject = "demo", type = "group", permission = "view")
# verify permissions
opal.resources_perm(o, "RSRC")
Then grant permission to use the DataSHIELD service to a group of users:
dsadmin.perm_add(o, subject = "datashield", type = "group", permission = "use")
# verify permissions
dsadmin.perm(o)
Note that it is also possible to grant permission to access a specific DataSHIELD profile (see Profiles section).