Title: | Retrieve and Analyze Clinical Trials in Public Registers |
---|---|
Description: | A system for querying, retrieving and analyzing protocol- and results-related information on clinical trials from four public registers, the 'European Union Clinical Trials Register' ('EUCTR', <https://www.clinicaltrialsregister.eu/>), 'ClinicalTrials.gov' (<https://clinicaltrials.gov/> and also translating queries the retired classic interface), the 'ISRCTN' (<http://www.isrctn.com/>) and the 'European Union Clinical Trials Information System' ('CTIS', <https://euclinicaltrials.eu/>). Trial information is downloaded, converted and stored in a database ('PostgreSQL', 'SQLite', 'DuckDB' or 'MongoDB'; via package 'nodbi'). Documents in registers associated with trials can also be downloaded. Other functions identify deduplicated records, easily find and extract variables (fields) of interest even from complex nested data as used by the registers, merge variables and update queries. The package can be used for meta-analysis and trend-analysis of the design and conduct as well as of the results of clinical trials. |
Authors: | Ralf Herold [aut, cre] |
Maintainer: | Ralf Herold <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.19.5.9000 |
Built: | 2024-11-10 18:17:41 UTC |
Source: | https://github.com/rfhb/ctrdata |
A package for aggregating and analysing information on clinical studies, and for obtaining documents, from public registers
Package ctrdata
retrieves trial information and stores it in a
database collection, which has to be given as a connection object
to parameter con
for several ctrdata functions; this
connection object is created in almost identical ways for
these supported backends:
Database | Connection object |
MongoDB | dbc <- nodbi::src_mongo(db = "my_db", collection = "my_coll") |
SQLite | dbc <- nodbi::src_sqlite(dbname = "my_db", collection = "my_coll") |
PostgreSQL | dbc <- nodbi::src_postgres(dbname = "my_db"); dbc[["collection"]] <- "my_coll" |
DuckDB | dbc <- nodbi::src_duckdb(dbname = "my_db", collection = "my_coll") |
Use a connection object with a ctrdata
function, for example
dbQueryHistory, or other packages, for example
mongolite::mongo or nodbi::docdb_query.
Use a demo database:
dbc <- nodbi::src_sqlite(dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials")
ctrOpenSearchPagesInBrowser, ctrLoadQueryIntoDb (load trial records into database collection); see ctrdata-registers for details on registers and how to search.
dbFindFields (find names of fields of interest in trial records in a collection), dbGetFieldsIntoDf (create a data frame with fields of interest from collection), dbFindIdsUniqueTrials (get de-duplicated identifiers of clinical trials' records that can be used to subset a data frame).
dfTrials2Long (convert fields with nested elements into long format), dfName2Value (get values for variable(s) of interest).
Ralf Herold [email protected]
Useful links:
Report bugs at https://github.com/rfhb/ctrdata/issues
Registers of clinical trials from which protocol- and result-related information can be retrieved and analysed with package ctrdata, last updated 2024-11-10.
EUCTR: The EU Clinical Trials Register contains more than 44,200 clinical trials (at least one investigational medicinal product, IMP; in the European Union and beyond; no new trials, but results for contained trials continue to be added)
CTIS: The EU Clinical Trials Information System started in January 2023 for new clinical trials. It includes more than 7,300 publicly accessible trials. How to automatically get the CTIS search query URL: here
CTGOV2: ClinicalTrials.gov includes more than 515,000 interventional and observational studies
ISRCTN: The ISRCTN Registry includes more than 25,500 interventional and observational health studies
CTGOV was retired on 2024-06-25; ctrdata
subsequently translates CTGOV
queries to CTGOV2 queries. The new website (CTGOV2) can be used with ctrdata
since 2023-08-27. CTIS was relaunched on 2024-06-17, changing the data
structure and search syntax, to which ctrdata
was updated. CTIS can be used
with ctrdata
since 2023-03-25. More information on changes:
here.
Material | EUCTR | CTGOV2 | ISRCTN | CTIS |
Home page | link | link | link | link |
About | link | link | link | link |
Terms & conditions, disclaimer | link | link | link | link |
How to search | link | link | link | link |
Search interface | link | link | link | link |
Expert / advanced search | link | link | link | link |
Glossary | link | link | link | |
FAQ, caveats, issues | link | link | link | link |
Definitions | link | link | link | link |
Example* | link | link | link | link |
*The example is an expert search for interventional trials primarily with neonates, investigating infectious conditions. It shows that searches in registers may not be sufficient to identify the sought trials:
The CTGOV2 search retrieves trials conducted exclusively with neonates.
EUCTR retrieves trials with neonates, but not only those exclusively in neonates.
ISRCTN retrieves studies with interventions other than medicines.
CTIS retrieves trials that mention the words neonates and infection.
To address this, trials can be retrieved with ctrLoadQueryIntoDb into a database collection and in a second step can be selected, based on values of relevant fields of all retrieved trial information, for example:
EUCTR field f115_children_211years
for age criteria
ISRCTN field interventions.intervention.interventionType
for type of study
CTIS fields ageGroup
and authorizedApplication.authorizedPartI.medicalConditions.medicalCondition
ctrdata
helps identifying fields with function dbGetFieldsIntoDf.
Ralf Herold [email protected]
An active substance can be identified by a recommended international nonproprietary name (INN), a trade or product name, or a company code(s). To find likely synonyms, the function retrieves from CTGOV2 the field protocolSection.armsInterventionsModule.interventions.otherNames. Note this is not free of error and should be checked manually.
ctrFindActiveSubstanceSynonyms(activesubstance = "", verbose = FALSE)
ctrFindActiveSubstanceSynonyms(activesubstance = "", verbose = FALSE)
activesubstance |
An active substance, in an atomic character vector |
verbose |
Print number of studies found in CTGOV2 for 'activesubstance' |
A character vector of the active substance (input parameter) and synonyms, or NULL if active substance was not found and may be invalid
## Not run: ctrFindActiveSubstanceSynonyms(activesubstance = "imatinib") # [1] "imatinib" "Carcemia" "Cemivil" # [4] "CGP 57148" "CGP-57148B" "CGP57148B" # [7] "Gleevac" "gleevec" "Gleevec (Imatinib Mesylate)" # [10] "Glevec" "glivec" "Imatinib" # [13] "imatinib mesylate" "Imatinib-AFT" "IND # 55666" # [16] "NSC #716051" "NSC-716051" "QTI571" # [19] "ST1571" "STI 571" "STI-571" # [22] "STI571" "tyrosine kinase inhibitors" ## End(Not run)
## Not run: ctrFindActiveSubstanceSynonyms(activesubstance = "imatinib") # [1] "imatinib" "Carcemia" "Cemivil" # [4] "CGP 57148" "CGP-57148B" "CGP57148B" # [7] "Gleevac" "gleevec" "Gleevec (Imatinib Mesylate)" # [10] "Glevec" "glivec" "Imatinib" # [13] "imatinib mesylate" "Imatinib-AFT" "IND # 55666" # [16] "NSC #716051" "NSC-716051" "QTI571" # [19] "ST1571" "STI 571" "STI-571" # [22] "STI571" "tyrosine kinase inhibitors" ## End(Not run)
Extracts query parameters and register name from parameter 'url' or from the clipboard, into which the URL of a register search was copied.
ctrGetQueryUrl(url = "", register = "")
ctrGetQueryUrl(url = "", register = "")
url |
URL such as from the browser address bar. If not specified, clipboard contents will be checked for a suitable URL. For automatically copying the user's query of a register in a web browser to the clipboard, see here. Can also contain a query term such as from dbQueryHistory()["query-term"]. |
register |
Optional name of register (one of "EUCTR", "CTGOV2" "ISRCTN" or "CTIS") in case 'url' is a query term but not a full URL |
A data frame (or tibble, if tibble
is loaded)
with column names 'query-term' and 'query-register'.
The data frame (or tibble) can be passed as such as parameter
'query-term' to ctrLoadQueryIntoDb and as parameter
'url' to ctrOpenSearchPagesInBrowser.
# user copied into the clipboard the URL from # the address bar of the browser that shows results # from a query in one of the trial registers if (interactive()) try(ctrGetQueryUrl(), silent = TRUE) # extract query parameters from search result URL # (URL was cut for the purpose of formatting only) ctrGetQueryUrl( url = paste0( "https://classic.clinicaltrials.gov/ct2/results?", "cond=&term=AREA%5BMaximumAge%5D+RANGE%5B0+days%2C+28+days%5D", "&type=Intr&rslt=&age_v=&gndr=&intr=Drugs%2C+Investigational", "&titles=&outc=&spons=&lead=&id=&cntry=&state=&city=&dist=", "&locn=&phase=2&rsub=&strd_s=01%2F01%2F2015&strd_e=01%2F01%2F2016", "&prcd_s=&prcd_e=&sfpd_s=&sfpd_e=&rfpd_s=&rfpd_e=&lupd_s=&lupd_e=&sort=" ) ) ctrGetQueryUrl("https://www.clinicaltrialsregister.eu/ctr-search/trial/2007-000371-42/results") ctrGetQueryUrl("https://euclinicaltrials.eu/ctis-public/view/2022-500041-24-00") ctrGetQueryUrl("https://classic.clinicaltrials.gov/ct2/show/NCT01492673?cond=neuroblastoma") ctrGetQueryUrl("https://clinicaltrials.gov/ct2/show/NCT01492673?cond=neuroblastoma") ctrGetQueryUrl("https://clinicaltrials.gov/study/NCT01467986?aggFilters=ages:child") ctrGetQueryUrl("https://www.isrctn.com/ISRCTN70039829")
# user copied into the clipboard the URL from # the address bar of the browser that shows results # from a query in one of the trial registers if (interactive()) try(ctrGetQueryUrl(), silent = TRUE) # extract query parameters from search result URL # (URL was cut for the purpose of formatting only) ctrGetQueryUrl( url = paste0( "https://classic.clinicaltrials.gov/ct2/results?", "cond=&term=AREA%5BMaximumAge%5D+RANGE%5B0+days%2C+28+days%5D", "&type=Intr&rslt=&age_v=&gndr=&intr=Drugs%2C+Investigational", "&titles=&outc=&spons=&lead=&id=&cntry=&state=&city=&dist=", "&locn=&phase=2&rsub=&strd_s=01%2F01%2F2015&strd_e=01%2F01%2F2016", "&prcd_s=&prcd_e=&sfpd_s=&sfpd_e=&rfpd_s=&rfpd_e=&lupd_s=&lupd_e=&sort=" ) ) ctrGetQueryUrl("https://www.clinicaltrialsregister.eu/ctr-search/trial/2007-000371-42/results") ctrGetQueryUrl("https://euclinicaltrials.eu/ctis-public/view/2022-500041-24-00") ctrGetQueryUrl("https://classic.clinicaltrials.gov/ct2/show/NCT01492673?cond=neuroblastoma") ctrGetQueryUrl("https://clinicaltrials.gov/ct2/show/NCT01492673?cond=neuroblastoma") ctrGetQueryUrl("https://clinicaltrials.gov/study/NCT01467986?aggFilters=ages:child") ctrGetQueryUrl("https://www.isrctn.com/ISRCTN70039829")
Retrieves information on clinical trials from registers and stores it in a collection in a database. Main function of ctrdata for accessing registers. A collection can store trial information from different queries or different registers. Query details are stored in the collection and can be accessed using dbQueryHistory. A previous query can be re-run, which replaces or adds trial records while keeping any user annotations of trial records.
ctrLoadQueryIntoDb( queryterm = NULL, register = "", querytoupdate = NULL, forcetoupdate = FALSE, euctrresults = FALSE, euctrresultshistory = FALSE, ctgov2history = FALSE, documents.path = NULL, documents.regexp = "prot|sample|statist|sap_|p1ar|p2ars|icf|ctalett|lay|^[0-9]+ ", annotation.text = "", annotation.mode = "append", only.count = FALSE, con = NULL, verbose = FALSE, ... )
ctrLoadQueryIntoDb( queryterm = NULL, register = "", querytoupdate = NULL, forcetoupdate = FALSE, euctrresults = FALSE, euctrresultshistory = FALSE, ctgov2history = FALSE, documents.path = NULL, documents.regexp = "prot|sample|statist|sap_|p1ar|p2ars|icf|ctalett|lay|^[0-9]+ ", annotation.text = "", annotation.mode = "append", only.count = FALSE, con = NULL, verbose = FALSE, ... )
queryterm |
Either a string with the full URL of a search
query in a register, or the data frame returned by the
ctrGetQueryUrl or the
dbQueryHistory functions, or, together with parameter
|
register |
String with abbreviation of register to query,
either "EUCTR", "CTGOV2", "ISRCTN" or "CTIS". Not needed
if |
querytoupdate |
Either the word "last", or the row number of
a query in the data frame returned by dbQueryHistory that
should be run to retrieve any new or update trial records since
this query was run the last time.
This parameter takes precedence over |
forcetoupdate |
If |
euctrresults |
If |
euctrresultshistory |
If |
ctgov2history |
For trials from CTGOV2, retrieve historic
versions of the record. Default is |
documents.path |
If this is a relative or absolute
path to a directory that exists or can be created,
save any documents into it that are directly available from
the register ("EUCTR", "CTGOV2", "ISRCTN", "CTIS")
such as PDFs on results, analysis plans, spreadsheets,
patient information sheets, assessments or product information.
Default is |
documents.regexp |
Regular expression, case insensitive,
to select documents by filename, if saving documents is requested
(see |
annotation.text |
Text to be including into the field
"annotation" in the records retrieved with the query
that is to be loaded into the collection.
The contents of the field "annotation" for a trial record
are preserved e.g. when running this function again and
loading a record of a with an annotation, see parameter
|
annotation.mode |
One of "append" (default), "prepend" or "replace" for new annotation.text with respect to any existing annotation for the records retrieved with the query that is to be loaded into the collection. |
only.count |
Set to |
con |
A connection object, see section 'Databases' in ctrdata. |
verbose |
Printing additional information if set to
|
... |
Do not use (capture deprecated parameters). |
A list with elements 'n' (number of trial records newly imported or updated), ‘success' (a vector of _id’s of successfully loaded records), 'failed' (a vector of identifiers of records that failed to load) and 'queryterm' (the query term used). The returned list has several attributes (including database and collection name, as well as the query history of this database collection) to facilitate documentation.
## Not run: dbc <- nodbi::src_sqlite(collection = "my_collection") # Retrieve protocol- and results-related information # on two specific trials identified by their EU number ctrLoadQueryIntoDb( queryterm = "2005-001267-63+OR+2008-003606-33", register = "EUCTR", euctrresults = TRUE, con = dbc ) # Count ongoing interventional cancer trials involving children # Note this query is a classical CTGOV query and is translated # to a corresponding query for the current CTGOV2 webinterface ctrLoadQueryIntoDb( queryterm = "cond=cancer&recr=Open&type=Intr&age=0", register = "CTGOV", only.count = TRUE, con = dbc ) # Retrieve all information on more than 40 trials # that are labelled as phase 3 and that mention # either neuroblastoma or lymphoma from ISRCTN, # into the same collection as used before ctrLoadQueryIntoDb( queryterm = paste0( "https://www.isrctn.com/search?", "q=neuroblastoma+OR+lymphoma&filters=phase%3APhase+III"), con = dbc ) # Retrieve information trials in CTIS mentioning neonates ctrLoadQueryIntoDb( queryterm = paste0("https://euclinicaltrials.eu/ctis-public/", "search#searchCriteria={%22containAll%22:%22%22,", "%22containAny%22:%22neonates%22,%22containNot%22:%22%22}"), con = dbc ) ## End(Not run)
## Not run: dbc <- nodbi::src_sqlite(collection = "my_collection") # Retrieve protocol- and results-related information # on two specific trials identified by their EU number ctrLoadQueryIntoDb( queryterm = "2005-001267-63+OR+2008-003606-33", register = "EUCTR", euctrresults = TRUE, con = dbc ) # Count ongoing interventional cancer trials involving children # Note this query is a classical CTGOV query and is translated # to a corresponding query for the current CTGOV2 webinterface ctrLoadQueryIntoDb( queryterm = "cond=cancer&recr=Open&type=Intr&age=0", register = "CTGOV", only.count = TRUE, con = dbc ) # Retrieve all information on more than 40 trials # that are labelled as phase 3 and that mention # either neuroblastoma or lymphoma from ISRCTN, # into the same collection as used before ctrLoadQueryIntoDb( queryterm = paste0( "https://www.isrctn.com/search?", "q=neuroblastoma+OR+lymphoma&filters=phase%3APhase+III"), con = dbc ) # Retrieve information trials in CTIS mentioning neonates ctrLoadQueryIntoDb( queryterm = paste0("https://euclinicaltrials.eu/ctis-public/", "search#searchCriteria={%22containAll%22:%22%22,", "%22containAny%22:%22neonates%22,%22containNot%22:%22%22}"), con = dbc ) ## End(Not run)
Open advanced search pages of register(s), or execute search in browser
ctrOpenSearchPagesInBrowser(url = "", register = "", copyright = FALSE)
ctrOpenSearchPagesInBrowser(url = "", register = "", copyright = FALSE)
url |
of search results page to show in the browser. To open the
browser with a previous search, the output of ctrGetQueryUrl
or dbQueryHistory can be used. Can be left as empty string
(default) to open the advanced search page of |
register |
Register(s) to open, "EUCTR", "CTGOV2", "ISRCTN" or "CTIS". Default is empty string, and this opens the advanced search page of the register(s). |
copyright |
(Optional) If set to |
(String) Full URL corresponding to the shortened url
in conjunction with register
if any, or invisibly
TRUE
if no url
is specified.
# Open all and check copyrights before using registers ctrOpenSearchPagesInBrowser(copyright = TRUE) # Open specific register advanced search page ctrOpenSearchPagesInBrowser(register = "CTGOV2") ctrOpenSearchPagesInBrowser(register = "CTIS") ctrOpenSearchPagesInBrowser(register = "EUCTR") ctrOpenSearchPagesInBrowser(register = "ISRCTN") # Open all queries that were loaded into demo collection dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials" ) dbh <- dbQueryHistory( con = dbc ) for (r in seq_len(nrow(dbh))) { ctrOpenSearchPagesInBrowser(dbh[r, ]) }
# Open all and check copyrights before using registers ctrOpenSearchPagesInBrowser(copyright = TRUE) # Open specific register advanced search page ctrOpenSearchPagesInBrowser(register = "CTGOV2") ctrOpenSearchPagesInBrowser(register = "CTIS") ctrOpenSearchPagesInBrowser(register = "EUCTR") ctrOpenSearchPagesInBrowser(register = "ISRCTN") # Open all queries that were loaded into demo collection dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials" ) dbh <- dbQueryHistory( con = dbc ) for (r in seq_len(nrow(dbh))) { ctrOpenSearchPagesInBrowser(dbh[r, ]) }
Given part of the name of a field of interest to the user, this
function returns the full field names used in records that were
previously loaded into a collection
(using ctrLoadQueryIntoDb). Only names of fields that have
a value in the collection can be returned.
Set sample = FALSE
to force screening all records in the
collection for field names, see below.
dbFindFields(namepart = ".*", con, sample = TRUE, verbose = FALSE)
dbFindFields(namepart = ".*", con, sample = TRUE, verbose = FALSE)
namepart |
A character string (can be a regular expression, including Perl-style) to be searched among all field names (keys) in the collection, case-insensitive. The default '".*"' lists all fields. |
con |
A connection object, see section 'Databases' in ctrdata. |
sample |
If |
verbose |
If |
The full names of child fields are returned in dot notation (e.g.,
clinical_results.outcome_list.outcome.measure.class_list.class.title
)
In addition, names of parent fields (e.g.,
clinical_results
) are returned.
Data in parent fields is typically complex (nested), see
dfTrials2Long for easily handling it.
For field definitions of the registers, see
"Definition" in ctrdata-registers.
Note: When dbFindFields
is first called after
ctrLoadQueryIntoDb, it will take a moment.
Vector of strings with full names of field(s) found, ordered by register and alphabet, see examples. Names of the vector are the names of the register holding the respective fields. The field names can be fed into dbGetFieldsIntoDf to extract the data for the field(s) from the collection into a data frame.
dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials" ) dbFindFields(namepart = "date", con = dbc)[1:5] # view all 3350+ fields from all registers: allFields <- dbFindFields(con = dbc, sample = FALSE) if (interactive()) View(data.frame( register = names(allFields), field = allFields))
dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials" ) dbFindFields(namepart = "date", con = dbc)[1:5] # view all 3350+ fields from all registers: allFields <- dbFindFields(con = dbc, sample = FALSE) if (interactive()) View(data.frame( register = names(allFields), field = allFields))
Records for a clinical trial can be loaded from more than one register into a collection. This function returns deduplicated identifiers for all trials in the collection, respecting the register(s) preferred by the user. All registers are recording identifiers also from other registers, which are used by this function to provide a vector of identifiers of deduplicated trials.
dbFindIdsUniqueTrials( preferregister = c("EUCTR", "CTGOV", "CTGOV2", "ISRCTN", "CTIS"), prefermemberstate = "DE", include3rdcountrytrials = TRUE, con, verbose = FALSE )
dbFindIdsUniqueTrials( preferregister = c("EUCTR", "CTGOV", "CTGOV2", "ISRCTN", "CTIS"), prefermemberstate = "DE", include3rdcountrytrials = TRUE, con, verbose = FALSE )
preferregister |
A vector of the order of preference for
registers from which to generate unique _id's, default
|
prefermemberstate |
Code of single EU Member State for which records
should returned. If not available, a record for DE or lacking this, any
random Member State's record for the trial will be returned.
For a list of codes of EU Member States, please see vector
|
include3rdcountrytrials |
A logical value if trials should be retained
that are conducted exclusively in third countries, that is, outside
the European Union. Ignored if |
con |
A connection object, see section 'Databases' in ctrdata. |
verbose |
If |
Note that the content of records may differ between registers (and, for "EUCTR", between records for different Member States). Such differences are not considered by this function.
A named vector with strings of keys (field "_id") of records in the collection that represent unique trials, where names correspond to the register of the record.
dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials" ) dbFindIdsUniqueTrials(con = dbc)
dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials" ) dbFindIdsUniqueTrials(con = dbc)
Fields in the collection are retrieved from all records into a data frame (or tibble). Within a given trial record, a fields can be hierarchical and structured, that is, nested. Th function uses the field names to appropriately type the values that it returns, harmonising original values (e.g. "Information not present in EudraCT" to 'NA', "Yes" to 'TRUE', "false" to 'FALSE', date strings to dates or time differences, number strings to numbers). The function simplifies the structure of nested data and may concatenate multiple strings in a field using " / " (see example) and may have widened the returned data frame with additional columns that were recursively expanded from simply nested data (e.g., "externalRefs" to columns "externalRefs.doi", "externalRefs.eudraCTNumber" etc.). For an alternative way for handling the complex nested data, see dfTrials2Long followed by dfName2Value for extracting the sought variable(s).
dbGetFieldsIntoDf(fields = "", con, verbose = FALSE, ...)
dbGetFieldsIntoDf(fields = "", con, verbose = FALSE, ...)
fields |
Vector of one or more strings, with names of sought fields. See function dbFindFields for how to find names of fields. Dot path notation ("field.subfield") without indices is supported. If compatibility with 'nodbi::src_postgres()' is needed, specify fewer than 50 fields, consider also using parent fields e.g., '"a.b"' instead of 'c("a.b.c.d", "a.b.c.e")', accessing sought fields with dfTrials2Long followed by dfName2Value or other R functions. |
con |
A connection object, see section 'Databases' in ctrdata. |
verbose |
Printing additional information if set to |
... |
Do not use (captures deprecated parameter |
A data frame (or tibble, if tibble
is loaded)
with columns corresponding to the sought fields.
A column for the records' '_id' will always be included.
The maximum number of rows of the returned data frame is equal to,
or less than the number of trial records in the database collection.
dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials") # get fields that are nested within another field # and can have multiple values with the nested field dbGetFieldsIntoDf( fields = "b1_sponsor.b31_and_b32_status_of_the_sponsor", con = dbc) # fields that are lists of string values are # returned by concatenating values with a slash dbGetFieldsIntoDf( fields = "keyword", con = dbc)
dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials") # get fields that are nested within another field # and can have multiple values with the nested field dbGetFieldsIntoDf( fields = "b1_sponsor.b31_and_b32_status_of_the_sponsor", con = dbc) # fields that are lists of string values are # returned by concatenating values with a slash dbGetFieldsIntoDf( fields = "keyword", con = dbc)
Show history of queries loaded into a database collection
dbQueryHistory(con, verbose = FALSE)
dbQueryHistory(con, verbose = FALSE)
con |
A connection object, see section 'Databases' in ctrdata. |
verbose |
If |
A data frame (or tibble, if tibble
is loaded)
with columns: 'query-timestamp', 'query-register',
'query-records' (note: this is the number of records loaded when last
executing ctrLoadQueryIntoDb, not the total record number) and
'query-term', with one row for each time that
ctrLoadQueryIntoDb loaded trial records into this collection.
dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials" ) dbQueryHistory(con = dbc)
dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials" ) dbQueryHistory(con = dbc)
Merge variables in a data frame such as returned by dbGetFieldsIntoDf into a new variable, and optionally also map its values to new levels.
dfMergeVariablesRelevel(df = NULL, colnames = "", levelslist = NULL)
dfMergeVariablesRelevel(df = NULL, colnames = "", levelslist = NULL)
df |
A data.frame with the variables (columns) to be merged into one vector. |
colnames |
A vector of names of columns in 'df' that hold the variables
to be merged, or a selection of columns as per |
levelslist |
A names list with one slice each for a new value to be used for a vector of old values (optional). |
A vector, with the type of the columns to be merged
dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials" ) df <- dbGetFieldsIntoDf( fields = c("overall_status", "x5_trial_status"), con = dbc ) statusvalues <- list( "ongoing" = c("Recruiting", "Active", "Ongoing"), "completed" = c("Completed", "Prematurely Ended", "Terminated"), "other" = c("Withdrawn", "Suspended", "No longer available") ) dfMergeVariablesRelevel( df = df, colnames = 'contains("status")', levelslist = statusvalues )
dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials" ) df <- dbGetFieldsIntoDf( fields = c("overall_status", "x5_trial_status"), con = dbc ) statusvalues <- list( "ongoing" = c("Recruiting", "Active", "Ongoing"), "completed" = c("Completed", "Prematurely Ended", "Terminated"), "other" = c("Withdrawn", "Suspended", "No longer available") ) dfMergeVariablesRelevel( df = df, colnames = 'contains("status")', levelslist = statusvalues )
Get information for variable of interest (e.g., clinical endpoints) from long data frame of protocol- or result-related trial information as returned by dfTrials2Long. Parameters 'valuename', 'wherename' and 'wherevalue' are matched using Perl regular expressions and ignoring case.
dfName2Value(df, valuename = "", wherename = "", wherevalue = "")
dfName2Value(df, valuename = "", wherename = "", wherevalue = "")
df |
A data frame (or tibble) with four columns ('_id', 'identifier', 'name', 'value') as returned by dfTrials2Long |
valuename |
A character string for the name of the field that holds the value of the variable of interest (e.g., a summary measure such as "endPoints.*tendencyValue.value") |
wherename |
(optional) A character string to identify the variable of interest among those that repeatedly occur in a trial record (e.g., "endPoints.endPoint.title") |
wherevalue |
(optional) A character string with the value of the variable identified by 'wherename' (e.g., "response") |
A data frame (or tibble, if tibble
is loaded)
that includes the values of interest, with columns
'_id', 'identifier', 'name', 'value' and 'where' (with the
contents of 'wherevalue' found at 'wherename').
Contents of 'value' are strings unless all its elements
are numbers. The 'identifier' is generated by
function dfTrials2Long to identify matching elements,
e.g endpoint descriptions and measurements.
dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials" ) dfwide <- dbGetFieldsIntoDf( fields = c( ## ctgov - typical results fields # "clinical_results.baseline.analyzed_list.analyzed.count_list.count", # "clinical_results.baseline.group_list.group", # "clinical_results.baseline.analyzed_list.analyzed.units", "clinical_results.outcome_list.outcome", "study_design_info.allocation", ## euctr - typical results fields # "trialInformation.fullTitle", # "baselineCharacteristics.baselineReportingGroups.baselineReportingGroup", # "trialChanges.hasGlobalInterruptions", # "subjectAnalysisSets", # "adverseEvents.seriousAdverseEvents.seriousAdverseEvent", "endPoints.endPoint", "subjectDisposition.recruitmentDetails" ), con = dbc ) dflong <- dfTrials2Long(df = dfwide) ## get values for the endpoint 'response' dfName2Value( df = dflong, valuename = paste0( "clinical_results.*measurement.value|", "clinical_results.*outcome.measure.units|", "endPoints.endPoint.*tendencyValue.value|", "endPoints.endPoint.unit" ), wherename = paste0( "clinical_results.*outcome.measure.title|", "endPoints.endPoint.title" ), wherevalue = "response" )
dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials" ) dfwide <- dbGetFieldsIntoDf( fields = c( ## ctgov - typical results fields # "clinical_results.baseline.analyzed_list.analyzed.count_list.count", # "clinical_results.baseline.group_list.group", # "clinical_results.baseline.analyzed_list.analyzed.units", "clinical_results.outcome_list.outcome", "study_design_info.allocation", ## euctr - typical results fields # "trialInformation.fullTitle", # "baselineCharacteristics.baselineReportingGroups.baselineReportingGroup", # "trialChanges.hasGlobalInterruptions", # "subjectAnalysisSets", # "adverseEvents.seriousAdverseEvents.seriousAdverseEvent", "endPoints.endPoint", "subjectDisposition.recruitmentDetails" ), con = dbc ) dflong <- dfTrials2Long(df = dfwide) ## get values for the endpoint 'response' dfName2Value( df = dflong, valuename = paste0( "clinical_results.*measurement.value|", "clinical_results.*outcome.measure.units|", "endPoints.endPoint.*tendencyValue.value|", "endPoints.endPoint.unit" ), wherename = paste0( "clinical_results.*outcome.measure.title|", "endPoints.endPoint.title" ), wherevalue = "response" )
The function works with procotol- and results- related information.
It converts lists and other values that are in a data frame returned
by dbGetFieldsIntoDf into individual rows of a long data frame.
From the resulting long data frame, values of interest can be selected
using dfName2Value.
The function is particularly useful for fields with complex content,
such as node field "clinical_results
" from EUCTR, for which
dbGetFieldsIntoDf returns as a multiply nested list and for
which this function then converts every observation of every (leaf)
field into a row of its own.
dfTrials2Long(df)
dfTrials2Long(df)
df |
Data frame (or tibble) with columns including
the trial identifier ( |
A data frame (or tibble, if tibble
is loaded)
with the four columns: '_id', 'identifier', 'name', 'value'
dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials") dfwide <- dbGetFieldsIntoDf( fields = "clinical_results.participant_flow", con = dbc) dfTrials2Long(df = dfwide)
dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials") dfwide <- dbGetFieldsIntoDf( fields = "clinical_results.participant_flow", con = dbc) dfTrials2Long(df = dfwide)