Title: | Retrieve and Analyze Clinical Trials Data from Public Registers |
---|---|
Description: | A system for querying, retrieving and analyzing protocol- and results-related information on clinical trials from four public registers, the 'European Union Clinical Trials Register' ('EUCTR', <https://www.clinicaltrialsregister.eu/>), 'ClinicalTrials.gov' (<https://clinicaltrials.gov/> and also translating queries the retired classic interface), the 'ISRCTN' (<http://www.isrctn.com/>) and the 'European Union Clinical Trials Information System' ('CTIS', <https://euclinicaltrials.eu/>). Trial information is downloaded, converted and stored in a database ('PostgreSQL', 'SQLite', 'DuckDB' or 'MongoDB'; via package 'nodbi'). Protocols, statistical analysis plans, informed consent sheets and other documents in registers associated with trials can also be downloaded. Other functions implement trial concepts canonically across registers, identify deduplicated records, easily find and extract variables (fields) of interest even from complex nested data as used by the registers, merge variables and update queries. The package can be used for monitoring, meta- and trend-analysis of the design and conduct as well as of the results of clinical trials across registers. |
Authors: | Ralf Herold [aut, cre] |
Maintainer: | Ralf Herold <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.21.1.9000 |
Built: | 2025-04-02 19:23:56 UTC |
Source: | https://github.com/rfhb/ctrdata |
ctrdata
is a package for aggregating and analysing information on clinical
studies, and for obtaining documents, from public trial registers
Package ctrdata
retrieves trial information and stores it in a database
collection. Therefore, a database connection object has to be given to
parameter con
for several ctrdata
functions.
The connection object is built using nodbi
which allows to use
different database backends.
Specifying a collection = "<my collection's name>"
is
necessary for package ctrdata
.
A connection object (here called dbc
) is created in almost identical
ways for these supported backends:
Database | Connection object |
MongoDB | dbc <- nodbi::src_mongo(db = "my_db", collection = "my_coll") |
DuckDB | dbc <- nodbi::src_duckdb(dbname = "my_db", collection = "my_coll") |
SQLite | dbc <- nodbi::src_sqlite(dbname = "my_db", collection = "my_coll") |
PostgreSQL | dbc <- nodbi::src_postgres(dbname = "my_db"); dbc[["collection"]] <- "my_coll" |
ctrOpenSearchPagesInBrowser, ctrLoadQueryIntoDb (load trial records into database collection); see ctrdata-registers for details on registers and how to search.
ctrShowOneTrial (show widget to explore structure, fields and data of a trial), dbFindFields (find names of fields of interest in trial records in a collection), dbGetFieldsIntoDf (create a data frame with fields of interest and calculated trial concepts from collection), dbFindIdsUniqueTrials (get de-duplicated identifiers of clinical trials' records to subset a data frame).
dfTrials2Long (convert fields with nested elements into long format), dfName2Value (get values for variable(s) of interest).
Ralf Herold [email protected]
Useful links:
Report bugs at https://github.com/rfhb/ctrdata/issues
Registers of the four clinical trial registers from which package ctrdata can retrieve, aggregate and analyse protocol- and result-related information as well as documents, last updated 2025-03-09.
EUCTR: The EU Clinical Trials Register holds more than 44,300 clinical
trials (at least one investigational medicinal product, IMP; in the European
Union and beyond), including almost 25,000 trials with results, which continue
to be added (can be loaded by ctrdata
).
CTIS: The EU Clinical Trials Information System, launched in 2023,
holds more than 8,700 publicly accessible clinical trials, including
around 100 with results or a report (only as PDF files).
No results in a structured electronic format are foreseeably available,
thus ctrdata
cannot load any CTIS results.
(To automatically get CTIS search query URLs, see
here)
CTGOV2: ClinicalTrials.gov holds more than 529,000 interventional and
observational studies, including almost 66,000 interventional studies with
results (can be loaded by ctrdata
).
ISRCTN: The ISRCTN Registry holds more than 26,000 interventional and
observational health studies, including almost 14,200 studies with
results (only as references).
No results in a structured electronic format are foreseeably available,
thus ctrdata
cannot load any ISRCTN results.
CTGOV "classic" was retired on 2024-06-25; ctrdata
subsequently translates
CTGOV queries to CTGOV2 queries. The new website ("CTGOV2") can be used with
ctrdata
since 2023-08-27. Database collections created with CTGOV queries
can still be used since functions in ctrdata
continue to support them.
CTIS was relaunched on 2024-06-17, changing the data structure and search
syntax, to which ctrdata
was updated.
CTIS can be used with ctrdata
since 2023-03-25.
EUCTR removed search parameter status=
as of February 2025.
More information on changes:
here.
Material | EUCTR | CTGOV2 | ISRCTN | CTIS |
About | link | link | link | link |
Terms & conditions, disclaimer | link | link | link | link |
How to search | link | link | link | link |
Search interface | link | link | link | link |
Expert / advanced search | link | link | link | link |
Glossary / related information | link | link | link | link |
FAQ, caveats, issues | link | link, link | link | link |
Data dictionaries / definitions / structure reference | link | link, link, link | link | link |
Example* | link | link | link | link |
Some registers are expanding entered search terms using dictionaries.
*The example is an expert search for interventional trials primarily with neonates, investigating treatments for infectious conditions. It shows that searches in the web interface of most registers are not sufficient to identify the trials of interest:
EUCTR retrieves trials with neonates, but not only those exclusively in neonates.
ISRCTN retrieves studies with interventions other than medicines.
CTIS retrieves trials that mention the words neonates and infection. (To show CTIS search results, see here)
To address this issue, trials can be retrieved with ctrLoadQueryIntoDb into a database collection and in a second step trials of interest can be selected based on values of relevant fields, for example:
EUCTR field f115_children_211years
and other age group criteria
ISRCTN field interventions.intervention.interventionType
for type of study
CTIS fields ageGroup
and authorizedApplication.authorizedPartI.medicalConditions.medicalCondition
ctrdata
supports users with pre-defined ctrdata-trial-concepts and
these cover the example above, and with functions dbFindFields and
ctrShowOneTrial for finding fields of interest and reviewing data
structure, respectively.
Ralf Herold [email protected]
ctrdata
includes (since version 1.21.0) functions that implement selected
trial concepts. Concepts of clinical trials, such as their start or status of
recruitment, require to analyse several fields against various
pre-defined values. The structure and value sets of fields differ between
all ctrdata-registers. In this situation, the implemented trial
concepts simplify and accelerate a user's analysis workflow and also increase
analysis consistency.
The implementation of trial concepts in ctrdata
has not been validated
with any formal approach, but has been checked for plausibility and
against expectations. The implementation is based on current
understanding, on public data models and on scientific papers, as relevant.
As with other R
functions, call help("f.startDate")
or print its
implementation code by entering the name of the function as command,
e.g. f.startDate
.
Please raise an issue here
to ask about or improve a trial concept.
The following trial concepts can be used by referencing their name when
calling dbGetFieldsIntoDf (parameter calculate
).
Concepts will continue to be refined and added;
last updated 2025-03-15.
f.controlType (factor) which type of internal or concurrent control is used in the trial? ("none", "no-treatment", "placebo", "active", "placebo+active" or "other")
f.isMedIntervTrial (logical) is the trial interventional and does it have one or more medicines (drugs or biological) as investigational (experimental) intervention? (irrespective of status of authorisation and of study design)
f.isUniqueTrial (logical) is the trial record unique in the data frame of trial, based on default parameters of dbFindIdsUniqueTrials?
f.likelyPlatformTrial (logical, list of likely related trials, and a list of possibly related trials) is the trial possibly a (research) platform trial, and what are related trials? (based on trial title, f.numTestArmsSubstances, number of periods; similarity of terms in parts of trial titles)
f.numSites (integer) how many sites does the trial have?
f.numTestArmsSubstances (integer) how many arms or groups have medicines that are investigational? (cannot be calculated for ISRCTN or for phase 1 trials)
f.primaryEndpointDescription (list of character) string containing protocol definition, details and time frames, concatenated with " == "
f.primaryEndpointResults (columns of number, character, integer) returning the statistical testing p value and method as well as the number of subjects included in the test, each in one new column, for the first primary endpoint only
f.resultsDate (date) the planned or achieved date of results availability
f.startDate (date) the planned, authorised or documented date of start of recruitment
f.sampleSize (integer) the planned or achieved number of subjects or participants recruited
f.sponsorType (factor) a type or class of main or lead sponsor that is simplified to "not for profit", "for profit" or "other"
f.statusRecruitment (factor) a status that is simplified to "ongoing" (includes temporarily halted), "completed", "ended early" (includes terminated or ended prematurely) and "other" (includes planned, stopped, withdrawn)
f.trialObjectives (string) identifies with letters those objectives that could be identified by text fragments, e.g. "E S PD D", with "E" (efficacy), "S" (safety), "D" (dose-finding)
f.trialPhase (ordered factor) the phase(s) of medicine development with which a trial is associated
f.trialPopulation (columns of factor, string and string) age groups (e.g., "P" for paediatric participants, "A" for adults, "E" for older than 65 years, or "P+A"), inclusion and exclusion criteria texts
f.trialTitle (string) full or scientific title of the study
Ralf Herold [email protected]
An active substance can be identified by a recommended international nonproprietary name (INN), a trade or product name, or a company code(s). To find likely synonyms, the function retrieves from CTGOV2 the field protocolSection.armsInterventionsModule.interventions.otherNames. Note this does not seem to be based on choices from a dictionary but may be manually filled, thus is not free of error and needs to be checked.
ctrFindActiveSubstanceSynonyms(activesubstance = "", verbose = FALSE)
ctrFindActiveSubstanceSynonyms(activesubstance = "", verbose = FALSE)
activesubstance |
An active substance, in an atomic character vector |
verbose |
Print number of studies found in CTGOV2 for 'activesubstance' |
A character vector of the active substance (input parameter) and synonyms, or NULL if active substance was not found and may be invalid
## Not run: ctrFindActiveSubstanceSynonyms(activesubstance = "imatinib") # [1] "imatinib" "CGP 57148" "CGP 57148B" # [4] "CGP57148B" "Gleevec" "GLIVEC" # [7] "Imatinib" "Imatinib Mesylate" "NSC 716051" # [10] "ST1571" "STI 571" "STI571" ## End(Not run)
## Not run: ctrFindActiveSubstanceSynonyms(activesubstance = "imatinib") # [1] "imatinib" "CGP 57148" "CGP 57148B" # [4] "CGP57148B" "Gleevec" "GLIVEC" # [7] "Imatinib" "Imatinib Mesylate" "NSC 716051" # [10] "ST1571" "STI 571" "STI571" ## End(Not run)
From high-level search terms provided by the user, generate specific queries for each registers with which ctrdata works, see ctrdata-registers. Search terms that are expanded to concepts such as from MeSH and MedDRA by the search implementations in registers include the 'intervention' and 'condition'. Logical operators only work with 'searchPhrase'.
ctrGenerateQueries( searchPhrase = NULL, condition = NULL, intervention = NULL, phase = NULL, population = NULL, recruitment = NULL, startBefore = NULL, startAfter = NULL, completedBefore = NULL, completedAfter = NULL, onlyWithResults = FALSE, registers = c("EUCTR", "ISRCTN", "CTIS", "CTGOV2") )
ctrGenerateQueries( searchPhrase = NULL, condition = NULL, intervention = NULL, phase = NULL, population = NULL, recruitment = NULL, startBefore = NULL, startAfter = NULL, completedBefore = NULL, completedAfter = NULL, onlyWithResults = FALSE, registers = c("EUCTR", "ISRCTN", "CTIS", "CTGOV2") )
searchPhrase |
String with optional logical operators ("AND", "OR") that will be searched in selected fields of registers that can handle logical operators (general or title fields), should not include quotation marks |
condition |
String with condition / disease |
intervention |
String with intervention |
phase |
String, e.g. "phase 2" (note that "phase 2+3" is a specific category, not the union set of "phase 2" and "phase 3") |
population |
String, e.g. "P" (paediatric), "A" (adult), "P+A" (adult and paediatric), "E" (elderly), "P+A+E" participants can be recruited |
recruitment |
String, one of "ongoing", "completed", "other" ( which includes "ended early" but this cannot be searched; use trial concept f.statusRecruitment to identify this status) |
startBefore |
String that can be interpreted as date, see example |
startAfter |
String that can be interpreted as date |
completedBefore |
String that can be interpreted as date (does not work with EUCTR) |
completedAfter |
String that can be interpreted as date (does not work with EUCTR) |
onlyWithResults |
Logical |
registers |
Vector of register names, default all registers |
Named vector of URLs for finding trials in the registers and as input to functions ctrLoadQueryIntoDb and ctrOpenSearchPagesInBrowser
urls <- ctrGenerateQueries( intervention = "antibody", phase = "phase 3", startAfter = "2000-01-01") # open queries in register web interface sapply(urls, ctrOpenSearchPagesInBrowser) urls <- ctrGenerateQueries( searchPhrase = "antibody AND covid", recruitment = "completed") # count trials found sapply(urls, ctrLoadQueryIntoDb, only.count = TRUE) # load queries into database collection # sapply(urls, ctrLoadQueryIntoDb, con = dbc) # find research platform and platform trials urls <- ctrGenerateQueries( searchPhrase = paste0( "basket OR platform OR umbrella OR master protocol OR ", "multiarm OR multistage OR subprotocol OR substudy OR ", "multi-arm OR multi-stage OR sub-protocol OR sub-study"), startAfter = "2010-01-01") # open queries in register web interface sapply(urls, ctrOpenSearchPagesInBrowser)
urls <- ctrGenerateQueries( intervention = "antibody", phase = "phase 3", startAfter = "2000-01-01") # open queries in register web interface sapply(urls, ctrOpenSearchPagesInBrowser) urls <- ctrGenerateQueries( searchPhrase = "antibody AND covid", recruitment = "completed") # count trials found sapply(urls, ctrLoadQueryIntoDb, only.count = TRUE) # load queries into database collection # sapply(urls, ctrLoadQueryIntoDb, con = dbc) # find research platform and platform trials urls <- ctrGenerateQueries( searchPhrase = paste0( "basket OR platform OR umbrella OR master protocol OR ", "multiarm OR multistage OR subprotocol OR substudy OR ", "multi-arm OR multi-stage OR sub-protocol OR sub-study"), startAfter = "2010-01-01") # open queries in register web interface sapply(urls, ctrOpenSearchPagesInBrowser)
Extracts query parameters and register name from parameter 'url' or from the clipboard, into which the URL of a register search was copied.
ctrGetQueryUrl(url = "", register = "")
ctrGetQueryUrl(url = "", register = "")
url |
URL such as from the browser address bar. If not specified, clipboard contents will be checked for a suitable URL. For automatically copying the user's query of a register in a web browser to the clipboard, see here. Can also contain a query term such as from dbQueryHistory()["query-term"]. Can also be an identifier of a trial, which based on its format will indicate to which register it relates. |
register |
Optional name of register (one of "EUCTR", "CTGOV2" "ISRCTN" or "CTIS") in case 'url' is a query term but not a full URL |
A data frame (or tibble, if tibble
is loaded)
with column names 'query-term' and 'query-register'.
The data frame (or tibble) can be passed as such as parameter
'queryterm' to ctrLoadQueryIntoDb and as parameter
'url' to ctrOpenSearchPagesInBrowser.
# user copied into the clipboard the URL from # the address bar of the browser that shows results # from a query in one of the trial registers if (interactive()) try(ctrGetQueryUrl(), silent = TRUE) # extract query parameters from search result URL # (URL was cut for the purpose of formatting only) ctrGetQueryUrl( url = paste0( "https://classic.clinicaltrials.gov/ct2/results?", "cond=&term=AREA%5BMaximumAge%5D+RANGE%5B0+days%2C+28+days%5D", "&type=Intr&rslt=&age_v=&gndr=&intr=Drugs%2C+Investigational", "&titles=&outc=&spons=&lead=&id=&cntry=&state=&city=&dist=", "&locn=&phase=2&rsub=&strd_s=01%2F01%2F2015&strd_e=01%2F01%2F2016", "&prcd_s=&prcd_e=&sfpd_s=&sfpd_e=&rfpd_s=&rfpd_e=&lupd_s=&lupd_e=&sort=" ) ) # other examples ctrGetQueryUrl("https://www.clinicaltrialsregister.eu/ctr-search/trial/2007-000371-42/results") ctrGetQueryUrl("https://euclinicaltrials.eu/ctis-public/view/2022-500041-24-00") ctrGetQueryUrl("https://classic.clinicaltrials.gov/ct2/show/NCT01492673?cond=neuroblastoma") ctrGetQueryUrl("https://clinicaltrials.gov/ct2/show/NCT01492673?cond=neuroblastoma") ctrGetQueryUrl("https://clinicaltrials.gov/study/NCT01467986?aggFilters=ages:child") ctrGetQueryUrl("https://www.isrctn.com/ISRCTN70039829") # using identifiers of single trials ctrGetQueryUrl("70039829") ctrGetQueryUrl("ISRCTN70039829") ctrGetQueryUrl("NCT00617929") ctrGetQueryUrl("2022-501142-30-00") ctrGetQueryUrl("2012-003632-23")
# user copied into the clipboard the URL from # the address bar of the browser that shows results # from a query in one of the trial registers if (interactive()) try(ctrGetQueryUrl(), silent = TRUE) # extract query parameters from search result URL # (URL was cut for the purpose of formatting only) ctrGetQueryUrl( url = paste0( "https://classic.clinicaltrials.gov/ct2/results?", "cond=&term=AREA%5BMaximumAge%5D+RANGE%5B0+days%2C+28+days%5D", "&type=Intr&rslt=&age_v=&gndr=&intr=Drugs%2C+Investigational", "&titles=&outc=&spons=&lead=&id=&cntry=&state=&city=&dist=", "&locn=&phase=2&rsub=&strd_s=01%2F01%2F2015&strd_e=01%2F01%2F2016", "&prcd_s=&prcd_e=&sfpd_s=&sfpd_e=&rfpd_s=&rfpd_e=&lupd_s=&lupd_e=&sort=" ) ) # other examples ctrGetQueryUrl("https://www.clinicaltrialsregister.eu/ctr-search/trial/2007-000371-42/results") ctrGetQueryUrl("https://euclinicaltrials.eu/ctis-public/view/2022-500041-24-00") ctrGetQueryUrl("https://classic.clinicaltrials.gov/ct2/show/NCT01492673?cond=neuroblastoma") ctrGetQueryUrl("https://clinicaltrials.gov/ct2/show/NCT01492673?cond=neuroblastoma") ctrGetQueryUrl("https://clinicaltrials.gov/study/NCT01467986?aggFilters=ages:child") ctrGetQueryUrl("https://www.isrctn.com/ISRCTN70039829") # using identifiers of single trials ctrGetQueryUrl("70039829") ctrGetQueryUrl("ISRCTN70039829") ctrGetQueryUrl("NCT00617929") ctrGetQueryUrl("2022-501142-30-00") ctrGetQueryUrl("2012-003632-23")
Retrieves information on clinical trials from registers and stores it in a collection in a database. Main function of ctrdata for accessing registers. A collection can store trial information from different queries or different registers. Query details are stored in the collection and can be accessed using dbQueryHistory. A previous query can be re-run, which replaces or adds trial records while keeping any user annotations of trial records.
ctrLoadQueryIntoDb( queryterm = NULL, register = "", querytoupdate = NULL, forcetoupdate = FALSE, euctrresults = FALSE, euctrresultshistory = FALSE, ctgov2history = FALSE, documents.path = NULL, documents.regexp = "prot|sample|statist|sap_|p1ar|p2ars|icf|ctalett|lay|^[0-9]+ ", annotation.text = "", annotation.mode = "append", only.count = FALSE, con = NULL, verbose = FALSE, ... )
ctrLoadQueryIntoDb( queryterm = NULL, register = "", querytoupdate = NULL, forcetoupdate = FALSE, euctrresults = FALSE, euctrresultshistory = FALSE, ctgov2history = FALSE, documents.path = NULL, documents.regexp = "prot|sample|statist|sap_|p1ar|p2ars|icf|ctalett|lay|^[0-9]+ ", annotation.text = "", annotation.mode = "append", only.count = FALSE, con = NULL, verbose = FALSE, ... )
queryterm |
Either a string with the full URL of a search
query in a register, or the data frame returned by
ctrGetQueryUrl or dbQueryHistory,
or an '_id' in the format of one of the trial registers,
or, together with |
register |
String with abbreviation of register to query,
either "EUCTR", "CTGOV2", "ISRCTN" or "CTIS". Not needed
if |
querytoupdate |
Either the word "last", or the row number of
a query in the data frame returned by dbQueryHistory that
should be run to retrieve any new or update trial records since
this query was run the last time.
This parameter takes precedence over |
forcetoupdate |
If |
euctrresults |
If |
euctrresultshistory |
If |
ctgov2history |
For trials from CTGOV2, retrieve historic
versions of the record. Default is |
documents.path |
If this is a relative or absolute
path to a directory that exists or can be created,
save any documents into it that are directly available from
the register ("EUCTR", "CTGOV2", "ISRCTN", "CTIS")
such as PDFs on results, analysis plans, spreadsheets,
patient information sheets, assessments or product information.
Default is |
documents.regexp |
Regular expression, case insensitive,
to select documents by filename, if saving documents is requested
(see |
annotation.text |
Text to be including into the field
"annotation" in the records retrieved with the query
that is to be loaded into the collection.
The contents of the field "annotation" for a trial record
are preserved e.g. when running this function again and
loading a record of a with an annotation, see parameter
|
annotation.mode |
One of "append" (default), "prepend" or "replace" for new annotation.text with respect to any existing annotation for the records retrieved with the query that is to be loaded into the collection. |
only.count |
Set to |
con |
A database connection object, created with
|
verbose |
Printing additional information if set to
|
... |
Do not use (capture deprecated parameters). |
A list with elements 'n' (number of trial records newly imported or updated), ‘success' (a vector of _id’s of successfully loaded records), 'failed' (a vector of identifiers of records that failed to load) and 'queryterm' (the query term used). The returned list has several attributes (including database and collection name, as well as the query history of this database collection) to facilitate documentation.
## Not run: dbc <- nodbi::src_sqlite(collection = "my_collection") # Retrieve protocol- and results-related information # on two specific trials identified by their EU number ctrLoadQueryIntoDb( queryterm = "2005-001267-63+OR+2008-003606-33", register = "EUCTR", euctrresults = TRUE, con = dbc ) # Count ongoing interventional cancer trials involving children # Note this query is a classical CTGOV query and is translated # to a corresponding query for the current CTGOV2 webinterface ctrLoadQueryIntoDb( queryterm = "cond=cancer&recr=Open&type=Intr&age=0", register = "CTGOV", only.count = TRUE, con = dbc ) # Retrieve all information on more than 40 trials # that are labelled as phase 3 and that mention # either neuroblastoma or lymphoma from ISRCTN, # into the same collection as used before ctrLoadQueryIntoDb( queryterm = paste0( "https://www.isrctn.com/search?", "q=neuroblastoma+OR+lymphoma&filters=phase%3APhase+III"), con = dbc ) # Retrieve information trials in CTIS mentioning neonates ctrLoadQueryIntoDb( queryterm = paste0("https://euclinicaltrials.eu/ctis-public/", "search#searchCriteria={%22containAll%22:%22%22,", "%22containAny%22:%22neonates%22,%22containNot%22:%22%22}"), con = dbc ) ## End(Not run)
## Not run: dbc <- nodbi::src_sqlite(collection = "my_collection") # Retrieve protocol- and results-related information # on two specific trials identified by their EU number ctrLoadQueryIntoDb( queryterm = "2005-001267-63+OR+2008-003606-33", register = "EUCTR", euctrresults = TRUE, con = dbc ) # Count ongoing interventional cancer trials involving children # Note this query is a classical CTGOV query and is translated # to a corresponding query for the current CTGOV2 webinterface ctrLoadQueryIntoDb( queryterm = "cond=cancer&recr=Open&type=Intr&age=0", register = "CTGOV", only.count = TRUE, con = dbc ) # Retrieve all information on more than 40 trials # that are labelled as phase 3 and that mention # either neuroblastoma or lymphoma from ISRCTN, # into the same collection as used before ctrLoadQueryIntoDb( queryterm = paste0( "https://www.isrctn.com/search?", "q=neuroblastoma+OR+lymphoma&filters=phase%3APhase+III"), con = dbc ) # Retrieve information trials in CTIS mentioning neonates ctrLoadQueryIntoDb( queryterm = paste0("https://euclinicaltrials.eu/ctis-public/", "search#searchCriteria={%22containAll%22:%22%22,", "%22containAny%22:%22neonates%22,%22containNot%22:%22%22}"), con = dbc ) ## End(Not run)
Open advanced search pages of register(s), or execute search in browser
ctrOpenSearchPagesInBrowser(url = "", register = "", copyright = FALSE)
ctrOpenSearchPagesInBrowser(url = "", register = "", copyright = FALSE)
url |
of search results page to show in the browser. To open the
browser with a previous search, the output of ctrGetQueryUrl
or dbQueryHistory can be used. Can be left as empty string
(default) to open the advanced search page of |
register |
Register(s) to open, "EUCTR", "CTGOV2", "ISRCTN" or "CTIS". Default is empty string, and this opens the advanced search page of the register(s). |
copyright |
(Optional) If set to |
(String) Full URL corresponding to the shortened url
in conjunction with register
if any, or invisibly
TRUE
if no url
is specified.
# Open all and check copyrights before using registers ctrOpenSearchPagesInBrowser(copyright = TRUE) # Open specific register advanced search page ctrOpenSearchPagesInBrowser(register = "CTGOV2") ctrOpenSearchPagesInBrowser(register = "CTIS") ctrOpenSearchPagesInBrowser(register = "EUCTR") ctrOpenSearchPagesInBrowser(register = "ISRCTN") # Open all queries that were loaded into demo collection dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) dbh <- dbQueryHistory(con = dbc) for (r in seq_len(nrow(dbh))) { ctrOpenSearchPagesInBrowser(dbh[r, ]) }
# Open all and check copyrights before using registers ctrOpenSearchPagesInBrowser(copyright = TRUE) # Open specific register advanced search page ctrOpenSearchPagesInBrowser(register = "CTGOV2") ctrOpenSearchPagesInBrowser(register = "CTIS") ctrOpenSearchPagesInBrowser(register = "EUCTR") ctrOpenSearchPagesInBrowser(register = "ISRCTN") # Open all queries that were loaded into demo collection dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) dbh <- dbQueryHistory(con = dbc) for (r in seq_len(nrow(dbh))) { ctrOpenSearchPagesInBrowser(dbh[r, ]) }
If used interactively, the function shows a widget of all data in the trial
as a tree of field names and values. The widget opens in the default browser.
Fields names and values can be search and selected. Selected fields can be
copied to the clipboard for use with function dbGetFieldsIntoDf.
The trial is retrieved with ctrLoadQueryIntoDb if no database
con
is provided or if the trial is not in database con
.
ctrShowOneTrial(identifier = NULL, con = NULL)
ctrShowOneTrial(identifier = NULL, con = NULL)
identifier |
A trial identifier string |
con |
A database connection object, created with
|
This is the widget for CTIS trial 2022-501142-30-00:
Invisibly, the trial data for constructing an HTML widget.
dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) # all such identifiers work id <- "2014-003556-31" id <- "2014-003556-31-SE" id <- "76463425" id <- "ISRCTN76463425" id <- "NCT03431558" id <- "2022-501142-30-00" # the id also works with # ctrGetQueryUrl(url = id) and # ctrLoadQueryIntoDb(queryterm = id, ...) # show widget for user to explore and search content as well as to # select fields of interest and to click on "Copy names of selected # fields to clipboard..." to use them with dbGetFieldsIntoDf() ctrShowOneTrial(identifier = id, con = dbc) # get sample of identifiers of trials in database sample(dbFindIdsUniqueTrials(con = dbc), 5L)
dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) # all such identifiers work id <- "2014-003556-31" id <- "2014-003556-31-SE" id <- "76463425" id <- "ISRCTN76463425" id <- "NCT03431558" id <- "2022-501142-30-00" # the id also works with # ctrGetQueryUrl(url = id) and # ctrLoadQueryIntoDb(queryterm = id, ...) # show widget for user to explore and search content as well as to # select fields of interest and to click on "Copy names of selected # fields to clipboard..." to use them with dbGetFieldsIntoDf() ctrShowOneTrial(identifier = id, con = dbc) # get sample of identifiers of trials in database sample(dbFindIdsUniqueTrials(con = dbc), 5L)
Given part of the name of a field of interest to the user, this
function returns the full field names used in records that were
previously loaded into a collection
(using ctrLoadQueryIntoDb). Only names of fields that have
a value in the collection can be returned.
Set sample = FALSE
to force screening all records in the
collection for field names, see below.
See ctrShowOneTrial to interactively find fields.
dbFindFields(namepart = ".*", con, sample = TRUE, verbose = FALSE)
dbFindFields(namepart = ".*", con, sample = TRUE, verbose = FALSE)
namepart |
A character string (can be a regular expression, including Perl-style) to be searched among all field names (keys) in the collection, case-insensitive. The default '".*"' lists all fields. |
con |
A database connection object, created with
|
sample |
If |
verbose |
If |
The full names of child fields are returned in dot notation (e.g.,
clinical_results.outcome_list.outcome.measure.class_list.class.title
)
In addition, names of parent fields (e.g.,
clinical_results
) are returned.
Data in parent fields is typically complex (nested), see
dfTrials2Long for easily handling it.
For field definitions of the registers, see
"Definition" in ctrdata-registers.
Note: When dbFindFields
is first called after
ctrLoadQueryIntoDb, it will take a moment.
Vector of strings with full names of field(s) found, ordered by register and alphabet, see examples. Names of the vector are the names of the register holding the respective fields. The field names can be fed into dbGetFieldsIntoDf to extract the data for the field(s) from the collection into a data frame.
dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) dbFindFields(namepart = "date", con = dbc)[1:5] # view all 1880+ fields from all registers: allFields <- dbFindFields(con = dbc, sample = FALSE) if (interactive()) View(data.frame( register = names(allFields), field = allFields))
dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) dbFindFields(namepart = "date", con = dbc)[1:5] # view all 1880+ fields from all registers: allFields <- dbFindFields(con = dbc, sample = FALSE) if (interactive()) View(data.frame( register = names(allFields), field = allFields))
Records for a clinical trial can be loaded from more than one register into a collection. This function returns deduplicated identifiers for all trials in the collection, respecting the register(s) preferred by the user. All registers are recording identifiers also from other registers, which are used by this function to provide a vector of identifiers of deduplicated trials.
dbFindIdsUniqueTrials( preferregister = c("CTGOV2", "EUCTR", "CTGOV", "ISRCTN", "CTIS"), prefermemberstate = "BE", include3rdcountrytrials = TRUE, con, verbose = FALSE )
dbFindIdsUniqueTrials( preferregister = c("CTGOV2", "EUCTR", "CTGOV", "ISRCTN", "CTIS"), prefermemberstate = "BE", include3rdcountrytrials = TRUE, con, verbose = FALSE )
preferregister |
A vector of the order of preference for
registers from which to generate unique _id's, default
|
prefermemberstate |
Code of single EU Member State for which records
should returned. If not available, a record for BE or lacking this, any
random Member State's record for the trial will be returned.
For a list of codes of EU Member States, please see vector
|
include3rdcountrytrials |
A logical value if trials should be retained
that are conducted exclusively in third countries, that is, outside
the European Union. Ignored if |
con |
A database connection object, created with
|
verbose |
If |
Note that the content of records may differ between registers (and, for "EUCTR", between records for different Member States). Such differences are not considered by this function.
Note that the trial concept ".isUniqueTrial" (which uses this function) can be calculated at the time of creating a data frame with dbGetFieldsIntoDf, which often may be the preferred approach.
A named vector with strings of keys (field "_id") of records in the collection that represent unique trials, where names correspond to the register of the record.
dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) dbFindIdsUniqueTrials(con = dbc)[1:10] # alternative as of ctrdata version 1.21.0, # using defaults of dbFindIdsUniqueTrials() df <- dbGetFieldsIntoDf( fields = "keyword", calculate = "f.isUniqueTrial", con = dbc) # using base R df[df[[".isUniqueTrial"]], ] ## Not run: library(dplyr) df %>% filter(.isUniqueTrial) ## End(Not run)
dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) dbFindIdsUniqueTrials(con = dbc)[1:10] # alternative as of ctrdata version 1.21.0, # using defaults of dbFindIdsUniqueTrials() df <- dbGetFieldsIntoDf( fields = "keyword", calculate = "f.isUniqueTrial", con = dbc) # using base R df[df[[".isUniqueTrial"]], ] ## Not run: library(dplyr) df %>% filter(.isUniqueTrial) ## End(Not run)
Fields in the collection are retrieved from all records into a data frame (or tibble). Within a given trial record, a fields can be hierarchical and structured, that is, nested. Th function uses the field names to appropriately type the values that it returns, harmonising original values (e.g. "Information not present in EudraCT" to 'NA', "Yes" to 'TRUE', "false" to 'FALSE', date strings to dates or time differences, number strings to numbers). The function simplifies the structure of nested data and may concatenate multiple strings in a field using " / " (see example) and may have widened the returned data frame with additional columns that were recursively expanded from simply nested data (e.g., "externalRefs" to columns "externalRefs.doi", "externalRefs.eudraCTNumber" etc.). For an alternative way for handling the complex nested data, see dfTrials2Long followed by dfName2Value for extracting the sought variable(s).
dbGetFieldsIntoDf(fields = "", calculate = "", con, verbose = FALSE, ...)
dbGetFieldsIntoDf(fields = "", calculate = "", con, verbose = FALSE, ...)
fields |
Vector of one or more strings, with names of sought fields. See function dbFindFields for how to find names of fields and ctrShowOneTrial for interactively selecting field names. Dot path notation ("field.subfield") without indices is supported. If compatibility with 'nodbi::src_postgres()' is needed, specify fewer than 50 fields, consider also using parent fields e.g., '"a.b"' instead of 'c("a.b.c.d", "a.b.c.e")', accessing sought fields with dfTrials2Long followed by dfName2Value or other R functions. |
calculate |
Vector of one or more strings, which are names of functions to calculate certain trial concepts from fields in the collection across different registers. |
con |
A database connection object, created with
|
verbose |
Printing additional information if set to |
... |
Do not use (captures deprecated parameter |
A data frame (or tibble, if tibble
is loaded)
with columns corresponding to the sought fields.
A column for the records' '_id' will always be included.
The maximum number of rows of the returned data frame is equal to,
or less than the number of trial records in the database collection.
dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) # get fields that are nested within another field # and can have multiple values with the nested field dbGetFieldsIntoDf( fields = "b1_sponsor.b31_and_b32_status_of_the_sponsor", con = dbc) # fields that are lists of string values are # returned by concatenating values with a slash dbGetFieldsIntoDf( fields = "keyword", con = dbc) # calculate new field(s) from data across trials df <- dbGetFieldsIntoDf( fields = "keyword", calculate = c("f.statusRecruitment", "f.isUniqueTrial", "f.startDate"), con = dbc) table(df$.statusRecruitment, exclude = NULL) ## Not run: library(dplyr) library(ggplot2) df %>% filter(.isUniqueTrial) %>% count(.statusRecruitment) df %>% filter(.isUniqueTrial) %>% ggplot() + stat_ecdf(aes( x = .startDate, colour = .statusRecruitment)) ## End(Not run)
dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) # get fields that are nested within another field # and can have multiple values with the nested field dbGetFieldsIntoDf( fields = "b1_sponsor.b31_and_b32_status_of_the_sponsor", con = dbc) # fields that are lists of string values are # returned by concatenating values with a slash dbGetFieldsIntoDf( fields = "keyword", con = dbc) # calculate new field(s) from data across trials df <- dbGetFieldsIntoDf( fields = "keyword", calculate = c("f.statusRecruitment", "f.isUniqueTrial", "f.startDate"), con = dbc) table(df$.statusRecruitment, exclude = NULL) ## Not run: library(dplyr) library(ggplot2) df %>% filter(.isUniqueTrial) %>% count(.statusRecruitment) df %>% filter(.isUniqueTrial) %>% ggplot() + stat_ecdf(aes( x = .startDate, colour = .statusRecruitment)) ## End(Not run)
Show history of queries loaded into a database collection
dbQueryHistory(con, verbose = FALSE)
dbQueryHistory(con, verbose = FALSE)
con |
A database connection object, created with
|
verbose |
If |
A data frame (or tibble, if tibble
is loaded)
with columns: 'query-timestamp', 'query-register',
'query-records' (note: this is the number of records loaded when last
executing ctrLoadQueryIntoDb, not the total record number) and
'query-term', with one row for each time that
ctrLoadQueryIntoDb loaded trial records into this collection.
dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) dbQueryHistory(con = dbc)
dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) dbQueryHistory(con = dbc)
Merge variables in a data frame such as returned by dbGetFieldsIntoDf into a new variable, and optionally also map its values to new levels. See ctrdata-trial-concepts for pre-defined cross-register concepts that are already implemented based on merging fields from different registers and calculating a new field.
dfMergeVariablesRelevel(df = NULL, colnames = "", levelslist = NULL)
dfMergeVariablesRelevel(df = NULL, colnames = "", levelslist = NULL)
df |
A data.frame with the variables (columns) to be merged into one vector. |
colnames |
A vector of names of columns in 'df' that hold the variables
to be merged, or a selection of columns as per |
levelslist |
A names list with one slice each for a new value to be used for a vector of old values (optional). |
A vector, with the type of the columns to be merged
dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) df <- dbGetFieldsIntoDf( fields = c( "protocolSection.eligibilityModule.healthyVolunteers", "f31_healthy_volunteers", "eligibility.healthy_volunteers" ), con = dbc ) table( dfMergeVariablesRelevel( df = df, colnames = 'matches("healthy")' ))
dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) df <- dbGetFieldsIntoDf( fields = c( "protocolSection.eligibilityModule.healthyVolunteers", "f31_healthy_volunteers", "eligibility.healthy_volunteers" ), con = dbc ) table( dfMergeVariablesRelevel( df = df, colnames = 'matches("healthy")' ))
Get information for variable of interest (e.g., clinical endpoints) from long data frame of protocol- or result-related trial information as returned by dfTrials2Long. Parameters 'valuename', 'wherename' and 'wherevalue' are matched using Perl regular expressions and ignoring case.
dfName2Value(df, valuename = "", wherename = "", wherevalue = "")
dfName2Value(df, valuename = "", wherename = "", wherevalue = "")
df |
A data frame (or tibble) with four columns ('_id', 'identifier', 'name', 'value') as returned by dfTrials2Long |
valuename |
A character string for the name of the field that holds the value of the variable of interest (e.g., a summary measure such as "endPoints.*tendencyValue.value") |
wherename |
(optional) A character string to identify the variable of interest among those that repeatedly occur in a trial record (e.g., "endPoints.endPoint.title") |
wherevalue |
(optional) A character string with the value of the variable identified by 'wherename' (e.g., "response") |
A data frame (or tibble, if tibble
is loaded)
that includes the values of interest, with columns
'_id', 'identifier', 'name', 'value' and 'where' (with the
contents of 'wherevalue' found at 'wherename').
Contents of 'value' are strings unless all its elements
are numbers. The 'identifier' is generated by
function dfTrials2Long to identify matching elements,
e.g endpoint descriptions and measurements.
dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) dfwide <- dbGetFieldsIntoDf( fields = c( ## ctgov - typical results fields # "clinical_results.baseline.analyzed_list.analyzed.count_list.count", # "clinical_results.baseline.group_list.group", # "clinical_results.baseline.analyzed_list.analyzed.units", "clinical_results.outcome_list.outcome", "study_design_info.allocation", ## euctr - typical results fields # "trialInformation.fullTitle", # "baselineCharacteristics.baselineReportingGroups.baselineReportingGroup", # "trialChanges.hasGlobalInterruptions", # "subjectAnalysisSets", # "adverseEvents.seriousAdverseEvents.seriousAdverseEvent", "endPoints.endPoint", "subjectDisposition.recruitmentDetails" ), con = dbc ) dflong <- dfTrials2Long(df = dfwide) ## get values for the endpoint 'response' dfName2Value( df = dflong, valuename = paste0( "clinical_results.*measurement.value|", "clinical_results.*outcome.measure.units|", "endPoints.endPoint.*tendencyValue.value|", "endPoints.endPoint.unit" ), wherename = paste0( "clinical_results.*outcome.measure.title|", "endPoints.endPoint.title" ), wherevalue = "response" )
dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) dfwide <- dbGetFieldsIntoDf( fields = c( ## ctgov - typical results fields # "clinical_results.baseline.analyzed_list.analyzed.count_list.count", # "clinical_results.baseline.group_list.group", # "clinical_results.baseline.analyzed_list.analyzed.units", "clinical_results.outcome_list.outcome", "study_design_info.allocation", ## euctr - typical results fields # "trialInformation.fullTitle", # "baselineCharacteristics.baselineReportingGroups.baselineReportingGroup", # "trialChanges.hasGlobalInterruptions", # "subjectAnalysisSets", # "adverseEvents.seriousAdverseEvents.seriousAdverseEvent", "endPoints.endPoint", "subjectDisposition.recruitmentDetails" ), con = dbc ) dflong <- dfTrials2Long(df = dfwide) ## get values for the endpoint 'response' dfName2Value( df = dflong, valuename = paste0( "clinical_results.*measurement.value|", "clinical_results.*outcome.measure.units|", "endPoints.endPoint.*tendencyValue.value|", "endPoints.endPoint.unit" ), wherename = paste0( "clinical_results.*outcome.measure.title|", "endPoints.endPoint.title" ), wherevalue = "response" )
The function works with procotol- and results- related information.
It converts lists and other values that are in a data frame returned
by dbGetFieldsIntoDf into individual rows of a long data frame.
From the resulting long data frame, values of interest can be selected
using dfName2Value.
The function is particularly useful for fields with complex content,
such as node field "clinical_results
" from EUCTR, for which
dbGetFieldsIntoDf returns as a multiply nested list and for
which this function then converts every observation of every (leaf)
field into a row of its own.
dfTrials2Long(df)
dfTrials2Long(df)
df |
Data frame (or tibble) with columns including
the trial identifier ( |
A data frame (or tibble, if tibble
is loaded)
with the four columns: '_id', 'identifier', 'name', 'value'
dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) dfwide <- dbGetFieldsIntoDf( fields = "clinical_results.participant_flow", con = dbc) dfTrials2Long(df = dfwide)
dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) dfwide <- dbGetFieldsIntoDf( fields = "clinical_results.participant_flow", con = dbc) dfTrials2Long(df = dfwide)
Trial concept calculated: type of internal control. ICH E10 lists as types of control: placebo concurrent control, no-treatment concurrent control, dose-response concurrent control, active (positive) concurrent control, external (including historical) control, multiple control groups. Dose-controlled trials are currently not identified. External (including historical) controls are so far not identified in specific register fields. Cross-over designs, where identifiable, have active controls.
f.controlType(df = NULL)
f.controlType(df = NULL)
df |
data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf. |
data frame with columns '_id' and '.controlType', which is a factor with levels 'none', 'no-treatment', 'placebo', 'active', 'placebo+active' and 'other'.
# fields needed f.controlType() # apply trial concept when creating data frame dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) trialsDf <- dbGetFieldsIntoDf( field = "ctrname", calculate = "f.controlType", con = dbc) trialsDf
# fields needed f.controlType() # apply trial concept when creating data frame dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) trialsDf <- dbGetFieldsIntoDf( field = "ctrname", calculate = "f.controlType", con = dbc) trialsDf
Trial concept calculated: Calculates if record is a medicine-interventional trial, investigating one or more medicine, whether biological or not. For EUCTR and CTIS, this corresponds to all records as per the definition of the EU Clinical Trial Regulation. For CTGOV and CTGOV2, this is based on drug or biological as type of intervention, and interventional as type of study. For ISRCTN, this is based on drug or biological as type of intervention, and interventional as type of study.
f.isMedIntervTrial(df = NULL)
f.isMedIntervTrial(df = NULL)
df |
data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf. |
data frame with colums '_id' and '.isMedIntervTrial', a logical.
# fields needed f.isMedIntervTrial() # apply trial concept when creating data frame dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) trialsDf <- dbGetFieldsIntoDf( calculate = "f.isMedIntervTrial", con = dbc) trialsDf
# fields needed f.isMedIntervTrial() # apply trial concept when creating data frame dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) trialsDf <- dbGetFieldsIntoDf( calculate = "f.isMedIntervTrial", con = dbc) trialsDf
Trial concept calculated: Applies function dbFindIdsUniqueTrials() with its defaults.
f.isUniqueTrial(df = NULL)
f.isUniqueTrial(df = NULL)
df |
data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf. |
data frame with columns '_id' and '.isUniqueTrial', a logical.
# fields needed f.isUniqueTrial() # apply trial concept when creating data frame dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) trialsDf <- dbGetFieldsIntoDf( calculate = "f.isUniqueTrial", con = dbc) trialsDf
# fields needed f.isUniqueTrial() # apply trial concept when creating data frame dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) trialsDf <- dbGetFieldsIntoDf( calculate = "f.isUniqueTrial", con = dbc) trialsDf
Trial concept calculated: platform trial, research platform. As operational definition, at least one of these criteria is true: a. trial has "platform", "basket", "umbrella", "multi.?arm", "multi.?stage" or "master protocol" in its title or description (for ISRCTN, this is the only criterion; some trials in EUCTR lack data in English), b. trial has more than 2 active arms with different investigational medicines, after excluding comparator, auxiliary and placebo medicines (calculated with f.numTestArmsSubstances; not used for ISRCTN because it cannot be calculated precisely), c. trial more than 2 periods, after excluding safety run-in, screening, enrolling, extension and follow-up periods (for CTGOV and CTGOV2, this criterion requires results-related data). Requires that EUCTR results have been included in the collection, using ctrLoadQueryIntoDb(queryterm = ..., euctrresults = TRUE, con = ...). Requires packages dplyr and stringdist to be installed; stringdist is used for evaluating terms in brackets in the trial title, where trials may be related if the term similarity is 0.7 or higher.
f.likelyPlatformTrial(df = NULL)
f.likelyPlatformTrial(df = NULL)
df |
data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf. |
Publication references considered: E-PEARL WP2 2020 https://tinyurl.com/eupearld21terminology (which did not include all basket trials in the definition, as done here) Williams RJ et al. 2022 https://doi.org/10.1136/bmj-2021-067745
data frame with columns '_id' and '.likelyPlatformTrial', a logical, ‘.likelyRelatedTrials', a list (e.g., from CTIS’ 'associatedClinicalTrials') and '.maybeRelatedTrials', a list (based on similar short terms within a first set of brackets or before a colon in the title).
# fields needed f.likelyPlatformTrial() # apply trial concept when creating data frame dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) trialsDf <- dbGetFieldsIntoDf( calculate = "f.likelyPlatformTrial", con = dbc) trialsDf
# fields needed f.likelyPlatformTrial() # apply trial concept when creating data frame dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) trialsDf <- dbGetFieldsIntoDf( calculate = "f.likelyPlatformTrial", con = dbc) trialsDf
Trial concept calculated: number of the sites where the trial is conducted. EUCTR lacks information on number of sites outside of the EEA; for each non-EEA country mentioned, at least one site is assumed.
f.numSites(df = NULL)
f.numSites(df = NULL)
df |
data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf. |
data frame with columns '_id' and '.numSites', an integer.
# fields needed f.numSites() # apply trial concept when creating data frame dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) trialsDf <- dbGetFieldsIntoDf( calculate = "f.numSites", con = dbc) trialsDf
# fields needed f.numSites() # apply trial concept when creating data frame dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) trialsDf <- dbGetFieldsIntoDf( calculate = "f.numSites", con = dbc) trialsDf
Trial concept calculated: number of active arms with different investigational medicines, after excluding comparator, auxiliary and placebo arms / medicines. For ISRCTN, this is imprecise because arms are not identified in a field. Most registers provide no or only limited information on phase 1 trials, so that this number typically cannot be calculated for these trials. Requires packages stringdist to be installed; stringdist is used for evaluating names of active substances, which are considered similar when the similarity is 0.8 or higher.
f.numTestArmsSubstances(df = NULL)
f.numTestArmsSubstances(df = NULL)
df |
data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf. |
data frame with columns '_id' and '.numTestArmsSubstances', an integer
# fields needed f.controlType() # apply trial concept when creating data frame dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) trialsDf <- dbGetFieldsIntoDf( calculate = "f.numTestArmsSubstances", con = dbc) trialsDf
# fields needed f.controlType() # apply trial concept when creating data frame dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) trialsDf <- dbGetFieldsIntoDf( calculate = "f.numTestArmsSubstances", con = dbc) trialsDf
Trial concept calculated: full description of the primary endpoint, concatenating with " == " its title, description, time frame of assessment. The details vary by register. The text description can be used for identifying trials of interest or for analysing trends in primary endpoints, which among the set of all endpoints are most often used for determining the number of participants sought for the study.
f.primaryEndpointDescription(df = NULL)
f.primaryEndpointDescription(df = NULL)
df |
data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf. |
data frame with columns '_id' and '.primaryEndpointDescription', which is a list (that is, one or more items in one vector per row; the background is that some trials have several endpoints as primary).
# fields needed f.primaryEndpointDescription() # apply trial concept when creating data frame dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO ) trialsDf <- dbGetFieldsIntoDf( calculate = "f.primaryEndpointDescription", con = dbc ) trialsDf
# fields needed f.primaryEndpointDescription() # apply trial concept when creating data frame dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO ) trialsDf <- dbGetFieldsIntoDf( calculate = "f.primaryEndpointDescription", con = dbc ) trialsDf
Trial concept calculated: Calculates several results-related elements of the primary analysis of the primary endpoint. Requires loading results-related information. For CTIS and ISRCTN, such information is not available in structured format. Recommended to be combined with .controlType, .sampleSize etc. for analyses.
f.primaryEndpointResults(df = NULL)
f.primaryEndpointResults(df = NULL)
df |
data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf. |
data frame with columns '_id' and new columns: '.primaryEndpointFirstPvalue' (discarding any inequality indicator, e.g. <=), '.primaryEndpointFirstPmethod' (normalised string, e.g. chisquared), '.primaryEndpointFirstPsize' (number included in test, across assignment groups).
# fields needed f.primaryEndpointResults() # apply trial concept when creating data frame dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) trialsDf <- dbGetFieldsIntoDf( calculate = "f.primaryEndpointResults", con = dbc) trialsDf
# fields needed f.primaryEndpointResults() # apply trial concept when creating data frame dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) trialsDf <- dbGetFieldsIntoDf( calculate = "f.primaryEndpointResults", con = dbc) trialsDf
Trial concept calculated: earliest date of results as recorded in the register. At that date, results may have been incomplete and may have been changed later. For EUCTR, requires that results and preferrably also their history of publication have been included in the collection, using ctrLoadQueryIntoDb(queryterm = ..., euctrresultshistory = TRUE, con = ...). Cannot be calculated for ISRCTN, which does not have a corresponding field.
f.resultsDate(df = NULL)
f.resultsDate(df = NULL)
df |
data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf. |
data frame with columns '_id' and '.resultsDate', a date.
# fields needed f.resultsDate() # apply trial concept when creating data frame dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) trialsDf <- dbGetFieldsIntoDf( calculate = "f.resultsDate", con = dbc) trialsDf
# fields needed f.resultsDate() # apply trial concept when creating data frame dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) trialsDf <- dbGetFieldsIntoDf( calculate = "f.resultsDate", con = dbc) trialsDf
Trial concept calculated: sample size of the trial, preferring results-related over protocol-related information.
f.sampleSize(df = NULL)
f.sampleSize(df = NULL)
df |
data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf. |
data frame with columns '_id' and '.sampleSize', an integer.
# fields needed f.sampleSize() # apply trial concept when creating data frame dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) trialsDf <- dbGetFieldsIntoDf( calculate = "f.sampleSize", con = dbc) trialsDf
# fields needed f.sampleSize() # apply trial concept when creating data frame dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) trialsDf <- dbGetFieldsIntoDf( calculate = "f.sampleSize", con = dbc) trialsDf
Trial concept calculated: type or class of the lead or main sponsor of the trial. Some information is not yet mapped (e.g., "NETWORK" in CTGOV2). No specific field is available in ISRCTN.
f.sponsorType(df = NULL)
f.sponsorType(df = NULL)
df |
data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf. |
data frame with columns '_id' and '.sponsorType', which is a factor with levels 'For profit', 'Not for profit' or 'Other'.
# fields needed f.sponsorType() # apply trial concept when creating data frame dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) trialsDf <- dbGetFieldsIntoDf( calculate = "f.sponsorType", con = dbc) trialsDf
# fields needed f.sponsorType() # apply trial concept when creating data frame dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) trialsDf <- dbGetFieldsIntoDf( calculate = "f.sponsorType", con = dbc) trialsDf
Trial concept calculated: start of the trial, based on the documented or planned start of recruitment, or on the date of opinion of the competent authority.
f.startDate(df = NULL)
f.startDate(df = NULL)
df |
data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf. |
data frame with columns '_id' and '.startDate', a date.
# fields needed f.startDate() # apply trial concept when creating data frame dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) trialsDf <- dbGetFieldsIntoDf( field = "ctrname", calculate = "f.startDate", con = dbc) trialsDf
# fields needed f.startDate() # apply trial concept when creating data frame dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) trialsDf <- dbGetFieldsIntoDf( field = "ctrname", calculate = "f.startDate", con = dbc) trialsDf
Trial concept calculated: status of recruitment at the time of loading the trial records. Maps the categories that are in fields which specify the state of recruitment. Simplifies the status into three categories.
f.statusRecruitment(df = NULL)
f.statusRecruitment(df = NULL)
df |
data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf. |
data frame with columns '_id' and '.statusRecruitment', which is a factor with levels 'ongoing' (includes active, not yet recruiting; temporarily halted; suspended; authorised, not started and similar), 'completed' (includes ended; ongoing, recruitment ended), 'ended early' (includes prematurely ended, terminated early) and 'other' (includes revoked, withdrawn, planned, stopped).
# fields needed f.statusRecruitment() # apply trial concept when creating data frame dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) trialsDf <- dbGetFieldsIntoDf( calculate = "f.statusRecruitment", con = dbc) trialsDf
# fields needed f.statusRecruitment() # apply trial concept when creating data frame dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) trialsDf <- dbGetFieldsIntoDf( calculate = "f.statusRecruitment", con = dbc) trialsDf
Trial concept calculated: objectives of the trial, by searching for text fragments found in fields describing its purpose, objective, background or hypothesis, after applying .isMedIntervTrial, because the text fragments are tailored to medicinal product interventional trials. This is a simplification, and it is expected that the criteria will be further refined. The text fragments only apply to English.
f.trialObjectives(df = NULL)
f.trialObjectives(df = NULL)
df |
data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf. |
data frame with columns '_id' and '.trialObjectives', which is a string with letters separated by a space, such as E (efficacy, including cure, survival, effectiveness); A (activity, including reponse, remission, seroconversion); S (safety); PK; PD (including biomarker); D (dose-finding, determining recommended dose); LT (long-term); and FU (follow-up).
# fields needed f.trialObjectives() # apply trial concept when creating data frame dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) trialsDf <- dbGetFieldsIntoDf( calculate = "f.trialObjectives", con = dbc) trialsDf
# fields needed f.trialObjectives() # apply trial concept when creating data frame dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) trialsDf <- dbGetFieldsIntoDf( calculate = "f.trialObjectives", con = dbc) trialsDf
Trial concept calculated: phase of a clinical trial as per ICH E8(R1).
f.trialPhase(df = NULL)
f.trialPhase(df = NULL)
df |
data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf. |
data frame with columns '_id' and '.trialPhase', which is an ordered factor with levels 'phase 1', 'phase 1+2', 'phase 2', 'phase 2+3', 'phase 2+4', 'phase 3', 'phase 3+4', 'phase 1+2+3', 'phase 4', 'phase 1+2+3+4'.
# fields needed f.trialPhase() # apply trial concept when creating data frame dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) trialsDf <- dbGetFieldsIntoDf( calculate = "f.trialPhase", con = dbc) trialsDf
# fields needed f.trialPhase() # apply trial concept when creating data frame dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) trialsDf <- dbGetFieldsIntoDf( calculate = "f.trialPhase", con = dbc) trialsDf
Trial concept calculated: inclusion and exclusion criteria as well as age groups that can participate in a trial, based on from protocol-related information. (See dfMergeVariablesRelevel example for healthy volunteers.)
f.trialPopulation(df = NULL)
f.trialPopulation(df = NULL)
df |
data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf. |
data frame with columns '_id' and new columns: '.trialPopulationAgeGroup' (factor, "P", "A", "P+A", "E", "A+E", "P+A+E"), '.trialPopulationInclusion' (string), '.trialPopulationExclusion' (string).
# fields needed f.trialPopulation() # apply trial concept when creating data frame dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) trialsDf <- dbGetFieldsIntoDf( calculate = "f.trialPopulation", con = dbc) trialsDf
# fields needed f.trialPopulation() # apply trial concept when creating data frame dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) trialsDf <- dbGetFieldsIntoDf( calculate = "f.trialPopulation", con = dbc) trialsDf
Trial concept calculated: scientific or full title of the study.
f.trialTitle(df = NULL)
f.trialTitle(df = NULL)
df |
data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf. |
data frame with columns '_id' and '.trialTitle', a string.
# fields needed f.resultsDate() # apply trial concept when creating data frame dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) trialsDf <- dbGetFieldsIntoDf( calculate = "f.trialTitle", con = dbc) trialsDf
# fields needed f.resultsDate() # apply trial concept when creating data frame dbc <- nodbi::src_sqlite( dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"), collection = "my_trials", flags = RSQLite::SQLITE_RO) trialsDf <- dbGetFieldsIntoDf( calculate = "f.trialTitle", con = dbc) trialsDf