Collection Methods
The simplest way to get a list of collections is to use the
nbiatoolkit.NBIAClient.getCollections()
method.
This method returns a list of all collections available in the NBIA database.
The method has the following signature:
- NBIAClient.getCollections(prefix: str = '', return_type: ReturnType | str | None = None) List[dict[Any, Any]] | DataFrame [source]
Retrieves the collections from the NBIA server.
- Parameters:
prefix (str, optional) – Prefix to filter the collections by. Defaults to “”.
return_type (Optional[Union[ReturnType, str]], optional) – Return type of the response. Defaults to None which uses the default return type.
- Returns:
List of collections or DataFrame containing the collections.
- Return type:
List[dict[Any, Any]] | pd.DataFrame
Passing no parameters to the method will return a list of all collections available in the NBIA database. Passing a prefix parameter will return a list of collections that match the prefix.
client = NBIAClient(return_type = "dataframe")
collections_df = client.getCollections(prefix='TCGA')
print(f"The number of available collections is {len(collections_df)}")
print(collections_df)
'The number of available collections is 18'
Collection
0 TCGA-BLCA
1 TCGA-BRCA
2 TCGA-CESC
3 TCGA-COAD
4 TCGA-ESCA
5 TCGA-KICH
6 TCGA-KIRC
7 TCGA-KIRP
8 TCGA-LIHC
9 TCGA-LUAD
10 TCGA-LUSC
11 TCGA-OV
12 TCGA-PRAD
13 TCGA-READ
14 TCGA-SARC
15 TCGA-STAD
16 TCGA-THCA
17 TCGA-UCEC
- NBIAClient.getCollectionDescriptions(collectionName: str, return_type: ReturnType | str | None = None) List[dict[Any, Any]] | DataFrame [source]
Retrieves the description of a collection from the NBIA server.
- Parameters:
collectionName (str) – The name of the collection.
return_type (Optional[Union[ReturnType, str]], optional) – Return type of the response. Defaults to None.
- Returns:
List of collection descriptions or DataFrame containing the collection descriptions.
- Return type:
List[dict[Any, Any]] | pd.DataFrame
with NBIAClient() as client:
desc = client.getCollectionDescriptions(collectionName = "TCGA-BLCA")[0]
print(desc['Description'])
print(desc['DescriptionURI'])
print(desc['LastUpdated'])
('The Cancer Genome Atlas-Bladder Endothelial Carcinoma (TCGA-BLCA) data '
'collection is part of a larger effort to enhance the TCGA '
'http://cancergenome.nih.gov/ data set with characterized radiological '
'images. The Cancer Imaging Program (CIP), with the cooperation of several of '
'the TCGA tissue-contributing institutions, has archived a large portion of '
'the radiological images of the genetically-analyzed BLCA cases. Please see '
'the TCGA-BLCA page to learn more about the images and to obtain any '
'supporting metadata for this collection.')
'https://doi.org/10.7937/K9/TCIA.2016.8LNG8XDR'
'2023-03-16'
- NBIAClient.getCollectionPatientCount(prefix: str = '', return_type: ReturnType | str | None = None) List[dict[Any, Any]] | DataFrame [source]
Retrieves the patient count for collections.
- Parameters:
prefix (str, optional) – Prefix to filter the collections by. Defaults to “”.
return_type (Optional[Union[ReturnType, str]], optional) – Return type of the response. Defaults to None which uses the default return type.
- Returns:
List of collections and their patient counts or DataFrame containing the collections and their patient counts.
- Return type:
List[dict[Any, Any]] | pd.DataFrame
with NBIAClient() as client:
counts_df = client.getCollectionPatientCount(
prefix = "TCGA",
return_type="dataframe"
)
print(counts_df)
Collection PatientCount
0 TCGA-BLCA 120
1 TCGA-BRCA 139
2 TCGA-CESC 54
3 TCGA-COAD 25
4 TCGA-ESCA 16
5 TCGA-KICH 15
6 TCGA-KIRC 267
7 TCGA-KIRP 33
8 TCGA-LIHC 97
9 TCGA-LUAD 69
10 TCGA-LUSC 37
11 TCGA-OV 143
12 TCGA-PRAD 14
13 TCGA-READ 3
14 TCGA-SARC 5
15 TCGA-STAD 46
16 TCGA-THCA 6
17 TCGA-UCEC 65