nbiatoolkit.nbia
Attributes
Classes
OAuth2 class for handling authentication and access token retrieval. |
|
This enum class defines the NBIA endpoints used in the NBIA toolkit. |
|
This enum class defines the NBIA base URLs used in the NBIA toolkit. |
|
Generic enumeration. |
|
A client for interacting with the NBIA API. |
Functions
Generate a file path for the DICOM file by formatting DICOM attributes. |
|
|
Set up a logger object that can be used to log messages to a file and/or console with daily log file rotation. |
|
|
|
Cleans the given HTML string by removing HTML tags and replacing special characters. |
|
Convert milliseconds to a formatted date string. |
|
Converts the input date to the desired format. |
|
Response will be either JSON or bytes |
|
Given a DataFrame containing DICOM series tags, retrieves the SeriesInstanceUIDs of the referenced series. |
|
Extracts ROI information from the StructureSetROISequence. |
|
Given a DataFrame containing DICOM sequence tags, retrieves the search space |
|
Generate a pydicom Dataset object from a DataFrame of DICOM tags. |
|
Downloads a single series from the NBIA server. |
Module Contents
- class nbiatoolkit.nbia.DICOMSorter(sourceDir: str, destinationDir: str, targetPattern: str = '%PatientName/%SeriesNumber-%SeriesInstanceUID/%InstanceNumber.dcm', truncateUID: bool = True, sanitizeFilename: bool = True)[source]
- nbiatoolkit.nbia.generateFilePathFromDICOMAttributes(dataset: pydicom.Dataset, targetPattern: str, truncateUID: bool, sanitizeFilename: bool) str[source]
Generate a file path for the DICOM file by formatting DICOM attributes.
- class nbiatoolkit.nbia.OAuth2(username: str = 'nbia_guest', password: str = '', client_id: str = 'NBIA', base_url: str | nbiatoolkit.utils.NBIA_BASE_URLS = NBIA_BASE_URLS.NBIA)[source]
OAuth2 class for handling authentication and access token retrieval.
This class provides methods to authenticate with the NBIA API using OAuth2 and retrieve the access token required for accessing the API.
Defaults to using the NBIA Guest for accessing public collections. If you have a username and password which has been granted access to the collections tagged with “limited access” you can use those credentials to access those collections.
- client_id
The client ID for authentication.
- Type:
str
- username
The username for authentication.
- Type:
str
- password
The password for authentication.
- Type:
str
- access_token
The access token retrieved from the API.
- Type:
str or None
- api_headers
The authentication headers containing the access token.
- Type:
dict or None
- expiry_time
The expiry time of the access token.
- Type:
str or None
- refresh_token
The refresh token for obtaining a new access token.
- Type:
str or None
- refresh_expiry
The expiry time of the refresh token.
- Type:
int or None
- scope
The scope of the access token.
- Type:
str or None
- getToken()
Authenticates with the API. Returns API headers containing the access token.
- Example Usage
- -------------
- >>> from nbiatoolkit.auth import OAuth2
- To use the NBIA Guest account:
- >>> oauth = OAuth2()
- To use a custom account:
- >>> oauth = OAuth2(username="my_username", password="my_password")
Notes
This class is mainly for developers looking to add functionality to the nbiatoolkit package. If you are a user looking to access the NBIA API, you can use the NBIAClient class without knowledge of this class.
As there are many packages for handling OAuth2 authentication, this class was for myself to learn how OAuth2 works and to provide a simple way to authenticate with the NBIA API. If you have any suggestions for improving this class, please open an issue on the GitHub repository.
- property fernet_key: bytes
- property access_token: str | None
- property api_headers: dict[str, str]
- property token_expiration_time
- property refresh_expiration_time
- property token_scope
- nbiatoolkit.nbia.setup_logger(name: str, log_level: str = 'INFO', console_logging: bool = False, log_file: str | None = None, log_dir: str | None = None, log_format: str = '%(asctime)s | %(name)s | %(levelname)s | %(message)s', datefmt: str = '%y-%m-%d %H:%M') logging.Logger[source]
Set up a logger object that can be used to log messages to a file and/or console with daily log file rotation. If passing a log_file, the log file will be created in the current working directory unless a log_dir is provided. The log_file is created with a TimedRotatingFileHandler to rotate the log file daily.
- Parameters:
name (str) – The name of the logger.
log_level (str, optional) – The log level. Defaults to ‘INFO’.
console_logging (bool, optional) – Whether to log to console. Defaults to False.
log_file (str, optional) – The log file name. Defaults to None.
log_dir (str, optional) – The log directory. Defaults to None.
log_format (str, optional) – The log format. Defaults to ‘%(asctime)s | %(name)s | %(levelname)s | %(message)s’.
datefmt (str, optional) – The date format. Defaults to ‘%y-%m-%d %H:%M’.
- Returns:
The logger object.
- Return type:
logger (logging.Logger)
- class nbiatoolkit.nbia.NBIA_ENDPOINTS[source]
Bases:
enum.EnumThis enum class defines the NBIA endpoints used in the NBIA toolkit.
- GET_COLLECTIONS = 'v2/getCollectionValues'
- GET_COLLECTION_PATIENT_COUNT = 'getCollectionValuesAndCounts'
- GET_COLLECTION_DESCRIPTIONS = 'getCollectionDescriptions'
- GET_MODALITY_VALUES = 'v2/getModalityValues'
- GET_MODALITY_PATIENT_COUNT = 'getModalityValuesAndCounts'
- GET_PATIENTS = 'v2/getPatient'
- GET_NEW_PATIENTS_IN_COLLECTION = 'v2/NewPatientsInCollection'
- GET_PATIENT_BY_COLLECTION_AND_MODALITY = 'v2/getPatientByCollectionAndModality'
- GET_BODY_PART_PATIENT_COUNT = 'getBodyPartValuesAndCounts'
- GET_STUDIES = 'v2/getPatientStudy'
- GET_SERIES = 'v2/getSeries'
- GET_UPDATED_SERIES = 'v2/getUpdatedSeries'
- GET_SERIES_METADATA = 'v1/getSeriesMetaData'
- DOWNLOAD_SERIES = 'v2/getImageWithMD5Hash'
- GET_DICOM_TAGS = 'getDicomTags'
- class nbiatoolkit.nbia.NBIA_BASE_URLS[source]
Bases:
enum.EnumThis enum class defines the NBIA base URLs used in the NBIA toolkit.
- NBIA = 'https://services.cancerimagingarchive.net/nbia-api/services/'
- NLST = 'https://nlst.cancerimagingarchive.net/nbia-api/services/'
- LOGOUT_URL = 'https://services.cancerimagingarchive.net/nbia-api/logout'
- nbiatoolkit.nbia.clean_html(html_string: str) str[source]
Cleans the given HTML string by removing HTML tags and replacing special characters.
- Parameters:
html_string (str) – The input HTML string to be cleaned.
- Returns:
The cleaned text content without HTML tags and special characters.
- Return type:
str
- nbiatoolkit.nbia.convertMillis(millis: int) str[source]
Convert milliseconds to a formatted date string.
- Parameters:
millis (int) – The number of milliseconds to convert.
- Returns:
The formatted date string in the format ‘YYYY-MM-DD’.
- Return type:
str
- Raises:
AssertionError – If the input is not an integer.
- nbiatoolkit.nbia.convertDateFormat(input_date: str | datetime.datetime, format: str = '%Y/%m/%d') str[source]
Converts the input date to the desired format.
- Parameters:
input_date (str) – The date to be converted.
- Returns:
The converted date in the format “YYYY/MM/DD”.
- Return type:
str
- Raises:
ValueError – If the input date has an invalid format.
- nbiatoolkit.nbia.parse_response(response: requests.Response) List[dict[Any, Any]][source]
Response will be either JSON or bytes
- class nbiatoolkit.nbia.ReturnType[source]
Bases:
enum.EnumGeneric enumeration.
Derive from this class to define new enumerations.
- LIST = 'list'
- DATAFRAME = 'dataframe'
- nbiatoolkit.nbia.getReferencedSeriesUIDS(series_tags_df: pandas.DataFrame) List[str][source]
Given a DataFrame containing DICOM series tags, retrieves the SeriesInstanceUIDs of the referenced series. Useful for RTSTRUCT DICOM files to find the series that the RTSTRUCT references. TODO:: implement SEG and RTDOSE
- Parameters:
series_tags_df (pd.DataFrame) – A DataFrame containing DICOM series tags.
- Returns:
A list of SeriesInstanceUIDs of the referenced series.
- Return type:
List[str]
- Raises:
ValueError – If the series is not an RTSTRUCT.
- nbiatoolkit.nbia.extract_ROI_info(StructureSetROISequence) dict[str, dict[str, str]][source]
Extracts ROI information from the StructureSetROISequence.
- Parameters:
StructureSetROISequence (pandas.DataFrame) – A pandas DataFrame representing the StructureSetROISequence.
- Returns:
A dictionary containing ROI information, where the key is the ROI number and the value is the ROI information.
- Return type:
dict[str, dict[str, str]]
- Raises:
ValueError – If ROI Number is not found in the StructureSetROISequence.
- nbiatoolkit.nbia.getSequenceElement(sequence_tags_df: pandas.DataFrame, element_keyword: str) pandas.DataFrame[source]
Given a DataFrame containing DICOM sequence tags, retrieves the search space based on the element keyword.
- Parameters:
sequence_tags_df (pd.DataFrame) – A DataFrame containing DICOM sequence tags.
element_keyword (str) – The keyword of the element to search for.
- Returns:
A DataFrame containing the search space based on the element keyword.
- Return type:
pd.DataFrame
- Raises:
ValueError – If the element is not found in the sequence tags.
ValueError – If more than two elements are found in the sequence tags.
- nbiatoolkit.nbia.generateFileDatasetFromTags(tags_df: pandas.DataFrame) pydicom.Dataset[source]
Generate a pydicom Dataset object from a DataFrame of DICOM tags.
- Parameters:
tags_df (pd.DataFrame) – DataFrame containing DICOM tags.
- Returns:
A pydicom Dataset object containing the DICOM tags.
- Return type:
pydicom.Dataset
- nbiatoolkit.nbia.__version__ = '1.3.1'
- nbiatoolkit.nbia.downloadSingleSeries(SeriesInstanceUID: str, downloadDir: str, filePattern: str, overwrite: bool, api_headers: dict[str, str], base_url: nbiatoolkit.utils.NBIA_BASE_URLS, log: logging.Logger, Progressbar: bool = False)[source]
Downloads a single series from the NBIA server.
- Parameters:
SeriesInstanceUID (str) – The unique identifier of the series.
downloadDir (str) – The directory where the series will be downloaded.
filePattern (str) – The desired pattern for the downloaded files.
overwrite (bool) – Flag indicating whether to overwrite existing files.
api_headers (dict[str, str]) – The headers to be included in the API request.
base_url (NBIA_ENDPOINTS) – The base URL of the NBIA server.
log (Logger) – The logger object for logging messages.
Progressbar (bool, optional) – Flag indicating whether to display a progress bar. Defaults to False.
- Returns:
True if the series is downloaded and sorted successfully, False otherwise.
- Return type:
bool
- class nbiatoolkit.nbia.NBIAClient(username: str = 'nbia_guest', password: str = '', log_level: str = 'INFO', logger: logging.Logger | None = None, return_type: nbiatoolkit.utils.ReturnType | str = ReturnType.LIST)[source]
A client for interacting with the NBIA API.
The NBIAClient class provides a high-level interface for querying the NBIA API and downloading DICOM series.
- Parameters:
username (str, optional) – The username for authentication. Defaults to “nbia_guest”.
password (str, optional) – The password for authentication. Defaults to an empty string.
log_level (str, optional) – The log level for the logger. Defaults to “INFO”.
return_type (Union[ReturnType, str], optional) – The return type for API responses. Defaults to ReturnType.LIST
- headers
The API headers.
- Type:
dict[str, str]
- base_url
The base URL for API requests.
- Type:
- logger
The logger for logging client events.
- Type:
Logger
- return_type
The current return type for API responses.
- Type:
str
- property OAuth_client: nbiatoolkit.auth.OAuth2
- property headers
- property base_url: nbiatoolkit.utils.NBIA_BASE_URLS
- property logger: logging.Logger
- property return_type: str
- _get_return(return_type: nbiatoolkit.utils.ReturnType | str | None) nbiatoolkit.utils.ReturnType[source]
helper function to replace the following code: returnType: ReturnType = (
ReturnType(return_type) if return_type is not None else self._return_type
)
- query_api(endpoint: nbiatoolkit.utils.NBIA_ENDPOINTS, params: dict = {}) List[dict[Any, Any]][source]
- getCollections(prefix: str = '', return_type: nbiatoolkit.utils.ReturnType | str | None = None) List[dict[Any, Any]] | pandas.DataFrame[source]
Retrieves the collections from the NBIA server.
- Parameters:
prefix (str, optional) – Prefix to filter the collections by. Defaults to “”.
return_type (Optional[Union[ReturnType, str]], optional) – Return type of the response. Defaults to None which uses the default return type.
- Returns:
List of collections or DataFrame containing the collections.
- Return type:
List[dict[Any, Any]] | pd.DataFrame
- getCollectionDescriptions(collectionName: str, return_type: nbiatoolkit.utils.ReturnType | str | None = None) List[dict[Any, Any]] | pandas.DataFrame[source]
Retrieves the description of a collection from the NBIA server.
- Parameters:
collectionName (str) – The name of the collection.
return_type (Optional[Union[ReturnType, str]], optional) – Return type of the response. Defaults to None.
- Returns:
List of collection descriptions or DataFrame containing the collection descriptions.
- Return type:
List[dict[Any, Any]] | pd.DataFrame
- getCollectionPatientCount(prefix: str = '', return_type: nbiatoolkit.utils.ReturnType | str | None = None) List[dict[Any, Any]] | pandas.DataFrame[source]
Retrieves the patient count for collections.
- Parameters:
prefix (str, optional) – Prefix to filter the collections by. Defaults to “”.
return_type (Optional[Union[ReturnType, str]], optional) – Return type of the response. Defaults to None which uses the default return type.
- Returns:
List of collections and their patient counts or DataFrame containing the collections and their patient counts.
- Return type:
List[dict[Any, Any]] | pd.DataFrame
- getModalityValues(Collection: str = '', BodyPartExamined: str = '', Counts: bool = False, return_type: nbiatoolkit.utils.ReturnType | str | None = None) List[dict[Any, Any]] | pandas.DataFrame[source]
Retrieves possible modality values from the NBIA database.
- Parameters:
Collection (str, optional) – Collection name to filter by. Defaults to “”.
BodyPartExamined (str, optional) – BodyPart name to filter by. Defaults to “”.
Counts (bool, optional) – Flag to indicate whether to return patient counts. Defaults to False.
return_type (Optional[Union[ReturnType, str]], optional) – Return type of the response. Defaults to None which uses the default return type.
- Returns:
List of modality values or DataFrame containing the modality values.
- Return type:
List[dict[Any, Any]] | pd.DataFrame
- getPatients(Collection: str = '', return_type: nbiatoolkit.utils.ReturnType | str | None = None) List[dict[Any, Any]] | pandas.DataFrame[source]
Retrieves a list of patients from the NBIA API.
- Parameters:
Collection (str, optional) – The name of the collection to filter the patients. Defaults to “”.
return_type (Optional[Union[ReturnType, str]], optional) – The desired return type. Defaults to None.
- Returns:
A list of patient dictionaries or a pandas DataFrame, depending on the return type.
- Return type:
List[dict[Any, Any]] | pd.DataFrame
- getNewPatients(Collection: str, Date: str | datetime.datetime, return_type: nbiatoolkit.utils.ReturnType | str | None = None) List[dict[Any, Any]] | pandas.DataFrame[source]
Retrieves new patients from the NBIA API based on the specified collection and date.
- Parameters:
Collection (str) – The name of the collection to retrieve new patients from.
Date (Union[str, datetime]) – The date to filter the new patients. Can be a string in the format “YYYY/MM/DD” or a datetime object.
return_type (Optional[Union[ReturnType, str]]) – The desired return type. Defaults to None.
- Returns:
A list of dictionaries or a pandas DataFrame containing the new patients.
- Return type:
List[dict[Any, Any]] | pd.DataFrame
- Raises:
AssertionError – If the Date argument is None.
- getPatientsByCollectionAndModality(Collection: str, Modality: str, return_type: nbiatoolkit.utils.ReturnType | str | None = None) List[dict[Any, Any]] | pandas.DataFrame[source]
Retrieves patients by collection and modality.
- Parameters:
Collection (str) – The collection name.
Modality (str) – The modality name.
return_type (Optional[Union[ReturnType, str]], optional) – The desired return type. Defaults to None.
- Returns:
The list of patients or a pandas DataFrame, depending on the return type.
- Return type:
List[dict[Any, Any]] | pd.DataFrame
- Raises:
AssertionError – If Collection or Modality is None.
- getBodyPartCounts(Collection: str = '', Modality: str = '', return_type: nbiatoolkit.utils.ReturnType | str | None = None) List[dict[Any, Any]] | pandas.DataFrame[source]
- getStudies(Collection: str, PatientID: str = '', StudyInstanceUID: str = '', return_type: nbiatoolkit.utils.ReturnType | str | None = None) List[dict[Any, Any]] | pandas.DataFrame[source]
Retrieves studies from the NBIA API based on the specified parameters.
- Parameters:
Collection (str) – The name of the collection to retrieve studies from.
PatientID (str, optional) – The patient ID to filter the studies by. Defaults to “”.
StudyInstanceUID (str, optional) – The study instance UID to filter the studies by. Defaults to “”.
return_type (Optional[Union[ReturnType, str]], optional) – The desired return type. Defaults to None.
- Returns:
A list of dictionaries or a pandas DataFrame containing the retrieved studies.
- Return type:
List[dict[Any, Any]] | pd.DataFrame
- getSeries(Collection: str = '', PatientID: str = '', StudyInstanceUID: str = '', Modality: str = '', SeriesInstanceUID: str = '', BodyPartExamined: str = '', ManufacturerModelName: str = '', Manufacturer: str = '', return_type: nbiatoolkit.utils.ReturnType | str | None = None) List[dict[Any, Any]] | pandas.DataFrame[source]
- getSeriesMetadata(SeriesInstanceUID: str | list[str], return_type: nbiatoolkit.utils.ReturnType | str | None = None) List[dict[Any, Any]] | pandas.DataFrame[source]
- getNewSeries(Date: str | datetime.datetime, return_type: nbiatoolkit.utils.ReturnType | str | None = None) List[dict[Any, Any]] | pandas.DataFrame[source]
- getDICOMTags(SeriesInstanceUID: str, return_type: nbiatoolkit.utils.ReturnType | str | None = None) List[dict[Any, Any]] | pandas.DataFrame[source]
- generateFilePathFromDICOMTags(SeriesInstanceUID: str, filePattern: str = '%PatientName/%Modality-%SeriesNumber-%SeriesInstanceUID/%InstanceNumber.dcm') str[source]
Generates a file path from DICOM tags.
- Parameters:
SeriesInstanceUID (str) – The Series Instance UID of the DICOM series.
filePattern (str, optional) – The file pattern to use for generating the file path. Defaults to “%PatientName/%Modality-%SeriesNumber-%SeriesInstanceUID/%InstanceNumber.dcm”.
- Returns:
The generated file path.
- Return type:
str
Note
This only considers the first instance of the series. Meant to be used to determine the dirname of the series files.