ContainerSystem
ContainerSystem(self, cDir='./containers', mDir='./modulefiles', forceImage=False, prereqs='', threads=8, cache_dir='/Users/gzynda/rgc_cache', force_cache=False, verbose=False)
Class for managing the rgc image cache
Parameters
- cDir (str): Path to output container directory
- mDir (str): Path to output module directory
- forceImage (bool): Option to force the creation of singularity images
- prereqs (str): string of prerequisite modules separated by ":"
- threads (int): Number of threads to use for concurrent operations
- cache_dir (str): Path to rgc cache
- force_cache (bool): Whether to force overwrite the cache
- verbose (bool): Whether to enable verbose logging
Attributes
system (str)
: Container systemcontainerDir (str)
: Path to use for containersmoduleDir (str)
: Path to use for module filesforceImage (bool)
: Force singularity image creationinvalid (set)
: Set of invalid urlsvalid (set)
: Set of valid urlsimages (dict)
: Path of singularity image or docker url after pullingregistry (dict)
: Registry of originprogs (dict)
: Set of programs in a containername_tag (dict)
: (name, tag) tuple of a URLkeywords (dict)
: List of keywords for a containercategories (dict)
: List of categories for a containerhomepage (dict)
: Original homepage of software in containerdescription (dict)
: Description of software in containerfull_url (dict)
: Full URL to container in registryblocklist (set)
: Set of programs to be blocked from being outputprog_count (Counter)
: Occurance count of each program seenlmod_prereqs (list)
: List of prerequisite modulesn_threads (int)
: Number of threads to use for concurrent operationslogger (logging)
: Class level loggercache_dir (str)
: Location for metadata cacheforce_cache (str)
: Force the regeneration of the metadata cache
validateURL
ContainerSystem.validateURL(self, url, include_libs=False)
Adds url to the self.invalid set when a URL is invalid and self.valid when a URL work.
By default, containers designated as libraries on bio.tools are excluded.
Parameters
- url (str): Image url used to pull
- include_libs (bool): Include containers of libraries
Attributes
self.valid (set)
: Where valid URLs are storedself.invalid (set)
: Where invalid URLs are stored
validateURLs
ContainerSystem.validateURLs(self, url_list, include_libs=False)
Adds url to the self.invalid set and returns False when a URL is invalid
Parameters
- url_list (list): List of URLs to validate
- include_libs (bool): Include containers of libraries
pullAll
ContainerSystem.pullAll(self, url_list, delete_old=False)
Uses worker threads to concurrently pull
- image
- metadata
- repository info
for a list of urls.
Parameters
- url_list (list): List of urls to pul
- delete_old (bool): Delete old images that are no longer used
pull
ContainerSystem.pull(self, url)
Pulls the following
- image
- metadata
- repository info
Parameters
- url (str): Image url used to pull
deleteImage
ContainerSystem.deleteImage(self, url)
Deletes a cached image
Parameters
- url (str): Image url used to pull
scanAll
ContainerSystem.scanAll(self)
Runs self.cachProgs
on all containers concurrently with threads
cacheProgs
ContainerSystem.cacheProgs(self, url, force=False)
Crawls all directories on a container's PATH and caches a list of all executable files in
self.progs[url]
and counts the global occurance of each program in
self.prog_count[prog]
Parameters
- url (str): Image url used to pull
- force (bool): Force a re-scan and print results (for debugging only)
getProgs
ContainerSystem.getProgs(self, url, blocklist=True)
Retruns a list of all programs on the path of a url that are not blocked
Parameters
- url (str): Image url used to pull
- blocklist (bool): Filter out blocked programs
Returns
list
: programs on PATH in container
getAllProgs
ContainerSystem.getAllProgs(self, url)
Returns a list of all programs on the path of url.
This is a shortcut for self.getProgs(url, blaclist=False)
Parameters
- url (str): Image url used to pull
findCommon
ContainerSystem.findCommon(self, p=25, baseline=[])
Creates a blocklist containing all programs that are in at least p% of the images
self.blocklist[url] = set([prog, prog, ...])
Parameters
- p (int): Percentile of images
- baesline (list): Exclude all programs from this list of urls
Attributes
permitlist (set)
: Set of programs that are always included when presentblocklist (set)
: Set of programs to be excluded
genModFiles
ContainerSystem.genModFiles(self, pathPrefix, contact_url, modprefix, delete_old)
Generates an Lmod modulefile for every valid image
Parameters
- url (str): Image url used to pull
- pathPrefix (str): Prefix to prepend to containerDir (think environment variables)
- contact_url (list): List of contact urls for reporting issues
- modprefix (str): Container module files can be tagged with modprefix-tag for easy stratification from native modules
- delete_old (bool): Delete outdated module files
genLMOD
ContainerSystem.genLMOD(self, url, pathPrefix, contact_url, modprefix='')
Generates an Lmod modulefile based on the cached container.
Parameters
- url (str): Image url used to pull
- pathPrefix (str): Prefix to prepend to containerDir (think environment variables)
- contact_url (list): List of contact urls for reporting issues
- modprefix (str): Container module files can be identified with modprefix-tag for easy stratification from native modules