Computation is crucial for applications of NMR that provide insights into biomolecular structure, dynamics, interactions, and stability, and have translational applications including diagnostics and drug discovery. The amount of NMR software is exploding, making it more difficult to discover new software, and to support a full complement of software on a diverse set of computer platforms or maintain legacy packages. Communication between software packages also becomes more challenging.
The broad aim of this Center is to simplify and integrate dissemination, maintenance, support, and application of NMR data processing and analysis software packages.
The Center will advance the application of biomolecular NMR for challenging applications in biomedicine, including structural biology, drug discovery, and metabolomics through the following:
The NMRbox virtual machine provides a comprehensive environment for bio-NMR data processing and analysis that is pre-configured with NMR software, cloud-based for easy access, and captured in a version-controlled archived to allow persistence and foster reproducibility.
Data translation and workflow tools (CONNJUR) will be developed to facilitate inter-operation and simplify NMR processing workflows. The NMR-STAR data model and file format will be extended to support data exchange between packages in NMRbox, and to capture meta-data essential for reproducibility of bio-NMR studies. The NMR-STAR data model also will be used to build tools for construction of complete BMRB data depositions.
Robust and extensible tools for Bayesian inference applied to spectral assignment of proteins, nucleic acids, and small molecule metabolites will be developed. An API will enable Bayesian inference to be easily incorporated into external software packages. All components will use the NMR-STAR/CONNJUR unified data model/interface.
Projects with important biomedical implications were chosen that require a broad range of software, extensive interoperation among applications, robust and reproducible analysis, and produce metadata not currently captured. As exemplars of bio-NMR data processing workflows, they will determine the software to be embedded in NMRbox, the metadata to be captured, the analysis tools to be implemented, and will serve as test beds for the technology emanating from the TRDs. The criteria used to select projects will be reviewed annually to determine whether to continue a project and for soliciting new projects. Close interaction with the External Advisory Board and the broader community will establish priorities for future projects. In addition to canonical solution-state protein structure determination workflows, the projects include solid-state NMR, relaxation, studies on RNA, interactions of small molecules with proteins, and metabolomics.
An initial complement of collaborative projects involve working directly with developers to increase the utility of their programs, by implementing them in NMRbox (facilitating both broader distribution and higher performance through parallelization), developing completely novel software applications, or providing novel tools for discovering software. A “new lab setup” service will be implemented. As for DBPs, Collaborative projects will be reviewed annually for renewal, and for recruiting new projects.
NMRbox staff will run annual workshops and provide on-site training sessions at conferences to instruct users how to process and analyze biomolecular NMR data using NMRbox, and to inform developers how to incorporate their software into NMRbox. Mechanisms for remote consultation and technical support will be developed employing web-based video “office hours”, a robust web site, an email “listserve”, and FAQ and video tutorial archives.