TO TOP

CUBiMed.RUB - Core Unit Bioinformatics

The Core Unit Bioinformatics "CUBiMed.RUB" at the Faculty of Medicine offers a wide range of bioinformatics resources for academics in the fields of life sciences. We offer consulting and analyses, training and tutorials, software and access to high performance hardware. The focus is on proteomics, genomics, transcriptomics, their combination ("multi omics"), but also on other - related or new - omics technologies and clinical research data.

The Core Unit Bioinformatics "CUBiMed.RUB" at the Faculty of Medicine builds on the de.NBI-BioInfra.Prot project which was funded by the BMBF from 2014 to 2021. With the help of the Core Unit, the established range of resources can be continued.

A close cooperation with the European infrastructure initiative ELIXIR was established, which will continue in the future.

 

If you have any question or inquiries, feel free to contact us: cubimed@rub.de

Resources

We offer consulting and analyses, training and tutorials, software and access to high performance hardware for academics in life sciences. These are provided as scientific cooperations and are usually free of charge, for more details see our FAQs.

Consulting and analyses

CUBiMed.RUB offers support for the analysis of proteomics, genomics, transcriptomics and related clinical data. The consulting and analysis is free of charge for academics in the life sciences, which especially supports groups without an associated bioinformatics or biostatistics department.

We offer a bioinformatics consulting service regarding the application of our own as well as third-party proteomics software. We advise on data analysis and the selection of suitable software tools. In addition, we develop user-friendly workflows for frequently required analyses and make them available. In addition to the expert advice provided by our bioinformaticians, we also offer to perform the corresponding analyses, supported by our high-performance hardware.

We also provide support in the planning of omics studies, in the selection of suitable statistical analysis methods and in the interpretation and visual presentation of the results obtained. This also includes up-to-date machine learning methods.

Additionally, we help researchers to run their analysis on high performance hardware in the cloud.

Training and Tutorials

We offer training courses for scientists about programming languages, data analysis and software tools. These courses are regularly held in–person (in Bochum or at different scientific conferences) or online in context of the de.NBI project. On request, we also offer additional  training courses for small groups. Here are examples of such training courses:

"Differential analysis of proteomics data using R": R is a programming language especially suited for statistical analysis. We offer basic training courses that introduce life science researchers to R and help them to start their first analysis. The courses do not only teach the programming language but also the methodology and ways of interpreting the results.  More advanced courses can be attended that focus on more advanced topics and data visualization.

"Introduction to Python": Our course covers the basic programming paradigms of Python, handling of research data with the library "pandas", the visualization of research data with the libraries "plotly" and "sweetviz", up to the integration of Python into the high-performance programming language Rust.

Our trainings courses are announced via de.NBI:
Link to de.NBI training homepage https://www.denbi.de/training

 

Software

MaCPepDB

MaCPepDB is a tryptic digest of the complete UniProt KB (SwissProt & TrEMBL) designed to not only allow queries of peptide sequences and return the respective information about connected proteins and thus whether a peptide is unique but also allow queries of specific masses of peptides or precursors of MS/MS spectra. Furthermore, posttranslational modifications can be considered in a query as well as different mass deviations. Hence the database can be used by a sequence query not only to, for example, check in which proteins of the UniProt database a tryptic peptide can be found but also to find possibly interfering peptides in PRM/SRM experiments.

 

Web: https://macpepdb.mpc.rub.de

API-Documentation: https://macpepdb.mpc.rub.de/docs/api

Source Code
Version 1.x & 2.x

Version 3.x - Under development


PIA - Protein Inference Algorithms

PIA is a toolbox for MS based protein inference and identification analysis.

PIA allows you to inspect the results of common proteomics spectrum identification search engines, combine them seamlessly and conduct statistical analyses. The main focus of PIA lays on the integrated inference algorithms, i.e. concluding the proteins from a set of identified spectra. But it also allows you to inspect your peptide spectrum matches, calculate FDR values across different search engine results and visualize the correspondence between PSMs, peptides and proteins.

https://github.com/medbioinf/pia


CalibraCurve

CalibraCurve is a tool for generating and visualizing calibration curves for targeted proteomics data. CalibraCurve enables automated batch mode determination of dynamic linear ranges and quantification limits for targeted proteomics and similar assays. The software uses a variety of measurements to assess the accuracy of the calibration and provides intuitive visualizations.

Link to Github:
https://github.com/mpc-bioinformatics/CalibraCurve


Workflows

For the creation of reproducible results in bioinformatics analyses, we use specialized workflow systems like Nextflow and Snakemake. Here, we offer not only to use well-defined and general approaches, but also help with the creation or create custom tailored approaches for various omics analyses. These workflows allow reproducibility with containerized software and thus help to provide FAIR (findable, accessable, interoperable, re-usable) results.


Support for SDRF generation

SDRF (Sample Data Reference Format) for proteomics provides sample metadata in a structured and consistent way. This often missing information not only contributes to greatly improved reproducibility of proteomic studies, but also provides a way to conduct periodic semi-automatic reanalysis of large datasets using new and modern tools. Due to the promising advantages of the format a SDRF file is now recommended for submission of any dataset to the PRIDE database.

The generation of the SDRF might be complex depending on your dataset. Therefore we can aid you in the process.


Support for repository submissions

To store the fundament of scientific analyses and the respective results, most communities and journals make it necessary by now to store the raw machine results of any omics method together with the basic findings in specialized repositories. In proteomics, these are the ProteomeXchange repositories, in sequencing omics for example the European Genome-phenome Archive (EGA), European Nucleotide Archive (ENA) or ArrayExpress.

As the uploads can be cumbersome, we offer support to handle these.


Additional software not listed here and projects under development can also be found at our GitHub pages (https://github.com/cubimedrub, https://github.com/mpc-bioinformatics and https://github.com/medbioinf)

Hardware and compute cluster

CUBiMed is building a compute cluster with several hundred CPU cores and gigabytes of memory. Paired with fast storage devices and internet connection this cluster will improve our capabilities to aid researchers in answering scientific questions.

Our computing resources will be free for scientific usage with workflow engines like Nextflow or Snakemake.

Heads of core unit

Jun.-Prof. Dr. Julian Uszkoreit

Tel.: +49 234 32 18175

E-Mail

Prof. Dr. Martin Eisenacher

Tel.: +49 234 32 18104

E-Mail