Prof. Dr. Hans-Ulrich Prokosch (University of Erlangen-Nuremberg) and Dr. Martin Lablans (German Cancer Research Center, DKFZ) head the IT department of the German Biobank Alliance (GBA). Together with an IT core team and the members of the IT basic teams at the different GBA sites, they are working to bring the IT infrastructure for biobanks together in a network. We spoke with Prokosch and Lablans on how the system works and the current state of development.
Mr. Prokosch, Mr. Lablans, your IT infrastructure will allow local sample data to be made available centrally. How exactly will the finished system work?
Prokosch: Our system will allow scientists to use a web interface to search for biosamples affording specific attributes. They will be shown a list of which biobanks in our network have how many suitable samples. This will allow them to gather more samples for a research project than would normally be possible at one single location. Our IT system will mean biomaterial samples are genuinely available.
Lablans: Behind this simple search function is a complex IT structure: partial information is extracted from local databases and made available for queries in so-called “bridgeheads”. This will remain invisible to the researcher performing a search though, of course.
It sounds a bit like a literature search in a university library’s online catalogue.
Prokosch: The principle is by all means similar, as there is a catalogue of attributes to choose from. If a researcher is looking for frozen tissue samples from patients with lung cancer at a certain stage, for example, these attributes can be selected and compiled in a logical query. The same principle applies during a query in the PubMed database, though a list of literature sources is provided there rather than specific samples.
Lablans: It’s important to embed the GBA search in a larger sample and project mediation process that also includes the possibility of personal advice. If a researcher is notified of a certain number of samples, they can enter into a public dialogue with the biobanks in question. The German Biobank Alliance does not just provide IT services here, but also access to biobank staff who assist in the search and offer advice.
What do you mean by a “public dialogue”?
Prokosch: “Public” doesn’t necessarily mean that the whole world is involved. Only the biobanks able to provide the researcher with samples are included. We refer to them as “candidate biobanks”.
Lablans: If a biobank responds to a researcher with the question of “Is a cryo-sample essential for your study?”, for example, then all of the candidate biobanks can see both this question and the answer. This allows the scenario of several biobanks asking the same question and the researcher having to respond several times to be avoided.
Prokosch: It is of course up to the staff of each individual biobank and the researchers themselves when they wish to switch to a private dialogue.
A central concept of the IT infrastructure is the so-called “bridgehead” that you already mentioned. What exactly is meant with this?
Lablans: Imagine you are on an island to which there is a bridge: you can control who crosses this bridge. Applied to our IT infrastructure, this means that you keep the data on your island despite having joined a network.
Prokosch: The “bridgehead” is the local installation via which a GBA site can transfer its data into a network via a bridge – or not.
Lablans: We have already published a number of scientific articles on the “bridgehead” and also used it in other networks. A kind of brand has essentially been created, as it is open source technology that can be reused. The word “bridgehead” has therefore become synonymous with local data warehouses with a network structure.
Assuming my biospecimens and associated data are added to the database, where exactly in the IT structure are they “located”? Who has access to them?
Prokosch: It depends. If you want to donate a sample, you first need to sign a consent form. This document determines where your data is located and where it can be moved to. As a rule, samples are located where they were collected. There are of course different types of biobanks: in clinical biobanks, the data is usually stored in the documentation system at the hospital where the patients are being treated. In a population-based or study-specific biobank, the data is stored in a study database. Whether the data is actually ever allowed to leave the place where it is stored – in a pseudonymised form, of course – depends on whether you gave your consent for this.
What does “pseudonymised” mean here?
Prokosch: Under no circumstances will basic identifying information about your person, such as your name, date of birth, address, and contact details, be disclosed. Names are replaced with a pseudonym – we use sequences of numbers and letters for this.
BBMRI-ERIC is also working on an IT system to enable centralised searches for biosamples. How is the cooperation organised?
Prokosch: We cooperate closely and try to use the same components wherever possible. On the European level, the different national languages naturally play a role. But some basic components can be used in both GBA and European concepts. After all, it is in the biobanks’ best interests to not have to install and operate entirely different IT components for the German and European networks. And it is also advantageous to researchers searching for samples in Germany and Europe.
The Medical Informatics Initiative (MII) has now begun its work. As head of the MIRACUM consortium, where do you see common ground between MII and GBA?
Prokosch: Both face similar problems and challenges in many aspects. It is important for us to coordinate our basic data sets, for example. It would make absolutely no sense to opt for different data descriptions. GBA has already developed cardiology and oncology data sets based on preliminary international work. We are a little ahead of MII in this respect. It will be important to exchange on these data sets. A joint meeting with representatives from MII is planned for the second quarter of 2018.
I have just one final question: what is the current state of development?
Lablans: All GBA sites have now received a preliminary software version of the “bridgehead” and are currently working to fill this with data. None of the biobanks are ready yet, but searches can already be performed based on a small data set.
Prokosch: At present, this data set serves more as “proof of concept” to demonstrate the technical possibilities. It would not yet satisfy a scientist looking for biosamples. Clinical parameters must first be added and further data sets stored to achieve this.
Lablans: We will gradually add medical disciplines to the data sets and increase the level of detail. The German Center for Lung Research (DZL) has already defined its own comprehensive core data set, for instance. We therefore now wish to enter into a dialogue with DZL to negotiate the development of a jointly coordinated data set.
Prokosch: In autumn 2018, a user group will test the functionality of the prototype system. We will then decide whether to go live with the system or to adjust it first.
The interview was conducted by Verena Huth.