Efficient exploitation of the massive amount of modern-day life science data
The ODEX4all project focuses on the challenges associated with the ever-growing amount of research data in the life sciences. In Next Generation Sequencing alone the data doubles every 6-8 months and high-throughput datasets contain up to millions of new associations. Traditional ways of publishing, retrieving and using these massive data sources are inadequate to provide researchers and computers access to information in a manner needed for the scientific reasoning process. ODEX4all sets out to generate the infrastructure for the comprehensive exploitation of available data sets in a continuous machine-mind interaction.
Deriving new biological insights from in silico analytics is one of the novelties the project aims to deliver. The project will address private partner driven research questions from different disciplines and will progressively answer these. These research questions have in common that they all require advanced knowledge discovery capabilities provided by ODEX4all.
ODEX4all will realize semantic interoperability on key datasets creating an infrastructure that enables advanced levels of Computer Assisted Analytics and Discovery. The data sets will include open access publications, closed access publications, abstracts and relevant legacy data sources and descriptions of published and current experimental datasets with links to the actual data. The associations contained in these sources will be ‘super-published’ as Nanopublications, small RDF graphs containing a single assertion, its provenance and context. The project will compare various approaches to access and analyze this interoperable dataset and will review the impact of these approaches on the human scientific reasoning and confirmation process in iteration with computer analytics in a context-specific and user-tailored manner.
From a bioinformatics & semantics point of view the project will bring together a completely novel combination of different technologies and approaches. The private participants in the project, ranging from established information and hardware providers to start-ups focusing on advanced pattern recognition in big data, contribute different approaches to data publication, storage, processing, user interaction and hardware use. The academic partners will use the collective research questions and the provided infrastructure to augment their cutting-edge approaches to computer assisted scientific discovery and evaluate which systems are most suited for addressing a class of problems.
At an eScience level ODEX4all will deliver a completely new way of publishing, using, searching and reasoning with massive data output that is rapidly becoming mandatory in the proposals to (e)Science Funders. ODEX4all thus provides an assessment of the impact on fundamental research but also of the ability, addressing a key challenge of data science, to publish and share reusable data more effectively.