Sammendrag
The public deliverable D4.4 describes the software components and processes (here called
pipelines as the processes mostly consist of Big Data volumes streaming through successive
processing steps) to be utilized by the DataBio Platform and pilots. The pilot services were
tested through two phases, Trial 1 and Trial 2 of the project. Most of the components were
used in both Trials with some updates in their features for Trial 2. In addition, this deliverable
reports which components were deployed in each pilot and the development platform that
the pilots tested their Big Data solutions on. The document aggregates information dispersed
among various deliverables (namely [REF-01] - [REF-06]). The aim of this deliverable is to
create a comprehensive overview of DataBio technical results.
The objective of WP4 “DataBio Platform with Pilot Support” was to configure and adopt Big
Data technologies for agriculture, forestry, and fishery. The work package together with WP5
“Earth Observation and Geospatial Data and Services”, established a platform for the
development of bioeconomy applications. The software and dataset repository DataBio Hub
is a central resource of the platform. In doing so, WP4 supported the DataBio pilots in their
needs for Big Data technologies.
This deliverable starts with an overview of DataBio building blocks such as platform
architecture, software components, datasets, models that offer functionalities primarily for
services in the domains of agriculture, forestry, and fishery. Then follows the exploitation for
the identification of cross reusable (sub) pipelines (“design patterns”) that can be used across
the pilots of the project and can be applied to other domains. The pipelines are one of the
major exploitable assets of DataBio.
The generic sections of the deliverable are concluded by Chapter 4 that explains the
integration of different components into a pipeline and the services that are provided per
pilot. The main results for the pilot services and the component updates, from a technological
aspect, for both trials 1 and 2 are presented. The concluding chapter outlines the main
findings, lessons learned and emerging examples of best practices.
The deliverable comprises contributions from the following tasks:
• T4.1: DataBio Architecture Requirements
• T4.2: Advanced Visualization Services
• T4.3: Predictive Analytics and Machine Learning
• T4.4: Real-time Analytics and Stream Processing
• T4.5: Big Data Variety Management, Storage, Linked Data and Queries
• T4.6: Big Data Acquisition and Curation with Security/Privacy Support
• T5.1: EO Subsystem and Components
• T5.2: EO Data Discovery and Data Management & Acquisition Services
• T5.3: EO Data Processing, Extraction, Conversion and Fusion Services
• T5.5: Meteo Data Management
Vis fullstendig beskrivelse