Autonomous Provenance to Drive Reproducibility in Computational Hydrology

Autonomous Provenance to Drive Reproducibility in Computational Hydrology

Autonomous Provenance to Drive Reproducibility in Computational Hydrology

The Kepler-driven provenance framework provides an Autonomous Provenance Collection capability for Hydrologic research. The framework scales to capture model parameters, user actions, hardware specifications and facilitates quick retrieval for actionable insights, whether the scientist is handling a small watershed simulation or a large continental-scale problem. The framework will support four critical aspects of hydrology: (a) enable reproducibility, (b) detect system faults to readjust execution, (c) capture performance data, and (d) perform resource predictions and optimize resource utilization by exploiting historical patterns. The framework is designed to make provenance collection declarative and autonomous. The framework leverages a non-relational database underneath to provide flexibility of changing provenance schema. Scientists can perform pattern recognition and data mining on hydrology provenance data using no-SQL queries. In this work, we present the integrated Kepler Provenance framework and explain how autonomous provenance collection accelerates reproducible hydrologic research.

Products