Abstract
This article proposes to identify and recommend scientific workflows for reuse and repurposing. Specifically, a scientific workflow is represented as a layer hierarchy that specifies the hierarchical relations between this workflow, its sub-workflows, and activities. Semantic similarity is calculated between layer hierarchies of workflows. A graph-skeleton based clustering technique is adopted for grouping layer hierarchies into clusters. Barycenters in each cluster are identified, which serve as core workflows in this cluster, for facilitating the cluster identification and workflow ranking and recommendation with respect to the requirement of scientists.
概要
创新点
本文旨在实现科学工作流的重用和再利用. 为此, 本文提出了识别和推荐科学工作流的有效技术. 首先, 本文通过使用层次模型表示科学工作流, 从而清晰描述科学工作流与其内部子工作流和活动之间的层级性关系. 据此, 本文提出了评估两个层次模型间相似性的策略. 并通过基于图骨架的聚类算法对现有层次模型进行聚类. 最后, 通过识别出聚类的重心点来表示每个聚类核心工作流, 以此来提高识别聚类和工作流排序、推荐的速度和质量, 从而满足用户的需求.
Similar content being viewed by others
References
Liu X Z, Huang G, Zhao Q, et al. Imashup: a mashup-based framework for service composition. Sci China Inf Sci, 2014, 57: 012101
Ning H S, Liu H. Cyber-physical-social-thinking space based science and technology framework for the internet of things. Sci China Inf Sci, 2015, 58: 031102
Starlinger J, Brancotte B, Cohen-Boulakia S, et al. Similarity search for scientific workflows. Proc VLDB Endowment, 2014, 7: 1143–1154
Huang J, Sun H, Song Q, et al. Revealing density-based clustering structure from the core-connected tree of a network. IEEE Trans Knowl Data Eng, 2013, 25: 1876–1889
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhou, Z., Cheng, Z. & Zhu, Y. Similarity assessment for scientific workflow clustering and recommendation. Sci. China Inf. Sci. 59, 113101 (2016). https://doi.org/10.1007/s11432-015-0934-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11432-015-0934-9