The HPCI Shared Storage is the data sharing infrastructure of HPCI. In April of 2015, we found 150000 data corrupted files in TokyoTech data servers. Even though most of files rescued by replica files, total 923 files are lost in this failure.
As the result of the analysis, this is the typical silent data corruption. To cope with the silent data corruption, the periodical data integrity check using file digest feature in Gfarm is introduced in HPCI Shared Storage. This periodical data integrity check enables data corrupted file detection within one week.

日時: 2016年2月9日 (火)、 12:00 – 13:15
12:00 - 12:15 軽食&コーヒータイム
12:15 - 13:15 AICS Cafe
場所: AICS 6階講堂
講演題目: Periodical data integrity check for silent data corruption in HPCI Shared Storage
講演者: 原田 浩 (HPCIシステム技術チーム)