A Note on Replication

SciPy 2012 Postview: The following is a section taken from my SciPy 2012 proceeding from the conference last week. You can see the paper at github. This post is a follow up to the “Why Reproducibility is Important” post. I hope to do a recap of the conference itself next week! (NOTE: flmake is a specific CLI utility for workflow management in the FLASH code.)

A weaker form of reproducibility is known as replication [SCHMIDT]. Replication is the process of recreating a result when “you take all the same data and all the same tools” [GRAHAM] which were used in the original determination. Replication is a weaker determination than reproduction because at minimum the original scientist should be able to replicate their own work. Without replication, the same code executed twice will produce distinct results. In this case no trust may be placed in the conclusions whatsoever.

Much as version control has given developers greater control over reproducibility, other modern tools are powerful instruments of replicability. Foremost among these are hypervisors. The ease-of-use and ubiquity of virtual machines (VM) in the software ecosystem allows for the total capture and persistence of the environment in which any computation was performed. Such environments may be hosted and shared with collaborators, editors, reviewers, or the public at large. If the original analysis was performed in a VM context, shared, and rerun by other scientists then this is replicability. Such a strategy has been proposed by C. T. Brown as a stop-gap measure until diacomputational science is realized [BROWN].

However, as Brown admits (see comments), the delineation between replication and reproduction is fuzzy. Consider these questions which have no clear answers:

Are bit-identical results needed for replication?
How much of the environment must be reinstated for replication versus reproduction?
How much of the hardware and software stack must be recreated?
What precisely is meant by ‘the environment’ and how large is it?
For codes depending on stochastic processes, is reusing the same random seed replication or reproduction?

Without justifiable answers to the above, ad hoc definitions have governed the use of replicability and reproducibility. Yet to the quantitatively minded, an I-know-reproducibility-when-I-see-it approach falls short. Thus the science of science, at least in the computational sphere, has much work remaining.

Even with the reproduction/replication dilemma, the flmake reproduce command is a reproducibility tool. This is because it takes the opposite approach to Brown’s VM-based replication. Though the environment is captured within the description file, flmake reproduce does not attempt to recreate this original environment at all. The previous environment information is simply there for posterity, helping to uncover any discrepancies which may arise. User specific settings on the reproducing machine are maintained. This includes but is not limited to which compiler is used.

The claim that Brown’s work and flmake reproduce represent paragons of replicability and reproducibility respectively may be easily challenged. The author, like Brown himself, does not presuppose to have all – or even partially satisfactory - answers. What is presented here is an attempt to frame the discussion and bound the option space of possible meanings for these terms. Doing so with concrete code examples is preferable to debating this issue in the abstract.

References

[BROWN] C. Titus Brown, “Our approach to replication in computational science,” Living in an Ivory Basement, April 2012, http://ivory.idyll.org/blog/replication-i.html.

[GRAHAM] Jim Graham, “What is ‘Reproducibility,’ Anyway?”, Scimatic, April 2010, http://www.scimatic.com/node/361.

[SCHMIDT] Gavin A. Schmidt, “On replication,” RealClimate, Feb 2009, http://www.realclimate.org/index.php/archives/2009/02/on-replication/langswitch_lang/in/.

Filed under: General Interest Image may be NSFW.
Clik here to view.

Image may be NSFW.
Clik here to view.

A Note on Replication

References

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112