To architect or engineer? Lessons from DataPool on building

26 Slides3.90 MB

To architect or engineer? Lessons from DataPool on building RDM repositories Steve Hitchcock, JISC DataPool Project 9th DCC Research Data Management Forum (RDMF9) Cambridge, 14-15 November 2012

Why architecting? http://datapool.soton.ac.uk

DataPool architecture (Sharepoint) Peter Hancock, iSolutions, University of Southampton

DataPool Building Capacity, Developing Skills, Supporting Researchers October 2011 Policy and guidance Progress Informed by Developing/ working with Training Data repository Doctoral Training Centres Graduat e & staff training services Case studies Imaging, 3D Geodata IDMB Surveys of data practices among academics Support for Data Management Plans e.g. SharePoint EPrints 3.3 University Strategic Research Groups EPrints data apps 3-layer metadata March 2013 Capture/share with external sources, e.g. SWORD-ARM JISCMRD Progress Workshop 24-25 October 2012 Nottingham Large-scale data storage Byatt, D. ([email protected]) Hitchcock, S. ([email protected] ) White, W. ([email protected] ) http:/ datapool.soton.ac.uk/ Assign DataCite DOIs

Data repository platforms Architected DataFlow MS Sharepoint EPrints Other Engineer ed a data repository From platforms perspective available DSpace, CKAN, data.bris, etc.

Implementations of DataFlow Model DataStage SWORD Curated repository/ archive Two-stage architecture DataBank Addresses Dropbox effect for data producers EPrints DSpace DataFlow: two data deposit motivations for creators: want to (practice), need to (policy) QMUL

DataStage: Upload file DataStage was developed at the University of Oxford DataStage screenshots courtesy JISC Kaptur project http://www.vads.ac.uk/kaptur/ Thanks to Carlos Silva

DataStage: Submit as data package

3-layer metadata model Takeda et al., 6th IDCC, Dec. 2010 available from http://eprints.soton.ac.uk/169533/ JISC Institutional Data Management Blueprint (IDMB) Project, University of Southampton

SharePoint user interface 1: project

SharePoint user interface 2: data fields for format, keywords

Prof. Simon Cox (engng) on Sharepoint “The concept that formed part of SP thinking (at Southampton) from the very inception that ability to use SP as a way to manage or at least collaborate as part of a 5-10 year programme of work. “The other side is what we’re doing with intellectual property and what we’re offering for students. I chair a group design project, and every single student has said ‘I just do it all on Dropbox’. The same is happening with our research. So I think we have at least to provide a level of service and a level of integration between our research experience and our teaching experience. Would these people go to Southampton rather than University of Nowhereshire on the Web or the University of Google or the University of Dropbox? These are deep questions for us.”

ePrints Soton: Item type: Dataset Currently EPrints v3.2, customised to ePrints Soton Dataset Item Type from 2007

ePrints Soton: start to deposit Dataset

EPrints data apps Apps available from EPrints Bazaar http://bazaar.eprints.org/ Apps work with EPrints v3.3 or later

EPrints (test repo) DataShare enabled App by Tim Brody, EPrints DataPool

EPrints (test repo) Data Core enabled Data Core “adds a few fields and doesn’t remove any fields from the eprint object. It creates an alternate workflow for datasets which is much smaller than a normal eprints workflow.” App by Patrick McSweeney

EPrints (test repo) Data Core enabled 2 App by Patrick McSweeney

Essex Research Data metadata profile aims schema relevant to UK “Using metadata HE and research data (DataCite, INSPIRE and DDI 2.1), we have developed a basic metadata profile suited to describing research data generated at institutions with disciplinary diversity. The inclusion of fields like Funder and Grant number will ensure future harvesting and linking opportunities (like RCUK Research Outcome Systems). The metadata also suits the EPSRC data registry requirements.” http://researchdataessex.posterous.com/re pository-beta-metadata-profile-released

EPrints: Essex Research Data repository Screenshots courtesy JISC Research Data @Essex project Thanks to Louise Corti, Tom Ensom, Alexis Wolton EPrints v3.3.10, customised to Essex Research Data

Essex Research Data record

Essex Research Data: observations Assumes data deposit, so no selection of EPrints Item Type No selection of e.g. Creative Commons licence, just copyright Requirement for Time Period suggests particular type of data expected Fields for Geographic info (not required) suggests particular type of data expected

Architects and surroundings “On one plot aggressively crystalline blocks by Rogers Stirk Harbour are going up, their diamond shapes having nothing in Nine Elms, particular to do with anything London around them. On another Foster usembassylond and Partners have designed a on series of curving, stepped, blobby things, of the kind usually designed to take advantage of views on the Med or the Gulf, but are here facing each other like rows Utopia of daleks. Again, it shows R. Moore, on Thames, Observer, 11 Nov 2012 little interest in anything around

Open access repository interoperability Confederation of Open Access Repositories (COAR) Dublin Core, CRIS-CERIF OpenAIRE, RepositoryNet , Rioxx RCUK: Research Outcomes System, Gateway to Research, REF Is there the same current debate about interoperability of data repositories?

COAR on OA interoperability Specific initiatives designed to support interoperability: AuthorClaim, CRIS-OAR, DataCite, DINI Certificate for Document and Publication Services, DOI, DRIVER, Handle System, KE Usage Statistics Guidelines, OAI-ORE, OAI-PMH, OA-Statistik, OA Repository Junction, OpenAIRE, ORCID, PersID, PIRUS, SURE, SWORD, and UK RepositoryNet . COAR, The Current State of Open Access Repository Interoperability (2012), 26 Oct. 2012 v.02 MT @gknight2000 (Gareth Knight) Lincoln's

What next for DataPool repositories? Sharepoint User test and feedback sessions scheduled, will direct further development Eprints apps (1 or 2 0f following, initially) Develop app based on Essex data repository, providing other repositories with a 1-click install of this profile Build interoperability (I/O) apps: e.g. Data Management Plans, Dropbox Automate record capture for producers of large-scale, regular data outputs

Back to top button