Friday 16 November 2012

Metadata for Research Data


White Rose meeting 06/11/12 at the University of Leeds

Aims
  • To identify metadata fields common to all datasets
  • Review suitability of DataCite standard & identify additional fields needed
  • Agree required fields for a data catalogue entry (regardless of format and location), including mandatory and optional fields
  • Review Graham Blyth’s 9 layers of metadata (for RoadMap Work package 5)

Context

One outcome of the WR 'Perspectives on Research Data Management' meeting of 24th May 2012 [1] was an interest amongst some participants to discuss the requirements for a research data catalogue at the WR level. Earlier blog posts discuss the establishment of a 'WRRDC' [2] and projects involving Research Data Catalogues [3] & [4]. The RoaDMap [5] project at Leeds is developing a data catalogue component for its RDM infrastructure - Work Package 5 'repositories and metadata'. Although there are no overwhelming arguments for a data catalogue at the WR level, (in addition to the institutional and national levels), it was thought a meeting to include WR people would be useful. Metadata rather than software systems were to be discussed.

The EPSRC is requiring full compliance with their expectations [6] by May 2015, and require 'appropriately structured metadata describing the research data they hold is published (normally within 12 months of the data being generated) and made freely accessible on the internet; in each case the metadata must be sufficient to allow others to understand what research data exists, why, when and how it was generated, and how to access it. Where the research data referred to in the metadata is a digital object it is expected that the metadata will include use of a robust digital object identifier'. In addition to these, EPSRC required fields include Funding agency, grant number, last access date, and privilaged access period.

Metadata schemata / models in development

IDMB 3 tier model (it was developed to enable discoverability) [7] was examined and thought adequate for data discovery :
  • Core metadata for findability
  • Discipline metadata for classification
  • Project metadata for object detail  
This model has been developed by subsequent projects - DAMARO [8], RD@Essex [9], and Iridium [10]:
  • Core metadata based on Dublin Core / DataCite [11]
  • Context/Admin metadata mapped to CERIF [12]
  • Discipline metadata based on subject schema
Compare with the proposed Roadmap model (WP5 lead Graham Blyth), includes metadata for preservation, validation and re-purposing:-
  • Core metadata - DC and DataCite
  • Discipline metadata - discipline ontologies are emerging
  • Project / Institution metadata - reflecting context of research?
  • Instrument metadata - data about instrument settings and specifications
  • Management metadata - Access control and lifecycle information
  • Preservation metadata - data formats and file structures
  • Community metadata - 'live' metadata that can be added to after dataset is put in the repository. Repository users can make observations about and reinterpret data. Cross-discipline keywords.  
Unfortunately we did not have time to discuss the details of the RoadMap metadata scheme.

Minimum mandatory metadata

DataCite proposes five mandatory properties [11] {containing child properties and attributes} as a minimum to be supplied on metadata submission:
  • Identifier {Identifier type} a registered DOI.
  • Creator {creatorName; nameIdentifier; nameIdentifierScheme}the main researchers involved, or authors of the publication in priority order.
  • Title {titleType} open format. Type includes alternative title, subtitle, translated title.
  • Publisher - in the case of datasets, the entity that makes the data available.
  • Publication year - the year that the data is made publicly available, when an embargo ends.
The Data.Bris [13] project at Bristol also requires these five mandatory fields for submission of data.
The DAMARO project at Oxford [8] is extending this set to include further core metadata:
  • Location of dataset - DOI or URL for digital data
  • Medium - Digital or non-digital or both
  • Creator affiliation
  • Access terms and conditions
  • Access date - expiry of embargo
  • Data owner - Institution / Faculty / Department
  • Metadata rights - CCO by default
  • Subject - possibly based on FAST (Faceted Application of Subject Terminology) [14]
A contextual metadata set is also mandatory:
  • Funding agency
  • Grant number
  • project information
  • Last access request date
  • Source & Source URL - if imported record
  • Data generation process
  • Why data was generated
  • Date range of data collection
  • Reason for embargo

Proposed RDMI architectures

At Oxford, DAMARO is building on past projects to implement a three stage Dataflow architecture [15] by May 2013:
  • DataStage (Data & metadata creation and local management) fed into by DMP metadata. Data and metadata transfer via SWORD to
  • DataBank (Institutional data storage on Fedora) records harvested by OAI-PMH to
  • DataFinder (Catalogue of research data) will exchange metadata with CRIS, a Symplectic RIS is in development, and ORA, Oxford's institutional repository. 
At Leeds, a similar architecture is being considered as part of RoadMap. EPrints is also being considered as an alternative to DataBank (Fedora based) - hence the interest in RD@Essex [16], where they are developing EPrints for a data repository. It is possible to link DataStage to Eprints via sword. Both a Databank and an EPrints data repository could be a subset of an established Institutional repository. DepositMO [17] is being considered for harvesting files from personal storage to the repository.

But the discussion was about metadata standards not infrastructure – was there any consensus?

Discussion
  • A WR catalogue would only need discovery layer metadata if records were pulled from institutional data catalogues; these would require a much larger set of mandatory and optional metadata elements.
  • Why a WR catalogue? Is there any benefit of creating a data catalogue at the WR level? Each institution will probably need to develop its own data catalogue anyway. A WR scale research data catalogue may make it easier to link to the resulting WRRO publications. If based on an EPrints platform and may be easy to implement as part of / a subset of WRRO.
  • Why not a national level catalogue? We should broaden our definition of a catalogue e.g. Google is a large distributed catalogue. Is Datacite a catalogue? Well yes, DataCite have established a beta metadata search facility [18] to search their records associated with the DOIs ascribed [19]. 
  • Regarding Datacite, if the majority of fields are mandatory to mint DOIs, but they require a minimum of 5 mandatory fields - presumably to encourage people, the entry barrier is low. 
  • Datacite is a good starting point as a basis for our requirements. We need to know where are the main sources of data fields – which are automatically populated? 
  • User tags are needed for developing discipline ontologies / subject taxonomies; by crowdsource tagging and keywords.
  • But who owns the data? It was decided the department / school / faculty probably owns the data - as they own the facilities that captured the data. Who is the contact? The head of department or equivalent. 
  • Who is the 'Creator'? Everyone involved in the data capture process could be named; Or all the people named as authors of published research output; Or only the Principle Investigator may be named. Other people involved (co-creators, technicians, students) may be named, and role specified in the optional fields 'Contributor' and 'Role'.
  • Metadata mapping: attribute of an element – option of qualified or not.
  • Considering elements for discoverability; what terms will be searched for? – Require more than the 5 mandatory fields of DataCite; including a mandatory subject field.
  • Rights should be mandatory.
  • What metadata needs adding manually? – some preservation metadata will – can’t have common schema across institutions for object level metadata.
  • Problem of research not funded by research councils, solely university funded. Where does metadata come from if not the Grant management system? Core project level metadata for funded research is available. Would we need a data catalogue for unfunded research? It won’t be joined up to other institutional systems. Repository can provide unique identifiers. External range of IDs would be available, but would need to be mapped to internal identifier. 

Discussion of detailed mandatory metadata fields

Taking the Damaro project metadata scheme [8] as a starting point, we looked at each of the mandatory fields to discuss
a. Would this be a mandatory field for a White Rose Institution data catalogue?
b. Would this field be required for discoverability?
c. How should we specify the exact definition for this field?


Element
Notes
Record ID
M
Unique internal repository record ID
Location of dataset
URL / DOI – DOI is best since URL is not persistent. Is DOI location of record or dataset? For non-digital dataset, contact details given.
Medium
Digital or non-digital or both (container for data, rather than format of data)
Creator
M R D
Drawn from institutional CRIS. Drawn from Names Project  [20]
- Creator ID
Unique ID for person
- PID scheme
Scheme for unique personal ID
Creator affiliation
R
Drawn from CRIS – What level, institution or department? 
- Affiliation dates
Dates of affiliation
Title of dataset
M D

Publisher of data
D
Data centre, Repository, institution where dataset is accessed from.
Publication year
M D
Year for citation purpose. Date when dataset is openly accessible – end of embargo (Datacite). Alternatively, date submitted to publisher, or date DOI minted? (See below 1.)
Access terms
Administrative metadata
Data owner
PI, HOD, Head of Faculty, school, institute named as representative of body.
Access date
Embargo expiry date. (See below 2.)
Rights for metadata
CCO, ODC. Administrative to ensure open metadata
Subject
R D

Mandatory subject description based on a controlled vocabulary.    FAST  [21]. Automatic base-level = discipline (creator affiliation?). Other subject vocabularies (LCSH).
Keywords
R O
User devised subject Keywords. 

      Key: M = mandatory for DataCite, O = optional, R = repeatable element, 
               D = terms for discovery in catalogue search.
      Text coulours: DAMARO notes, meeting suggestionsmy suggestions



  1. How do people refer to datasets catalogued using datacite scheme metadata if the dataset is embargoed, but the person is not subject to the restriction – how does a person refer to their own embargoed dataset?
  2. Publication year could be considered first date of access and may be different to access date if the  dataset is re-embargoed after being previously accessible (new embargo).
  3. Problem of management of embargoed elements within a dataset? Best to remove these elements first and publish separately after embargo? Or include embargoed material after embargo period passed and mint new DOI.
  4. Problem of management of embargoed elements within a dataset? Different publication / access dates - publication date for non-embargoed data; access date for embargoed data.
  5. Problem of management of embargoed elements within a dataset? Multiple sublevel DOIs may be minted for different parts within a dataset. 
  6. Other Mandatory fields - Related Identifier DC.12 - mandatory for re-purposed data.

Things for the group to do
Contact DAMARO  - how's progress?
Keep up to speed with RD@Essex
Reflect on this meeting and continue with the process of identifying required fields.

References

[1] White Rose Perspectives on Research Data Management http://library.leeds.ac.uk/info/377/roadmap/123/roadmap_events/2
[2] Metadatatron Blog - A White Rose Research Data Catalogue http://metadatatron.blogspot.co.uk/2012/09/white-rose-research-data-catalogue.html[3] Metadatatron Blog - Metadata for a WR Data Catalogue (part 1) http://metadatatron.blogspot.co.uk/2012/10/metadata-for-wr-data-catalogue.html[4] Metadatatron Blog - Metadata for a WR Data Catalogue (part 2)  http://metadatatron.blogspot.co.uk/2012/10/metadata-for-wr-data-catalogue-part-2.html
[5] RoaDMap - Work packages http://blog.library.leeds.ac.uk/downloads/file/260/roadmap_work_packages 
[6] EPSRC Expectations http://www.epsrc.ac.uk/about/standards/researchdata/Pages/expectations.aspx
[7] IDMB Initial findings report (p84 & 89) http://eprints.soton.ac.uk/195155/1/idmbinitialfindingsreportv4.pdf  
[8] Just enough metadata: Metadata for research datasets in institutional data repositories. Rumsey, S (2012) DAMARO http://damaro.oucs.ox.ac.uk/docs/Just%20enough%20metadata%20v3-1.pdf 
[9] Research Data @ Essex Blog http://researchdataessex.posterous.com/metadata
[10] IRIDIUM Blogpost http://iridiummrd.wordpress.com/2011/12/09/195/
[11] DataCite - Mandatory core metadata http://schema.datacite.org/meta/kernel-2.2/doc/DataCite-MetadataKernel_v2.2.pdf#page=8
[12] CERIF 1.5 Common European Research Information Format http://www.eurocris.org/Uploads/Web%20pages/CERIF-1.5/CERIF1.5_Semantics.xhtml 
[13] Data.Bris - Minimum set of mandatory metadata http://data.blogs.ilrt.org/2012/05/18/minimal-set-of-mandatory-metadata/
[14] FAST (Faceted Application of Subject Terminology) http://www.oclc.org/research/activities/fast.html
[15] Infrastructure for Research Data Management at the University of Oxford. Wilson, J (2012) DAMARO  http://www.ands.org.au/events/webinars/james-wilson-jisc-webinar-slides.pdf
[16] Opening up research data at Essex: Experiments with EPrints. Ensom, T & Wolton, A (2012)  Research Data @Essex http://www.data-archive.ac.uk/media/368772/rde_or2012_notes.pdf
[17] DepositMO and DepositMOre: Modus Operandi for Repository Deposits http://blog.soton.ac.uk/depositmo/tag/depositmo/
[18] DataCite - Beta search facility at http://search.datacite.org/ui
[19] DataCite - Blog http://datacite.wordpress.com/2012/01/26/datacite-search/
[20] Names Project http://names.mimas.ac.uk/
[21] FAST http://www.oclc.org/research/activities/fast.html


Friday 26 October 2012

Metadata for a WR Data Catalogue - part 2


Data Catalogue aspects of RDM infrastructure projects 
The JISC Digital infrastructure: Research management programme (2011-13) Managing Research Data strand is supporting 17 ResearchData Management Infrastructure projects [1]. Four of these, RoaDMaP, SwordARM, Open Exeter and Managing Research Data (at UWE) are referred to in part one of this post [2].
Overall there seems to be a consensus in accepting the three-tier metadata model put forward by the IDMB [3]  project at Southampton (& Takeda et al 2010 [4]) :-



This model has been developed through the IRIDIUM O’loughlin 2011 [5] and DataBris Boyd 2012 [6] projects, the three tiers given slightly different attributes:-
1.      a minimum mandatory metadata set providing core information and could be based       around a standard metadata element set - the 15 Dublin Core elements, DataCite kernel  [45] or CKAN [47], but includes other fields such as location, access terms and             conditions and any embargo information. The top level relates to the Discoverability of the Resource.
2.      a second mandatory layer with contextual metadata covered by elements within the     CERIF model [48], administrative information. base entities: project, person, organisation unit, collaborators; funding information: Funder, grant number; and result entities:            publication, patent, research product ideally, much of this will be automatically harvested, or fed from administrative systems. 
3.      and finally a specific level of optional metadata providing the rich, specific more granular, detailed information. This layer provides the discipline related information required for reuse. 

Oxford University has hosted a number of JISC funded RDM projects since 2008, including EIDCSR [7], Sudamih [8], Admiral [9], Vidaas [10], and Dataflow [11]. These were concerned with research workflows, embedding preservation, developing core metadata, sharing research data in collaborative workspaces, cloud storage of data and research data repository; all towards development of an integrated Research Data Service. The current project DAMARO [12] is implementing this Research Data Service [13], key aspects are the development of Datafinder the Oxford University Data catalogue and Databank [14], the data repository. The Datafinder architecture (Wilson 2012) [15], will have the following characteristics:- 
OAI-PMH harvesting of data stores 
•SWORD2 compliant 
•CERIF compatible 
•Metadata schema based on DataCite 
•Interfaces directly with DataBank & ORDS (Oxford's Online Research Database Service, based on DataStage [16] system
•Users can register non-electronic data.
  

For Datafinder, a three tier metadata approach is envisaged, comprising:

Minumum core elements
Record/digital object ID
Location of dataset
Medium
Creator (if not depositor)
Creator affiliation (if not depositor)
Title
Publisher of data
Publication year
Access terms & conditions
Data owner
Access date to data
Rights for metadata
Subject

Contextual mandatory elements
Funding agency
Grant number
Project information
Last access request date
Source
Source URL
Data generation
process
Why the data was generated/Abstract/Brief description
Date
Reason for embargo

Optional metadata (selection)
Co-creators/contributor
Role
Affiliation
Sub-title
Subject
Keywords
Date (other)
Language
ResourceType
AlternateIdentifier: Eg DOI
RelatedIdentifier: eg DOI of publication
Size
Format
Version
Data generation process
Abstract/Brief description
Documentation 1:descriptive or contextual information about the dataset (e.g. machine settings and experimental conditions under which the data were gathered)
Documentation 2
Subject specific m.d.
Subject specific m.d.
Subject specific classification
Subj specific classn scheme
Data complying with known standards eg DDI
(Rumsey 2012) [17]
The metadata will have three sources: Manual entry - generally disliked, can be inaccurate but can produce rich metadata; Imported - from data capture instruments, from institutional systems (RIMS, DMP), from a data repository; Autogenerated by the RDMI.  (Rumsey 2012) [17] 



The research data catalogue  is Central to the infrastructure being developed by IRDIUM project [18] at Newcastle, recording what data they have and making it discoverable. This will be integrated with MyProject (the Research Management System), MyImpact (Institutional publications system – equivalent to our Symplectics system) and the EPrints based IR. The catalogue will not be a repository but rather a straight forward web-based searchable catalogue of data and that we will only collect information on data that supports publication. We have opted for this measure as we know that data supporting publication should have already been prepared (i.e. confidentiality respected through the scrubbing of data, fields marked sensibly etc) plus we feel that data is normally available at this point for peer review and as a matter of good scientific practice, so (hopefully) we’re not asking too much more from our academics to fill in data information at the same point they fill in their new publication info in our output system.” (Wood,L. 2012) [19]. A list of twelve key field and seven further fields has been drawn up which will be publically or privately viewable through the catalogue interface. Again the three tier model has been adhered to; "This is quite appealing as we already collect much of the information in the first two levels through our current systems (MyProjects, e-prints and MyImpact) so the main additional input we’d be requiring from the academic would be at the third level.”  (O’loughlin2011) [20].
Interconnectivity of RDC with other elements of the RDMI is important because researchers definitely do not want to enter research project metadata more than once in multiple systems, within or outside the institution.
"This requires us to understand some of the systems the RDC may need to exchange metadata with that have existing information already entered. These could be local research group metadata catalogues, local/national repositories and other online systems" (Wood 2012) [21].

Bristol University’s Data.Bris project [22] is developing a RDMI which will integrate a new CRIS (PURE), which also provides an institutional repository, with the existing University Research Data Storage Facility (RDSF). This extends the storage facility into becoming a Research Data Repository and allows data to be published from the storage facility. The proposed architecture [23] involves the creation of a metadata store, (a SPARQL 1.1 service), and will adhere to  OAI-PMH [24], OAI-ORE [25], and SWORD [26] protocols. Data.Bris has defined a minimal set of mandatory metadata to be used when depositing or publishing data: Identifier; Creators; Title; Publisher; Publication year; and are investigating which metadata elements may be created automatically and which need adding manually. Again the three tier metadata model is thought useful, especially since metadata can be pulled in from the CRIS (Boyd2012) [27].

The Datapool project [28] follows on from the Institutional Data Management Blueprint (IDMB) project. The project will launch and populate an EPrints institutional data repository to collect and store all research data produced across disciplines within the institution, as part of the research data management infrastructure. The repository will have access to storage sufficient for local data assets and will also provide links to data held elsewhere, both externally in subject repositories and internally using other systems. The project is investigating mechanisms for transferring data and metadata into the data repository from other local data stores, and exporting data from the repository using the SWORD2 protocol. They will also use the three tier metadata model developed by IDMB  [29].
"The lesson for data repositories is clear: to capture content from data creators you must provide useful services that will become an integral part of the workflow of creating the data. It will not work to isolate particular requirements, such as records creation, from other needs such as storage services. Data does not appear with the same mode and frequency as published papers, so workflow must accommodate many different patterns. Research data is often produced by machines, so deposit workflow must allow scope for non-manual intervention" (Hitchcock 2012) [30].


For the Research Data @Essex project [31], EPrints is being used for repository. This project is also adhering to the IDMB inspired three tier metadata model; they considered EPrints metadata provided for level 1 & 2, whilst level 3 ‘minutiae’ are derived by drawing from DataCite, INSPIRE, DDI and DataShare schema. A multi-schema crosswalk was produced [32] and the Metadata schema worked out based on Datacite INSPIRE and DDI 2.1 [33].



University of West England's MRD uses a schema based on DataCite in a two tier model 1. basic metadata, 2. detailed domain level metadata. The Hydra project uses the Fedora object model and MODS schema; both described in part 1 [2]. 

The C4D project [34] aims to integrate research data metadata with Cerif CRIS metadata. developing mapping between multiple metadata standards aiming at maximum interoperability.

The ADMIRe project [35] seem to be developing a system based on DataCite minimum mandatory metadata, with additional subject specific metadata including DDI.

KAPTUR [36] involves work integrating DataStage with EPrints providing a structured metadata collection interface; and FigShare with EPrints with the intention to create an API to link Figshare with an EPrints repository using the SWORD 2 protocolThe project is specifically involved with visual arts data management so relevant metadata schema referred to [37] include the Categories for the Descriptionof Works of Art (CDWA) [38], the VRA Core Categories [39] and the Data Dictionary – TechnicalMetadata for Digital Still Images (ANSI/NISO Z39.87-2006) [40]. 

MiSS [41] is working towards a RDMI at University of Manchester. They are developing a system of metadata templates specific to different research domains, for use during data capture. The MiSS Baseline Requirements Report  indicates the advantages of implementing a RDMI in automating data capture and metadata ingest from instruments, reduces the need for manual metadata annotation by researchers – this benefit needs promoting to researchers. with the multitude of data sizes, different instruments and specific proprietary data and metadata formats, community input is needed to achieve integration of metadata schemas in the RDMI.

Open Exeter [42] is developing a prototype DSpace research data repository. They have surveyed post-graduates about their experiences testing the interface and metadata webform (Evans 2012)[43] .

Orbital [44] are using CKAN repository system for their data repository. Integrating this with their EPrints repository, their 'Awards Management System' (RIMS) and 'ownCloud' networked storage (an ‘academic dropbox’). Accepting minimum metadatarequirements for DataCite [45] agreement on the the mandatory and optional attributes.

PIMMS [46] (Portable Infrastructure for the Metafor Metadata System) will refactor the Metafor metadata management tool for use in university departments. The project deals with metadata schema in the climatology domain.

Part 3 will describe the work of Australian projects in the research data catalogue / metadata stores area.

References

[1] JISC Digital infrastructure: Managing Research Data Programme 2011-13 - Research Data Management Infrastructure Projects http://www.jisc.ac.uk/whatwedo/programmes/di_researchmanagement/managingresearchdata/infrastructure.aspx
[2] http://metadatatron.blogspot.co.uk/2012/10/metadata-for-wr-data-catalogue.html
[4] Data Management for All - The Institutional Data Management Blueprint project (IDMB at the 6th IDCC)  http://eprints.soton.ac.uk/169533/1/6th_international_digital_curation_conference__idmb_final_paper_revised.pdf
[7] EIDCSR http://eidcsr.oucs.ox.ac.uk/
[8] Sudamih http://sudamih.oucs.ox.ac.uk/
[9] Admiral http://imageweb.zoo.ox.ac.uk/wiki/index.php/ADMIRAL
[10] Vidaas http://vidaas.oucs.ox.ac.uk/
[11] Dataflow http://www.dataflow.ox.ac.uk/
[12] DAMARO http://damaro.oucs.ox.ac.uk/index.xml
[13] University of Oxford Bodleian Libraries - Research data services http://www.bodleian.ox.ac.uk/bdlss/research-data-services 
[14] Databank Oxford University research data repository https://databank.ora.ox.ac.uk/
[15] Wilson (2012) Infrastructure for Research Data Management at the University of Oxford. ANDS Webinar http://www.ands.org.au/events/webinars/james-wilson-jisc-webinar-slides.pdf
[16] DataStage http://www.dataflow.ox.ac.uk/index.php/datastage/ds-about
[17] Sally Rumsey (2012) Building an institutional research data management infrastructure. OR2012  http://damaro.oucs.ox.ac.uk/docs/Just%20enough%20metadata%20v3-1.pdf
[18] IRIDIUM http://research.ncl.ac.uk/iridium/
[19] Wood, L., IRIDIUM Blogpost http://iridiummrd.wordpress.com/2012/10/02/iridium-requirements-for-a-research-data-catalogue-and-proof-of-concept-development/
[20] O'Laughlan N., IRIDIUM Blogpost http://iridiummrd.wordpress.com/2011/12/09/195/
[21] Wood, L., IRIDIUM Blogpost http://iridiummrd.wordpress.com/2012/10/03/iridium-rdm-systemstools-connectivity-busy-researchers-dont-like-duplication-of-metadata-entry/
[22] Data.Bris project http://data.bris.ac.uk/
[23] Steer, D. - Dat.Bris architecture http://data.blogs.ilrt.org/2012/02/03/data-bris-architecture/
[24]  OAI-PMH http://www.openarchives.org/pmh/
[25]  OAI-ORE http://www.openarchives.org/ore/
[26]  SWORD http://swordapp.org/
[27] Boyd, D. - Data.Bris Blog http://data.blogs.ilrt.org/category/metadata/
[28] DataPool project http://datapool.soton.ac.uk/
[29] DataPool project proposal http://datapool.soton.ac.uk/files/2011/12/University-of-Southampton-Proposal-public.pdf
[30] Hitchcock, S., DataPool Blog http://datapool.soton.ac.uk/tag/repositories/
[31] Research Data @Essex Blog http://researchdataessex.posterous.com/metadata
[32] Research Data @Essex Metadata schema crosswalk http://researchdataessex.posterous.com/metadata#
[33] RDE Metadata Profile for EPrints https://docs.google.com/open?id=0B7VJTfTg7nrrcU1WMWVEMW9tY3M
[34] C4D http://cerif4datasets.files.wordpress.com/2012/04/cris2012_35_full_paper.pdf
[35] ADMIRe http://admire.jiscinvolve.org/wp/2012/08/16/notes-from-the-2nd-datacite-workshop/
[36] KAPTUR http://www.vads.ac.uk/kaptur/outputs/Kaptur_technical_analysis.pdf
[37] KAPTUR Blogpost https://kaptur.wordpress.com/2012/06/12/raising-your-redman/
[38] CDWA http://www.getty.edu/research/publications/electronic_publications/cdwa/index.html
[39] VRA Core http://www.vraweb.org/projects/vracore4/
[40] NISO Technical Metadata for Digital Still Images http://www.niso.org/kst/reports/standards?step=2&gid=None&project_key=b897b0cf3e2ee526252d9f830207b3cc9f3b6c2c
[41] MiSS - BaselineRequirementsReport http://www.miss.manchester.ac.uk/wp-content/uploads/2012/09/MiSS-BaselineRequirementsReport-RevisedVersion-Aug2012.pdf
[42] Open Exeter http://blogs.exeter.ac.uk/openexeterrdm/
[43] Evans, J., Open Exeter Blog http://blogs.exeter.ac.uk/openexeterrdm/blog/2012/05/31/pgr-feedback-on-data-upload/
[44] Orbital Blog http://orbital.blogs.lincoln.ac.uk/
[45] DataCite Mandatory Properties http://schema.datacite.org/meta/kernel-2.2/doc/DataCite-MetadataKernel_v2.2.pdf#page=8
[46] PIMMS http://proj.badc.rl.ac.uk/pimms
[47] CKAN http://ckan.org/
[48] CERIF 1.5 - Common European Research Information Format http://www.eurocris.org/Uploads/Web%20pages/CERIF-1.5/CERIF1.5_Semantics.xhtml