Digital Project Planning & Management Basics Optional Unit: Specific

51 Slides1.36 MB

Digital Project Planning & Management Basics Optional Unit: Specific metadata standards and applications overviews Addendum to session 4 1

Session Objectives Understand standards for Metadadata elements data value standards data content standards and Learn about metadata standards developed by specific communities Evaluate the efficacy of the standard for a specific community, their strengths and weaknesses Explore the adoption of non-traditional standards by libraries 2

Session Outline Introduction to basic concepts Description of community specific metadata schemes Description of specific structural metadata and syntaxes 3

Questions to Ask When Selecting a Metadata Standard What type of material will be digitized? How much information is available? Is there a Community of practice developed for this resource type(s)? What is the purpose of digital project? Did your “Needs Assessment” elicit who will be the audience and how they would use the content? Are there pre-existing digital projects with which this one needs to function? What Systems options are available? 4

Metadata Standards in a Resource Grid stewardship high high Special collections low MARC, MARC, DC DC ONIX, ONIX, MPEG MPEG Uniqueness Books Journals MARC, MARC,METS, METS, EAD, EAD,DC, DC,TEI TEI Books Journals Newspapers Government docs Audiovisual Maps Scores Special collections Rare books Local/Historical Newspapers Local history materials Archives & manuscripts Theses & dissertations Stuart Weibel. Presentation State of the Dublin Core Metadata Initiative Göttingen August 11, 2003 (Based on Lorcan Dempsey Presentation) low Freelyaccessible web resources Open source software Newsgroup archives Institutional repositories ePrints Learning objects/materials Research data DC DC Freely-accessible web resources Institutional assets DC, DC, DDI, DDI, IEEE/LOM, IEEE/LOM, FGDC, FGDC, EAD, EAD, TEI, TEI, SCORM SCORM 5

Metadata Standards Schemas (a.k.a. ‘Element Sets’) Set of semantic properties, in this context used to describe resources Not the same as “XML schemas” (which has a very precise meaning) Syntaxes The structural wrapping around the semantics Essential for moving information around 6

Content Standards AACR2 functions as the content standard for traditional cataloging RDA (the successor to AACR2) aspires to be the content standard for non-MARC metadata DACS (Describing Archives: a Content Standard) CCO (cataloging Cultural Objects) new standard developed by visual arts and cultural heritage community Best practices, Guidelines, Data dictionaries-- less formal content standards 7

Value Standards Library of Congress Subject Headings Art and Architecture Thesaurus Thesaurus of Geographical Names 8

Some Example Schemas Dublin Core (http://dublincore.org) Simple and Qualified MODS (www.loc.gov/standards/mods/) VRA 4.0 (http://www.vraweb.org/projects/vracore4/index.html) IEEE-LOM (http://ltsc.ieee.org/wg12/) ONIX (http://www.editeur.org/onix.html) EAD (http://www.loc.gov/ead/) TEI (http://www.tei-c.org/) 9

Dublin Core: Simple Fifteen elements; one namespace Minimal standard for OAI-PMH Used also as: Controlled vocabulary values may be expressed, but not the sources of the values core element set in some other schemas switching vocabulary for more complex schemas 10

Dublin Core Metadata Initiative (DCMI) Origins 2nd W3C Conference Chicago (October 1994) Conversations at this conference led to the first meeting at OCLC in Dublin Ohio, hence its name Combination of IT and Librarians Workshops began in 1995 March 1995, NCSA/OCLC workshop in Dublin, Ohio Identified the need for author generated metadata, a “core”: of common elements to describe Web content to help discovery 11

Mission of the DCMI (Original) “The mission of the Dublin Core Metadata Initiative (DCMI) is to make it easier to find resources using the Internet through the following activities: Developing metadata standards for resource discovery across domains Defining frameworks for the interoperation of metadata sets Facilitating the development of community- or domain-specific metadata sets that work within these frameworks ” Weibel http://purl.org/dc/workshops/dc8conference/plenary/sld018.htm 12

DCMES Characteristics Simplicity Supports resource discovery All elements are optional/repeatable No order of elements prescribed Extensible* / Refined* Interdisciplinary/International Semantic interoperability 13

Value International and cross-domain Increase efficiency of the discovery/retrieval of digital objects Provide a framework of elements which will aid the management of information Promote collaboration of cultural/educational information as shared “social capital” 14

DCMES Principles 1:1 Dumb Down Appropriate Values http://dublincore.org/documents/usageguide/glossary.shtml 15

Dublin Core Metadata Element Set (DCMES) 1996 The 15 Dublin Core elements can be divided into three categories : CONTENT INTELLECTUAL PROPERTY INSTANTATION Title Creator Date Description Contributor Language Subject Publisher Identifier Relation Rights Format Source Coverage Type 16

Ex.: Simple Dublin Core metadata dc:title Cataloging cultural objects, /dc:title dc:contributor Baca, Murtha. /dc:contributor dc:contributor Harpring, Patricia./dc:contributor dc:subject Information organization /dc:subject dc:subject Metadata /dc:subject dc:subject Cultural property-Documentation /dc:subject dc:subject CC135.C37 2006 /dc:subject dc:subject 363.6 /dc:subject dc:date 2006 /dc:date dc:format 396 p. /dc:format dc:type Text /dc:type dc:identifier ISBN:0838935648 /dc:identifier dc:language en /dc:language dc:publisher ALA Editions /dc:publisher /metadata 17

Extensible: Lego Blocks Extensible architecture Spectrum of simple to more complex DCMES may be used with other metadata element sets Lego Metaphor: Modular building blocks used to develop application profiles of mixed metadata Leverage existing thesauri, classification systems, ontologies, local vocabularies Stuart Weibel. Presentation State of the Dublin Core Metadata Initiative Göttingen August 11, 2003 18

Dublin Core: Qualified ‘Qualified’ includes element refinements and encoding schemes More specific properties Two namespaces Explicit vocabularies Additional elements, including ‘Audience,’ ‘InstructionalMethod,’ ‘RightsHolder’ and ‘Provenance’ 19

Qualified Dublin Core Elements 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. Identifier Title Creator Contributor Publisher Subject Description Coverage Format Type Date Relation Source Rights Language Element Refinements Abstract Access rights Alternative Audience Available Bibliographic citation Conforms to Created Date accepted Date copyrighted Date submitted Education level Extent Has format Has part Has version Is format of Is part of Is referenced by Is replaced by Is required by Issued Is version of License Mediator Medium Modified Provenance References Replaces Requires Rights holder Spatial Table of contents Temporal Valid 20

More Dublin Core “Refinements” Encodings Types Box DCMIType DDC IMT ISO3166 ISO639-2 LCC LCSH MESH Period Point RFC1766 RFC3066 TGN UDC URI W3CTDF Collection Dataset Event Image Interactive Resource Moving Image Physical Object Service Software Sound Still Image Text 21

metadata dc:title xml:lang "en" Cataloging cultural objects. /dc:title dc:contributor Baca, Murtha. /dc:contributor dc:contributor Harpring, Patricia. /dc:contributor dc:subject xsitype "LCSH" Information organization /dc:subject dc:subject xsitype "LCSH" Metadata /dc:subject dc:subject xsitype "LCSH" Cultural property-Documentation /dc:subject dc:subject xsitype "LCC" CC135.C37 2006 /dc:subject dc:subject xsitype "DDC" 363.3 /dc:subject dc:date xsitype "W3CDTF" 2006 /dc:date dcterms:extent 396 p. /dcterms:extent dc:type xsitype "DCMIType" Text /dc:type dc:identifier xsitype "URI" ISBN: 0838935648 /dc:identifier dc:language xsitype "RFC3066" en /dc:language dc:publisher ALA Editions /dc:publisher dcterms:audience Catalogers /dcterms:audience Ex.: Qualified Dublin Core 22

Lego Model replaced by RDF Combining element sets using the Resource Description Framework (RDF), Semantic Web Container Package Dublin Core Package MARC record URI Package Terms and Conditions Package Indirect Reference 23

Advantages of Dublin Core Less rigorous content rules Easier to train and implement Allows OAI harvesting of metadata Supported by digital library products: ContentDM Encompass MetaSource 24

Disadvantages to Dublin Core Lack of granularity may not support specific community needs Lack of granularity makes its role as a switching language between standards limited No fields are required and lack of consistent training can hamper interoperability 25

QuickTime and a TIFF (Uncompressed) decompressor are needed to see this picture. What is MODS? Descriptive metadata standard Initiative of Network Development and MARC Standards Office at LC A derivative of MARC21 Documentation refers to MARC definitions for most properties Descriptive metadata encoded in an XML schema Uses textual rather than numeric tags Originally designed for library applications, but may be used for others Uses XML Schema (METS) 26

Why ? XML (Extensible Markup Language) is the markup for the Web Library community need for a element set simpler but compatible with MARC that could be transmitted in XML A standardized framework for holding and exchanging metadata: analogous to the MARC record, for re-use of pre-existing information Designed for complex digital library objects Dublin Core not sufficient; e.g., need to express role of creator Provide a more explicit means of expressing different categories of dates in machine-readable forms 27

elements Title Info Name Type of resource Genre Origin Info Language Physical description Abstract Table of contents Target audience Note Subject Classification Related item Identifier Location Access conditions Extension Record Info Root elements: mods (A single MODS record modsCollection (A collection of MODS records)) 28

Fields used in Minerva project Title Alternative title Name (structured form) Abstract Date captured Genre (value always “Web site”) Physical description (file formats) Identifier (base URL) Language Access conditions/rights management Subject (keyword or LCSH if possible) 30

Advantages of Uses language-based tags; fully uses Unicode character set Allows the aggregation of multilingual records Elements generally inherit semantics of MARC but does not assume the use of any specific rules for description Element set is more compatible with existing descriptions than ONIX or Dublin Core Elements particularly applicable to digital resources XML schema allows for flexibility and availability of freely available software tools 31

Disadvantages of Library-centric Not widely adopted by other libraries or other communities 32

Ex.: MODS titleInfo title Cataloging cultural objects. / /title /titleInfo name type "personal" namePart type "family" Baca, /namePart namePart type "given" Murtha), /namePart namePart type "date" 1951- /namePart role roleTerm type "text" editor /roleTerm /role /name name type "personal" namePart type "family" Harpring, /namePart namePart type "given" Patricia. /namePart role roleTerm type "text" editor /roleTerm /role /name 33

More MODS typeOfResource text /typeOfResource genre authority "marc" book /genre originInfo place placeTerm authority "marccountry" type "code" ilu /placeTerm /place place placeTerm type "text" Chicago /placeTerm /place publisher ALA Editions /publisher dateIssued 2006 /dateIssued issuance monographic /issuance /originInfo language languageTerm authority "iso639-2b" type "code" eng /languageTerm /language 34

VRA Core Categories for Visual Resources Developed by the Visual Resources Association, the VRA Standards Committee Designed specifically for visual resources Viewed as a means to share cataloging of visual materials Provides access to digitized images and their description 35

VRA Metadata Elements Based on CDWA for category definitions and recommendations for controlled vocabulary Two types of elements Work Images Like DC, all fields are repeatable Unlike DC, all are mandatory if applicable 36

VRA 4.0 Elements Work, Collection or Image Work Type Title Measurements Material Technique Agent Date Subject Relation Location REFID Text REF Style/Period Agent.Culture / Cultural Context Description Source Rights Inscription State Edition 37

VRA Data Values LCSH AAT TGN ULAN 38

Online Information Exchange (ONIX) Designed by publishing industry (American Association of Publishers) to exchange information about “books” with wholesalers, retail, e-tail booksellers. Standard for data exchange Richer information for online bookstores 39

ONIX Integrated with MARC Records? CC:DA Task on ONIX International charge with reviewing the standard and assessing the impact if integrated http://www.ala.org/alcts/organization/ccs/ccda/tf-onix1.html 40

Comparison of ONIX & MARC ONIX has finer granularity than MARC Fields can be mapped from ONIX into UNIMARC, but can not be reconverted Each application contains fields that are relevant to only themselves ONIX records provide enriching information: reviews, abstracts,TOC, prizes won, credentials of authors 41

ONIX/MARC Crosswalks ONIX (1.0) to UNIMARC Crosswalk developed by Library of Congress http://lcweb.loc.gov/marc/onix2marc.htlml Mapping by Bob Pearson (OCLC) http://222.editeur.org/ ONIX MARC Mapping External.doc Report by Alan Danskin http://bic.org.uk/reporton.doc 42

ONIX Metadata Standard Allows two levels of description: Level 2: 235 elements of information in 24 categories Requires XML DTD Level 1: Not all the categories, 82 elements Does not require XML DTD 43

ONIX for Books Originally devised to simplify the provision of book product information to online retailers (name stood for ONline Information eXchange) First version flat XML, second version included hierarchy and elements repeated within ‘composites’ Maintained by Editeur, with the the Book Industry Study Group (New York) and Book Industry Communication (London) Includes marketing and shipping oriented information: book jacket blurb and photos, full size and weight info, etc. 44

Ex.: ONIX Title TitleType 01 /TitleType TitleText textcase “02” British English, A to Zed /TitleText /Title Contributor SequenceNumber 1 /SequenceNumber ContributorRole A01 /ContributorRole PersonNameInverted Schur, Norman W /PersonNameInverted BiographicalNote A Harvard graduate in Latin and Italian literature, Norman Schur attended the University of Rome and the Sorbonne before returning to the United States to study law at Harvard and Columbia Law Schools. Now retired from legal practise, Mr Schur is a fluent speaker and writer of both British and American English. /BiographicalNote /Contributor 45

Ex.: ONIX othertext d102 01 /d102 d104 BRITISH ENGLISH, A TO ZED is the thoroughly updated, revised, and expanded third edition of Norman Schur’s highly acclaimed transatlantic dictionary for English speakers. First published as BRITISH SELF-TAUGHT and then as ENGLISH ENGLISH, this collection of Briticisms for Americans, and Americanisms for the British, is a scholarly yet witty lexicon, combining definitions with commentary on the most frequently used and some lesser known words and phrases. Highly readable, it’s a snip of a book, and one that sorts out – through comments in American – the “Queen’s English” – confounding as it may seem. /d104 /othertext othertext d102 08 /d102 d104 Norman Schur is without doubt the outstanding authority on the similarities and differences between British and American English. BRITISH ENGLISH, A TO ZED attests not only to his expertise, but also to his undiminished powers to inform, amuse and entertain. – Laurence Urdang, Editor, VERBATIM, The Language 46

Main Ex.: ONIX othertext Desc. d102 01 /d102 d104 BRITISH ENGLISH, A TO ZED is the thoroughly updated, revised, and expanded third edition of Norman Schur’s highly acclaimed transatlantic dictionary for English speakers. First published as BRITISH SELF-TAUGHT and then as ENGLISH ENGLISH, this collection of Briticisms for Americans, and Americanisms for the British, is a scholarly yet witty lexicon, combining definitions with commentary on the most frequently used and some lesser known words and phrases. Highly readable, it’s a snip of a book, and one that sorts out – through comments in American – the “Queen’s English” – confounding as it may seem. /d104 Revie /othertext othertext w d102 08 /d102 d104 Norman Schur is without doubt the outstanding authority on the similarities and differences between British and American English. BRITISH ENGLISH, A TO ZED attests not only to his expertise, but also to his undiminished powers to inform, amuse and entertain. – Laurence Urdang, Editor, VERBATIM, The Language Quarterly, Spring 1988 /d104 /othertext 47

EAD -- Encoded Archival Description http://www.loc.gov/ead/ 48

Learning Object Metadata An array of related standards for description of ‘learning objects’ or ‘learning resources’ Most based on efforts of the IEEE LTSC (Institute of Electrical and Electronics Engineers Learning Technology Standards Committee) and the IMS Global Learning Consortium, inc. Tends to be very complex with few implementations outside of government and industry One well-documented implementation is CanCore 49

XML schema for a set of technical data elements required to manage digital image collections http://www.loc.gov/standards/mix/ 50

TEI -- Text Encoding Initiative http://www.tei-c.org/ 51

Back to top button