mirror of
https://github.com/futurepress/epub.js.git
synced 2025-10-05 15:32:55 +02:00
1494 lines
227 KiB
HTML
1494 lines
227 KiB
HTML
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
|
||
<!DOCTYPE html><html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:pls="http://www.w3.org/2005/01/pronunciation-lexicon" xmlns:ssml="http://www.w3.org/2001/10/synthesis" xmlns:svg="http://www.w3.org/2000/svg"><head><title>Chapter 3. Resources in Organizing Systems</title><link rel="stylesheet" type="text/css" href="core.css"/><meta name="generator" content="DocBook XSL Stylesheets V1.76.1"/><meta name="keywords" content="resource, domain, format, agency, focus, active resource, passive resource, identity, identifier, vocabulary problem, controlled vocabulary, digitization, born digital, metadata, work, expression, manifestation, item, effectivity, authenticity, provenance, precision, recall"/><meta name="keywords" content="organizing, information, resources, metadata, resource description"/><link rel="up" href="index.html" title="The Discipline of Organizing"/><link rel="prev" href="ch02.html" title="Chapter 2. Activities in Organizing Systems"/><link rel="next" href="ch04.html" title="Chapter 4. Resource Description and Metadata"/></head><body><section class="chapter" title="Chapter 3. Resources in Organizing Systems" epub:type="chapter" id="chapter-3"><div class="titlepage"><div><div><h2 class="title">Chapter 3. Resources in Organizing Systems</h2></div><div><div class="author"><h3 class="author"><span class="firstname">Robert</span> <span class="othername">J.</span> <span class="surname">Glushko</span></h3></div></div><div><div class="author"><h3 class="author"><span class="firstname">Daniel</span> <span class="othername">D.</span> <span class="surname">Turner</span></h3></div></div><div><div class="author"><h3 class="author"><span class="firstname">Kimra</span> <span class="surname">McPherson</span></h3></div></div><div><div class="author"><h3 class="author"><span class="firstname">Jess</span> <span class="surname">Hemerly</span></h3></div></div></div></div><div class="sect1" title="Introduction"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="section-3.1">Introduction</h2></div></div></div><p><a id="id601179" class="indexterm"/>This chapter builds upon the foundational concepts introduced in <a class="xref" href="ch01.html" title="Chapter 1. Foundations for Organizing Systems">Chapter 1</a> to explain more carefully what we mean by
|
||
<span class="bold"><strong><a class="glossterm" href="go01.html#gloss_resource"><em class="glossterm">resource</em></a></strong></span>.
|
||
In particular, we focus on the issue of identity<span class="symbol">—</span>what will be treated as
|
||
a separate resource<span class="symbol">—</span>and discuss the issues and principles we need to consider
|
||
when we give each resource a name or identifier.</p><p>In <a class="xref" href="ch03.html#section-3.2" title="Four Distinctions about Resources">Four Distinctions about Resources</a> we introduce four distinctions we can make when we
|
||
discuss resources: <span class="strong"><strong><a class="glossterm" href="go01.html#gloss_domain"><em class="glossterm">domain</em></a></strong></span>, <span class="strong"><strong><a class="glossterm" href="go01.html#gloss_format"><em class="glossterm">format</em></a></strong></span>, <span class="strong"><strong><a class="glossterm" href="go01.html#gloss_agency"><em class="glossterm">agency</em></a></strong></span>, and
|
||
<span class="strong"><strong><a class="glossterm" href="go01.html#gloss_focus"><em class="glossterm">focus</em></a></strong></span>. In <a class="xref" href="ch03.html#section-3.3" title="Resource Identity">Resource Identity</a> we apply these
|
||
distinctions as we discuss how resource identity is determined for <span class="hardware">physical
|
||
resources</span>, bibliographic resources, resources in information systems, as
|
||
well as for <a class="glossterm" href="go01.html#gloss_active_resources"><em class="glossterm">active resources</em></a> and
|
||
<span class="quote">“<span class="quote"><span class="hardware">smart things</span>.</span>”</span>
|
||
<a class="xref" href="ch03.html#section-3.4" title="Naming Resources">Naming Resources</a> then tackles the problems and principles for naming: once
|
||
we have identified resources, how do we name and distinguish them? Finally, <a class="xref" href="ch03.html#section-3.5" title="Resources over Time">Resources over Time</a> considers issues that emerge with respect to resources over
|
||
time.</p><div class="sect2" title="What Is a Resource?"><div class="titlepage"><div><div><h3 class="title" id="section-3.1.1">What Is a Resource?</h3></div></div></div><p>Resources are what we organize.</p><p>We introduced the concept of <a class="glossterm" href="go01.html#gloss_resource"><em class="glossterm">resource</em></a> in <a class="xref" href="ch01.html#section-1.2.1" title="The Concept of “Resource”">The Concept of <span class="quote">“<span class="quote">Resource</span>”</span></a> with its ordinary sense
|
||
of <span class="quote">“<span class="quote"><span>anything of value that can support goal-oriented
|
||
activity</span></span>”</span> and emphasized that a group of resources can be
|
||
treated as a <a class="glossterm" href="go01.html#gloss_collection"><em class="glossterm">collection</em></a> in an organizing system. And what do we mean by
|
||
<span class="quote">“<span class="quote">anything of value,</span>”</span> exactly? It might seem that the question of
|
||
<span class="strong"><strong><a class="glossterm" href="go01.html#gloss_identity"><em class="glossterm">identity</em></a></strong></span>, of what a single resource is, should not
|
||
be hard to answer. After all, we live in a world of resources, and
|
||
<span class="action">finding</span>, <span class="action">selecting</span>,
|
||
<span class="action">describing</span>, <span class="action">arranging</span>, and
|
||
<span class="action">referring</span> to them are everyday activities.</p><p>Nevertheless, even when the resources we are dealing with are <span class="hardware">tangible
|
||
things</span>, how we go about organizing them is not always obvious, or at
|
||
least not obvious to each of us in the same way at all times. Not everyone thinks of
|
||
them in the same way. Recognizing something in the sense of perceiving it as a
|
||
<span class="hardware">tangible thing</span> is only the first step toward being able to
|
||
organize it and other resources like it. Which properties garner our attention, and
|
||
which we use in organizing depends on our experiences, purposes, and context.</p><p><span>We <span class="action">add information</span> to a resource when we
|
||
<span class="action">name</span> or <span class="action">describe</span> it; it then becomes more
|
||
than <span class="quote">“<span class="quote">it.</span>”</span></span>
|
||
<span>We can describe the same resource in many different ways.</span>
|
||
<span>At various times we can consider any given resource to be one of many
|
||
members of a broad category, as one of the few members of a narrow category, or
|
||
as a unique instance of a category with only one member.</span> For example,
|
||
we might recognize something as a piece of clothing, as a sock, or as the specific
|
||
dirty sock with the hole worn in the heel from yesterday’s long hike. However, even
|
||
after we categorize something, we might not be careful how we talk about it; we
|
||
often refer to two objects as <span class="quote">“<span class="quote">the same thing</span>”</span> when what we mean is
|
||
that they are <span class="quote">“<span class="quote">the same type of thing.</span>”</span> Indeed, we could debate whether
|
||
a category with only one possible member is really a category, because it blurs an
|
||
important distinction between particular items or instances and the class or type to
|
||
which they belong.</p><p>The issues that matter and the decisions we need to make about resource instances
|
||
and resource classes and types are not completely separable. Nevertheless, we will
|
||
strive to focus on the former ones in this chapter and the latter ones in <a class="xref" href="ch06.html" title="Chapter 6. Categorization: Describing Resource Classes and Types">Chapter 6</a>.</p><div class="sect3" title="Resources with Parts"><div class="titlepage"><div><div><h4 class="title" id="section-3.1.1.1">Resources with Parts</h4></div></div></div><p>As tricky as it can be to decide what a resource is when you are dealing with
|
||
single objects, it is even more challenging when the resources are objects or
|
||
systems composed of other parts. In these cases, we must focus on the entirety
|
||
of the object or system and treat it as a resource, treat its constituent parts
|
||
as resources, and deal with the relationships between the parts and the whole,
|
||
as we do with engineering drawings and assembly procedures.</p><p><span>How many <span class="hardware">things</span> is a car?</span> If you are
|
||
imagining the car being assembled you might think of several dozen large parts
|
||
like the frame, suspension, drive train, gas tank, brakes, engine, exhaust
|
||
system, passenger compartment, doors, and other pre-assembled components. Of
|
||
course, each of those components is itself
|
||
made up of many parts<span class="symbol">—</span>think of the engine, or even just the radio. Some
|
||
sources have counted ten or fifteen thousand parts in the average car, but
|
||
even at that precise granularity a lot of parts are still complex things. There
|
||
are screws and wires and fasteners and on and on; really too many to
|
||
count.</p><p>Ambiguity about the number of parts in the whole holds for information
|
||
resources too; a newspaper can be considered a single resource but it might also
|
||
consist of multiple sections, each of which contains separate stories, each of
|
||
which has many paragraphs, and so on. <span>From the typesetter’s point of
|
||
view, each character in a sentence can be taken as a distinct resource,
|
||
selected from a font of similar resources.</span></p></div><div class="sect3" title="Bibliographic Resources, Information Components, and “Smart Things” as Resources"><div class="titlepage"><div><div><h4 class="title" id="section-3.1.1.2">Bibliographic Resources, Information Components, and <span class="quote">“<span class="quote">Smart
|
||
Things</span>”</span> as Resources</h4></div></div></div><p>Information resources generally pose additional challenges in their
|
||
<span class="action">identification</span> and <span class="action">description</span> because
|
||
their most important property is usually their content, which is not easily and
|
||
consistently recognizable. <span>Organizing systems for information resources
|
||
in physical form, like those for libraries, have to juggle the duality of
|
||
their tangible embodiment with what is inherently an abstract information
|
||
resource; that is, the printed book versus the knowledge the book
|
||
contains.</span> Here, the organizing system emphasizes description
|
||
resources or surrogates, like bibliographic records that describe the
|
||
information content, rather than their physical properties.</p><p>Another important question in libraries is: <span>What set of resources
|
||
should be treated as the same work because they contain essentially similar
|
||
intellectual or artistic content?</span> We may talk about Shakespeare’s
|
||
play <em class="citetitle">Macbeth</em>, but what is this thing we call
|
||
<span class="quote">“<span class="quote">Macbeth</span>”</span>? Is it a particular string of words, saved in a
|
||
computer file or handwritten upon a folio? Is it the collection of words printed
|
||
with some predetermined font and pagination? Are all the editions and printings
|
||
of these words the same <em class="citetitle">Macbeth</em>? How should we organize
|
||
the numerous live and recorded performances of plays and movies that share the
|
||
<em class="citetitle">Macbeth</em> name? What about creations based on or
|
||
inspired by <em class="citetitle">Macbeth</em> that do not share the title
|
||
<span class="quote">“<span class="quote">Macbeth,</span>”</span> like the Kurosawa film <span class="quote">“<span class="quote">
|
||
<em class="citetitle"><span xml:lang="ja" class="foreignphrase"><em xml:lang="ja" class="foreignphrase">Kumonosu-jo</em></span></em>
|
||
</span>”</span> (<em class="citetitle">Throne of Blood</em>) that transposes the plot to
|
||
feudal Japan? <a id="id601619" class="indexterm"/><span class="personname"><span class="firstname">Patrick</span> <span class="surname">Wilson</span></span> proposed a genealogical analogy, characterizing a
|
||
<span class="quote">“<span class="quote">work</span>”</span> as <span class="quote">“<span class="quote">a group or family of texts,</span>”</span> with the
|
||
idea that a creation like Shakespeare’s <em class="citetitle">Macbeth</em> is the
|
||
<span class="quote">“<span class="quote">ancestor of later members of the family.</span>”</span><sup>[<a id="chapter-3-endnote-01" href="#ftn.chapter-3-endnote-01" epub:type="noteref" class="footnote">124</a>]</sup></p><p><span>Information system designers and architects</span> face analogous
|
||
design challenges when they describe the <span class="quote">“<span class="quote">information components</span>”</span>
|
||
in <span class="application">business or scientific organizing systems</span>.
|
||
<span>Information content is intrinsically merged or confounded with
|
||
structure and presentation whenever it is used in a specific instance and
|
||
context.</span> From a logical perspective, an order form contains
|
||
information components for ITEM, CUSTOMER NAME, ADDRESS, and PAYMENT
|
||
INFORMATION, but the arrangement of these components, their type font and size,
|
||
and other non-semantic properties can vary a great deal in different order forms
|
||
and even across a single information system that <span class="action">re-purpose</span>s
|
||
these components for letters, delivery notices, mailing labels, and database
|
||
entries.<sup>[<a id="chapter-3-endnote-02" href="#ftn.chapter-3-endnote-02" epub:type="noteref" class="footnote">125</a>]</sup></p><p><a id="id601715" class="indexterm"/><a id="id601738" class="indexterm"/><a id="id601722" class="indexterm"/><a id="id601727" class="indexterm"/>Similar questions about resource identity are posed by the emergence
|
||
of ubiquitous or <span class="application">pervasive computing</span>, in which
|
||
<span class="action">information processing</span> capability and
|
||
<span class="action">connectivity</span> are embedded into <span class="hardware">physical
|
||
objects</span>, in devices like <span class="hardware">smart phones</span>, and
|
||
in the surrounding <span class="hardware">environment</span>. <a id="id601792" class="indexterm"/>Equipped with sensors, <span class="application">radio-frequency identification
|
||
(<abbr class="abbrev">RFID</abbr>) tags</span>, <abbr class="abbrev">GPS</abbr> data,
|
||
and user-contributed metadata, these <span class="quote">“<span class="quote"><span class="hardware">smart
|
||
things</span></span>”</span> create a jumbled torrent of information about
|
||
location and other properties that must be sorted into identified streams and
|
||
then matched or associated with the original resource.</p><p><a class="xref" href="ch03.html#section-3.3" title="Resource Identity">Resource Identity</a> discusses the issues and methods for determining
|
||
<span class="quote">“<span class="quote">what is a resource?</span>”</span> for physical resources as well as for the
|
||
bibliographic resources, information components and <span class="quote">“<span class="quote">smart things</span>”</span>
|
||
discussed here in <a class="xref" href="ch03.html#section-3.1.1.1" title="Resources with Parts">Resources with Parts</a>.</p></div></div><div class="sect2" title="Identity, Identifiers, and Names"><div class="titlepage"><div><div><h3 class="title" id="section-3.1.2">Identity, Identifiers, and Names</h3></div></div></div><p><a id="id601914" class="indexterm"/>The answer to the question posed in <a class="xref" href="ch03.html#section-3.1.1" title="What Is a Resource?">What Is a Resource?</a> has two parts.
|
||
The first part is <a class="glossterm" href="go01.html#gloss_identity"><em class="glossterm">identity</em></a>: what thing are we treating as the resource? The second
|
||
part is <span class="action">identification</span>: <span class="action">differentiating</span> between
|
||
this single resource and other resources like it. These problems are closely
|
||
related. Once you have decided what to treat as a resource, you create a name or an
|
||
identifier so that you can refer to it reliably. <a id="id601947" class="indexterm"/><a id="id601957" class="indexterm"/><span><a id="def_name"/>A <span class="strong"><strong><a class="glossterm" href="go01.html#gloss_name"><em class="glossterm">name</em></a></strong></span> is a
|
||
label for a resource that is used to distinguish one from another.</span>
|
||
<a id="id601989" class="indexterm"/><a id="id601997" class="indexterm"/><span><a id="def_identifier"/>An <span class="strong"><strong><a class="glossterm" href="go01.html#gloss_identifier"><em class="glossterm">identifier</em></a></strong></span> is a special kind of name assigned in a
|
||
controlled manner and governed by rules that define possible values and naming
|
||
conventions.</span>
|
||
<span><a id="def_resolution"/>For a digital resource, its identifier serves as the
|
||
input to the system or function that determines its location so it can be
|
||
retrieved, a process called <span class="strong"><strong><a class="glossterm" href="go01.html#gloss_resolution"><em class="glossterm">resolving</em></a></strong></span> the
|
||
identifier or <span class="strong"><strong><a class="glossterm" href="go01.html#gloss_resolution"><em class="glossterm">resolution</em></a></strong></span>.</span></p><p><span class="action">Choosing names and identifiers</span><span class="symbol">—</span>be it for a
|
||
person, a service, a place, a trend, a work, a document, a concept,
|
||
etc.<span class="symbol">—</span>is hardly straightforward. In fact, naming can often
|
||
be challenging and is often highly contentious. <span><span>Naming is made
|
||
difficult by countless factors, including the audience that will need to
|
||
access, share, and use the names, the limitations of language, institutional
|
||
politics, and personal and cultural biases.</span></span></p><p>A common complication arises when a resource has more than one name or identifier.
|
||
<span>When something has more than one name each of the multiple names is a
|
||
<span class="strong"><strong><a class="glossterm" href="go01.html#gloss_synonym"><em class="glossterm">synonym</em></a></strong></span> or <span class="strong"><strong><a class="glossterm" href="go01.html#gloss_alias"><em class="glossterm">alias</em></a></strong></span>.</span> A particular physical instance
|
||
of a book might be called a hardcover or paperback or simply a text. <a id="id602070" class="indexterm"/>
|
||
<span><span class="personname"><span class="firstname">George</span> <span class="surname">Furnas</span></span> and his research collaborators called this issue of multiple names
|
||
for the same resource or concept the <span class="quote">“<span class="quote"><span class="strong"><strong><a class="glossterm" href="go01.html#gloss_vocabulary_problem"><em class="glossterm">vocabulary
|
||
problem</em></a>.</strong></span></span>”</span></span><sup>[<a id="chapter-3-endnote-03" href="#ftn.chapter-3-endnote-03" epub:type="noteref" class="footnote">126</a>]</sup></p><p><a id="id602159" class="indexterm"/><a id="id602166" class="indexterm"/>Whether we call it a book or a text, the resource will usually have a
|
||
Library of Congress catalog number as well as an <abbr class="abbrev">ISBN</abbr> as an
|
||
<a class="glossterm" href="go01.html#gloss_identifier"><em class="glossterm">identifier</em></a>. When the book is in a carton of books being shipped
|
||
from the publisher to a bookstore or library, that carton will have a bar-coded
|
||
tracking number assigned by the delivery service, and a manifest or receipt document
|
||
created by the publisher whose identifier associates the shipment with the customer.
|
||
Each of these identifiers is unique with respect to some established scope or
|
||
context.</p><p><a id="id602184" class="indexterm"/>A partial solution to the <a class="glossterm" href="go01.html#gloss_vocabulary_problem"><em class="glossterm">vocabulary
|
||
problem</em></a> is to use a <span class="strong"><strong><a class="glossterm" href="go01.html#gloss_controlled_vocabulary"><em class="glossterm">controlled vocabulary</em></a></strong></span>. We can impose rules that
|
||
standardize the way in which names and labels for resources are assigned in the
|
||
first place. Alternatively, we can define mappings from terms used in our natural
|
||
language to the authoritative or controlled terms. However, vocabulary control
|
||
cannot remove all ambiguity. Even if a passport or national identity system requires
|
||
authoritative full names rather than nicknames, there could easily be more than one <span class="personname"><span class="firstname">Robert</span> <span class="othername">John</span> <span class="surname">Smith</span></span> in the system.</p><p>Controlling the language used for a particular purpose raises other questions: Who
|
||
writes and enforces these rules? What happens when organizing systems that follow
|
||
different rules get compared, combined, or otherwise brought together in contexts
|
||
different from those for which they were originally intended?</p></div></div><div class="sect1" title="Four Distinctions about Resources"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="section-3.2">Four Distinctions about Resources</h2></div></div></div><p>The nature of the resource is critical for the creation and maintenance of quality
|
||
organizing systems. There are four distinctions we make in discussing resources:
|
||
<span class="strong"><strong><a class="glossterm" href="go01.html#gloss_domain"><em class="glossterm">domain</em></a></strong></span>, <span class="strong"><strong><a class="glossterm" href="go01.html#gloss_format"><em class="glossterm">format</em></a></strong></span>, <span class="strong"><strong><a class="glossterm" href="go01.html#gloss_agency"><em class="glossterm">agency</em></a></strong></span>, and <span class="strong"><strong><a class="glossterm" href="go01.html#gloss_focus"><em class="glossterm">focus</em></a></strong></span>. <a class="xref" href="ch03.html#chapter-3-figure-3.1" title="Figure 3-1. Resource Domain, Format, Focus and Agency.">Figure 3-1</a> depicts these four distinctions, perspectives or
|
||
points of view on resources; because they are not independent, we cannot portray these
|
||
distinctions as categories of resources.</p><div class="figure-float"><div class="figure"><a id="chapter-3-figure-3.1"/><div class="figure-contents"><div class="mediaobject"><a id="chapter-3-figure-3.1a"/><img src="figs/print/ch3.1-350dpi.png" alt="Four distinctions we can make when discussing resources concern their domain (their type of matter or content), format (physical or digital), agency (active or passive), and focus (primary or description)."/></div></div><div class="figure-title">Figure 3-1. Resource Domain, Format, Focus and Agency.</div></div></div><div class="sect2" title="Resource Domain"><div class="titlepage"><div><div><h3 class="title" id="section-3.2.1">Resource Domain</h3></div></div></div><p><a id="id602452" class="indexterm"/><span><a id="def_domain"/><span>Every resource has some essence or type
|
||
that distinguishes it from other resources, which we call the resource
|
||
<span class="strong"><strong><a class="glossterm" href="go01.html#gloss_domain"><em class="glossterm">domain</em></a></strong></span>.</span> Domain is
|
||
an intuitive notion that we can help define by contrasting it with the
|
||
alternative of <span xml:lang="latin" class="foreignphrase"><em xml:lang="latin" class="foreignphrase">ad hoc</em></span> or arbitrary
|
||
groupings of resources that just happen to be in the same place at some moment,
|
||
rather than being based on natural or intrinsic characteristics.</span></p><p>For <span class="hardware">physical resources</span>, domains can be coarsely distinguished
|
||
according to the <span class="hardware">type of matter</span> of which they are made using
|
||
properties that can be readily perceived. <span>
|
||
<a id="id602485" class="indexterm"/> The top-level classification of all things into the animal,
|
||
vegetable, and mineral kingdoms by Carl Linnaeus in 1735 is today deeply
|
||
embedded in most languages and cultures to create a hierarchical system of
|
||
domain categories.<sup>[<a id="chapter-3-endnote-03a" href="#ftn.chapter-3-endnote-03a" epub:type="noteref" class="footnote">127</a>]</sup></span>
|
||
<span><a id="id602619" class="indexterm"/>Many aspects of this system of domain categories are determined by natural
|
||
constraints on category membership that are manifested in patterns of shared
|
||
properties; once a resource is identified as a member of one category it must
|
||
also be a member of another with which it shares some but not all
|
||
properties.</span> For example, a marble statue in a museum must also be a
|
||
kind of material resource, and a fish in an aquarium must also be a kind of animal
|
||
resource.</p><p>For information resources, easily perceived properties are less reliable and
|
||
correlated, so we more often distinguish domains based on semantic properties; the
|
||
definitions of the <span class="quote">“<span class="quote">encyclopedia,</span>”</span>
|
||
<span class="quote">“<span class="quote">novel,</span>”</span> and <span class="quote">“<span class="quote">invoice</span>”</span> resource types distinguish them
|
||
according to their typical subject matter, or the type of content, rather than
|
||
according to the great variety of physical forms in which we might encounter them.
|
||
Arranging <span class="hardware">books</span> by color or size might be sensible for very
|
||
small collections, or in a photo studio, but organizing according to physical
|
||
properties would make it extremely impractical to find books in a large
|
||
library.</p><p>
|
||
<a id="id602666" class="indexterm"/>We can arrange types of information resources in a hierarchy but because
|
||
the category boundaries are not sharp it is more useful to view domains of
|
||
information resources on a continuum from weakly-structured narrative content to
|
||
highly structured transactional content. This <a class="glossterm" href="go01.html#gloss_framework"><em class="glossterm">framework</em></a>,
|
||
called the <em class="citetitle">Document Type Spectrum</em>
|
||
by Glushko and McGrath, captures the idea that the boundaries between resource
|
||
domains, like those between colors in the rainbow, are easy to see for colors far
|
||
apart in the spectrum but hard to see for adjacent ones.<sup>[<a id="chapter-3-endnote-04" href="#ftn.chapter-3-endnote-04" epub:type="noteref" class="footnote">128</a>]</sup> (See the Sidebar, <a class="xref" href="ch03.html#chapter-3-sidebar-1" title="The Document Type Spectrum">The Document Type Spectrum</a>, and its
|
||
corresponding depiction as <a class="xref" href="ch03.html#chapter-3-figure-3.2" title="Figure 3-2. Document Type Spectrum.">Figure 3-2</a>)</p><div class="sidebar"><a id="chapter-3-sidebar-1"/><div class="sidebar-title">The Document Type Spectrum</div><p>Different domains or types of documents can be distinguished according to the
|
||
extent to which their content is semantically prescribed, by the amount of
|
||
internal structure, and by the correlations of their presentation and formatting
|
||
to their content and structure. These three characteristics of content,
|
||
structure, and presentation vary systematically from narrative document types
|
||
like novels to transactional document types like invoices.</p><p>Narrative types are authored by people and are heterogeneous in structure and
|
||
content, and their content is usually just prose and graphic elements. Their
|
||
presentational characteristics carefully reinforce their structure and
|
||
semantics; for example, the text of titles or major headings is large because
|
||
the content is important, in contrast to the small text of footnotes.
|
||
Transactional document types are usually created mechanically and, as a result,
|
||
are homogeneous in structure and content; their content is largely
|
||
<span class="quote">“<span class="quote">data</span>”</span> <span class="symbol">—</span>strongly typed content with precise semantics that can
|
||
be processed by computers.</p><p>In the middle of the spectrum are hybrid document types like textbooks,
|
||
encyclopedias, and technical manuals that contain a mixture of narrative text
|
||
and structured content in figures, data tables, code examples, and so on.</p></div><div class="figure"><a id="chapter-3-figure-3.2"/><div class="figure-contents"><div class="mediaobject"><a id="chapter-3-figure-3.2a"/><img src="figs/print/Figure3-2-Replacement.png" alt="The Document Type Spectrum represents the idea that document types vary on a continuum from narrative ones that are mostly text, like novels, to transactional ones with highly-structured information, like invoices. In between are hybrid types that contain both narrative and transactional content, like dictionaries and encyclopedias."/></div></div><div class="figure-title">Figure 3-2. Document Type Spectrum.</div></div></div><div class="sect2" title="Resource Format"><div class="titlepage"><div><div><h3 class="title" id="section-3.2.2">Resource Format</h3></div></div></div><p><span><a id="id602757" class="indexterm"/>Information resources can exist in numerous formats with the most basic
|
||
format distinction being whether the resource is physical or digital.</span>
|
||
This distinction is most important when it comes to the implementation of a resource
|
||
<span class="action">storage</span> or <span class="action">preservation</span> system because that is
|
||
where physical properties are usually considerations, and very possibly constraints.
|
||
This distinction is less important at the logical level when we <span class="action">design
|
||
interactions</span> with resources because it is often possible to use digital
|
||
surrogates for the <span class="hardware">physical resources</span> to overcome the
|
||
constraints posed by their physical properties. When we <span class="action">search</span> for
|
||
<span class="hardware">cars or appliances</span> in an online store it does not matter
|
||
where the actual cars or appliances are located or how they are <span class="hardware">physically
|
||
organized</span>. (See the Sidebar, <a class="xref" href="ch01.html#chapter-1-sidebar-4" title="The Three Tiers of Organizing Systems">The Three Tiers of Organizing Systems</a>).</p><p>Many digital representations can be associated with either physical or digital
|
||
resources, but it is important to know which one is the original or primary
|
||
resource, especially for unique or valuable ones.</p><p><span><a id="id602880" class="indexterm"/><a id="id602883" class="indexterm"/><a id="id602888" class="indexterm"/>Today a great many resources in organizing systems are <span class="strong"><strong><a class="glossterm" href="go01.html#gloss_born_digital"><em class="glossterm">born digital</em></a></strong></span>.</span>
|
||
They are created in word processors and <span class="hardware">digital cameras</span>, or by
|
||
<span class="hardware">audio and video recorders</span>. Other resources are produced in
|
||
digital form by the many types of <span class="hardware">sensors</span> in
|
||
<span class="quote">“<span class="quote"><span class="hardware">smart things</span></span>”</span> and by the systems that
|
||
<span class="action">create digital resources</span> when they interact with
|
||
<span class="hardware">barcodes</span>, <abbr class="abbrev">QR</abbr> (<span class="quote">“<span class="quote">quick
|
||
response</span>”</span>) codes, <abbr class="abbrev">RFID</abbr> tags, or other mechanisms for
|
||
<span class="action">tracking</span> identity and location.<sup>[<a id="chapter-3-endnote-05" href="#ftn.chapter-3-endnote-05" epub:type="noteref" class="footnote">129</a>]</sup></p><p>Other digital resources are digitized ones created by <span class="strong"><strong><a class="glossterm" href="go01.html#gloss_digitization"><em class="glossterm">digitization</em></a></strong></span>, the process for
|
||
<span class="action">transforming</span> an artifact whose original format is physical so
|
||
that it can be stored and manipulated by a computer. We can digitize the printed
|
||
word, photographs, blueprints and record albums. Printed text, for example, can be
|
||
digitized by scanning the pages and employing character recognition software or
|
||
simply by re-typing it.<sup>[<a id="chapter-3-endnote-06" href="#ftn.chapter-3-endnote-06" epub:type="noteref" class="footnote">130</a>]</sup></p><p>There are a vast number of digital formats that differ in many ways, but we can
|
||
coarsely compare them on two dimensions: the degree to which they distinguish
|
||
information content from presentation or rendering, and the explicitness with which
|
||
content distinctions are represented. Taken together, these two dimensions allow us
|
||
to compare formats on their overall <span class="quote">“<span class="quote">Information IQ</span>”</span> <span class="symbol">—</span>with
|
||
the overarching principle being that <span class="quote">“<span class="quote">smarter</span>”</span> formats contain more
|
||
computer-processable information, as illustrated in <a class="xref" href="ch03.html#chapter-3-figure-3.2.2-IQ" title="Figure 3-3. Information IQ.">Figure 3-3</a></p><p><a id="id603104" class="indexterm"/><a id="id603108" class="indexterm"/><a id="id603139" class="indexterm"/>Simple digital formats for <span class="quote">“<span class="quote">plain text</span>”</span> documents contain only the
|
||
characters that you see on your computer keyboard. <em class="firstterm"><abbr class="abbrev">ASCII</abbr></em> is the most commonly used
|
||
simple format, but <abbr class="abbrev">ASCII</abbr> is inadequate for most languages, which
|
||
have larger character sets, and it also cannot handle mathematical characters.<sup>[<a id="chapter-3-endnote-07" href="#ftn.chapter-3-endnote-07" class="footnote">131</a>]</sup> The Unicode standard was designed to overcome these
|
||
limitations.<sup>[<a id="chapter-3-endnote-08" href="#ftn.chapter-3-endnote-08" class="footnote">132</a>]</sup> (ASCII and Unicode are discussed in great detail in <a class="xref" href="ch08.html#section-8.3.1" title="Notations">Notations</a>).</p><div class="figure-float"><div class="figure"><a id="chapter-3-figure-3.2.2-IQ"/><div class="figure-contents"><div class="mediaobject"><a id="chapter-3-figure-3.2.2"/><img src="figs/print/chapter-3-figure-3.2.2.png.jpg" alt="The notion of Information IQ captures the idea that document formats differ on two dimensions. One is the explicitness of content representation, and the other is the separation of content and presentation. A scanned document is just a picture of a document that doesn’t reveal any of these distinctions, so it is low on both dimensions. A database or XML document distinguishes explicitly between types of content and any presentation is separately assigned to them, so they are high on both dimension and hence have the highest Information IQ. An HTML document has explicit distinctions between types of content, but the distinctions are usually about how the content is presented and thus has lower IQ than an XML document or a database. Formats with high Information IQ are easily and usefully processed by computers."/></div></div><div class="figure-title">Figure 3-3. Information IQ.</div></div></div><p>Most document formats also explicitly <span class="action">encode</span> a hierarchy of
|
||
structural components, such as chapters, sections or semantic components like
|
||
descriptions or <span class="action">procedural steps</span>, and sometimes the appearance of
|
||
the rendered or printed form.<sup>[<a id="chapter-3-endnote-09" href="#ftn.chapter-3-endnote-09" epub:type="noteref" class="footnote">133</a>]</sup> Another important distinction to note is whether the information is
|
||
encoded as a sequence of text characters so that it is human readable as well as
|
||
computer readable. Encoding character content with <abbr class="abbrev">XML</abbr>, for
|
||
example, allows for layering of intentional coding or <span class="strong"><strong><a class="glossterm" href="go01.html#gloss_markup"><em class="glossterm">markup</em></a></strong></span> interwoven with the <span class="quote">“<span class="quote">plain text</span>”</span>
|
||
content. Because <abbr class="abbrev">XML</abbr> processors are required to support Unicode,
|
||
any character can appear in an <abbr class="abbrev">XML</abbr> document. The most complex
|
||
digital formats are those for multimedia resources and multidimensional data, where
|
||
the data format is highly optimized for specialized analysis or
|
||
applications.<sup>[<a id="chapter-3-endnote-10" href="#ftn.chapter-3-endnote-10" class="footnote">134</a>]</sup></p><p><span class="action">Digitization</span> of <span class="hardware">non-text resources</span> such as
|
||
<span class="hardware">film photography</span>, <span class="hardware">drawings</span>, and
|
||
<span class="hardware">analog audio</span> and <span class="hardware">visual recordings</span>
|
||
raises a complicated set of <span class="action">choices</span> about pixel density, color
|
||
depth, sampling rate, frequency filtering, compression, and numerous other technical
|
||
issues that determine the digital representation.<sup>[<a id="chapter-3-endnote-11" href="#ftn.chapter-3-endnote-11" epub:type="noteref" class="footnote">135</a>]</sup> There may be multiple intended uses and devices for the digitized
|
||
resource that might require different digitization approaches and formats. Moreover,
|
||
downstream users of digitized resources often need to know the format in which the
|
||
digital artifact has been created so they can <span class="action">reuse</span> it as is or
|
||
<span class="action">process</span> it in other ways.</p><p>Some digital formats support <span class="action">interactions</span> that are qualitatively
|
||
different and more powerful than those possible with physical resources. Museums are
|
||
using virtual world technology to <span class="action">create</span> interactive exhibits in
|
||
which visitors can fly through the solar system, scan their own bodies, and change
|
||
gravity so they can bounce off walls. Sophisticated digital document formats can
|
||
enable interactions with annotated digital images or video, 3-D graphics or embedded
|
||
data sets. <span><a id="id603537" class="indexterm"/>The <span class="application">Google Art Project</span> contains extremely
|
||
high resolution photographs of famous paintings that make it possible to see
|
||
details that are undetectable under the normal viewing conditions in
|
||
museums.</span><sup>[<a id="chapter-3-endnote-12" href="#ftn.chapter-3-endnote-12" class="footnote">136</a>]</sup></p><p>Nevertheless, digital representations of physical resources can also lose
|
||
important information and capabilities. The distinctive sounds of hip hop music
|
||
produced by <span class="quote">“<span class="quote">scratching</span>”</span> vinyl records on turntables cannot be produced
|
||
from digital MP3 music files.<sup>[<a id="chapter-3-endnote-13" href="#ftn.chapter-3-endnote-13" class="footnote">137</a>]</sup></p><p>Copyright often presents a <span class="action">barrier to digitization</span>, both as a
|
||
matter of law and because digitization itself enables <span class="action">copyright
|
||
enforcement</span> to a degree not possible prior to the advent of
|
||
digitization, by eliminating common forms of access and interactions that are
|
||
inherently possible with physical printed books like the ability to give or sell
|
||
them to someone else.<sup>[<a id="chapter-3-endnote-14" href="#ftn.chapter-3-endnote-14" class="footnote">138</a>]</sup></p></div><div class="sect2" title="Resource Agency"><div class="titlepage"><div><div><h3 class="title" id="section-3.2.3">Resource Agency</h3></div></div></div><p><a id="id603779" class="indexterm"/><a id="id603786" class="indexterm"/><a id="id603827" class="indexterm"/><a id="id603847" class="indexterm"/><span><a id="def_agency"/><span class="bold"><strong><a class="glossterm" href="go01.html#gloss_agency"><em class="glossterm">Agency</em></a></strong></span>, the extent to
|
||
which a resource can initiate actions on its own is the third distinction we
|
||
make about a resource. Another way to express this contrast is between passive
|
||
resources that are acted upon and <span class="action">active resources</span> that can
|
||
<span class="action">initiate actions</span>.</span>
|
||
<span class="application">Telephone answering</span> and <span class="application">fax
|
||
machines</span> are agents because they are capable of independently
|
||
responding to an outside stimulus, <span class="action">accepting and managing messages</span>.
|
||
An ordinary <span class="hardware">mercury thermometer</span> is not capable of
|
||
<span class="action">communicating its own reading</span>, but a
|
||
<span class="hardware"><span class="application">digital wireless thermometer</span></span>
|
||
or <span class="quote">“<span class="quote"><span class="hardware"><span class="application">weather station</span></span></span>”</span>
|
||
can. Passive resources serve as nouns or operands, while active resources serve as
|
||
verbs or operants.<sup>[<a id="chapter-3-endnote-15" href="#ftn.chapter-3-endnote-15" epub:type="noteref" class="footnote">139</a>]</sup></p><div class="sect3" title="Passive or Operand Resources"><div class="titlepage"><div><div><h4 class="title" id="section-3.2.3.1">Passive or Operand Resources</h4></div></div></div><p><a id="id603982" class="indexterm"/>Organizing systems that contain passive or operand resources are
|
||
ubiquitous for the simple reason that we live in a world of physical resources
|
||
that we identify and name in order to interact with them. <span><a id="def_passive_resources"/><span class="bold"><strong><a class="glossterm" href="go01.html#gloss_passive_resources"><em class="glossterm">Passive resources</em></a></strong></span> are usually tangible and
|
||
static and thus they become valuable only as a result of some action or
|
||
interaction with them.</span></p><p>Most organizing systems with physical resources or those that contain
|
||
resources that are digitized equivalents treat those resources as passive. A
|
||
printed book on a library shelf, a digital book in an e-book reader, a statue in
|
||
a museum gallery, or a case of beer in a supermarket refrigerator only create
|
||
value when they are checked out, viewed, or consumed. None of these resources
|
||
exhibits any agency and cannot initiate any actions to create value on their
|
||
own.</p></div><div class="sect3" title="Active or Operant Resources"><div class="titlepage"><div><div><h4 class="title" id="section-3.2.3.2">Active or Operant Resources</h4></div></div></div><p><a id="id604026" class="indexterm"/><a id="id604039" class="indexterm"/><a id="id604047" class="indexterm"/><a id="id604051" class="indexterm"/><span><a id="def_active_resources"/><span class="bold"><strong><a class="glossterm" href="go01.html#gloss_active_resources"><em class="glossterm"><span class="application">Active
|
||
resources</span></em></a></strong></span> create effects or
|
||
value on their own, sometimes when they initiate interactions with passive
|
||
resources. Active resources can be <span class="hardware">people</span>,
|
||
<span class="hardware">other living resources</span>, <span class="application">computational
|
||
agents</span>, <span class="application">active information
|
||
sources</span>, or <span class="application">web-based
|
||
services</span></span>. We can exploit computing capability,
|
||
storage capacity and communication bandwidth to create active resources that can
|
||
do things and support interactions that are impossible for ordinary
|
||
<span class="hardware">physical passive resources</span>.</p><p><span>Objects become <span class="hardware"><span class="application">active
|
||
resources</span></span> when they contain sensing or
|
||
communication capabilities.</span>
|
||
<abbr class="abbrev">RFID</abbr> chips, which are essentially <span class="hardware">bar
|
||
codes</span> with built-in <span class="hardware"><span class="application">radio
|
||
transponders</span></span>, enable <span class="action">automated location
|
||
tracking</span> and <span class="action">context sensing</span>. <a id="id604148" class="indexterm"/><abbr class="abbrev">RFID</abbr> Receivers are built into store shelves,
|
||
loading docks, parking lots, and toll booths to detect when some
|
||
<abbr class="abbrev">RFID</abbr>-tagged resource is at some meaningful location.
|
||
<abbr class="abbrev">RFID</abbr> tags can be made <span class="quote">“<span class="quote">smarter</span>”</span> by having them
|
||
record and transmit information from sensors that detect temperature, humidity,
|
||
acceleration, and even biological contamination.<sup>[<a id="chapter-3-endnote-16" href="#ftn.chapter-3-endnote-16" epub:type="noteref" class="footnote">140</a>]</sup></p><p><span><span class="hardware">Smart phones</span> are <a class="glossterm" href="go01.html#gloss_active_resources"><em class="glossterm">active resources</em></a> that can
|
||
identify and share their own location, orientation, acceleration and a
|
||
growing number of other <a class="glossterm" href="go01.html#gloss_contextual_properties"><em class="glossterm">contextual parameters</em></a> to enable personalization of
|
||
information services.</span>
|
||
<span><a id="id604224" class="indexterm"/><a id="id604211" class="indexterm"/><span class="hardware"><span class="application">Self-regulating
|
||
appliances</span></span> are active resources when they
|
||
communicate with each other in a <span class="quote">“<span class="quote"><span class="hardware"><span class="application">smart
|
||
building</span></span></span>”</span> to minimize energy
|
||
consumption.</span></p><p><span>Many organizing systems on the web consist of collections or
|
||
configurations of <span class="application">active digital resources</span>.
|
||
Interactions among these active resources often implement
|
||
information-intensive business models where value is created by
|
||
<span class="action">exchanging</span>, <span class="action">manipulating</span>,
|
||
<span class="action">transforming</span>, or otherwise <span class="action">processing</span>
|
||
information, rather than by manipulating, transforming, or otherwise
|
||
processing physical resources.</span></p><p><a id="id604286" class="indexterm"/><a id="id604269" class="indexterm"/><span><a id="def_SOA"/><em class="firstterm"><a id="first_SOA"/><span class="citerefentry"><span class="refentrytitle">Service Oriented Architecture</span>(SOA)</span></em> is an emerging design discipline for
|
||
organizing <span class="application">active resources</span> as
|
||
<span class="application">functional business components</span> that can be
|
||
combined in different ways. <abbr class="abbrev">SOA</abbr> is generally implemented
|
||
using web services that exchange <abbr class="abbrev">XML</abbr> documents in real-time
|
||
information flows to interconnect the business service
|
||
components.</span></p><p><a id="id604312" class="indexterm"/>A familiar design pattern for an organizing system composed from
|
||
<span class="application">active digital resources</span> is the
|
||
<span class="quote">“<span class="quote"><span class="application">online store</span>.</span>”</span> The store can be
|
||
analyzed as a composition or <span class="action">choreography</span> in which some web
|
||
pages display catalog items, others serve as <span class="quote">“<span class="quote">shopping carts</span>”</span> to
|
||
<span class="action">assemble the order</span>, and then a
|
||
<span class="quote">“<span class="quote"><span class="action">checkout</span></span>”</span> page collects the buyer’s
|
||
payment and delivery information that gets passed on to other
|
||
<span class="application">service providers</span> who <span class="action">process
|
||
payments</span> and <span class="action">deliver the goods</span>.</p><p><a id="id604371" class="indexterm"/>The web has enabled the novel application of
|
||
<span class="hardware"><span class="application">human resources as active
|
||
resources</span></span> to carry out tasks of short duration that
|
||
can be precisely described but which cannot be done reliably by computers. These
|
||
tasks include <span class="hardware"><span class="action">image classification or
|
||
annotation</span></span>, <span class="hardware"><span class="action">spoken language
|
||
transcription</span></span>, and <span class="hardware"><span class="action">sentiment
|
||
analysis</span></span>. <a id="id604393" class="indexterm"/> The people doing these tasks over the web are sometimes called
|
||
<span class="quote">“<span class="quote"><span class="hardware"><span class="application">Mechanical
|
||
Turks</span></span></span>”</span> by analogy to a fake
|
||
<span class="hardware"><span class="application">chess playing machine</span></span>
|
||
from the 18<sup>th</sup> century that had a human hidden inside
|
||
who was secretly moving the pieces.<sup>[<a id="chapter-3-endnote-17" href="#ftn.chapter-3-endnote-17" class="footnote">141</a>]</sup></p></div></div><div class="sect2" title="Resource Focus"><div class="titlepage"><div><div><h3 class="title" id="section-3.2.4">Resource Focus</h3></div></div></div><p>A fourth contrast between types of resources distinguishes primary or
|
||
<span class="hardware">original resources</span> from resources that describe them.
|
||
<span><a id="def_description_resource"/>Any primary resource can have one or more
|
||
description resources associated with it to facilitate <span class="action">finding</span>,
|
||
<span class="action">interacting</span> with, or <span class="action">interpreting</span> the
|
||
primary one. Description resources are essential in organizing systems where the
|
||
primary resources are not under its control and can only be accessed or
|
||
interacted with through the description. Description resources are often called
|
||
<span class="strong"><strong><a class="glossterm" href="go01.html#gloss_metadata"><em class="glossterm">metadata</em></a></strong></span>.</span></p><p><span>The distinction between primary resources and description resources, or
|
||
metadata, is deeply embedded in library science and traditional organizing
|
||
systems whose collections are predominantly text resources like books, articles,
|
||
or other documents. In these contexts description resources are commonly called
|
||
bibliographic resources or catalogs, and each primary resource is typically
|
||
associated with one or more description resources.</span></p><p>In business enterprises, the organizing systems for digital information resources,
|
||
such as business documents, or data records created by transactions or automated
|
||
processes, almost always employ resources that describe, or are associated with,
|
||
large sets or classes of primary resources.<sup>[<a id="chapter-3-endnote-18" href="#ftn.chapter-3-endnote-18" epub:type="noteref" class="footnote">142</a>]</sup></p><p><a id="id604648" class="indexterm"/><span><a id="def_focus"/>The contrast between primary resources and
|
||
description resources is very useful in many contexts, but when we look more
|
||
broadly at organizing systems, it is often difficult to distinguish them, and
|
||
determining which resources are primary and which are metadata is often just a
|
||
decision about which resource is currently the <span class="strong"><strong>focus</strong></span> of our attention.</span></p><p><a id="id604578" class="indexterm"/>For example, many people who use <span class="application">Twitter</span>
|
||
focus on the 140-character message body as the primary resource, while the
|
||
associated metadata about the sender and the message (is it a forward, reply, link,
|
||
and so on?) is less important to them. However, for firms in the growing ecosystem
|
||
of services that use <span class="application">Twitter</span> metadata to measure sender
|
||
and brand impact, identify social networks, and assess trends, the focus is on the
|
||
metadata, not the message content.<sup>[<a id="chapter-3-endnote-19" href="#ftn.chapter-3-endnote-19" class="footnote">143</a>]</sup></p><p><a id="id604692" class="indexterm"/>As another example, <a id="id604695" class="indexterm"/>the <span class="hardware">players on professional sports teams</span> are
|
||
<span class="hardware">human resources</span> that we enjoy watching as they compete, but
|
||
millions of people participate in <span class="hardware"><span class="application">fantasy sports
|
||
leagues</span></span> where teams consist of fantasy players that
|
||
are simulated resources based on the statistics generated by the <span class="hardware">actual
|
||
human players</span>. Put another way, the associated resources in the
|
||
actual sports are treated as the primary ones in the fantasy leagues.<sup>[<a id="chapter-3-endnote-20" href="#ftn.chapter-3-endnote-20" epub:type="noteref" class="footnote">144</a>]</sup></p></div><div class="sect2" title="Resource Format x Focus"><div class="titlepage"><div><div><h3 class="title" id="section-3.2.5">Resource Format x Focus</h3></div></div></div><p>Applying the format contrast between physical and digital resources to the focus
|
||
distinction between primary and descriptive resources yields a useful <a class="glossterm" href="go01.html#gloss_framework"><em class="glossterm">framework</em></a> with four categories of
|
||
resources (see <a class="xref" href="ch03.html#chapter-3-figure-3.3" title="Figure 3-4. Resource Format x Focus.">Figure 3-4</a>).</p><div class="figure-float"><div class="figure"><a id="chapter-3-figure-3.3"/><div class="figure-contents"><div class="mediaobject"><a id="chapter-3-figure-3.3a"/><img src="figs/print/ch3.4-350dpi.png" alt="The distinctions of resource format and resource focus are orthogonal so they can be combined to distinguish four categories of resources: primary physical ones, primary digital ones, physical description ones, and digital description ones."/></div></div><div class="figure-title">Figure 3-4. Resource Format x Focus.</div></div></div><div class="sect3" title="Physical Description of a Primary Physical Resource"><div class="titlepage"><div><div><h4 class="title" id="section-3.2.5.1">Physical Description of a Primary Physical Resource</h4></div></div></div><p><span>The oldest relationship between descriptive resources and
|
||
<span class="hardware">physical resources</span> is when descriptions or other
|
||
information about physical resources are themselves encoded in a
|
||
<span class="hardware">physical form</span>. <span class="history">Nearly ten
|
||
thousand years ago in Mesopotamia small <span class="hardware">clay tokens</span>
|
||
kept in clay containers served as inventory information to count units
|
||
of <span class="hardware">goods or livestock</span>. It took 5000 years for the
|
||
idea of stored tokens to evolve into Cuneiform writing in which marks in
|
||
clay stood for the tokens and made both the tokens and containers
|
||
unnecessary.</span></span><sup>[<a id="chapter-3-endnote-21" href="#ftn.chapter-3-endnote-21" epub:type="noteref" class="footnote">145</a>]</sup>
|
||
<span class="LIS">Printed cards served as physical description resources for
|
||
books in libraries for nearly two centuries.</span><sup>[<a id="chapter-3-endnote-22" href="#ftn.chapter-3-endnote-22" class="footnote">146</a>]</sup></p></div><div class="sect3" title="Digital Description of a Primary Physical Resource"><div class="titlepage"><div><div><h4 class="title" id="section-3.2.5.2">Digital Description of a Primary Physical Resource</h4></div></div></div><p><a id="id604999" class="indexterm"/>Here, the digital resource describes a <span class="hardware">physical
|
||
resource</span>. The most familiar example of this relationship is the
|
||
online library catalog used to find the shelf location of <span class="hardware">physical
|
||
library resources</span>, which beginning in the 1960s replaced the
|
||
physical cards with database records. The online catalogs for museums usually
|
||
contain a digital photograph of the painting, item of sculpture, or other museum
|
||
object that each catalog entry describes.</p><p><a id="id605026" class="indexterm"/><a id="id605011" class="indexterm"/><a id="id605033" class="indexterm"/><a id="id605041" class="indexterm"/><a id="id605046" class="indexterm"/>Digital description resources for <span class="hardware">primary physical
|
||
resources</span> are essential in <span class="application">supply chain
|
||
management</span>, <span class="application">logistics</span>
|
||
<span class="application">retailing</span>, <span class="application">transportation</span>,
|
||
and every business model that depends on having timely and accurate information
|
||
about where things are or about their current states. This digital description
|
||
resource is created as a result of an interaction with a primary physical
|
||
resource like a <span class="hardware"><span class="application">temperature
|
||
sensor</span></span> or with some <span class="hardware">secondary physical
|
||
resource</span> that is already associated with the <span class="hardware">primary
|
||
physical resource</span> like an <span class="hardware">RFID tag</span> or
|
||
<span class="hardware">barcode</span>.</p><p><span class="application">Augmented reality systems</span> combine a layer of
|
||
real-time digital information about some <span class="hardware">physical object</span> to
|
||
a digital view or representation of it. The yellow <span class="quote">“<span class="quote">first down</span>”</span>
|
||
lines superimposed in broadcasts of football games are a familiar example.
|
||
Augmented reality techniques that <span class="hardware">superimpose identifying or
|
||
descriptive metadata</span> have been used in displays to support the <a id="id605114" class="indexterm"/><span class="application">operation or maintenance of complex
|
||
equipment</span>, in <span class="hardware"><span class="application">smart phone
|
||
navigation</span></span> and tourist guides, in advertising,
|
||
and in other domains where users might otherwise need to consult a separate
|
||
information source. Advanced airplane cockpit technology includes
|
||
<span class="hardware">heads-up displays</span> that <span class="action">present critical
|
||
data</span> based on available instrumentation, including
|
||
<span>augmented reality runway lights when visibility is poor because of
|
||
clouds or fog</span>.</p></div><div class="sect3" title="Digital Description of a Primary Digital Resource"><div class="titlepage"><div><div><h4 class="title" id="section-3.2.5.3">Digital Description of a Primary Digital Resource</h4></div></div></div><p><a id="id605168" class="indexterm"/>Here, the digital resource describes a digital resource. This is the
|
||
relationship in a digital library or any web-based organizing system and it
|
||
makes it possible to access the primary digital resource directly from the
|
||
digital secondary resource.</p></div><div class="sect3" title="Physical Description of a Primary Digital Resource"><div class="titlepage"><div><div><h4 class="title" id="section-3.2.5.4">Physical Description of a Primary Digital Resource</h4></div></div></div><p><a id="id605179" class="indexterm"/>This is the relationship implemented when we encounter an embedded
|
||
<span class="hardware">QR barcode</span> in newspaper or magazine advertisements, on
|
||
billboards, sidewalks, t-shirts, or on store shelves. Scanning the
|
||
<abbr class="abbrev">QR</abbr> code with a <span class="hardware">mobile phone camera</span> can
|
||
launch a website that contains information about a product or service,
|
||
<span class="action">place an order</span> for one unit of the pointed-to- item in a
|
||
web catalog, <span class="action">dial a phone number</span>, or <span class="action">initiate another
|
||
application or service</span> identified by the <abbr class="abbrev">QR</abbr>
|
||
code.<sup>[<a id="chapter-3-endnote-23" href="#ftn.chapter-3-endnote-23" epub:type="noteref" class="footnote">147</a>]</sup></p></div></div></div><div class="sect1" title="Resource Identity"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="section-3.3">Resource Identity</h2></div></div></div><p>Determining the identity of resources that belong in a domain, deciding which
|
||
properties are important or relevant to the people or systems operating in that domain,
|
||
and then specifying the principles by which those properties encapsulate or define the
|
||
relationships among the resources are the essential tasks when building any organizing
|
||
system. In organizing systems used by individuals or with small scope, the methods for
|
||
doing these tasks are often <span xml:lang="latin" class="foreignphrase"><em xml:lang="latin" class="foreignphrase">ad hoc</em></span> and
|
||
unsystematic, and the organizing systems are therefore idiosyncratic and do not scale
|
||
well. At the other extreme, organizing systems designed for institutional or
|
||
industry-wide use, especially in information-intensive domains, require systematic
|
||
design methods to determine which resources will have separate identities and how they
|
||
are related to each other. These resources and their relationships are then described in
|
||
conceptual models which guide the implementation of the systems that manage the
|
||
resources and support interactions with them.<sup>[<a id="chapter-3-endnote-24" href="#ftn.chapter-3-endnote-24" epub:type="noteref" class="footnote">148</a>]</sup></p><div class="sect2" title="Identity and Physical Resources"><div class="titlepage"><div><div><h3 class="title" id="section-3.3.1">Identity and Physical Resources</h3></div></div></div><p><span>Our human visual and cognitive systems do a remarkable job at picking out
|
||
objects from their backgrounds and distinguishing them from each other.</span>
|
||
In fact, we have little difficulty recognizing an object or a person even if we are
|
||
seeing them from a novel distance and viewing angle or with different lighting,
|
||
shading, and so on. When we watch a football game, we do not have any trouble
|
||
perceiving the players moving around the field, and their contrasting uniform colors
|
||
allow us to see that there are two different teams.</p><p><a id="id605455" class="indexterm"/>The perceptual mechanisms that make us see things as permanent objects with
|
||
contrasting visible properties are just the prerequisite for the organizing tasks of
|
||
identifying the specific object, determining the categories of objects to which it
|
||
belongs, and deciding which of those categories is appropriate to emphasize. Most of
|
||
the time we carry out these tasks in an automatic, unconscious way; at other times
|
||
we make conscious decisions about them. For some purposes we consider a sports team
|
||
as a single resource, as a collection of separate players for others, as offense and
|
||
defense, as starters and reserves, and so on.<sup>[<a id="chapter-3-endnote-25" href="#ftn.chapter-3-endnote-25" epub:type="noteref" class="footnote">149</a>]</sup></p><p><a id="id605487" class="indexterm"/>Although we have many choices about how we can organize football players, all of
|
||
them will include the concept of a single player as the smallest identifiable
|
||
resource. We are never going to think of a football player as an intentional
|
||
collection of separately identified leg, arm, head, and body resources because there
|
||
are no other ways to <span class="quote">“<span class="quote">assemble</span>”</span> a human from body parts. Put more
|
||
generally, there are some natural constraints on the organization of matter into
|
||
parts or collections based on sizes, shapes, materials, and other properties that
|
||
make us identify some things as indivisible resources in some domain.</p></div><div class="sect2" title="Identity and Bibliographic Resources"><div class="titlepage"><div><div><h3 class="title" id="section-3.3.2">Identity and Bibliographic Resources</h3></div></div></div><p><a id="id605493" class="indexterm"/><span>Pondering the question of <a class="glossterm" href="go01.html#gloss_identity"><em class="glossterm">identity</em></a> is something relatively recent in
|
||
the world of librarians and catalogers. Libraries have been around for about
|
||
4000 years, but until the last few hundred years librarians created
|
||
<span class="quote">“<span class="quote">bins</span>”</span> of headings and topics to organize resources without
|
||
bothering to give each individual item a separate identifier or name. This meant
|
||
searchers first had to make an educated guess as to which bin might house their
|
||
desired information<span class="symbol">—</span><span class="quote">“<span class="quote">Histories</span>”</span>?
|
||
<span class="quote">“<span class="quote">Medical and Chemical Philosophy</span>”</span>?<span class="symbol">—</span>then
|
||
scour everything in the category in a quest for their desired item. The choices
|
||
were <span xml:lang="latin" class="foreignphrase"><em xml:lang="latin" class="foreignphrase">ad hoc</em></span> and always
|
||
local<span class="symbol">—</span>that is, each cataloger decided the bins and
|
||
groupings for each catalog.</span><sup>[<a id="chapter-3-endnote-26" href="#ftn.chapter-3-endnote-26" epub:type="noteref" class="footnote">150</a>]</sup></p><p><a id="id605688" class="indexterm"/><a id="id605692" class="indexterm"/>The first systematic approach to dealing with the concept of <a class="glossterm" href="go01.html#gloss_identity"><em class="glossterm">identity</em></a> for
|
||
bibliographic resources was developed by <span class="personname"><span class="firstname">Antonio</span> <span class="surname">Panizzi</span></span> at the <span class="orgname">British Museum</span> in the mid-19th century.
|
||
Panizzi wondered: How do we differentiate similar objects in a library catalog? His
|
||
solution was a catalog organized by author name with an index of subjects, along
|
||
with his newly concocted <em class="citetitle">Rules for the Compilation of the
|
||
Catalogue</em>. This contained 91 rules about how to identify and arrange
|
||
author names and titles and what to do with anonymous works. The rules were meant to
|
||
codify how to differentiate and describe each singular resource in his library.
|
||
Taken together, the rules serve to group all the different editions and versions of
|
||
a work together under a single identity.<sup>[<a id="chapter-3-endnote-27" href="#ftn.chapter-3-endnote-27" epub:type="noteref" class="footnote">151</a>]</sup></p><p><a id="id605779" class="indexterm"/><a id="id605783" class="indexterm"/>The concept of identity for bibliographic resources was refined in the
|
||
1950s by Lubetzky, who enlarged the concept of <span class="quote">“<span class="quote">the work</span>”</span> to make it a
|
||
more abstract idea of an author’s intellectual or artistic creation. According to
|
||
Lubetzky’s principle, an audio book, a video recording of a play, and an electronic
|
||
book should be listed each as distinct items, yet still linked to the original
|
||
because of their overlapping intellectual origin.<sup>[<a id="chapter-3-endnote-28" href="#ftn.chapter-3-endnote-28" epub:type="noteref" class="footnote">152</a>]</sup></p><p><a id="id605862" class="indexterm"/><a id="id605876" class="indexterm"/><a id="id605886" class="indexterm"/><a id="id605896" class="indexterm"/><a id="id605907" class="indexterm"/><a id="id605917" class="indexterm"/><a id="id605931" class="indexterm"/><span><a id="def_four-step_abstraction"/>The distinctions put forth by
|
||
Lubetzky, Svenonius and other library science theorists have evolved today into
|
||
a four-step abstraction hierarchy (illustrated in <a class="xref" href="ch03.html#chapter-3-figure-3.3.2" title="Figure 3-5. The Abstraction Hierarchy for Identifying Resources.">Figure 3-5</a>) between the abstract <span class="strong"><strong><a class="glossterm" href="go01.html#gloss_work"><em class="glossterm">work</em></a></strong></span>, an <span class="strong"><strong><a class="glossterm" href="go01.html#gloss_expression"><em class="glossterm">expression</em></a></strong></span> in multiple formats or genres, a
|
||
particular <span class="strong"><strong><a class="glossterm" href="go01.html#gloss_manifestation"><em class="glossterm">manifestation</em></a></strong></span> in one of
|
||
those formats or genres, and a specific physical <span class="strong"><strong><a class="glossterm" href="go01.html#gloss_item"><em class="glossterm">item</em></a></strong></span>.</span> The broad scope from the abstract
|
||
work to the specific item is essential because organizing systems in libraries must
|
||
organize tangible artifacts while expressing the conceptual structure of the domains
|
||
of knowledge represented in their collections.</p><p>If we revisit the question <span class="quote">“<span class="quote">What is this thing we call
|
||
<em class="citetitle">Macbeth</em>?</span>”</span> we can see how different ways of
|
||
answering fit into this abstraction hierarchy. The most specific answer is that
|
||
<span class="quote">“<span class="quote"><em class="citetitle">Macbeth</em></span>”</span> is a specific <span class="strong"><strong><a class="glossterm" href="go01.html#gloss_item"><em class="glossterm">item</em></a></strong></span>, a very particular and individual resource, like
|
||
that dog-eared paperback with yellow marked pages that you owned when you read
|
||
<em class="citetitle">Macbeth</em> in high school. A more abstract answer is that
|
||
<em class="citetitle">Macbeth</em> is an idealization called a <span class="strong"><strong>work</strong></span>, a category that includes all the plays, movies,
|
||
ballets, or other intellectual creations that share a recognizable amount of the
|
||
plot and meaning from the original Shakespeare play.</p><div class="figure-float"><div class="figure"><a id="chapter-3-figure-3.3.2"/><div class="figure-contents"><div class="mediaobject"><img src="figs/print/ucb-tdo-chapter3.3.2-color.png" alt="The abstraction hierarchy for identifying resources yields four different answers about the identity of an information resource. The most specific answer is that every individual resource can be treated as an ITEM because two resources can always be distinguished on the basis of their physical properties. A particular printed copy of Macbeth is an item. By ignoring some of their differentiating properties, a set of items can be treated as a single MANIFESTATION, such as all of the copies of Macbeth from a particular publisher and edition. If we ignore still more differences we can consider a set of resources as a manifestation, like all versions of Macbeth in book form, regardless of publisher or edition. The most general answer to identifying resources would treat as the Macbeth WORK everything that shares the intellectual or conceptual idea of Macbeth, which puts books, movies, plays, and other manifestations in the same equivalence class."/></div></div><div class="figure-title">Figure 3-5. The Abstraction Hierarchy for Identifying Resources.</div></div></div><p><a id="id606092" class="indexterm"/><a id="id606096" class="indexterm"/>This hierarchy is defined in the <em class="firstterm"><a id="ref_FRBR"/><span class="citerefentry"><span class="refentrytitle">Functional Requirements for Bibliographical
|
||
Records</span>(FRBR)</span></em>, published as a standard by the <span class="citerefentry"><span class="refentrytitle"><span class="orgname">International Federation of Library Associations and
|
||
Institutions</span></span>(IFLA)</span>.<sup>[<a id="chapter-3-endnote-29" href="#ftn.chapter-3-endnote-29" epub:type="noteref" class="footnote">153</a>]</sup></p></div><div class="sect2" title="Identity and Information Components"><div class="titlepage"><div><div><h3 class="title" id="section-3.3.3">Identity and Information Components</h3></div></div></div><p><a id="id606253" class="indexterm"/>In information-intensive domains, documents, databases, software
|
||
applications, or other explicit repositories or sources of information are
|
||
ubiquitous and essential to the creation of value for the user, reader, consumer, or
|
||
customer. Value is created through the <span class="action">comparison</span>,
|
||
<span class="action">compilation</span>, <span class="action">coordination</span> or
|
||
<span>transformation of information in some chain or choreography of processes
|
||
operating on information flowing from one information source or process to
|
||
another</span>. These processes are employed in
|
||
<span class="application">accounting</span>, <span class="application">financial
|
||
services</span>, <span class="application">procurement, logistics</span>,
|
||
<span class="application">supply chain management</span>, <span class="application">insurance
|
||
underwriting and claims processing</span>, <span class="application">legal and
|
||
professional services</span>, <span class="application">customer
|
||
support</span>, <span class="application">computer programming</span>, and
|
||
<span class="application">energy management</span>.</p><p>The processes that create value in information-intensive domains are <span class="quote">“<span class="quote">glued
|
||
together</span>”</span> by <span>shared information components</span> that are
|
||
exchanged in documents, records, messages, or resource descriptions of some kind.
|
||
Information components are the primitive and abstract resources in
|
||
information-intensive domains. They are the units of meaning that serve as building
|
||
blocks of composite descriptions and other information artifacts.</p><p>The value creation processes in information-intensive domains work best when their
|
||
component parts come from a common <a class="glossterm" href="go01.html#gloss_controlled_vocabulary"><em class="glossterm">controlled vocabulary</em></a> for components, or when each uses a
|
||
vocabulary with a granularity and semantic precision compatible with the others. For
|
||
example, the value created by a personal health record emerges when information from
|
||
doctors, clinics, hospitals, and insurance companies can be combined because they
|
||
all share the same <span class="quote">“<span class="quote">patient</span>”</span> component as a logical piece of
|
||
information.</p><p>This abstract definition of information components does not help identify them, so
|
||
we will introduce some heuristic criteria: An <span class="quote">“<span class="quote">information component</span>”</span>
|
||
can be (1) Any piece of information that has a unique label or identifier or (2) Any
|
||
piece of information that is self-contained and comprehensible on its own.<sup>[<a id="chapter-3-endnote-30" href="#ftn.chapter-3-endnote-30" epub:type="noteref" class="footnote">154</a>]</sup></p><p>These two criteria for determining the identity of information components are
|
||
often easy to satisfy through observations, interviews, and task analysis because
|
||
people naturally use many different types of information and talk easily about
|
||
specific components and the documents that contain them. Some common components
|
||
(e.g., person, location, date, item) and familiar document types (e.g., report,
|
||
catalog, calendar, receipt) can be identified in almost any domain. Other components
|
||
need to be more precisely defined to meet the more specific semantic requirements of
|
||
narrower domains. These smaller or more fine-grained components might be viewed as
|
||
refined or qualified versions of the generic components and document types, like
|
||
course grade and semester components in academic transcripts, airport codes and
|
||
flight numbers in travel itineraries and tickets, and drug names and dosages in
|
||
prescriptions.</p><p>Decades of practical and theoretical effort in conceptual modeling, relational
|
||
theory, and database design have resulted in rigorous methods for identifying
|
||
information components when requirements and business rules for information can be
|
||
precisely specified. For example, in the domain of business transactions, required
|
||
information like item numbers, quantities, prices, payment information, and so on
|
||
must be encoded as a particular type of data<span class="symbol">—</span>integer,
|
||
decimal, Unicode string, etc.<span class="symbol">—</span> with clearly defined possible
|
||
values and that follows clear occurrence rules.<sup>[<a id="chapter-3-endnote-31" href="#ftn.chapter-3-endnote-31" epub:type="noteref" class="footnote">155</a>]</sup></p><p><a id="id606491" class="indexterm"/>Identifying components can seem superficially easy at the transactional
|
||
end of the Document Type Spectrum (see the Sidebar in <a class="xref" href="ch03.html#section-3.2.1" title="Resource Domain">Resource Domain</a>), with orders or invoices, forms requiring data entry, or other highly-structured
|
||
document types like product catalogs, where pieces of information are typically
|
||
labeled and delimited by boxes, lines, white space or other presentation features
|
||
that encode the distinctions between types of content. For example, the presence of
|
||
ITEM, CUSTOMER NAME, ADDRESS, and PAYMENT INFORMATION labels on the fields of an
|
||
online order form suggests these pieces of information are semantically distinct
|
||
components in a retail application. They follow the <span class="quote">“<span class="quote">self-contained and
|
||
comprehensible</span>”</span> heuristic enough to interconnect the order management,
|
||
payment, and delivery services that work together to carry out the transaction. In
|
||
addition, these labels might have analogues in variable names in the source code
|
||
that implements the order form, or as tags in a <abbr class="abbrev">XML</abbr> document
|
||
created by the ordering application; <code class="sgmltag-starttag"><CustName></code>John Smith<code class="sgmltag-endtag"></CustName></code> and <code class="sgmltag-starttag"><Item></code>A-19<code class="sgmltag-endtag"></Item></code> in the
|
||
order document can be easily identified when it is sent to the other services by the
|
||
order management application.</p><p>But the theoretically grounded methods for identifying components like those of
|
||
relational theory and <a id="id606631" class="indexterm"/>normalization that work for structured data do not strictly apply when
|
||
information requirements are more qualitative and less precise at the narrative end
|
||
of the Document Type Spectrum. These information requirements are typical of
|
||
narrative, unstructured and semi-structured types of documents, and information
|
||
sources like those often found in law, education, and professional services.
|
||
Narrative documents include technical publications, reports, policies, procedures
|
||
and other less structured information, where semantic components are rarely labeled
|
||
explicitly and are often surrounded by text that is more generic. Unlike
|
||
transactional documents that depend on precise semantics because they are used by
|
||
computers, narrative documents are used by people, who can ask if they aren’t sure
|
||
what something means, so there is less need to explicitly define the meaning of the
|
||
information components. Occasional exceptions, such as where components in narrative
|
||
documents are identified with explicit labels like NOTE and WARNING, only prove the
|
||
rule.</p></div><div class="sect2" title="Identity and Active Resources"><div class="titlepage"><div><div><h3 class="title" id="section-3.3.4">Identity and Active Resources</h3></div></div></div><p><a id="id606595" class="indexterm"/><a class="glossterm" href="go01.html#gloss_active_resources"><em class="glossterm">Active resources</em></a> (<a class="xref" href="ch03.html#section-3.4.3.2" title="Use Controlled Vocabularies">Use Controlled Vocabularies</a>) initiate effects or create value
|
||
on their own. In many cases an <span class="hardware">inherently passive physical
|
||
resource</span> like a <span class="hardware">product package</span> or
|
||
<span class="hardware">shipping pallet</span> is transformed into an active one when it
|
||
associated with an <span class="hardware">RFID tag or bar code</span>. <span class="hardware">Mobile
|
||
phones</span> contain device or subscriber IDs so that any information they
|
||
communicate can be associated both with the phone and often, through indirect
|
||
reference, with a particular person. If the resource has an IP address, it is said
|
||
to be part of the <span class="quote">“<span class="quote"><span><span class="hardware">Internet of
|
||
Things</span></span>.</span>”</span><sup>[<a id="chapter-3-endnote-32" href="#ftn.chapter-3-endnote-32" epub:type="noteref" class="footnote">156</a>]</sup></p><p>Organizing systems that create value from <span class="hardware">active resources</span>
|
||
often co-exist with or complement organizing systems that treat its resources as
|
||
passive. In a traditional library, books sat passively on shelves and required users
|
||
to read their spines to identify them. Today, some library books contain active
|
||
<abbr class="abbrev">RFID</abbr> tags that make them dynamic information sources that
|
||
self-identify by publishing their own locations. Similarly, a supermarket or
|
||
department store might organize its goods as physical resources on shelves, treating
|
||
them as passive resources; superimposed on that traditional organizing system is one
|
||
that uses point-of-sale transaction information created when items are scanned at
|
||
<span class="hardware">checkout counters</span> to automatically re-order goods and
|
||
<span class="action">replenish the inventory</span> at the store where they were sold. In
|
||
some stores the shelves contain <span class="hardware">sensors</span> that continually
|
||
<span class="quote">“<span class="quote">talk to the goods</span>”</span> and the information they gather can maintain
|
||
inventory levels and even help prevent theft of valuable merchandise by tracking
|
||
goods through a store or warehouse. The inventory becomes a collection of active
|
||
resources; each item eager to announce its own location and ready to conduct its own
|
||
sale.</p><p><span class="hardware">Blogjects</span><span class="symbol">—</span><span class="hardware">objects that
|
||
blog</span><span class="symbol">—</span>and
|
||
<span class="hardware">Tweetjects</span><span class="symbol">—</span>objects that post
|
||
messages to <span class="application">Twitter</span><span class="symbol">—</span>are neologisms
|
||
for active resources that are plugged into the social web. Blogjects don’t write
|
||
editorial commentary about their experiences, but they use <abbr class="abbrev">API</abbr>s and
|
||
customized programs to harness the information captured by sensors and
|
||
<abbr class="abbrev">RFID</abbr> that then appears on blogs in the form of human-readable
|
||
maps, charts, and text.<sup>[<a id="chapter-3-endnote-33" href="#ftn.chapter-3-endnote-33" epub:type="noteref" class="footnote">157</a>]</sup></p><p><span class="hardware">Tweetjects</span> are <span class="hardware">sensors</span> that send
|
||
information about measurements or events to a <span class="application">Twitter</span>
|
||
account. For example, <span class="orgname">Sparkfun Electronics</span> sells a kit consisting
|
||
of a soil sensor that sends information about the water level in the soil through an
|
||
<span class="hardware">Arduino circuit board</span>, converting thresholds to
|
||
<span class="application">Twitter</span> messages like, <span class="quote">“<span class="quote">Please water me, I’m
|
||
thirsty!</span>”</span><sup>[<a id="chapter-3-endnote-34" href="#ftn.chapter-3-endnote-34" class="footnote">158</a>]</sup></p><p>The extent to which an <span class="hardware">active resource</span> is
|
||
<span class="quote">“<span class="quote">smart</span>”</span> depends on how much computing capability it has available
|
||
to refine the data it collects and communicates. <span>A large collection of
|
||
sensors can transmit a torrent of captured data that requires substantial
|
||
processing to distinguish significant events from those that reflect normal
|
||
operation, and also from those that are statistical outliers with strange values
|
||
caused by random noise.</span>
|
||
<a id="id607025" class="indexterm"/> This challenge gets qualitatively more difficult as the amount of data
|
||
grows to <a class="glossterm" href="go01.html#gloss_big_data"><em class="glossterm">big data</em></a> size, because a
|
||
one in million event might be a statistical outlier that can be ignored, but if
|
||
there are a thousand similar outliers in a billion sensor readings, this cluster of
|
||
data probably reveals something important. On the other hand, giving every sensor
|
||
the computing capability to refine its data so that it only communicates significant
|
||
information might make the sensors too expensive to deploy.<sup>[<a id="chapter-3-endnote-35" href="#ftn.chapter-3-endnote-35" epub:type="noteref" class="footnote">159</a>]</sup></p></div></div><div class="sect1" title="Naming Resources"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="section-3.4">Naming Resources</h2></div></div></div><p><a id="id607151" class="indexterm"/>Determining the identity of the thing, document, information component, or data item
|
||
we need isn’t always enough. We often need to give that resource a name, a label that
|
||
will help us understand and talk about what it is. But naming isn’t just the simple task
|
||
of assigning a sequence of characters. In this section, we’ll discuss why we name, some
|
||
of the problems with naming, and the principles that help us name things in useful
|
||
ways.</p><div class="sect2" title="What’s in a Name?"><div class="titlepage"><div><div><h3 class="title" id="section-3.4.1">What’s in a Name?</h3></div></div></div><p>When a <span class="hardware">child</span> is born, its parents give it a name, often a
|
||
very stressful and contentious decision. Names serve to distinguish one
|
||
<span class="hardware">person</span> from another. Names also, intentionally or
|
||
unintentionally, suggest characteristics or aspirations. The name given to us at
|
||
birth is just one of the names we will be identified with during our lifetimes. We
|
||
have nicknames, names we use professionally, names we use with friends, and names we
|
||
use online. Our banks, our schools, and our governments will know who we are because
|
||
of numbers they associate with our names. As long as it serves its purpose to
|
||
identify you, your name could be anything.<sup>[<a id="chapter-3-endnote-36" href="#ftn.chapter-3-endnote-36" class="footnote">160</a>]</sup></p><p>Resources other than <span class="hardware">people</span> need names so we can find them,
|
||
describe them, reuse them, refer or link to them, record who owns them, and
|
||
otherwise interact with them. In many domains the names assigned to resources are
|
||
also influenced or constrained by rules, industry practice, or technology
|
||
considerations.</p></div><div class="sect2" title="The Problems of Naming"><div class="titlepage"><div><div><h3 class="title" id="section-3.4.2">The Problems of Naming</h3></div></div></div><p>Giving names to anything, from a business to a concept to an action, can be a
|
||
difficult process and it is possible to do it well or do it poorly. The following
|
||
section details some of the major challenges in assigning a name to a
|
||
resource.</p><div class="sect3" title="The Vocabulary Problem"><div class="titlepage"><div><div><h4 class="title" id="section-3.4.2.1">The Vocabulary Problem</h4></div></div></div><p><a id="id607381" class="indexterm"/><a id="id607345" class="indexterm"/><span><a id="def_vocabulary_problem"/>Every natural language offers
|
||
more than one way to express any thought, and in particular there are
|
||
usually many words that can be used to refer to the same thing or concept.
|
||
The words people choose to name or describe things are embodied in their
|
||
experiences and context, so people will often disagree in the words they
|
||
use. Moreover, people are often a bit surprised when it happens, because
|
||
what seems like the natural or obvious name to one person isn’t natural or
|
||
obvious to another.</span><sup>[<a id="chapter-3-endnote-37" href="#ftn.chapter-3-endnote-37" epub:type="noteref" class="footnote">161</a>]</sup></p><p><a id="id607423" class="indexterm"/>Back in the 1980s in the early days of computer user interface
|
||
design, <span class="personname"><span class="firstname">George</span> <span class="surname">Furnas</span></span> and his colleagues at <span class="orgname">Bell Labs</span> conducted a set
|
||
of experiments to measure how much people would agree when they named some
|
||
resource or function. The short answer: very little. Left to our own devices, we
|
||
come up with a shockingly large number of names for a single common
|
||
thing.</p><p>In one experiment, a thousand pairs of people were asked to <span class="quote">“<span class="quote">write the
|
||
name you would give to a program that tells about interesting activities
|
||
occurring in some major metropolitan area.</span>”</span> Less than 12 pairs of
|
||
people agreed on a name. Furnas called this phenomenon <span class="quote">“<span class="quote">the vocabulary
|
||
problem,</span>”</span> concluding that no single word could ever be considered the
|
||
<span class="quote">“<span class="quote">best</span>”</span> name.<sup>[<a id="chapter-3-endnote-38" href="#ftn.chapter-3-endnote-38" epub:type="noteref" class="footnote">162</a>]</sup></p></div><div class="sect3" title="Homonymy, Polysemy, and False Cognates"><div class="titlepage"><div><div><h4 class="title" id="section-3.4.2.2">Homonymy, Polysemy, and False Cognates</h4></div></div></div><p><a id="id607511" class="indexterm"/><a id="id607534" class="indexterm"/>Sometimes the same word can refer to different
|
||
resources<span class="symbol">—</span>a <span class="quote">“<span class="quote">bank</span>”</span> can be a financial
|
||
institution or the side of a river. <span><a id="def_homographs"/>When two words
|
||
are spelled the same but have different meanings they are <span class="strong"><strong><a class="glossterm" id="term_homographs" href="go01.html#gloss_homographs"><em class="glossterm">homographs</em></a>;</strong></span> if they are also pronounced the same
|
||
they are <span class="strong"><strong><a class="glossterm" href="go01.html#gloss_homonyms"><em class="glossterm">homonyms</em></a></strong></span>. If the
|
||
different meanings of the homographs are related, they are called <span class="strong"><strong><a class="glossterm" href="go01.html#gloss_polysemes"><em class="glossterm">polysemes</em></a></strong></span>.</span></p><p>Resources with homonymous and polysemous names are sometimes incorrectly
|
||
identified, especially by an automated process that can’t use common sense or
|
||
context to determine the correct referent. Polysemy can cause more trouble than
|
||
simple homography because the overlapping meaning might obscure the
|
||
misinterpretation. If one person thinks of a <span class="quote">“<span class="quote">shipping container</span>”</span>
|
||
as being a cardboard box and orders some of them, while another person thinks of
|
||
a <span class="quote">“<span class="quote">shipping container</span>”</span> as the large box carried by semi-trailers
|
||
and stacked on cargo ships, their disagreement might not be discovered until the
|
||
wrong kinds of containers arrive.<sup>[<a id="chapter-3-endnote-39" href="#ftn.chapter-3-endnote-39" epub:type="noteref" class="footnote">163</a>]</sup></p><p>Many words in different languages have common roots, and as a result are often
|
||
spelled the same or nearly the same. This is especially true for technology
|
||
words; for example, <span class="quote">“<span class="quote">computer</span>”</span> has been borrowed by many languages.
|
||
The existence of these cognates and borrowed words makes us vulnerable to false
|
||
cognates. When a word in one language has a different meaning and refers to
|
||
different resources in another, the results can be embarrassing or disastrous.
|
||
<span class="quote">“<span class="quote"><span xml:lang="de" class="foreignphrase"><em xml:lang="de" class="foreignphrase">Gift</em></span></span>”</span> is poison
|
||
in German; <span class="quote">“<span class="quote"><span xml:lang="fr" class="foreignphrase"><em xml:lang="fr" class="foreignphrase">pain</em></span></span>”</span> is bread
|
||
in French.</p></div><div class="sect3" title="Names with Undesirable Associations"><div class="titlepage"><div><div><h4 class="title" id="section-3.4.2.3">Names with Undesirable Associations</h4></div></div></div><p>False cognates are a special category of words that make poor names, and there
|
||
are many stories relating product marketing mistakes, where a product name or
|
||
description translates poorly, into other languages or cultures, with
|
||
undesirable associations.<sup>[<a id="chapter-3-endnote-40" href="#ftn.chapter-3-endnote-40" epub:type="noteref" class="footnote">164</a>]</sup> Furthermore, these undesirable associations differ across cultures.
|
||
For example, even though floor numbers have the straightforward purpose to
|
||
identify floors from lowest to highest levels, most buildings in Western
|
||
cultures skip the 13<sup>th</sup> floor because many people
|
||
think 13 is an unlucky number. In many East and Southeast Asian buildings, the
|
||
4<sup>th</sup> floor is skipped. In China the number 4 is
|
||
dreaded because it sounds like the word for <span class="quote">“<span class="quote">death,</span>”</span> while 8 is
|
||
prized because it sounds like the word for <span class="quote">“<span class="quote">wealth.</span>”</span></p><p><a id="id607795" class="indexterm"/>While it can be tempting to dismiss unfamiliar biases and beliefs about names
|
||
and identifiers as harmless superstitions and practices, their implications are
|
||
ubiquitous and far from benign.
|
||
<a class="glossterm" href="go01.html#gloss_alphabetical"><em class="glossterm">Alphabetical ordering</em></a> might seem like a fair and
|
||
non-discriminatory arrangement of resources, but because it is easy to choose
|
||
the name at the top of an alphabetical list, many firms in service businesses
|
||
select names that begin with <span class="quote">“<span class="quote">A,</span>”</span>
|
||
<span class="quote">“<span class="quote">AA,</span>”</span> or even <span class="quote">“<span class="quote">AAA</span>”</span> (look in any printed service
|
||
directory). A consequence of this bias is that people or resources with names
|
||
that begin with letters late in the alphabet are systematically discriminated
|
||
against because they are often not considered, or because they are evaluated in
|
||
the context created by resources earlier in the alphabet rather than on their
|
||
own merit.<sup>[<a id="chapter-3-endnote-41" href="#ftn.chapter-3-endnote-41" class="footnote">165</a>]</sup></p></div><div class="sect3" title="Names that Assume Impermanent Attributes"><div class="titlepage"><div><div><h4 class="title" id="section-3.4.2.4">Names that Assume Impermanent Attributes</h4></div></div></div><p><span>Many resources are given names based on attributes that can be
|
||
problematic later if the attribute changes in value or
|
||
interpretation.</span></p><p>Web resources are often referred to using URLs that contain the domain name of
|
||
the server on which the resource is located, followed by the directory path and
|
||
file name on the computer running the server. This treats the current location
|
||
of the resource as its name, so the name will change if the resource is moved.
|
||
It also means that resources that are identical in content, like those at an
|
||
archive or mirror website, will have different names than the original even
|
||
though they are exact copies. An analogous problem is faced by restaurants or
|
||
businesses with street names or numbers in their names if they lose their leases
|
||
or want to expand.<sup>[<a id="chapter-3-endnote-42" href="#ftn.chapter-3-endnote-42" epub:type="noteref" class="footnote">166</a>]</sup></p><p>Some dynamic web resources that are generated by programs have
|
||
<abbr class="abbrev">URI</abbr>s that contain information about the server technology
|
||
used to create them. When the technology changes, the <abbr class="abbrev">URI</abbr>s will
|
||
no longer work.<sup>[<a id="chapter-3-endnote-43" href="#ftn.chapter-3-endnote-43" epub:type="noteref" class="footnote">167</a>]</sup></p><p>Other resources have names that include page numbers, which disappear or
|
||
change when the resource is accessed in a digital form. For example, the
|
||
standard citation format for legal opinions uses the page number from the
|
||
printed volume issued by <span class="orgname">West Publishing</span>, which has <span>a
|
||
virtual monopoly on the publishing of court opinions and other types of
|
||
legal documents</span>.<sup>[<a id="chapter-3-endnote-44" href="#ftn.chapter-3-endnote-44" epub:type="noteref" class="footnote">168</a>]</sup></p><p><a id="id608145" class="indexterm"/>Some resources have names that contain dates, years or other time indicators,
|
||
most often to point to the future. The film studio named
|
||
<span class="quote">“<span class="quote"><span class="orgname">20<sup>th</sup> Century
|
||
Fox</span></span>”</span> took on that name in the 1930s to give it a
|
||
progressive, forward-looking identity, but today a name with
|
||
<span class="quote">“<span class="quote">20<sup>th</sup> Century</span>”</span> in it does the
|
||
opposite.<sup>[<a id="chapter-3-endnote-45" href="#ftn.chapter-3-endnote-45" epub:type="noteref" class="footnote">169</a>]</sup></p></div><div class="sect3" title="The Semantic Gap"><div class="titlepage"><div><div><h4 class="title" id="section-3.4.2.5">The Semantic Gap</h4></div></div></div><p><a id="id608220" class="indexterm"/><a id="id608230" class="indexterm"/><span><a id="def_semantic_gap"/>The <a class="glossterm" href="go01.html#gloss_semantic_gap"><em class="glossterm">semantic gap
|
||
</em></a> is the difference in perspective in naming and description when
|
||
resources are described by automated processes rather than by
|
||
people.</span><sup>[<a id="chapter-3-endnote-46" href="#ftn.chapter-3-endnote-46" epub:type="noteref" class="footnote">170</a>]</sup></p><p>The semantic gap is largest when computer programs or
|
||
<span class="hardware">sensors</span> obtain and name some information in a format
|
||
optimized for efficient <span class="action">capture</span>, <span class="action">storage</span>,
|
||
<span class="action">decoding</span>, or other technical criteria.
|
||
The names<span class="symbol">—</span>like <em class="filename">IMG20268.jpg</em> on a digital
|
||
photo<span class="symbol">—</span>might make sense for the camera as it stores
|
||
consecutively taken photos but they are not good names for people. We may prefer
|
||
names that describe the content of the picture, like
|
||
<em class="filename">goldengatebridge.jpg.</em></p><p><a id="id608314" class="indexterm"/>And if we try to examine the content of computer-created or
|
||
<span class="hardware"><span class="application">sensor-captured
|
||
resources</span></span>, like a clip of music or a compiled
|
||
software program, a human-language text rendering of the content simply looks
|
||
like nonsense. It was designed to be interpreted by a computer program, not by a
|
||
person.</p></div></div><div class="sect2" title="Choosing Good Names and Identifiers"><div class="titlepage"><div><div><h3 class="title" id="section-3.4.3">Choosing Good Names and Identifiers</h3></div></div></div><p>If someone tells you they are having dinner with their best friend, a cousin,
|
||
someone with whom they play basketball, and their professional mentor from work, how
|
||
many places at the table will be set? Anywhere from two to five; it’s possible all
|
||
those relational descriptions refer to a single person, or to four different people,
|
||
and because <span class="quote">“<span class="quote">friend,</span>”</span>
|
||
<span class="quote">“<span class="quote">cousin,</span>”</span>
|
||
<span class="quote">“<span class="quote">basketball teammate</span>”</span> and <span class="quote">“<span class="quote">mentor</span>”</span> don’t name specific
|
||
people you’ll have to guess who is coming to dinner.</p><p>If instead of descriptions you’re told that the dinner guests are Bob, Carol, Ted,
|
||
and Alice, you can count four names and you know how many people are having dinner.
|
||
But you still can’t be sure exactly which four people are involved because there are
|
||
many people with those names.</p><p>The uncertainty is completely eliminated only if we use identifiers for the people
|
||
rather than names. <a class="glossterm" href="go01.html#gloss_identifier"><em class="glossterm">Identifiers</em></a> are names that refer unambiguously to a specific
|
||
person, place, or resource because they are assigned in a controlled way.
|
||
Identifiers are often created as strings of numbers or letters rather than words to
|
||
avoid the biases and associations that words can convey. For example, in some
|
||
universities professors grade final exams that are identified with student numbers
|
||
rather than names so that grades are assigned without the bias that could arise if
|
||
the professor knows the student.</p><p>The distinction between <a class="glossterm" href="go01.html#gloss_name"><em class="glossterm">names</em></a> and
|
||
<a class="glossterm" href="go01.html#gloss_identifier"><em class="glossterm">identifiers</em></a> for people is
|
||
often not appreciated. (See the Sidebar, <a class="xref" href="ch03.html#chapter-3-sidebar-2" title="Names {and, or, vs} Identifiers">Names {and, or, vs} Identifiers</a>).</p><p><a id="id608397" class="indexterm"/></p><div class="sidebar"><a id="chapter-3-sidebar-2"/><div class="sidebar-title">Names {and, or, vs} Identifiers</div><p>People change their names for many reasons: when they get married or divorced,
|
||
because their name is often mispronounced or misspelled, to make a political or
|
||
ethnic statement, or because they want to stand out. <a id="id608455" class="indexterm"/> A few years a football player with a large ego named <span class="personname"><span class="firstname">Chad</span> <span class="surname">Johnson</span></span>, which is the second most common surname in the US, decided to
|
||
change his name to his player number of 85, becoming Chad
|
||
<span class="quote">“<span class="quote">Ochocinco.</span>”</span> He has an <span class="application">ochocinco.com</span>
|
||
website and uses the ochocinco name on <span class="application">Facebook</span> and
|
||
<span class="application">Twitter</span>. In a bit of irony, when Ochocinco wanted
|
||
to put Ocho Cinco on the back of his football jersey, the football league would
|
||
not let him because his legal name doesn’t have a space in it. That surely
|
||
contributed to his decision to change his name back to Chad Johnson in
|
||
2012.</p><p><a id="id608479" class="indexterm"/>
|
||
<a id="id608507" class="indexterm"/>
|
||
<a id="id608518" class="indexterm"/>A similar name change with an unintended consequence was that of the
|
||
American singer and musician <span class="personname"><span class="firstname">Price</span> <span class="othername">Rogers</span> <span class="surname">Nelson</span></span>, who adopted the stage name of <span class="personname"><span class="othername">Prince</span></span> and released numerous highly successful record albums under that
|
||
name. But because Prince wasn’t a legal name change, the record label
|
||
trademarked it for marketing purposes, which led to disputes about control. In
|
||
response, <a id="id608558" class="indexterm"/><span class="quote">“<span class="quote"><span class="personname"><span class="othername">The Artist Formerly Known as Prince</span></span></span>”</span> invented <span>a graphical symbol that merged the
|
||
symbols for male and female</span>. Unfortunately, even though it is a
|
||
unique identifier, this symbol isn’t represented in any standard character set,
|
||
so it can’t be printed here and can’t be searched for on the web.</p><p>
|
||
<a id="id608602" class="indexterm"/>Some minor league sports teams have replaced the player names on
|
||
jerseys with <span class="application">Twitter</span> handles, which might be a good
|
||
thing if their fans are into social media, but it must be strange for the
|
||
announcers at the games when they say <span class="quote">“<span class="quote">@ifuentes4 just scored a
|
||
goal.</span>”</span></p><p>When you go to coffee shops, you are often asked your name, which the cashier
|
||
writes on the empty cup so that your drink can be identified after the
|
||
<span xml:lang="it" class="foreignphrase"><em xml:lang="it" class="foreignphrase">barista</em></span> makes it. They don’t
|
||
actually need your name; just as some establishments use a receipt number to
|
||
distinguish orders, what they need is an identifier. So even if your name is
|
||
Joe, you can tell them it is Thor, Wotan, Mercurio, El Greco, Clark Kent, or any
|
||
other name that is likely to be a unique identifier for the minute it takes to
|
||
make your beverage.<sup>[<a id="chapter-3-endnote-47" href="#ftn.chapter-3-endnote-47" class="footnote">171</a>]</sup></p></div><div class="sect3" title="Make Names Informative"><div class="titlepage"><div><div><h4 class="title" id="section-3.4.3.1">Make Names Informative</h4></div></div></div><p>The most basic principle of naming is to choose names that are informative,
|
||
which makes them easier to understand and remember. It is easier to tell what a
|
||
computer program or <abbr class="abbrev">XML</abbr> document is doing if it uses names like
|
||
<span class="quote">“<span class="quote">ItemCost</span>”</span> and <span class="quote">“<span class="quote">TotalCost</span>”</span> rather than just
|
||
<span class="quote">“<span class="quote">I</span>”</span> or <span class="quote">“<span class="quote">T.</span>”</span> People will enter more consistent and
|
||
reusable address information if a form asks explicitly for
|
||
<span class="quote">“<span class="quote">Street,</span>”</span>
|
||
<span class="quote">“<span class="quote">City,</span>”</span> and <span class="quote">“<span class="quote">PostalCode</span>”</span> instead of
|
||
<span class="quote">“<span class="quote">Line1</span>”</span> and <span class="quote">“<span class="quote">Line2.</span>”</span></p><p><a id="id608724" class="indexterm"/> Identifiers can be designed with internal structure and semantics
|
||
that conveys information beyond the basic aspect of pointing to a specific
|
||
resource. An <span class="citerefentry"><span class="refentrytitle">International Standard Book Number</span>(ISBN)</span> like <span class="quote">“<span class="quote">ISBN 978-0-262-07261-8</span>”</span> identifies a resource
|
||
(07261=<span class="quote">“<span class="quote">Document Engineering</span>”</span>) and also reveals that the
|
||
resource is a book (978), in English (0), and published by The MIT Press
|
||
(262).<sup>[<a id="chapter-3-endnote-48" href="#ftn.chapter-3-endnote-48" epub:type="noteref" class="footnote">172</a>]</sup></p><p>The <span class="hardware">navigation points</span> that mark intersections of
|
||
<span class="hardware">radial signals</span> from <span class="hardware">ground beacons</span>
|
||
or <span class="hardware">satellites</span> that are crucial to <span class="hardware">aircraft
|
||
pilots</span> used to be meaningless five-letter codes. These
|
||
identifiers were changed to make them suggest their locations; semantic landmark
|
||
names made pilots less likely to enter the wrong names into navigation systems,
|
||
For example, some of the <span class="hardware">navigation points</span> near Orlando,
|
||
Florida<span class="symbol">—</span>the home of Disney World<span class="symbol">—</span>are MICKI, MINEE,
|
||
and GOOFY.<sup>[<a id="chapter-3-endnote-49" href="#ftn.chapter-3-endnote-49" epub:type="noteref" class="footnote">173</a>]</sup></p></div><div class="sect3" title="Use Controlled Vocabularies"><div class="titlepage"><div><div><h4 class="title" id="section-3.4.3.2">Use Controlled Vocabularies</h4></div></div></div><p><span><a id="def_controlled_vocabulary"/>One way to encourage good names for a
|
||
given resource domain or task is to establish a <span class="strong"><strong>controlled vocabulary</strong></span>. A <a class="glossterm" href="go01.html#gloss_controlled_vocabulary"><em class="glossterm">controlled vocabulary</em></a>
|
||
can be thought of as a fixed or closed dictionary that includes all the
|
||
terms that can be used in a particular domain. A controlled vocabulary
|
||
shrinks the number of words used, reducing synonymy and homonymy and
|
||
eliminating undesirable associations, leaving behind a set of words with
|
||
precisely defined meanings and rules governing their use.</span>
|
||
Controlled vocabularies are applied in many organizing systems, from
|
||
bibliographic languages that determine the ways books are catalogued in a
|
||
library to business languages that define the set of information components that
|
||
can be used in transactional documents.</p><p>A <a class="glossterm" href="go01.html#gloss_controlled_vocabulary"><em class="glossterm">controlled
|
||
vocabulary</em></a> isn’t simply a set of allowed words; it also includes
|
||
their definitions and often specifies rules by which the vocabulary terms can be
|
||
used and combined. Different domains can create specific controlled vocabularies
|
||
for their own purposes, but the important thing is that the vocabulary be used
|
||
consistently throughout that domain.<sup>[<a id="chapter-3-endnote-50" href="#ftn.chapter-3-endnote-50" epub:type="noteref" class="footnote">174</a>]</sup></p><p><a id="id608956" class="indexterm"/><a id="id608976" class="indexterm"/><span><a id="def_authority_control"/>For bibliographic resources
|
||
important aspects of vocabulary control include determining the
|
||
authoritative forms for author names, uniform titles of works, and the set
|
||
of terms by which a particular subject will be known. In library science,
|
||
the process of creating and maintaining these standard names and terms is
|
||
known as <span class="strong"><strong><a class="glossterm" href="go01.html#gloss_authority_control"><em class="glossterm">authority control</em></a></strong></span>.</span> When evaluating
|
||
what name to use for an author, librarians typically look for the name form
|
||
that’s used most commonly across that author’s body of work while conforming to
|
||
rules for handling prefixes, suffixes and other name parts that often cause name
|
||
variations. For example, a name like that of <span class="personname"><span class="firstname">Johann</span> <span class="othername">Wolfgang</span> <span class="surname">von Goëthe</span></span> might be alphabetized as both a <span class="quote">“<span class="quote">G</span>”</span> name and a
|
||
<span class="quote">“<span class="quote">V</span>”</span> name, but using <span class="quote">“<span class="quote">G</span>”</span> is the authoritative way.
|
||
<span class="quote">“<span class="quote">See</span>”</span> and <span class="quote">“<span class="quote">see also</span>”</span> references then map the
|
||
variations to the authoritative name. Similar rules are followed for identifying
|
||
the authoritative form of titles when multiple translations and editions
|
||
exist.<sup>[<a id="chapter-3-endnote-51" href="#ftn.chapter-3-endnote-51" class="footnote">175</a>]</sup></p><p>Official authority files are maintained for many resource domains: a gazetteer
|
||
associates names and locations and tells us whether we should be referring to
|
||
Bombay or Mumbai; <a id="id609180" class="indexterm"/>the <span class="citerefentry"><span class="refentrytitle">Domain Name System</span>(DNS)</span> maps human-oriented domain and host names to their
|
||
<abbr class="abbrev">IP</abbr> addresses; the Chemical Abstracts Service Registry
|
||
assigns unique identifiers to every chemical described in the open scientific
|
||
literature; numerous institutions assign unique identifiers to different
|
||
categories of animal species.<sup>[<a id="chapter-3-endnote-52" href="#ftn.chapter-3-endnote-52" class="footnote">176</a>]</sup></p><p><a id="id609227" class="indexterm"/>In some cases, authority files are created or maintained by a
|
||
community, as in the case of <span class="application">MusicBrainz</span>, an
|
||
<span class="quote">“<span class="quote">open music encyclopedia</span>”</span> to which users contribute information
|
||
about artists, releases, tracks, and other aspects of music. Music metadata is
|
||
notoriously unreliable; one study found over 100 variations in the description
|
||
of the <em class="citetitle">Knockin’ on Heaven’s Door</em> song (written by <span class="personname"><span class="firstname">Bob</span> <span class="surname">Dylan</span></span>) as recorded by <span class="orgname">Guns N’ Roses</span>.<sup>[<a id="chapter-3-endnote-53" href="#ftn.chapter-3-endnote-53" epub:type="noteref" class="footnote">177</a>]</sup></p></div><div class="sect3" title="Allow Aliasing"><div class="titlepage"><div><div><h4 class="title" id="section-3.4.3.3">Allow Aliasing</h4></div></div></div><p>A <a class="glossterm" href="go01.html#gloss_controlled_vocabulary"><em class="glossterm">controlled
|
||
vocabulary</em></a> is extremely useful to people who use it, but if you
|
||
are designing an organizing system for other people who do not or cannot use it,
|
||
you need to accommodate the variety of words they will actually use when they
|
||
seek or describe resources. <a id="id609324" class="indexterm"/> The authoritative name of a certain fish species is <span xml:lang="latin" class="foreignphrase"><em xml:lang="latin" class="foreignphrase">Amphiprion ocellaris</em></span>, but most people would
|
||
search for it as <span class="quote">“<span class="quote">clownfish,</span>”</span>
|
||
<span class="quote">“<span class="quote">anemone fish,</span>”</span> or even by its more familiar film name of
|
||
<em class="citetitle">Nemo</em>.</p><p>Furnas suggests <span class="quote">“<span class="quote">unlimited aliasing</span>”</span> to connect the uncontrolled
|
||
or natural vocabularies that people use with the controlled one employed by the
|
||
organizing system. By this he means that there must be many alternate access
|
||
routes to each word or function that a user is trying to find. For example, the
|
||
birth name of <span class="personname"><span class="othername">the 42nd President of the United States of America</span></span> is <span class="quote">“<span class="quote"><span class="personname"><span class="firstname">William</span> <span class="othername">Jefferson</span> <span class="surname">Clinton</span></span>,</span>”</span> but web pages that refer to him as <span class="quote">“<span class="quote"><span class="personname"><span class="firstname">Bill</span> <span class="surname">Clinton</span></span></span>”</span> are vastly more common, and searches for the former
|
||
are redirected to the latter. A related mechanism used by search engines is
|
||
spelling correction, essentially treating all the incorrect spellings as aliases
|
||
of the correct one (<span class="quote">“<span class="quote">did you mean California?</span>”</span> when you typed
|
||
<span class="quote">“<span class="quote">Claifornia</span>”</span>).</p></div><div class="sect3" title="Make Identifiers Unique or Qualified"><div class="titlepage"><div><div><h4 class="title" id="section-3.4.3.4">Make Identifiers Unique or Qualified</h4></div></div></div><p><a id="id609432" class="indexterm"/><a id="id609428" class="indexterm"/>Even though an identifier refers to a single resource, this doesn’t
|
||
mean that no two identifiers are identical. One <span class="application">military inventory
|
||
system</span> might use stock number <span class="productnumber">99 000
|
||
1111</span> to identify <span class="hardware">a 24-hour, cold-climate ration
|
||
pack</span>, while another <span class="application">inventory system</span>,
|
||
the same number could be used to identify <span class="hardware">an electronic radio
|
||
valve</span>. Each identifier is unique in its <span class="application">inventory
|
||
system</span>, but if a supply request gets sent to the wrong
|
||
warehouse hungry soldiers could be sent <span class="hardware">radio valves</span>
|
||
instead of <span class="hardware">rations</span>.<sup>[<a id="chapter-3-endnote-54" href="#ftn.chapter-3-endnote-54" epub:type="noteref" class="footnote">178</a>]</sup>
|
||
<sup>[<a id="chapter-3-endnote-55" href="#ftn.chapter-3-endnote-55" epub:type="noteref" class="footnote">179</a>]</sup></p><p><a id="id609565" class="indexterm"/><a id="id609572" class="indexterm"/><span><a id="def_namespace"/>We can prevent or reduce identifier
|
||
collisions by adding information about the <span class="strong"><strong><a class="glossterm" href="go01.html#gloss_namespace"><em class="glossterm">namespace</em></a></strong></span>, the domain from which the names
|
||
or identifiers are selected, thus creating what are often called <span class="strong"><strong><a class="glossterm" href="go01.html#gloss_qualified_names"><em class="glossterm">qualified
|
||
names</em></a></strong></span>.</span> There are several dozen US cities
|
||
named <span class="quote">“<span class="quote">Springfield</span>”</span> and <span class="quote">“<span class="quote">Washington,</span>”</span> but adding state
|
||
codes to mail addresses distinguishes them. Likewise, we can add prefixes to
|
||
<abbr class="abbrev">XML</abbr> element names when we create documents that reuse
|
||
components from multiple document types, distinguishing <code class="sgmltag-starttag"><book:Title></code> from <code class="sgmltag-starttag"><legal:Title></code>.</p><p><a id="id609640" class="indexterm"/>We can fix problems like these by qualifying or extending the
|
||
identifier, or by creating a <span class="strong"><strong><a class="glossterm" href="go01.html#gloss_guid"><em class="glossterm">globally unique identifier</em></a></strong></span>
|
||
(or <abbr class="abbrev">GUID</abbr>), one that will never be the same as another
|
||
identifier in any organizing system anywhere else. One easy method to create a
|
||
<abbr class="abbrev">GUID</abbr> is to use a <abbr class="abbrev">URL</abbr> you control and
|
||
append a string to it, the same approach that gives every web page a unique
|
||
address. <abbr class="abbrev">GUID</abbr>s are often used to identify software objects, the
|
||
resources in distributed systems, or data collections.<sup>[<a id="chapter-3-endnote-56" href="#ftn.chapter-3-endnote-56" epub:type="noteref" class="footnote">180</a>]</sup></p><p>Because they aren’t created by an algorithm whose results are provably unique,
|
||
we do not consider fingerprints, or other biometric information, to be globally
|
||
unique identifiers for people, but for all practical purposes they are.<sup>[<a id="chapter-3-endnote-57" href="#ftn.chapter-3-endnote-57" class="footnote">181</a>]</sup></p></div><div class="sect3" title="Distinguish Identifying and Resolving"><div class="titlepage"><div><div><h4 class="title" id="section-3.4.3.5">Distinguish Identifying and Resolving</h4></div></div></div><p><span><a id="id609791" class="indexterm"/><a id="id609815" class="indexterm"/>Library call numbers are identifiers that do not contain any
|
||
information about where the resource can be found in the library stacks on
|
||
in a digital repository. This separation enables this identification system
|
||
to work when there are multiple copies in different locations, in contrast
|
||
to <abbr class="abbrev">URI</abbr>s that serve as both identifiers and locations much
|
||
of the time. When the identifier does not contain information about resource
|
||
location, we need a way to interpret or <span class="quote">“<span class="quote">resolve</span>”</span>
|
||
it to determine the location. With <span class="hardware">physical resources</span>,
|
||
<span class="strong"><strong><a class="glossterm" href="go01.html#gloss_resolution"><em class="glossterm">resolution</em></a></strong></span> takes place with the aid of
|
||
<span class="hardware">signs</span>, <span class="hardware">maps</span>, or other
|
||
<span class="hardware">associated resources</span> that describe the arrangement
|
||
of resources in some physical environment; for example, <span class="hardware">you are
|
||
here maps</span> have a list of its buildings and associate each
|
||
with a coordinate or other means of finding it on the map.. With digital
|
||
resources, the resolver is a directory system or service that interprets an
|
||
identifier and looks up its location or directly initiates the retrieval of
|
||
the resource.</span></p></div></div></div><div class="sect1" title="Resources over Time"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="section-3.5">Resources over Time</h2></div></div></div><p>Problems of <span class="quote">“<span class="quote">what is the resource?</span>”</span> and <span class="quote">“<span class="quote">how do we identify
|
||
it?</span>”</span> are complex and often require ongoing work to ensure they are properly
|
||
answered as the content and context of an organizing system evolves. As a result, we
|
||
might need to know how a resource does or does not change over time (its <span class="strong"><strong><a class="glossterm" href="go01.html#gloss_persistence"><em class="glossterm">persistence</em></a></strong></span>), whether its state and content come into
|
||
play at a specified point in time (its <span class="strong"><strong><a class="glossterm" href="go01.html#gloss_effectivity"><em class="glossterm">effectivity</em></a></strong></span>), whether the resource is what it is said to be
|
||
(its <span class="strong"><strong><a class="glossterm" href="go01.html#gloss_authenticity"><em class="glossterm">authenticity</em></a></strong></span>), and sometimes who has certified its
|
||
authenticity over time (its <span class="strong"><strong><a class="glossterm" href="go01.html#gloss_provenance"><em class="glossterm">provenance</em></a></strong></span>).</p><div class="sect2" title="Persistence"><div class="titlepage"><div><div><h3 class="title" id="section-3.5.1">Persistence</h3></div></div></div><p><a id="id610060" class="indexterm"/>Even if you have reached an agreement as to the meaning of <span class="quote">“<span class="quote"><span class="hardware">a
|
||
thing</span></span>”</span> in your organizing system, you still face the
|
||
question of the identity of the resource over time, or its <span class="strong"><strong>persistence</strong></span>.</p><div class="sect3" title="Persistent Identifiers"><div class="titlepage"><div><div><h4 class="title" id="section-3.5.1.1">Persistent Identifiers</h4></div></div></div><p>How long must an identifier last? Coyle gives the conventional, if
|
||
unsatisfying, answer: <span class="quote">“<span class="quote">As long as it’s needed.</span>”</span><sup>[<a id="chapter-3-endnote-58" href="#ftn.chapter-3-endnote-58" epub:type="noteref" class="footnote">182</a>]</sup> In some cases, the time frame is relatively short. When you order a
|
||
specialty coffee and the <span xml:lang="it" class="foreignphrase"><em xml:lang="it" class="foreignphrase">barista</em></span> asks
|
||
for your name, this identifier only needs to last until you pick up your order
|
||
at the end of the counter. But other time frames are much longer. For libraries
|
||
and repositories of scientific, economic, census, or other data the time frame
|
||
might be <span class="quote">“<span class="quote">forever.</span>”</span></p><p>The design of a scheme for persistent identifiers must consider both the
|
||
required time frame and the number of resources to be identified. When the <span class="citerefentry"><span class="refentrytitle">Internet Protocol</span></span> (IP) was designed in 1980, it contained a 32-bit address scheme,
|
||
sufficient for over 4 billion unique addresses. But the enormous growth of the
|
||
Internet and the application of <abbr class="abbrev">IP</abbr> addresses to resources of
|
||
unexpected types have required a new addressing scheme with 128 bits.<sup>[<a id="chapter-3-endnote-59" href="#ftn.chapter-3-endnote-59" epub:type="noteref" class="footnote">183</a>]</sup></p><p><a id="id610202" class="indexterm"/><a id="id610205" class="indexterm"/>Recognition that <abbr class="abbrev">URI</abbr>s are often not persistent as
|
||
identifiers for web-based resources led the <span class="citerefentry"><span class="refentrytitle"><span class="orgname">Association of American
|
||
Publishers</span></span>(AAP)</span> to develop the <span class="citerefentry"><span class="refentrytitle">Digital Object Identifier</span>(DOI)</span> system. The location and owner of a digital resource can change,
|
||
but its <abbr class="abbrev">DOI</abbr> is permanent.<sup>[<a id="chapter-3-endnote-60" href="#ftn.chapter-3-endnote-60" class="footnote">184</a>]</sup></p></div><div class="sect3" title="Persistent Resources"><div class="titlepage"><div><div><h4 class="title" id="section-3.5.1.2">Persistent Resources</h4></div></div></div><p>Even though persistence often has a technology dimension, it is more important
|
||
to view it as a commitment by an institution or organization to perform
|
||
activities over time to ensure that a resource is available when it is needed.
|
||
Put another way, preservation (<a class="xref" href="ch02.html#section-2.5.2" title="Preservation">Preservation</a>) and governance (<a class="xref" href="ch02.html#section-2.5.4" title="Governance">Governance</a>) are
|
||
activities carried out to ensure the outcome of persistence.</p><p>The subtle relationship between preservation and persistence raises some
|
||
interesting questions about what it means for a resource to stay the same over
|
||
time. One way to think of persistence is that a persistent resource is never
|
||
changed. However, physical resources often require maintenance, repair, or
|
||
restoration to keep them accessible and usable, and we might question whether at
|
||
some point these activities have transformed them into different
|
||
resources.<sup>[<a id="chapter-3-endnote-61" href="#ftn.chapter-3-endnote-61" epub:type="noteref" class="footnote">185</a>]</sup> Likewise, digital resources require regular backup and migration to
|
||
keep them available and this might include changing their digital format.</p><p><a id="id610381" class="indexterm"/>We might instead think of persistence more abstractly, and expect that
|
||
persistent resources need only to remain functionally the same to support the
|
||
same interactions at any point in their lifetimes even if their physical
|
||
properties change. Active resources implemented as computational agents or web
|
||
services might be re-implemented numerous times, but as long as they don’t
|
||
change their interfaces they can be deemed to be persistent from the perspective
|
||
of any other resource that uses them. Similarly, many resources like online
|
||
newspapers or blog feeds continually change their content but still could have
|
||
persistent identifiers.</p><p>Some organizing systems closely monitor their resources and every interaction
|
||
with them to prevent or detect tampering with them or other unauthorized
|
||
changes. Some organizing systems, like those for software or legal documents,
|
||
explicitly maintain every changed version to satisfy expectations of persistence
|
||
because different users might not be relying on the same version. With digital
|
||
resources determining whether two resources are the same or determining how they
|
||
are related or derived from one another are very challenging problems.<sup>[<a id="chapter-3-endnote-62" href="#ftn.chapter-3-endnote-62" epub:type="noteref" class="footnote">186</a>]</sup></p></div></div><div class="sect2" title="Effectivity"><div class="titlepage"><div><div><h3 class="title" id="section-3.5.2">Effectivity</h3></div></div></div><p><a id="id610500" class="indexterm"/><span><a id="def_effectivity"/>Many resources, or their properties, also
|
||
have <span class="strong"><strong><a class="glossterm" href="go01.html#gloss_effectivity"><em class="glossterm">effectivity</em></a></strong></span>, meaning that
|
||
they come into effect at a particular time and will almost certainly cease to be
|
||
effective at some future date. Effectivity is sometimes known as time-to-live
|
||
and it is generally expressed as a range of two dates. It consists of a date on
|
||
which the resource is effective, and optionally a date on which the resource
|
||
ceases to be effective, or becomes stale.</span> For some types of resources,
|
||
the effective date is the moment they are created, but for others, the effective
|
||
date can be a time different from the moment of creation. For example, a law can be
|
||
passed in November but not take effect until January 1 of the following year. An
|
||
<span class="quote">“<span class="quote">effective date</span>”</span> is the counterpart of the <span class="quote">“<span class="quote">Best
|
||
Before</span>”</span> date on perishable goods. That date indicates when a product goes
|
||
bad, whereas an item’s effective date is when it <span class="quote">“<span class="quote">goes good</span>”</span> and the
|
||
resource that it supersedes needs to be disposed of or archived.</p><p><a id="id610567" class="indexterm"/>Effectivity concerns sometimes intersect with name authority control, because name
|
||
changes for resources often are tied to particular dates and events. Some places
|
||
that have been the site of civil unrest, foreign occupation, and other political
|
||
disruptions have had many different names over time. Even if you always live in the
|
||
same place, the answer to <span class="quote">“<span class="quote">what country do you live in?</span>”</span> can depend on
|
||
when it is asked.<sup>[<a id="chapter-3-endnote-63" href="#ftn.chapter-3-endnote-63" class="footnote">187</a>]</sup></p><p>In most cases effectivity implies persistence requirements because it is important
|
||
to be able to determine and reconstruct the configuration of resources that was in
|
||
effect at some prior time. A new tax might go into effect on January 1, but if the
|
||
government audits your tax returns what matters is whether you followed the law that
|
||
was in effect when you filed your returns.<sup>[<a id="chapter-3-endnote-64" href="#ftn.chapter-3-endnote-64" epub:type="noteref" class="footnote">188</a>]</sup></p></div><div class="sect2" title="Authenticity"><div class="titlepage"><div><div><h3 class="title" id="section-3.5.3">Authenticity</h3></div></div></div><p><span><a id="def_authenticity"/>In ordinary use we say that something is <span class="bold"><strong><a class="glossterm" href="go01.html#gloss_authenticity"><em class="glossterm">authentic</em></a></strong></span> if it can be shown to be, or has come to
|
||
be accepted as what it claims to be. The importance and nuance of questions
|
||
about authenticity can be seen in the many words we have to describe the
|
||
relationship between <span class="quote">“<span class="quote"><span class="hardware">the real thing</span></span>”</span> (the
|
||
<span class="quote">“<span class="quote">original</span>”</span>) and something else: copy, reproduction, replica,
|
||
fake, phony, forgery, counterfeit, pretender, imposter, ringer, and so
|
||
on.</span></p><p> It is easy to think of examples where authenticity of a resource matters:
|
||
<span class="hardware">a signed legal contract</span>, <span class="hardware">a work of
|
||
art</span>, <span class="hardware">a historical artifact</span>, even <span class="hardware">a
|
||
person’s signature</span>.</p><p>The creator or operator of an organizing system, whether human or machine, can
|
||
<span class="action">authenticate</span> a newly created resource. A third party can also
|
||
serve as proof of authenticity. <span>Many professional careers are based on
|
||
figuring out if a resource is authentic.</span><sup>[<a id="chapter-3-endnote-65" href="#ftn.chapter-3-endnote-65" epub:type="noteref" class="footnote">189</a>]</sup></p><p><a id="id610925" class="indexterm"/>There is large body of techniques for establishing the identity of a
|
||
<span class="hardware">person</span> or <span class="hardware">physical resource</span>. We often
|
||
use judgments about the <span class="hardware">physical integrity</span> of recorded
|
||
information when we consider the integrity of its contents.</p><p><a id="id610939" class="indexterm"/><a id="id610964" class="indexterm"/>Digital authenticity is more difficult to establish. Digital resources can be
|
||
reproduced at almost no cost, exist in multiple locations, carry different names on
|
||
identical documents or identical names on different documents, and bring about other
|
||
complications that do not arise with physical items. Technological solutions for
|
||
ensuring digital authenticity include time stamps, watermarking, encryption, and
|
||
digital signatures. However, while scholars generally trust technological methods,
|
||
technologists are more skeptical of them because they can imagine ways for them to
|
||
be circumvented or counterfeited. Even when a technologically sophisticated system
|
||
for establishing authenticity is in place, we can still only assume the constancy of
|
||
identity as far back as this system reaches in the <span class="quote">“<span class="quote"><span class="hardware">chain of
|
||
custody</span></span>”</span> of the document.</p></div><div class="sect2" title="Provenance"><div class="titlepage"><div><div><h3 class="title" id="section-3.5.4">Provenance</h3></div></div></div><p>The idea that important documents must be created in an authenticatable manner and
|
||
then preserved with an unbroken <span class="hardware">chain of custody</span> goes back to
|
||
ancient Rome. <span class="application">Notaries</span> witnessed the creation of
|
||
important documents, which were then protected to maintain their integrity or value
|
||
as evidence. In organizing systems like museums and archives that preserve rare or
|
||
culturally important objects or documents this concern is expressed as the principle
|
||
of <span class="strong"><strong>provenance</strong></span>. This is the history of the
|
||
ownership of a collection or the resources in it, where they have been and who has
|
||
had access to the resources.</p><p><span>A uniquely Chinese technique in organizing systems is the
|
||
<span class="action">imprinting</span> of elaborate red seals on documents, books, and
|
||
paintings that collectively record the provenance of ownership and the review
|
||
and approval of the artifact by emperors or important officials.</span></p><p><a id="id611047" class="indexterm"/>
|
||
<a class="xref" href="ch03.html#chapter-3-figure-3.4" title="Figure 3-6. Resources over Time.">Figure 3-6</a> portrays the relationships among the
|
||
concepts of Persistence, Provenance, Effectivity, and Authenticity. A resource might
|
||
have persistence over some time line, but an unbroken chain of custody that captures
|
||
changes in possession enables questions about authenticity to be answered with
|
||
authority. Effectivity emphasizes a particular segment on the time line or a
|
||
starting point after which the resource is effective.</p><div class="figure-float"><div class="figure"><a id="chapter-3-figure-3.4"/><div class="figure-contents"><div class="mediaobject"><a id="chapter-3-figure-3.4a"/><img src="figs/print/ch3.6-350dpi.png" alt="Four considerations that arise with respect to the maintenance of resources over time are their persistence, provenance, authenticity, and effectivity."/></div></div><div class="figure-title">Figure 3-6. Resources over Time.</div></div></div></div></div><div class="sect1" title="Key Points in Chapter Three"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="section-3.6">Key Points in Chapter Three</h2></div></div></div><div class="highlights"><div class="itemizedlist"><ul class="itemizedlist"><li class="listitem"><p>We can consider a resource to be one of many members of a very broad
|
||
category, as the unique instance of a category with only one member, or
|
||
anywhere in between.</p></li><li class="listitem"><p>The size of the category<span class="symbol">—</span>the number of resources that are treated as
|
||
equivalent<span class="symbol">—</span>is determined by the properties or characteristics we consider
|
||
when we examine the resource.</p></li><li class="listitem"><p>More fine-grained organization reduces <a class="glossterm" href="go01.html#gloss_recall"><em class="glossterm">recall</em></a>, the number of resources you find
|
||
or retrieve in response to a query, but increases the <a class="glossterm" href="go01.html#gloss_precision"><em class="glossterm">precision</em></a> of
|
||
the recalled set, the proportion of recalled items that are relevant.</p></li><li class="listitem"><p>Organizing systems for physical information resources emphasize
|
||
description resources or surrogates like bibliographic records that describe
|
||
the information content rather than their physical properties.</p></li><li class="listitem"><p>Which resources are primary and which are metadata is often just a
|
||
decision about which resource is the <a class="glossterm" href="go01.html#gloss_focus"><em class="glossterm">focus</em></a> of our attention.</p></li><li class="listitem"><p>It can be useful to view domains of information resources on the Document
|
||
Type Spectrum from weakly-structured narrative content to highly structured
|
||
transactional content.</p></li><li class="listitem"><p>Organizing systems designed for institutional or industry-wide use require
|
||
systematic design methods to determine which resources will have separate
|
||
identities and how they are related to each other.</p></li><li class="listitem"><p><a id="id611219" class="indexterm"/>The concept of identity for bibliographic resources has evolved into a
|
||
four-step abstraction hierarchy between the abstract <a class="glossterm" href="go01.html#gloss_work"><em class="glossterm">work</em></a>, an <a class="glossterm" href="go01.html#gloss_expression"><em class="glossterm">expression</em></a>
|
||
in multiple formats or genres, a particular <a class="glossterm" href="go01.html#gloss_manifestation"><em class="glossterm">manifestation</em></a> in one of those
|
||
formats or genres, and a specific physical <a class="glossterm" href="go01.html#gloss_item"><em class="glossterm">item</em></a>.</p></li><li class="listitem"><p>Resources become active resources when they contain sensing and
|
||
communication capabilities.</p></li><li class="listitem"><p>Organizing systems that create value from active resources often co-exist
|
||
with or complement organizing systems that treat its resources as
|
||
passive.</p></li><li class="listitem"><p>If the resource has an IP address, it is said to be part of the
|
||
<span class="quote">“<span class="quote">Internet of Things.</span>”</span></p></li><li class="listitem"><p>The most basic principle of naming is to choose names that are
|
||
informative.</p></li><li class="listitem"><p>A <a class="glossterm" href="go01.html#gloss_controlled_vocabulary"><em class="glossterm">controlled
|
||
vocabulary</em></a> can be thought of as a fixed or closed dictionary
|
||
that includes all the terms that can be used unambiguously in a particular
|
||
domain.</p></li><li class="listitem"><p>Many resources are given names based on attributes that can be problematic
|
||
later if the attribute changes in value or interpretation.</p></li><li class="listitem"><p>An identifier is a special kind of name assigned in a controlled manner
|
||
and governed by rules that define possible values and naming conventions.
|
||
The design of a scheme for persistent identifiers must consider both the
|
||
required time frame and the number of resources to be identified.</p></li><li class="listitem"><p>Preservation and governance are activities carried out to ensure the
|
||
outcome of persistence.</p></li></ul></div></div></div><div class="footnotes" epub:type="footnotes"><br/><hr style="width: 100; align: left;"/><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-01"><p><sup>[<a href="#chapter-3-endnote-01" class="para">124</a>] </sup>[Citation] <a class="link" href="bi01.html#Wilson1968" title="Two Kinds of Power: An Essay on Bibliographical Control"><a id="cite_Wilson1968"/>(Wilson 1968, p. 9)</a>.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-02"><p><sup>[<a href="#chapter-3-endnote-02" class="para">125</a>] </sup>[Business] <span><a id="id601703" class="indexterm"/><a id="id601689" class="indexterm"/>Separating information content from its structure and
|
||
presentation is essential to re-purposing it for different
|
||
scenarios, applications, devices, or users.</span>
|
||
<span>The global information economy is increasingly driven by
|
||
automated information exchange between business processes.</span>
|
||
When information flows efficiently from one type of document to another
|
||
in this chain of related documents, the overlapping content components
|
||
act as the <span class="quote">“<span class="quote">glue</span>”</span> that connects the information systems or
|
||
web services that produce and consume the documents. <a class="link" href="bi01.html#Glushko2005" title="Document Engineering: Analyzing and Designing Documents for Business Informatics and Web Services"><a id="cite_Glushko2005-3.1"/>(Glushko and McGrath
|
||
2005)</a>.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-03"><p><sup>[<a href="#chapter-3-endnote-03" class="para">126</a>] </sup>[Citation] <a class="link" href="bi01.html#Furnas1987" title="“The Vocabulary Problem in Human-System Communication: an Analysis and a Solution”"><a id="cite_Furnas1987-3.1"/>(Furnas,
|
||
Landauer, Gomez, and Dumais 1987)</a>.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-03a"><p><sup>[<a href="#chapter-3-endnote-03a" class="para">127</a>] </sup>[Citation] <a id="id602521" class="indexterm"/><a id="id602541" class="indexterm"/><a id="id602548" class="indexterm"/><a id="id602554" class="indexterm"/><a id="id602559" class="indexterm"/><a id="id602567" class="indexterm"/><a id="id602570" class="indexterm"/><a id="id602576" class="indexterm"/>Linnaeus is sometimes called the father of modern taxonomy
|
||
(which is unfair to Aristotle) but he certainly deserves enormous credit
|
||
for the systematic approach to biological classification that he
|
||
proposed in <a class="link" href="bi01.html#Linnaeus1735" title="Systema Naturae 1"><em class="citetitle">Systema Naturae</em></a>, published in 1735.
|
||
This seminal work contains the familiar kingdom, class, order, family,
|
||
genus, species hierarchy.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-04"><p><sup>[<a href="#chapter-3-endnote-04" class="para">128</a>] </sup>[Citation] <a class="link" href="bi01.html#Glushko2005" title="Document Engineering: Analyzing and Designing Documents for Business Informatics and Web Services"><a id="cite_Glushko2005-3.2"/>(Glushko
|
||
and McGrath 2005)</a>.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-05"><p><sup>[<a href="#chapter-3-endnote-05" class="para">129</a>] </sup>[Citation] <a class="link" href="bi01.html#Kuniavsky2010" title="Smart Things: Ubiquitous Computing User Experience Design"><a id="cite_Kuniavsky2010"/>(Kuniavsky 2010)</a>.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-06"><p><sup>[<a href="#chapter-3-endnote-06" class="para">130</a>] </sup>[LIS] <a id="id602993" class="indexterm"/><a id="id603002" class="indexterm"/><a id="id603007" class="indexterm"/><a id="id603012" class="indexterm"/><span class="orgname"><a id="id603018" class="indexterm"/>Project Gutenberg</span>, begun in 1971, was the first
|
||
large-scale effort to digitize books; its thousands of volunteers have
|
||
created about 40,000 digital versions of classic printed works. Systematic
|
||
research in digital libraries began in the 1990s when the <span class="citerefentry"><span class="refentrytitle"><span class="orgname">US National Science
|
||
Foundation</span></span>(NSF)</span>, the <span class="citerefentry"><span class="refentrytitle"><span class="orgname">Advanced Research Projects
|
||
Agency</span></span>(ARPA)</span>, and <abbr class="abbrev">NASA</abbr> launched a <span class="orgname">Digital
|
||
Library Initiative</span> that emphasized the enabling technologies
|
||
and infrastructure. At about the same time numerous pragmatic efforts to
|
||
digitize library collections began, characterized by some as a race against
|
||
time as old books in libraries were literally disintegrating and turning
|
||
into dust. <a id="id603076" class="indexterm"/><a id="id603064" class="indexterm"/> The <span class="application">Internet Archive</span>, started in
|
||
1996, now has a collection of over 3 million texts and has estimated the
|
||
cost of digitizing to be about $30 for the average book. <a id="id603086" class="indexterm"/>Multiply this by the scores of millions of books held in the
|
||
world’s research libraries and it is easy to why many libraries endorsed
|
||
<span class="orgname">Google</span>’s offer to digitize their collections.</p></div><div class="footnote" id="ftn.chapter-3-endnote-07"><p><sup>[<a href="#chapter-3-endnote-07" class="para">131</a>] </sup>[Computing] <a id="id603149" class="indexterm"/><a id="id603156" class="indexterm"/>The <abbr class="abbrev">ASCII</abbr> scheme was standardized in the 1960s
|
||
when computer memory was expensive and most computing was in
|
||
English-speaking countries, so it is minimal and distinguishes only 128
|
||
characters. <em class="firstterm"><span class="citerefentry"><span class="refentrytitle">American Standard Code for Information
|
||
Interchange</span>(ASCII)</span></em> is an <span class="orgname">ANSI</span> specification.
|
||
(See <a class="ulink" href="http://en.wikipedia.org/wiki/ASCII" target="_top"><code class="uri">http://en.wikipedia.org/wiki/ASCII</code></a>.)</p></div><div class="footnote" id="ftn.chapter-3-endnote-08"><p><sup>[<a href="#chapter-3-endnote-08" class="para">132</a>] </sup>[Computing] <a id="id603212" class="indexterm"/><a id="id603239" class="indexterm"/><a id="id603245" class="indexterm"/>Unicode 6.0 (<a class="ulink" href="http://www.unicode.org/" target="_top"><code class="uri">http://www.unicode.org/</code></a>) has room to
|
||
encode 109,449 characters for all the <a class="glossterm" href="go01.html#gloss_writing_system"><em class="glossterm">writing systems</em></a> in the world,
|
||
so a single standard can represent the characters of every existing
|
||
language, even <span class="quote">“<span class="quote">dead</span>”</span> ones like Sumerian and Hittite. Unicode
|
||
encodes the scripts used in languages, rather than languages per se, so
|
||
there only needs to one representation of the Latin, Cyrillic, Arabic., etc
|
||
scripts that are used for writing multiple language. Unicode also
|
||
distinguishes <a class="glossterm" href="go01.html#gloss_character"><em class="glossterm">characters</em></a>
|
||
from <a class="glossterm" href="go01.html#gloss_glyph"><em class="glossterm">glyphs</em></a>, the different
|
||
forms for the same character<span class="symbol">—</span>enabling different <a class="glossterm" href="go01.html#gloss_font"><em class="glossterm">fonts</em></a> to be identified as the same
|
||
character.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-09"><p><sup>[<a href="#chapter-3-endnote-09" class="para">133</a>] </sup>[CogSci] <a id="id603481" class="indexterm"/><a id="id603495" class="indexterm"/><a id="id603348" class="indexterm"/>Encoding of structure in documents is valuable because titles,
|
||
sections, links and other structural elements can be leveraged to enhance
|
||
the user interface and navigational interactions with the digital document
|
||
and enable more precise information retrieval. Some uses of documents
|
||
require formats that preserve their printed appearance.
|
||
<span class="quote">“<span class="quote"><span>Presentational fidelity</span></span>”</span> is essential
|
||
if we imagine a banker or customs inspector carefully comparing a printed
|
||
document with a computer-generated one to ensure they are identical.</p></div><div class="footnote" id="ftn.chapter-3-endnote-10"><p><sup>[<a href="#chapter-3-endnote-10" class="para">134</a>] </sup>[Computing] <a id="id603458" class="indexterm"/><a id="id603375" class="indexterm"/>Text encoding specs are well-documented; see
|
||
(<a class="ulink" href="http://www.wotsit.org/list.asp?fc=10" target="_top"><code class="uri">http://www.wotsit.org/list.asp?fc=10</code></a>).</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-11"><p><sup>[<a href="#chapter-3-endnote-11" class="para">135</a>] </sup>[Citation] <a class="link" href="bi01.html#Chapman2009" title="Digital Multimedia"><a id="cite_Chapman2009"/>(Chapman and
|
||
Chapman 2009)</a>.</p></div><div class="footnote" id="ftn.chapter-3-endnote-12"><p><sup>[<a href="#chapter-3-endnote-12" class="para">136</a>] </sup>[LIS] <a id="id603551" class="indexterm"/>Numerous museums have created web collections, but a great many
|
||
of them seem to have focused on the quantity of information they could put
|
||
online rather than on the user experience they were creating. Perhaps not
|
||
surprisingly, the ambitious use of virtual world technology to create novel
|
||
forms of interaction described by <a class="link" href="bi01.html#Rothfarb2007" title="“Creating Museum Content and Community in Second Life”"><a id="cite_Rothfarb2007"/>(Rothfarb and Doherty 2007)</a> reflects the
|
||
highly interactive character of its host museum, the
|
||
<span class="orgname">Exploratorium</span> in San Francisco
|
||
(<a class="ulink" href="http://www.exploratorium.edu/" target="_top"><code class="uri">http://www.exploratorium.edu/</code></a>). <a id="id603608" class="indexterm"/>Similarly, the <span class="application">Google Art Project</span>
|
||
(<a class="ulink" href="http://googleartproject.com" target="_top"><code class="uri">http://googleartproject.com</code></a>) is notable for its goal of
|
||
complementing and extending, rather than merely imitating, the museum
|
||
visitor’s encounter with artwork <a class="link" href="bi01.html#Proctor2011" title="“The Google Art Project: A New Generation of Museums on the Web?”"><a id="cite_Proctor2011-3.1"/>(Proctor 2011)</a>. A feature that let people
|
||
create a <span class="quote">“<span class="quote">personal art collection</span>”</span> is very popular, enabling a
|
||
fan of <a id="id603653" class="indexterm"/><span class="personname"><span class="firstname">Vincent</span> <span class="surname">Van Gogh</span></span> to bring together paintings that hang in different
|
||
museums.</p></div><div class="footnote" id="ftn.chapter-3-endnote-13"><p><sup>[<a href="#chapter-3-endnote-13" class="para">137</a>] </sup>[Computing] However, scratching can be simulated using a smart phone or
|
||
tablet app called <span class="application">djay</span>. See
|
||
<a class="ulink" href="http://www.algoriddim.com/djay" target="_top"><code class="uri">http://www.algoriddim.com/djay</code></a>.</p></div><div class="footnote" id="ftn.chapter-3-endnote-14"><p><sup>[<a href="#chapter-3-endnote-14" class="para">138</a>] </sup>[Law] <a id="id603714" class="indexterm"/><a id="id603724" class="indexterm"/><a id="id603732" class="indexterm"/>As a result, digital books are somewhat controversial and
|
||
problematic for libraries, whose access models were created based on the
|
||
economics of print publication and the social contract of the copyright
|
||
first sale doctrine that allowed libraries to lend printed books.. Digital
|
||
books change the economics and first sale is not as well-established for
|
||
digital works, which are licensed rather than sold <a class="link" href="bi01.html#Aufderheide2011" title="Reclaiming Fair Use: How to Put Balance Back in Copyright"><a id="cite_Aufderheide2011-3.1"/>(Aufderheide and
|
||
Jaszi 2011)</a>. To protect their business models, many publishers
|
||
are limiting the number of times e-books can be lent before they
|
||
<span class="quote">“<span class="quote">self-destruct.</span>”</span> Some librarians have called for boycotts
|
||
of publishers in response
|
||
(<a class="ulink" href="http://boycottharpercollins.com" target="_top"><code class="uri">http://boycottharpercollins.com</code></a>).</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-15"><p><sup>[<a href="#chapter-3-endnote-15" class="para">139</a>] </sup>[Business] <a id="id603901" class="indexterm"/><a id="id603910" class="indexterm"/><a id="id603916" class="indexterm"/><a id="id603932" class="indexterm"/><a id="id603939" class="indexterm"/>The opposing categories of operands and operants have their
|
||
roots in debates in political economics about the nature of work and the
|
||
creation of value <a class="link" href="bi01.html#Vargo2004" title="“Evolving to a new dominant logic for marketing.”"><a id="cite_Vargo2004"/>(Vargo and
|
||
Lusch 2004)</a> and have more recently played a central role in the
|
||
development of modern thinking about service design <a class="link" href="bi01.html#Constantin1994" title="Understanding Resource Management: How to deploy your people, products, and processes for maximum productivity"><a id="cite_Constantin1994"/>(Constantin and Lusch
|
||
1994)</a>, <a class="link" href="bi01.html#Maglio2009" title="“The service system is the basic abstraction of service science.”"><a id="cite_Maglio2009"/>(Maglio et
|
||
al. 2009)</a>. The concept of agency or operant resources is needed
|
||
to bring resources that are active information sources, or computational in
|
||
character, into the organizing system framework. This concept also lets us
|
||
include living resources, or more specifically, humans, into discussions
|
||
about organizing systems in a more general way that emphasizes their agency
|
||
and de-emphasizes other characteristics that could otherwise be
|
||
distracting.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-16"><p><sup>[<a href="#chapter-3-endnote-16" class="para">140</a>] </sup>[Citation] See <a class="link" href="bi01.html#Allmendinger2005" title="“Four Strategies for the Age of Smart Services”"><a id="cite_Allmendinger2005"/>(Allmendinger and Lombreglia
|
||
2005)</a>, <a class="link" href="bi01.html#Want2006" title="“An Introduction to RFID Technology”"><a id="cite_Want2006"/>(Want
|
||
2006)</a>.</p></div><div class="footnote" id="ftn.chapter-3-endnote-17"><p><sup>[<a href="#chapter-3-endnote-17" class="para">141</a>] </sup>[CogSci] <a id="id604426" class="indexterm"/><a id="id604434" class="indexterm"/><a id="id604443" class="indexterm"/><span class="personname"><span class="firstname">Luis</span> <span class="surname">Von Ahn</span></span>
|
||
<a class="link" href="bi01.html#vonAhn2004" title="“Labeling images with a computer game”"><a id="cite_vonAhn2004"/>(von Ahn 2004)</a>
|
||
was the first to use the web to get people to perform
|
||
<span class="quote">“<span class="quote">microwork</span>”</span> or <span class="quote">“<span class="quote">human computation</span>”</span> tasks
|
||
when he released what he called <span class="quote">“<span class="quote">the ESP game</span>”</span> that
|
||
randomly paired people trying to agree on labeling an image. Not long
|
||
afterwards Amazon created the MTurk platform
|
||
(<a class="ulink" href="http://www.mturk.com" target="_top"><code class="uri">http://www.mturk.com</code></a>) that lets people propose microwork
|
||
and others sign up to do it, and today there are both hundreds of
|
||
thousands of tasks offered and hundreds of thousands of people offering
|
||
to be paid to do them.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-18"><p><sup>[<a href="#chapter-3-endnote-18" class="para">142</a>] </sup>[Computing] <a id="id604609" class="indexterm"/>For semi-structured or more narrative documents these
|
||
descriptions might be authoring templates used in word processors or other
|
||
office applications, document schemas in <abbr class="abbrev">XML</abbr> applications,
|
||
style sheets, or other kinds of transformations that change one resource
|
||
representation into another one. Primary resources that are highly and
|
||
regularly structured are invariably organized in databases or enterprise
|
||
information management systems in which a <a class="glossterm" href="go01.html#gloss_data_schema"><em class="glossterm">data schema</em></a> specifies the
|
||
arrangement and type of data contained in each field or component of the
|
||
resource.</p></div><div class="footnote" id="ftn.chapter-3-endnote-19"><p><sup>[<a href="#chapter-3-endnote-19" class="para">143</a>] </sup>[Computing] There are a large number of third-party Twitter apps. See
|
||
<a class="ulink" href="http://twitter.pbworks.com/w/page/1779726/Apps" target="_top"><code class="uri">http://twitter.pbworks.com/w/page/1779726/Apps</code></a>. For a
|
||
scholarly analysis see <a class="link" href="bi01.html#Efron2011" title="“Information Search and Retrieval in Microblogs”"><a id="cite_Efron2011"/>(Efron
|
||
2011)</a>.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-20"><p><sup>[<a href="#chapter-3-endnote-20" class="para">144</a>] </sup>[Business] <a id="id604729" class="indexterm"/><a id="id604734" class="indexterm"/>The basic idea behind fantasy sports is quite simple. You select
|
||
a team of existing players in any sport, and then compare their statistical
|
||
performance against other teams similarly selected by other people. Fantasy
|
||
sports appeal mostly to die-hard fans who study player statistics carefully
|
||
before <span class="quote">“<span class="quote">drafting</span>”</span> their players. The global fantasy sports
|
||
business for companies who organize and operate fantasy leagues is estimated
|
||
as between 1 and 2 billion US dollars annually <a class="link" href="bi01.html#Montague2010" title="The rise and fall of fantasy sports"><a id="cite_Montague2010"/>(Montague 2010)</a>.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-21"><p><sup>[<a href="#chapter-3-endnote-21" class="para">145</a>] </sup>[Citation] <a class="link" href="bi01.html#Schmandt-Besserat1997" title="How Writing Came About"><a id="cite_Schmandt-Besserat1997"/>(Schmandt-Besserat
|
||
1997)</a>.</p></div><div class="footnote" id="ftn.chapter-3-endnote-22"><p><sup>[<a href="#chapter-3-endnote-22" class="para">146</a>] </sup>[LIS] <a id="id604911" class="indexterm"/><a id="id604919" class="indexterm"/><a id="id604938" class="indexterm"/><a id="id604945" class="indexterm"/>The oldest known lists of books were created about 4000
|
||
years ago in Sumeria. The first use of cards in library catalogs was
|
||
literal; when the revolutionary government of France seized private book
|
||
collections, an inventory was created stating in 1791 using the blank
|
||
backs of playing cards. 110 years later the <span class="orgname">US Library of
|
||
Congress</span> began selling pre-printed catalog cards to
|
||
libraries, but in the mid-1960s the creation of the <em class="firstterm"><a id="first_MARC"/><span class="citerefentry"><span class="refentrytitle">Machine-Readable Cataloging</span>(MARC)</span></em> format marked the beginning of the end
|
||
of printed cards. See <a class="link" href="bi01.html#Strout1956" title="“The development of the catalog and cataloging codes”"><a id="cite_Strout1956"/>(Strout 1956)</a>. The MARC standards are at
|
||
<a class="ulink" href="http://www.loc.gov/marc/" target="_top"><code class="uri">http://www.loc.gov/marc/</code></a>.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-23"><p><sup>[<a href="#chapter-3-endnote-23" class="para">147</a>] </sup>[LIS] <a id="id605225" class="indexterm"/><a id="id605233" class="indexterm"/>We treat resource format and resource focus as distinct
|
||
dimensions, so there are four categories here. This contrasts with <span class="personname"><span class="firstname">David</span> <span class="surname">Weinberger</span></span>’s three <span class="quote">“<span class="quote">orders of order</span>”</span> that he proposes in
|
||
the first chapter of a book called <a class="link" href="bi01.html#Weinberger2007" title="Everything Is Miscellaneous: The Power of the New Digital Disorder"><a id="cite_Weinberger2007-3.1"/><em class="citetitle">Everything is
|
||
Miscellaneous</em> (Weinberger 2007)</a>.
|
||
Weinberger starts with the assumption that physical
|
||
resources are inherently the primary ones, so the first <span class="quote">“<span class="quote">order of
|
||
order</span>”</span> emerges when physical resources are arranged. The
|
||
second <span class="quote">“<span class="quote">order of order</span>”</span> emerges when physical description
|
||
resources are arranged, and the third <span class="quote">“<span class="quote">order of order</span>”</span>
|
||
emerges when digital description resources for physical resources are
|
||
arranged. Later in the book Weinberger mentions the use of bar codes
|
||
associated with websites, a physical description of a digital resource,
|
||
but because he started with the assumption that physical resources
|
||
define the <span class="quote">“<span class="quote">first order</span>”</span> this example does not fit into his
|
||
orders of order.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-24"><p><sup>[<a href="#chapter-3-endnote-24" class="para">148</a>] </sup>[Computing] <a id="id605340" class="indexterm"/><a id="id605413" class="indexterm"/><a id="id605418" class="indexterm"/><a id="id605423" class="indexterm"/>These methods go by different names in different disciplines,
|
||
including <span class="quote">“<span class="quote">data modeling,</span>”</span>
|
||
<span class="quote">“<span class="quote">systems analysis,</span>”</span> and <span class="quote">“<span class="quote">document engineering</span>”</span> (e.g.,
|
||
<a class="link" href="bi01.html#Kent2012" title="Data and Reality: A Timeless Perspective on Perceiving and Managing Information in Our Imprecise World"><a id="cite_Kent2012-3.1"/>(Kent 2012)</a>, <a class="link" href="bi01.html#Silverston2000" title="The Data Model Resource Book, Vol. 2: A Library of Data Models for Specific Industries"><a id="cite_Silverston2000-3.1"/>(Silverston
|
||
2000)</a>, <a class="link" href="bi01.html#Glushko2005" title="Document Engineering: Analyzing and Designing Documents for Business Informatics and Web Services"><a id="cite_Glushko2005-3.3"/>(Glushko
|
||
and McGrath 2005)</a>. What they have in common is that they produce
|
||
conceptual models of a domain that specify their components or parts and the
|
||
relationships among these components or parts. These conceptual models are
|
||
called <span class="quote">“<span class="quote">schemas</span>”</span> or <span class="quote">“<span class="quote">domain ontologies</span>”</span> in some
|
||
modeling approaches, and are typically implemented in models that are optimized
|
||
for particular technologies or applications.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-25"><p><sup>[<a href="#chapter-3-endnote-25" class="para">149</a>] </sup>[CogSci] <a id="id605475" class="indexterm"/><a id="id605535" class="indexterm"/>Specifically, an <abbr class="abbrev">NFL</abbr> football team needs to be
|
||
considered a single resource for games through the season and in playoffs,
|
||
and 53 individual players for other situations, like the
|
||
<abbr class="abbrev">NFL</abbr> draft or play-calling. The team and the team’s
|
||
roster can be thought of as resources, and the team’s individual players are
|
||
also resources that make up the whole team.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-26"><p><sup>[<a href="#chapter-3-endnote-26" class="para">150</a>] </sup>[LIS] <a id="id605594" class="indexterm"/><a id="id605603" class="indexterm"/><a id="id605642" class="indexterm"/><a class="link" href="bi01.html#Denton2007" title="“FRBR and the History of Cataloging”"><a id="cite_Denton2007-3.1"/>(Denton
|
||
2007)</a> is a highly readable retelling of the history of cataloging
|
||
that follows four themes<span class="symbol">—</span>the use of axioms, user
|
||
requirements, the <span class="quote">“<span class="quote">work,</span>”</span> and standardization and
|
||
internationalization<span class="symbol">—</span>culminating with their
|
||
synthesis in the <span class="citerefentry"><span class="refentrytitle">Functional Requirements for Bibliographic
|
||
Records</span>(FRBR)</span>.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-27"><p><sup>[<a href="#chapter-3-endnote-27" class="para">151</a>] </sup>[LIS] <a id="id605726" class="indexterm"/>This was a surprisingly controversial activity. Many opposed
|
||
Panizzi’s efforts as a waste of time and effort because they assumed that
|
||
<span class="quote">“<span class="quote">building a catalog was a simple matter of writing down a list of
|
||
titles</span>”</span><a class="link" href="bi01.html#Denton2007" title="“FRBR and the History of Cataloging”"><a id="cite_Denton2007-3.2"/>(Denton 2007</a> p. 38).</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-28"><p><sup>[<a href="#chapter-3-endnote-28" class="para">152</a>] </sup>[LIS] <a id="id605744" class="indexterm"/><a id="id605800" class="indexterm"/><a id="id605805" class="indexterm"/><a id="id605810" class="indexterm"/><a id="id605817" class="indexterm"/>Seymour Lubetzky worked for the US <span class="orgname">Library of
|
||
Congress</span> from 1943-1960 where he tirelessly sought to simplify
|
||
the proliferating mass of special case cataloging rules proposed by the
|
||
<span class="orgname">American Library Association</span>, because at the time the
|
||
<abbr class="abbrev">LOC</abbr> had the task of applying those rules and making the
|
||
catalog cards other libraries used. Lubetzky’s book on <em class="citetitle">Cataloguing Rules
|
||
and Principles</em>
|
||
<a class="link" href="bi01.html#Lubetzky2001" title="Seymour Lubetzky: Writings on the Classical Art of Cataloging"><a id="cite_Lubetzky1953"/>(Lubetzky 1953)</a>
|
||
bluntly asks <span class="quote">“<span class="quote">Is this rule necessary?</span>”</span> and was a turning point
|
||
in cataloging.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-29"><p><sup>[<a href="#chapter-3-endnote-29" class="para">153</a>] </sup>[LIS] <a id="id606142" class="indexterm"/>In between the abstraction of the <span class="bold"><strong>work</strong></span> and the specific single <span class="bold"><strong>item</strong></span> are two additional levels in the <abbr class="abbrev">FRBR</abbr> abstraction hierarchy. An <span class="bold"><strong>expression</strong></span> denotes the multiple the multiple
|
||
realizations of a work in some particular medium or notation, where it can
|
||
actually be perceived. There are many editions and translations of
|
||
<em class="citetitle">Macbeth</em>, but they are all the same expression,
|
||
and they are a different expression from all of the film adaptations of
|
||
<em class="citetitle">Macbeth</em>. A <span class="bold"><strong>manifestation</strong></span> is the set of physical artifacts with the same
|
||
expression. All of the copies of the <span class="orgname">Folger Library</span> print
|
||
edition of <em class="citetitle">Macbeth</em> are the same manifestation.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-30"><p><sup>[<a href="#chapter-3-endnote-30" class="para">154</a>] </sup>[Computing] This kind of advice can be found in many data or conceptual
|
||
modeling texts, but this particular statement comes from <a class="link" href="bi01.html#Glushko1988" title="“Hypertext engineering: practical methods for creating a compact disk encyclopedia”"><a id="cite_Glushko1988"/>(Glushko, Weaver, Coonan,
|
||
and Lincoln 1988)</a>. Similar advice can also be found in the
|
||
information science literature: <span class="quote">“<span class="quote">A unit of information...would have to
|
||
be....correctly interpretable outside any context</span>”</span>
|
||
<a class="link" href="bi01.html#Wilson1968" title="Two Kinds of Power: An Essay on Bibliographical Control"><a id="cite_Wilson1968-3.1"/>(Wilson 1968</a>, p.
|
||
18).</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-31"><p><sup>[<a href="#chapter-3-endnote-31" class="para">155</a>] </sup>[Computing] <a id="id606417" class="indexterm"/>A group of techniques collectively called <a id="id606439" class="indexterm"/><span class="quote">“<span class="quote">normalization</span>”</span> produces a set of tightly defined
|
||
information components that have minimal redundancy and ambiguity. Imagine
|
||
that a business keeps information about customer orders using a
|
||
<span class="quote">“<span class="quote">spreadsheet</span>”</span> style of organization in which a row contains
|
||
cells that record the date, order number, customer name, customer address,
|
||
item ID, item description, quantity, unit price, and total price. If an
|
||
order contains multiple products, these would be recorded on additional
|
||
rows, as would subsequent orders from the same customer. All of this
|
||
information is important to the business, but this way of organizing it has
|
||
a great deal of redundancy and inefficiency. For example, the customer
|
||
address recurs in every order, and the customer address field merges street,
|
||
city, state and zip code into a large unstructured field rather than
|
||
separating them as atomic components of different types of information with
|
||
potentially varying uses. Similar redundancy exists for the products and
|
||
prices. Canceling an order might result in the business deleting all the
|
||
information it has about a particular customer or product.</p><p>Normalization divides this large body of information into four separate
|
||
tables, one for customers, one for customer orders, one for the items
|
||
contained in each order, and one for item information. This normalized
|
||
information model encodes all of the information in the <span class="quote">“<span class="quote">spreadsheet
|
||
style</span>”</span> model, but eliminates the redundancy and avoids the data
|
||
integrity problems that are inherent in it.</p><p>Normalization is taught in every database design course. The concept and
|
||
methods were proposed by <a class="link" href="bi01.html#Codd1970" title="“A relational model of data for large shared data banks”"><a id="cite_Codd1970"/>(Codd
|
||
1970)</a>, who invented the relational data model, and has been
|
||
taught to students in numerous database design textbooks like <a class="link" href="bi01.html#Date2003" title="An Introduction to Database Systems"><a id="cite_Date2003"/>(Date 2003)</a>.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-32"><p><sup>[<a href="#chapter-3-endnote-32" class="para">156</a>] </sup>[Computing] <a id="id606674" class="indexterm"/><a id="id606683" class="indexterm"/>The <span class="quote">“<span class="quote">Internet of Things</span>”</span> concept spread very
|
||
quickly after it was proposed in 1999 by <span class="personname"><span class="firstname">Kevin</span> <span class="surname">Ashton</span></span>, who co-founded the Auto-ID center at <span class="orgname">MIT</span>
|
||
that year to standardize <abbr class="abbrev">RFID</abbr> and sensor information. For a
|
||
popular introduction, see <a class="link" href="bi01.html#Gershenfeld2004" title="“The Internet of Things”"><a id="cite_Gershenfeld2004"/>(Gershenfeld, Krikorian, and Cohen
|
||
2004)</a>. For a recent technical survey and a taxonomy of
|
||
application domains and scenarios see <a class="link" href="bi01.html#Atzori2010" title="“The Internet of Things: A survey”"><a id="cite_Atzori2010"/>(Atzori, Iera, and Morabito 2010)</a>.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-33"><p><sup>[<a href="#chapter-3-endnote-33" class="para">157</a>] </sup>[Computing] <a id="id606758" class="indexterm"/><span class="orgname"><a id="id606771" class="indexterm"/>University of Southern California</span> professor <span class="personname"><span class="firstname">Julian</span> <span class="surname">Bleecker</span></span> coined the term <span class="quote">“<span class="quote">Blogjects</span>”</span> to describe objects
|
||
that blog <a class="link" href="bi01.html#Bleecker2006" title="A Manifesto for Networked Objects — Cohabiting with Pigeons, Arphids and Aibos in the Internet of Things"><a id="cite_Bleecker2006-3.1"/>(Bleecker 2006, p. 2)</a>.
|
||
Bleecker’s early example of a Blogject is <span class="personname"><span class="firstname">Beatriz</span> <span class="surname">da Costa</span></span>’s <span class="application">Pigeon Blog</span>. Da Costa, a Los
|
||
Angeles<span class="symbol">—</span>based artist working at the intersection
|
||
of life sciences, politics, and technology, armed urban pigeons with
|
||
pollution sensors and locative tracking devices, released them, and created
|
||
a web interface<span class="symbol">—</span>in this case Pigeon
|
||
Blog<span class="symbol">—</span>to display their flight patterns on <a id="id606895" class="indexterm"/>
|
||
<span class="application">Google Maps</span> alongside the pollution levels in the
|
||
air as they flew. <span class="quote">“<span class="quote">Whereas once the pigeon was an urban varmint whose
|
||
value as a participant in the larger social collective was practically
|
||
nil or worse, the Pigeon that Blogs now attains first-class citizen
|
||
status</span>”</span> <a class="link" href="bi01.html#Bleecker2006" title="A Manifesto for Networked Objects — Cohabiting with Pigeons, Arphids and Aibos in the Internet of Things"><a id="cite_Bleecker2006-3.2"/>(Bleecker 2006, p. 5)</a>.</p></div><div class="footnote" id="ftn.chapter-3-endnote-34"><p><sup>[<a href="#chapter-3-endnote-34" class="para">158</a>] </sup>[Computing] <span class="orgname"><a id="id606972" class="indexterm"/>IBM</span>’s <span class="personname"><span class="firstname">Andy</span> <span class="surname">Stanford-Clark</span></span> has been credited with coining the term when he wired his
|
||
house with sensors, enabling appliances to send information to the house’s
|
||
<span class="application">Twitter</span> account, @andy_house (MacManus, 2009,
|
||
para. 4). The house plant kit:
|
||
<a class="ulink" href="http://www.sparkfun.com/products/10334" target="_top"><code class="uri">http://www.sparkfun.com/products/10334</code></a>. See also
|
||
<a class="ulink" href="http://supermechanical.com/twine" target="_top"><code class="uri">http://supermechanical.com/twine</code></a>.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-35"><p><sup>[<a href="#chapter-3-endnote-35" class="para">159</a>] </sup>[Computing] <a id="id607036" class="indexterm"/><a id="id607059" class="indexterm"/><a id="id607064" class="indexterm"/>Pattern analysis can help escape this dilemma by enabling
|
||
predictive modeling to make optimal use of the data. In designing smart
|
||
things and devices for people, it is helpful to create a smart model in
|
||
order to predict the kinds of patterns and locations relevant to the data
|
||
collected or monitored. These allow designers to develop a set of dimensions
|
||
and principles that will act as smart guides for the development of smart
|
||
things. Modeling helps to enable automation, security, or energy efficiency,
|
||
and baseline models can be used to detect anomalies. As for location, exact
|
||
locations are unnecessary; use of a <span class="quote">“<span class="quote">symbolic space</span>”</span> to
|
||
represent each <span class="quote">“<span class="quote">sensing zone</span>”</span><span class="symbol">—</span>e.g.,
|
||
rooms in a house<span class="symbol">—</span>and an individual’s movement history
|
||
as a string of symbols<span class="symbol">—</span>e.g.,
|
||
abcdegia<span class="symbol">—</span>works sufficiently as a model of
|
||
prediction. See <a class="link" href="bi01.html#Das2002" title="“The role of prediction algorithms in the MavHome smart home architecture”"><a id="cite_Das2002"/>(Das et al.
|
||
2002)</a>.</p></div><div class="footnote" id="ftn.chapter-3-endnote-36"><p><sup>[<a href="#chapter-3-endnote-36" class="para">160</a>] </sup>[Law] <a id="id607193" class="indexterm"/>Well, maybe not anything. Books list traditional meanings of
|
||
various names, charts rank names by popularity in different eras, and dozens
|
||
of websites tout themselves as the place to find a special and unique name.
|
||
See <a class="ulink" href="http://www.ssa.gov/oact/babynames/" target="_top"><code class="uri">http://www.ssa.gov/oact/babynames/</code></a> for historical trends
|
||
about baby names in the US with an interactive visualization at
|
||
<a class="ulink" href="http://www.babynamewizard.com/voyager#" target="_top"><code class="uri">http://www.babynamewizard.com/voyager#</code></a>.</p><p><a id="id607250" class="indexterm"/>Different countries have rules about characters or words that
|
||
may be used in names. In Germany, for example, the government regulates the
|
||
names parents can give to their children; there’s even a book, the
|
||
<em class="citetitle">International Handbook of Forenames</em>, to guide
|
||
them <a class="link" href="bi01.html#Kulish2009" title="“High Court in Germany Pops Names That Balloon”"><a id="cite_Kulish2009"/>(Kulish 2009)</a>.
|
||
In Portugal, the <span class="orgname">Ministry of Justice</span> publishes lists of
|
||
prohibited names (BBC News, 2007a). Meanwhile, in 2007, Swedish tax
|
||
officials rejected a family’s attempt to name their daughter <span class="personname"><span class="firstname">Metallica</span></span> (<a class="ulink" href="http://news.bbc.co.uk/2/hi/6525475.stm" target="_top"><code class="uri">http://news.bbc.co.uk/2/hi/6525475.stm</code></a>).</p><p>We can also change our names. Whether a woman takes on her husband’s
|
||
surname after marriage or, like the California man who changed his name to <span class="quote">“<span class="quote"><span class="personname"><span class="firstname">Trout Fishing</span></span>,</span>”</span> we just find something that better suits us than
|
||
the name given by our parents.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-37"><p><sup>[<a href="#chapter-3-endnote-37" class="para">161</a>] </sup>[CogSci] <a id="id607363" class="indexterm"/>While you may think that certain terms are more
|
||
obviously <span class="quote">“<span class="quote">good</span>”</span> than others, studies show that
|
||
<span class="quote">“<span class="quote">there is no one good access term for most objects. The idea
|
||
of an <span class="quote">‘<span class="quote">obvious,</span>’</span>
|
||
<span class="quote">‘<span class="quote">self-evident,</span>’</span>' or <span class="quote">‘<span class="quote">natural</span>’</span> term is a
|
||
myth!</span>”</span>
|
||
<a class="link" href="bi01.html#Furnas1987" title="“The Vocabulary Problem in Human-System Communication: an Analysis and a Solution”"><a id="cite_Furnas1987-3.2"/>(Furnas et al.
|
||
1987</a>, p. 967).</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-38"><p><sup>[<a href="#chapter-3-endnote-38" class="para">162</a>] </sup>[CogSci] <a id="id607491" class="indexterm"/>The most common names for this service were activities,
|
||
calendar and events, but in all over a hundred different names were
|
||
suggested, including cityevents, whatup, sparetime, funtime, weekender,
|
||
nightout, and many more, <span class="quote">“<span class="quote">People use a surprisingly great variety
|
||
of words to refer to the same thing,</span>”</span> Furnas wrote. <span class="quote">“<span class="quote">If
|
||
everyone always agreed on what to call things, the user’s word would
|
||
be the designer’s word would be the system’s word. . . .
|
||
Unfortunately, people often disagree on the words they use for
|
||
things</span>”</span>
|
||
<a class="link" href="bi01.html#Furnas1987" title="“The Vocabulary Problem in Human-System Communication: an Analysis and a Solution”"><a id="cite_Furnas1987-3.3"/>(Furnas et al.
|
||
1987</a>, p. 964).</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-39"><p><sup>[<a href="#chapter-3-endnote-39" class="para">163</a>] </sup>[CogSci] <a id="id607679" class="indexterm"/>This example comes from <a class="link" href="bi01.html#Farish2002" title="What’s in a Name?"><a id="cite_Farish2002"/>(Farish 2002)</a>, who analyzes
|
||
<span class="quote">“<span class="quote">What’s in a Name?</span>”</span> and suggests that multiple names
|
||
for the same thing might be a good idea because non-technical business
|
||
users, data analysts, and system implementers need to see things
|
||
differently and no one standard for assigning names will work for all
|
||
three audiences.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-40"><p><sup>[<a href="#chapter-3-endnote-40" class="para">164</a>] </sup>[CogSci] <a id="id607665" class="indexterm"/>See, for example, <span class="emphasis"><em>Handbook of Cross-Cultural
|
||
Marketing</em></span>, <a class="link" href="bi01.html#Kaynak1997" title="Handbook of Cross-Cultural Marketing"><a id="cite_Kaynak1997"/>(Kaynak 1997)</a>. The
|
||
<span class="orgname">Starbucks</span> coffee chain seemingly goes out of its
|
||
way to confuse its customers by calling the smallest of its three coffee
|
||
sizes (12 ounces) the <span class="quote">“<span class="quote">tall</span>”</span> size, calling its 16-ounce
|
||
size a <span class="quote">“<span class="quote"><span xml:lang="latin" class="foreignphrase"><em xml:lang="latin" class="foreignphrase">grande</em></span>,</span>”</span> and calling its largest a
|
||
<span class="quote">“<span class="quote"><span xml:lang="it" class="foreignphrase"><em xml:lang="it" class="foreignphrase">venti</em></span>,</span>”</span>
|
||
which is Italian for 20 (ounces). Outside of
|
||
<span class="orgname">Starbucks</span>, something that is <span class="quote">“<span class="quote">tall</span>”</span>
|
||
is never also considered <span class="quote">“<span class="quote">small.</span>”</span> Ironically, despite
|
||
having about 20,000 stores in about 60 countries, Starbucks has none in
|
||
Italy where <span xml:lang="it" class="foreignphrase"><em xml:lang="it" class="foreignphrase">venti</em></span> would be in
|
||
the local language.</p></div><div class="footnote" id="ftn.chapter-3-endnote-41"><p><sup>[<a href="#chapter-3-endnote-41" class="para">165</a>] </sup>[Business] <a id="id607857" class="indexterm"/><a id="id607830" class="indexterm"/>See “<span class="citetitle">As easy as YZX,</span>”
|
||
<a class="ulink" href="http://www.economist.com/node/760345" target="_top"><code class="uri">http://www.economist.com/node/760345</code></a>. In addition, the
|
||
convention to list the co-authors of scientific publications in
|
||
alphabetic order has been shown to affect reputation and employment by
|
||
giving undeserved advantages to people whose names start with letters
|
||
that come early in the alphabet. This bias might also affect admission
|
||
to selective schools. <a class="link" href="bi01.html#Efthyvoulou2008" title="“Alphabet Economics: The link between names and reputation”"><a id="cite_Efthyvoulou2008"/>(Efthyvoulou 2008)</a>.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-42"><p><sup>[<a href="#chapter-3-endnote-42" class="para">166</a>] </sup>[Business] <a id="id607949" class="indexterm"/><a id="id607956" class="indexterm"/>The <span class="orgname">Kentucky Fried Chicken</span> franchise
|
||
solved this problem by changing its name to <abbr class="abbrev">KFC</abbr>, which
|
||
you can now find in Beijing, Moscow, London and other locations not
|
||
anywhere near Kentucky and where many people have probably never heard
|
||
of the place.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-43"><p><sup>[<a href="#chapter-3-endnote-43" class="para">167</a>] </sup>[Computing] <a id="id607932" class="indexterm"/><a id="id607970" class="indexterm"/><span class="personname"><span class="firstname"><a id="id607996" class="indexterm"/>Tim</span> <span class="surname">Berners-Lee</span></span>, the founder of the web, famously argued that
|
||
<em class="citetitle">Cool URIs Don’t Change</em>
|
||
<a class="link" href="bi01.html#Berners-Lee1998" title="Cool URIs don’t change"><a id="cite_Berners-Lee1998-3.1"/>(Berners-Lee 1998)</a>.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-44"><p><sup>[<a href="#chapter-3-endnote-44" class="para">168</a>] </sup>[Law] <a id="id608072" class="indexterm"/><a id="id608047" class="indexterm"/>Any online citation to one of the <span class="orgname">West</span>
|
||
printed court reports will use the <span class="orgname">West</span> format.
|
||
However, when <span class="orgname">Mead Data</span> wanted to use the West page
|
||
numbers in its <span class="application">LEXIS online service</span> to link
|
||
to specific pages, West sued for copyright infringement. The citation
|
||
for the <span class="orgname">West Publishing</span> vs. <span class="orgname">Mead Data
|
||
Central</span> case is 799 F.2d 1219 (8th Cir 1986), which means
|
||
that the case begins on page 1219 of volume 799 in the set of opinions
|
||
from the <span class="orgname">8th Circuit Court of Appeals</span> that
|
||
<span class="orgname">West</span> published in print form.
|
||
<span class="orgname">West</span> won the case and <span class="orgname">Mead
|
||
Data</span> had to pay substantial royalties. Fortunately, this
|
||
logic behind this decision was repudiated by the <span class="orgname">US Supreme
|
||
Court</span> a few years later in a case that
|
||
<span class="orgname">West</span> published as <em class="citetitle"><span class="orgname">Feist
|
||
Publications, Inc.</span>, v. <span class="orgname">Rural Telephone
|
||
Service Co.</span>, 499 U.S. 340 (1991)</em>, and
|
||
<span class="orgname">West</span> can no longer claim copyright on page
|
||
numbers.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-45"><p><sup>[<a href="#chapter-3-endnote-45" class="para">169</a>] </sup>[CogSci] <a id="id608176" class="indexterm"/><a id="id608189" class="indexterm"/>When <span class="personname"><span class="firstname">George</span> <span class="surname">Orwell</span></span> gave the title <span class="quote">“<span class="quote">1984</span>”</span> to a novel he wrote in
|
||
1949 he intended it as a warning about a totalitarian future as the Cold
|
||
War took hold in a divided Europe, but today 1984 is decades in the past
|
||
and the title does not have the same impact.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-46"><p><sup>[<a href="#chapter-3-endnote-46" class="para">170</a>] </sup>[Citation] <a class="link" href="bi01.html#Dorai2002" title="“Bridging the Semantic Gap in Content Management Systems: Computational Media Aesthetics”"><a id="cite_Dorai2002"/>(Dorai and
|
||
Venkatesh 2002)</a>.</p></div><div class="footnote" id="ftn.chapter-3-endnote-47"><p><sup>[<a href="#chapter-3-endnote-47" class="para">171</a>] </sup>[Citation] Most common US surnames;
|
||
<a class="ulink" href="http://names.mongabay.com/most_common_surnames.htm" target="_top"><code class="uri">http://names.mongabay.com/most_common_surnames.htm</code></a>.</p><p>Chad Ochocinco story:
|
||
<a class="ulink" href="http://en.wikipedia.org/wiki/Chad_Ochocinco" target="_top"><code class="uri">http://en.wikipedia.org/wiki/Chad_Ochocinco</code></a>.</p><p>The Artist Formerly Known as Prince:
|
||
<a class="ulink" href="http://en.wikipedia.org/wiki/Prince_%28musician%29" target="_top"><code class="uri">http://en.wikipedia.org/wiki/Prince_%28musician%29</code></a>.</p><p>Fake names at Starbucks:
|
||
<a class="ulink" href="http://online.wsj.com/article/SB10001424053111904106704576582834147448392.html" target="_top"><code class="uri">http://online.wsj.com/article/SB10001424053111904106704576582834147448392.html</code></a>.</p><p>Twitter on sports jerseys: :
|
||
<a class="ulink" href="http://www.forbes.com/sites/alexknapp/2011/12/30/pro-lacrosse-team-replaces-names-with-twitter-handles-on-jerseys/?partner=technology_newsletter" target="_top"><code class="uri">http://www.forbes.com/sites/alexknapp/2011/12/30/pro-lacrosse-team-replaces-names-with-twitter-handles-on-jerseys/?partner=technology_newsletter</code></a>.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-48"><p><sup>[<a href="#chapter-3-endnote-48" class="para">172</a>] </sup>[Computing] <a id="id608776" class="indexterm"/><a id="id608783" class="indexterm"/>Identifiers with meaningful internal structure are said to
|
||
be structured or intelligent. Those that contain no additional
|
||
information are sometimes said to be unstructured, opaque, or dumb. The
|
||
8 in the <abbr class="abbrev">ISBN</abbr> example is a check digit, not technically
|
||
part of the identifier, that is algorithmically derived from the other
|
||
digits to detect errors in entering the <abbr class="abbrev">ISBN</abbr>.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-49"><p><sup>[<a href="#chapter-3-endnote-49" class="para">173</a>] </sup>[Citation] <a class="link" href="bi01.html#McCartney2006" title="“When Pilots Pass the BRBON, They Must Be in Kentucky”"><a id="cite_McCartney2006"/>(McCartney 2006)</a>.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-50"><p><sup>[<a href="#chapter-3-endnote-50" class="para">174</a>] </sup>[LIS] <a id="id608993" class="indexterm"/><a class="link" href="bi01.html#Svenonius2000" title="The Intellectual Foundation of Information Organization"><a id="cite_Svenonius2000-3.1"/>(Svenonius 2000)</a> calls vocabulary control <span class="quote">“<span class="quote">the
|
||
<span xml:lang="latin" class="foreignphrase"><em xml:lang="latin" class="foreignphrase">sine qua non</em></span> of
|
||
information organization</span>”</span> (p. 89). <span class="quote">“<span class="quote">The imposition of
|
||
vocabulary control creates an artificial language out of a natural
|
||
language</span>”</span> (p. 89), leaving behind an official, normalized set
|
||
of terms and their uses.</p></div><div class="footnote" id="ftn.chapter-3-endnote-51"><p><sup>[<a href="#chapter-3-endnote-51" class="para">175</a>] </sup>[LIS] <a id="id609078" class="indexterm"/><a id="id609088" class="indexterm"/><a id="id609112" class="indexterm"/><a id="id609121" class="indexterm"/> This mapping is <span class="quote">“<span class="quote">the means by which the language of
|
||
the user and that of a retrieval system are brought into
|
||
sync</span>”</span>
|
||
<a class="link" href="bi01.html#Svenonius2000" title="The Intellectual Foundation of Information Organization"><a id="cite_Svenonius2000-3.2"/>(Svenonius 2000, p. 93)</a>
|
||
and allows an
|
||
information-seeker to understand the relationship between, say, <span class="personname"><span class="firstname">Samuel</span> <span class="surname">Clemens</span></span> and <span class="personname"><span class="firstname">Mark</span> <span class="surname">Twain</span></span>. The <span class="orgname">Library of Congress</span> maintains a
|
||
list of standard, accepted names for authors, subjects, and titles
|
||
called the <em class="citetitle">Name Authority File</em>.
|
||
<a class="ulink" href="http://id.loc.gov/authorities/names.html" target="_top"><code class="uri">http://id.loc.gov/authorities/names.html</code></a>.</p></div><div class="footnote" id="ftn.chapter-3-endnote-52"><p><sup>[<a href="#chapter-3-endnote-52" class="para">176</a>] </sup>[Citation] Pan-European Species Directory Infrastructure (PESI):
|
||
<a class="ulink" href="http://www.eu-nomen.eu/pesi" target="_top"><code class="uri">http://www.eu-nomen.eu/pesi</code></a>; Consortium for the Barcode
|
||
of Life (CBOL): <a class="ulink" href="http://www.barcoding.si.edu/" target="_top"><code class="uri">http://www.barcoding.si.edu/</code></a>; NatureServe:
|
||
<a class="ulink" href="http://services.natureserve.org/BrowseServices/getSpeciesData/getSpeciesListREST.jsp" target="_top"><code class="uri">http://services.natureserve.org/BrowseServices/getSpeciesData/getSpeciesListREST.jsp</code></a>.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-53"><p><sup>[<a href="#chapter-3-endnote-53" class="para">177</a>] </sup>[Citation] <a class="link" href="bi01.html#Hemerly2011" title="“Making Metadata: The Case of MusicBrainz”"><a id="cite_Hemerly2011"/>(Hemerly
|
||
2011)</a>.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-54"><p><sup>[<a href="#chapter-3-endnote-54" class="para">178</a>] </sup>[Law] <a id="id609483" class="indexterm"/>This <span class="hardware">rations / radio confusion</span> is
|
||
described in <a class="link" href="bi01.html#Wheatley2004" title="“Operation Clean Data”"><a id="cite_Wheatley2004"/>(Wheatley 2004)</a>. In 2008 a similar mistake in
|
||
<span class="action">managing inventory</span> at a US <span class="hardware">military
|
||
warehouse</span> led to <span class="hardware">missile launch
|
||
fuses</span> being sent to Taiwan instead of
|
||
<span class="hardware">helicopter batteries</span>, causing a high-level
|
||
diplomatic furor when the <span class="orgname">Chinese government</span> objected
|
||
to this as a treaty violation <a class="link" href="bi01.html#Hoffman2008" title="“Details emerging on how fuses got to Taiwan”"><a id="cite_Hoffman2008"/>(Hoffman 2008)</a>.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-55"><p><sup>[<a href="#chapter-3-endnote-55" class="para">179</a>] </sup>[LIS] <a id="id609546" class="indexterm"/><a id="id609554" class="indexterm"/>Organizing systems in libraries, museums, and businesses
|
||
often give sequential accession numbers to resources when they are added
|
||
to a collection, but these identifiers are of no use outside of the
|
||
context in which they are assigned, as when a union catalog or merged
|
||
database is created.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-56"><p><sup>[<a href="#chapter-3-endnote-56" class="para">180</a>] </sup>[Computing] <a id="id609676" class="indexterm"/><a id="id609684" class="indexterm"/>A more general technique is to use the <abbr class="abbrev">UUID</abbr>
|
||
standard, which standardizes some algorithms that generate 128-bit
|
||
tokens that, for all practical purposes, will be unique for hundreds, if
|
||
not thousands, of years.</p></div><div class="footnote" id="ftn.chapter-3-endnote-57"><p><sup>[<a href="#chapter-3-endnote-57" class="para">181</a>] </sup>[Computing] <a id="id609714" class="indexterm"/><a id="id609726" class="indexterm"/><a id="id609731" class="indexterm"/><a id="id609736" class="indexterm"/>The <em class="firstterm"><a id="first_OASIS"/><span class="citerefentry"><span class="refentrytitle">Organization for the Advancement of Structured
|
||
Information Systems</span>(OASIS)</span></em>
|
||
<span class="citerefentry"><span class="refentrytitle">XML Common Biometric Format</span>(XCBF)</span> was developed to standardize the use of biometric data
|
||
like <abbr class="abbrev">DNA</abbr>, fingerprints, iris scans, and hand geometry
|
||
to verify identity
|
||
(<a class="ulink" href="https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=xcbf" target="_top"><code class="uri">https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=xcbf</code></a>).</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-58"><p><sup>[<a href="#chapter-3-endnote-58" class="para">182</a>] </sup>[Citation] <a class="link" href="bi01.html#Coyle2006" title="“Identifiers: Unique, Persistent, Global”"><a id="cite_Coyle2006"/>(Coyle
|
||
2006</a>, p. 429).</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-59"><p><sup>[<a href="#chapter-3-endnote-59" class="para">183</a>] </sup>[Computing] <a id="id610166" class="indexterm"/>IP v6 for internet addresses. The threat of exhaustion was
|
||
the motivation for remedial technologies, such as classful networks, <span class="citerefentry"><span class="refentrytitle">Classless Inter-Domain Routing</span>(CIDR)</span> methods, and <span class="citerefentry"><span class="refentrytitle">Network Address Translation</span>(<abbr class="acronym">NAT</abbr>)</span> that extend the usable address space.</p></div><div class="footnote" id="ftn.chapter-3-endnote-60"><p><sup>[<a href="#chapter-3-endnote-60" class="para">184</a>] </sup>[Computing] <a id="id610251" class="indexterm"/><a id="id610261" class="indexterm"/><span class="citerefentry"><span class="refentrytitle">Digital Object Identifier</span>(DOI)</span> system (<a class="ulink" href="http://www.doi.org" target="_top"><code class="uri">http://www.doi.org</code></a>). However,
|
||
<abbr class="abbrev">DOI</abbr> has its issues too. It’s a highly political,
|
||
publisher-controlled system, not a universal solution to
|
||
persistence.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-61"><p><sup>[<a href="#chapter-3-endnote-61" class="para">185</a>] </sup>[CogSci] <a id="id610335" class="indexterm"/><a id="id610340" class="indexterm"/><a id="id610345" class="indexterm"/>This is called the <em class="citetitle">Paradox of
|
||
Theseus</em>, a philosophical debate since ancient times. Every
|
||
day that <span class="personname"><span class="surname">Theseus</span></span>’s ship is in the harbor, a single plank gets replaced,
|
||
until after a few years the ship is completely rebuilt: not a single
|
||
original plank remains. Is it still the ship of <span class="personname"><span class="surname">Theseus</span></span>? And suppose, meanwhile, the shipbuilders have been
|
||
building a new ship out of the replaced planks? Is that the ship of <span class="personname"><span class="surname">Theseus</span></span>?
|
||
<a class="link" href="bi01.html#Furner2008" title="“Interrogating ‘identity’: A philosophical approach to an enduring issue in knowledge organization”"><a id="cite_Furner2008"/>(Furner 2008, p. 6)</a>.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-62"><p><sup>[<a href="#chapter-3-endnote-62" class="para">186</a>] </sup>[Citation] See <a class="link" href="bi01.html#Renear2003" title="“Towards Identity Conditions for Digital Documents”"><a id="cite_Renear2003"/>(Renear
|
||
and Dubin 2003)</a>, <a class="link" href="bi01.html#Wynholds2011" title="“Linking to Scientific Data: Identity Problems of Unruly and Poorly Bounded Digital Objects”"><a id="cite_Wynholds2011"/>(Wynholds 2011)</a>.</p></div><div class="footnote" id="ftn.chapter-3-endnote-63"><p><sup>[<a href="#chapter-3-endnote-63" class="para">187</a>] </sup>[Law] <a id="id610617" class="indexterm"/><a id="id610595" class="indexterm"/><a id="id610606" class="indexterm"/><a id="id610636" class="indexterm"/><a id="id610647" class="indexterm"/>Consider the case of an elderly woman born in 1929 in Zemum, a
|
||
district in the eastern European city of Belgrade, who has never moved. The
|
||
place she lives has been part of seven different countries during her
|
||
lifetime: Kingdom of Yugoslavia (1929-1941); Independent State of Croatia
|
||
(1941-1945); Federal People’s Republic of Yugoslavia (1945-1963); Socialist
|
||
Federal Republic of Yugoslavia (1963-1992); Federal Republic of Yugoslavia
|
||
(1992-2003); State Union of Serbia and Montenegro (2003-2006); Republic of
|
||
Serbia (2007 - present). See <a class="ulink" href="http://www.nationsonline.org/oneworld/hist_country_names.htm" target="_top"><code class="uri">http://www.nationsonline.org/oneworld/hist_country_names.htm</code></a>
|
||
for a list of formerly used country names and their effectivities.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-64"><p><sup>[<a href="#chapter-3-endnote-64" class="para">188</a>] </sup>[Business] <a id="id610668" class="indexterm"/><a id="id610756" class="indexterm"/><a id="id610687" class="indexterm"/>Effectivity in the tax code is simple compared to that relating
|
||
to documents in complex systems, like commercial aircraft. Because of their
|
||
long lifetimes<span class="symbol">—</span>the <span class="hardware">Boeing 737</span>
|
||
has been flying since the 1960s<span class="symbol">—</span>and continual
|
||
upgrading of parts like <span class="hardware">engines</span> and
|
||
<span class="hardware">computers</span>, each <span class="hardware">airplane</span> has
|
||
its own <span class="hardware">operating and maintenance manual</span> that reflects
|
||
changes made to the plane over time. Every change to the plane requires an
|
||
update to the repair manual, making the old version obsolete. And while an
|
||
aircraft mechanic might refer to <span class="quote">“<span class="quote">the 737 maintenance manual,</span>”</span>
|
||
each 737 aircraft actually has its own unique manual.</p></div><div class="footnote" epub:type="footnote" id="ftn.chapter-3-endnote-65"><p><sup>[<a href="#chapter-3-endnote-65" class="para">189</a>] </sup>[Law] <a id="id610873" class="indexterm"/><a id="id610918" class="indexterm"/>A notary public is used to verify that a signature on an
|
||
important document, such as a mortgage or other contract, is authentic, much
|
||
as signet rings and sealing wax once proved that no one has tampered with a
|
||
document since it was sealed.</p></div></div></section></body></html>
|