12.26.07

METADATA

Publicado en Gestión de Fondos Digitales a 8:19 pm por Ana Carrera

INFORMATION TAKEN FROM:

http://www.webopedia.com/TERM/m/metadata.html

http://www.library.uq.edu.au/iad/ctmeta4.html

http://en.wikipedia.org/wiki/Metadata

Metadata is structured data which describes the characteristics of a resource. Metadata is data about data (An item of metadata is itself data and therefore may have its own metadata). It shares many similar characteristics to the cataloguing that takes place in libraries, museums and archives. Metadata is essential for understanding information stored in data warehouses and has become increasingly important in XML-based Web applications. David Marco, another metadata theorist, defines metadata as “all physical data and knowledge from inside and outside an organization, including information about the physical data, technical and business processes, rules and constraints of the data, and structures of the data used by a corporation.”

Metadata is data associated with objects which relieves their potential users of having full advance knowledge of their existence or characteristics.

The term “meta” derives from the Greek word denoting a nature of a higher order or more fundamental kind. A metadata record consists of a number of pre-defined elements representing specific attributes of a resource, and each element can have one or more values. Below is an example of a simple metadata record:

Each metadata schema will usually have the following characteristics:

  • A limited number of elements
  • The name of each element
  • The meaning of each element

Typically, the semantics is descriptive of the contents, location, physical attributes, type and form. Key metadata elements of documents include the originator of a work, its title, when and where it was published and the subject areas it covers. The resource community may also define some logical grouping of the elements or leave it to the encoding scheme. For example, Dublin Core may provide the core to which extensions may be added. When structured into a hierarchical arrangement, metadata is more properly called an ontology or schema.

Some of the most popular metadata schemas include:

  • Dublin Core
  • AACR2 (Anglo-American Cataloging Rules)
  • GILS (Government Information Locator Service)
  • EAD (Encoded Archives Description)
  • IMS (IMS Global Learning Consortium)
  • AGLS (Australian Government Locator Service)

There are hundreds of metadata schemas to choose from, as different communities seek to meet the specific needs of their members. While the syntax is not strictly part of the metadata schema, the data will be unusable, unless the encoding scheme understands the semantics of the metadata schema. The encoding allows the metadata to be processed by a computer program. Important schemes include:

  • HTML (Hyper-Text Markup Language)
  • SGML (Standard Generalised Markup Language)
  • XML (eXtensible Markup Language)
  • RDF (Resource Description Framework)
  • MARC (MAchine Readable Cataloging)
  • MIME (Multipurpose Internet Mail Extensions)

Metadata may be deployed in a number of ways:

  • In the Web page by the creator or their agent using META tags in the HTML coding of the page.
  • As a separate HTML document linked to the resource it describes.
  • In a database linked to the resource. The records may either have been directly created within the database or extracted from another source, such as Web pages.

The simplest method is for Web page creators to add the metadata as part of creating the page. Creating metadata directly in a database and linking it to the resource, is growing in popularity as an independent activity to the creation of the resources themselves. Increasingly, it is being created by an agent or third party, particularly to develop subject-based gateways.

Metadata has many different applications:

- It provides the essential link between the information creator and the information user.
-To improve resource discovery: to speed up and enrich searching for resources.
-To provide additional information to users of the data it describes. This information may be descriptive.
- It helps to bridge the semantic gap. By telling a computer how data items are related and how these relations can be evaluated automatically, it becomes possible to process even more complex filter and search operations.
-Certain metadata is designed to optimize lossy compression algorithms.
-Some metadata is intended to enable variable content presentation.
-Other can be used to automate workflows.
-Metadata has become important because of the need to find useful information from the mass of information available.

The metadata elements fall into three groups which roughly indicate the class or scope of information stored in them: (1) elements related mainly to the content of the resource, (2) elements related mainly to the resource when viewed as intellectual property, and (3) elements related mainly to the physical manifestation of the resource. See further in the page of Dublin Core (which has become the defacto Internet metadata standard) or the previous article

The <META> tag is not normally displayed by Web browsers, but can be viewed by selecting “Page Source”.

 

Which elements, sub-elements and schemes should be used?

The choice, it is normally based on:

  • The specific needs of the local community to maximise information retrieval and management.
  • The need to guard against making the creation of metadata and its maintenance more trouble than it is worth and therefore defeating its purpose.
  • Sustainability of the metadata schema in terms of keeping the records up to date: It is not economical to start attaching metadata only after the production process has been completed.

The level of specificity in resource description is also important. The resources can be described individually or at a collection or aggregate level.

Consistent use of language with metadata descriptions can aid in the consistent discovery of resources.

 

Where will the metadata be stored?

-Metadata can be stored either internally, in the same file as the data, or externally, in a separate file.Metadata may be deployed in a number of ways:

  • Embedding the metadata in the Web page by the creator or their agent using META tags in the HTML coding of the page
  • As a separate HTML document linked to the resource it describes
  • In a database linked to the resource. The records may either have been directly created within the database or extracted from another source, such as Web pages.

Creating metadata directly in a database and linking it to the resource, is growing in popularity as an independent activity to the creation of the resources themselves. Increasingly, it is being created by an agent or third party, particularly to develop subject-based gateways.

For metadata attached to Web pages, the standard encoding scheme is HTML (HyperText Markup Language). RDF (Resource Description Framework) supports multiple metadata schemes. It uses XML (EXtensible Markup Language) to express the structure.

How does one create metadata?

The more easily the metadata can be created and collected at point of creation of a resource or at point of publication, the more efficient the process and the more likely it is to take place. There are many such tools available and the number continues to grow. Some examples include:

Types of metadata

There are two distinct classes of metadata: structural or control metadata and guide metadata. Structural metadata is used to describe the structure of computer systems such as tables, columns and indexes. Guide metadata is used to help humans find specific items and is usually expressed as a set of keywords in a natural language.Metadata can be classified by:

  • Content: Metadata can either describe the resource itself or the content of the resource.
  • Mutability: Immutable or mutable (the “Scene description” does change).
  • Logical function: There are three layers of logical function: subsymbolic layer, the symbolic layer and the logical one.

Digital library metadata

There are three categories of metadata that are frequently used to describe objects in a digital library [3][4]:

  1. descriptive – Information describing the intellectual content of the object, finding aids or similar schemes. It is typically used for bibliographic purposes and for search and retrieval.
  2. structural – Information that ties each object to others to make up logical units (e.g., information that relates individual images of pages from a book to the others that make up the book).
  3. administrative – Information used to manage the object or control access to it. This may include information on how it was scanned, its storage format, copyright and licensing information, and information necessary for the long-term preservation of the digital objects.

Dejar un comentario

Debes ser Sesión como para publicar un comentario.