When working in a Lab, a number of data formats are frequently used. This list is by no means complete, but provides an overview of possible formats.
Analysed Layout and Text Object (ALTO) is an XML format describing recognised text and layout of an image. It is often used in collaboration with METS (see below).
Hypertext Optical Character Recognition (hOCR) is an XML format describing recognised text and its location on an image used by open source OCR engines such as Tesseract.
Text Encoding Initiative (TEI) is an XML format used to encode text in detail. It is often used for digital editions.
Comma Separated Values (CSV) is a format used to represent a tabular data in comma separated values.
eXtensible Markup Language (XML) is a markup language much like HTML.
Moving Picture Experts (MPEG21) is an XML format which describes the structure of a digital object. It is often combined with the Digital Item Declaration (DIDL) to describe the structure.
Metadata Encoding and Transmission Standard (METS) is an XML format which describes the structure of a digital object. It is often used in collaboration with ALTO (see above).
Functional Requirements for Bibliographic Records (FRBR) is a conceptual model developed by the International Federation of Library Associations and Institutions (IFLA) which is focused on user tasks of retrieval and access in online library catalogues from a user-centred perspective.
Bibframe was initiated by the Library of Congress in order to replace MARC standards and to adopt the linked data principles.
Resource Description and Access (RDA) is a package of data elements, guidelines, and instructions for creating library and cultural heritage resource metadata that are well-formed according to international models for user-focused linked data applications.
Bibliographic Ontology (BIBO) provides main concepts and properties for describing citations and bibliographic references (i.e. quotes, books, articles, etc.) on the Semantic Web.
Lightweight Information Describing Objects (LIDO) is an XML harvesting schema which supports a full range of descriptive information about museum objects.
Encoding Archival Description (EAD) is an XML standard for encoding archival finding aids.
Cultural Heritage Metadata
Europeana Data Model (EDM) is the formal specification of the classes and properties that could be used in Europeana, the EU digital platform for cultural heritage.
CIDOC Conceptual Reference Model(CRM) provides definitions and a formal structure for describing the implicit and explicit concepts and relationships used in cultural heritage documentation.