Technical WG - Ontologies
======= The official draft is of the ‘Modeling Epigraphy with an Ontology’ document is stored in Zenodo at doi:10.5281/zenodo.4639507
Modeling Epigraphy with an Ontology
Contributors (alphabetically ordered by surname):1
Gabriel Bodard, Hugh Cayless, Chiara Cenati2, Alison Cooley, Tom Elliott, Silvia Evangelisti, Achille Felicetti, Paula Granados, Frank Grieshaber, Ethan Gruber, Aaron Hershkowitz3, Tim Hill, Harri Kiiskinen4, Thomas Kollatz, Adeline Levivier, Pietro Liuzzo5, Franco Luciani, Andrea Mannocci, Emilia Mataix, Francesca Murano, Orla Murphy, Elli Mylonas, Jonathan Prag, Vincent Razanajao, Simona Stoyanova, Georgios Tsolakis6,Charlotte Tupman, Irene Vagionakis, Valeria Vitale, Franziska Weise.
Table of Contents
This document describes how to use existing ontologies as well as a few new properties and classes, to model epigraphic objects. Inscriptions that are described using this model can be shared, ingested into an ontological database, and used in graph databases or to share linked open data. Describing an inscription in this way complements existing practices as described by EpiDoc. 2
This document describes how to use existing ontologies as well as a few new properties and classes, to model epigraphic objects. Inscriptions that are described using this model can be shared, ingested into an ontological database, and used in graph databases or to share linked open data. Describing an inscription in this way complements existing practices as described by EpiDoc.7
Prefixes used in the following examples are all listed here for convenience. The empty namespace is the area of definition of local entities.
@prefix : <> .
@prefix tm: <https://www.trismegistos.org/text/> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
@prefix frbr: <http://purl.org/vocab/frbr/core#> .
@prefix nmo: <http://nomisma.org/ontology#> .
@prefix lawd: <http://lawd.info/ontology/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix gn: <http://www.geonames.org/ontology#> .
@prefix gndo: <http://d-nb.info/standards/elementset/gnd#> .
@prefix geo: <http://www.opengis.net/ont/geosparql#> .
@prefix edh: <http://edh-www.adw.uni-heidelberg.de/edh/ontology#> .
@prefix epnet: <http://www.semanticweb.org/ontologies/2015/1/EPNet-ONTOP_Ontology#> .
@prefix edm: <http://www.europeana.eu/schemas/edm/> .
@prefix svcs: <http://rdfs.org/sioc/services#> .
@prefix doap: <http://usefulinc.com/ns/doap#> .
@prefix crm: <http://erlangen-crm.org/current/> .
@prefix crmtex: <http://www.cidoc-crm.org/crmtex/> .
@prefix crmsci: <http://www.cidoc-crm.org/crmsci/> .
@prefix extech: <https://w3id.org/executionTechnique/ontology#> .
@prefix eagle: <https://www.eaglenetwork.eu/voc/writing/lod/> .
@prefix eagleExTech: <https://www.eagle-network.eu/voc/writing/lod/> .
@prefix eagleObjTyp: <https://www.eagle-network.eu/voc/objtyp/lod/> .
@prefix eagleInsTyp: <https://www.eagle-network.eu/voc/typeins/lod/> .
@prefix eagleMat: <https://www.eagle-network.eu/voc/material/lod/> .
@prefix epont: <https://w3id.org/epont/ontology/> .
Added features of the Epigraphic Ontology
The namespace of the ontology is https://w3id.org/epont/ontology/ . The OWL file is available here https://github.com/PietroLiuzzo/epiont/blob/master/OWL/epont.owl ; it imports the following implementation of the CRMtex https://github.com/PietroLiuzzo/epiont/blob/master/OWL/crmtex.owl
Zenodo Storage of these two OWL files: https://zenodo.org/record/4444132#.YAISa8VKjRY
This subclass of lawd:Edition is defined because the edition of an inscription is often an item in a corpus of inscriptions with a numeric identifier, or is published within a specific article which may or may not solely be devoted to that edition. This is a different concept from that of a critical edition or an edition of a book (for example).
The Property epOnt:carriesText, is defined as the key Property to link objects and written texts, and is a subclass of crm:P56_bears_feature (is found on) instead of the crm:P128_carries as proposed in EPNet and EAGLE.
This Property is a subclass of crm:P94_has_created and is used to connect an Activity of TX6 Transcription to the Edition of that transcription published somewhere.
The following is a summary depiction of the main entities involved in the model using generically understandable terms. In a nutshell, it takes the Inscription as studied by epigraphers as a pair of closely related but distinct entities, an Object and a Written Text. It is neither necessary nor desirable to prioritize either Text or Object, but the two must be distinguished in order to be accurately described. The Written Text and its fragments have been written in a series of execution phases to reproduce a Text. The scholar who observes the written text and the object may produce a transcription which may become part of an Edition and which may then be translated. In this visual model no specific, formal, concept name is used; instead simplified labels are used to make the model clearer. The formal class and Property names appear in the examples and in the description.
On finding a new inscription, its existence could be defined by the simple three instances and two relations seen above. However, because we are pretty sure it was written, the minimal set should actually have one more Property and one more entity.
The Monument / Object
The Object is a E22 Human-Made Object which is a subclass of E19 Physical Object which is proposed in CRMtex. Not listed in the above model are all the more standard properties which can be attached to a E22 Human-Made Object, like crm:P48_has_preferred_identifier or crm:P104_is_subject_to (for the rights) and which will also be omitted in the rest of the model examples. Similarly, the subproperties of P4 has time span can always be assigned to each of these objects. In this example, dcterms:temporal is used to link to a period entity, as in the Pelagios data examples. This serves as an additional piece of information, distinct from P4 has time span properties.
The Object carries a Written Text (see The inscription below). The Property epOnt:carriesText is defined as the key Property to link objects and written texts, and is a subclass of crm:P56_bears_feature (is found on). It is defined as a subProperty in order to distinguish it from TXP2_is_included_within (included), which is also a subProperty of crm:P56_bears_feature (is found on). While epOnt:carriesText relates a E22 Human-Made Object to a TX1 Written Text, TXP2_is_included_within (included) relates a TX1 Written Text to a TX4 Writing Field.
The Property for the classification of monument or object type can also be attached to the object, and here the one proposed by EDH is reused and has been further specified as a subtype of nmo:hasObjectType. The range of the Property is a concept in the EAGLE Vocabulary. See below The Concepts in the EAGLE Vocabularies.
nmo:hasMaterial, which is declared in the ontology as a subProperty of P45 consist of is used in the same way as edh:representsTypeOfMonument to link to a concept defining a material (E57 Material).
Dimensions of the object can be given using P43 has dimension and features of the decoration can also be added in pure CIDOC, as in the EAGLE or EPNet models.
Several places can be specified by using the relevant properties defined by LAWD and linking to place entities defined locally or directly aligned to an available gazetteer, e.g. Pleiades or Trismegistos or any other relevant and open gazetteer (See below Places). While in this model lawd:foundAt and lawd:origin are used, the super-Property lawd:where could also be used if neither of the previous two can be specified. Only lawd:foundAt and not the super-Property lawd:where is equated to similar properties in other ontologies (nmo:hasFindspot and epnet:hasFindingPlace). crm:P53 has former or current location is also added to the range of subproperties of lawd:where, and includes the more specific subProperty used in the model, P55 has current location.
The Property proposed by the epnet ontology to relate bibliographic information to an entity in a generic way (epnet:isDocumentedBy) is also used here as an example,8 to link the graph of this object to any list of identifiable bibliographic resources, distinct from any list of the editions of the text (see below The Edition), although potentially overlapping. The modeling of these bibliographic resources, if not directly pointing to standard RDF representations, should follow the FABIO9 or FRBR models.
The Object inherits from the CRM implementation the core feature of modeling parts with P46 is composed of. To the example model above two further objects could be added as parts of the original object and modelled in the same way as that object, with or without a link to a Written Text.
Let us trace such an example now. Below we have two theorized fragments, separately described, transcribed and published in different places . Each is modelled independently.
Upon discovery that the two or more fragments belong to the same inscription, they could be joined by simply adding this statement to a further object entity (denoting the united object) or to one of the two.
The Object is the only entity which can be possibly represented by an image or drawing (an abstraction of a text cannot be), and remains true also for objects which are drawings and sketches from field notes, for example. The images are linked with a standard CRM Property and defined as in the following section (Images).
Squeezes, Casts, 3D replicas and other types of reproductions are Objects as much as the original object or Text of which they are a copy.. It is sometimes the case that the only extant witness of a text is such a copy. A printed or digital image, if it is the only witness to an Object, is as such also an Object, and can be related to the abstract entity identifying the lost original. It is thus possible to clearly model whether a reading is based on the original object or on a copy. It is also possible to document the existence of such reproductions independently of whether they have been used for transcription or editing.
At the same time, there must be a model for the relationships between the Objects. CIDOC E-55_type can be used to provide a list of types for this purpose.
In the above example two Reproduction activities of the creation of a Squeeze and a Cast are defined and linked to an Object via P12 occurred in the presence of.10 The creation of a reproduction is a Production (E12 Production) which can be dated and can have an Actor, like any other activity, while the Object it produces is an Object like any other and can have the same properties as the Object on which it is based. Many of the properties listed under E12 Production in the CIDOC CRM documentation could also be used here, to document the facsimile production. The Cast and Squeeze defined in the example above are also objects which carry a written text. In the case of a lost inscription, for which only a Squeeze survives we could have the following modeling, although nothing would prevent the epOnt:carriesText Property from remaining attached to the ‘original’ object. A squeeze or photo of a plaster cast is also definable this way, and each object can have its own properties regarding material, dating, etc. The special status of the object used for reproductions is thus not defining.
Additionally, as for images, P138 represents (has representation) can be used to link digital representations of any of these E22_Human-Made_Object, so that one can have, for example, a 3D model of a squeeze as well as a 3D model of the object on which that squeeze was made.
Similarly, E55 type can be used to mark the best among many available objects related in this way. This allows us to precisely say if, for example, a reading has been made on a source which is not the best available. (Note that this kind of classification will necessarily be subjective.)
These types are not defined in this ontology and depend on Vocabularies locally defined by data providers; they are important here only for their relation to one another.
This model is entirely taken from Nomisma documentation, which was also expanded to include the way the EDM11 model represents links to IIIF images.
The modeling of the place concepts associated with an object or that appear in attestations is not within the scope of an epigraphic ontology. The example below is a light modification of the current model used by EDH, where the most important things are the class lawd:Place (equivalent to E27_Site) and the owl:sameAs. As in Pelagios, SKOS properties can be used for matching and hierarchy.
The Property dcterms:isPartOf is used instead of skos:broader as in the EDH data. The same place model can be used to model any place concept used in the information exposed.
The Inscription Text
In CRMtex, the class TX1 Written Text is the central point of definition of the inscription and is here equivalent to a lawd:ConceptualWork because it shares the same logic of detachment of the conceptual level from the text as reading, transcription or edition. The class epnet:Inscription is also aligned to this concept. Note that the edh:Inscription class is aligned to TX6 Transcription instead.
For completeness, P56 is found on is repeated here although redundant. This is also to show that a provider of inscriptions as archived text could provide this information without offering any information about the object itself, leaving that part to others.
Although the date of the text depends on its interpretation and thus on the Reading (see The reading), here an example is offered of how the nomisma properties nmo:hasStartDate and nmo:hasEndDate could be used to specify a date for the abstraction of the written text different from that of any writing / production or execution phase. These properties are both declared as subclasses of P4 has time-span in this ontology.
The TX1 Written Text is the domain of the Property crmtex:TXP2 is included within, which allows for the separate description of the TX4 Writing Field (The writing field), as in the documentation of CRMtex. In CRMtex a TX1 Written Text is linked to a class TX2 Writing by a specific Property TXP5 was written by, subclass of P108 was produced by, introduced to specify the particular relationship between the written text and the event that produced it.TX1 Written Text is also linked to a TX5 Reading with a Property introduced to specify the relationship between the written text and its reading: the TXP9 was read by, subclass of O6 observed by from the CRMsci module (see The Reading). This is crucial to distinguish the text production from the investigation which leads to a transcription. It is necessary to abstract the text separately from its production aspect and transcription aspect to be able to associate the information precisely. This is an extremely important contribution from this model, which inherently allows the association of distinct and identifiable autopsies to a single text. It is thus possible to model the fact that a specific transcription is based on a specific autopsy event or instead is based on a previous edition, or on another Observation event. This is a major improvement possibility for the study of epigraphy, opening the path for attempts to disentangle the genesis of editions and publications.
To the TX4 Writing events in this example a series of other Events are also linked using the extech:hasExecutionPhase Property, equivalent to P31 has modified. This allows the use of an event-based description structure for any TX1 Written Text.
The writing field
The CRMtex class TX4 Writing field has already been discussed, and this model inherits that description as is. Like TX1 Written Text this is always a subclass of E25 Man-made Feature and shares all its features. An updated example based on the Arc of Constantine is in Felicetti and Murano 2020.13
Another important feature of CRMtex is the class TX7 Written Text Fragment, which provides a way to be able to model fragments of text separately from parts of objects.
The Property TXP4 composes provides the ability to have a separate model for each independent fragment and easily join the history of observation and publication of multiple fragments into one. Being able to identify a part of text would enable (for example) associating it with other parts of the same written text and easily modeling inscriptions in multiple languages or in multiple scripts (or both) by having a conceptualization for each part. Each Written Text or Written Text Fragment will have a crm:P108has_produced(was_produced_by) linked to a TX2 Writing event.
TX2 Writing is a subclass of E12 Production and can have identifiers and time related declarations associated with it. It can also make use of crm:P120_occurs_before to link to any other E2 Temporal entity. This Property scope note says that
a temporal gap exists between the end of A and the start of B. This Property is only necessary if the relevant time spans are unknown (otherwise the relationship can be calculated).
This is extremely important for chronologically placing one entity relative to another without making vague assertions about time-spans if such are not possible with a decent level of accuracy.
The example for this model focuses on two features which are relevant to this entity’s class: the association of a Script and of a Linguistic Object. The latter is an abstract written work (a lawd:WrittenWork or any of its subclasses) which embodies the TX1 or TX7 to which it is associated.
The E33 Linguistic Object can then be used to join different entities such as any E33, using for example properties of the SAWS ontology as in the example.14 This linguistic object is also usefully different from any other linguistic object resulting from reading (observation) so that the distinction between the text of the inscription and the transcription of the text of the inscription (diplomatic, editorial, etc.) is guaranteed. This linguistic Object does not have a literal text associated with it in this model, because the only possible literal text is a result of an observation and a transcription, as exemplified in the CRMtex document.
Phases of Execution
While writing is a specific sort of Production activity for a Written Text, many execution phases may be additionally involved, which are here classified using a draft ontology proposal for execution technique, which defines a extech:Phase as equivalent to a E11 Modification (so that TX2 Writing is also an extech:Phase in the ontology)
This relies on several specific types of execution techniques and tools used to accomplish them which have been defined and organized in the Eagle Vocabulary for execution technique (see Concepts in the EAGLE Vocabularies) of inscriptions, in order to be able to specify better and avoid confusions between tool and technique or result and tool, for example.
The reading of an inscription is a scientific observation, and is defined in CRMTex as a subclass of S4 Observation. It can have all properties of other Temporal Entities. Here only the ones most salient to epigraphic research are used in the example. Note also that the person possibly related to the autopsy by P14 carried out by (performed) need not be the actual author of the transcription or of the edition, although this will be often the case. To this event further annotations, for example related to the phenomenology of reading, could also be added.15
These entities are those which are more likely to be used in a DTS API presentation.
This is a different meaning of “reading” from what is intended as a reading e.g. in the TEI and EpiDoc guidelines for the element <rdg>.16 The scientific observation which is subclassed in TX5 Reading is one action of reading in its entirety,17 which may or may not result in a known transcription or an edition, which in turn will contain “readings” of specific pieces of the text.
The TX6 Transcription, product of a Reading (here called Autopsy could also in theory be another Activity such as reading an edition). This is the entity which most closely resembles the contents of a record in a digital epigraphy archive when viewed specifically from the landing page of an inscription. In this entity we have a series of standard dc statements including the link to a landing page, an identifier, and temporal properties. This entity is, however, still distinct from the Epigraphic Edition (edition) which it may produce and which is related to it with a special Property epOnt:hasEdition, defined as a subclass of P94 created. This allows for a text transcription to be available even before it is published, and to be easily distinguished from published transcriptions, whatever the format of publication. The transcription can take different forms and the EpNet ontology has defined a range of properties to associate to the results of this activity, including a Property to associate an EpiDoc transcription in XML. Only to this transcribed (and thus interpreted) text can a typology for a text (or more) be associated with the edh:representsTypeOfInscription, one of the concepts defined in the EAGLE Vocabularies (see The Concepts in the EAGLE Vocabularies).
Other properties such as edh:hasWorkstatus could be related to this entity, and in particular attestations (Attestations), which will be related to a specific Transcription or Edition if available.
The Bibliographical Reference associated with an Epigraphic Edition need not be specified in full, as bibliographic information modeling is beyond the scope of this model, but an edited text of the inscription could be modelled here, instead of the transcription text, to distinguish the two.
The edition of an inscription is often a part, with a numeric identifier, of a corpus of inscriptions, or is published within a specific article which may or may not solely be devoted to that edition. This is a different concept from that of a critical edition or of an edition of a book and has been specified as a subclass of lawd:Edition.
One autopsy and consequent transcription can create several editions, which is an additional added potential of the model. Each Edition can be associated with a translation.
Any attestation is modelled as required for NE attestations by Pelagios/SNAP/others attestation model in EPNet, etc. and can be referred exactly to an edition, a transcription or a translation.
This section is work in progress, it will be included in future releases. It will include modeling of different kind of sameness (same edition, same inscription, etc.).
Concepts in the EAGLE Vocabularies
The EAGLE Vocabularies are used and linked to, but are far from being an established resource even for epigraphic projects which have started digitally. IIP for example uses Getty ATT for Object typologies. The Getty ATT is however very wide, and the EAGLE Vocabularies may still be a better and simpler proxy for epigraphers. However, there is no limitation or boundary here, and other vocabularies can be used or aligned to the EAGLE vocabularies.
This approach, with separate vocabularies would have the advantage of being done once for everyone for non problematic cases, while one could still jump to another value or point to a different authority if unsatisfied.
To complete the above examples, here are some concepts and execution phases using them.
Persons involved in any stage and at any defined level can be modelled on the example of what EDH already does, using mainly FOAF properties. The following is a result of a DESCRIBE Query to the SPARQL endpoint, as an example.
Reification (about that triple…)
The issue of attribution of each RDF statement is solved with a process of adding triples which speak about triples, called reification. However, this risks to be never-ending, as one may start to wish for triples which speak about triples about other triples. Nomisma.org (example below) does not use reification for every single triple, but only where needed, and the same approach is used here with Contributors, Editors, etc.18 Nothing prevents anyone from implementing this however there is no need for prescription in this document.
The Many Missing Parts
Many other aspects of information are not covered here and are either
- Already covered by one of the imported ontologies
- Still in need of modeling of any type or vocabularies/definitions
- Covered by other ontologies not used here. For example, the Reading concept may be expanded upon using Classes and properties of the READ-IT Ontology.
The modularity of RDF allows this to happen as needed.
Summary of Classes and Property
Summary of Classes and Property and their “common denomination” which may eventually become a class name mapped to the reference entity. This is only a list of what is contained in this modeling document, the many ways in which this data can be expanded are not listed or prescribed, but are listed where known.
|Class or Property||Current mapped name||Epont proposed name (EN)|
|Property||crmtex:TXP9_was_read_by||Was read by|
|Property||crmtex:TXP5_was_written_by||Was written by|
|Property||dcterms:temporal||Time range (period)|
|Property||crm:P46_is_composed_of||Is composed of Object|
|Property||crmtex:TXP2_is_included_within||Is included within|
|Property||crmtex:TXP5_was_written_by||Was written by|
|Property||extech:hasExecutionPhase||Has execution phase|
|Property||crmtex:TXP1_used_writing_system||Used writing system|
|Property||crm:P92_brought_into_existence||Brought into existence|
|Class||extech:Phase||Phase of execution|
|Property||crm:P20_had_specific_purpose||Had specific purpose|
|Property||crmtex:TXP3_is_rendered_by||Is rendered by|
|Property||crm:P14_carried_out_by||Carried out by|
|Property||dct:isPartOf||Is part of|
|Property||edh:representsTypeOfInscription||Type of inscription|
Dept. of Cultural History, University of Turku. ↩
Universität Hamburg, Beta maṣāḥǝft: Manuscripts of Ethiopia and Eritrea (Schriftkultur des christlichen Äthiopiens und Eritreas: eine multimediale Forschungsumgebung). Contributor of the initial complete draft and structure. ↩
Institute for the Study of the Ancient World, New York University. ↩
CITO properties could also be used for this, to differentiate the different sources. This is the praxis for example in the Pleiades dataset. ↩
(Peroni and Shotton 2012) ↩
See CIDOC-CRM v.7 page xxix ↩
Although Pleiades does not serve a SPARQL Endpoint, the data is all freely available and can be loaded locally to a triplestore for querying as RDF. ↩
See the READ Ontology https://readit-project.eu/. ↩
Single reading within a textual variation. ↩
See CRMtex, “refers to the semiotic procedure of decoding (and therefore understanding) a written text. This procedure can be carried out for scientific purposes, in order to analyse and study the text according to different disciplinary perspectives. The reading activity is thus intended as a specific observation (S4) in which the decoding of the signs is performed, i.e., the linguistic value is recognised and the message is understood.” ↩
https://numishare.blogspot.com/2020/12/roman-republican-die-project-nomisma.htmlaspointed out by Ethan Gruber. Another approach is that of KNORA and Gravsearch ↩