About Semantic CorA
The following paragraphs outline the research context and provide some overview about the whole project.
The Project
The project “A Virtual Research Environment for the History of Education based on a Semantic Wiki Technology (Semantic MediaWiki for Collaborative Corpora Analysis: Semantic CorA)” targets the development of a virtual research environment (vre) based on Semantic MediaWiki (SMW) for a collaborative analysis of comprehensive digitised text corpora and an exemplary sustained nesting into the professional community of researchers in the history of education. Moreover, the project aims to provide for a possible further use of the enrichment and analysis works of the researchers and in the long term, an infrastructural distribution of the vre (Semantic CorA) to other disciplines with community building.
Owing to its concrete need for collaborative instruments for analysing pedagogical reference books, the domain of history of education offers a good starting point for exemplarily realising a virtual research environment. Furthermore, sound co-operations are well established in the respective community among researchers, librarians and technicians, which have for instance resulted in several digitisation projects. The vre allows for an integration of digitised documents along with their bibliographic metadata, collaboratively analysing them in a quantitative and qualitative sense, and thus connecting the idea of Linked Data with practical research in the humanities. This eHumanities project will enable libraries to integrate results from their digitisation projects into professional discourse in terms of primary data and generate added scientific value in the chain of creating research values, by a direct semantic connection between digitised records and analytic results – as well as enabling integrated archiving.
The exemplary realisation of the vre links up to a concrete research project in the history of education, aimed at discourse and field analyses of pedagogical reference works dating from 1774 to 1942. The project will focus on three levels of analysis (dictionaries, lemmata, texts), investigating their content, scope of discourse and argument. The vre will hence integrate the dictionaries from Scripta Paedagogica Online (SPO), hosted by the Library for the History of Education at the German Institute of International Educational Research (DIPF), which already indexes references to relevant pedagogical reference works at the level of articles, rendering them accessible online as image files. The corpus contains a total amount of nearly 22,000 articles.
An iterative development of the vre is envisaged, adjusted to the research process with agile computing, step-by-step open-source publishing, participative design by consulting the participants and empowering them to take an active part in designing the vre, as well as requirement and demand analyses.
Motivation
The main focus of the project lies on establishing a vre for the domain of historical research in education. Collaborative work in the maintenance and analysis of research data should be possible while special attention is paid to the re-use of research data at the beginning and the end of the research processes. Semantic CorA therefore relies on Semantic MediaWiki what ensures a certain degree of interoperability of the new-gained data due the RDF-Export features (and various other export formats like csv, json,..). Our project aims at connecting three different communities which are:
- researchers in Humanities (in our case from the domain of historical research in education, University Göttingen)
- developers of vres (Institute of Applied Informatics and Formal Description Methods (AIFB), Karlsruhe Institute for Technoloy (KIT) and Information Center for Education at German Institute for International Educational Research (DIPF)
- digital libraries (Research Library for the History of Education (BBF/DIPF) with its online archive "Scripta Paedagogica Online (SPO)")
Semantic MediaWiki
Semantic Media Wiki was chosen because it is a lightweight system with a broad community. As its development is open source, the results of our project in form of extensions and forms can easily be reused and adapted by others. As Semantic CorA does not aim at developing a (technically) new vre, the modular system architecture of MediaWiki (respective Semantic MediaWiki) offers a basis which can be adapted to the actual needs. Semantic CorA aims at the management and analysis of large corpora but clearly does not claim to be a large scale solution for the totality of research fields in the humanities as for example Text-Grid does. Therefore the RDF-Support of Semantic MediaWiki with its promising interoperability is a fundamental criteria which ensures the possibility to reuse the new data in other, rdf-based systems. This interoperability on the data level enables the possibility to think in "smaller scales" in the context of vres.
Furthermore wikis are well-known as a tool for collaborative working in the web. Even if some usabilty problems are given, regarding the syntax for example, using wiki-systems aims at using already known software. Even if editorial task in the wiki system are fare less trivial to non techi people as often assumed, a familiarization with wiki systems is given.
As another positive effect in Semantic CorA, we observed that after some time the users were able to construct own queries and templates to gain more information from and interact more flexible with their data. This development is clearly due to the openess of the wiki system where a large flexibility is given, especially compared to most of the out-of-the-box desktop environments which define a clear range of possibilities in the analysis of data. A more qualitative development through observation and collaboration in defining requirements on the system step-by-step was possible.
Funding
This project is funded by the German Research Foundation (DFG) entitled: "Entwicklung einer Virtuellen Forschungsumgebung für die Historische Bildungsforschung mit Semantischer Wiki-Technologie - Semantic MediaWiki for Collaborative Corpora Analysis (INST 367/5-1, INST 5580/1-1)" in the domain of Scientific Library Services and Information Systems (LIS). It is realized in a cooperation between the German Institute for International Educational Research (DIPF), the Karlsruhe Institute of Technology (KIT), the Library for Research on Educational History (BBF), and historical educational researchers mainly of the Georg-August-University Göttingen. We are grateful that Rudi Studer, Denny Vrandecic, Elena Simperl, Cornelia Veja, Klaus-Peter Horn, Anne Hild, Anna Stisser, Benedikt Kämpgen, Martin Wünsch, Sabine Liebmann, Stefan Cramme, and Gwen Schulte actively supported our endeavor.