Subprojects

Additional Information

Title Metadataproject
Project coordinator ir. Daan Broeder (Max Planck Instituut)
Budget To be determined
Abstract In the CLARIN EU project a design was made for the so called CLARIN Metadata Infrastructure (CMDI) that should create a single interoperable domain of metadata descriptions for all resources housed at CLARIN centres. At the basis of this design is the concept of component metadata, these are bundles of metadata elements that describe related aspects of a resource. All metadata elements need to be linked to a concept in the ISOCat Data Category Registry for semantic interoperability. Resource providers may reuse existing components or create their own and gather these into a metadata schema that they think best describes a specific resource type.
The CLARIN NL metadata project will, by using a preliminary rough implementation of a XML-schema and style-sheets toolkit (1) test the viability of this approach by describing the resources currently housed at two prospective CLARIN centres: INL and Meertens Institute. (2) It will provide a ready set of metadata components that can be used to seed the future CLARIN EU metadata component registry. (3) Provide best practice guidance for future users of the CMDI.

Up

Title Infrastructuurimplementatieplan
Project coordinator ir. Daan Broeder (Max Planck Instituut)
Budget To be determined
Abstract According to John Taylor an eScience scenario allowing researchers to make use of advanced features of the Internet can only be realized when they can make use of a new type of infrastructure. This new infrastructure should offer highly available and robust services to step away from the currently dominating download-first attitude. We can achieve this by offering a network of centers offering these services and taking care of data persistence and curation; developing a common and flexible framework of high quality metadata allowing easy discovery; setting up a federation based on single sign-on and single identity principles allowing researchers to easily navigate in such a domain and to create virtual collections; and by providing a joint framework for web services that allows researchers to easily include their smart algorithms and orchestrate them. All needs to be based on standards and best practices to achieve a high degree of integration of and interoperability between all types of language resources and tools. In this subproject we intend to implement (a first version of) such an infrastructure with MPI, INL, Meertens and DANS as initial centres.

Up

Title Geleerdenbrievenproject (CKCC)
Project coordinator prof. dr. W.W.Mijnhardt (Universiteit Utrecht, Descartes Institute)
Budget To be determined
Abstract A consortium of Dutch universities and cultural heritage institutions is building a web-based collaboratory (an online space for asynchronous collaboration) around a corpus of 20.000 letters of scholars who lived in the 17th-century Dutch Republic to answer the research question: how did knowledge circulate in the 17th century? Hereto, it will be necessary to analyze this large amount of correspondence systematically. Based on this (extendable) corpus, we will implement a content processing workflow that consists of iterative cycles of conceptual analysis, enrichment with several layers of annotation and visualization.
With advice from CLARIN-EU in the first stage of the project a demonstrator will be developed which implements techniques of keyword extraction (deadline: 1 October 2010) The second stage consists of evaluating existing more complex tools en techniques that can tackle one or more aspects of the targeted grammatical, content-related, and network complexity analysis, annotation, and visualization. The phase shall identify a set of tools that can be readily utilized in CKCC, as well as tools that need to be adapted or extended to the needs of CKCC; in short, by the end of this phase resources, requirements and risks shall become clear (deadline: December 2010).
In the third stage the collaboratory will be further developed according to the description in the CKCC project goals, centering around the technique of concept extraction (Deadline: 1 November 2012).
These three stages constitute the Work Package Analysis Tools, the core of the CKCC project, for which the support of CLARIN-NL is requested. Other Work Packages provide data and software tools needed to create a complete system: the digital corpus of letters (WP6), the editing collaboratory that will contain the letters (WP1), and the archiving environment for data and software (WP2).

Up

Title TST Tools voor het Nederlands als Webservices in een Workflow (TTNWW)
Project coordinators Marc Kemps-Snijders (Meertens Instituut)
Ineke Schuurman (coördinator CLARIN-vlaanderen)
Budget To be determined
Abstract Het doel van het beschreven project is om allerlei bestaande componenten die ontwikkeld zijn in (o.a.) CGN en STEVIN in te passen in een workflowsysteem voor web services dat (o.a. met grote Nederlandse inbreng, bijv. MPI) ontwikkeld wordt in CLARIN-verband, en dit geheel te laten draaien op servers van erkende CLARIN-centra, met als doel faciliteiten aan te bieden voor onderzoekers uit de HSS met geen of weinig technische bagage. Deze faciliteiten moeten 1) hen in staat stellen hun onderzoeksvragen beter of makkelijker aan te pakken en 2) mogelijkheden bieden voor het formuleren van nieuwe typen onderzoeksvragen, i.e. onderzoeksvragen die voor CLARIN niet gesteld konden worden of niet doelmatig te beantwoorden waren. De webservices betreffen twee modaliteiten: tekst en spraak.

Up

Title Search and Develop (S&D)
Project coordinator Hans Bennis (Meertens Instituut)
Budget To be determined
Abstract This proposal describes a three-year program with the intention to develop a necessary functionality to allow a generic search on metadata and content. Such a project is ideal to achieve the required national cooperation between CLARIN-NL centres as well, since it includes almost all the relevant infrastructural aspects. Moreover, it is in line with a proposal for a European project to develop a CLARIN European Demonstrator. In this project we intend to achieve three goals: (1) a generic  search engine, (2) a national centre structure, and (3) a leading position in Europe.

Up

Title User Survey
Project coordinator Arjan van Hessen (Universiteit Twente)
Budget To be determined
Abstract a zero-measurement

Up