Opening up MOOCs for OER management on the Web of linked data

Dr Gilbert Paquette

Research Chair for Instructional and Cognitive Engineering (CICE)

  1. OER repositories first generation

The idea that educational contents could be seen as “objects” to be reused in multiple contexts dates back to the late 60’s but it started to become a reality only by the middle of the 90s with the generalization of the Internet [1]. In 1995, an international consensus arose around the necessity of e-learning standards to promote tools’ interoperability and learning objects reusability. The aim was to insure the reuse of educational objects jeopardized by the diversity of referencing metadata schema around the world. This goal was shortly concretized by the Dublin Core (DC) metadata initiative [2] proposing a first set of standardized metadata, expressed in XML. In June 2002, on the basis of a joint IMS Global and ARIADNE proposal, IEEE approved a LOM standard [3] that was largely accepted internationally. From then on, major resource repository initiatives bloomed rapidly: ARIADNE in Europe, MERLOT in the USA, EdNA in Australia. In Canada, the eduSource initiative networked the first LOM repositories from coast to coast. It was followed by our own LORNET research network, which joined the GLOBE [4] international consortium. GLOBE operates actually a large repository of nearly one million resources, mostly OERs.

  1. Second generation repositories

After a decade of research and practice in this field, there are still a number of limitations to a larger use of OER repositories part of which is the heavy resource indexing load required by LOM application profiles. These profiles try to reduce the load to a limited set of metadata useful locally, at the expense of resource reusability from various repositories.

The ISO/IEC 19788 standard [5], in short ISO-MLR, is intended to provide optimal compatibility with both DC and the LOM. It presents the following advantages.

  • Insuring the coherence of metadata concepts by proposing an RDF-based data model.
  • Preventing the proliferation of non interoperable application profiles.
  • Supporting the extension of description vocabulairies while preserving interoperability.
  • Supporting multilingual and cultural adaptability requirements from a global perspective.
  • Integrating referencing and search with other data sets wihin the Web of linked data.

The fundamental thing here is that ISO-MLR proceeds from a different vision than previous standards like the IEEE-LOM, where resources as seen only as documents. ISO-MLR uses technologies like RDF and RDF schema to integrate well in a Web of linked data, instead of simply a Web of documents. In the Web of linked data, the URLs who provide locations on the Web are generalized to URIs that can also represent people, real-world objects or abstract concept and properties. These entities and the values of their properties are linked together by declaring RDF triples. It then becomes possible to describe the meaning, the semantic of Web pages beyond the syntax of natural languages and their inherent ambiguity. A Web of linked data enables computer agents to follow the links and perform more intelligent operations using the knowledge behind the words. For this, the SPARQL RDF Query Language [6] enables queries within the huge graph of RDF triples [7] that constitutes the Web of linked data.

  1. COMÈTE, an RDF-based OER Manager

COMÈTE is a second generation learning resource repository manager based on the RDF approach. It allows locating, aggregating and retrieving educational resources that constitute the heritage of an organization. Basically, it runs a triple store containing metadata triples about learning resources on which users can perform queries to find and discover educational material that they can reuse for their various needs. The integration of resources inside a COMETE repository is done by processing metadata records from repositories that use Dublin Core, IEEE LOM, ISO-MLR and other application profiles. The result of this process is a homogeneous graph of data within COMETE’s internal metamodel based on ISO-LMR like schemas.

As a semantic network, the resulting RDF graph represents the entities as nodes. Mains nodes are learning resources, persons and organizations and SKOS vocabulary elements: concepts and properties. By various techniques, the system maximizes the inner coherence of the graph. The Identity module implements the management of metadata about persons or organizations). This includes importation of identities, identity resolution of triples that represent the same person or organization, making sure it stays unique, and completing it as new details are known). Manual merge of identities is also provided within a set of administrative tools for a better control of data integrity.

The Vocabulary module implements the management of vocabularies, thesauri and ontologies, which involve importing from VDEX or SKOS formats, unambiguously identifying the vocabulary that a term is from, and finding a computer readable representation of the whole vocabulary, transparently converting from one format to another, replacing a vocabulary when updates are available, publishing vocabularies automatically and providing user interface elements reusable by other modules, including for queries to the repository. This module manages also correspondences between taxonomies. Indeed, SKOS concept alignment between different ontologies (or vocabularies) can be taken into account by the query engine. A useful example of alignment is the mapping between different school-level taxonomies of different countries to promote the interoperability of resources between national repositories. For instance we can search resources which target audience is Junior High School in the United States and the results may contain pertinent Secondary School I-III tutorials produced in Québec.

COMÈTE constructs rich graphs of data that allow doing sophisticated search based on authors, organization, concepts or properties describing knowledge, using various kinds of search interfaces. All of the queries expressed in forms and menus are translated in SPARQL language by the QueryEngine module and then run on the triple store. By combining different conditions, mixing keyword-based approach, using negative prefixes, more complex queries can be performed thant in traditional OEF managers..

  1. Using COMÈTE within a MOOC Platform


In this final section, we present two use cases where COMETE is used to interoperate with a MOOC platform like OpenEdX. Within such a platform, the role of COMETE is twofold: enabling designers to search and reference OERs within a MOOC; reference MOOC themselves to produce a searchable standardized MOOC portal.

At Télé-université, we have adapted to our needs OpenEdX, the open-source release of the edX platform developed by a non-profit organization founded by Harvard and MIT in the USA.  It provides essentially two server-based applications. The first one, edX-STUDIO, is the application where designers build courses. Resources and activities are grouped in course modules and stored in a Mongo no-sql and MySQL databases. Students interact at runtime with the second application, the Learning Management System (LMS) that performs learner authentication and learning scenario support at runtime [8].

Designing a MOOC using the COMETE OER Manager. A typical OpenEdx course is subdivided in sections (e.g. modules) and each module in sub-sections (e.g. lessons). For each lesson, an upper menu provides access to the lessons’ sequential components: discussion components, HTML content components, problem/quizz components, and video components. As a principle, all these components should be open educational resources (OER). All these OER components are found mainly on the Web. Actually, most designers use search engines like Google or Bing to find open resources to reuse or adapt for their course. As explained previously, there are many advantages in using a learning resource repository manager like COMETE to find suitable resources. Using REST web services, a call to COMETE from OpenEdX studio could start efficient search operations and facilitate the selection of resources of the four categories proposed in STUDIO. Conversely, STUDIO could be upgraded to provide forms to edit metadata for the resources in a standardized DC, LOM or ISO-MLR application profile suitable for Studio. This would enable designers to automatically create a RDF resource repository for a course, for a whole program or for all its edX users. The creation of this local repository would produce a URI where the edX resources can be harvested by COMETE or other OER Managers and integrated into larger repositories for future use.

Referencing a MOOC using COMETE. When a new MOOC is created in OpenEdX, a course registration screen is offered to the designer. Actually, only four metadata are asked: the course name, the organization that supports it, the course number and the periods when it will be offered. This form could be easily extended to fields from a DC, LOM or ISO-MLR application profile that would take into account the differences between small resources within a MOOC, compared to large OERs: MOOC courses or modules. Then, automatically, each time a new MOOC is created, it would have a URI on the Web of data together with its component resources. COMETE could then provide a searching facility is MOOC repositories like Class Central [9], which is a free online course aggregator from universities like Stanford, MIT, Harvard, etc., offered via Coursera, Udacity, edX, NovoED, & others.

Actually, in a MOOC portal, courses are classified by subject, universities, level and provider, which are the only meta-data entries available to browse for a course. Most of the time one must open each course (or register) to know what’s in it. With standardized metadata, COMETE could power a MOOC portal with various kinds of search and navigation capabilities, combining metadata queries and knowledge navigation on the Web of data.


We have presented a solution to one of the main problems in Open Educational Resources repositories, which is the multiplicity of norms, standards and application profiles that preclude efficient search for resources within multiple repositories. We have built a first Linked data OER repository manager, COMETE, relying on semantic web techniques, largely complying to the new ISO-MLR standard. Its use for MOOC and MOOC components referencing using RDF triples would become an asset as the number in massive online courses is growing rapidly in most countries. Our next work will be to investigate various integration of COMETE tools with MOOC platforms as indicated in the present contribution.


[1]    Duval, E. and Robson, R. (2001) Duval, E. and Robson. R. Guest Editorial on Metadata. Interactive Learning Environments, Special issue: Metadata, Volume 9-3, December 2001, pp. 201-206

[2]    DC – Dublic Core Metatdata initiative. http://dublincore.org .

[3]    IEEE-LOM – Learning Object Metatda, http://fr.wikipedia.org/wiki/Learning_Object_Metadata .

[4]    GLOBE – Global Learning Object Brokered Exchange, http://globe-info.org .

[5]    ISO-MLR (2013) ISO-IED 19788 Information technoogy – Learning, education and training – Metatda for learning resources multipart standard. http://en.wikipedia.org/wiki/ISO/IEC_19788

[6]    SPARQL 1.1 Query Language, W3C Recommendation, 21 March 2013. http://www.w3.org/TR/sparql11-query/

[7]    LOD Cloud – The Linking Open ata cloud diagram. http://lod-cloud.net

[8]    Coulombe C. (2014) Expérimentation de la plateforme OpenEdX, rapport technique LICEF, Télé-université du Québec.

[9]    Class Central (2014) Free Online Education Portal, https://www.class-central.com