Skip to content

GitLab

  • Menu
Projects Groups Snippets
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • L ls.ext
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 0
    • Issues 0
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Commits
  • Issue Boards
Collapse sidebar
  • digibib
  • ls.ext
  • Wiki
  • Catalinker

Last edited by Kristoffer Moe Apr 24, 2017
Page history

Catalinker

Introduction

Purpose

Catalinker is the main tool for creating and maintaining bibliographic catalogue data. It is a single-page app running in the browser, storing data mainly to RDF triple store, but also in Koha.

History

Catalinker started as a simple dynamic one page app that could render all input fields relevant to a particular resource type. i.e Work, Publication or Person derived from the evolving ontology, but needs for a more structured workflow demanded a solution that could handle more than one resource at a time, variations of resource types and complex object structures. Catalinker therefore has two faces; the main one and an old version, kept around mostly for testing purposes.

Extensibility

Catlinker is designed to be as flexible as possible in terms of supported ontology and workflow.

Architecture

The application consists of a Node.js back end serving a javascript/Ractive.js front end.

Node backend

The node.js back end serves static files; html, javascript, images etc, in addition to serving the main configuration, caching and proxying to services container.

Browserify bundling

All node mudules and the main javascript source files are bundled with browserify on demand. Therefore, there is noe build stage necessary other than npm install.

Caching

The back end enforces aggressive caching, to minimize loading time.

Proxying service calls

All actual data processing; HTTP POST, PATCH etc are forwarded to the services container. GETs are cached, but invalidated by POSTs and PATCHes to same resources.

Application structure

To allow flexibility with respect to ontology and workflow supported, the application conceptually consists of a front end that knows how to render any number of input fields grouped by tabs and connected to specific RDF targets such that each tab roughly concerns some aspects of one resource. A resource in this case can be a Work og a Publication. These corresponds to resource types from the ontology. Each tab also has a button to navigate to the next tab, supporting a particular work flow e.g. search for a work, then create a publication of it and then link it to the work. In addition, tabs may be selected explicitly once enabled by the workflow state. The application is designed to be capable of handling a somewhat shallow hierarchy of resources, more specifically Work as top nodes and one or more Publication as children. Depending on context, the editor at any time targets a Publication and its parent Work or only a Work instance. In the former case, the uri of the Publication infers also the uri of the Work, so only the uri of the Publication is present in the address bar.

A guiding principle during development has been to separate the general from the specific, so that the front end in principle should be able to handle any type of resource in a structure, not only Work and Publication, but e.g. Buildings and Floors. "Publication" and "Work" are therefore mentioned as rarely as possible in the code, but it has not been totally avoided. One could argue that the mindset has been kept, however. To support the principle above, all specific behavior is configured in the file workflow_config.js, which is served by the back end. The configuration file describes which predicates from the ontology goes to which tab in the workflow, the order of the predicates, and other necessary configurations like how to search for relevant values when linking to authorities, which fields should be shown when creating new authorities etc.

Following from above, all sub solutions are designed to be as generic as possible, even if used only once. The separation of code and configuration means that it should be fairly easy to adapt to another workflow or ontology, simply by replacing the configuration file or parameterize it, supporting other needs.

The user interface is constructed with the help of one from of a handful of templates. The most important being workflow.html, while the entry point to the application is via the menu.htmltemplate. Other templates are edit_authority.html which is used to edit all fields relevant for a resource type, in addition to comparing and merging two resources, and report.html, which presents the data from all tabs as one sectioned page, for an overview over catalogue data describing a Publication.

Ractive.js

Catalinker uses Ractive.js web framework, developed by the Guardian newspaper. Its most prominent feature is two-way binding between input fields and values in javascript object structures, suitable for rendering a complex user interface backed by an application state.

main.js

Most of the functionality in Catalinker is in main.js. It has grown from its humble beginnings in the old catalinker mentioned earlier to a quite large chunk of javascript code. Some parts and functions are more complex than others, and cater for more than one concern. No doubt should the file be divided into smaller modules and perhaps utilize Ractive's component support, but because of the incremental nature of the development process followed, and perhaps laziness, it has not come to this yet. However, most functions are kept small and with descriptive names, and the structure is roughly the same as from the beginning, only with more features added.

The core principle when editing RDF data with Catalinker is that "input" objects at some level in the application data tree structure are two-way bound to html input fields, laid out according to the configuration. More precisely; it is the "input" objects' "current" value that is bound and shown in the input field, whereas the previous value is kept invisibly as "old". Whenever a field loses focus after it has been edited by the user, an event handler kicks in and compares the old with the current value, and when a change has taken place, a patch is created. The patch contains a delete operation for the old value and an add operation of the current one. This is conveyed to the back end as an HTTP PATCH, which removes the outdated RDF triple and inserts a new one for the updated value.

A number of variations of this theme are used for different types of inputs e.g.:

  • Select value from set of predefined values
  • Search for other resources, where the input's value is an uri, but a displayValue field of the input object holds the name or label of the referred resource
  • Compound inputs; groups of any types of inputs that are saved as one
  • Inputs not saved as data, but used for looking up internal and external data.

Outline

The outline of main.js is as follows:

  • Require directives, e.g. requiring jquery and jquery-ui artifacts.
  • Various functions
  • Dialog handlers using jquery-dialog
  • Functions handling input values. Most important is updateInputsForResource which does all the heavy lifting when it comes to filling data into inputs from saved resources, external data as suggestions, resources to be compared with and paging of multi-valued compound input values.
  • Functions for converting configuration and ontology into input structure
  • Functions for handling patching and saving of new data
  • Ractive decorators. These are used to handle different ui requirements, such as Select2 style select components, sliding elements, accordion, input field ornamentation, special formatting, maintaining support panel positioning
  • The ractive instance, with
    • Event handlers. These are called on any on-click, on-enter etc events from user interface markup.
    • Observers. These are called when particular values in the state models changes.

Exposed objects

The main.js exposes some handy channels into the belly of Catalinker with which one can peek and poke around. Among these are the main ractive instance, underscore and jQuery.js. These are accessible in browser console as

  • document.main.getRactive()
  • document.main._
  • document.main.$

These are handy tools for debugging and development since the application relies heavily on them.

workflow_config.js

The configuration file mainly consists of these groups:

  • inputForms
  • tabs
  • authorityMaintenance
  • search
  • prefillValuesFromExternalSources

inputForms

This is an array with objects that describe forms used to create and maintain authorities, e.g. Person or Place. Each has an array of inputs specifying which RDF-property should be made available for entry.

tabs

This is an array og objects each describing a tab in the workflow, with attributes like tab label but most important which type of resource the fields in the tab adresses. The application can at any time handle one resource of each type, so all tabs with the same rdfType, e.g. Work targets the same resource.

Inside each tab there is an array of inputs, each consisting of an object describing how a particular predicate from ontology is handled. Note that not all inputs are connected to a predicate, some special cases are used for search e.g. for retrieving existing work to be edited or for external data lookup by e.g. ISBN. Even if these are used only once each, they are configured and not hard coded, again following the principle from earlier. Input fields may fetch label from the ontology, or it can be overridden in the config file. They may also be shown only when other inputs have certain values, e.g. only show the EAN field when mediaType is film or musical recording.

Input types

Usually, the type of an input is inferred from the type of the corresponding predicate from the ontology e.g. string, number etc. However, for inputs linking to other resource, one has to add type: 'searchable-with-result-in-side-panel' indicating that the field will either show the name of a linked resource (e.g. an author's name) or be used as a search field to look up linkable resources. This is to distinguish this kind of input from inputs used for predefined values (see below), where the value also is an uri.

When searching, a popup list with search results appears next to the field, where one can select one of existing resources, or create a new one. The input which is searchable will refer to one or more elasticsearch index types via its indexTypes property. If more than one index type is searchable as a source, a select box will appear. This is used when e.g. searching for candidates for subjects for works, which may link to different types of authorities, such as Subject, Person and even other Work.

From the referred search specification, the property resultItemLabelProperties lists which properties from the resulting search documents are to be used as source for search results labels. On the other hand, when the value of an already entered link to another resource is to be shown, the input property nameProperties is an array listing predicates in short form (i.e. only the fragment part of predicate's uri) by which the displayed value should be comprised of. Possible values for both search result and triple store labels are prefLabel, name etc, depending on the search index document structure.

Another important input type is dropdown list with predefined values, such as language. The actual values are extracted from the ontology. Some inputs allow multiple values, others don't. Since some inputs allow multiple value from the sets of predefined values, it is handy to realize these as html select elements. Due to the way two-way binding works in Ractive for these kinds of input controls, this is modeled as an array of values, rather than as an array og objects with current.value properties as is the case for other types of input. Some of the complexity when it comes to filling up values in the input structure is related to this difference.

An important property of inputs are widgetOptions structure which holds various configurations for an input, such as which form (from inputForms above) to show when opening a related authority for in situ editing - if the input field is of a type than links to another resource. widgetOptions is also used for other specialized control of input fields.

Some inputs are nested, and therefore has both their own predicate linked to by parent resource and an array of subInputs. These target blank nodes, which are handled as groups of data that are either all saved or none. An example of such objects are Contributions which have both a person and a role, e.g. author linking a Work to a Person. To allow contributions to be linked to either from Work or Publication, compound inputs may have multiple types as domain, e.g. both Work and Publication. Radio buttons for selecting source for the link to Contribution are therefore rendered when compound input has more than one subjects.

authorityMaintenance

Configures input search fields to be displayed on the authority maintenance tab of the menu page. For each editable authority resource type, one can configure search parameters in terms of which elasticsearch index type to use, and if present, a widgetOptions structure describing how the resource should ne handlet, that is, which form or which template to be used. Simple resource types, such as Person or Place, may be edited by the popup input forms most suited (from inputForms above), while the more complex, such as Work and Publication should be opened with the workflow template.

search

Configures the elasticsearch search routes for each resource type along with which indexed document's fields to query for and which fields to show in result list.

prefillValuesFromExternalSources

Configures how suggestion data from external sources are mapped into inputs. Catalinker receives data from external sources through services in RDF-format, some of which may be turned into new authorities such as Person or Subject.

State model

The state of the application is held in a set of object structures under the control of the Ractive.js instance. The most important part of the structure is the inputGroups, which is derived from the tabs array in the configuration enriched by the ontology. Accessing the model is done via ractive.get() function with a path as argument, e.g. ractive.get('inputGroups.1.inputs.3') which retrieves the fourth input of the second tab. Each input again has a values array with each element having a current.value and old.value. When values are changed, a delta is generated based on difference between a value's old and current value. The delta is then sent to back end as a HTTP PATCH-request. Value structs representing a link to another resource has in addition a displayValue field, containing the name or a more user friendly representation of the linked resource.

Below are screenshots showing input fields for main contributor and corresponding data structure at the deepest level in the state tree, exposing elements holding data for the http://data.deichman.no/ontology#agent predicate of the blank node representing a contribution to a work, in this case the main contributor or main entry.

image

image

Note that the same construction and predicates are used for both main entry and additional entries, in the latter case typically with e.g. illustrator og translator roles, whereas the main contributor often has an author role. They both handle blank nodes that are of type http://data.deichman.no/ontology#Contribution, which may be related to either Work or Publication. They appear on both first and last workflow tab, but are distinguished by the fact that the main entry case also has additional type http://data.deichman.no/ontology#MainEntry. Some of the complexity in the function updateInputsForResource is due to this fact, since it needs to be able to handle the same predicate in two different ways when loading a Work and/or Publication.

Partials

Even if the application's main engine is a massive block of code, the user interface is divided into a number of parts, called partials in Ractive.js lingo. Partials feed on the context in which they are invoked, such as an input node as the example above.

Internationalisation

Internationalisation (i18n) is provided with the use of partials named by the message keys, e.g. {{>pleaseWait}} which renders ... "Please wait..." when running with English language and "Vent litt..." when running in the Norwegian, which is the default language. The translations resides in the resource bundle files client/src/i18n/en.js and client/src/i18n/no_nb.js respectively. To select a particular language, e.g. English, add language=en as a url parameter when opening the cataloguing interface.

Since i18n texts are in fact Ractive partials, they may include any markup and use other partials just like bits parts of the front end, even refer to other language keys.

Development tips

Inspecting state

To inspect the state of the model, execute document.main.getRactive().get(<keypath>)from the development console in e.g. Chrome. Empty argument returns the ractive top level, where one can drill down to specific points of interest.

Print keypath in user interface

When one is in doubt if keypaths in the html files are correct, insert {{@keypath}} to print out the path to where in the structure your markup is working.

Observe changes

Sometimes it can be hard to see where a value in the structure has been changed from. By adding a intermittent observer, one can catch where changes originate from. Execute a statement like this in the console:

document.main.getRactive().observe('inpuGroups.3.inputs.4.values.2.current.value', function(newVal, oldVal, keypath){debugger;},{init:false})

Do whatever ui interaction that triggers an unexpected change, and when the debugger stops, examine the stack, looking especially for lines within main.js.

Random external data

To test and develop handling of external data, the backend can serve random suggestions for new data. To achieve this, start Catalinker with additional url parameters like these: externalSource=random_bibbi. Entering any ISBN will populate suggested data with nonsensical work, publication and person data. Adding externalSource=random_loc as url parameter also fetches data as from a secondary source, in this case Library of Congress.

Testing

Module tests

There is a simple module test that tests whether the workflow can be instantiated, but is has not been augmented for a long time.

Cucumber tests

Each new feature added has been accompanied by new or by extending the existing feature tests. The tests generally walk through the cataloguing workflow creating new authorities, and link them together. The old parts of the interface is used to check values after test scenarios, but is limited when it comes to showing e.g. compound inputs.

Clone repository
  • Backup & restore
  • Catalinker
  • Development Patron client
  • Development Services
  • Documentation todo
  • HTTP flows
  • Home
  • JSON LD PATCH
  • Ontology
  • Patron Client todo
  • System component interfaces ports
  • System components overview
  • deichman stack