PRISM Glossary

Comprehensive definitions and guidance for using the Platform for Research Infrastructure Synergy Mapping

Interaction Types

PRISM categorizes tool interactions into 11 distinct types. Understanding these helps you accurately describe how research tools connect and communicate.

API Integration

0 interactions

Definition:

Direct programmatic connection between tools using Application Programming Interfaces

When to use:

When tools communicate programmatically with structured data exchange

Example:

DMPTool connects to RSpace via REST API to sync data management plans

Technical Indicators:

REST API GraphQL SOAP JSON XML OAuth

Common Technologies:

  • HTTP/HTTPS
  • REST
  • SOAP
  • gRPC

Data Exchange

0 interactions

Definition:

Transfer of research data files or datasets between tools

When to use:

When the primary function is moving data content between systems

Example:

Zenodo receives data files exported from GitHub repositories

Technical Indicators:

file transfer bulk data datasets repository sync

Common Technologies:

  • FTP
  • SFTP
  • rsync
  • cloud storage APIs

Metadata Exchange

0 interactions

Definition:

Transfer of descriptive information about data without moving the data itself

When to use:

When exchanging descriptions, citations, or contextual information

Example:

ORCID profile information linked to publications in Zenodo

Technical Indicators:

metadata schema descriptive info catalog

Common Technologies:

  • OAI-PMH
  • SWORD
  • Dublin Core
  • DataCite

File Format Conversion

0 interactions

Definition:

Transformation of data from one file format to another

When to use:

When format transformation is the primary interaction purpose

Example:

Converting CSV data to Parquet format for analysis

Technical Indicators:

format change conversion transformation encoding

Common Technologies:

  • CSV
  • JSON
  • XML
  • Parquet
  • HDF5
  • NetCDF

Workflow Integration

0 interactions

Definition:

Tools combined into multi-step research workflows or pipelines

When to use:

When tools are orchestrated together in a sequence

Example:

Jupyter Notebook packaged with Docker for reproducible analysis

Technical Indicators:

pipeline workflow orchestration automation

Common Technologies:

  • Airflow
  • Nextflow
  • Snakemake
  • Galaxy
  • Taverna

Plugin/Extension

0 interactions

Definition:

One tool extends functionality of another through add-ons or plugins

When to use:

When one tool adds features directly into another tool's interface

Example:

Zotero plugin installed in Microsoft Word for citation management

Technical Indicators:

plugin extension add-on module

Common Technologies:

  • Browser extensions
  • IDE plugins
  • Office add-ins

Direct Database Connection

0 interactions

Definition:

Tools query or write to shared database infrastructure

When to use:

When tools share underlying data storage layer

Example:

Analysis tool connects directly to PostgreSQL research database

Technical Indicators:

database SQL NoSQL direct connection

Common Technologies:

  • PostgreSQL
  • MySQL
  • MongoDB
  • Redis
  • Elasticsearch

Web Service

0 interactions

Definition:

Tools interact via web-based service endpoints (may include APIs)

When to use:

For web-protocol-based interactions like HTTP, SOAP, OAI-PMH

Example:

Data repository accessed via OAI-PMH harvesting protocol

Technical Indicators:

web service endpoint WSDL service oriented

Common Technologies:

  • HTTP
  • SOAP
  • XML-RPC
  • OAI-PMH

Command Line Interface

0 interactions

Definition:

Tools invoked or controlled via terminal commands or scripts

When to use:

When interaction happens through shell commands or scripts

Example:

Python script calls FFmpeg via command line to process video data

Technical Indicators:

CLI bash shell script command line

Common Technologies:

  • Batch processing
  • Automation scripts
  • HPC jobs

Import/Export

0 interactions

Definition:

Manual or semi-automated file-based data transfer between tools

When to use:

When users manually transfer files between systems

Example:

Export CSV from REDCap, import into R for analysis

Technical Indicators:

export import download upload manual transfer

Common Technologies:

  • CSV
  • Excel
  • JSON
  • XML
  • text files

Other

0 interactions

Definition:

Interaction types not covered by standard categories

When to use:

When no other category fits; please describe in Technical Details

Example:

Custom or novel integration approaches

Technical Indicators:

custom proprietary novel unique
Please provide detailed description to help us improve categorization

Research Data Lifecycle Stages

The MaLDReTH model defines 12 stages in the research data lifecycle, representing the complete journey of research data from conception to reuse.

About the 12-Stage Model

This lifecycle model was developed by the MaLDReTH II RDA Working Group to provide a comprehensive framework for understanding research data workflows. The stages are sequential but can also be iterative and overlapping in practice.

1 CONCEPTUALISE

Duration: Weeks to months

Definition: To formulate the initial research idea or hypothesis, and define the scope of the research project and the data component/requirements of that project.

Key Activities:

  • Literature review
  • Hypothesis formulation
  • Research question development
  • Defining data requirements
  • Scope definition

Typical Outputs:

  • Research questions
  • Hypotheses
  • Initial concepts
  • Data requirements

Typical Tools:

Reference managers, Mind mapping tools, Literature databases, Ideation platforms

2 PLAN

Duration: Weeks to months

Definition: To establish a structured strategic framework for management of the research project, outlining aims, objectives, methodologies, and resources required for data collection, management and analysis. Data management plans (DMP) should be established for this phase of the lifecycle.

Key Activities:

  • Study design
  • Protocol development
  • Resource planning
  • DMP creation
  • Defining methodologies
  • Resource identification

Typical Outputs:

  • Data Management Plans
  • Protocols
  • Study designs
  • Resource allocation plans

Typical Tools:

DMP tools, Project management, Protocol repositories, DMPTool, DMPonline

3 FUND

Duration: Months to years

Definition: To identify and acquire financial resources to support the research project, including data collection, management, analysis, sharing, publishing and preservation.

Key Activities:

  • Grant writing
  • Budget planning
  • Proposal submission
  • Identifying funding sources
  • Financial planning

Typical Outputs:

  • Grant proposals
  • Budgets
  • Funding awards
  • Financial plans

Typical Tools:

Grant management systems, Budget calculators, Proposal tools, Funding databases

4 COLLECT

Duration: Days to years

Definition: To use predefined procedures, methodologies and instruments to acquire and store data that is reliable, fit for purpose and of sufficient quality to test the research hypothesis.

Key Activities:

  • Experiments
  • Surveys
  • Observations
  • Measurements
  • Sampling
  • Data acquisition

Typical Outputs:

  • Raw data
  • Observations
  • Measurements
  • Samples
  • Experimental data

Typical Tools:

Lab instruments, Survey platforms, Sensors, Data loggers, Electronic lab notebooks

5 PROCESS

Duration: Days to months

Definition: To make new and existing data analysis-ready. This may involve standardised pre-processing, cleaning, reformatting, structuring, filtering, and performing quality control checks on data. It may also involve the creation and definition of metadata for use during analysis, such as acquiring provenance from instruments and tools used during data collection.

Key Activities:

  • Data cleaning
  • Quality assurance
  • Normalization
  • Format conversion
  • Metadata creation
  • Filtering
  • Structuring

Typical Outputs:

  • Cleaned datasets
  • Quality reports
  • Processed data
  • Metadata
  • Analysis-ready data

Typical Tools:

Data cleaning tools, ETL platforms, Quality control software, OpenRefine, Data wrangling tools

6 ANALYSE

Duration: Weeks to months

Definition: To derive insights, knowledge, and understanding from processed data. Data analysis involves iterative exploration and interpretation of experimental or computational results, often utilising mathematical models and formulae to investigate relationships between experimental variables. Distinct data analysis techniques and methodologies are applied according to the data type (quantitative vs qualitative).

Key Activities:

  • Statistical tests
  • Modeling
  • Visualization
  • Pattern discovery
  • Iterative exploration
  • Interpretation

Typical Outputs:

  • Analysis results
  • Statistical models
  • Visualizations
  • Insights
  • Interpretations

Typical Tools:

R, Python, SPSS, MATLAB, Jupyter, Statistical software, Analysis platforms

7 STORE

Duration: Duration of project

Definition: To record data using technological media appropriate for processing and analysis whilst maintaining data integrity and security.

Key Activities:

  • Active storage
  • Backup
  • Version control
  • Collaboration
  • Integrity maintenance
  • Security management

Typical Outputs:

  • Backed up data
  • Version history
  • Shared datasets
  • Secure storage

Typical Tools:

Cloud storage, Version control, Lab servers, Collaborative platforms, Git, Institutional storage

8 PUBLISH

Duration: Months to years

Definition: To release research data in published form for use by others with appropriate metadata for citation (including a unique persistent identifier) based on FAIR principles.

Key Activities:

  • Paper writing
  • Peer review
  • Conference presentations
  • Preprints
  • Data publication
  • Metadata creation
  • DOI assignment

Typical Outputs:

  • Publications
  • Presentations
  • Preprints
  • Published datasets
  • DOIs

Typical Tools:

Journal systems, Preprint servers, Writing tools, LaTeX, Data journals, Repository platforms

9 PRESERVE

Duration: Permanent

Definition: To ensure the safety, integrity, and accessibility of data for as long as necessary so that data is as FAIR as possible. Data preservation is more than data storage and backup, since data can be stored and backed up without being preserved. Preservation should include curation activities such as data cleaning, validation, assigning preservation metadata, assigning representation information, and ensuring acceptable data structures and file formats. At a minimum, data and associated metadata should be published in a trustworthy digital repository and clearly cited in the accompanying journal article unless this is not possible (e.g. due to the privacy or safety concerns).

Key Activities:

  • Archiving
  • Format migration
  • Metadata enrichment
  • Curation
  • Data cleaning
  • Validation
  • Format standardization

Typical Outputs:

  • Archived datasets
  • DOIs
  • Preserved research outputs
  • Preservation metadata
  • Curated collections

Typical Tools:

Repositories, Archives, Preservation systems, Digital curation tools, Trustworthy repositories

10 SHARE

Duration: Ongoing

Definition: To make data available and accessible to humans and/or machines. Data may be shared with project collaborators or published to share it with the wider research community and society at large. Data sharing is not limited to open data or public data, and can be done during various stages of the research data lifecycle. At a minimum, data and associated metadata should be published in a trustworthy digital repository and clearly cited in the accompanying journal article.

Key Activities:

  • Publishing datasets
  • Access control
  • License assignment
  • Documentation
  • Collaboration
  • Community sharing

Typical Outputs:

  • Shared datasets
  • Data publications
  • Access portals
  • Collaborative workspaces

Typical Tools:

Data repositories, Institutional repositories, Figshare, Zenodo, Dryad, Sharing platforms

11 ACCESS

Duration: Ongoing

Definition: To control and manage data access by designated users and reusers. This may be in the form of publicly available published information. Necessary access control and authentication methods are applied.

Key Activities:

  • Data discovery
  • Search
  • Download
  • API access
  • Access control
  • Authentication management

Typical Outputs:

  • Downloaded data
  • Retrieved datasets
  • Access logs
  • Usage statistics

Typical Tools:

Data catalogs, Search engines, Repository interfaces, APIs, Access management systems

12 TRANSFORM

Duration: Varies

Definition: To create new data from the original, for example: (i) by migration into a different format; (ii) by creating a subset, by selection or query, to create newly derived results, perhaps for publication; or, (iii) combining or appending with other data.

Key Activities:

  • Format conversion
  • Subset creation
  • Data integration
  • Reanalysis
  • Data migration
  • Query and selection

Typical Outputs:

  • Transformed data
  • Subsets
  • Integrated datasets
  • New research
  • Derived datasets

Typical Tools:

Conversion tools, Query systems, Integration platforms, Analysis tools, Data transformation pipelines

MaLDReTH Terminology

MaLDReTH
Mapping the Landscape of Digital Research Tools Harmonised. An RDA Working Group initiative focused on creating a comprehensive categorization schema for digital research tools.
PRISM
Platform for Research Infrastructure Synergy Mapping. This web application - a key output of the MaLDReTH II initiative.
Exemplar Tool
A representative tool within a category, demonstrating typical characteristics and capabilities. Currently PRISM contains 72 exemplar tools.
Tool Category
A classification group for similar tools within a lifecycle stage. Categories help organize tools by function and purpose.
Tool Interaction
A connection or integration between two research tools, describing how they communicate or work together. PRISM currently contains 0 documented interactions.
Research Data Lifecycle (RDL)
The 12-stage model describing the complete journey of research data from initial concept through to reuse and transformation.
GORC
Global Open Research Commons. An RDA initiative that PRISM contributes to, focused on improving interoperability and FAIR data practices.

Technical Terms

API
Application Programming Interface. A set of protocols for building software and enabling tool-to-tool communication.
REST
REpresentational State Transfer. An architectural style for web APIs using HTTP methods.
OAuth
Open Authorization. A standard for secure authorization and authentication between applications.
DOI
Digital Object Identifier. A persistent identifier for digital objects like datasets and publications.
ORCID
Open Researcher and Contributor ID. A unique identifier for researchers and scholars.
FAIR
Findable, Accessible, Interoperable, Reusable. Principles for scientific data management and stewardship.
OAI-PMH
Open Archives Initiative Protocol for Metadata Harvesting. A protocol for sharing metadata between repositories.
JSON
JavaScript Object Notation. A lightweight data format for API communication.
CSV
Comma-Separated Values. A simple file format for tabular data exchange.
CLI
Command Line Interface. Text-based interface for interacting with software via commands.

Contributing to PRISM

How to Add an Interaction
  1. Identify the tools: Determine the source and target tools involved
  2. Select interaction type: Review the definitions above to choose the most appropriate type
  3. Choose lifecycle stage: Identify which research stage this interaction supports
  4. Describe the interaction: Write 1-3 sentences explaining what happens and why it's useful
  5. Add technical details: Include protocols, APIs, or technologies used (optional but recommended)
  6. Provide examples: Share real-world use cases (optional but valuable)
What Makes a Good Interaction Description
  • Clear and specific: Explain exactly what the interaction does
  • Accurate categorization: Use the correct interaction type and lifecycle stage
  • Technical depth: Include implementation details when known
  • Real examples: Reference actual use cases or institutions
  • Benefits and challenges: Help others understand trade-offs
Bulk Upload via CSV

For adding multiple interactions:

  1. Download the CSV template to see the format
  2. Prepare your data following the same structure
  3. Ensure tool names match existing tools in PRISM (or new tools will be created)
  4. Use the CSV upload page to submit your file
  5. Review the results and fix any errors reported

Frequently Asked Questions

API Integration refers to modern RESTful or GraphQL APIs with programmatic access, typically using JSON. Web Service is broader and includes older protocols like SOAP, XML-RPC, or domain-specific protocols like OAI-PMH. If in doubt, "API Integration" is usually the better choice for contemporary tools.

Currently, each interaction is assigned to one primary lifecycle stage. If an interaction genuinely supports multiple stages, choose the stage where it's most commonly used, and mention the other stages in the description or examples field.

When you add an interaction via CSV upload, PRISM will automatically create any missing tools. For manual entry through the web form, please contact the MaLDReTH II working group to request tool additions, or use the CSV bulk upload feature.

PRISM focuses specifically on interactions between tools rather than just cataloging individual tools. While many catalogs list research tools, PRISM maps how they connect, integrate, and work together across the research data lifecycle. This makes it uniquely valuable for understanding research infrastructure interoperability.

Yes! Every interaction has an "Edit" button on its detail page. You can update any field to improve accuracy or add additional information. All edits help improve the quality of PRISM's knowledge base.

PRISM is maintained by the MaLDReTH II RDA Working Group. You can get involved by:
  • Contributing interaction data through PRISM
  • Joining the MaLDReTH II working group
  • Participating in RDA plenary sessions
  • Providing feedback and suggestions

Ready to Contribute?

Use your new knowledge to help map the research infrastructure landscape