Version 2 Now Live!


Welcome To LD Connect,
Our Linked Data Portal

What's New

IOS Press is pleased to welcome you to the second release of LD Connect 2.0 (Beta). This linked data portal contains linked metadata of all IOS Press books and journals, as well as embeddings of all full text.

This newly released second version offers an improved user interface and twice the number of datapoints, now offering linked data for more than 132,000 journal articles and book chapters, over 330,000 authors resulting in more than 16 million triples.

Our linked data is openly available to the public, so it can:

We invite you to explore the data through our data browser, download our entire dataset or just subsets, download our word embeddings trained on the full text of all IOS Press publications or use the SPARQL endpoint search box on this platform.

Would you like to see our data connected to yours or use our dataset to power one of your applications? We are very open to collaborations and love to hear about your project. Please email us or fill in the form below and we will get back to you at our earliest convenience.

Note: This project is in BETA, if you have any feedback, please don't hesitate to get back to us.

What Is Linked Data?

Linked data is a method of publishing structured data on the Web in a human and machine readable way, thereby breaking apart data silos and fostering the interlinking of data.

It builds upon standardized Web technologies such as HTTP, RDF and URIs, but rather than using them to serve web pages for human readers, it extends them to share information in a way that can be read and understood automatically by computers.

Tim Berners-Lee coined the term in 2006 and described the following principles:

  1. Use URIs as names for things.
  2. Use HTTP URIs so that people can look up those names.
  3. When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL).
  4. Include links to other URIs so that they can discover more things.

Semantic Web technologies and ontologies, i.e., formal vocabularies, provide the means to query that data, draw inferences using vocabularies, and combine data across different sources. To make the Web of Data a reality, it is important to have the data available in a standard format, reachable and manageable by Semantic Web tools as well as to link to other data. This collection of interrelated datasets on the Web can also be referred to as linked data.

For example, metadata about research articles can be published as linked data together with data about authors, their affiliations, and so forth. Links between these data, enable queries such as for all authors across different journals that work on Alzheimer's disease. Next, establishing links between different datasets makes it possible to integrate these bibliographic data with a wide variety of other data such as from demographic datasets, e.g., to establish a relationship between the affiliation of a research team, the research topic, and the region studied in their work.

Why Linked Data?

As a science publishing house operating in an era of digital transformation, we felt it was imperative for us to apply best practices to all aspects of our workflow. The potential of linked data is not lost on us. By offering our datasets in machine-readable form to third parties and semantic tools, we hope to contribute in a meaningful way to scientific progress.

Dr. Einar Fredriksson, founder and director of IOS Press

The un-siloing of data leads to improved retrieval, accessibility, reusability, and interoperability. Structured data can be searched, shared, reused, data mined and linked to other data sources. Contextual relations between authors, institutions and research areas can be made visible. Downstream applications such as abstracting and indexing databases can use the data portal to ensure their own datastores are always up to date with latest research published by IOS Press. Furthermore, authors who publish their work with IOS Press can do so under the assurance that their work is disseminated through both human and and machine accessible channels and following web-friendly standards.

The portal currently contains millions of triples, i.e., individual statements, and maps connections between metadata of journal articles, book chapters, authors, affiliations, keywords and other bibliographic metadata to provide a complete ecosystem of the IOS Press scholarly relationships.

More statistics on the current number of data can be found below. New data is continuously added and new data points will also be added to further enrich the portal. Tools that visualize the data for human consumption as well as tools for knowledge mining include a visual linked data browser, a SPARQL query endpoint, and a visual query interface. Further tools, e.g., for semantic search, are in development.

The linked data portal was developed in collaboration with STKO Lab in Santa Barbara, CA, USA.

Information About Our Data

We spent a considerable amount of time cleaning up our data and constructing a conversion pipeline to transform all IOS Press article and book metadata to RDF-based linked data.

We use a custom vocabulary and web standards extensively while describing our data, in order to make our data even more discoverable, accessible, linkable and interoperable with other datasets. The affiliations are geocoded and authors as well as affiliations are disambiguated using our co-reference resolution script. With the help of machine learning techniques, the data conversion pipeline keeps on improving the more data are added. Co-reference resolution was developed in collaboration with the DaSe Lab at Wright State University.

Our datasets contain a.o. metadata of journal articles, volumes, issues, book chapters, published dates, ISSNs, DOIs, authors, affiliations, keywords, pages and abstracts.

Pre-trained Doc2Vec Models

The two files linked below contain pre-trained Doc2Vec models of all English journal articles and book chapters published by IOS Press over the years and are based on their full-text content, not just abstracts. In total, the dataset used was made up of >132000 papers, all of them are also matched to entities in the IOS Knowledge Graph. The corresponding word embedding model has a vocabulary size of 105839. The embedding dimension of both of these two models is 200. The Doc2Vec model is trained using the Python gensim@3.3.0 library.

  1. "" can be loaded directly into the gensim library.
  2. "" contains the Doc2Vec model and its corresponding Word2Vec model as plain-text files (“doc2vec.txt”, “w2v.txt”). The “doc2vec_voc.txt” contains a list of all the paper entity URLs of the Doc2Vec model. The “w2v_voc.txt” contains a list the word vocabulary of the corresponding word2Vec model. This version can, therefore, be used to work that required a direct integration with the IOS knowledge graph.

For questions, please contact - Krzysztof Janowicz at

IOS Knowledge Graph Embedding

The IOS Knowledge Graph (KG) Embedding files are trained on the IOS Knowledge Graph by using the TransE algorithm. The algorithm utilizes each triple with object properties to training an embedding model for each entity and each predicate in the KG. As for a triple <s, p, o>, TransE learns k-dimensional embeddings for the entity s, o as well as relation p to make s + p aproximately zero.

Note TransE_ent.txt and TransE_relation.txt follows the word embedding format defined by python’s gensim package.

For questions, please contact - Krzysztof Janowicz at

Unleashing The Potential

Providing machine-readable, interlinked metadata that is publicly available opens up a wide range of opportunities.

On the one hand we offer our linked data to the public, so it can enrich third party datasets, further unsilo research data, and incentivize new discoveries.

On the other hand we are currently working on services and tools built on top of our linked data. Our tools can be used to:

The possibilities are endless and we are only at the start of it. Connect your dataset to other datasets out there and new potential is unleashed, time and again.

Would you like to see our data connected to yours or use our dataset to power one of your applications? We are very open to collaborations and love to hear about your project. Please email us or fill in the form below and we will get back to you at our earliest convenience.

LD Connect currently contains


Articles & Chapters





Feedback form

Please let us know what you think, so we can further improve our linked data platform!

Stay informed about LD Connect