InterMine 4.3.0 release

We are pleased to announce the new InterMine release 4.3.0.
It includes a few improvements and bug fixes.
This is a non-disruptive release.

Improvements and bug fixes

  1. Added a new generic obo source which can be configured in project.xml (Sam Hokin)
  2. Removed the gretty plugin from testmine which now uses gradle 4.9
  3. Removed duplication of so.obo file
  4. Fixed the Google Charts API which broke various things, including the humanmine expression viewers
  5. Updated Google auth and userinfo endopoints
  6. Fixed the NullPointerException in the report page caused by an empty value in the field key used to generate the permanent URI (Share button)
  7. Fixed the home page loading when bioschemas markup are enabled (markup.webpages.enable=true) and JAVA 11 is used
  8. Fixed the webservice query/results when the format is jsonobjects and the root class is a simple object – with no id (Sam Hokin)

See release notes for detailed information.

Upcoming releases

For more information about the upcoming releases, please visit the InterMine roadmap here.

Outreachy Interview: Mardhiyah Sanni on ‘Review, update, and integrate InterMine developer documentation’ project

This is our blog series interviewing our 2020 Outreachy interns, who are working remotely for InterMine for 3 months (from December to February) on the user interface and developer documentation projects. We’ve interviewed Mardhiyah Sanni, who will be working on “Review, update, and integrate InterMine developer documentation”.

Hi Mardhiyah! We’re really excited to have you on board as part of the team this winter. Can you introduce yourself? 

Hi InterMine team! I am also very excited to be working with you guys during this winter. I am a recent Electronic and Electrical Engineering graduate. I am a python developer and an aspiring machine learning engineer with particular interests in computer vision and natural language processing. I enjoy being in intellectually stimulating environments as this satisfies my hunger for solving problems of varying nature.

I also enjoy watching movies, videos of pandas and playing (probably an unhealthy amount 🙈) of scrabble.

What interested you about Outreachy with InterMine?

Applying to outreachy was my first experience with open source, so I wanted to start from contributing to a documentation related project. However, after starting with the InterMine project and interacting with the very accommodating community, I was instantly attached to this project and team. I must say that the community made my first experience with contributing to open source a very memorable and easy one and I am therefore looking forward to spending the next three months working with the team.

Tell us about the project you’re planning to do for InterMine this winter.

The project is “Review, update, and integrate InterMine developer documentation”. I intend to evaluate different technical documentation frameworks that support markdown and versioning as an alternative to the current documentation that uses Sphinx and Read the Docs. I also plan on reviewing the current documentation for issues and implementing solutions, converting files written in reStructuredText to markdown format as well as integrate the two documentation sources.

Are there any challenges you anticipate for your project? How do you plan to overcome them?

The main challenges I’m anticipating are time and organization. However, I have come up with a detailed timeline, highlighting when tasks should be started and completed and a spreadsheet to keep track of converted and reviewed files. 

Share a meme or gif that represents your project

InterMine 4.2.0 release

We are pleased to announce the new InterMine release 4.2.0.
It includes new functionalities to support the upcoming BlueGenes release 0.10.0, some improvementes on FAIR side and a few bugs fixes.
This is a non-disruptive release.
Thank you so much to our contributors: Ahmed Hafez, Asher Pasha and Sam Hokin!

BlueGenes related improvements

  1. Added /login web service that merges the anonymous session with the user logged in.
  2. Added /logout web service.
  3. Added a new webservice to change the users’s password.
  4. Updated the existing /lists webservice which allows modifying the list description.
  5. Improvements on the Date type (to support CovidMine).

BlueGenes 0.10.0 will be released soon and announced in a separate blog.

FAIR related improvements

  1. Simplified the webservice that generates Bioschemas markup for the report page.
  2. Adopted DataRecord in the report page.
  3. Added Gene, Protein markup in the report page.
  4. Added BioChemEntity markup in the report page (only if configured).
  5. Added the ontology licences to the obo converters.

Bug Fixes / Improvements

  1. Added a new bio source to load ISA files in json format
  2. Fixed organism short name generation (Ahmed Hafez)
  3. Fixed a bug related to long fields in the report page (Asher Pasha)
  4. Removed BioEntity.ontologyAnnotations because redundant (Sam Hokin)
  5. Fixed src.data.dir.include (gff3 and xml) ans src.data.dir (intermine-items-xml-file)
  6. UniProtFastaLoader works with organism names longer than 2 words (for example Severe acute respiratory syndrome coronavirus 2)

See release notes for detailed information.

Upcoming releases

For more information about the upcoming releases, please visit the InterMine Development Roadmap. More details on the roadmap here.

Google Season of Docs 2020

We’re pleased to announce that, after partecipating in Google Summer of Code (GSoC) for three fantastic years, and in Outreachy mentoring program which is running right now, we will be participating, for the first time, in Google Season of Docs 2020 as a mentor organization.
InterMine will be under the umbrella of the INCF organitation; here you can find the full ideas list for INCF projects including InterMine projects (numbers 3 and 4).

InterMine Projects

  1. InterMine user training docs. For more details, please see here.
  2. Review, update, and integrate InterMine developer documentation. For more details, please see here.

If you’re interested in applying for one of our two projects, please drop an email to the people named in the project document to introduce yourself, and explain which of the project(s) you’re interested in.

Deadline for technical writer applications is the 9th of July.

If you have any ideas or questions, please don’t hesitate to email us.

InterMine 4.1.3 – patch release

We’ve released a small batch of bug fixes

Fixes

  • Deleting a template doesn’t leave the template in the Templates page anymore
  • Public template creation has been fixed
  • vcf bio source fixed
  • The service/web-properties returns a valid response
  • In the report page, links to HumanMine/FlyMine are displayed

See release notes for detailed information.

This is a non-disruptive release.

InterMine 4.1.2 – patch release

We’ve released a small batch of bug fixes and improvements.

If you host your own CDN please update it with the latest version of imjs (v.3.18.1) and im-tables (v 2.1.0).

Thank you to our contributors Joe Carlson, Paulo Nuin and Asher Pasha!

Fixes

  • DataSet URLs appear in tables
  • LOOP query in webapp has been fixed
  • Complex Displayer fixed
  • Updated chebiWS-client and jami-interactionviewer-json versions
  • Licence dataset doesn’t display null
  • runtime exception in BagManager.getBags catched
  • Fixed bug in the report page which allowed to execute javascript (Asher’s contribution)
  • Cast conversion corrected when updating serial (Joe’s contribution)

Enhancements

  • Update to Java11 (Asher’s contribution)
  • WebservicePythonCodeGenerator updated according to Python’s code styling PEP8 (Paulo’s contribution)
  • In the Export section, option “Upload to GenomeSpace” removed
  • ThaleMine updated to to psi BioSource and BioGrid (Asher’s contribution)
  • From FAIR side: json-ld home page updated + use of the registry to set provider/support in the home page markup, ‘Shared link’ configuration improved
  • Libraries as im.js, imtables.js, imtables-dep.js removed from intermine-webapp
  • gff source added to the bio/source multi gradle project
  • Improved the logs when post processes related to solr fail

This is a non-disruptive release.

See release notes for detailed information.

InterMine 4.1.1 – patch release

We’ve released a small batch of bug fixes and added the Code of Conduct.

Thank you to our contributor Asher Pasha (ThaleMine).

Fixes

  • ncbi-gff bio source updated due to data change
  • intermine plugin updated to allow you to build and deploy your InterMine instance using Gradle 4.9. To update the Gradle version on your mine, please read the upgrade instructions
  • merged PRs from Asher Pasha (ThaleMine) aimed at streamlining ThaleMine production.

This is a non-disruptive release.

See release notes for detailed information.

InterMine 4.1.0

InterMine team has just released InterMine 4.1.0.

The new release includes a better integration with Galaxy: we can import data into Galaxy from any InterMine of our choice (either starting from InterMine or Galaxy), and we can export a list of identifiers from Galaxy to any InterMine of our choice through the InterMine registry. No need to configure anything any more: all the Galaxy properties have been moved to InterMine core. No need to create a mine-specific Galaxy tool anymore, use the NEW intermine tool instead. Please read here for more details. A simple InterMine tutorial will be published soon in the Galaxy Training Material, under the Data Manipulation topic.

This release offers the integration with ELIXIR AAI (Authentication and Authorisation Infrastructure) allowing the researchers to log in the InterMine instances using their ELIXIR profile. You will need:

  1. an ELIXIR identity
  2. register the InterMine client in order to obtain the client-id and the client-secret which must be set in the mine properties file.

More details here in the OpenAuth2 Settings section of the documentation.

Also new in this version is the gradle wrapper 4.9, which is compatible with Java11. This only effects the users which compile/install InterMine code.

Thank you so much to our contributor Joe Carlson for improving the generateUpdateTriggers task.

The release contains also a few bug fixes.

Bug Fixes

  • Solved the error caused by obsolete terms in the gene ontology
  • Fasta query result: CDS translation option + extra view parameter
  • The ONE OF constraint works properly when editing a template
  • The default queries configuration have been migrated to json
  • The task generateUpdateTriggers has been improved

See the release notes for the complete list and detailed information.

This is a non-disruptive release. To update your mine with these new changes, see the upgrade instructions.

Persistent identifiers (URI) and navigable URLs in InterMine

Local unique identifiers (LUIs)

A LUI (Local Unique Identifier) is  an identifier guaranteed to be unique in a given local context (e.g. a single data collection). [Ref. https://doi.org/10.1371/journal.pbio.2001414]. InterMine’s existing local identifiers are based on an internal database ID; they are unique but they are not preserved across database releases. For example, the ID 1007854 which currently identifies the gene zen in FlyMine, it’s not persistent; after the next build the link http://www.flymine.org/flymine/report.do?id=1007854 will be not valid.

In InterMine, we have implemented new persistent local unique identifiers which are preserved across releases; they are based on the class types, defined in the InterMine core model, and the external IDs from the main data source provider integrated.

Some examples are:
protein:P31946 (protein identifier)
publication:8829651 (PubMed identifier)
gene:MGI:1924206 (gene identifier)

Persistent URIs

An URI (Uniform Resource Identifier) is an identifier which is unique on the web, and not only within the local context as the LUI is, and actionable, so if you copy it in the web address bar you are redirected to the source. URIs need to be persistent in order to to provide reliable sources, always findable and accessible.

Some examples of persistent URIs are:
http://purl.uniprot.org/uniprot/P05455 where P05455 is the LUI for UniProt
http://identifiers.org/biosample/SAMEA104559033 where SAMEA104559033 is the LUI for biosample.

Where are Permanent URIs going to be used by InterMine?
1. To markup the web pages for search engines with Bioschemas.org or Schema.org types: set the identifier attribute with the persistent URI in DataCatalog, DataSeta and BioChemEntity types.
2. To generate RDF: we need persistent URI to set the subject in the triples generated.

We need to generate persistent URIs only if we create new entities. If a mine instance DOES NOT create new entities, it needs to re-use the existing URIs provided by the main source provider.
In FlyMine, for example, the RDF generated for the protein P05455 integrated from UniProt, which is the main resource provider for that data type, should be:

<http://purl.uniprot.org/uniprot/P05455> rdf:type <http://semanticscience.org/resource/SIO_010043> .
<http://purl.uniprot.org/uniprot/P05455> rdfs:label “Protein P05455” .

But how to generate persistent URIs?

There are different options to generate persistent URIs, and the mine administrator will choose the option which is more suitable to the mine instance.

Option 1: Generate Persistent URIs using third party resolvers

In order to provide permanent URIs, we can configure the mine instance to use Identifiers.org as PURL (permanent URI) provider. These the steps to follow:

1. register the mine instance in Identifiers.org as data collection

Namespace/prefix LegumeMine
URI (assigned by Identifiers.org) http://identifiers.org/legumemine
Primary Resource https://mines.legumeinfo.org/legumemine

2. set, in the mine instance, the property identifier.uri.base with the URI assigned by Identifiers.org (e.g. http://identifiers.org/legumemine).

The URI, generated by LegumeMine, for the entity GeneticMarker with primary identifier 118M3 will be: http://identifiers.org/legumemine:geneticmarker:118M3. This is persistent, unique and actionable: if you paste it in the web browser address you will redirected to the navigable URL: https://mines.legumeinfo.org/legumemine/geneticmarker:118M3 by identifiers.org.

Option 2 – Generate Persistent URIs setting a redirection system

A mine administrator might prefer to implement an in-house redirection system (couple of lines in apache or nginx configuration files) setting a purl system similar to purl.uniprot.org.

In LegumeMine, for example, the permanent URI, for the entity GeneticMarker with identifier 118M3 might be: http://purl.legumemine.org/legumemine/geneticmarker:118M3.

Option 3 – Use navigable URLs

A mine administrator might decide to use the navigable URLs (see the next section) as permanent URIs.
An example is given by ZFIN where the URI http://zfin.org/ZDB-GENE-040718-423 coincides with the navigable URL.

Navigable persistent URLs

The navigable or access URLs are the URLs of the web pages where the users are redirected.
Examples of the new navigable URLs in Flymine:
http://flymine.org/flymine/protein:P31946
http://flymine.org/flymine/publication:781465
http://flymine.org/flymine/gene:MGI:1924206

The new navigable URLs will not change at every build! We can guarantee they will be persistent setting redirection system which resolves old URLs.

Navigable URLs usage

1. Permanent link button in the current report page
2. Permanent link button in BlueGenes, InterMine’s new user interface.
3. To markup the web pages with Bioschemas.org type: url attribute will be set with the navigable URL
4. To generate RDF: the field schema:mainEntityOfPage will be set with the persistent URL. For example:

<http://identifiers.org/MGI:97490> schema:mainEntityOfPage  <https://mousemine.org/gene:MGI:97490>

The new permanent URIs and URLs have been scheduled to be released in InterMine 4.0 release.

Researchers connected in Berlin

researchersConnected.png

I really enjoyed attending the Neo4j Life & Health Sciences Workshop, organized in Berlin, this week, by Michael and Petra: a day rich with great presentations about the application and utility of graph technology in several research areas. Here are only few examples:

  • The Ontology Lookup Service, a repository for biomedical ontologies, implemented with the support of graph databases and Apache Solr for indexing, different technologies for different purposes.
  • In the Lamond lab (University of Dundee), they model proteomics data with graph databases in order to understand protein behaviour under different conditions and dimensions of analysis.
  • MetaProteomeAnalyzer (MPA), a tool for analyzing & visualizing metaproteomics, uses Neo4j as the backend for metaproteomics data analysis software.
  • Tabloid Proteome is a database of associated protein pairs, derived from mass-spectrometry based proteomics experiments, implemented using a graphdb, which can help also to discover proteins that are connected indirectly, or may have information that you are not looking for!
  • Reactome is a pathway database which has recently migrated from MySQL to Neo4j, with relevant performance improvement. You can access data via the GraphCore open source Java library, developed  with Spring Data Neo4j, or via Neo4j browser.

I’ve lost count of how many times I heard sentences like: “Biology systems are complex and growing and graphs are the native data model” or “Graph database technology is an effective tool for modelling highly connected data as we have in biology systems”. We already knew it, but it’s been very encouraging and promising hearing it again from so many researchers and practitioners with higher experience than us in graph technologies.

In the afternoon, I attended the workshops “Data modelling with Neo4j”; starting from the data sources we usually work with, we have tried to model the entities and the relationships in order to answer some relevant questions. Modelling can be very challenging and, in some cases, it might depend on the questions you have to answer!

Before the end, I had the chance to give a short presentation about our experience with Neo4j.

Thanks again Michael and Petra for organizing such a great event!