InterMine, Oracle and the Future of Java

There have been a few questions about Oracle’s announcements on the future of Java, so this post hopes to cover what actually has changed and how this impacts InterMine as a software package.

In short, these changes do not impact InterMine negatively, but we should be aware of these issues.

Oracle JDK 11 is not free for use in production; Use OpenJDK instead

Oracle changed its licencing a bit. Starting with Java 11, Oracle now releases its two JDKs under different licences:

  1. OpenJDK (open source under GPL)
  2. Oracle JDK (commercial licence)

(Previously, Oracle had released these both under the BCL licence which allows a mix of free and commercial use, so you only had to pay “sometimes”).

To use the Oracle JDK 11 in a production environment, you now need to purchase a commercial licence. You are still allowed to use this JDK in development, for demos etc but the Oracle JDK 11 is NOT free to use in production.

We develop InterMine against (and recommend people use) OpenJDK instead of the commercial JDK Oracle provides. As of Java 11, these two JDKs are now virtually identical so this is safe.

Oracle JDK 8 — “End of Public Updates”; Use OpenJDK instead

Oracle will provide public updates of Oracle JDK 8 through at least December 2020 for personal desktop use and January 2019 for commercial use. You can continue to use Oracle’s JDK indefinitely without updates, but that’s a bad idea for security and functionality reasons. If you want updates to Java 8, switch to OpenJDK, there are free OpenJDK builds from other providers like AdoptOpenJDK, Azul, IBM, Red Hat, other Linux distros etc.

OpenJDK binaries from Oracle will only be provided until the next JDK release; Use OpenJDK from a non-Oracle provider

Oracle changed their release schedule to be twice a year, and they will not provide a LTS release for OpenJDK. Oracle will not provide updates to older Open JDK versions, e.g. versions older than six months. This includes security fixes!

This is troubling as the InterMine release schedule is such that it’s not feasible to update Java versions every six months. But we can’t ignore needed security fixes.

However, RedHat announced in September that they would take a leadership role in this area. Some, e.g. https://adoptopenjdk.net, plan to offer an OpenJDK LTS releases for free. So there will be OpenJDK LTSs available, just not from Oracle.

What does this all mean for InterMine? Not Much!

We’ll keep monitoring the situation but this seems like the usual way that companies manage open source projects — providing open software and additional paid support. So nothing to be alarmed about. OpenJDK is open source, so we are safe.

People are (rightly?) concerned about Oracle’s true commitment to Java and open source going forward. What if they change their mind and don’t release updates to OpenJDK? For InterMine this isn’t too scary because worst case scenario we could use an older stable version of Java. However in this nightmare scenario it’s likely that Java would be forked and we could carry on.

Future InterMine plans

We have no plans to migrate away from Java and will continue to develop using the OpenJDK as normal. We develop against the Java specification not the version so we aren’t tied to a specific Java version. For now, we’re recommending staying with OpenJDK 8 but plan to start testing with Java 11 soon.

Although some are suspicious of Oracle due to past experiences, we are optimistic about the future of Java, as the community really seems to be responding to the need for a secure and open Java.

More reading:

 

 

InterMine Releases – Winter 2018 Update (Solr, Strains and being more FAIR)

Here’s a list of recent and upcoming InterMine releases.

InterMine 3.0 – Solr

Just released! This is the Solr project we discussed over the summer that was done as part of Google Summer of Code (Thanks again Arunan!). See our blog post for details.

InterMine 3.1 – Strains

This will be released next week. The release will include the data model changes we discussed on the last community call. We’ve added Strain to the core data model, with references to Organism and Sequence Feature.

Sam’s built a test mine you can query to preview the updates.

This will not be a disruptive release, except you may want to update your strains to match the core InterMine data model.

InterMine 3.1.1 – Bug fixes

3.1.1 is a small release comprised of a few very very small but useful bug fixes and features. If you have something specific you need done, please ask!

This will not be a disruptive release.

InterMine 4.0 – FAIR

We’ve been making InterMine more FAIR! This release will include things like adding licence information to data sets, adding ontologies to describe the data model etc. More details soon! We’re hoping this release is ready late January 2019.

This will not be a disruptive release.

Thanks for reading! As always, if you have any questions, please hop onto our discord server (chat.intermine.org) or drop us an email.

Helpful Links:

Release Notes

Upgrade Instructions

 

InterMine 3.0 – Solr search

InterMine 3.0 is now available and features a brand new search powered by Solr.

Default search configuration will work well, but Solr allows for endless configuration for your specific needs.

Now the first search after deployment is instant, you can inspect the search index directly (via http://localhost:8983/solr/) and there’s a facet web service (via /service/facet-list and /service/facets?q=gene). Certain bugs, e.g. searching for the gene “OR”, are also now fixed.

New Configuration Option – optimize

There is a new keyword search configuration setting: index.optimize. If set to `true`, reorganises the index so chunks are placed together in storage which might improve the search time. (Similar to defragmentation of a hard disk.) See the configuration docs for more details.

Docs

Installing Solr

Configuring the keyword search

InterMine 3.0 upgrade instructions | release notes

A big thank you to our clever and hard-working 2018 Google Summer of Code student Arunan Sugunakumar — who did the bulk of the work as part of his summer project. Great job!

STORM + InterMine: A partnership in the fight against cancer

In July 2018 Innovate UK awarded InterMine at the University of Cambridge and STORM Therapeutics a Knowledge Transfer Partnership (KTP). A KTP is a government program that helps businesses in the UK by linking them with an academic organisation — enabling them to bring in new skills and the latest academic thinking to deliver a specific, strategic innovation project.

The key objective of this particular project is to develop an analysis platform using the data warehouse InterMine to help STORM advance their cancer research.

Here we talk with Hendrik Weisser, Senior Bioinformatician at STORM, about this collaboration.

Can you tell me about this project?

Sure, my company (STORM) is partnering with InterMine in this project. We are going to develop a computational knowledge base for cancer drug discovery and RNA epigenetics, based on InterMine’s HumanMine database. We will extend InterMine by adding analysis tools, more biomedical data etc. to make it a bespoke platform to help us identify and validate drug targets.

Can you tell me more about STORM?

STORM Therapeutics is a drug discovery company focused on RNA epigenetics, developing small-molecule inhibitors of RNA-modifying enzymes for the treatment of cancer. We are a spin-out of Cambridge University, founded in 2015 by professors Eric Miska and Tony Kouzarides from the Gurdon Institute. You can find more information – and a cool animated video about RNA epigenetics – on our website, www.stormtherapeutics.com.

What do you hope to achieve?

For STORM, convenient access to available data on RNA-modifying enzymes, their roles in RNA epigenetics, and their associations to different cancers – both direct and via interaction partners – is vital for our efforts in target validation, indication prioritisation and patient stratification. A large amount of relevant data is publicly available but is scattered over many sources and not integrated, thus difficult and time-consuming to fully utilise. STORM’s vision is to develop an integrated database of relevant human biomedical data, that should enable our scientists to quickly view and interrogate the most pertinent data on target genes/proteins, but also allow us to easily perform bioinformatic analyses on these data.

What attracted you to InterMine? What makes InterMine a useful tool for drug discovery?

I found out about InterMine’s existence by chance and then quickly signed up to an InterMine training course at Cambridge University to learn more. I was impressed by the wealth of functionality offered by InterMine and by its sophisticated architecture that enables huge flexibility in dealing with different kinds of biological data. InterMine really represents the state of the art in terms of large-scale complex biomedical data integration. By focusing on extensibility and customisation and on enabling local installations, InterMine is able to serve a variety of research communities. These capabilities also make it an ideal fit for STORM’s requirements for an internal data management system that integrates diverse public data. The fact that InterMine is open-source, i.e. the code is and will stay available, is also important for us because it helps to ensure long-term maintainability.

—-

For more information see STORM’s website.

 

 

InterMine 2.1.0 release

We pushed out a few bug fixes and improvements:

  • FIX – “Update publications” data source failed when too many PubMed IDs were sent to the Entrez web service (Thank you to Norbert Auer!)
  • FIX – small bug for generating Python code
  • FIX – FASTA query web service times out when extensions are used (Thanks to Joel Richardson!)
  • FIX – Discontiguous CDS sequences and lengths not set properly
  • FIX – Some SO terms were not updated in 2.0 release
  • FIX – Region search: trying to leave “extended region” field blank results in error

See GitHub for details:

https://github.com/intermine/intermine/releases

How to Upgrade

Throughout the InterMine code, the InterMine version number is set via a global variable. Here’s an example:

# Maven will download the bio-core JAR with the correct version
compile group: 'org.intermine', name: 'bio-core', version: System.getProperty("bioVersion")

To change which InterMine version you are using , you will want to increment the value of the system property “imVersion” and “bioVersion“. These are located in  the “gradle.properties” file for your mine:

# gradle.properties in your mine
systemProp.imVersion=2.1.+
systemProp.bioVersion=2.1.+

Maven will now download, for example, the bio-core JAR of the latest version, e.g. “bio-core-2.1.0.jar”.

If you set the property to “2.1.+” you will get any small point releases that are published in the future. You can set the property to be 2.1.0 if you ONLY want to use version “2.1.0” and do not want to receive updates:

# gradle.properties to only get specific version
systemProp.imVersion=2.1.0
systemProp.bioVersion=2.1.0

Here is an example:

HumanMine upgrade to use the latest version.

InterMine 3.0 – SOLR

The next InterMine release will be InterMine 3.0 which will include SOLR. See our SOLR blog post for details.

We are currently testing SOLR with InterMine and should have a version ready for public beta testing early next month.

 

InterMine 2.0

We are excited to announce the official release of InterMine 2.0!

InterMine 2.0 includes some model updates, a big change in how InterMine itself is built, lots of new features, like a new UI, and a long list bug fixes. See the full list of updates here.

This release represents a large milestone for the InterMine team! Not only because we made big fundamental changes to the core InterMine data model and build system, but also because this release represents a major shift in philosophy for us. Previously InterMine was a big, monolithic, single piece of software. You downloaded the whole InterMine, you compiled the whole InterMine, you got the whole of InterMine. Instead, we are moving towards this idea of modularity and responsiveness. Smaller, independent libraries that are interconnected but can be used for tools and features separately or linked together.

Smaller decoupled InterMine packages will allow us to develop more features faster with less errors. InterMine maintainers might then have the flexibility to include (or not) the features in their mine, plug in their own tools, etc.

Version 2.0 represents a big step towards this goal!

A New Interface

A new feature in InterMine 2.0 is the ability to run our new UI, nicknamed “Blue Genes”. This app is in addition to the current webapp and offers a new and responsive search environment for your InterMine data.

Blue genes is a modern UI built in Clojure and provides a modern user experience.

  • Super fast response times
  • Interactive list upload
  • Redesigned “My account” section
  • Search autocomplete
  • Template and query builder result previews
  • .. lots more!

Once you have your InterMine updated to InterMine 2.0, there is a single command that will launch Blue Genes for your mine.

We are actively seeking feedback on Blue Genes, it’s still very much in the beta phase still, so please get in touch once you have some opinions!

Special Thanks

Thanks to everyone who helped test this release! Thanks Howie Motenko at MGI for your alpha testing and model insights. And a BIG thank you goes to Sam Hokin from the NCGR who spent a lot of time and effort helping improve InterMine! Thanks Sam and Howie! You are much appreciated.

Helpful Links

What exactly we changed (blog post)

Full list of GitHub tickets included this release

Docs on how to upgrade to 2.0

 

As always, please contact us if you have any questions or comments! We have an active twitter account, a discord server at chat.intermine.org, and a low traffic mailing list.

FlyMine 46.0 Released!

FlyMine has been updated to the latest version of FlyBase. All other data sets have also been updated to the newest versions and we have fixed a few bugs. See the data sources page for a full list of data and their versions. All data can be accessed through our comprehensive library of template searches or by building your own queries using the query builder.

 

Data model changes!

Our data model has changed slightly to make querying easier between mines.

  • Protein molecular weight is a float instead of an integer
  • We’ve added URLs for GO term evidence codes
  • Sequence Ontology (the basis for the InterMine data model) is updated, so lots of new data types added.

See our previous blog post for complete details of updates.

No more Anopheles

After listening to community feedback, we have decided to stop loading Anopheles data into FlyMine. However, as always, if there is a specific data set you are interested in, please contact us!

 

We have docs and videos, and for a full list of data sources available in FlyMine see the data sources list.

However, please do not hesitate to contact us should you require any further assistance. For all types of help and feedback email info@intermine.org.

HumanMine 5.0 released!

HumanMine has been updated to the latest version of NCBI Entrez Gene. All other data sets have also been updated to the newest versions and we have fixed a few bugs. See the data sources page for a full list of data and their versions. All data can be accessed through our comprehensive library of template searches or by building your own queries using the query builder.

 

New Data Source: GTEx

We added a new expression data set, GTEx. Here’s an example search:

Gene –> Tissue Expression

Tissue –> Gene expression

Data model changes!

Our data model has changed slightly to make querying easier between mines.

  • Protein molecular weight is a float instead of an integer
  • We’ve added URLs for GO term evidence codes
  • Sequence Ontology (the basis for the InterMine data model) is updated, so lots of new data types added.

See our previous blog post for complete details of updates.

 

We have docs and videos, and for a full list of data sources available in HumanMine see the data sources list.

However, please do not hesitate to contact us should you require any further assistance. For all types of help and feedback email info@intermine.org.

Coming up soon: InterMine 2.0 release webinar, community calls, and GSoC presentations

What’s coming up soon in InterMineLand? Here are a few of the highlights:

Upgrading to 2.0 – Thursday 2nd August

With the release of InterMine 2.0 RC1, we’ll be dedicating the InterMine Developer call to an InterMine 2.0 Upgrade Webinar, spending around 20 minutes discussing how one upgrades an InterMine 1.x installation to use the newer (and much more easygoing) Gradle dependency management system. Q&A afterwards so you can learn everything you’ve been burning to know. [Call in information]

This call will be recorded so anyone who couldn’t make it can catch up.

GSoC Student project presentations – Thursday 16th August

Six students, six awesome projects. Our students have been blogging prolifically while working over the last three months, and they’ll be presenting their work on the developer call, with five minutes slots per student + time for Q&A afterwards. [Agenda here]

This call will be recorded so anyone who couldn’t make it can catch up.

Community Outreach Call – 6th September 

Once a quarter we host non-techie calls where we focus on interesting things the community has been doing as well as community engagement in general. This time we’ll be featuring Kevin Macpherson, who runs some fantastic community outreach at SGD, including amazing webinar video use-cases.  [Agenda, still work in progress]

Previous featured speakers include Jacqueline Campbell talk about her approach to community engagement, Wayne Decateur demonstrating InterMine code in Jupyter notebooks, and Abby Cabunoc Mayes, Mozilla’s Working Open Practice Lead.

We’re still looking for speakers for this call and the next one, in December – If you have a topic you’d like to share about InterMine, open science/source, or bioinformatics in general, ping yo@intermine.org to pitch the idea.

GSoC’18- Cross InterMine Search Tool (Progress so far)


It has been three weeks since the commencement of the GSoC’18 Coding Phase and a lot of progress has been made till now. The project has been deployed here.

This week has been really very productive. I have worked upon several parts of the application this week. My main focus for this week was to add filters into the application. But I ended up with some more cool stuff here 😉
Let’s have a look at all of them one by one.

  1. Search Rating/Relevance Score
The application knows what you are searching for 😉

Relevance score is a value which is calculated by the InterMine QuickSearch API endpoint for every keyword which is searched. This rating determines how much relevant is the result item with the search keyword. The QuickSearch API response provides a floating value of ‘relevance’ parameter. InterMine uses a formula for determining the Search score in the specific InterMine Search Portals. I have used the same formula to convert the floating relevance score into a score out of 5. The formula is here:
=> Math.round(Math.max(0.1, Math.min(1, relevance)) * 5)

2. Different colors for different Categories in Results

The world is colorful and hence our app too 😉

This was a feedback received from the InterMine community. Having different category results shaded in different colors helps in exploring the results easily. At times, there may be a long list of result items returned by the application. Then having separate colors for separate categories helps us to explore faster. Our eyes are meant to perceive colors more quickly than text.

3. External links to reports

Every result item has a link which opens the result report on its particular InterMine portal. This report page contains more detailed information about the result item. On clicking the icon, a new tab with the given result report opens in the browser. The result link is generated dynamically:

4. Metadata about search results

Metadata for BMAP mine (search: ‘brca1′)

Having metadata about the results returned is always handy at work. Every tab in the application loads a set of metadata as attached above. This can certainly help in understanding the presence of a search term in the mine in a unified way.

5. Search/Relevance Score Filter

Score filter on sidebar of the application

With addition of score/relevance in the application, it was very much necessary to add an option to filter the results based on that. This section provides radio buttons to filter out the results based on the relevance score of the result item. I hope this feature will be extremely beneficial for the community members out there. 🙂

6. Category Filter

Every mine contains data from various diverse categories. So, this feature too was a necessary requirement of a full fledged searching tool. This section is loaded dynamically based on the types of categories returned by the API. The application uses the search result metadata received from the API to generate these category checkboxes dynamically. So, we need not worry about hard-coding any of these.


So, this was all about progress so far. Almost everything in the application is mobile optimized and ready to use. Most probably, I will also be extending the scope of this project and add a REST API service for searching multiple mines. So, the project will be a full fledged Cross InterMine Search Tool in future, i.e. a package of, a client driven search interface & a back-end REST API service. You can find the project repository here. Please have a look at the application here. I would love to have more feedback from the community. (For providing your feedback/suggestions, kindly email me at dwivedi.aman96@gmail.com) Thanks for reading. Happy Coding! 🙂