JavaScript everywhere – the BlueGenes Tool API version 1 is released!

If you attended the 2017 InterMine developer workshop, you may recall the discussion we had about embedding tools in InterMine’s new UI, BlueGenes. One of the biggest priorities was to make sure that it was easy and fun to create visualisations for your mine.

It’s taken a lot of sweat, toil, testing, and iteration, but I’m incredibly excited to announce that Version 1.0 of the Tool API is released today. This means that you’ll be able to view all of your favourite client-side tools in BlueGenes, hopefully with just a few quick tweaks. It’ll also be relatively straightforward to install tools that other people have created.

bluegenes_protein_report_page_for_A0A0B4KEJ0_DROME
Preview of a protein report page for A0A0B4KEJ0_DROME, with the InterMine Cytoscape interaction viewer and ProtVista protein feature viewer both embedded.

 

You can try it out yourself, at bluegenes.apps.intermine.org – just search for a gene or protein report page.

How does it work?

We didn’t want to re-invent the wheel, and JavaScript definitely has package managers out there already. We use npm (node package manager) to package and install all BlueGenes tools. You can find all BlueGenes compatible tools by browsing the tag bluegenes-intermine-tool, and once you’ve configured your InterMine to work with the tool API, installing a new tool is often as simple as typing npm install @intermine/my-new-tool-name --save into a terminal. Equally, updating tools is as simple as running npm update from your tools folder. BlueGenes then looks inside a file in your tool folder called package.json, and outputs all installed tools listed there. package.json is an npm configuration file which contains a manifest of all the installed packages.

Getting started

Running tools in BlueGenes

If you have an InterMine that’s at least at version 2.1, you can start bluegenes with the ./gradlew blueGenesStart task. See our docs for details.

It’s also possible to run tools on standalone BlueGenes.

Once tools are installed, you can see the entire list of them under the developer menu in BlueGenes (top right > click the cog > developer > tool “app store”).

Developing tools and converting your existing tools

So, you’ve gotten some tools installed and you’d like to add some more of your own? We have the full tool API specs, a tutorial to walk you through creating your first tool, and a nice tool-scaffolder yeoman generator that will create most of the boilerplate files you need automatically, so you can spend time on more important things like eating cake 🎂 and feeding the cat.

Some credits and thanks

A huge thanks to Vivek Krishnakumar for the very first draft of the Tool API specs, and to Josh Heimbach for further work on the spec. I’d also like to thank Julie, who patiently tested the Tool API installation process and helped me iron out a lot of the bugs.

Future plans

What’s next for BlueGenes and the Tool API? Well, we have some updates planned specifically for the Tool API, including: 

  • extending the tool support to list result pages working with legacy tools that aren’t packaged as Javascript modulesBetter integration with BioJS.

You can see a roadmap here for the Tool API. First, though, we’ll probably be thinking about some of the final bits of polish BlueGenes needs before it can be officially launched as a non-beta UI, including: 

  • Authentication. When the InterMine web services were initially implemented, it was with an eye to enable data scientists and bioinformaticians to access data from InterMine. Some authentication-related services had been implemented, such as token-based authentication, but given the fact that the web services weren’t designed with a full application layer in mind, we need to add some more, including as user registration.
  • A MyMine section. We started with some prototypes late last year / early this year, but ended up rolling them back due to unexpected complications.
  • Speedier and more configurable report pages. Is there anything you’ve always wished for in a report page that’s not a tool? Feel free to ping us with ideas.

 

Questions, concerns, confusion, ideas?

Drop by the InterMine chat, email the group at info@intermine.org, or drop me a mail directly – yo@intermine.org.

Advertisements

InterMine 3.1 – Extending the Core InterMine Data Model with Multiple Genome Versions, Strains

Advances in sequencing technologies mean that genome sequence and annotation data for multiple strains of a species are now often available. An update to the InterMine core data model was decided that would allow addition of Strain data should it be available without affecting InterMines which do not have this data.

It was decided that the addition of a new class, Strain, which is referenced by Organism and Sequence feature and vice versa, would allow both the flexibility required and allow for addition of further data and expansion if required.

 

strains

The Strain class has the following features/advantages:

  • SequenceFeature entities, such as Genes, would continue to reference Organism, but would also reference the new Strain class, allowing for queries returning SequenceFeatures for a specific strain.
  • Providing strain information as a separate class allows individual InterMine’s to reference other information as required, such as Genotype and Stocks.
  • The Strain class extends BioEntity so will include strain-relevant attributes such as PrimaryIdentifier and Name and will reference other collections such as synonym.
  • Minimal changes to the user interface will be required as, to our knowledge, SequenceFeatures in individual strains always have a unique identifier. With the help of templates if necessary, users will be able to identify particular SequenceFeatures and which strain they originate from.

 

To update your mine with these new changes, see upgrade instructions. This is a non-disruptive release.

See release notes and the notes from the community call for more details. Please join our community calls if you’d like to be part of future data model decisions! (Details of upcoming calls are available via our developer mailing list).

Google Summer of Code 2019 – it’s time to get your thinking caps on!

TL;DR: Send us your awesome project ideas and/or volunteer to be a mentor!

Longer version:

We need your ideas!

GSoC 2019 has been announced, and as in 2017 and 2018, InterMine will be applying again to become a mentor organisation.  This means we’re back at the “we want your project ideas!” phase – and we do! If you work with or use an InterMine and have ideas for its improvement – might it be something big enough for a student to work on for three months? Any of these types of ideas would be great:

  • An interesting exploratory project that answers a question – “is x likely to be possible or practical with InterMine?”
  • Fixing something that’s always bothered you – in 2018, we managed this with the Solr Search update!
  • A well-scoped application like the InterMine iOS app
  • A set of Javascript Visualisations for your InterMine’s data.

Even if you don’t have time to mentor the project yourself, any ideas like this would be greatly appreciated! You can add them directly to the ideas doc, or feel free to contact us to chat about it more.

Could you mentor for an InterMine project?

In 2017 and 2018 we’ve had mentors from the community, including mentors who were previously GSoC students. Ideally, interested mentors should be known to us, perhaps because you are an InterMine user, developer/maintainer/administrator, or a previous student. If you don’t have any project ideas of your own, you may be able to pick one from the project list that suits your skills and interests.

What is mentoring like, you might ask? We have the basics set out in our mentor terms and conditions (which isn’t as dry as the title suggests). Some things to note:

  • The busiest period is the application phase, when multiple students will be interacting with you to learn about and contribute to the community.
  • After this, things calm down a lot. You’re expected to meet (virtually) with your student at least once a week for the three months of the coding phase. Proactive students often only need an hour or two here or there, but other students may need more hands-on attention.
  • Students wages are paid by Google. Mentors also get a small stipend and thank-you gift after GSoC is complete.
  • You’ll be paired with a Cambridge-based mentor for support, guidance, and cover while on vacation.

A great opportunity all around

GSoC as a program ends up being incredibly valuable two-way exchange. Students get three months of paid work experience at an open source organisation, and on the other side InterMine and InterMine mentors end up with the chance to guide projects and see some truly fantastic work implemented. Promising students might even end up applying for vacancies when they come up – it’s a great way to broaden your community!


InterMine, Oracle and the Future of Java

There have been a few questions about Oracle’s announcements on the future of Java, so this post hopes to cover what actually has changed and how this impacts InterMine as a software package.

In short, these changes do not impact InterMine negatively, but we should be aware of these issues.

Oracle JDK 11 is not free for use in production; Use OpenJDK instead

Oracle changed its licencing a bit. Starting with Java 11, Oracle now releases its two JDKs under different licences:

  1. OpenJDK (open source under GPL)
  2. Oracle JDK (commercial licence)

(Previously, Oracle had released these both under the BCL licence which allows a mix of free and commercial use, so you only had to pay “sometimes”).

To use the Oracle JDK 11 in a production environment, you now need to purchase a commercial licence. You are still allowed to use this JDK in development, for demos etc but the Oracle JDK 11 is NOT free to use in production.

We develop InterMine against (and recommend people use) OpenJDK instead of the commercial JDK Oracle provides. As of Java 11, these two JDKs are now virtually identical so this is safe.

Oracle JDK 8 — “End of Public Updates”; Use OpenJDK instead

Oracle will provide public updates of Oracle JDK 8 through at least December 2020 for personal desktop use and January 2019 for commercial use. You can continue to use Oracle’s JDK indefinitely without updates, but that’s a bad idea for security and functionality reasons. If you want updates to Java 8, switch to OpenJDK, there are free OpenJDK builds from other providers like AdoptOpenJDK, Azul, IBM, Red Hat, other Linux distros etc.

OpenJDK binaries from Oracle will only be provided until the next JDK release; Use OpenJDK from a non-Oracle provider

Oracle changed their release schedule to be twice a year, and they will not provide a LTS release for OpenJDK. Oracle will not provide updates to older Open JDK versions, e.g. versions older than six months. This includes security fixes!

This is troubling as the InterMine release schedule is such that it’s not feasible to update Java versions every six months. But we can’t ignore needed security fixes.

However, RedHat announced in September that they would take a leadership role in this area. Some, e.g. https://adoptopenjdk.net, plan to offer an OpenJDK LTS releases for free. So there will be OpenJDK LTSs available, just not from Oracle.

What does this all mean for InterMine? Not Much!

We’ll keep monitoring the situation but this seems like the usual way that companies manage open source projects — providing open software and additional paid support. So nothing to be alarmed about. OpenJDK is open source, so we are safe.

People are (rightly?) concerned about Oracle’s true commitment to Java and open source going forward. What if they change their mind and don’t release updates to OpenJDK? For InterMine this isn’t too scary because worst case scenario we could use an older stable version of Java. However in this nightmare scenario it’s likely that Java would be forked and we could carry on.

Future InterMine plans

We have no plans to migrate away from Java and will continue to develop using the OpenJDK as normal. We develop against the Java specification not the version so we aren’t tied to a specific Java version. For now, we’re recommending staying with OpenJDK 8 but plan to start testing with Java 11 soon.

Although some are suspicious of Oracle due to past experiences, we are optimistic about the future of Java, as the community really seems to be responding to the need for a secure and open Java.

More reading:

 

 

InterMine Releases – Winter 2018 Update (Solr, Strains and being more FAIR)

Here’s a list of recent and upcoming InterMine releases.

InterMine 3.0 – Solr

Just released! This is the Solr project we discussed over the summer that was done as part of Google Summer of Code (Thanks again Arunan!). See our blog post for details.

InterMine 3.1 – Strains

This will be released next week. The release will include the data model changes we discussed on the last community call. We’ve added Strain to the core data model, with references to Organism and Sequence Feature.

Sam’s built a test mine you can query to preview the updates.

This will not be a disruptive release, except you may want to update your strains to match the core InterMine data model.

InterMine 3.1.1 – Bug fixes

3.1.1 is a small release comprised of a few very very small but useful bug fixes and features. If you have something specific you need done, please ask!

This will not be a disruptive release.

InterMine 4.0 – FAIR

We’ve been making InterMine more FAIR! This release will include things like adding licence information to data sets, adding ontologies to describe the data model etc. More details soon! We’re hoping this release is ready late January 2019.

This will not be a disruptive release.

Thanks for reading! As always, if you have any questions, please hop onto our discord server (chat.intermine.org) or drop us an email.

Helpful Links:

Release Notes

Upgrade Instructions

 

InterMine 3.0 – Solr search

InterMine 3.0 is now available and features a brand new search powered by Solr.

Default search configuration will work well, but Solr allows for endless configuration for your specific needs.

Now the first search after deployment is instant, you can inspect the search index directly (via http://localhost:8983/solr/) and there’s a facet web service (via /service/facet-list and /service/facets?q=gene). Certain bugs, e.g. searching for the gene “OR”, are also now fixed.

New Configuration Option – optimize

There is a new keyword search configuration setting: index.optimize. If set to `true`, reorganises the index so chunks are placed together in storage which might improve the search time. (Similar to defragmentation of a hard disk.) See the configuration docs for more details.

Docs

Installing Solr

Configuring the keyword search

InterMine 3.0 upgrade instructions | release notes

A big thank you to our clever and hard-working 2018 Google Summer of Code student Arunan Sugunakumar — who did the bulk of the work as part of his summer project. Great job!

STORM + InterMine: A partnership in the fight against cancer

In July 2018 Innovate UK awarded InterMine at the University of Cambridge and STORM Therapeutics a Knowledge Transfer Partnership (KTP). A KTP is a government program that helps businesses in the UK by linking them with an academic organisation — enabling them to bring in new skills and the latest academic thinking to deliver a specific, strategic innovation project.

The key objective of this particular project is to develop an analysis platform using the data warehouse InterMine to help STORM advance their cancer research.

Here we talk with Hendrik Weisser, Senior Bioinformatician at STORM, about this collaboration.

Can you tell me about this project?

Sure, my company (STORM) is partnering with InterMine in this project. We are going to develop a computational knowledge base for cancer drug discovery and RNA epigenetics, based on InterMine’s HumanMine database. We will extend InterMine by adding analysis tools, more biomedical data etc. to make it a bespoke platform to help us identify and validate drug targets.

Can you tell me more about STORM?

STORM Therapeutics is a drug discovery company focused on RNA epigenetics, developing small-molecule inhibitors of RNA-modifying enzymes for the treatment of cancer. We are a spin-out of Cambridge University, founded in 2015 by professors Eric Miska and Tony Kouzarides from the Gurdon Institute. You can find more information – and a cool animated video about RNA epigenetics – on our website, www.stormtherapeutics.com.

What do you hope to achieve?

For STORM, convenient access to available data on RNA-modifying enzymes, their roles in RNA epigenetics, and their associations to different cancers – both direct and via interaction partners – is vital for our efforts in target validation, indication prioritisation and patient stratification. A large amount of relevant data is publicly available but is scattered over many sources and not integrated, thus difficult and time-consuming to fully utilise. STORM’s vision is to develop an integrated database of relevant human biomedical data, that should enable our scientists to quickly view and interrogate the most pertinent data on target genes/proteins, but also allow us to easily perform bioinformatic analyses on these data.

What attracted you to InterMine? What makes InterMine a useful tool for drug discovery?

I found out about InterMine’s existence by chance and then quickly signed up to an InterMine training course at Cambridge University to learn more. I was impressed by the wealth of functionality offered by InterMine and by its sophisticated architecture that enables huge flexibility in dealing with different kinds of biological data. InterMine really represents the state of the art in terms of large-scale complex biomedical data integration. By focusing on extensibility and customisation and on enabling local installations, InterMine is able to serve a variety of research communities. These capabilities also make it an ideal fit for STORM’s requirements for an internal data management system that integrates diverse public data. The fact that InterMine is open-source, i.e. the code is and will stay available, is also important for us because it helps to ensure long-term maintainability.

—-

For more information see STORM’s website.