STORM + InterMine: A partnership in the fight against cancer

In July 2018 Innovate UK awarded InterMine at the University of Cambridge and STORM Therapeutics a Knowledge Transfer Partnership (KTP). A KTP is a government program that helps businesses in the UK by linking them with an academic organisation — enabling them to bring in new skills and the latest academic thinking to deliver a specific, strategic innovation project.

The key objective of this particular project is to develop an analysis platform using the data warehouse InterMine to help STORM advance their cancer research.

Here we talk with Hendrik Weisser, Senior Bioinformatician at STORM, about this collaboration.

Can you tell me about this project?

Sure, my company (STORM) is partnering with InterMine in this project. We are going to develop a computational knowledge base for cancer drug discovery and RNA epigenetics, based on InterMine’s HumanMine database. We will extend InterMine by adding analysis tools, more biomedical data etc. to make it a bespoke platform to help us identify and validate drug targets.

Can you tell me more about STORM?

STORM Therapeutics is a drug discovery company focused on RNA epigenetics, developing small-molecule inhibitors of RNA-modifying enzymes for the treatment of cancer. We are a spin-out of Cambridge University, founded in 2015 by professors Eric Miska and Tony Kouzarides from the Gurdon Institute. You can find more information – and a cool animated video about RNA epigenetics – on our website, www.stormtherapeutics.com.

What do you hope to achieve?

For STORM, convenient access to available data on RNA-modifying enzymes, their roles in RNA epigenetics, and their associations to different cancers – both direct and via interaction partners – is vital for our efforts in target validation, indication prioritisation and patient stratification. A large amount of relevant data is publicly available but is scattered over many sources and not integrated, thus difficult and time-consuming to fully utilise. STORM’s vision is to develop an integrated database of relevant human biomedical data, that should enable our scientists to quickly view and interrogate the most pertinent data on target genes/proteins, but also allow us to easily perform bioinformatic analyses on these data.

What attracted you to InterMine? What makes InterMine a useful tool for drug discovery?

I found out about InterMine’s existence by chance and then quickly signed up to an InterMine training course at Cambridge University to learn more. I was impressed by the wealth of functionality offered by InterMine and by its sophisticated architecture that enables huge flexibility in dealing with different kinds of biological data. InterMine really represents the state of the art in terms of large-scale complex biomedical data integration. By focusing on extensibility and customisation and on enabling local installations, InterMine is able to serve a variety of research communities. These capabilities also make it an ideal fit for STORM’s requirements for an internal data management system that integrates diverse public data. The fact that InterMine is open-source, i.e. the code is and will stay available, is also important for us because it helps to ensure long-term maintainability.

—-

For more information see STORM’s website.

 

 

Advertisements

InterMine 2.1.0 release

We pushed out a few bug fixes and improvements:

  • FIX – “Update publications” data source failed when too many PubMed IDs were sent to the Entrez web service (Thank you to Norbert Auer!)
  • FIX – small bug for generating Python code
  • FIX – FASTA query web service times out when extensions are used (Thanks to Joel Richardson!)
  • FIX – Discontiguous CDS sequences and lengths not set properly
  • FIX – Some SO terms were not updated in 2.0 release
  • FIX – Region search: trying to leave “extended region” field blank results in error

See GitHub for details:

https://github.com/intermine/intermine/releases

How to Upgrade

Throughout the InterMine code, the InterMine version number is set via a global variable. Here’s an example:

# Maven will download the bio-core JAR with the correct version
compile group: 'org.intermine', name: 'bio-core', version: System.getProperty("bioVersion")

To change which InterMine version you are using , you will want to increment the value of the system property “imVersion” and “bioVersion“. These are located in  the “gradle.properties” file for your mine:

# gradle.properties in your mine
systemProp.imVersion=2.1.+
systemProp.bioVersion=2.1.+

Maven will now download, for example, the bio-core JAR of the latest version, e.g. “bio-core-2.1.0.jar”.

If you set the property to “2.1.+” you will get any small point releases that are published in the future. You can set the property to be 2.1.0 if you ONLY want to use version “2.1.0” and do not want to receive updates:

# gradle.properties to only get specific version
systemProp.imVersion=2.1.0
systemProp.bioVersion=2.1.0

Here is an example:

HumanMine upgrade to use the latest version.

InterMine 3.0 – SOLR

The next InterMine release will be InterMine 3.0 which will include SOLR. See our SOLR blog post for details.

We are currently testing SOLR with InterMine and should have a version ready for public beta testing early next month.

 

InterMine 2.0

We are excited to announce the official release of InterMine 2.0!

InterMine 2.0 includes some model updates, a big change in how InterMine itself is built, lots of new features, like a new UI, and a long list bug fixes. See the full list of updates here.

This release represents a large milestone for the InterMine team! Not only because we made big fundamental changes to the core InterMine data model and build system, but also because this release represents a major shift in philosophy for us. Previously InterMine was a big, monolithic, single piece of software. You downloaded the whole InterMine, you compiled the whole InterMine, you got the whole of InterMine. Instead, we are moving towards this idea of modularity and responsiveness. Smaller, independent libraries that are interconnected but can be used for tools and features separately or linked together.

Smaller decoupled InterMine packages will allow us to develop more features faster with less errors. InterMine maintainers might then have the flexibility to include (or not) the features in their mine, plug in their own tools, etc.

Version 2.0 represents a big step towards this goal!

A New Interface

A new feature in InterMine 2.0 is the ability to run our new UI, nicknamed “Blue Genes”. This app is in addition to the current webapp and offers a new and responsive search environment for your InterMine data.

Blue genes is a modern UI built in Clojure and provides a modern user experience.

  • Super fast response times
  • Interactive list upload
  • Redesigned “My account” section
  • Search autocomplete
  • Template and query builder result previews
  • .. lots more!

Once you have your InterMine updated to InterMine 2.0, there is a single command that will launch Blue Genes for your mine.

We are actively seeking feedback on Blue Genes, it’s still very much in the beta phase still, so please get in touch once you have some opinions!

Special Thanks

Thanks to everyone who helped test this release! Thanks Howie Motenko at MGI for your alpha testing and model insights. And a BIG thank you goes to Sam Hokin from the NCGR who spent a lot of time and effort helping improve InterMine! Thanks Sam and Howie! You are much appreciated.

Helpful Links

What exactly we changed (blog post)

Full list of GitHub tickets included this release

Docs on how to upgrade to 2.0

 

As always, please contact us if you have any questions or comments! We have an active twitter account, a discord server at chat.intermine.org, and a low traffic mailing list.

FlyMine 46.0 Released!

FlyMine has been updated to the latest version of FlyBase. All other data sets have also been updated to the newest versions and we have fixed a few bugs. See the data sources page for a full list of data and their versions. All data can be accessed through our comprehensive library of template searches or by building your own queries using the query builder.

 

Data model changes!

Our data model has changed slightly to make querying easier between mines.

  • Protein molecular weight is a float instead of an integer
  • We’ve added URLs for GO term evidence codes
  • Sequence Ontology (the basis for the InterMine data model) is updated, so lots of new data types added.

See our previous blog post for complete details of updates.

No more Anopheles

After listening to community feedback, we have decided to stop loading Anopheles data into FlyMine. However, as always, if there is a specific data set you are interested in, please contact us!

 

We have docs and videos, and for a full list of data sources available in FlyMine see the data sources list.

However, please do not hesitate to contact us should you require any further assistance. For all types of help and feedback email info@intermine.org.

HumanMine 5.0 released!

HumanMine has been updated to the latest version of NCBI Entrez Gene. All other data sets have also been updated to the newest versions and we have fixed a few bugs. See the data sources page for a full list of data and their versions. All data can be accessed through our comprehensive library of template searches or by building your own queries using the query builder.

 

New Data Source: GTEx

We added a new expression data set, GTEx. Here’s an example search:

Gene –> Tissue Expression

Tissue –> Gene expression

Data model changes!

Our data model has changed slightly to make querying easier between mines.

  • Protein molecular weight is a float instead of an integer
  • We’ve added URLs for GO term evidence codes
  • Sequence Ontology (the basis for the InterMine data model) is updated, so lots of new data types added.

See our previous blog post for complete details of updates.

 

We have docs and videos, and for a full list of data sources available in HumanMine see the data sources list.

However, please do not hesitate to contact us should you require any further assistance. For all types of help and feedback email info@intermine.org.

Coming up soon: InterMine 2.0 release webinar, community calls, and GSoC presentations

What’s coming up soon in InterMineLand? Here are a few of the highlights:

Upgrading to 2.0 – Thursday 2nd August

With the release of InterMine 2.0 RC1, we’ll be dedicating the InterMine Developer call to an InterMine 2.0 Upgrade Webinar, spending around 20 minutes discussing how one upgrades an InterMine 1.x installation to use the newer (and much more easygoing) Gradle dependency management system. Q&A afterwards so you can learn everything you’ve been burning to know. [Call in information]

This call will be recorded so anyone who couldn’t make it can catch up.

GSoC Student project presentations – Thursday 16th August

Six students, six awesome projects. Our students have been blogging prolifically while working over the last three months, and they’ll be presenting their work on the developer call, with five minutes slots per student + time for Q&A afterwards. [Agenda here]

This call will be recorded so anyone who couldn’t make it can catch up.

Community Outreach Call – 6th September 

Once a quarter we host non-techie calls where we focus on interesting things the community has been doing as well as community engagement in general. This time we’ll be featuring Kevin Macpherson, who runs some fantastic community outreach at SGD, including amazing webinar video use-cases.  [Agenda, still work in progress]

Previous featured speakers include Jacqueline Campbell talk about her approach to community engagement, Wayne Decateur demonstrating InterMine code in Jupyter notebooks, and Abby Cabunoc Mayes, Mozilla’s Working Open Practice Lead.

We’re still looking for speakers for this call and the next one, in December – If you have a topic you’d like to share about InterMine, open science/source, or bioinformatics in general, ping yo@intermine.org to pitch the idea.

GSoC’18- Cross InterMine Search Tool (Progress so far)


It has been three weeks since the commencement of the GSoC’18 Coding Phase and a lot of progress has been made till now. The project has been deployed here.

This week has been really very productive. I have worked upon several parts of the application this week. My main focus for this week was to add filters into the application. But I ended up with some more cool stuff here 😉
Let’s have a look at all of them one by one.

  1. Search Rating/Relevance Score
The application knows what you are searching for 😉

Relevance score is a value which is calculated by the InterMine QuickSearch API endpoint for every keyword which is searched. This rating determines how much relevant is the result item with the search keyword. The QuickSearch API response provides a floating value of ‘relevance’ parameter. InterMine uses a formula for determining the Search score in the specific InterMine Search Portals. I have used the same formula to convert the floating relevance score into a score out of 5. The formula is here:
=> Math.round(Math.max(0.1, Math.min(1, relevance)) * 5)

2. Different colors for different Categories in Results

The world is colorful and hence our app too 😉

This was a feedback received from the InterMine community. Having different category results shaded in different colors helps in exploring the results easily. At times, there may be a long list of result items returned by the application. Then having separate colors for separate categories helps us to explore faster. Our eyes are meant to perceive colors more quickly than text.

3. External links to reports

Every result item has a link which opens the result report on its particular InterMine portal. This report page contains more detailed information about the result item. On clicking the icon, a new tab with the given result report opens in the browser. The result link is generated dynamically:

4. Metadata about search results

Metadata for BMAP mine (search: ‘brca1′)

Having metadata about the results returned is always handy at work. Every tab in the application loads a set of metadata as attached above. This can certainly help in understanding the presence of a search term in the mine in a unified way.

5. Search/Relevance Score Filter

Score filter on sidebar of the application

With addition of score/relevance in the application, it was very much necessary to add an option to filter the results based on that. This section provides radio buttons to filter out the results based on the relevance score of the result item. I hope this feature will be extremely beneficial for the community members out there. 🙂

6. Category Filter

Every mine contains data from various diverse categories. So, this feature too was a necessary requirement of a full fledged searching tool. This section is loaded dynamically based on the types of categories returned by the API. The application uses the search result metadata received from the API to generate these category checkboxes dynamically. So, we need not worry about hard-coding any of these.


So, this was all about progress so far. Almost everything in the application is mobile optimized and ready to use. Most probably, I will also be extending the scope of this project and add a REST API service for searching multiple mines. So, the project will be a full fledged Cross InterMine Search Tool in future, i.e. a package of, a client driven search interface & a back-end REST API service. You can find the project repository here. Please have a look at the application here. I would love to have more feedback from the community. (For providing your feedback/suggestions, kindly email me at dwivedi.aman96@gmail.com) Thanks for reading. Happy Coding! 🙂