Notice: Removing “Node” Syntax

Please note that in InterMine 2.0 we will longer support the “Node” syntax in queries. This is an old way of writing path queries and hasn’t been used for several years.

Here is an example of the old way of doing things:

<query name="employeeDepartmentCompany" model="testmodel" view="Employee.name Employee.department.name Employee.department.company.name">
    <node path="Employee" type="Employee"></node>
    <node path="Employee.department" type="Department"> </node>
    <node path="Employee.department.name" type="String">
     <constraint op="=" value="DepartmentA1" description="" identifier="" code="A">
      </constraint>
    </node>
  </query>

This is the current syntax for the same query:

   <query name="employeeDepartmentCompany" model="testmodel" view="Employee.name Employee.department.name Employee.department.company.name">
    <constraint path="Employee.department.name" op="=" value="DepartmentA1" identifier="" code="A"/>
   </query

You can see why we changed! If you think this is going to inconvenience you in any way or have any questions, please get in touch.

HumanMine moving to HTTPS

What is happening?

To improve security and privacy, HumanMine is moving all of its Web sites and services, including Web APIs, to HTTPS only by 30 September, 2016.

If you use HumanMine only through a Web browser (like Safari, Firefox, Chrome, Internet Explorer, Opera, etc.), this document is not of interest to you. The only change you should notice after the deadline is that a green lock icon should appear inside the box, and the web addresses of the HumanMine pages you visit will start with https://.

If you maintain software that uses HumanMine APIs or accesses HumanMine servers through the Web, you should understand and act before the deadline to ensure uninterrupted service.

Applications that access HumanMine web servers using http: URLs, instead of https:// URLs, may fail partially or completely after HumanMine switches to HTTPS-only.

 

Why?

The HTTP protocol does not provide encryption, so anyone who can see web traffic between a client (for example, a web browser) and a server can intercept potentially sensitive information, and/or inject malware into users’ browsers or operating systems. HTTPS solves this problem. It works just like HTTP, except that traffic is encrypted in both directions, so observers between the client and the server can’t intercept or tamper with the requests or responses. It also provides authentication, ensuring that the client is communicating with the intended server given by the hostname, and not some impostor. (Source)

Please contact us with any questions or concerns!

 

 

 

 

 

HumanMine 3.0

HumanMine has been updated to the latest version of NCBI Entrez Gene. All other data sets have also been updated to the newest versions and we have fixed a few bugs. See the data sources page for a full list of data and their versions. All data can be accessed through our comprehensive library of template searches or by building your own queries using the query builder.

 

New Data Source: ClinVar

We added a new data source linking genes with their alleles and associated diseases. Here’s an example query:

http://www.humanmine.org/humanmine/template.do?name=Gene_Alleles_Disease

Human Data Sources Switch

We switched Entrez and Ensembl gene identifiers around. Please see our blog for details. If you have questions or problems, please contact us.

Complex Interaction Viewer

We’ve added a nice viewer for complexes. Source: http://interactionviewer.org

71e3498e-18f5-11e6-8422-5e4486ab4b67

 

We have docs and videos, and for a full list of data sources available in HumanMine see the data sources list.

However, please do not hesitate to contact us should you require any further assistance. For all types of help and feedback email support@humanmine.org

FlyMine 43.0

FlyMine has been updated to the latest version of FlyBase. All other data sets have also been updated to the newest versions and we have fixed a few bugs. See the data sources page for a full list of data and their versions. All data can be accessed through our comprehensive library of template searches or by building your own queries using the query builder.

If you have any questions, please see our docs and videos. Please do not hesitate to contact us should you require any further assistance. For all types of help and feedback email support@flymine.org

HumanMine – Identifier switch

Currently genes in the human InterMine (humanmine.org) have the Ensembl gene identifier (e.g. ENSG00000000003) as the “primary” identifier and the NCBI gene identifier (e.g. 7105) as the “secondary” identifier. In the next release of HumanMine, this will be switched.

A small change! But may impact your lists of genes. Please contact us if you are worried. We will keep the current version of HumanMine available for your convenience for the next few months just in case.

Why not just use both identifier schemes?

This is what we have done, and will continue to do. The problem is the two organisations do not agree completely on the genome annotation. This means that what Ensembl says is a gene may not be considered a gene by the NCBI. In fact there is a many-to-many relationship. There are some NCBI IDs that map to zero, one or several Ensembl identifiers. Conversely, there are some Ensembl identifiers that map to zero, one or several NCBI gene identifiers.

Why did you pick Ensembl identifiers?

There are a lot of quality data sets that use Ensembl identifiers. Not using Ensembl identifiers means that we may lose information from these valuable studies.

Why did you switch to using NCBI identifiers?

We are part of a BD2K pilot for the NIH Commons project involving six major model organism databases:  fly (FlyBase), mouse (MGI), rat (RGD), worm (WormBase), yeast (SGD), zebrafish (ZFIN). All of the model organisms use NCBI identifiers for human genes. For interoperability, we decided to use NCBI identifiers as well.

What were the final numbers? How much data was “lost” or gained?

Total genes
Data Source loaded into HumanMine database both NCBI and Ensembl identifier HGNC symbol
Ensembl 61,817 30,137 38,124
NCBI 59,613 23,016 39,670
  • Only 36 NCBI genes do not have a corresponding HGNC symbol.
  • There were 94 Ensembl identifiers that are assigned to more than one NCBI gene.
  • There were approximately 100 NCBI genes associated with more than one Ensembl identifier. In these cases, we did not assign the Ensembl to be an identifier. Instead we placed the two as “synonyms” so users can still search and find the relevant genes.

Why do I care?

If you have a saved list using Ensembl IDs, there may be data loss. We will keep the current version of HumanMine available for your convenience for the next few months just in case — so you aren’t in danger of losing any of your saved data.

 

 

 

 

2016 InterMine RoadMap

We have a brand new blog and so would like to take this opportunity to tell you our grand plans for 2016.

InterMine 2.0

Gradle

Currently InterMine is built with a series of ant commands, and dependencies are managed manually. This of course is not ideal, and we plan to use Gradle to replace Ant and manage our dependencies automatically. This change will make builds faster, easier and more efficient.

For those of you with InterMines of your own, this means that you will use different commands for building your databases and deploying your webapps. We’ll provide the new commands along with documentation, and aim to make the transition as easy as possible.

Keyword Search

We currently use Lucene for our search index but plan to greatly expand our utilisation of this great library — making search on InterMine more robust, sensitive and powerful.

The Cloud

Some have already deployed their InterMine to the cloud. We intend to make this process much easier, probably by creating a custom InterMine buildpack which pre-configures a Docker container with all of InterMine’s dependencies.

New Data Sources

We are always adding new data sources and would like to hear your suggestions. On our list right now is:

And of course we will continue to update our current data source library as file formats and data change.

New User Interface

We’ve developed a new user interface which should be ready for beta testing in early 2016. It’ll exist alongside the current interface for some time, allowing you to feed back ideas, suggestions, and critiques in the new interface, whilst still being able to rely on the old one.

Here’s a sneak preview (subject to plenty of change, of course!):

Sneak preview: Homepage for the (work-in-progress) Intermine 2.0 UI.
Sneak preview: Homepage for the (work-in-progress) Intermine 2.0 UI.

New Tools

To go along with our new interface, we’re going to be adding a lot of new tools for you to use. Our wish list so far (not in order of priority):

  1. Advanced Search / Query builder / Guided search
  2. Recommendation engine (which gene is like my gene?)
  3. Complex Interaction viewer
  4. more powerful region search
  5. phenotype viewer
  6. InterMine search tool
  7. R plug-in
  8. Text mining tool
  9. JBrowse / other genome browsers
  10. UniProt protein browser

We’d like to hear which tools are important to you. We also will improve the tools we currently have, making them easier to adapt to your data sets.

2017 and beyond

Genomes are being sequenced every day, technology is moving at an ever more rapid pace and everyone is facing a challenging funding environment. We don’t know quite what the world will look like in the next five years but we are working hard to be future proof. We’ve always had a deep commitment to openness, flexibility and collaboration, and feel that this will help us meet any future challenges.

Towards this end, we are running a pilot program to test out various graph databases and to explore the semantic web. We will keep you posted on our progress as always, and would like to hear your thoughts.

Thanks to our great community for all of their support over the years! We look forward to a really exciting year!