Notice: Removing “Node” Syntax

Please note that in InterMine 2.0 we will longer support the “Node” syntax in queries. This is an old way of writing path queries and hasn’t been used for several years.

Here is an example of the old way of doing things:

<query name="employeeDepartmentCompany" model="testmodel" view="Employee.name Employee.department.name Employee.department.company.name">
    <node path="Employee" type="Employee"></node>
    <node path="Employee.department" type="Department"> </node>
    <node path="Employee.department.name" type="String">
     <constraint op="=" value="DepartmentA1" description="" identifier="" code="A">
      </constraint>
    </node>
  </query>

This is the current syntax for the same query:

   <query name="employeeDepartmentCompany" model="testmodel" view="Employee.name Employee.department.name Employee.department.company.name">
    <constraint path="Employee.department.name" op="=" value="DepartmentA1" identifier="" code="A"/>
   </query

You can see why we changed! If you think this is going to inconvenience you in any way or have any questions, please get in touch.

Cool InterMine features roundup

I’ve said this before, but I’ll proudly say it again: one of the greatest things about being open source is the community. People are continually creative and resourceful with the tools we’ve built, and we love seeing all the different things you guys do with InterMine. Here’s a quick roundup of some of the things we’ve seen so far this year:

TargetMine’s Auxiliary Toolkit

targetmine-new-stuff
TargetMine’s Auxiliary toolkit offers advanced analysis for networks and enrichment

TargetMine links out from report pages to provide external enrichment and interaction tools. Read more about it here, or  browse the tutorials: [Enrichment] [Interaction Network].

The Beany Mines:

The beany mines (Soy, Peanut, Legume, and Bean) recently added a shared motif search, as well as a couple of other great visualisations:legume-shared-motif-search

 

R and SOLR

Colin of HymenopteraMine and BovineMine did a great blog post about using our R client, InterMineR, and then continued to impress by making efforts to upgrade InterMine to use Solr.

MOLD

Ever wondered what Model Organism Linked Data might look like?  MOLD includes a queryable SPARQL endpoint and draws from multiple different InterMines to create a single dataset.

mold

Tip: Make it generic

Generic tools are ones that aren’t hard-coded to a specific Mine or model. We’re always on the look out for new and exciting features, whether it’s a visualisation or a web service or a database tweak. If you think it’s good, you can email us to discuss it or simply create a pull request, and bask in glory forever after.

We’d love to see more!

This list is awesome (thanks everyone!!) but by no means conclusive. If you think we’ve missed something out, or you’re doing something new at the moment, drop us a line and we’ll add you to the next round up. We’d also love to hear from others who might be interested in guest-blogging an InterMine related feature.

Where’s Wally?

Last week InterMine attended the first RSE (Research Software Engineer) conference (look at the picture…we are there!)

rsepeople

But what’s an RSE? In the Introduction main talk, the first day, Caroline Jay from the University of Manchester defines RSEs as “the coalface of ensuring that computational science is accurate, reliable and reproducible, and their views on making progress in this domain are therefore particularly valuable.”  Particularly valuable because, as a slogan that everyone loved says, “Software can exist without a paper but a paper and the results can’t. If the software is wrong, the science is wrong”.

As promised by the organizers, the conference focused exclusively on the issues that affect people who write and use software in research, not people who write papers. In two days there were a lot of interesting talks and workshops about how research software engineers can grow a project for science, best practices, software development process, docker…

As InterMine team, we’ve contributed to the conference, sharing our story, what “open” really means to us, why we choose open source and how we try to be open. The image below shows our vision about Open Source.

open-source-more-than-just-a-licence-on-github

We have also shared the best practices we’ve learned, over the years, in designing, writing and maintaining open source software for science, hoping that people embarking on their first open source project could benefit from these. [Slides from our talk]

We had a great time talking and meeting with a lot of very friendly and passionate people sharing idea, best practices, issues and doubts.

Thank UK-RSE folks for organizing a so great event!

See you next year!

Save the date: 29-31 March 2017

Remember the big International InterMine meetup we were tentatively discussing a few months back? Thanks again to everyone who responded to the survey, as it helped us a lot. We’re still in the process of nailing down the details, but here’s the rough program we are expecting (with details potentially subject to change but hopefully they won’t…)

Wednesday 29th March 2017: Arrival at Berkley in preparation for the fun ahead. Hopefully there will be some limited accommodation on site available for early birds.

Thursday 30th and Friday 31st March 2017: The conference itself. Details to be confirmed.

Saturday 1st and Sunday 2nd April: Hackathon! Entirely optional. Put your thinking caps on and start looking for fun ideas.

We’ll post more details as things become concrete, and we’d love to hear from you if you have any ideas or thoughts regarding the conference and its content. We still need to think of a catchy name and hashtag for twitter!

Finally for now, we’d like to give massive thanks to the folks at JGI for helping us to coordinate this.

 

HumanMine moving to HTTPS

What is happening?

To improve security and privacy, HumanMine is moving all of its Web sites and services, including Web APIs, to HTTPS only by 30 September, 2016.

If you use HumanMine only through a Web browser (like Safari, Firefox, Chrome, Internet Explorer, Opera, etc.), this document is not of interest to you. The only change you should notice after the deadline is that a green lock icon should appear inside the box, and the web addresses of the HumanMine pages you visit will start with https://.

If you maintain software that uses HumanMine APIs or accesses HumanMine servers through the Web, you should understand and act before the deadline to ensure uninterrupted service.

Applications that access HumanMine web servers using http: URLs, instead of https:// URLs, may fail partially or completely after HumanMine switches to HTTPS-only.

 

Why?

The HTTP protocol does not provide encryption, so anyone who can see web traffic between a client (for example, a web browser) and a server can intercept potentially sensitive information, and/or inject malware into users’ browsers or operating systems. HTTPS solves this problem. It works just like HTTP, except that traffic is encrypted in both directions, so observers between the client and the server can’t intercept or tamper with the requests or responses. It also provides authentication, ensuring that the client is communicating with the intended server given by the hostname, and not some impostor. (Source)

Please contact us with any questions or concerns!

 

 

 

 

 

Exploring Blazegraph

While we’ve been testing Neo4j with all FlyMine data and with PhytoMine to verify how well it performs and scales with big databases, we started exploring another open source implementation for graph databases: Blazegraph.

Blazegraph overview

Blazegraph is a open source high-performance graph database supporting the RDF data model.

RDF is a model to describe and store data: in this model, you express facts, also known as “statements”, composed by three parts knowns as triples. Each triple is composed of a subject (the resource), the predicate (the property name of the resource) and the object (the property value). For this reasons, Blazegraph is also called a “triples store”.

Subject Predicate Object
http: //flymine.intermine.org/flymine/1007664 :hasSymbol “zen”

Blazegraph supports SPARQL (pronounced “sparkle”), a rich and expressive query language for RDF, which is extremely standardized. Using query operations like union, sort, filter and aggregation, the user can query the data in a very flexible way. With federated queries, the user can aggregate information executing queries distributed over different SPARQL endpoints and consequently discover more data across the web.

Blazegraph provides a SPARQL endpoint where the user can remotely explore, access, and download the data stored using SPARQL language; Blazegraph workbench provides a graphical interface for the REST APIs.

Blazegraph and Neo4j: different graph modelling

In Neo4j, a node in the graph corresponds to an entity in a domain. A node, but also the relationships between the nodes, can contain properties describing the object that it represents.

By contrast, in Blazegraph, the nodes don’t contain properties but primitive data like string, integer, date.

In Neo4j we’ve represented the gene entity and its relation with the organism in this way:

node1

neo4jrelation

In Blazegraph the same concept will be represented as:

blazegraph-post

with the following statements:

triplesOnly one statement represents the relation between the gene and the organism (that one containing the predicate hasOrganism), the others describe the properties of the two entities.

The resources represented in RDF are identified by unique HTTP URIs (in the example http: //flymine.intermine.org/flymine/1007664).

Exporting FlyMine data: Intermine-RDFizer

We have exported all FlyMine data using Intermine-RDFizer.

The Intermine-RDFizer can query any InterMine endpoint via InterMine API, download the tables in tsv files and transform them into RDF nquads based on the XML object model file.

Intermine-RDFizer

The InterMine-RDFizer script converts every row in a table into a RDF resource. The resource type is based on the class name (e.g. Gene, Organism) and the resource URI is built using the column “id”. The script converts the columns in resource properties and builds a RDF literal typed with the column’s name.

blazegrah-triplesFor FlyMine, we have created roughly 365 million triples and imported them into Blazegraph using the REST APIs provided.

Benchmarking

We’ve started testing Blazegraph performance using all FlyMine data imported via InterMine-RDFizer and comparing the results with Neo4j.

As usual, we will keep you updated!

 

InterMine in Orlando: TAGC16

As many of you may know, TAGC (The Allied Genetics Conference) is happening in July,  in Orlando, Florida, USA, covering multiple model organisms.
We’ll be there, with a booth at stand 403: (Link to interactive floorplan)

Blank Flowchart - GSA (1).png

Come say hi at the booth and maybe grab some swag, like these fabulous stickers.

We’re planning to arrive and set things up late-ish Wednesday the 13th. The booth will always be staffed, but you’ll be able to catch us at the following additional sessions:

Thursday 14th July: 

  • 1:30 – 2:30 PM: Cypress Ballroom – Poster Session

Friday 15th July:

  • 2:50 – 3:30 PM: Cypress Ballroom – Poster Session

Saturday 17th July: