Looking ahead: InterMine+Google Summer of Code 2018. Could you be a mentor?

2017 is coming to an end, and I have to say it’s been a fabulous one! I’ll probably post a “cool things InterMine did this year” round-up in a week or two – but in the meantime, here’s my final Google Summer of Code blog for you all!  We’ll cover the InterMine swag just sent out across the globe, as well as plans for next year – and how you can help out.

Thank-you gifts for mentors and students

Last week, we posted care packages to all our GSoC mentors and summer students, in the form of t-shirts, stickers, and pens. The postal-service-wrinkled shirt shown above is the women’s fit shirt printed on black; unisex shirts are a slightly lighter grey colour. If you filled out the swag survey when it was sent to you, your gift should be with you soon! Tweet us your images of the items in use for extra InterMine Cool Points 😎.

GSoC 2018 – call for project ideas and mentors!

Early 2017, we put together an ideas list for GSoC projects – InterMine’s projects are numbers 3 to 9. If you want to get more of an idea what it’s like to apply, (or be a mentor), read our application guidance from last year.

Do you have a nifty idea, or an InterMine itch you’d like to scratch?

Please share it with us! Add it to our 2018 Google Summer of Code ideas list, or if you need to sound things out and discuss them a little bit, comment on the GitHub issue, or email the dev list. You can even propose several ideas, if you like! Please add all ideas by the end of 14th of December (end of this week).

Would you like to try mentoring?

Fancy a chance to earn some nifty exclusive swag like pictured above? Add your name as a possible mentor to an existing idea (or your own new idea). You can always drop us a line if you want to discuss things first. We like projects to have more than one mentor if possible.

Maybe you’re a student thinking of GSoC?

Awesome! If you have your own InterMine project idea (whether it’s brand new or you’ve already started it), or if one of the ideas on our ideas list lights your fire, it’s not too early to start talking with potential mentors about it. The application guidance we mentioned above would be a good read, too.

 

 

Talks and Workshops: Sharing our materials for re-use

Would you like to grab some ready-made slides or InterMine training workshop materials? We’ve rounded up of some recent things that have been going on. Feel free to remix materials for your own talks and outreach efforts. If you do use them, we’d love to see the result!

Slides

You should have permissions to make a copy; if not, please contact us / tweet us / pop by chat to poke us with a stick.

3-min lightning talk at GSoC Mentor Summit: Citable version on FigshareGoogle Drive (editable) version

Better Science Through Better Data: Citable version on Figshare | Google Drive (editable) version | Featured image above was live-scribed during  the talk. Licence is CC-BY from Springer Nature, and the image is available from https://figshare.com/articles/Better_Science_through_Better_Data_2017_scidata17_scibe_images/5558653

Blank InterMine-branded slides: Get ’em here.

Posters

BlueGenes Poster: This poster was presented at BOSC 2017Citeable version on F1000Inkscape editable version –  (download Inkscape here: https://inkscape.org/en/release/0.92.2/)

InterMine Poster for Elixir UK All Hands 2017: PDF version | Inkscape editable version 

Workshop learning materials

We run an InterMine training workshop every term, covering the basics of using the webapp, as well as discussing how to draw data from the API. If you’re near Cambridge, keep your eyes open on the blog or twitter feed, as we’ll always announce them well in advance.

Workshop training materials in PDF: Workshop Exercises – handouts with answers | Workshop slides – note that these exercises were all correct with data from HumanMine in October 2017. Numbers of results may change if we add or update new data sources in the future, but the majority of the materials should still be generally correct apart from the results counts. 

You can download the original OpenOffice files as well if you’d like to adapt the materials for your own workshops, or feel free to contact us if you’d like to coordinate some training with us.

Side note: We’re also delivering a half-day workshop training session as part of the EBI’s 4-day Introduction to Multiomics Data Integration course – applications are open now until 01 December 2017.

Refs:

Data, Scientific (2017): Better Science through Better Data 2017 (#scidata17) scribe images. figshare.

https://doi.org/10.6084/m9.figshare.5558653.v1

Retrieved: 15:48, Nov 06, 2017 (GMT)

InterMineR package

InterMine data can be accessed via command line programs like cURL and client libraries for five programming languages (Java, JavaScript, Perl, Python and Ruby.) Aiming to expand the functionality of InterMine framework, an R package, InterMineR, had been started that provided basic access to InterMine instances through the R programming environment. (You could run template queries, but not much else!)

However, in order to fully utilize the statistical and graphical capabilities of the R language and make the InterMine framework available to an even greater number of life scientists, the goals were set to:

  1. Further develop and publish the InterMineR package to Bioconductor, a widely used, open source software project based in R, which aims to facilitate the integrative analysis of biological data derived from high-throughput assays.
  2. Add visualisation capabilities, e.g. “What features are close to my feature of interest?”
  3. Add enrichment analysis in InterMineR, a feature that will provide R users with access to the InterMine enrichment analysis widgets and can be effectively combined with the graphical capabilities of R libraries.

InterMineR performs a call to the InterMine Registry to retrieve up-to-date information about the available Mines. The information retrieved are then used to connect the Mines with the R environment using the InterMine web services.

Queries

The InterMineR package can be used to perform complicated queries on a Mine. The process is facilitated by the retrieval of the data model and the ready-to-use template queries of the respective Mine. The R functions setConstraints and setQuery have been created along with the formal class InterMineR, to create new or modify existing queries, store them as Intermine-class objects and apply them to the Mine with the runQuery method.

Genomic Coordinates

r_gviz

Figure 1: Gene visualisation done via InterMineR AND GVIZ

InterMineR can retrieve genomic coordinates and gene expression analysis data which can be converted to:

with the R functions convertToGRanges and convertToRangedSummarizedExperiment respectively. This way an interaction layer between InterMineR and other Bioconductor packages (e.g. GenomicRanges and SummarizedExperiment) is established, allowing for rapid analysis of the retrieved InterMine data.

Enrichment + GeneAnswers

InterMineR also retrieves InterMine enrichment widgets and facilitates the enrichment analysis on an InterMine instance using the R functions getWidgets and doEnrichment, respectively. With the usage of the R function convertToGeneAnswers the results of the enrichment analysis are converted to a GeneAnswers-class object, therefore allowing the visualization of:

  • Pie charts
  • Bar plots
  • Concept-gene networks
  • Annotation category (e.g. GO terms, KEGG pathways) – interaction networks
  • Gene interaction networks

by using R functions from the GeneAnswers R package.

geneanswers_go_structure_network

Figure 2: GeneAnswers GO structure network, generated via InterMineR

geneanswers_concept_gene_network_colors

Figure 3: GeneAnswers gene network generated using InterMineR

Final steps: Bioconductor & Vignettes

The updated InterMineR package complies to the instructions for submitting new packages to Bioconductor, has passed all automated checks (R CMD build, check and BiocCheck) and is currently under the process of manual review for Bioconductor submission.

Documentation of each function along with examples of its usage are available in the GitHub repo and as help files upon the installation of the package. Furthermore, a detailed vignette and tutorials concerning the new functionality of InterMineR package are currently available at the intermine/InterMineR/vignettes folder of the GitHub dev branch, and will be shortly available on the GitHub master branch as well.

This project is part of Google Summer of Code, still under development by me, Konstantinos Kyritsis, PhD student at the Aristotle University of Thessaloniki, under the mentoring of Julie Sullivan and Rachel Lyne. The GitHub repository of the InterMineR package can be found at https://github.com/intermine/InterMineR.

Commits made my Konstantinos can be found here: https://github.com/intermine/InterMineR/commits/master?author=kostaskyritsis

GSoC final month: testing, wrapping up, and live demos

We’re in to the final stretch of the three month Google Summer of Code period, and results are coming through thick and fast.

On August the 17th at 5PM UK time (you can check when it is in your local timezone) we’ll be doing short presentations for each of the projects as part of our community call – around 5 minutes per project. Come join in and see the great work our students have been doing!

Here’s a quick summary of projects to date:

InterMine Registry: The registry is up and running! You can view all known instances of InterMine in the registry front end, or browse the API docs to learn more about programmatic access. Tip: like the logos you see? Add yours with these handy tips from Julie in an earlier post.

intermine-registry
Snapshot of the registry front-end UI.

Leonardo also wrote a great blog post about his work on the registry.

InterMine iOS app: Several members of the InterMine community signed up to provide beta testing while the app was under development. Nadia’s been doing some great work on this – users can now use keyword search across multiple InterMines, browse templates, lists, and create sets of “favourite” InterMine objects – perhaps building up a literature search for future use. It also loads its mine list straight from the registry! Expect it in the app store soon.

Similarity Project: Samyadeep wrote up an in-depth technical project on the InterMine object similarity engine he’s been working on, using FlyMine sample data in Neo4J.

Neo4j: Yash will be demoing his InterMine Query <—>Cypher work on the call, or in the meantime, you can check out his blog posts on the subject.

R: Konstantinos updated our InterMine R client library to include new features such as enrichment visualisation – expect a blog post about it soon! It’s under review in Bioconductor but you can use the library now directly from GitHub.

 

InterMine community roundup: June 2017

Here are some of the exciting things that have been happening in the InterMine community recently:

Thanks to everyone who has contributed including students and their mentors. You guys are awesome!

excited Kermit via GIPHY

Have you done anything exciting with InterMine lately? email info [at] intermine [dot] org, tweet us at @intermineorg, or pop into chat.intermine.org to tell us about it… we’d love to feature you in a future round-up!

Google Summer of Code: Coding period starts!

As of the 30th of May, the community bonding period is over and official coding starts for GSoC. The first evaluation period is between June 26 to June 30 (full timeline).

Preparing for the evaluation

We don’t have full details of the evaluation questions yet, but the Student Manual and Mentor Manual provide a decent overview – it’s likely to be a few short questions ensuring work and communication are occurring and are on-track.

Students: What you need to do:

Follow your workplan and communicate regularly with your mentor!  Evidence of work can include emails regarding progress, demos if possible, and GitHub commits / PRs. Read the Student Manual entry on evaluations. Remember you’ll need to complete an evaluation on your mentor, too.

Mentors: What you’ll need to do:

Make sure you’re communicating with your student regularly and you’re confident about their progress. If you are on vacation during the evaluation period (or immediately before), make clear plans now, and make sure your student knows what will be happening and who their backup mentor/evaluator is for this time period.

Please also read the Mentor Manual on evaluations, and consider arranging a face-to-face feedback session, since your student can’t see your evaluation details beyond a pass/fail status.

 

 

GSoCers Assemble! Announcing the InterMine GSOC 2017 students

Google Summer of Code is officially open as of 16:00 UTC today! This year InterMine will have five students coding over the summer, with five projects:

gsoc-icon-192

  • InterMineR will be getting better docs and hopefully submitted to R repos. Konstantinos Kyritsis will be working on this with the help of InterMine mentors Julie and Rachel.
  • Our Android App will get a younger sibling in the form of an iOS app, thanks to Nadia Yudina. I’ll be the primary mentor for this project.
  • We’ll finally have a proper registry of all the great InterMines out there, brought to you by Leonardo Kuffo with Daniela mentoring the project.
  • Samyadeep Basu will be looking at an ‘InterMine Similarity project’ – given a Gene (or other entity) from InterMine – are there any other interesting entities related to it in some way? Josh is the lead mentor on this project.
  • Yash Sharma will be working on creating Neo4j-InterMine API endpoints under Sam Hokin‘s mentorship.

We wish we could have accepted more of you. In total we had more than 40 students interested in GSoC 2017 with InterMine, resulting in around 30 finalised applications. Many of the applications were brilliant – far more than we could possibly have accepted. Deciding who to accept was really tough, and even if you didn’t get a place in GSoC with us you’re still entirely welcome to contribute to any of our projects if you had any ideas.

Suggestions for accepted students

Congratulations on being accepted. We’re really glad to have you on board. Please have a quick read through our GSoC guidelines to get started.

During the community bonding period, here are a few ideas for getting involved.

  • Find out more details that might pertain to your project (obviously) – investigate the API or work on bugs
  • Project management – in your project’s GitHub repo create milestones, tickets, project boards as appropriate.
  • Write an intro blog post about yourself & your planned work (to be posted here and/or a personal blog we could link to).
  • Come hang in the chat (below).

Non-GSoC InterMine community: you can play too!

We’ve created a couple of chat rooms at chat.intermine.org. We’ll be encouraging our GSoC students to hang out in the #general channel, and you’re welcome to, as well. The students are from all around the world – come make them feel at home!

A flurry of deadlines: Grants, GSoC, workshops, and more…

We blogged in February commenting that we had a lot of events over the March / April period. Here’s a re-cap:

  • Attending conferences: Amongst the team we attended Bioschemas, the Elixir all-hands, and the Cambridge Scientific Computation Day.
  • InterMine training: We delivered a training workshop about using InterMine at the EBI, part of their Introduction to Omics data integration week-long course.
    • This went well despite a server-room meltdown which conveniently timed itself for the morning of the same day (the training session was in the afternoon, so we thankfully had time to get the servers back up!).
    • In contrast to previous years, every single hand went up when we asked if the participants wrote code as part of their job. Next time, we will try to allow for a longer session on using InterMine web services, rather than the 15 minute slot we allocated this time!
  • Developer Workshop and Hackathon: 5 days in sunny California, spending time with InterMiners from around the world. Longer blog posts to follow, but in the meantime you can browse the agenda for links to slides from each session, or the storify summary of tweets.
  • Google Summer of Code: We’re participating in Google Summer of Code (GSoC) this year (previously) as a mentoring organisation. We had over 50 interested students and 30 distinct applications, many of which were simply brilliant. The deadline for students applying, naturally, was the day after the hackathon, making finding time to provide student feedback a challenge. Maybe there’s a reason to be grateful for jet-lag induced wakefulness at odd hours!
  • Grants: A tale of two grants… :
    • New application: We had a grant application deadline that was, once again, the day after the hackathon. Uh-oh! Feverish figure fixes, tentative typo tweaks and word-count winnowing was squeezed in at every opportunity.
    • Good news about an old application: Meanwhile, we got the news that we’d been fortunate enough to have our hard work pay off: a grant we’d applied for last year as part of the BBSRC BBR 2016 call was agreed to! Hint: the future of InterMine is looking very FAIR, possibly even SPARQLing. More details in a later post.

Events coming up soon:

Google Summer of Code at InterMine

We’re pleased to announce that we’ll be participating in Google Summer of Code 2017 as a mentor organisation, under the umbrella of the Open Genome Informatics. Here’s the full ideas list for Open Genome Informatics Projects – InterMine projects are numbers 3 to 9.

Information for students:

About us:

InterMine is an open source biological data warehouse, based in the University of Cambridge. There are nearly thirty instances of public InterMines, covering a range of subjects from organisms like mice and rats, mines dedicated to plants such as the soybean, insects like the fruitfly or bees and wasps, and even mines dedicated to mitochondrial DNA and discovering drug targets.

We’re interested in mentoring students from a bioinformatics, computational biology, or computer science background.

You don’t have to be a biologist to work on InterMine related projects – many of the full time developers on the team didn’t come from a biology background – but biological knowledge is an advantage.

We use a range of languages in our projects, but most commonly you’ll see Java, PostgreSQL, Clojure/ClojureScript, and JavaScript. Each instance of InterMine has its own set of web services, and there are client libraries in five different languages, with a sixth in final stages of development.

Browse through our GitHub repos to see more of our projects: https://github.com/intermine

Getting started:

If you’re interested in applying for one of our projects, drop an email to the people named in the project description to introduce yourself, and explain which of the project(s) you’re interested in. There’s already been quite a lot of interest in the Similarity project from multiple students, so you might want to consider one of the other projects as a backup if you think you’d particularly like InterMine.

When you mail us, please make sure to include as many of the following as possible:

  • A CV / Resume. Tell us about yourself!
  • Links to GitHub, BitBucket, LinkedIn or similar.
  • Sample code. If you don’t have GitHub/Bitbucket etc. we’d still like to see what you can do. A class coding assignment or personal project you’re proud of is a great alternative.

A great way to familiarise yourself with the basics of building InterMine is to run through our tutorial: http://intermine.readthedocs.io/en/latest/get-started/tutorial/ – or alternatively you could try familiarising yourself with the web interface for your preferred InterMine. You can find the full list of InterMines at intermine.org, or try our experimental interface here: http://redgenes.apps.intermine.org/

We’ve also set up a few tickets on the core InterMine repo with the tag “Good first bug” if you’d like to get your hands dirty. Pop a note on the ticket and make a pull request when you think you’re ready. We have some guidelines for contributing that you should read before you make the pull request.

Finally, if you have any ideas or questions, please don’t hesitate to email us.

Useful links:

– Our twitter feed: https://twitter.com/intermineorg
– Here’s a blog post about some of the cool things the community has done with InterMine resources: https://intermineorg.wordpress.com/2016/11/22/cool-intermine-features-roundup/
– Our interactive web services docs: http://iodocs.apps.intermine.org/
– Our very in-the-works new ClojureScript UI. Demo: http://redgenes.apps.intermine.org/ repo: github.com/intermine/redgenes
– Developer documentation: http://intermine.readthedocs.io/en/latest/