Levelling up: From GSoC student to mentor

We’re really proud of our ongoing engagement with GSoC students from previous years, and we always encourage our students to stay involved in any way that suits them, from writing papers about their work, summer internships in the office, and even joining the team. Here, we’ve interviewed Aman Dwivedi, Arunan Sugunakumar, and Adrián Rodríguez-Bazaga, all of whom were mentors in 2019, but came from the special perspective of having been InterMine students in 2018. It’s not long at all until we’ll be thinking about GSoC 2020! 

Hi all – thanks for volunteering to be interviewed! What motivated you to return as a mentor after having been a student?

Adrián: As a result of being a student under InterMine umbrella during GSoC 2018, I got invaluable skills that contributed towards my professional career, and eventually to getting a job at the mentor organization itself. One of these skills is the ability to communicate, cooperate, and in general terms, to work with a software development organization in an international setting. This is a highly demanded skill – both in industry and in academia – that I couldn’t really get anywhere else before GSoC. 

On the second hand, the opportunity to learn how to contribute back into an open-source with a (huge) codebase and a decent number of contributors, both with code and ideas, was a unique chance to add this top-tier ability to my skill-stack. For this reason, since the high impact that GSoC had in my career, I wanted to go back and help other prospective students by mentoring them and sharing my experience, something that my current position at InterMine helped to contribute positively.

Aman: As a developer, I think we often use open source software and we don’t really get a chance to give back to the community. It becomes difficult to keep contributing to open source in our day to day professional work. Being a part of GSoC in the past, I have realised the importance of open source projects and the communities running them. Returning as a mentor for GSoC this year gave me a reason and a chance to contribute again. I always wanted to be a part of the GSoC journey again and this gave me an opportunity to welcome new contributors to the community.

Arunan: Being part of an organisation which is on the other side of the planet is always an exciting thing to do. I understood the full meaning of the term ‘Globalization’ when I was a student at InterMine last year, thanks to GSOC. I loved our meetings, guidance I received, the project outcome and the level of satisfaction I got. I wanted to have the same experience again this year as a mentor with the organisation I am familiar with.

Did you feel like you had any special insights into what students were going through, having been in the same position in previous years?

Adrián: Having been in the same situation as the students were during GSoC, was indeed very helpful to find and understand the potential needs that they might have. As a matter of illustration, one of the difficulties that is common within already-accepted GSoC students, is that when they face issues in terms of how to continue their progress through the program – either in terms of how to fix obstacles that they might find or contributing with new features – they often don’t feel “brave” enough to communicate with the mentor in order to ask about those problems directly, but instead prefer to find their way through independently, as maybe some of them feel that asking on how to proceed/fix something is a “signal of  lack of knowledge”, and in my opinion this is totally wrong, as mentors are there precisely to help you get around these situations!

Aman: From being a GSoC student to stepping into the shoes of a GSoC mentor, I already was aware of the problems faced by a student. Being a first time contributor in an open source organisation is just like entering a room full of unknown people. Sometimes the student might not know when to ask for help or feedback. Communication becomes the main barrier in such cases.

Arunan: As a student, the hardest part was selecting an organisation and working with them before submitting a proposal. GSoC has gained more and more popularity over the years and the competition is very tough. This might discourage many students and they might postpone their idea of participating in GSoC to the following year. Students should learn to overcome this fear and start trying. Once you have passed a threshold point of getting to know the organisation, the path becomes clear and easy. Once you reach this point, you get all the motivation in the world to start and complete the project because it is an exciting journey.

What advice would you give to a student who is applying for GSoC? Is there something you’d go back and tell yourself when you were a student? 

Adrián: In my view, and re-iterating what I’ve stated in my answer to the previous question, I encourage students to communicate with mentors constantly, and ask about any issue that may arise during the program, while still keeping a high degree of independence.

Aman: GSoC is about open source communities. The student should keep in mind that his/her code would be used by a lot of people all over the world. Each and every aspect of the student’s work has a great impact on a lot of people and a lot of dependent projects. With this thought, comes a great responsibility of ownership. The student should work passionately and should ask for feedback and suggestions from other community members to enhance his/her work.

What tips would you give to first-time mentors? 

Adrián: For first-time mentors, I strongly advise to be proficient enough with the tech stack and have a clear idea of what the desired output from the project is – especially if the project has not been proposed by you, so that you are able to guide the student through the program. In addition to that, make sure to continuously be in close communication with at least one senior mentor in the organization, so that any arising matters can be cleared.

Aman: Mentors should understand the project thoroughly. Understanding the various components of the project is extremely necessary. One should be in sync with the core team of the organisation and should discuss about the expectations from the project. Selection of students is the most important part of GSoC. It is always better to discuss about the various students with the other team members before coming on to the final selection.

Arunan: Mentoring might seem hard especially if you are not part of the internal InterMine team. But if you are comfortable with the project and the tech stack, then mentoring wouldn’t be a problem. Mentors needs to be up-to-date on the project all the time and should have some patience when the student struggles. If you are a first time mentor, it is better to co-mentor with a person who is in the internal InterMine team so that decision making can be easy and aligns with the future work of the organisation.

Interested in participating as a mentor or student yourself?

Mentoring: If you’re interested in mentoring, please email yo@intermine.org to discuss your project ideas. Generally we expect mentors to be known to us and/or have had some involvement in the InterMine community before participating as a mentor. You can also read through our Guidance for Mentors.

Interested student / intern: Check out our guide for students applicants. In 2020 we may well be participating in Outreachy as well as GSoC – so you don’t have to be a student to apply!

InterMine 4.1.1 – patch release

We’ve released a small batch of bug fixes and added the Code of Conduct.

Thank you to our contributor Asher Pasha (ThaleMine).

Fixes

  • ncbi-gff bio source updated due to data change
  • intermine plugin updated to allow you to build and deploy your InterMine instance using Gradle 4.9. To update the Gradle version on your mine, please read the upgrade instructions
  • merged PRs from Asher Pasha (ThaleMine) aimed at streamlining ThaleMine production.

This is a non-disruptive release.

See release notes for detailed information.

InterMine 4.1.0

InterMine team has just released InterMine 4.1.0.

The new release includes a better integration with Galaxy: we can import data into Galaxy from any InterMine of our choice (either starting from InterMine or Galaxy), and we can export a list of identifiers from Galaxy to any InterMine of our choice through the InterMine registry. No need to configure anything any more: all the Galaxy properties have been moved to InterMine core. No need to create a mine-specific Galaxy tool anymore, use the NEW intermine tool instead. Please read here for more details. A simple InterMine tutorial will be published soon in the Galaxy Training Material, under the Data Manipulation topic.

This release offers the integration with ELIXIR AAI (Authentication and Authorisation Infrastructure) allowing the researchers to log in the InterMine instances using their ELIXIR profile. You will need:

  1. an ELIXIR identity
  2. register the InterMine client in order to obtain the client-id and the client-secret which must be set in the mine properties file.

More details here in the OpenAuth2 Settings section of the documentation.

Also new in this version is the gradle wrapper 4.9, which is compatible with Java11. This only effects the users which compile/install InterMine code.

Thank you so much to our contributor Joe Carlson for improving the generateUpdateTriggers task.

The release contains also a few bug fixes.

Bug Fixes

  • Solved the error caused by obsolete terms in the gene ontology
  • Fasta query result: CDS translation option + extra view parameter
  • The ONE OF constraint works properly when editing a template
  • The default queries configuration have been migrated to json
  • The task generateUpdateTriggers has been improved

See the release notes for the complete list and detailed information.

This is a non-disruptive release. To update your mine with these new changes, see the upgrade instructions.

InterMine Cloud: Making InterMine cloud-native and easing deployments

GSoC 2019 was fun and I learned a lot from the InterMine Cloud project. In this blog post, I am going to summarise the work that I did on the project. A detailed technical description of all the work done will be published elsewhere.

InterMine is a powerful data warehousing, integration and analysis tool used to store and share genomics data. However, setting up an instance of InterMine is a time consuming and error prone process. It also requires technical knowledge and some familiarity with Java, Postgres, Solr, Perl and shell scripts. These issues create a barrier for entry and friction in adoption of InterMine by the bioinformatics community.
To solve these issues, we went back to the drawing board and spent two months planning and searching for simple and feasible solutions.

So, the first thing that we did was packaging InterMine into Docker containers.

InterMine on Docker

Repo: https://github.com/intermine/docker-intermine-gradle
Commits: https://github.com/intermine/docker-intermine-gradle/commits?author=leoank


Packaging InterMine into Docker containers helped us to reduce required dependencies to set up an InterMine to just two (Docker and Docker Compose). Previously you had to go through tens of pages of InterMine docs to get everything set up and configured correctly to start a new InterMine.

But, packaging InterMine into Docker containers was not a trivial task. Unlike other applications where we can have a single generic container image that can be used by different users, InterMine needs to be custom built for every user. Also, the build requires coordination with other services like Postgres and Solr.

So, instead of having a single Docker image, we now have a set of Docker images that can be orchestrated together to build custom InterMines. These Docker images can be configured easily using environment variables and config files for easier cloud deployments.

Usage instructions for these Docker containers are documented here.

After packaging InterMine in Docker containers, the second thing we did was to write the cloud infrastructure needed for deploying InterMine as Code.

InterMine Cloud Infrastructure as Code

Repo: https://github.com/intermine/intermine-cloud
Commits: https://github.com/intermine/intermine-cloud/commits?author=leoank

To achieve an easy to use and reproducible cloud infrastructure setup and deployments, we used three technologies: Terraform, Kubernetes and Helm.

Terraform is used to define required infrastructure as code. We now have Terraform scripts that can be used to spin up a Kuberenetes cluster on Google Cloud Platform with correct configs in just minutes.

Kubernetes is a production-grade container orchestration platform. It makes easier to manage containers on cloud.

Helm is like a package manager for Kubernetes. We wrote helm charts for deploying single InterMine instances and also entire InterMine Cloud components. Using these charts, users can deploy a custom InterMine in just minutes now.

Doing all this work standardised the cloud deployment process for InterMine. But, we didn’t stopped here though. We took this one step further, which finally brings us to InterMine Cloud.

InterMine Cloud

Repos:
Compose: https://github.com/intermine/intermine_compose
Configurator: https://github.com/intermine/intermine_configurator
Wizard: https://github.com/intermine/wizard

Commits:
Compose: https://github.com/intermine/intermine_compose/commits?author=leoank
Configurator: https://github.com/intermine/intermine_configurator/commits?author=leoank
Wizard: https://github.com/intermine/wizard/commits?author=leoank

InterMine Cloud is a SaaS platform that offers InterMines as a service to its users. It brings a whole new way to use InterMines and makes it accessible to a much larger group of users. We envisioned a completely new user workflow that removes all the technical burden from a user.

InterMine Cloud Workflow

The work we did on InterMine Cloud is completely reusable and we encourage others in to community to host their own InterMine Clouds. The diagram below gives you a brief overview of the architecture.

InterMine Cloud Architecture Overview

InterMine Cloud has four main components:

  • InterMine Compose
  • InterMine Configurator
  • Wizard
  • Kubernetes environment

InterMine Compose

Compose is responsible for authentication, authorisation and building custom InterMines using config files generated by InterMine Configurator. It also acts as a proxy to InterMine Configurator and the underlying kubernetes environment.

InterMine Configurator and Wizard

My mentors wrote configurator and wizard. Together they are responsible for generating a mine config that is used by InterMine Compose. Wizard asks a series of relevant question to the user about the data file, which is then processed by configurator to generate a config.

Kubernetes environment

The underlying Kubernetes environment is a standard Kubernetes cluster with few InterMine cloud specific components added. These specific components includes a Solr service and a distributed shared filesystem enabled by Rook.

Future Work

InterMine cloud is functional but a work in progress. It will take few more weeks to reach alpha. We have planned to add few more features before a public release and also actively looking for community feedback and suggestions.

Call recording available: GSoC 2019 Final Presentations

Our Google Summer of Code students presented their work at a special edition of the community call yesterday. You can catch up on the entire recording on YouTube – or scroll down to see individual presentations. The agenda and notes accompanying the call (including code and slides links) is in Google Docs.

Prabodh Kotasthane – Spring Migration

Prabodh’s presentations starts at 3:54: https://youtu.be/ZzV6JmVRQmA?t=234

Slides

Ankur Kumar – InterMine Cloud

Ank’s presentation starts at 13:12: https://youtu.be/ZzV6JmVRQmA?t=792

Laksh Singla – Upgrading imjs & im-tables

Laksh’s presentation starts at 21:08: https://youtu.be/ZzV6JmVRQmA?t=1268

Rahul Yadav – Single Sign-In

Rahul’s presentation starts at 27:39 https://youtu.be/ZzV6JmVRQmA?t=1659

Deepak Kumar – InterMine Schema Validator

Deepak’s presentation starts at 24:11 https://youtu.be/ZzV6JmVRQmA?t=2051

Akshat Bhargava – Data Visualisations

Akshat’s presentation starts at 41:30 https://youtu.be/ZzV6JmVRQmA?t=2490

14 August: InterMine Community Call and Google Summer of Code Final Presentations

After weeks and weeks of fabulous work, our six Google Summer of Code projects are approaching the finish line. As in previous years (2018, 2017), our students will be sharing their work in a series of 5-minutes presentations at an InterMine Community Call. Everyone from the InterMine community is encouraged to come and see what our fantastic students have been up to.

Joining the call

The call will be on the 14th of August 2019. (Note we previously advertised the call as being on the 15th; this was an error – the call is definitely on Wednesday the 14th of August).

Time: 17:00 UK time / 21:30 IST / or check your time zone here: https://arewemeetingyet.com/London/2019-08-14/17:00/Final%20presentations

Agenda and joining instructions: https://docs.google.com/document/d/14KAdYACPowLxcIhOe6yVzeYsHMnSy2X0WzuJ124KZ30/edit#heading=h.x7mc3otkj1bu

Here’s a sneak preview of what our students have been working on:

Status update for BlueGenes

It’s been a while since we posted our last (rather optimistic) update around BlueGenes, so we thought we’d share a quick update, starting with the basics.

As a reminder, the long-term goal of BlueGenes is to replace the existing JSP-based UI with a more modern interface – one that works well with mobiles, one that hopefully responds more quickly and is easier to use, and perhaps most importantly, is easy to update and customise.

Some of the questions we’ve had in the last few months:

Q: Will BlueGenes replace the current JSP UI?

A: Yes, eventually. Once we reach official beta/prod release (we’re currently in alpha), we anticipate running them concurrently for a couple of years, but we probably will only provide small fixes for the JSP UI during this period, focusing most of our development effort on BlueGenes.

Q: Do I have to run my own BlueGenes, or can I use the central one at apps.intermine.org?

A: Since BlueGenes is powered purely by web services, it will probably be possible to run your InterMine as a server/api-only service and use BlueGenes at bluegenes.apps.intermine.org/. You can also run your own BlueGenes on your servers and domains, allowing you to customise it so it’s suitable for your data, and not having to rely on our uptime. Either (or both) should work fine. There will be some version requirements related to what version of InterMine can access all the features of BlueGenes – see the next point.

Q: What version of InterMine do I need to have to run BlueGenes?

A: BlueGenes will require a minimum version of InterMine to run. The original release of InterMine web services focused primarily on providing a way to give JSP users access to their data programmatically, but at the time there wasn’t an anticipated need for application level services such as superuser actions. There are a few web services and authentication-layer services we still need to implement, so it’s likely BlueGenes will need API version 31+ or higher in order to be fully-featured. InterMines with API version 27 or higher can run a basic version of BlueGenes. You can check out this table to see if your InterMine is configured to work with BlueGenes.

Q: Ok, so what’s left to do before BlueGenes is released as a public beta?

A: Mostly authentication, superuser and MyMine features – things  like saving and updating personal templates, sorting lists in folders, updating preferences and passwords. Some of these features require updates to InterMine itself in order to work – hence the minimum version noted in the previous question. Once these are ready we’ll move to the public beta stage.

Your input here will be incredibly welcome, too – the more feedback we get early on, the more polished we hope BlueGenes can be.

Q: Will BlueGenes work nicely with HTTPS InterMines?

A: You will be able to run BlueGenes without HTTPS, but in order to avoid inadvertently exposing user passwords, the login button will only be available over HTTPS connections. We’re also working with a student over the next few months, to implement a pilot InterMine Single Sign On service. You can read about it in our interview with Rahul Yadav.

Q: Will I be able to customise the way BlueGenes looks?

A: Totally! There are two ways you can do this. One is to make sure you have your logo and colour settings configured in your web properties. We have a nice guide for that. This’ll tell us what your preferred highlight colours are – FlyMine is purple, HumanMine green, etc. If you’re really dedicated and would like to write your own CSS, you can do that too, if you’re running your own InterMine/BlueGenes combo.

Q: I have some nice custom visualisation tools in my InterMine. I don’t want to have to re-write them!

A: We don’t want you to re-write them either! It depends how they’re implemented in your mine, but we’ve designed the BlueGenes Tool API with you in mind, and many Javascript-powered tools will require only a few lines of code to become BlueGenes ready.

As an example, the Cytoscape interaction viewer currently used in some InterMines only requires 20 lines of code to import into BlueGenes, plus a few lines of config – all the other files (and most of the config too) is boilerplate that we auto-generated.