InterMine internships plans for 2020: GSoC, Outreachy, other?

TL;DR: InterMine has been participating in Google Summer of Code (GSoC) for three fantastic years now, and we’re hoping to participate in a fourth, with a few improvements. We’ll be applying for GSoC again, but probably taking on fewer students than previous years if we are accepted as an org. This is because we also plan to participate in Outreachy, another mentoring program that runs on the same timeline. We’ll probably take on around six interns in total between both programs. 

What is Outreachy? Why the change? 

In short, it’s a chance to include a more diverse pool of InterMine interns.

It’s no secret that there’s poor diversity in tech generally and particularly in open source: open source software developers self-report to be over 90% male and 15% or less self-report as BME (https://osf.io/preprints/socarxiv/qps53). Similar problems are present for staff and students in higher education, both in gender balance and ethnic balance (https://www.ecu.ac.uk/about-us/he-equality-challenges).

It’s not surprising, then, that while we are pleased to have welcomed interns from India, USA, Ecuador, Sri Lanka, Greece, Spain, and Canada in our last three years of GSoC, only two of our seventeen interns were female.

Outreachy is a program similar to Google Summer of Code, but with a focus on getting more underrepresented folks into tech. The timeline is broadly the same, and our commitment to supporting internships has not changed, so we hope this will help us include interns from a broader range of backgrounds.

Looking to help out? One of the biggest ways you can help us is by spreading the word – share this blog post with anyone you know who is interested in tech and comes from an under-represented community, and/or is looking for experience in tech projects that don’t fall under the umbrella of “code” projects (see the note about Outreachy-only internship topics in the next section). 

Potential student/intern Frequently Asked Questions:

Q: Should I choose Outreachy or GSoC? 

If you were already considering participating in GSoC, we’d suggest staying with GSoC. You can read about eligibility for Outreachy in the Outreachy application guide, too. Outreachy interns don’t have to be students, and could instead be recent graduates, parents returning to work after a break, people who know how to code looking for early experience, or perhaps something else. Another difference is that Outreachy allows tech-related projects that aren’t code – e.g. design, user experience, documentation, or accessibility. Ultimately, your choice should depend on which of the two programs you’re eligible for and which projects you’re interested in, since some projects will be Outreachy-only. 

Q: I’m keen to get started! What do I do?

🎉 Amazing! Take a look at the InterMine GSoC site (we’ll update it to cover Outreachy soon) and InterMine contributing guide and spend a little while learning about InterMine. You can pick up an issue or two if you’d like, but it’s not mandatory for GSoC students. (Outreachy interns need to make at least one small contribution during the contribution period).

It’s also a nice idea to read some successful project proposals from previous years, like this one from Nupur Gunwant

Q: Do I need to get started right away?

A: It’s great to see your enthusiasm! We won’t know for sure until February 20th 2020 whether or not we’re accepted to GSoC as a mentor organisation, and Outreachy applications won’t open until late January. Feel free to sit back with a nice drink and relax until then – seriously! Our project idea list isn’t out yet, and we don’t grade project proposals based on the number of contributions.

Read through our criteria for proposal grading here: intermine.org/gsoc/guidance/grading-criteria-2019/#experience – some evidence of coding is necessary, and you’ll need to understand InterMine well enough to write a sensible project proposal, but beyond that, we don’t want you to work to exhaustion! Well-rested and happy is much more fun 😉

One thing that is really helpful is if you like to hang out in chat (chat.intermine.org) – you can welcome newbies into the #gsoc channel, and answer some of their basic questions. Teamworking and community skills are highly valuable! 

Levelling up: From GSoC student to mentor

We’re really proud of our ongoing engagement with GSoC students from previous years, and we always encourage our students to stay involved in any way that suits them, from writing papers about their work, summer internships in the office, and even joining the team. Here, we’ve interviewed Aman Dwivedi, Arunan Sugunakumar, and Adrián Rodríguez-Bazaga, all of whom were mentors in 2019, but came from the special perspective of having been InterMine students in 2018. It’s not long at all until we’ll be thinking about GSoC 2020! 

Hi all – thanks for volunteering to be interviewed! What motivated you to return as a mentor after having been a student?

Adrián: As a result of being a student under InterMine umbrella during GSoC 2018, I got invaluable skills that contributed towards my professional career, and eventually to getting a job at the mentor organization itself. One of these skills is the ability to communicate, cooperate, and in general terms, to work with a software development organization in an international setting. This is a highly demanded skill – both in industry and in academia – that I couldn’t really get anywhere else before GSoC. 

On the second hand, the opportunity to learn how to contribute back into an open-source with a (huge) codebase and a decent number of contributors, both with code and ideas, was a unique chance to add this top-tier ability to my skill-stack. For this reason, since the high impact that GSoC had in my career, I wanted to go back and help other prospective students by mentoring them and sharing my experience, something that my current position at InterMine helped to contribute positively.

Aman: As a developer, I think we often use open source software and we don’t really get a chance to give back to the community. It becomes difficult to keep contributing to open source in our day to day professional work. Being a part of GSoC in the past, I have realised the importance of open source projects and the communities running them. Returning as a mentor for GSoC this year gave me a reason and a chance to contribute again. I always wanted to be a part of the GSoC journey again and this gave me an opportunity to welcome new contributors to the community.

Arunan: Being part of an organisation which is on the other side of the planet is always an exciting thing to do. I understood the full meaning of the term ‘Globalization’ when I was a student at InterMine last year, thanks to GSOC. I loved our meetings, guidance I received, the project outcome and the level of satisfaction I got. I wanted to have the same experience again this year as a mentor with the organisation I am familiar with.

Did you feel like you had any special insights into what students were going through, having been in the same position in previous years?

Adrián: Having been in the same situation as the students were during GSoC, was indeed very helpful to find and understand the potential needs that they might have. As a matter of illustration, one of the difficulties that is common within already-accepted GSoC students, is that when they face issues in terms of how to continue their progress through the program – either in terms of how to fix obstacles that they might find or contributing with new features – they often don’t feel “brave” enough to communicate with the mentor in order to ask about those problems directly, but instead prefer to find their way through independently, as maybe some of them feel that asking on how to proceed/fix something is a “signal of  lack of knowledge”, and in my opinion this is totally wrong, as mentors are there precisely to help you get around these situations!

Aman: From being a GSoC student to stepping into the shoes of a GSoC mentor, I already was aware of the problems faced by a student. Being a first time contributor in an open source organisation is just like entering a room full of unknown people. Sometimes the student might not know when to ask for help or feedback. Communication becomes the main barrier in such cases.

Arunan: As a student, the hardest part was selecting an organisation and working with them before submitting a proposal. GSoC has gained more and more popularity over the years and the competition is very tough. This might discourage many students and they might postpone their idea of participating in GSoC to the following year. Students should learn to overcome this fear and start trying. Once you have passed a threshold point of getting to know the organisation, the path becomes clear and easy. Once you reach this point, you get all the motivation in the world to start and complete the project because it is an exciting journey.

What advice would you give to a student who is applying for GSoC? Is there something you’d go back and tell yourself when you were a student? 

Adrián: In my view, and re-iterating what I’ve stated in my answer to the previous question, I encourage students to communicate with mentors constantly, and ask about any issue that may arise during the program, while still keeping a high degree of independence.

Aman: GSoC is about open source communities. The student should keep in mind that his/her code would be used by a lot of people all over the world. Each and every aspect of the student’s work has a great impact on a lot of people and a lot of dependent projects. With this thought, comes a great responsibility of ownership. The student should work passionately and should ask for feedback and suggestions from other community members to enhance his/her work.

What tips would you give to first-time mentors? 

Adrián: For first-time mentors, I strongly advise to be proficient enough with the tech stack and have a clear idea of what the desired output from the project is – especially if the project has not been proposed by you, so that you are able to guide the student through the program. In addition to that, make sure to continuously be in close communication with at least one senior mentor in the organization, so that any arising matters can be cleared.

Aman: Mentors should understand the project thoroughly. Understanding the various components of the project is extremely necessary. One should be in sync with the core team of the organisation and should discuss about the expectations from the project. Selection of students is the most important part of GSoC. It is always better to discuss about the various students with the other team members before coming on to the final selection.

Arunan: Mentoring might seem hard especially if you are not part of the internal InterMine team. But if you are comfortable with the project and the tech stack, then mentoring wouldn’t be a problem. Mentors needs to be up-to-date on the project all the time and should have some patience when the student struggles. If you are a first time mentor, it is better to co-mentor with a person who is in the internal InterMine team so that decision making can be easy and aligns with the future work of the organisation.

Interested in participating as a mentor or student yourself?

Mentoring: If you’re interested in mentoring, please email yo@intermine.org to discuss your project ideas. Generally we expect mentors to be known to us and/or have had some involvement in the InterMine community before participating as a mentor. You can also read through our Guidance for Mentors.

Interested student / intern: Check out our guide for students applicants. In 2020 we may well be participating in Outreachy as well as GSoC – so you don’t have to be a student to apply!

Call recording available: GSoC 2019 Final Presentations

Our Google Summer of Code students presented their work at a special edition of the community call yesterday. You can catch up on the entire recording on YouTube – or scroll down to see individual presentations. The agenda and notes accompanying the call (including code and slides links) is in Google Docs.

Prabodh Kotasthane – Spring Migration

Prabodh’s presentations starts at 3:54: https://youtu.be/ZzV6JmVRQmA?t=234

Slides

Ankur Kumar – InterMine Cloud

Ank’s presentation starts at 13:12: https://youtu.be/ZzV6JmVRQmA?t=792

Laksh Singla – Upgrading imjs & im-tables

Laksh’s presentation starts at 21:08: https://youtu.be/ZzV6JmVRQmA?t=1268

Rahul Yadav – Single Sign-In

Rahul’s presentation starts at 27:39 https://youtu.be/ZzV6JmVRQmA?t=1659

Deepak Kumar – InterMine Schema Validator

Deepak’s presentation starts at 24:11 https://youtu.be/ZzV6JmVRQmA?t=2051

Akshat Bhargava – Data Visualisations

Akshat’s presentation starts at 41:30 https://youtu.be/ZzV6JmVRQmA?t=2490

14 August: InterMine Community Call and Google Summer of Code Final Presentations

After weeks and weeks of fabulous work, our six Google Summer of Code projects are approaching the finish line. As in previous years (2018, 2017), our students will be sharing their work in a series of 5-minutes presentations at an InterMine Community Call. Everyone from the InterMine community is encouraged to come and see what our fantastic students have been up to.

Joining the call

The call will be on the 14th of August 2019. (Note we previously advertised the call as being on the 15th; this was an error – the call is definitely on Wednesday the 14th of August).

Time: 17:00 UK time / 21:30 IST / or check your time zone here: https://arewemeetingyet.com/London/2019-08-14/17:00/Final%20presentations

Agenda and joining instructions: https://docs.google.com/document/d/14KAdYACPowLxcIhOe6yVzeYsHMnSy2X0WzuJ124KZ30/edit#heading=h.x7mc3otkj1bu

Here’s a sneak preview of what our students have been working on:

GSoC Interview: Akshat Bhargava on new data visualisations for BlueGenes

This is our blog series interviewing our 2019 Google Summer of Code students, who will be working remotely for InterMine for 3 months on a variety of projects. We’ve interviewed Akshat Bhargava, who will be creating data visualisations for BlueGenes.

Hi Akshat! We’re really excited to have you on board as part of the team this summer. Can you introduce yourself?

akshat1Hi InterMine team, I’m very excited too for this upcoming summer! I’m a Computer Science undergraduate going to start my 3rd year this August. I’m primarily a Javascript Developer (Web & Hybrid Mobile) and have been working with it for the last 2.5+ years, but the real me is a person who loves to solve problems in general, may they be related to programming or not. I’ve been exploring the field of data visualization for the last few months and I am in love with it. Have a look at IPL (cricket) data viz I created a few months back here.

Apart from coding, I love reading about psychology, history and watching horror movies.

What interested you about GSoC with InterMine?

I feel it magical how numbers show their true faces when seen via a meaningful visualization, and this is why I’m most excited for this summer with InterMine.

Real World Bio Data + Data Viz = Something big coming in! ❤

Another reason for my interest in InterMine, is that I applied to InterMine last year too for Cross InterMine Search Tool and couldn’t make it, but understood it’s community and how they work. The mentors are very helpful and supportive to everyone, so I directly jumped here this year. 😀

Tell us about the project you’re planning to do for InterMine this summer.

InterMine has tons of different types of biological data, this summer I’ll mostly be working on discussing and developing visualizations for data, making it easier to biologists to understand it in a easier way, and draw relevant conclusions with a single sight to the graphs.

There is a software called BlueGenes, which is already developed and helps explore different mines, it provides a tool API which allows Javascript developers to create additional visualization tools on top of it, which can be integrated on any Gene or Protein result page. My goal for this summer is to create a different variety of such visualization tools in order to enrich the visualization of different types of data.

As an example of how useful is what I’m doing, one of the visualizations I’ll be developing will help us understand how the expression of a particular gene is distributed among different tissues. This information is helpful for cancer biologists that want to assess if a gene is highly expressed across different tissues of an organism, because that gives a relative picture on to what degree it’s implicated in diseases.

Are there any challenges you anticipate for your project? How do you plan to overcome them?

Data visualization is something that requires you to understand the data properly first in order to be able to actually create some meaningful visualization out of it, and since I’m not very familiar with InterMine’s data model and the related biological terms, I’ll face some difficulties during my thought process of “what and why” to visualize. To overcome this, I’m already exploring more and more of the InterMine’s data model, trying to understand how to deal with different types of data, and how to create the appropriate visualization for them. Mentors are really helping me out with this (overall in terms of tech, viz and everything). 🙂

Share a meme or gif that represents your project

akshatmeme1

GSoC Interview: Migrating from Struts to Spring with Prabodh Kotasthane

This is our blog series interviewing our 2019 Google Summer of Code students, who working remotely for InterMine for 3 months on on a variety of projects. We’ve interviewed Prabodh Kotasthane, who will be working on a project to migrate InterMine’s RESTful web services from Struts to Spring.

Hi Prabodh! We’re really excited to have you on board as part of the team this summer. Can you introduce yourself?

I am a final year Computer Science Engineering (B.Tech.) student from Birla Institute of Technology, Mesra, India. I have been coding in  JAVA and related frameworks, C/C++ and python since last 4 years. I have contributed to Open Source Community, some software development projects and hackathon projects at college level. Prabodh
I successfully completed GSoC 2018 with OpenMRS. My project was OAuth Module Enhancement and SMART Apps Support. Details about the project can be found here:

https://pkatgithub.github.io/GSoC-2018-Final-Evaluations/

Presently I am doing internship under Microland Limited, Bengaluru, India till end of May 2019. Here I am working around graph databases and technologies like Apache Kafka and Neo4j with all the coding part done in python.
Apart from coding, I have many other interests and hobbies which include singing, cooking, photography, fine arts, basketball and writing.

What interested you about GSoC with InterMine?

It was around mid Jan this year when I got to know about InterMine.
Previously, I have worked with Java Spring Framework and hence I am comfortable with the same. So, I was searching for GSoC organisations which have something to do with Spring and then I got to know about InterMine and their project in which they were planning to migrate the web-services from Struts to Spring.
I read more about InterMine and also about the project, and I found it interesting. I went through the documentation of the project and joined the discord handle of InterMine so that I could connect with the mentors and the community.
I had a warm welcome into the community. Julie, Daniela and Yo are always excited to chat and exchange thoughts and it’s been such a good time with them till now.
All in all, this community, this project and the people associated to this project made me believe that I can do a GSoC with InteMine!

Tell us about the project you’re planning to do for InterMine this summer.

Presently InterMine uses Struts framework which is outdated. InterMine provides RESTful web-services which facilitates to execute custom or templated queries, search keywords, manage lists, discover metadata, perform enrichment statistics and manage user profiles.
The main objective of this project is to migrate the web-services from Struts to Spring framework and document the APIs with Swagger in compliance with OpenAPI Specifications.
Spring framework is evolving all the time and is more robust and flexible as compared to the Struts framework.
OpenAPI specifications are easy to write and Swagger Codegen, which supports Spring, makes the job of developer easy by generating the code stubs which can be modified to render the services.

Are there any challenges you anticipate for your project? How do you plan to overcome them?

InterMine has a lot of web-services, a total of 70, with various different functionalities.
The business logic of web-services is strongly dependent upon the
classes in webcore and in order to migrate a web-service, the knowledge of underlying logic layer is a must.This is going to a real challenge. It is a requirement to give proper time and understand this business logic layer of the project.
Apart from this, writing tests is also a time taking job. I wish I could get some help in that! 😛

Share a meme or gif that represents your project

PrabodhMeme

GSoC Interview: InterMine Schema Validator with Deepak Kumar

This is our blog series interviewing our 2019 Google Summer of Code students, who working remotely for InterMine for 3 months on on a variety of projects. We’ve interviewed Deepak Kumar, who will be working on the InterMine Schema Validator.

Hi Deepak! We’re really excited to have you on board as part of the team this summer. Can you introduce yourself?

Hi, Thank you for this opportunity, Let me first talk about myself, My name is Deepak Kumar, I live in Ahmedabad, India with my family. I started coding when I was in 17, I had two great teachers in my school days who introduced me to computer programming, and from that time I got interested in this field.

I completed my graduation in Computer Applications from St. Xavier’s College, Ahmedabad and currently I’m doing  Post Graduate Program MSC.IT(Information Technology) at DA-IICT, Gandhinagar, India.

Now talking about my technical details, I love working on challenging projects, I’ve worked on several projects, One of my favourite project that I created while pursuing my bachelors was ‘Smallscript’, It’s a compiled programming language that compiles to bytecode and runs on JVM that makes it platform-independent. It’s my favourite project because It was challenging and when I started with the project I didn’t know any technical detail about compilers, so I had to start from very scratch.

I’ve also worked with a startup company, where I worked as a backend-developer with a team of 8 people and our team was really fantastic, I worked on two projects there, and I really enjoyed it, working with a big team wonderful experience.

I’ve recently started my open source journey with GSoC 2019. Though I’m new to open source, I’ve started contributing to ‘JabRef’ and as I’m selected for GSoC 2019, I’m also going to work with Intermine this summer, and have future plan to contribute to Intermine after completion of GSOC. I also regularly participate in coding contests and hackathon, In one of the AI contest, I built an AI game that ranked 68 among thousands of participants.

Currently, I’m working at OpenXcell Technolabs as an Intern, which is part of my MSC.IT Master’s program. I love reading, travelling, table-tennis and working with new technologies.

What interested you about GSoC with InterMine?

When GSoC 2019 was about to start, I had already bookmarked a few of the previous year organizations I was interested in, and hoping that Intermine will be part of GSoC 2019 too. When the organization list came out, I was super excited to see Intermine in the list. After going through the Intemine’s idea list, I found myself very interested in ‘Intemine Schema Validator Project’, So it was really the Intermine’s project that made me interested in the community.

Tell us about the project you’re planning to do for InterMine this summer.

I’ll be working on a project named ‘Schema Validator’ for Intermine this summer. Well, the project is quite simple to explain, it’s going to be a library that takes a file as input and outputs whether that file is following a particular schema or not. While working on the project my goal from the first day would be to create this project as general as possible, so that the project can be easily extended to support other schemas as well.

Are there any challenges you anticipate for your project? How do you plan to overcome them?

Yes, there are few challenges that I will face while working on this project, One of the biggest challenges which I’m currently trying to solve is about performance. As the purpose of this project is to validate schema files, then the problem is how will I handle larger files that are filled with the content of like 10GB or more. I need to discuss this problem with my mentors that what is their expectation about the performance of the library.

Currently, I’m thinking about the solution to this problem. Maybe I can boost the performance by concurrently running multiple instances of a Schema Validator, Although it doesn’t matter how I implement it If the library is validating a 10GB file that it is definitely going to take a little amount of time.

Then there are also a few challenges regarding the implementation of the schema rules.