GSoC Student Interview spotlight: ElasticSearch / Solr Project + Arunan Sugunakumar

This is our blog series interviewing our 2018 Google Summer of Code students, who will be working remotely for InterMine for 3 months on a variety of projects. We’ve interviewed Arunan Sugunakumar, who will be working on upgrading InterMine’s search facilities.

arunan-architecture.png

Hi Arunan! We’re really excited to have you on board as part of the team this summer. Can you introduce yourself?

I am Arunan Sugunakumar, an undergraduate from Department of Computer Science and Engineering, University of Moratuwa. I am attracted to the concept of open source because I get to learn a lot by seeing contributions from other people all over the world and I learn by contributing myself. I did my internship in WSO2, a open source middleware company. I mostly contribute to Java, Python and JavaScript related projects. I am also interested in Internet of Things and Big Data stuff.

I like to read books in my spare time. It helps me to clear my mind. Also I like to play scrabble which is a popular word game.

What interested you about GSoC with InterMine?

I came to know about InterMine through a friend, and when I went through the project ideas and the community, I fixated in my mind that I should give a try to be part of this organization. Most of the project ideas were associated with core InterMine product rather than trial and error projects. So I know if I become a part of it, my contributions would be there in all InterMine instances. That gave me most of the excitement and the mentors were also very friendly and supportive.

Tell us about the project you’re planning to do for InterMine this summer.

Currently InterMine uses an outdated library to handle bio data search. My project aims to improve the search feature using modern search engines like Apache Solr / ElasticSearch. The existing architecture in InterMine has to be modified to handle the new approach and it should reduce the complexity to the user.

Are there any challenges you anticipate for your project? How do you plan to overcome them?

The main challenge for me is to understand the existing code base so that I can change it without breaking the workflow. I need to work closely with my mentor and need to update them with every change I make. Also I have to communicate my doubts to the community in a friendly manner so that I can get input from everyone.

Another challenge that I might face is choosing the appropriate search engine. There are many open source search engines out there and all of them are best in their own way. So I need to discuss with my mentor to select an appropriate search engine that would be suitable for the project.

Share a meme or gif that represents your project

apache-solr-spongebob.gif

Advertisements

Cool InterMine features roundup

I’ve said this before, but I’ll proudly say it again: one of the greatest things about being open source is the community. People are continually creative and resourceful with the tools we’ve built, and we love seeing all the different things you guys do with InterMine. Here’s a quick roundup of some of the things we’ve seen so far this year:

TargetMine’s Auxiliary Toolkit

targetmine-new-stuff
TargetMine’s Auxiliary toolkit offers advanced analysis for networks and enrichment

TargetMine links out from report pages to provide external enrichment and interaction tools. Read more about it here, or  browse the tutorials: [Enrichment] [Interaction Network].

The Beany Mines:

The beany mines (Soy, Peanut, Legume, and Bean) recently added a shared motif search, as well as a couple of other great visualisations:legume-shared-motif-search

 

R and SOLR

Colin of HymenopteraMine and BovineMine did a great blog post about using our R client, InterMineR, and then continued to impress by making efforts to upgrade InterMine to use Solr.

MOLD

Ever wondered what Model Organism Linked Data might look like?  MOLD includes a queryable SPARQL endpoint and draws from multiple different InterMines to create a single dataset.

mold

Tip: Make it generic

Generic tools are ones that aren’t hard-coded to a specific Mine or model. We’re always on the look out for new and exciting features, whether it’s a visualisation or a web service or a database tweak. If you think it’s good, you can email us to discuss it or simply create a pull request, and bask in glory forever after.

We’d love to see more!

This list is awesome (thanks everyone!!) but by no means conclusive. If you think we’ve missed something out, or you’re doing something new at the moment, drop us a line and we’ll add you to the next round up. We’d also love to hear from others who might be interested in guest-blogging an InterMine related feature.