GSoC Student Interview spotlight: ElasticSearch / Solr Project + Arunan Sugunakumar

This is our blog series interviewing our 2018 Google Summer of Code students, who will be working remotely for InterMine for 3 months on a variety of projects. We’ve interviewed Arunan Sugunakumar, who will be working on upgrading InterMine’s search facilities.

arunan-architecture.png

Hi Arunan! We’re really excited to have you on board as part of the team this summer. Can you introduce yourself?

I am Arunan Sugunakumar, an undergraduate from Department of Computer Science and Engineering, University of Moratuwa. I am attracted to the concept of open source because I get to learn a lot by seeing contributions from other people all over the world and I learn by contributing myself. I did my internship in WSO2, a open source middleware company. I mostly contribute to Java, Python and JavaScript related projects. I am also interested in Internet of Things and Big Data stuff.

I like to read books in my spare time. It helps me to clear my mind. Also I like to play scrabble which is a popular word game.

What interested you about GSoC with InterMine?

I came to know about InterMine through a friend, and when I went through the project ideas and the community, I fixated in my mind that I should give a try to be part of this organization. Most of the project ideas were associated with core InterMine product rather than trial and error projects. So I know if I become a part of it, my contributions would be there in all InterMine instances. That gave me most of the excitement and the mentors were also very friendly and supportive.

Tell us about the project you’re planning to do for InterMine this summer.

Currently InterMine uses an outdated library to handle bio data search. My project aims to improve the search feature using modern search engines like Apache Solr / ElasticSearch. The existing architecture in InterMine has to be modified to handle the new approach and it should reduce the complexity to the user.

Are there any challenges you anticipate for your project? How do you plan to overcome them?

The main challenge for me is to understand the existing code base so that I can change it without breaking the workflow. I need to work closely with my mentor and need to update them with every change I make. Also I have to communicate my doubts to the community in a friendly manner so that I can get input from everyone.

Another challenge that I might face is choosing the appropriate search engine. There are many open source search engines out there and all of them are best in their own way. So I need to discuss with my mentor to select an appropriate search engine that would be suitable for the project.

Share a meme or gif that represents your project

apache-solr-spongebob.gif

Advertisements

GSoC Student Interview Spotlight: InterMine Data Browser + Adrián Rodríguez Bazaga

This is our blog series interviewing our 2018 Google Summer of Code students, who working remotely for InterMine for 3 months on on a variety of projects. We’ve interviewed Adrián Rodríguez Bazaga, who will be working on the InterMine Data Browser.

Hi Adrian! We’re really excited to have you on board as part of the team this summer. Can you introduce yourself?

adrian.pngHi, I am Adrián Rodríguez Bazaga, a student from UPC, Barcelona (Spain). I am a Computer Scientist who is currently pursuing a Master’s degree in Data Science and Machine Learning. One of the things that characterizes me, is my desire for science, and to solve problems using technology, and when it’s posible, do it in collaboration with other enthusiasts in the field, which is how the Open Source philosophy works!

Apart from this, I love animals, especially cats and Cavalier King Charles Spaniels, I could spend all day long cuddling them if I had the time. I also love to play chess and any kind of board games in my spare time!

What interested you about GSoC with InterMine?

As a student, I’m still learning about everything that interest me, and although my major is Computer Science (and Artificial Intelligence related topics), I’m very interested in the bioinformatics world, a landscape where InterMine lies around, and, consequently, gives me the perfect opportunity to learn the “bio-concepts” behind the project on which I am involved, by applying my Computer Science skills.

 

Tell us about the project you’re planning to do for InterMine this summer.

Currently, the InterMine services offer a query builder to search for biological data over the different mines. Although this is a very useful tool, the user needs to know how the data is structured (data model) on each mine, in order to create the desired queries. Since knowing the data model is mandatory to use this query builder, it can, indeed, become overwhelming for new users who want to search for some specific information in the data.

On top of this idea, my project is to implement a faceted search tool to display the data from InterMine database, allowing the users to search easily within the different mines available around InterMine, and we have already made some advancements on that, as you can check in the following picture:

adrian-prototype

Are there any challenges you anticipate for your project? How do you plan to overcome them?

The main challenge of my project lies on the fact that, the data browser is intended to be used by beginners, without the added difficulty of knowing deeply the data model, this means that I will need to deploy an application capable of working with all the functionality of searching on InterMine repositories but with an easy-to-use interface for users, which is by itself, a great challenge.

Share a meme or gif that represents your project

adrian-gif

 

GSoC Student Interview spotlight: InterMine Python Client + Nupur Gunwant

This is our blog series interviewing our 2018 Google Summer of Code students, who working remotely for InterMine for 3 months on a variety of projects. We’ve interviewed Nupur Gunwant, who will be working on the InterMine Python Client.

Hi Nupur! We’re really excited to have you on board as part of the team this summer. Can you introduce yourself?

I am Nupur Gunwant, a student from IIT Kharagpur, India. I am pursuing an Integrated Masters degree in Mathematics and Computing. I am an open source enthusiast and a maths lover. I love to solve problems and talk about ideas. I firmly believe in the power of Python and admire its versatility.

Apart from that, I am a lover of art. I want to further pursue my studies in the intersection of my artistic and technical interests. And most importantly, I always carry a book wherever I am.

What interested you about GSoC with InterMine?

I was deeply intrigued by the work InterMine does and as a student, I wanted to work with an organization with such a huge impact on the society. Another thing that motivated me towards preparing hard to work with InterMine as a student developer was the fact that it’s such a healthy and friendly community, where ideas are appreciated and one is always motivated to work on them. I think that made InterMine the most desired place to work with.

Tell us about the project you’re planning to do for InterMine this summer.

I will be working on adding functionalities to Python Client, a very important part of InterMine at present. I will begin with creating a link between the InterMine Registry and Python Client, so that the user can make use of the Registry features on the terminal.

Further I will build a Query Manager that will be a key source to perform operations on user queries using the terminal and lastly, I will add visuality to the Python Client using matplotlib.

Are there any challenges you anticipate for your project? How do you plan to overcome them?

The biggest challenge to meet all the needs of the user for the Client in all the three subparts of my project. I am planning to make community interactions and their feedback the greatest source of review on my work, which because of the communities’ experience in user experience should help a great deal in overcoming this problem.

Share a meme or gif that represents your project

Webp.net-gifmaker (1)

GSoC 2018 Students Announced! 🌞☀️

After last year’s great success, we’re really excited to welcome six Google Summer of Code students to work with us again this year:

Aman Dwivedi will be working on a Cross-InterMine search tool. This will use the registry to allow users to search multiple InterMines at once, and should be a good way to figure out which mine has the data you’re looking for. Aman will be mentored by Nadia Yudina, herself a graduate of one of last year’s InterMine+GSoC program.

Adrián Rodríguez Bazaga will be working on something we’ve always wanted: an InterMine data browser – hopefully a tool that will allow users to learn a bit more about data inside an InterMine without having to know the data model. Yay for easier learning curves! Adrian’s mentor will be Yo Yehudi.

Arunan Sugunakumar is going to explore hooking InterMine up to a more modern search package, probably Solr or ElasticSearch. Our current version of Lucene is very old, and we know there are better options out there!  Daniela Butano will mentor this project.

Jake Macneal is going to work on a prototype to convert natural language questions into InterMine PathQuery – it would be exciting to have a user type “Show me all the genes associated with diabetes” into an InterMine, and get a sensible set of results back! Aaron Golden will mentor Jake.

Nupur Gunwant will be adding additional features to our python client, such as registry communication, a query manager, and visualisations. Julie Sullivan will be Nupur’s mentor for this project.

Ankit Kumar Lohani will be working on Buzzbang – a search engine to crawl multiple biological sources including, but not exclusively, InterMine instances. Justin Clark-Casey will be Ankit’s mentor.

We’re also planning to post a short interview series highlighting each student and their plans for the summer. We can’t wait to get started!!

 

 

 

 

 

 

 

Google Summer of Code: Let the enquiries commence!

Last month we applied for InterMine to join Google Summer of Code (GSoC) as a mentor organisation, and we’re pleased to report that we have officially been accepted!

Students: Interested in working with us for GSoC?

Our GSoC site has a project ideas list and the student application guidance, which hopefully will answer most of your questions.

Want to learn more?

  • You can also read our GSoC blog posts from last year to learn more about how things went.
  • If you still have questions:
    • If the question is project-specific: email both listed mentors of the given project.
    • If the question is about GSoC in general, see the student manual.
    • We’ll be running a GSoC question and answer video call session where students can learn more about the specific projects. Updates about the exact date and time will go out on this blog, our mailing lists, and twitter.

We’ll look forward to hearing from you!

 

Looking ahead: InterMine+Google Summer of Code 2018. Could you be a mentor?

2017 is coming to an end, and I have to say it’s been a fabulous one! I’ll probably post a “cool things InterMine did this year” round-up in a week or two – but in the meantime, here’s my final Google Summer of Code blog for you all!  We’ll cover the InterMine swag just sent out across the globe, as well as plans for next year – and how you can help out.

Thank-you gifts for mentors and students

Last week, we posted care packages to all our GSoC mentors and summer students, in the form of t-shirts, stickers, and pens. The postal-service-wrinkled shirt shown above is the women’s fit shirt printed on black; unisex shirts are a slightly lighter grey colour. If you filled out the swag survey when it was sent to you, your gift should be with you soon! Tweet us your images of the items in use for extra InterMine Cool Points 😎.

GSoC 2018 – call for project ideas and mentors!

Early 2017, we put together an ideas list for GSoC projects – InterMine’s projects are numbers 3 to 9. If you want to get more of an idea what it’s like to apply, (or be a mentor), read our application guidance from last year.

Do you have a nifty idea, or an InterMine itch you’d like to scratch?

Please share it with us! Add it to our 2018 Google Summer of Code ideas list, or if you need to sound things out and discuss them a little bit, comment on the GitHub issue, or email the dev list. You can even propose several ideas, if you like! Please add all ideas by the end of 14th of December (end of this week).

Would you like to try mentoring?

Fancy a chance to earn some nifty exclusive swag like pictured above? Add your name as a possible mentor to an existing idea (or your own new idea). You can always drop us a line if you want to discuss things first. We like projects to have more than one mentor if possible.

Maybe you’re a student thinking of GSoC?

Awesome! If you have your own InterMine project idea (whether it’s brand new or you’ve already started it), or if one of the ideas on our ideas list lights your fire, it’s not too early to start talking with potential mentors about it. The application guidance we mentioned above would be a good read, too.

 

 

Talks and Workshops: Sharing our materials for re-use

Would you like to grab some ready-made slides or InterMine training workshop materials? We’ve rounded up of some recent things that have been going on. Feel free to remix materials for your own talks and outreach efforts. If you do use them, we’d love to see the result!

Slides

You should have permissions to make a copy; if not, please contact us / tweet us / pop by chat to poke us with a stick.

3-min lightning talk at GSoC Mentor Summit: Citable version on FigshareGoogle Drive (editable) version

Better Science Through Better Data: Citable version on Figshare | Google Drive (editable) version | Featured image above was live-scribed during  the talk. Licence is CC-BY from Springer Nature, and the image is available from https://figshare.com/articles/Better_Science_through_Better_Data_2017_scidata17_scibe_images/5558653

Blank InterMine-branded slides: Get ’em here.

Posters

BlueGenes Poster: This poster was presented at BOSC 2017Citeable version on F1000Inkscape editable version –  (download Inkscape here: https://inkscape.org/en/release/0.92.2/)

InterMine Poster for Elixir UK All Hands 2017: PDF version | Inkscape editable version 

Workshop learning materials

We run an InterMine training workshop every term, covering the basics of using the webapp, as well as discussing how to draw data from the API. If you’re near Cambridge, keep your eyes open on the blog or twitter feed, as we’ll always announce them well in advance.

Workshop training materials in PDF: Workshop Exercises – handouts with answers | Workshop slides – note that these exercises were all correct with data from HumanMine in October 2017. Numbers of results may change if we add or update new data sources in the future, but the majority of the materials should still be generally correct apart from the results counts. 

You can download the original OpenOffice files as well if you’d like to adapt the materials for your own workshops, or feel free to contact us if you’d like to coordinate some training with us.

Side note: We’re also delivering a half-day workshop training session as part of the EBI’s 4-day Introduction to Multiomics Data Integration course – applications are open now until 01 December 2017.

Refs:

Data, Scientific (2017): Better Science through Better Data 2017 (#scidata17) scribe images. figshare.

https://doi.org/10.6084/m9.figshare.5558653.v1

Retrieved: 15:48, Nov 06, 2017 (GMT)