In release 8.1 of FlyMine we have made some major changes to the layout of FlyMine. This has been done in response to feedback from users and to make the most exciting parts of FlyMine more obvious. We have also added public lists and a new help system.
A newÂ header provides tabs for easy access to the major areas of the site. TheÂ Templates andÂ Lists tabs link to new search pages that show all templates/lists and update as you type keyword search terms. You can also restrict by data category and, if you are logged in, restrict to your favourites or own saved data. TheÂ search bar in the header is a quick way to search identifiers and now also templates and lists.
The updatedÂ home page layout introduces the main concepts – Data categories, Templates, Lists and the QueryBuilder. There is also space for us to highlight new or interesting templates and public lists.
In response to user feedback we have renamed some elements of the FlyMine interface to make them more obvious.Â ‘Bags’ are now calledÂ ‘Lists’,Â ‘Aspects’ are nowÂ ‘Data categories’. We also refer now refer toÂ Report pages instead of the computer speak:Â Object details pages and aÂ List analysis page instead of aÂ Bag details page. Hopefully it won’t cause confusion for too long.
Another major development is the introduction ofÂ public lists. These are lists that we create and make available for everyone to use, typically they will be derived from a publication. In this release we have added public lists based onÂ FlyAtlas expression in the adult fly tissues andÂ FlyTF transcription factor lists. Click on an example on the home page to see a list analysis page with graphs, tables, statistical enrichment and the results of templates run for the list. We can add new public lists at any time, please contact us if you have a suggestion.
List Analysis Widgets
We have added two newÂ ‘widgets’ to the analysis page for a list of genes. These show statistical enrichment of InterPro protein domains within a list of genes and the publications that reference the most genes from the list (data from PubMed). We will continue to add to the list analysis page in every release, so let us know if you have any suggestions.
A new context sensitive help system has replaced the FlyMine manual. On every page there is a ‘?’ icon to the top right, click this to get help with the current page and find out what you could do next. The help should now be briefer and more readable.
- Results page
- FIX – column summaries sometimes gave misleading results when counting unique rows.
- FIX – the total number of rows and unique rows in summaries now update correctly in Safari.
- FIX – colons are now allowed in list/template names.
- FIX – occasional issues with creation dates of templates/lists are fixed.
- Report page
- NEW – the bottom of each report page shows any lists (public or saved) that contain the object.
- NEW – template descriptions are now included when inline tables are open.
- FIX – some issues with the view list and sort order on Internet Explorer are fixed.
- FIX – when using LOOKUP constraints can now convert wildcards that match an incorrect type – e.g. convert EVE_DROME* (a protein identifier) to a gene.
- FIX – LOOKUP constraints and list upload now accept names that contain spaces – e.g. ‘even skipped’ for the eve gene.
- FIX – sort order is now remembered for saved queries, doesn’t get reset when removing elements from view list.
In release 8.0 we have updated the data model for protein interactions, protein structure and orthologues to make them easier and faster to query. We also include three new data sets: Tiffin (predicted regulatory motifs), anoEST (A. gambiae EST clusters) and predicted 3-D protein structures for A. gambiae domains. A new type of constraint is used in template queries which now allows you to enter any identifier or symbol.
- NEW – D. melanogaster predicted regulatory motifs and functional sites (motif instances) from theÂ Tiffin database. Note these have been mapped to the release 5.0 genome sequence.
- NEW – A. gambiae EST and EST clusters from theÂ Imperial College London Centre for Bioinformatics.
- NEW – A. gambiae protein domain 3-D structure predictions fromÂ Kenji Mizuguchi. This data provides structure predictions for regions of A. gambiae proteins that correspond to a Pfam domain.
- NEW – H. sapiens and M. musculus protein information included from UniProt, we already include InParanoid orthologues to these organisms.
- User Interface
- NEW – LOOKUP constraints. Most template queries are now updated to use a new type of constraint. Now you don’t need to enter a specific type of identifier – for example for the D. melanogaster gene zen you could enter ‘zen’, ‘CG1046’, ‘FBgn0004053’ or ‘zerknullt’ and get the same result. You could even enter a protein and or transcript identifier and it would find the gene.
We are running a half day workshop on using FlyMine in Cambridge UK on June 20th. This is part of a joint workshop with FlyBase, half a day will be devoted to each. The workshop is free of charge, for more information and to sign up seeÂ here.
InterMine will be used as a major component of the Data Coordination Centre for the $57m modENCODE project; G. Micklem has been awarded two posts for four years to apply the technology developed in the FlyMine project to the dissemination of data produced as part of the $57m US NIH modENCODE project (www.modencode.org). This programme will generate an unprecedented amount of data about functional elements in the genomes of the fruit flyÂ Drosophila melanogaster and the nematodeÂ Caenorhabditis elegans.
Release 7.1 includes a number of new user interface features. Results of queries can be ordered by a selected column and results tables have a new summary button which brings up statistics on the values in that column. Many template queries now have more descriptive column titles in results tables.
- User Interface
- NEW – Column summaries in results tables. Each column of a results table now has a summary icon, clicking this will bring up a box with more information about data in the column. For numerical data it will show the minimum value, maximum value, mean and standard deviation. For text it will display the number of unique values and the most commonly occurring values with their frequency.
- NEW – Sorting query results. The QueryBuilder allows you to select an element from the output to sort results by. A sort button lets you choose ascending or descending order, for example to display results with the highest confidence score or most recent publication first.
- NEW – Results column titles. Many template queries are now configured to have more descriptive column headings in results tables. The full path can be seen by hovering the mouse pointer over the description.
- NEW – Chromosome distribution viewer. The gene details page forÂ D. melanogaster andÂ A. gambiae now includes a chromosome distribution viewer. This shows how many genes from the bag are found on each chromosome, click on a bar to see a list of the genes. The graph also shows an expected number of genes for each chromosome based on the distribution of all genes between chromosomes and the size of the bag.
- NEW – Accurate counts on the results page. The results page used to show only an approximate number of rows returned from a query (unless the ‘Last’ link was clicked). The estimate is now updated to give an accurate ‘Total rows’ figure once it has been calculated.
- UPDATE – The trail (e.g. Query -> Results -> Gene -> Protein) is now more complete to allow easy navigation back to recently viewed queries, results, object details or bag pages.
- FIX – Performance has been improved when saving and viewing large bags of objects.
- FIX – Renaming bags now works correctly.
- FIX – Missing export options from results pages have been fixed. Genome features can be exported as FASTA or GFF3, protein interactions can be exported in Cytoscape SIF format.
- FIX – Sequences from Translation objects can now be exported as FASTA.
- FIX – Some minor issues with display on Internet Explorer have been fixed..
- FIX – TFModules from REDfly are now shown in GBrowse and have a GBrowse image on their details pages.
- Known issues
- There are currently no known problems with release 7.1.
Release 7.0 updates theÂ D. melanogaster genome to 5.1 annotation and other genome annotation sources have been re-mapped. GO enrichment and KEGG pathway viewers have been added to the gene Bag Details page.
- UPDATE – TheÂ D. melanogaster genome has been updated to annotation version 5.1. Data from DrosDel, FlyReg, REDfly and the microarray tiling path have been re-mapped usingÂ USCS LiftOver.
- NEW – UniProt keywords (e.g. Acetylation, Sulfate transport) and protein features (e.g. HELIX, DNA_BIND) have been added.
- NEW – KEGG pathway information added forÂ D. melanogaster.
- NEW – FlyAtlas now has data for three more tissues – larval fat body, larval tubule and male accessory gland.
- UPDATE – InParanoid orthologues have been updated to a release from January 2007.
- UPDATE – Four more D. melanogaster RNAi screens added from the DRSC the RNAi aspect.
- UPDATE GO annotation, protein-protein interactions and UniProt protein data are all updated to recent releases.
- FIX – Missing protein structure data has been added, protein structures can now be viewed with JMol again.
- FIX – Missing protein interaction detection method has been replaced.
- FIX – Missing INDAC oligo sequences added.
- User Interface
- FIX -Drosophila gene names now work in the quick search box, a number of minor problems with quick search have been fixed.
- NEW – The trail (e.g. Query -> Results -> Gene) has been improved for easy navigation between queries, results, and details pages.
- NEW – GO enrichment widget on the gene Bag Details page. For the genes in the bag this lists the number of genes with a particular GO term and a p-value which is the probability that this number of genes were annotated with the GO term by chance, given the abundance of the GO term in a reference population.
- NEW – KEGG pathway widget on the gene Bag Details page. This shows the number of genes in the bag that are associated with a particular KEGG pathway, links give the list of genes and more information about the pathway.
- UPDATE – The constraint editor pane of the QueryBuilder has been made clearer.
- NEW – import query from XML link added to the FlyMine home page.
FlyMine 6.1 contains a major overhaul of the way bags are handled. Bags can now only contain actual objects rather than identifiers or symbols. This means that any object in a bag has already been found in FlyMine which should reduce confusion. A sophisticated bag upload system has been added to aid in creating bags from external lists of identifiers.
PLEASE NOTE – most saved user content is automatically upgraded between FlyMine releases. In this case it was not possible to port some types of bags. These are still available in the 6.0 archive, please contact support [at] flymine.org if you have any queries about transferring bags.
Bags also now have a type (class) assigned to them – for example Gene, Protein, GOTerm. This means that when editing a constraint in a template query only bags of the correct type will be listed – so if the template requires you to enter a Gene identifier the bags dropdown will list any Gene bags in your profile. The same is true when creating/editing a query in the QueryBuilder, just add a constraint on the identifier, name, etc of a class and you will see available bags.
The ‘Bags’ page in MyMine (select ‘Bags’ or ‘MyMine’ from the top menu bar) now allows you to paste in a list of identifiers and select a type for the new bag. The input can be a mixture of different identifier types, for example if you wish to create a bag ofÂ Drosophilagenes if can be a mixture of CGxx, FBgnxx and symbols. In the case where an object can’t be found to that matches a particular input identifier, FlyMine will attempt to help. For example if the input list contains a UniProt protein identifier, but you choose to make a gene bag, the website will attempt to find a related gene. Any matches found in this way will be reported for you to choose which are added to your bag.
As an example, when creating a Gene bag from these identifiers: zen CG2328 FBgn0015379 Q8IML9_DROME unknown_name FlyMine will find a gene for each of the first three identifiers and find the gene for the Q8IML9_DROME protein. The “unknown_name” will be reported as not found.
Also new are bag details pages. These are accessible for any of your saved bags in the ‘Bags’ tab of MyMine. They have a similar layout to object details pages but run templates for all objects in your page. On the page for gene bags is the first of many ‘widgets’ we plan to add for a viewing and analysis of data in bags. Currently there a widget that graphs the genes from a bag that are over/under expressed in different tissues according to the FlyAtlas data set (www.flyatlas.org). Note that clicking on any of the bars in this graph allows you to create a new bag of genes in that category. More functionality will be available on these pages in release 7.0/