Advances in sequencing technologies mean that genome sequence and annotation data for multiple strains of a species are now often available. An update to the InterMine core data model was decided that would allow addition of Strain data should it be available without affecting InterMines which do not have this data.
It was decided that the addition of a new class, Strain, which is referenced by Organism and Sequence feature and vice versa, would allow both the flexibility required and allow for addition of further data and expansion if required.
The Strain class has the following features/advantages:
- SequenceFeature entities, such as Genes, would continue to reference Organism, but would also reference the new Strain class, allowing for queries returning SequenceFeatures for a specific strain.
- Providing strain information as a separate class allows individual InterMine’s to reference other information as required, such as Genotype and Stocks.
- The Strain class extends BioEntity so will include strain-relevant attributes such as PrimaryIdentifier and Name and will reference other collections such as synonym.
- Minimal changes to the user interface will be required as, to our knowledge, SequenceFeatures in individual strains always have a unique identifier. With the help of templates if necessary, users will be able to identify particular SequenceFeatures and which strain they originate from.
To update your mine with these new changes, see upgrade instructions. This is a non-disruptive release.
See release notes and the notes from the community call for more details. Please join our community calls if you’d like to be part of future data model decisions! (Details of upcoming calls are available via our developer mailing list).