Oral Presentation 23rd Annual Lorne Proteomics Symposium 2018

Working with non-model organisms: an integrated omics workflow for effective assembly of species-specific protein landscapes (#16)

Eugene A Kapp 1 2 , Oliver R Thomas 3 , Peng Po 3 , Anne Roberts 2 , Pascal Bernard 3 , Gerry Tonkin-Hill 1 , Tony T Papenfuss 1 , Andrew Webb 1 , Stephen E Swearer 3 , Blaine R Roberts 2
  1. WEHI, Parkville, VIC, Australia
  2. Neuroproteomics, The Florey Institute of Neuroscience and Mental Health, Melbourne, VIC, Australia
  3. School of Biosciences, University of Melbourne, Melbourne, VIC, Australia

Proteomics approaches are increasingly being employed in a diverse range of fields, particularly with respect to identification of key proteins in non-model organisms. The majority of traditional protein biochemistry has been focused on organisms such as Homo sapiens or Mus musculus. Successful protein inference relies upon alignment with sequenced genomes and proteins in public domain databases. This proves a significant challenge for the study of non-model organisms for several reasons.  Firstly, as such alignments are typically done against large, non-specific datasets, the power of these searches is inevitably negatively impacted upon. Secondly, the FDR for peptides is increased. Finally, large numbers of mass spectra end up being unassigned, causing numerous novel proteins to be overlooked. Although the simplest solution is to sequence the genome of the study organism, such an undertaking can prove to be prohibitive in terms of cost. We present here a multi-omics workflow that allowed for identification of candidate inner ear proteins from Acanthopagrus butcheri (Black Bream), an unsequenced Southern Australian fish. Our approach first involved the sequencing and assembly of transcriptome FASTA using Trinity. Proteomic data collected from the RP-fractionated organic phase of ear stones and endolymph from wild, adult Black Bream were then searched against these FASTA in six frames to create a dataset of transcriptome-matching peptides. Secondly, we performed Blastx analyses of the transcriptome against the RefSeq database, resulting in a dataset consisting of the best aligned proteins from a variety of fish species. These two datasets were merged to create a database to best represent the species under investigation. Finally, mass spectra were searched against this new database and the results integrated with de novo sequencing resulting in a comprehensive protein landscape and the identification of novel proteins – many of which would have not been identified using a traditional proteomics approach.