PlaceTeam NameProject NameMembers
1GenomeLabsGene Body AtlasDennis Aumiller, Ray Gao, Musa Talluzi
2ProteoWizardsHuman Uniqueness Map (slides)Max Frank, Aleksei Shkurin, Ron Blutrich, Paul Frank, Julian Mazzitelli, Hailen Xu, Charlotte Nguyen
3SadCricketHills With A Halo (slides)Jenny Yin, Alana Man, Yuqing Zou, Justin Lee, Ian Shi
Top 6A TAD TiredTAD Viewer (slides)Ashley Wang, Srishti Sehgal, Marco Ly, Frederick Zhang
Top 6BioCompBaesPredicting gene functionality based on location (slides)Kashish Verma, Christine Bui, Afifa Saleem, Samia Muqeem
Top 6Don't Do Shrugs KidsGene Interaction...and 👉YOU (slides)Hammad K., Xi H., Nada E., Tanith J., Alison L., Ramnik C.
AAARelationship GraphAmy Li, Alex Chen
pizzasquadPizza GraphBruno, ignaspa, Sid, Ben, John, Santiago
HackthoniansBioHacks ProjectSotaro Hirai, Shisei Naka

March 17-18 2018

Bioinformatics and Computational Biology + CSSU Presents: BCB BioHacks 2018

Returning for its 2nd year at UofT, BioHacks is a two day hackathon that will feature academic speakers, workshops, and industry guests. We can't wait to welcome a hundred undergraduate participants to learn, build, and share their solutions to current challenges in bioinformatics.

Bioinformatics is a classic interdisciplinary field. At BioHacks2018, we're looking to bring computer scicene and life science students together to problem solve and apply domain knowledge. In addition, we want to expose our students to some of the fascinating research and industry oportunities local to Toronto.

We're inspiring each other to excel in bioinformatics learning, and invite you to join us in pioneering the field!

8:30 a.m.

Registration and Team Formation

March 17
Sidney Smith Room 1084 (100 St. George Street Toronto Ontario)

9:00 a.m.


Sidney Smith (SS) If you haven't got a team or are looking for new members to add come find them over breakfast!

9:30 a.m.

Opening Ceremonies


Prof. Nicholas Provart on Raising the BAR for Visual Analytics: Building Tools for Hypothesis Generation with Open Big Data

Nicholas Provart is a full professor of Plant Cyberinfrastructure and Systems Biology in the Department of Cell & Systems Biology at the University of Toronto. Currently his Bio-Analytic Resource (BAR) at, comprising tools for coexpression analysis of publicly-available gene expression data, cis-element prediction, identifying molecular markers, generating “electronic fluorescent pictographic” (eFP) representations of gene expression patterns, and exploring protein-protein interactions in Arabidopsis and other plants, is used approximately 60,000 times a month by researchers worldwide. He is one of the founding members of the International Arabidopsis Informatics Consortium, is president of the Multinational Arabidopsis Steering Committee, and is teaching two MOOCs on Bioinformatic Methods on

Prof. Boris Steipe to Introduce the Challenge and Starter Code

Boris Steipe - MD PhD Ludwig Maximilian's University, Munich, Germany; Postdoc in protein structure with Robert Huber, Max Planck Institute for Biochemistry; Habilitation in Biochemistry; Junior research fellow (protein engineering), Gene Center of the University, Munich; Associate professor, Department of Biochemistry and Department of Molecular Genetics, University of Toronto. Director, Undergraduate Specialist Program in Bioinformatics and Computational Biology; Linguistics Specialist POSt (part time).

10:30 a.m.

Hacking Begins!

Get comfy and Start Hacking in Rooms SS 2104/2105/2106/2108/2112/2114/2116/2119/1084/1086! Mentors will be circulating to answer questions and help you get started!

12:00 p.m.


Sidney Smith

2:00 p.m.

Guest Speaker: Amit Deshwar

Computational approaches to correction of a splicing defect in a mouse model of congenital muscular dystrophy

Amit Deshwar is a research scientist at Deep Genomics. His doctoral work was at the University of Toronto under Quaid Morris using Bayesian Non-parametrics to study intra-tumour heterogeneity and evolution. He is a Vanier Scholar and former Junior Fellow at Massey College. Previously he worked at Google, started two companies and obtained undergraduate degrees in Software Engineering and Psychology.

4:00 p.m.

Guest Speaker: Erica Acton from CAGEF


Metagenomics Analysis: What's in your Microbial Goody Bag?

Erica Acton entered the field of data science when she needed to 'get stuff done' during her Master's degree in Genome Science and Technology at the University of British Columbia. She has subsequently worked as a bioinformatician in Vancouver and Toronto, analyzing transcriptome data and visualizing large genomics datasets. She sometimes leaves her keyboard to give talks or train others at science education and outreach events that promote data science, or to run half-marathons. She is currently at the Centre for the Analysis of Genome Evolution and Function (CAGEF) where she is creating bioinformatics training lessons and workshops, and drinking copious amounts of coffee.

The Centre for the Analysis of Genome Evolution and Function (CAGEF) promotes interdisciplinary research in comparative, evolutionary, and functional analyses of genomes and proteomes, and to promote training and education in genome biology through the development and support of innovative teaching initiatives, courses, and workshops. CAGEF provides genomics, proteomics, and bioinformatic services, including Illumina-based genome, microbiome, and metagenome sequencing and transcriptome analysis, and proteome and protein modification analysis via LC-MS/MS. CAGEF has genomics expertise in the analysis of non-standard species, systems, and environments, with particular experience in microbial, plant, and environmental genomics and metagenomics.

6:00 p.m.


Sidney Smith

10:00 p.m.

Cup Stacking

Location: TBD

12:00 p.m.

Midnight Games


9:00 a.m.


March 18
Sidney Smith

9:00 a.m.

Submissions due and Presentation Sign-Up

Stay tuned!

9:30 a.m.



11:30 p.m.


Sidney Smith

12:30 p.m.

Closing Ceremony and Prizes

Sidney Smith 2135

About BioHacks

BioHacks 2018 will take place at the University of Toronto St. George campus Bahen Center for Information Technology. March 17th-18th 2018, we're excited to welcome Canadian undergraduate students, professors and companies. This event will feature opening and closing keynote speeches from pioneers in Biology, Computer Science and Engineering disciplines, brainstorming sessions and workshops from the industry and academia followed by an overnight biohacking session for addressing real world problems.

Registered teams composed of undergraduates with backgrounds in Computer Science, Life Sciences and Engineering are invited to attend and learn from leaders in the bioinformatics field.

In this two day competition, participants will have a chance to attend lectures and workshops hosted by leaders in the field of computational biology and bioinformatics, present in front of our judges, and most importantly network with everyone participating and innovate with your new friends! We will have mentors circulating to guide our participants in project development, design, and implementation.

Prizes will be awarded to top three projects as selected by our judges.


Rethinking Genome Annotation

Our imagination of the genome has matured tremendously. Your challenge is to create a data-driven visual representation of our current understanding of the human genome, 20 years after its initial publishing. Do this either using the data we have curated for you from human chromosome 20 or your choice of human genetic material of similar magnitude.


What does it mean to be human? Nearly 20 years since the conclusion of the human genome project, we know a lot more about gene function than we once did. For tens of thousands of genes, this amounts to a huge volume of information. How can we improve on how we associate the existing information for a gene? Consider a gene's sequence, function, regulation, pathways it participates in, and its associated protein(s).

Gene annotation and prediction is a process where a gene function or family is predicted from reading a nucleotide or amino acid sequence. This process is one of the most important steps in studying the metabolism, phylogeny, and the overall genomic properties of a sequenced species.

The first two draft sequences of the human genome were published in February of 2001. Three years from now will mark the twentieth anniversary of this accomplishment that like now other has shaped the landscape of bioinformatics, computational biology and molecular medicine. In 2001, Celera - a private company founded three years earlier to commercialize genome information - published an iconic poster summarizing their version of the genome. It is still fascinating today. This poster is significant, not so much for its interpretable content, but for the unique perspective it gives us on the entirety of information that constitutes our molecular identity. The details are rich, in fact, surprisingly "modern", presenting features like CpG islands and SNP density, and exon transcripts with Gene Ontology functional categories colour coded, for forward and reverse strand, accurately plotted on the nucleotide backbone at about 500 kB per centimetre. This was computed from gff records with Josep Abril's gff2ps software.

But we know so much more today. While the Celera map showed us the genome of one Caucasian male, the number of sequenced genomes has exploded - we envisioned the 1,000 genomes project (2008, completed 2012); quickly set our sights on 100,000 genomes (2012, almost completed), and as of today more than 500,000 human genomes have been sequenced overall. We have sequenced cancers, and genetic diseases. We have sequenced representatives of virtually all ethnicities on the planet. We have even sequenced Neanderthals and Denisovians, and we have sequenced other species far and wide to acquire a sense of where we humans fit into the landscape of evolution. We have annotated the contents of the genome in the ENCODE project. We have built databases that carefully dissect all proteins into their domains, such as InterPro. We have started to outline how things work together in functional networks such as the STRING data, or in modules as published by KEGG, and we are beginning to translate our insights into actionable information for medicine, at the OICR, at Sick Kids' TCAG.

Deliverables: What you'll be judged on

  • Prototype of your visualization
    • Should be clearly data driven - or the path to making it data driven should be clear
  • Code
    • Quality and structure
  • Documentation
    • Should include your sources
    • Architecture - get from data to visualization in a sensical way
  • Presentation


We have cleaned up the "wild" data for chromosome 20 for you to use. Download our txt files and you'll find they contain (tab delimited) HUGO gene symbol, and IDs to allow you to find annotations for this gene via crossRef, InterProt domains, STRING DB, GO annotations, etc. We'll also provide sample scripts of how the data was prepared that you can adapt.

Starter code

We will provide some starter code to give you some direction. You don't need to use it! Languages supported are R, python, JavaScript. Find these at our github.


Useful Reading


If you are interested in becoming a sponsor, check out our sponsorship package.



Where, when, how?

UofT BioHacks 2018 will take place on March 17th-18th, 2018. It will be a two day event hosted in Sidney Smith. Applications are now open!

Will this event run overnight?


Will there be food?

Yes! Full meals, snacks and plenty of coffee will be provided!

Email us at if you have any dietary restrictions

Are there any prizes awarded?

Yes! Prizes will be awarded for the different challenges based on submissions and presentations so put your best foot forward!

Does registration guarantee a space at UofT BioHacks 2018?

Due to the high demand and limited space, registration does not guarantee a spot in the competitions. Selected candidates will recieve an invitation email. Candidates who accept the invitations will be guaranteed for a spot in the competition. Priority will be given to members who have detailed registration forms. If you don't get a spot right away, we'll put you on the waitlist and let you know if something opens up!

How much coding or biology background do I need?

We do not expect an advanced bioinformatics background. However, this hackathon is not for first time coders. Teams are encouraged to have members with different backgrounds who work together. Additionally, there will be workshops provided to improve your bioinformatics skills. If you lack coding experience make sure you've got team members who are proficient coders.

How do I form a team?

In order to join a team, you can

  1. Create a team and invite your friends by using their UserIDs or email addresses.
  2. OR You can contact your friend and ask to be invited to their team.
  3. OR Show up the day of and find others who need a team!!

Note: You cannot send an invitation to a friend who is already on a team.

I don’t have a team, should I still register?

Definitely! We will help you find your team members the morning of.

How many members per team?

Reccomended team size is 3-6 members

What you should bring?

A laptop, charger, extension cords and power bars are always helpful, whatever you require for overnight and yourself!

Is there any preparation I can do to before?

1) Read up! Google around or check out some of our provided links. Wouldn't want to waste precious hacking time doing background readings!! Get that done before you show up!

2) Brainstorm ideas with your team!

3) Brush up on your coding skills

Are there language constraints?

Code your submission however you're comfortable, but keep in mind that you might encounter large volumes of data. For this reason we often recommend python or R.

My questions is infrequently asked!

If you've got lingering Q's we have A's! Shoot us an email at and we'll be happy to help you out.