Personal Genome Analysis

  • Silk: Cluser Computing Platform for Genome Sciences

Personal genome analysis involves massive amount of data analysis: genome sequence alignment, SNP calls, finding desease-related mutations, etc. These data analysis involves tons of data managment (e.g., compression, file I/O, table joins, filtering, etc.) Silk supports running these complex data analysis pipeline in a cluster machine.

Genome Browser

UTGB Toolkit is a bundle of Java libraries for developing web-based genome browsers. This includes portable Tomcat server, SQLite database engine (or connection to other DBMSs through JDBC), AJAX-style graphical user interface for data browsing, which is implemented using Google Web Toolkit (GWT).

Genome browsers developed by UTGB

Genome Sequence Alignment

The Primary Transcriptome of C. elegans

Despite the fundamental importance of transcription start sites (TSSs), little is known about TSSs in C. elegans. This is due to the high frequency of trans-splicing of C. elegans mRNAs, a process which post-transcriptionally removes RNA segments of variable length from the 5’end of mature mRNAs. WormTSS is a comprehensive collection of TSSs for trans-spliced genes generated from the sequencing of Illumina reads from pre-trans-splicing 5’-ends that were captured using a modified 5’-SAGE procedure. Our data provides an enabling resource for both experimental and theoretical analysis of gene structure and function in C. elegans.

The Saccharomyces Cerevisiae Morphological Database (SCMD)

The Saccharomyces Cerevisiae Morphological Database(SCMD) is a collection of micrographs of budding yeast mutants. Micorgraphs of mutants with altered cell morphology were taken at Ohya Group, University of Tokyo, from a set of the haploid MATa deleted strains obtained from EUROSCARF. From the micrographs, disruptant cells are automatically extracted by our novel cell-image processing software developed at Morishita Group, University of Tokyo.