1st Workshop on bioKepler Tools and Its Applications

DATES: September 5th and 6th, 2012
VENUE: SDSC Auditorium (Directions)

Workshop Overview:

In this workshop bioinformaticians and computational biologists will meet with the bioKepler team to clarify specific technical requirements for bioKepler tool and workflow development. The two focus areas for the workshop are (i) evaluation of bioinformatics and computational tools for bioActor development; and (ii) generation of bioinformatics workflows based on conceptual workflows presented by workshop attendees. These focus areas are chosen to not only address challenges related to accessing and integrating important bioinformatics and computational biology tools, but also to coordinate the incorporation of scientific use cases and requirements into bioKepler scientific workflows through bioActors that facilitate documentation, replication, and sharing of executable analysis and models.

As a part of this workshop, the organizers will conduct a formal survey of use cases and translate them into functional bioKepler requirements. They will then examine the list of use cases along with other requirements and generalize and supplement them in order to satisfy the needs of the bioinformatics and computational biology community. At least two of the participants will have expertise in systems analysis and design in order to focus the workshop on creating precise specifications that can guide design activities.

Workshop participants are invited based on their diverse expertise and key participation in world-class bioinformatics and computational biology projects.


Participant Information

Pre-Meeting Reading:


The 1st Workshop on bioKepler Tools has 28 participants. Click on table headers to sort.

First Name Last Name bioKepler Participationsort descending Title Institution Department Research Interests
Shulei Sun bioKepler Bioinformatician University of California, San Diego Center for Research in Biological Systems Bioinformatics, biology, computational biology, and metagenomic and genomic data analysis.
Sitao Wu bioKepler Bioinformatician Programmer Analyst University of California, San Diego Center for Research in Biological Systems Electric engineering, protein structure prediction, metagenomics and genomics.
Weizhong Li bioKepler Co-PI Associate Research Scientist University of California, San Diego Center for Research in Biological Systems Computational biology, bioinformatics, and developing computational methods for sequence, genomic and metagenomic data analysis.
Daniel Crawl bioKepler Developer and Researcher Workflow Specialist University of California, San Diego San Diego Supercomputer Center Overall integration of distributed data parallel (DDP) execution patterns and the Kepler Scientific Workflow System. Research and development of execution patterns, bioActors, and distributed directors.
Jianwu Wang bioKepler Developer and Researcher Assistant Project Scientist University of California, San Diego San Diego Supercomputer Center Computer software and theory, scientific workflows, distributed computing and data-intensive computing.
Ilkay Altintas bioKepler PI, Lab Director Assistant Research Scientist University of California, San Diego San Diego Supercomputer Center Scientific workflows, provenance, distributed computing, bioinformatics, observatory systems, conceptual data querying, and software modeling
Eric Allen bioKepler Scientific Advisor Associate Professor University of California, San Diego Scripps Institution of Oceanography Use of environmentally-derived genome sequence information to explore the genetic potential, ecology, and evolution of marine microbial populations. Field-based collections, bioinformatics (genome assembly, annotation, and comparative analyses) and the tools of molecular ecology and genetics.
Sheila Podell bioKepler Scientific Advisor Research Programmer University of California, San Diego Scripps Institution of Oceanography Cell and molecular biology, analytical biochemistry, and bioinformatics computer programming, including the development of new tools and algorithms for genomic and metagenomic analysis.
Kate Kaya bioKepler Web Developer Research Programmer University of California, San Diego San Diego Supercomputer Center Web site development, technical writing, software deployment, and programming
Zhuohui Gan bioKepler Workshop Attendee Post-doctoral researcher (UCSD Bioengineering), Visiting investigator (Sanford-Burnham Medical Research Institute) UCSD and Sanford-Burnham Medical Research Institute Oxidative-stress related mechanisms, such as hypoxia, ischemia, and hyperoxia, either using computational-mathematical approaches or experimental approaches.
Adam Godzik bioKepler Workshop Attendee Bioinformatics and Systems Biology Program Director, P.I., Professor Sanford-Burnham Medical Research Institute Proteins, their sequences, structures and functions, and the relationships between the three.
Jeffrey Grethe bioKepler Workshop Attendee Associate Director University of California, San Diego Center for Research in Biological Systems Enabling collaborative research, data sharing and discovery through the application of advanced informatics approaches. Currently involved with two projects, CAMERA (Community Cyberinfrastructure for Advanced Microbial Ecology Research and Analysis; http://http://camera.calit2.net) and NIF (Neuroscience Information Framework; http://neuinfo.org), that provide data and knowledge environments to the biomedical research community.
Ananth Kalyanaraman bioKepler Workshop Attendee Associate Professor Washington State University a) Developing and implementing advanced scientific workflows that could enable coupling, distributed parallel executions and provenance for an ongoing USDA/NSF EaSM project called BioEarth for Earth Systems Modeling. b) Design and development of massively parallel graph algorithms (under MPI, MapReduce, OpenMP models) for solving problems originating in the analysis of environmental microbial community (i.e., metagenomics) data.
Konstantinos Krampis bioKepler Workshop Attendee Assistant Professor J. Craig Venter Institute Informatics Algorithmic designs using cloud and high performance computing frameworks for solving computational bottlenecks in data analysis of large-scale genomic datasets.
Zhanwen Li bioKepler Workshop Attendee Bioinformatics Specialist Sanford-Burnham Medical Research Institute Bioinformatics and System Biology Program Design and development of databases, web servers, software tools for protein sequence and structure classification and protein functional annotation. Implement a protein comparative modeling pipeline for large-scale analysis.
Brett Pickett bioKepler Workshop Attendee Bioinformatics Analyst J. Craig Venter Institute Informatics Integration of virus-host interaction information into the public Virus Pathogen Database and Analysis Resource that contains sequence, structural, epitope, annotation, and other data; Improvements on the number and types of analytical tools that aid in interpreting such data.
Julia Ponomarenko bioKepler Workshop Attendee PI/Senior Research Scientist University of California, San Diego San Diego Supercomputer Center (1) Modeling of the stimulus and pathogen-responsive gene regulatory networks from high-throughput, genome-wide transcriptome measurements. (2) Databases and tools for in silico prediction of immune epitopes. (3) Integration of heterogeneous biological data and bioinformatics analysis tools.
Alexander Richter bioKepler Workshop Attendee Sr. Bioinformatics Engineer J. Craig Venter Institute Prokaryotic functional annotation and microbial metagenomic community and functional analysis; pipeline development and systems integration
Indresh Singh bioKepler Workshop Attendee Lead, Informatics Core Services J. Craig Venter Institute Automation and High-Throughput computing for Lab, Informatics and Scientific applications.
Ishwor Thapa bioKepler Workshop Attendee Software Application Developer University of Nebraska at Omaha College of IS&T Analysis of genomics/proteomics data from biomedical researchers, mostly with an interest in the design of workflows or software pipelines for next-generation sequencing.
Juan Ugalde bioKepler Workshop Attendee Graduate Student Researcher University of California, San Diego Scripps Institution of Oceanography Study of microbial communities using environmental genomics.
Hsin-Hui (Elvis) Wu bioKepler Workshop Attendee Postdoctoral Associate University of Florida Florida Museum of Natural History Design and apply informatics and computational approaches such as scientific workflow, cloud computing, and semantic web to address questions and accelerate the process of scientific discovery in biological systematics and biogeography fields.
Wilfred Li bioKepler Workshop Attendee Executive Director University of California, San Diego National Biomedical Computation Resources (NBCR) Development of computational biology workflows and cloud based solutions for computer aided drug discovery. Big data and workflow management in the context of biomedical simulation.
Patrick Liu bioKepler Workshop Attendee Programmer/Analyst University of California, San Diego Bioinformatics, workflow management, data management, high-throughput sequencing data analysis, usage of high-performance computing systems.
Gabriel Pratt bioKepler Workshop Attendee Graduate Student Researcher University of California, San Diego Bioinformatics and System Biology Program Bioinformatics, Genomics, scalable and reproducible data analysis, RNA processing.
Mark Bieda bioKepler Workshop Attendee Assistant Professor University of Calgary Kepler workflows for genomics and bioinformatics to leverage the diverse software and database resources for model organism (especially human and mouse) genomics, with a special emphasis on gene expression and epigenetic resources.
David Coss bioKepler Workshop Attendee High Performance Computing Specialist St. Jude Children's Research Hospital High Performance Computing Facility Developing and maintaining pipelines for various bioinformatics/genomics data analyses.
Antonio Ferreira bioKepler Workshop Attendee Manager, HPC Facility/Staff Scientist St. Jude Children’s Research Hospital Development of tools to support large-scale genomic data analysis and de novo sequence assembly as a manager of HPC Facilities. Independent research interests are in the application of quantum chemistry to computer-aided drug design and the study of biochemical mechanisms.


September 4th, 2012

Participants arrive

September 5th, 2012

8:00am 8:30am Breakfast  
8:30am 9:00am bioKepler and Kepler update Ilkay Altintas, UCSD
9:00am 9:15am Welcome and introductions Ilkay Altintas, UCSD
9:15am 10:00am Introduction to bioActors Weizhong Li, UCSD
10:00am 10:30am Parallelization techniques: Applying Map, Reduce and Cross concepts using bioActors Ilkay Altintas, Daniel Crawl, Jianwu Wang: UCSD
10:30am 10:50am Break  
10:50am 11:15am Kepler Interface and Introductory Examples on Using Kepler Daniel Crawl, UCSD
11:15am 12:00pm Building a Metagenome Annotation Workflow using Kepler and bioKepler Weizhong Li, Sitao Wu and Jianwu Wang, UCSD
12:00pm 1:15pm Lunch  
1:15pm 1:30pm Information on goals for the rest of the workshop and break out groups Ilkay Altintas, UCSD
1:30pm 2:45pm Discussion on evaluation of bioinformatics and computational biology tools Moderators: Eric Allen and Weizhong Li, UCSD
2:45pm 3:00pm Coffee Break  
3:00pm 5:15pm Group Discussion
  • Usage based prioritized list of of bioinformatics tools and packages for bioActor development
  • Conceptual workflow usecases
Moderators: Eric Allen, Ilkay Altintas and Weizhong Li, UCSD
5:15pm 5:30pm Summary of the day and goals for tomorrow Ilkay Altintas, UCSD
5:30pm   Adjourn for the day  

September 6th, 2012

8:00am 8:30am Breakfast  
10:00am Break out session
  • Group 1, bioActor: Initial evaluation of the generated tool list for computational requirements
  • Group 2, Workflow: Technical requirements to develop the discussed conceptual usecases
  • Group 1 Moderators: Weizhong Li and Eric Allen, UCSD
  • Group 2 Moderator: Ilkay Altintas, UCSD
10:00am 10:30am Break  
10:30am 11:00am Break out reports  
11:00am 12:00pm Discussion on overlaps of the requirements collected by break out groups Moderator: Ilkay Altintas
12:00pm 1:15pm Lunch  
1:15pm 1:40pm Talk: Using Workflows for Omics Research Mark Bieda, University of Calgary
1:40pm 5:00pm
  • Hands-on session with a focus on bioActor and workflow development.
  • Implementation of parts of conceptual workflows with help from the bioKepler team.
Refreshments will be provided in the room
5:00pm   Workshop adjourns