BioinfoGRID Project Milanesi Luciano National Research Council Institute of Biomedical Technologies, Milan, Italy ...
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
BioinfoGRID Project Milanesi Luciano National Research Council I tit t off Biomedical Institute Bi di l TTechnologies, h l i Milan, Italy luciano.milanesi@itb.cnr.it Milanesi Luciano EGEE User Forum, Clermont-Ferrand , France 11-14 February, 2008
Networks of resources • The potential of new biological and biomedical technological platforms in connection with HPC and GRID technologygy will be p particularly y useful to deal with the increasing amount, complexity, and heterogeneity of biological and biomedical data. • Bioinformatics applications for eHealth have become an ideal research area where computer scientists can apply and further develop new intelligent computation methods, in both experimental and theoretical cases cases. Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 2
BioinfoGRID Project BioinfoGRID Project web site: www.bioinfogrid.eu Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 3
Consortium Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 4
BioinfoGRID Objectives • Objective of the BioinfoGRID project Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 5
Interaction with related projects At present the BioinfoGRID project has established co-operations with the following projects initiative: • EGEE • BELIEF • EMBRACE • EUCHINAGRID • EUMEDGRID • EELA • DILIGENT • ICEAGE • LITBIO • LIBI • HEALTHGRID • WISDOM Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 6
BioinfoGRID Work Packages Work-package No Work Package title WP1 Genomics Applications in GRID WP2 Proteomics Applications in GRID WP3 Transcriptomics Applications in GRID WP4 Database and Functional Genomics Applications WP5 Molecular Dynamics Applications WP6 Coordination of technical aspects and relation with Grid j , user training, infrastructure Projects, g, application pp support pp and resources integration. WP7 Dissemination and Outreach. WP8 Project Management Office Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 7
WP1 – Genomics Applications GCG In house In-house (~130 developments programs) - own programms - automated tasks HUSAR Program Package EMBOSS (~150 Third-party Third party programs) programs DATABASES - >300 SRS - Prompt updates ((Sequence q Retrieval (daily, weekly) System) Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 8
WP1 – Genomics Applications • Integrating I t ti W3H, W3H SoapLab S L b and d th the GRID target setup preliminary setup HTML pages W2H WebService Solaris (OS) ScLinux (OS) W3H SoapL analysis y ab tasks @dkfz-heidelberg.de Grid any Client more toolkit software ScLinux (OS) ?? ssh % G Grid API submit_formatd b … any Grid Interface %Client more software submit_blastal toolkit ?? l … @dkfz or anywhere else Grid CE Grid CE % formatdb % formatdb … … % blastall % blastall … … Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 9
WP2 – Proteomics Applications • Perform functional protein analysis in GRID by using the functional protein domain annotations on large protein families using GRID and related databases databases. • All 518 human protein kinases and 5129 proteins from non-redundant chainset of Protein DataBank were analyzed with InterProScan applications Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 10
WP2 – Proteomics Applications • Protein surface calculation in GRID. : the grid was used to compute the volumetric description of the proteins obtaining a precise representation of the corresponding surface. Then protein interactions could be quickly screened by the mean of surface analysis. – The ProSite domains were analyzedy all-against-all g – ATP-E against its inhibitor – Collagen against integrin Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 11
WP3 – Transcriptomics applications • Ph Phylogenetics l ti : Reconstructing R t ti theth evolutionary l ti hi history t off a group of taxa is major research thrust in computational biology gy and a standard p part of exploratory p y sequence q analysis. • An evolutionary history not only gives relationships among taxa but also an important tool for inferring structural taxa, structural, physiological, and biochemical properties of sequences from other similar sequences, and reconstruction of tissue evolution. Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 12
WP4 – Databases & Genomics Applications • Work Package 4: Databases and Functional Genomics Applications – Testing the main biological databases in the Grid environment optimization p on storageg space, p , bandwidth,, download time – Testing performances and scalability of database-based applications performances/scalability testing according to various g use cases and submission algorithms – 1 challenge: Gene Analogous Finder 55+ years of computation on a single CPU, not f feasible ibl iin a llocall environment. i Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007
WP4 – Databases handling • GridDBManager – Automatic Updater Timer based monitoring and update of Grid ported databases – Adaptive replica manager Constantly adapts the number of replicas in relation g of each database in the last 10 days to the usage y – Version Regression Keeps patches on the Grid for allowing regression of eachhddatabase b to an earlier li version i Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007
WP4 – Methods - GridDBManager Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 15
WP4 – Methods - DBApp Perf. Testing • Testing performances and scalability of Database-Oriented Bioinformatics Applications (DBApp) in the EGEE GRID – Testing Performance and Scalability Grid: too manyy variables (queue (q time, database download time, queue failures, execution failures) Submission mode: too many variables (number of jobs, rate-limiting settings, resubmission algorithm) Application too many variables: (performance of specific application, location of database) Probing of Grid performances Numeric simulation for all algorithms Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 16
WP4 – Methods - DBApp Perf. Testing • Probing Grid performances (Example) – Grid queue times and reliability Sent 150 jobs in 3 groups of 50 at different times Grid queue times (normal load) 30 25 20 % of jobs 15 10 5 0 1 2min 1-2min 4 10min 4-10min 30min 1h 30min-1h 4h 8h 4h-8h Time out Time-out 8h Queue times Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 17
WP5 – Molecular docking The neuraminidase viruses is considered a valid target for antiviral drugs Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 18
WP5 – Molecular docking Starting compound Starting target database structure model Docking: predict how small molecules bind to a receptor of DOCKING known 3D structure Predicted binding models There are successful examples – rapid, id – cost effective… Post-analysis But there are limitations – CPU and storage needed Compounds for assay More specific p talk by y Ana Lucia Da Costa Wednesday 13th 11:15 – Room: Bordeaux Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 19
WP7 – Dissemination • The following series of events were specifically associated to or organized by the BioinfoGRID project: – BioinfoGRID Symposium 2007: December 10th-13 13th 2007, 2007 Milan – BioinfoGRID Session at EGEE '07: October 4th 2007, Budapest – Biomed Grid School, Varenna, Italy, May 14th-19th 2007 – BioinfoGRID Workshop at Healthgrid 2007 Conference - Geneva, Switzerland, 24th April 2007 – NETTAB 2006 Workshop: Distributed Applications, Web Services, Tools and GRID Infrastructures for Bioinformatics - Santa Margherita di Pula, Sardinia, Italy - July 10-13th, 2006 – BioinfoGRID Initial Training Course, Course Bari Italy, March 8th-10th 2006 Bari, Italy • In addition, the BioinfoGRID project has been represented at 58 national and international conferences and workshops. Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 20
WP7 – Dissemination • 24 Journal Articles written within the frame of the BioinfoGRID project: – 9 - BMC Bioinformatics – 4 - IEEE Transactions on Nanobioscience – 3 - Studies in Health Technology and Informatics – 1 - Journal of Parallel and Distributed Computing – 1 - Journal of Chemical Information and Modeling – 1 - Parallel Computing – 1 - Int. J. of Bioinformatics Research and Applications – 1 - IEEE Transactions on Systems Science and Applications – 1 - Nucleic Acids Research – 1 - BMC Genetics – 1 - Bioinformatics Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 21
WP7 – Dissemination • 19 Conferences proceedings achieved within BioinfoGRID – 6 – NETTAB '06 – 2 – EGEE User Forum 06/07 – 2 – BITS '06 – 2 – HPDC '07 – 1 – EGEE 06/07 – 1 – CAPI 2006 – 1 – Bioinformatics of African Pathogens and Disease Vectors Vectors. Nairobi 2007 – 1 – MAS-BIOMED '06 Workshop – 1 – CCGrid '07 Symposium – 1 – EvoBIO '08 – 1 – CHEP '07 Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 22
People Acknowledgments • Cristina Aiftimiei • David Fergusson • Alessandro Orro • Roberta Alfieri • Geraldine Fettahi • Giovanni Paolella • Claudio Arlandini • Sandro Fiore • Silvano Paoli • Roberto Barbera • Riccardo Gervasoni • Antonio Pierro • Endre Barta • Karl-Heinz Glatting • Giorgio Pietro Maggi • F Francesco Beltrame B lt • J h H John Hatton tt • M Marco Pi l Pirola • Attila Bende • Ally Hume • Raffaele Ponzini • Chiara Bishop • Nicolas Jacq • Ivan Porro • Chirstophe Blanchet • Atul Jain • Paolo Ramieri • Ignacio Blanquer • Miklos Kozlovszky • Paolo Romano • Vincent Bloch • Giuseppe La Rocca • Ermanna Rovida • Gianpaolo Bottoni • Yannick Legré • Erika Salvi • Vincent Breton • Pietro Liò • Jean Salzemann • Andrea Calabria • Carles Loomis • Diego Sardaci • Andrea Caprera • Mario Marchisio • Salvatore Scifo • Tiziana Castrignanò g • Hajnal j Marton • Martin Senger g • Federidica Chiappori • Rafael Mayo Garcia • Giuliano Taffoni • Dario Corrada • Mirco Mazzucato • Livia Torterolo • Paolo Cozzi • Giovanni Meloni • Gabriele Trombetti • Stefano Cozzini • Ivan Merelli • Angelica Tulipano • Enza D’Alba • Emanuale Merelli • Vania Ugè • Pasqualina D’Ursi ’ • L i Luciano Mil Milanesii • Eli b th van der Elizabeth d Wath W th • Ana Da Costa • Elisa Molinari • Richard van der Wath • Paride Dagna • Ettore Mosca • Kasam Vinod • Guilia De Sario • Georgina Moulton • Federica Viti • Davide Di Pasquale • Loukas Moutsianas • Guy Warner • Giacinto Donvito • Tibor Nagy • Ted Wen • Vihang Dudhalkar • Alessandro Negro • Pierfrancesco Zuccato • Peter Ernst • Laszlo Oroszi Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 23
Projects Acknowledgements ISSeG EU GRID Diligent A DIgital Library Infrastructure on Grid ENabled Technology Milanesi Luciano BioinfoGRID Symposium, Milan 10-13 December 2007 24
You can also read