Thursday, September 25, 2008

Comparative genomics

Let us first define what comparative genomics actually mean.

It is practice of analyzing and comparing genetic material of different species for purpose of studying functions of genes, studying evolution and inherited diseases.

But why
do we require comparative genomics? What is importance of it?

  • It tells us what are unique and common
    between different species at genome level. E.g. To identify unique crucial protein in pathogens to use as
    targets for products that are both safe and effective.
  • Genome comparison is surest and most reliable way to indentify genes , predict their functions and interactions. E.g. To distinguish between orthologues and paralogues.

    Here we have two new terms: Orthologues and Paralogues. Actually genes with similar sequence are called homologous genes. These genes may undergo gene duplication or even get divergent in functions during the course of evolution. Genes with similar sequence and functions are called orthologues and genes with similar sequence and different functions are called paralogues. E.g. Genes encoding myoglobin and hemoglobin are paralogues.

  • Functions of human genes and other regions of DNA can be revealed by studying their counterpart in lower organisims.

Comparison of Complete Genome Sequences

Here we take example of
helicobacter pylori
. We shall compare 2 strains of H.pylori and study their strain specific diversity.

Let's first give you a note for Helicobacter Pylori. It is an organism that colonizes in the human gastric mucosa. It induces gastric inflammation which can progress to ulcer, gastric cancer, or mucosal associated lymphoma.

About 60 to 80% of Asian and 30 to 40% of population in US are being affected by this. Remember that not all strains of H.Pylori cause diseases. Some are even beneficial to host. So the question arises what cause the difference??? Is it strain specific diversity or host diversity? R A Alm was the person who first compared genomes of two strains of H. Pylori: J99 and 26695.

What shall we compare??

Statistics of Genome

  • Size of genome i.e. total number of base pairs.
  • Overall G+C content
  • Location of regions with different GC content and are they located in corresponding regions in both genomes.

    The two strains had similar genome size and G+C content and there were about 4 regions of different G+C content.

Predicted Open Reading Frames

Before knowing what to compare let us first describe how to identify genes in genome??

For identifying genes in case of prokaryotes there are different statistical methods such as GenMark, Glimmer. But eukaryotes are far more complex because of large intron regions and alternate splicing. So predicting of genes becomes quite difficult. Different statistical methods used to indentify genes in case of eukaryotes are GenScan, Genie.

Here are the thing that we have to find out.

  • Total no. of predicted ORFs.
  • % of coding regions
  • Average length of ORFs
  • Predicted genes with homology and its assigned function
  • Predicted genes with homology and no assigned function
  • Organism specific genes i.e. the genes that are not found yet in any other organism genome.
  • Strain – specific genes
  • Location of strain specific genes

In H.Pylori half of strain-specific genes are clustered in plasticity zone with different G+C content which suggests horizontal DNA transfer (Horizontal evolution and not vertical which is the general case)

Paralogues and Othologues

  • Find out if gene belongs to which paralogous family
  • DNA sequence difference between orthologues
  • Protein sequence difference between orthologues

    In J99 strain 337 genes are members of 113 paralogous family.

    DNA-sequence differences between orthologues are mainly found in the third position of coding triplets.


    8 genes were with more than 98% nucleotide identity.

    310 proteins were with more than 98% amino-acid identity.

Genomic Organization and gene order

  • Look for duplication, inversion, translocation
  • Check if gene order is conserved between genomes

In J99 3 single copy genes have complete or partial duplication.

10 regions showed translocation and inversion.

In case of gene order conservation,

  • 84% have same neighbor in each side in both genomes
  • 13% are flanked by strain specific genes, so no same neighbor
  • 1.8% have different neighbor on one side because of organization difference

Friday, September 5, 2008

Bioinformatics Companies

  1. Metahelix Life Sciences Pvt. Ltd. ( Biotech )
  2. Invitrogen ( bioinfo + biotech )
  3. Brainwave Biosolutions limited ( bioinfo)

+ Data mining using our proprietary software and manual curation, backed by NLP

+ Design, implementation & integration of biological and chemical databases; relational, xml, various parsers

+ Data analysis support; neural network, HMM

+ Customized tool development to support laboratory experiments

+ Analytical tool development as per the scientific requirements

  1. Connexios Life Sciences Pvt. Ltd. ( biotech+ bioinfo)
  2. Invenio Biosolutions ( biotech )

    Offers custom software development for gene and protein sequence analysis, and drug discovery.

  3. Infosys Technologies Ltd ( bioinfo)
  4. Accelrys Software Solutions Pvt. Ltd. ( bioinfo )
  5. MWG Biotech Pvt. Ltd.
  6. Quintiles Technologies (India) Pvt. Ltd.( clinical trial )
  7. Carl Zeiss India Pvt. Ltd. ( biomedical instruements )
  8. ReaMetrix India Pvt. Ltd. ( biotech )
  9. Infovalley, Bangalore ( bioinfo )

    Bioinformatician

    Responsibilities:

    •Provide training on bioinformatics-related concepts, applications and tools

    •Collaborate and con9ult with researchers to analyze problems, recommend technology-based solutions, and design computational strategies for a wide range of biological research

    •contribute to the design, development, implementation, and testing of biocomputing tools

    .Develop computational tools in biology that use genomic data to generate biological hrohteseel

    •Create or modify web-based bioinformatics tools, public domain biological databases and software tools for sequence, domain, and structural analysis

    •ensure completion of deliverables and adherence to timelines

    •Analyze and resolve issues that have the potential to jsparchze perfomiance and/or ability to meet agreed upon daliverables

    •Perform any other related duties incidental to the wirk described herein

    •Other duties as assigned

    Job Specification:

    •work requires a BSc. in Bioinformatics or Biotechnology with demonstrable computational skills or a BSc. in computer science with a strong interest in biology/genomics. Master or Ph.D. preferred.

    •work requires at least 1 years of experience in bioinformatics.

    •Experience with web-based bioinformatics tools, public domain biological databeses and software tools for sequenoe, domain and structural analysis.

    •Familiarity with and development of computational tools in biology that use genomic data to ganerate biological hypotheses.

    •Experience with a procedural language, proficient in Java, Peil, 'C', web design, DNA genome informatics, proteomics informatics, statistics, and computer science.

    •Expeiience with relational databases and sQL helpful.

    •Fresh graduate is enoouraged to apply.

  10. Molecular Connection Pvt Ltd ( bioinfo )
  11. Biocon Ltd. ( biotech )
  12. Mphasis IT Services( clinical data mangement )
  13. Vivus Health Center, Bangalore ( helth care )
  14. AstraZeneca India Pvt Ltd.( drug discovery )
  15. Ocimum Biosolutions Ltd., Hyderabad ( bioinfo )

    Ocimum Biosolutions is a leading integrated genomics company providing comprehensive Genomic Reference Databases, Life-Science Lab Information Management Solutions, GLP compliant Microarray Services and essential research consumables. Over 2/3rd of the top 25 Pharma companies, leading research institutes and emerging biotech companies worldwide have chosen us as their preferred outsourcing partner and utilize our expertise for understanding underlying mechanisms of diseases, discovery and prioritization of gene targets & biomarkers.

    Gene Logic Databases and Software

    BioExpress®

    ToxExpress®

    Genesis and GX Connect Bioinformatics Products / Web Solutions

    ToxShield™

    ASCENTA® and SCIANTIS®

    Genowiz™, Genchek™, iRNAchek™ and OptGene™


    LIMS Solutions

    Biotracker™ for Life Science Research

    Biotracker™ for Manufacturing

    Biotracker™ for Pre-Clinical Custom Bioinformatics Services

    Pharmacogenomics


    The BioIT Division consists of reference databases like The BioExpress® and The ToxExpress® System, Enterprise software solutions like Genesis, web based solutions like ASCENTA® and SCIANTIS® . It also includes three 21 CFR Part 11 & GLP compliant, Lab Information Management Systems (LIMS) products - Biotracker™, Biotracker™ for Manufacturing and Biotracker™ for Pre-Clinical and a suite of bioinformatics products, Genchek™ (sequence analysis), OptGene™ (gene design), Genowiz™ (microarray data analysis) and iRNAchek™ (iRNA template design). The division also offers specialized data analysis for different kinds of "omics" data and other custom software services.

  16. Cell Works Group Inc.( bioinfo)
  17. GE India Technology Centre Pvt. Ltd ( biomedical sciences )
  18. Strand Life Sciences ( bioinfo )
  19. Siri Technologies Pvt. Ltd.( bioinfo )
  20. Thyrocare Technologies Limited ( biotech )
  21. Millipore (India) Pvt. Ltd.( biomedical sciences )
  22. Tata Consultancy Services , hyd ( bioinfo )
  23. TechnoConcepts Pvt. Ltd.( biomed instruments )
  24. Gangagen Biotechnology Pvt. Ltd ( biotech)
  25. Lab India Instruments Pvt. Ltd. ( biotech instruments )
  26. Simenens pvt ltd ( biomedical instruments )

You can have Google Search to know more about companies