Life Cycle of Antheraea mylitta

Gene Expression Data Retrieval from GEO (NCBI) – B.Sc. Bioinformatics Practical

 

Gene Expression Data Retrieval from GEO (NCBI) – B.Sc. Bioinformatics Practical

Aim of the Experiment

To retrieve and analyze gene expression data from the Gene Expression Omnibus (GEO) database of NCBI.

Principle

The Gene Expression Omnibus (GEO) is a public repository maintained by NCBI that stores high-throughput gene expression data (microarray and RNA-Seq).

  • Each dataset is assigned an accession number (e.g., GSE, GSM, GPL)
  • Data represents gene activity under different biological conditions
  • GEO allows retrieval, visualization, and comparison of expression profiles

 Requirements

  • Computer with internet connection
  • Web browser
  • Access to NCBI
  • Basic knowledge of genes and expression analysis

 Step-by-Step Procedure

Step 1: Open GEO Database

  • Visit NCBI
  • From the database dropdown, select GEO DataSets

Step 2: Enter Search Query

  • Type keywords such as:
    • Gene name (e.g., TP53)
    • Disease (e.g., cancer)
    • Organism (e.g., Homo sapiens)

 Example:
TP53 breast cancer Homo sapiens

Step 3: Run Search

  • Click Search
  • A list of GEO datasets (GSE) will appear

Step 4: Select Dataset (GSE)

  • Click on a relevant GSE accession number
  • Review:
    • Study title
    • Organism
    • Experimental design

Step 5: Explore Dataset Information

  • Check:
    • Number of samples
    • Platform used (GPL)
    • Type of experiment (microarray/RNA-Seq)

Step 6: View Sample Data (GSM)

  • Scroll to Samples (GSM)
  • Click any sample to view expression values

Step 7: Analyze Gene Expression

  • Use “Analyze with GEO2R” option
  • Divide samples into groups (e.g., control vs treated)
  • Click Top 250 to view differentially expressed genes

Step 8: Interpret Results

  • Observe:
    • Fold change
    • p-value
    • Upregulated/downregulated genes

Step 9: Download Data

  • Click Download → Series Matrix File(s)
  • Save expression dataset for further analysis

Step 10: Record Observations

  • Note in practical file:
    • Dataset accession number (GSE)
    • Organism
    • Number of samples
    • Key differentially expressed genes

Result

Gene expression dataset successfully retrieved and analyzed using GEO2R tool, identifying differentially expressed genes.

Precautions

  • Choose datasets with proper experimental design
  • Ensure correct grouping of samples in GEO2R
  • Verify organism and platform
  • Interpret p-values carefully

Applications

  • Disease gene identification
  • Biomarker discovery
  • Drug target analysis
  • Functional genomics
  • Personalized medicine

Viva Voce Questions (with Answers)

  1. What is GEO?
    A gene expression database of NCBI.
  2. What does GSE stand for?
    GEO Series (dataset).
  3. What is GSM?
    Individual sample in GEO.
  4. What is GPL?
    Platform used for experiment.
  5. What is GEO2R?
    Online tool for comparing gene expression.
  6. What is fold change?
    Measure of gene expression difference.
  7. What is p-value?
    Indicates statistical significance.
  8. What is upregulation?
    Increase in gene expression.
  9. What is downregulation?
    Decrease in gene expression.
  10. Which data types are in GEO?
    Microarray and RNA-Seq data.

Post a Comment

0 Comments

Graphical Representation of Statistical Data using MS Excel – B.Sc. Practical