Universitat Internacional de Catalunya

Biological Databases

Biological Databases
4
14863
4
First semester
op
Main language of instruction: English

Other languages of instruction: Catalan, Spanish

Teaching staff

Introduction

The Biological Databases course aims to introduce students to the different types of databases available in the health sciences field and to teach them how to use these resources to solve problems in biomedicine. The course builds on the contents of Introduction to Bioinformatics and goes deeper into their application for accessing, managing, and interpreting biological information.

This course contributes to the United Nations Sustainable Development Goals (SDGs) of the 2030 Agenda, particularly SDGs 3, 4, 9, 10 and 17:

- SDG 3 (Good Health and Well-being): by enabling the use of biological data to improve disease diagnosis, treatment, and prevention.

- SDG 4 (Quality Education): by developing students’ skills in scientific information management and bioinformatics tools.

- SDG 9 (Industry, Innovation and Infrastructure): by promoting innovation through the use of databases for biomedical research and technological development.

- SDG 10 (Reduced Inequalities): by supporting open access to data and bioinformatics resources, helping to provide equal opportunities for research worldwide.

- SDG 17 (Partnerships for the Goals): by encouraging the use of open resources and fostering interdisciplinary collaboration in research.

Pre-course requirements

It is recommended to have completed and passed:

 - Introduction to bioinformatics

Objectives

 - To explain the main biomedical databases and their web sites to access and exploit its information.

 - To promote skills in searching, collecting, processing, and interpreting biomedical data in order to solve problems in the field of life sciences.

 - To develop critical thinking and analytical skills based on scientific results, and assess the quality and limitations of the available data.

 - To foster autonomy and lifelong learning, as well as promote communication and collaborative work in the management of biomedical information.

Competences/Learning outcomes of the degree programme

  • CN14 - Identify the principles of biomedical sciences related to health, as well as the basic concepts and tools that have an impact on Biomedical Sciences and allow them to work in any of its fields (biomedical companies, bioinformatics labs, research laboratories, clinical analysis companies, etc.).
  • CP05 - Apply biological foundations in the search for practical solutions to health problems, following ethical standards and scientific rigour and respecting fundamental equal rights between men and women, and the promotion of human rights and the values inherent in a peaceful society of democratic values that includes inclusive, non-discriminatory language without stereotypes.

Learning outcomes of the subject

Upon completing the course, students should be able to:

 - Analyze biomedical problems and identify aspects that require the use of databases.

 - Retrieve relevant information from major large-scale (Big Data) databases in the field of Biomedical Sciences.

 - Recognize the different types of databases available in the health sciences to address key problems in the field of biomedicine.

 - Identify the fundamentals of data collection, processing, and analysis of large volumes of data from biomedical research.

 - Apply specific tools to correctly extract and interpret information from databases.

Syllabus

1) Introduction to biological databases. Use of biomedical data in the field of life sciences. Types of data available in biomedicine. Main databases. Search for information in a database.

2) Bibliographic databases. MEDLINE bibliographic database. MeSH biomedical vocabulary. PubMed search engine. PubMed Central bibliographic repository. Mendeley bibliographic manager.

3) Nucleotide databases: genomes, genes and transcripts. INSDC: NCBI, EMBL-EBI and DDBJ. Genomes, Genes, and Transcripts at NCBI: Entrez, Nucleotide, GenBank, RefSeq; and in EMBL-EBI: ENA and Ensembl. Visualization of genomes at UCSC. Gene nomenclature in HGNC. Description of genes in Gene Cards. Coding sequence in CCDS.

4) Gene expression and regulation databases. Gene expression in GEO, Expression Atlas, Single Cell Expression Atlas and Single Cell Portal. Gene regulation in GTEx and ENCODE.

5) Protein databases. Protein sequence in UniProt, RefSeq and Ensembl. Protein domains in InterPro, PROSITE and Pfam. Protein structures in PDBe, RCSB PDB and AlphaFold. Protein interaction in IntAct, STRING and BioGRID.

6) Network databases. Gene function in Gene Ontology. Metabolic pathways in Reactome and KEGG.

7) Databases of diseases, genetic variability and drugs. Diseases in OMIM and Orphanet. Genetic variability in ClinVar, UniProt, gnomAD and dbSNP. Drugs in DrugBank, PubChem and PHARMGKB.

Teaching and learning activities

In person



Fully face-to-face classroom-based modality.

The contents will be delivered using three different teaching methodologies or learning activities:

1. Lectures – 8 hours: the teaching staff delivers knowledge in the classroom to the entire group of students.

2. Case method (CM) – 20 hours: students solve problems provided on the day by the teaching staff. In the classroom, students present their conclusions with the active participation of the teaching staff, who may introduce new concepts whenever necessary.

3. Practical classes – 12 hours: computer-based exercises on concepts studied in the theoretical classes, carried out under the supervision of the teaching staff.

Evaluation systems and criteria

In person



First call students:

The final mark will be:

 - Case methods: 15%

 - Practice reports: 15%

 - Partial exam: 30%

 - Final exam: 40%

Additionally:

1) The exams will be multiple-choice, with four answer options. Each correct answer will earn 1 point, and each incorrect answer will incur a penalty of 0.33 points.

2) To be eligible for averaging, a minimum grade of 5 is required in the final exam. In order to pass the course, students must obtain an overall minimum mark of 5.

3) Attendance at case methods and practices is mandatory. The continuous nature of this assessment means that it is not possible to aprove the subject if a minimum of 75% of the hours have not been participated in.

4) The improper use of electronic devices (such as the recording and dissemination of both students and teachers during the different lessons, as well as the use of these devices for recreational and non-educational purposes) may lead to expulsion from class.

5) The expulsion of a student from the classroom may lead to the failure of the subject.

Students in second or subsequent call: the note of the methods of the case and the note of practices will be saved; and the final exam will represent 70% of the final grade. Repeating students who wish to repeat the partial in 3 or 5 calls, may do so by previously communicating it to the head teacher.

Bibliography and resources

https://www.ncbi.nlm.nih.gov/home/tutorials/

https://www.ensembl.org/info/index.html

http://www.uniprot.org/help/

https://www.ebi.ac.uk/training/online/

Andreas D. Baxevanis (Editor) , B. F. Francis Ouellette (Editor) (2004). Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins (3 ed.). Wiley.

Model, Mitchell L. (2010). Bioinformatics programming using Python (1 ed.). O’Reilly . 

Evaluation period

E: exam date | R: revision date | 1: first session | 2: second session:
  • E1 21/01/2026 I3 18:00h
  • E2 22/06/2026 I3 18:00h