2 Vraag 2: Lion King
2.1 A
Pas de functie van vraag1 aan zodat je in iedere ensemble dataset kan zoeken naar geassocieerde filters en attributes. De functie heeft drie argumenten: 1. ensembl dataset 2. zoekpatroon (regex) voor attributes 3. zoekpatroon (regex) voor filters
Bijvoorbeeld: functie(“dataset”, ” gene_id” , ”human”)
find2 <- function(dataset, attributes_pattern, filter_pattern){
  mart <- useEnsembl(biomart = "genes", dataset)
  
  filters <- searchFilters(mart, filter_pattern) 
  attributes <- searchAttributes(mart, attributes_pattern)
  
  head(filters) %>% print()
  head(attributes) %>% print()
}
find2(dataset = "hsapiens_gene_ensembl", "gene_id", "human")##                            name                             description
## 149 with_illumina_humanht_12_v3 With ILLUMINA HumanHT 12 V3 probe ID(s)
## 150 with_illumina_humanht_12_v4 With ILLUMINA HumanHT 12 V4 probe ID(s)
## 151 with_illumina_humanref_8_v3 With ILLUMINA HumanRef 8 V3 probe ID(s)
## 152  with_illumina_humanwg_6_v1  With ILLUMINA HumanWG 6 V1 probe ID(s)
## 153  with_illumina_humanwg_6_v2  With ILLUMINA HumanWG 6 V2 probe ID(s)
## 154  with_illumina_humanwg_6_v3  With ILLUMINA HumanWG 6 V3 probe ID(s)
##                        name                        description         page
## 1           ensembl_gene_id                     Gene stable ID feature_page
## 2   ensembl_gene_id_version             Gene stable ID version feature_page
## 82            entrezgene_id NCBI gene (formerly Entrezgene) ID feature_page
## 106             wikigene_id                        WikiGene ID feature_page
## 204         ensembl_gene_id                     Gene stable ID    structure
## 205 ensembl_gene_id_version             Gene stable ID version    structure
2.2 B
Voor iedere dataset zoek de volgende attribute en filter zoals aangegeven in de tabel. Zoek eerst de namen op van de ensembl dataset voor de aangegeven organismen.
| Dataset | Attribute | Filter | 
|---|---|---|
| Leeuw | protein | chromosome | 
| Baboon | protein | chromosome | 
| Olifant | protein | chromosome | 
Let op: Ga niet 3 keer de functie uitvoeren met de aangegeven argumenten. Gebruik een R functie die iteraties kan uitvoeren.
martALL <- useEnsembl("genes")
searchDatasets(martALL, "(L|l)ion") ##               dataset            description   version
## 147 pleo_gene_ensembl Lion genes (PanLeo1.0) PanLeo1.0
searchDatasets(martALL, "(B|b)aboon") ##                  dataset                   description  version
## 139 panubis_gene_ensembl Olive baboon genes (Panu_3.0) Panu_3.0
searchDatasets(martALL, "(E|e)lephant") ##                   dataset                                      description
## 42    cmilii_gene_ensembl Elephant shark genes (Callorhinchus_milii-6.1.3)
## 85 lafricana_gene_ensembl                       Elephant genes (Loxafr3.0)
##                      version
## 42 Callorhinchus_milii-6.1.3
## 85                 Loxafr3.0
DS <- c("pleo_gene_ensembl", "panubis_gene_ensembl", "lafricana_gene_ensembl")
for (x in DS) {
  print(x)
  find2(x, "protein", "chromosome")
}## [1] "pleo_gene_ensembl"
##              name              description
## 1 chromosome_name Chromosome/scaffold name
##                                                 name
## 30                                   peptide_version
## 41                                        protein_id
## 120                                  peptide_version
## 160                                  peptide_version
## 173              cabingdonii_homolog_ensembl_peptide
## 177 cabingdonii_homolog_canonical_transcript_protein
##                                                        description         page
## 30                                               Version (protein) feature_page
## 41                                                INSDC protein ID feature_page
## 120                                              Version (protein)    structure
## 160                                              Version (protein)     homologs
## 173 Abingdon island giant tortoise protein or transcript stable ID     homologs
## 177                                 Query protein or transcript ID     homologs
## [1] "panubis_gene_ensembl"
##              name              description
## 1 chromosome_name Chromosome/scaffold name
##                                                 name
## 30                                   peptide_version
## 44                                        protein_id
## 167                                  peptide_version
## 207                                  peptide_version
## 220              cabingdonii_homolog_ensembl_peptide
## 224 cabingdonii_homolog_canonical_transcript_protein
##                                                        description         page
## 30                                               Version (protein) feature_page
## 44                                                INSDC protein ID feature_page
## 167                                              Version (protein)    structure
## 207                                              Version (protein)     homologs
## 220 Abingdon island giant tortoise protein or transcript stable ID     homologs
## 224                                 Query protein or transcript ID     homologs
## [1] "lafricana_gene_ensembl"
##              name              description
## 1 chromosome_name Chromosome/scaffold name
##                                                 name
## 30                                   peptide_version
## 42                                        protein_id
## 132                                  peptide_version
## 172                                  peptide_version
## 185              cabingdonii_homolog_ensembl_peptide
## 189 cabingdonii_homolog_canonical_transcript_protein
##                                                        description         page
## 30                                               Version (protein) feature_page
## 42                                                INSDC protein ID feature_page
## 132                                              Version (protein)    structure
## 172                                              Version (protein)     homologs
## 185 Abingdon island giant tortoise protein or transcript stable ID     homologs
## 189                                 Query protein or transcript ID     homologs