makeTranscriptDbFromBiomart {GenomicFeatures}R Documentation

Making a TranscriptDb object from annotations available on a BioMart database

Description

The makeTranscriptDbFromBiomart function allows the user to make a TranscriptDb object from transcript annotations available on a BioMart database.

Usage

getChromInfoFromBiomart(biomart="ensembl",
                        dataset="hsapiens_gene_ensembl")

makeTranscriptDbFromBiomart(biomart="ensembl",
                            dataset="hsapiens_gene_ensembl",
                            transcript_ids=NULL,
                            circ_seqs=DEFAULT_CIRC_SEQS)

Arguments

biomart

which BioMart database to use. Get the list of all available BioMart databases with the listMarts function from the biomaRt package. See the details section below for a list of BioMart databases with compatible transcript annotations.

dataset

which dataset from BioMart. For example: "hsapiens_gene_ensembl", "mmusculus_gene_ensembl", "dmelanogaster_gene_ensembl", "celegans_gene_ensembl", "scerevisiae_gene_ensembl", etc in the ensembl database. See the examples section below for how to discover which datasets are available in a given BioMart database.

transcript_ids

optionally, only retrieve transcript annotation data for the specified set of transcript ids. If this is used, then the meta information displayed for the resulting TranscriptDb object will say 'Full dataset: no'. Otherwise it will say 'Full dataset: yes'.

circ_seqs

a character vector to list out which chromosomes should be marked as circular.

Details

makeTranscriptDbFromBiomart is a convenience function that feeds data from a BioMart database to the lower level makeTranscriptDb function. See ?makeTranscriptDbFromUCSC for a similar function that feeds data from the UCSC source.

BioMart databases that are known to have compatible transcript annotations are:

Not all annotations will have CDS information.

Value

A TranscriptDb object.

Author(s)

M. Carlson and H. Pages

See Also

listMarts, useMart, listDatasets, DEFAULT_CIRC_SEQS, makeTranscriptDbFromUCSC, makeTranscriptDb

Examples

## Discover which datasets are available in the "ensembl" BioMart
## database:
library(biomaRt)
listDatasets(useMart("ensembl"))

## Retrieving an incomplete transcript dataset for Human from the
## "ensembl" BioMart database:
transcript_ids <- c(
    "ENST00000268655",
    "ENST00000313243",
    "ENST00000341724",
    "ENST00000400839",
    "ENST00000400840",
    "ENST00000435657",
    "ENST00000478783"
)
txdb <- makeTranscriptDbFromBiomart(transcript_ids=transcript_ids)
txdb  # note that these annotations match the GRCh37 genome assembly

[Package GenomicFeatures version 1.6.8 Index]