search indicatorNeed Help?

Bioresearch FAQs

Essential Resources for Quantitative PCR

Quantitative polymerase chain reaction (QPCR) is a set of powerful PCR techniques designed to quantify RNA transcripts. However, much up-front work is required for experiments to produce reliable data.

Specific considerations for QPCR include:

  • determining whether your transcript has alternatively spliced forms
  • primer and probe design, particularly to ensure specificity
  • sample preparation
  • normalization controls
  • detecting RNA degradation
  • preventing degradation

Key Resources

How to download full text of a set of search results

A number of software programs can run a search that you specify against selected databases and download resulting PDFs or HTMLs associated with the hits.

  • EndNote
    Literature querying, data analysis and literature management tool
  • Zotero
    an open source FireFox plug-in (available for other browsers)

Both EndNote and QUOSA can accept a list of PubMedIDs as input.

None of these tools can download more than roughly 60% of the papers they find because of the complicated links provided by publishers. Contact Lauren Schoenthaler in the Office of General Counsel for advice on copyright.

How to find information about patents

Patent data are a crucial yet frequently overlooked resource. Much of the data present in patent applications are never published into publicly-accessible journals, whether or not the patent is issued. Such data include synthesis methods, DNA and protein sequences, as well as functional mutations and genotypes.

Furthermore, more prosaic information, such as the composition of a research kit (e.g., gene transcription analysis kit), can often be found in patent applications.

Stanford Resources

The easiest way to find transcription start site data

When it comes to identifying the TSS for e.g., dozens of genes, you have two options:

Parsing method

You can parse NCBI genome annotation files for the information. As part of the genome annotation process, tab-delimited files are created that give the position of key features in both contig (RefSeq accessions of the format NW_ or NT_) and chromosome coordinates, if applicable.

  • Go to Index of genomes to find the genome-specific directories
  • Within a directory, click on "maps", then "mapview", then the folder for the current build
  • In the directory you will find the file "". The first line in the file names the columns.
    (chrStart, chrEnd, and orientation refer to the positions on chromosome; cnt_start, cnt_stop, and cnt_orient refer to positions on the contigs. Note that both of these positions are 1-based (i.e., start at 1, not 0).
  • The "gene" lines in this file give the ranges for the gene on the chromosome (as applicable), as well as contig coordinates.
  • Scan the file using the UNIX commands gzcat and egrep (ex.: "gzcat | egrep "GENE.*reference".) This will extract the "GENE" lines for the reference assembly.

Direct SQL querying method

You can query the Ensembl databases directly using SQL to retrieve this kind of data.

For limited number of genes

  • Access the NCBI Gene database to visualize your gene. This database lets you see the nucleotide number for the TSS of a gene (ex.: TP53).
    There is no way to query either the Gene or the Nucleotide databases programmatically for that datum.
  • Search SOURCE with your gene ID and click on the TRASER tool to retrieve the 5' genome region.
  • The GeneCards database provides an explicit TSS number upfront. Caveat: You may have to convert the coordinates to whatever assembly you are using, which is not clear. (ex.: TP53).
  • 5'SAGE: SAGE-based TSS identification.
  • Eukaryotic Promoter Database
    doesn't provide the TSS explicitely but can be inferred
    database of transcription factor data
    database of 5'-end sequences from the sequencing of full-length cDNAs
  • TRED
    database of mammalian cis- and trans- transcriptional regulatory elements

Online Bioinformatics Resources Collection

The University of Pittsburgh's Online Bioinformatics Resources Collection (OBRC) is a curated, searchable collection of bioinformatics tools. OBRC was created for identifying a Life Sciences software program or database to support your research and describes >1,768 bioinformatics databases and software tools.

OBRC provides:

  • a short review of each tool or database
  • comprehensive indexing of biotools


Although you are likely to find what you need, your search will likely return large numbers of potentially applicable resources, each of which will require considerable effort to determine its applicability.

Good summaries of bioinformatics, genomics and proteomics topics

The Encyclopedia of Genetics, Genomics, Proteomics and Bioinformatics (EGGPB) is a collection of authoritative articles ranging from substantive to very comprehensive in the fields of genetics, genomics, proteomics and bioinformatics.

EGGPB should be one of the first resources consulted when you need to understand an unfamiliar concept in a rapid manner. All articles are written by authorities in their field, complemented by an excellent search engine.

Designing a microarray gene expression experiment

In addition to the cost of running microarray gene expression experiments, high throughput experiments require more careful design than traditional experiments in order to ensure that the results of the experiment are interpretable and robust.

A frequent problem in microarray gene expression experiments is the lack of a crisp null hypothesis. Withouth a null hypothesis, it is very easy to delude oneself that a given signal (a group of genes that are appear to be regulated as a function of experiment treatment) constitutes a biologically meaningful result. This is because when thousands of signals are generated, it is always possible to find variations that are not relevant to the research problem at hand.

The first thing you should do once you have an experiment design is consult a statistician:



Market data for health care on drug usage, clinical diagnostics, medical devices, surgical patient care, medical imaging

Commercial Sources

  • Frost & Sullivan
    Detailed marketing reports on key sectors in healthcare (drug discovery, clinical diagnostics, medical devices, surgical patient care, medical imaging). Stanford faculty and staff only
  • Mintel:
    Market research on consumer behavior, product innovation and competitive marketing strategies

These reports typically provide top-level overviews and can include hard-to-find information on:

  • incidence
  • prevalence
  • economic burden
  • analyses of revenue forecasts
  • pricing trends
  • emerging markets

Get started programming in Perl

Download Perl

Create, debug and run Perl programs

Extending Perl for biologists: BioPerl

Available as elective components to the Perl interpreter, BioPerl provides an extensive collection of so-called Perl modules to handle most biologically-related data analysis tasks.

BioPerl application examples

  • Accessing sequence data from remote databases
  • Converting between formats
  • Identifying amino acid cleavage sites
  • Running BLAST automatically
  • Analyzing phylogenetic trees
  • Manipulating 3D structure objects (e.g., proteins)

Key references

What is CMGM?

The Computational Services and Bioinformatics Resource (commonly referred to as "CMGM") is a service of the Beckman Center for Molecular and Genetic Medicine that provides a vast array of shared software resources for Stanford groups and laboratories based on an annual membership fee.

These resources range from office applications to scientific software such as image analysis and computational biology tools.


  • Lee Kozar, PhD
  • Director, CSBF
  • Beckman Center room B062D
  • (650) 725-4483

What is Eureka?

Eureka is GeneGo's easy-to-use knowledge mining program that allows patrons to quickly find high quality, relevant answers from manually curated databases.

Simply type in the name of your favorite gene, protein, drug or disease and Eureka will show you detailed information including SNPs, GO localization, molecular function, etc. It even allows you to order compounds from Sigma directly.

It allows the user to ask questions such as:

  • "In which pathways is my favorite gene implicated?"
  • "Find compounds that induce geneX but not geneY"
  • "Find all enzymes involved in inhibiting coagulation"

Access is limited to the Stanford community (SUNet ID holders). Our current license allows for two simultaneous users.

What is the BIOBASE Knowledge Library database?

The BIOBASE Knowledge Library provides highly curated data on the proteome of selected organisms based on data extracted from the primary literature. BKL is one of the best ways to quickly assess a vast set of protein properties for a given protein, or even a set of proteins.

  • Rapid understanding of a protein and all of its properties (androgen receptor example
  • Understanding what proteins interact with a given protein
  • Determining where a protein is present or absent
  • Understanding how a protein is regulated
  • Comparative genomics: finding homologs, orthologs
  • Species coverage: Homo sapiens, S. cerevisae, S. pombe, C. elegans and pathogenic fungi.
  • Protein properties such as:
    • Sites of expression (presence and known absence)
    • Interacting proteins
    • Related proteins; homology; orthology
    • Regulation: Upregulated, Downregulated, Affects
    • Protein complex data
    • Protein modification data
    • Association with disease
    • Use of protein as biomarker
  • Orthology: nice synteny maps are provided
  • Expression patterns, including pathological and cell type, not just organ/tissue
  • Highlighting of biomarker potential
  • Protein modifications are listed
  • Proteome's protein functions are likely to be more rigorous than those of NCBI
  • Richer dataset as compared to NCBI Gene (e.g., BIOBASE has information on gene regulation, NCBI does not)
  • Antibodies are listed
  • Provide positive/negative correlation & data for disease and physical interactions, among others

Access is limited to the Stanford community (SUNet ID holders).Our current license allows for two simultaneous users.

What is the MMDB database?

The MMDB database is NCBI's database of biomolecular structures, integrated within Entrez.
  • Finding a structure for a protein for which no structure is available by identifying a related protein (homolog) that has a structure
  • Visualizing sequence/structure alignments
  • Understanding biological function by analyzing the mechanism of action of a protein
  • Understanding the evolutionary history of and relationships between macromolecules
  • Example record
  • Cn3D client
  • 3D structure data of biological macromolecules obtained mostly from X-ray crystallography and NMR-spectroscopy

Key References

What is the TRANSFAC database?

The MMDB database is NCBI's database of biomolecular structures, integrated within Entrez.
  • Finding a structure for a protein for which no structure is available by identifying a related protein (homolog) that has a structure
  • Visualizing sequence/structure alignments
  • Understanding biological function by analyzing the mechanism of action of a protein
  • Understanding the evolutionary history of and relationships between macromolecules
  • Example record
  • Cn3D client
  • 3D structure data of biological macromolecules obtained mostly from X-ray crystallography and NMR-spectroscopy

Contact Your Liaison

Arpi Siyahian

To suggest changes to this portal or if you have questions, please contact your liaison Arpi Siyahian.

  • CAP logo CAP profile
  • 650-723-1233

We provide consultations via Adobe Connect.

Join a MedMeeting
UpToDate: A point-of-care clinical information resource containing succinct and aggressively updated clinical topic reviews.PubMed: provides fulltext access to Lane's resources. Includes the MEDLINE database, which contains coverage of over 5000 journals and more than 16 million citations for biomedical articles, including, but not limited to, clinical trials, systematic reviews, case reports, and clinical practice guidelines.Lane RSS feeds (Really Simple Syndication)
Limit your PubMed search to Cochrane Reviews, AHRQ Evidence Reports, BMJ Clinical Evidence topics, FPIN Clinical Inquiries, and ACP Journal Club article reviews. Limit your PubMed search to systematic reviews, meta-analyses, reviews of clinical trials, evidence-based medicine, consensus development name conferences, guidelines and citations to articles from journals specializing in review studies of value to clinicians.The National Guideline Clearinghouse (NGC): A comprehensive database of evidence-based clinical practice guidelines and related documents.PubMed Guidelines: PubMed search restricted to articles identified as "guidelines"PubMed Reviews: Reviews limited to selected pediatric journals, core clinical journals, past 5-years, and English-language.eMedicine: A point-of-care clinical information resource containing articles on 7,000 diseases and disorders. The evidence-based content, updated regularly, provides the latest practice guidelines in 59 medical specialtiesMedlinePlus: A repository of health information from the National Library of Medicine. Links are from trusted sites. No advertising, no endorsement of commercial companies or productsLPCH CareNotes via MicroMedex: Patient education handouts customized by LPCH clinical staffBMJ Clinical Evidence. A clinical information tool built around systematic reviews summarizing the current state of knowledge about prevention and treatment of clinical conditionsMicroMedex: Premier pharmaceutical information source containing multiple databases and drug reference tools. Of particular value is DRUGDEX Evaluations, one of the most comprehensive drug sources available.Micromedex Lab Advisor: Evidence based laboratory test information