|
ES Cell Line Sequencing Information from MGC AND CGAP
The Office of Cancer Genomics (OCG) has been spearheading the National Cancer Institute’s efforts in generating transcriptome data from embryonic stem (ES) cells. Two different approaches, Expressed Sequence Tags (ESTs) and Serial Analysis of Gene Expression (SAGE) have resulted in data that is available to the entire scientific community. Below is a progress report as of April 2005 as well as a description of specific Web-based resources to analyze the data. In the future, updates will be provided when new data become available.
The Mammalian Gene Collection (MGC)
The MGC Program is a Trans-NIH project whose goal is to obtain a representative full-length cDNA clone for every human and mouse gene [The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC) Genome Res. 2004]. As a part of the MGC, the following cDNA libraries were prepared from the NIH-registered ES cell lines:
Library name |
Stem Cell
(Provider’s code/
NIH code) |
#ESTs |
Normalized |
Comments |
NIH_MGC_258 |
hESBGN.01/BG01 |
4,518 |
No |
BG01 cells d ifferentiated into an early endodermal cell type. |
NIH_MGC_260 |
hESBGN.01/BG01 |
8,183 |
No |
|
NIH_MGC_262 |
hESBGN.01/BG01 |
8,604 |
No |
BG01 cells differentiated into an early neural progenitor cell type. |
NIH_MGC_172 |
H1/WA01 |
10,005 |
No |
Undifferentiated WA01 cells grown on Matrigel. |
NIH_MGC_173 |
H1/WA01 |
8,471 |
No |
WA01 cells differentiated into trophoblasts by 4 day treatment with BMP and grown on Matrigel. |
NIH_MGC_278* |
UC01/HSF-1 |
4,111 |
No |
|
NIH_MGC_279* |
UC01/HSF-1 |
4,112 |
Yes |
|
NIH_MGC_280* |
UC06/HSF-6 |
4,059 |
No |
|
NIH_MGC_281* |
UC06/HSF-6 |
4,372 |
Yes |
|
NIA Human H1 Embryonic Stem Cell cDNA Library (Long) |
H1/WA01 |
6,300 |
No |
Cells grown on irradiated mouse embryonic fibroblast feeder layer. |
*These libraries were a size selected to remove DNA fragments below 1.2 kb.
Analysis Tools for cDNA ES Cell Libraries
- UniGene Homo Sapiens Library Browser (NCBI) provides information on the library construction methods, the exact number of 5’ ESTs from each library, and relative gene expression from each cell line.
- Entrez Nucleotide (NCBI) provides a list of GenBank accession numbers, with links to the individual sequences, for each library (obtained by typing in the name of the library in the search text box on the home page).
Analysis Tools for ES Cell Transcription Data
- On the CGAP Web site (where genes are based on UniGene classification):
- Gene Library Summarizer lists all the genes, both known and unknown, found in one library, or group of libraries, and further divides them into those genes that are unique or not unique to that library or pool.
- cDNA xProfiler compares the gene expression between two individual libraries or pools of libraries.
- cDNA DGED provides a list of differences in gene expression that are statistically significant between two libraries, pool of libraries or a library and a library pool
Cancer Genome Anatomy Project/ SAGE Genie
One of the components of the NCI’s Cancer Genome Anatomy Project (CGAP) is the analysis of transcription profiles from tissues or cell lines using the method of Serial Analysis of Gene Expression. With SAGE it is possible to determine the relative expression of a large number of genes within a biological sample. CGAP uses a novel analytical approach to map tags to genes and provides a unique set of tools to analyze the data called SAGE Genie (PNAS 2002). With SAGE Genie, users can interrogate 10 long SAGE ES libraries and one short SAGE ES library that have been prepared from NIH-registered stem cell lines:
Library name |
Stem Cell
(Provider’s code/NIH code) |
# Tags |
LSAGE_Embryonic_stem_cell_H13_normal_p22_CL_SHE15 |
H13/WA13 |
221,101 |
LSAGE_Embryonic_stem_cell_H14_normal_p22_CL_SHE14 |
H14/WA14 |
212,170 |
LSAGE_Embryonic_stem_cell_H1_normal_p31_CL_SHE17 |
H1/WA01 |
276,203 |
LSAGE_Embryonic_stem_cell_H1_normal_p54_CL_SHE16 |
H1/WA01 |
218,214 |
LSAGE_Embryonic_stem_cell_H7_normal_p33_CL_SHE13 |
H7/WA07 |
272,465 |
SAGE_Embryonic_stem_cell_H9_normal_p38_CL_SHES1 |
H9/WA09 |
151,735 |
LSAGE_Embryonic_stem_cell_H9_normal_p38_CL_SHES2 |
H9/WA09 |
401,432 |
LSAGE_Embryonic_stem_cell_HES3_normal_p16_CL_SHE10 |
HES-3/ES03 |
205,353 |
LSAGE_Embryonic_stem_cell_HES4_normal_p36_CL_SHE11 |
HES-4/ES04 |
209,232 |
LSAGE_Embryonic_stem_cell_HSF6_normal_p50_CL_SHES9 |
UC06/HSF-6 |
224,488 |
LSAGE_Embryonic_stem cell_BG01_normal_p20_CL_SHE19 |
BG01/hESBGN.01 |
201,668 |
In addition, one SAGE library was prepared from mouse feeder layer cells used to support the human SAGE ES libraries. It is called mSAGE_MEF_normal_CL_SHE18 and contains 234,823 tags.
Analysis Tools for SAGE ES Libraries (SAGE Genie)
- Downloads provides information about the libraries and access to the tag data. (This is an FTP site.)
- Sage Anatomic Viewer (SAV) allows the user to visualize the level of expression of a single gene in ES cell lines. After selecting a gene, the user can compare the expression in the pool of ES cell libraries to all other major cell lines derived from tissues in the body, both normal and cancerous, or to a mixture of both cell lines and tissues. Expression is visualized by color, ten shades of pink to blue each representing a different level of expression. In addition, the “embryonic stem cell” link opens a new page on which there is an individual tissue culture dish for each stem cell library, each colored to represent its own expression level of the gene. SAV also provides access to:
- The Ludwig Transcript Viewer that depicts the transcript that contains the tag and three other theoretical tags downstream of the 3’ end.
- The Digital Northern that provides the expression of the gene in each library in which it is found.
- SAGE Digital Gene Expression Displayer (DGED) provides a list of genes whose expression is different between two individual libraries or two pools of libraries, the difference being statistically significant. Thus, the gene expression in ES cells can be compared, either individually or in an ES pool, against any other ES cell line, pool or tissue.
- SAGE Absolute Level Lister (SALL) lists all the SAGE libraries by tissue type, with “Stem Cell” being one unique heading. The total number of tags per library is given, and the library itself is linked to a separate page in which all the tags in that library are grouped by level of expression. The user can then access the list of genes (based on the best tag for a gene mapping) per expression level.
Manuscript that will provide results of analyses of expression patterns of these cell lines will be submitted soon.
Updated: April 22, 2004
|