Statistics

                                                                                                                                                                                                HOME

More ESTs support the specific splicing site, this splicing information is more reliable. We can also use the lowest frequency of ESTs aligned to a specific splicing site as a parameter. The lowest frequency user specified is higher, the alternative splicing could be found is less. In the following tables, we list the number of alternative splicing genes in human and mouse with different lowest EST frequencies. (Table 1, Table 2)If only the splicing sites with at least two ESTs supported are considered, the amounts of alternative splicing sites for different organisms are stated as follows. (Table 3)

Table 1. Alternative splicing coverage of Homo sapiens by EST frequencies. 

HS EST support

Exon

Skpping

3’ AS

5’ AS

ME

Total

1

17418

12933

12879

270

43500

2

5800

3227

3213

133

12373

3

3093

1626

1520

87

6326

4

1996

947

882

64

3889

5

1429

653

582

52

2716

 Table 2. Alternative splicing coverage of Mus musculus by EST frequencies. 

MM EST support

Exon

Skpping

3’ AS

5’ AS

ME

Total

1

8068

6567

6473

119

21227

2

2772

1504

1488

65

5829

3

1433

676

702

42

2853

4

927

420

423

26

1796

5

638

305

287

18

1248

  Table 3. Alternative splicing coverage of six organism, EST support= 2.

 

Organism

Exon

Skipping

3’ AS

5’ AS

ME

Total

HS

5800

3227

3213

133

12373

MM

2772

1504

1488

65

5829

RN

158

145

162

0

465

DM

8

100

106

0

214

CE

7

50

63

1

121

AT

2

59

76

0

137

 There is substantial variation in the certainty associated with these observations, due to the differing number of EST frequency associated with each of these AS sites. We list the number of AS sites and AS genes in human with different EST lowest frequencies. (Table 4)

Since the coverage of ESTs in genome is really rugged, a small number of genes being represented by large numbers of ESTs. This set might be biased towards medically relevant genes. When we compute the ratio of AS genes, the number of ESTs mapped into the same gene region should be considered. In the following table, we show that the genes associated with more ESTs are with higher alternative splicing possibility. (Table 5)

Table 4. The number of AS genes with different EST frequencies.

EST supprot

No. of EST

AS events

AS genes

1

858838

115915

26814

2

760737

33261

18490

3

693091

17886

13817

4

642163

11435

11426

5

599929

8780

9463

Table 5. Relation of Alternative splicing Gene associated with EST frequencies 

Human ESTs coverage

/Genes

Num of Genes

Alternative splicing genes

Ratio

2,000

23

23

100%

1,000

87

86

99%

800

124

121

98%

300

600

577

96%

100

3525

2880

82%

50

7193

5482

76%

10

19303

8956

46%

5

23878

9117

38%

2

29520

9119

30%

=0

26377

0

0%

  As EST data collection continues, it even seems probably that alternative splicing variant may be observed for all genes . Yet, in Avatar, some genes are associated with many ESTs and no alternative splicing event been observed. For example, NM_000184.1 NM_000559.1, NM_015379.1, NM_002415.1, NM_018955.2, NM_009199.1 are all associated with more than 500 ESTs and no obvious alternative splicing event is detected. (Table 6)

Table 6. The genes involve more than 500 ESTs but no obvious alternative splicing

Organism

EST support

mRNA accession

Gene name

HS

1015

NM_000184.1

HBG2

HS

987

NM_000559.1

HBG1

HS

806

NM_015379.1

BRI3

HS

711

NM_002415.1

MIF

HS

678

NM_018955.2

UBB

MM

610

NM_009199.1

Slc1a1

  Using bioinformatic approach to detect AS sites, we first need to map ESTs to different clusters. Traditionally, researchers use UniGene clustering information to classify ESTs. It saves computation time. Yet, it is not informative. Our approach aligns ESTs to genome and cluster ESTs according to their corresponding location at genome. Many ESTs in different UniGene clusters are aligned to the same genome area. For example, we find that UniGene cluster HS.433680, HS.432883, HS.305916 and HS.396617 are all mapped into gene TG(Thyroglobulin).

                                                                                                   HOME

  ©2004 BioGrid Lab. 886-4-23323456:1704   FCU  Taichung , TAIWAN 433