Differences

This shows you the differences between two versions of the page.

app:lfmpro:sample-classification [2006/11/14 09:33] (current)
Ahmet Sacan
Line 1: Line 1:
 +====== Classification Example ======  
 +  
 +This example shows how to setup a classification experiment between Globins and Ser-Thr Kinases families. Note that this is not a challenging classification problem, and is shown here only to demonstrate to usage of scripts. Please see the publication for results of a more challenging classification experiment. The dataset and experiment setup is same as described in the [[sample-globins|Mining Globins Family page]]. The setup was the following set of proteins:  
 +  
 +<code>  
 +>> globals_set();  
 +>> [ptns,fams]=data_prepare_experiment('experimentFile','families.globins.kinases.txt');  
 +  1- 10 a.1.1.2 - (Globins)  
 + 11- 28 a.1.1.* - (Globin-like)  
 + 29- 38 d.144.1.7 - (Ser/Thr Kinases)  
 + 39-227 _except_a.1.1.* - (except Globins)  
 +228-424 _except_d.144.1.7 - (except Ser/Thr Kinases)  
 +</code>  
 +  
 +Now, we generate representative sets for both Globins and Kinases families, leaving one protein from each family out of the training.  
 +  
 +<code>  
 +>> mine_family_represent('globins_leaveoneout', 'ptns',ptns([1:9]), 'rand',ptns([39:227]),'recache',1);  
 +>> mine_family_represent('kinases_leaveoneout', 'ptns',ptns([29:37]), 'rand',ptns([228:424]),'recache',1);  
 +</code>  
 +  
 +Next, we can test the proteins that were left out during the training phase and see which class they belong to:  
 +<code>  
 +>> class=classify_protein(ptns(10),{'globins_leaveoneout','kinases_leaveoneout'})  
 +  
 +class =  
 +     1  
 +  
 +>> class=classify_protein(ptns(38),{'globins_leaveoneout','kinases_leaveoneout'})  
 +  
 +class =  
 +     2  
 +>>  
 +</code>  
 +  
 +Or, we could test all Globins and Kinases family members as follows:  
 +<code>  
 +>> for i=1:38; class(i)=classify_protein(ptns(i),{'globins_leaveoneout','kinases_leaveoneout'}); end; class  
 +  
 +class =  
 +  Columns 1 through 18  
 +     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1  
 +  
 +  Columns 19 through 36  
 +     1     1     1     1     1     1     1     1     1     1     2     2     2     2     2     2     2     2  
 +  
 +  Columns 37 through 38  
 +     2     2  
 +</code>  
 +  
 +Note that the Globin-like proteins, some of which were not used in training (in the range 11-28) are also classified as Globins.
app/lfmpro/sample-classification.txt · Last modified: 2006/11/14 09:33 by Ahmet Sacan