This shows you the differences between two versions of the page.
— |
app:lfmpro:sample-classification [2006/11/14 09:33] (current) Ahmet Sacan |
||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ====== Classification Example ====== | ||
+ | |||
+ | This example shows how to setup a classification experiment between Globins and Ser-Thr Kinases families. Note that this is not a challenging classification problem, and is shown here only to demonstrate to usage of scripts. Please see the publication for results of a more challenging classification experiment. The dataset and experiment setup is same as described in the [[sample-globins|Mining Globins Family page]]. The setup was the following set of proteins: | ||
+ | |||
+ | <code> | ||
+ | >> globals_set(); | ||
+ | >> [ptns,fams]=data_prepare_experiment('experimentFile','families.globins.kinases.txt'); | ||
+ | 1- 10 a.1.1.2 - (Globins) | ||
+ | 11- 28 a.1.1.* - (Globin-like) | ||
+ | 29- 38 d.144.1.7 - (Ser/Thr Kinases) | ||
+ | 39-227 _except_a.1.1.* - (except Globins) | ||
+ | 228-424 _except_d.144.1.7 - (except Ser/Thr Kinases) | ||
+ | </code> | ||
+ | |||
+ | Now, we generate representative sets for both Globins and Kinases families, leaving one protein from each family out of the training. | ||
+ | |||
+ | <code> | ||
+ | >> mine_family_represent('globins_leaveoneout', 'ptns',ptns([1:9]), 'rand',ptns([39:227]),'recache',1); | ||
+ | >> mine_family_represent('kinases_leaveoneout', 'ptns',ptns([29:37]), 'rand',ptns([228:424]),'recache',1); | ||
+ | </code> | ||
+ | |||
+ | Next, we can test the proteins that were left out during the training phase and see which class they belong to: | ||
+ | <code> | ||
+ | >> class=classify_protein(ptns(10),{'globins_leaveoneout','kinases_leaveoneout'}) | ||
+ | |||
+ | class = | ||
+ | 1 | ||
+ | |||
+ | >> class=classify_protein(ptns(38),{'globins_leaveoneout','kinases_leaveoneout'}) | ||
+ | |||
+ | class = | ||
+ | 2 | ||
+ | >> | ||
+ | </code> | ||
+ | |||
+ | Or, we could test all Globins and Kinases family members as follows: | ||
+ | <code> | ||
+ | >> for i=1:38; class(i)=classify_protein(ptns(i),{'globins_leaveoneout','kinases_leaveoneout'}); end; class | ||
+ | |||
+ | class = | ||
+ | Columns 1 through 18 | ||
+ | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | ||
+ | |||
+ | Columns 19 through 36 | ||
+ | 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 | ||
+ | |||
+ | Columns 37 through 38 | ||
+ | 2 2 | ||
+ | </code> | ||
+ | |||
+ | Note that the Globin-like proteins, some of which were not used in training (in the range 11-28) are also classified as Globins. |