Function Reference
How to get help about function options
Functions implemented in LFMPro are designed to accept their required and optional arguments in the following style: functionname('requiredvalue1', 'requiredvalue2',…, 'option1','value1', 'option2','value2',…); If you pass 'help' as one of the options, you will be given the list of available options for that function, and also a synopsis, if one is available. For example, here are the available options and their default values for draw_backbone function:
>> draw_backbone('1irda','help') default options are: simple: 1 LineWidth: 1 useSphere: 0 sphereRadius: 0.4000 sphereDetails: 6 cylinderDetail: 8 color: 0.2000 0.2000 0.2000 withLabels: 0 labelShift: 0.3000 R: T:
LFMPro uses a caching mechanism to avoid recalculation of some of the steps. Among the common options to functions is recache
parameter which when set to 1, triggers recalculation of the function value.
src/* Matlab scripts
Protein Classification
[ cs,feated ] = classify_proteins ( ptns, feats,scores,borderDs, feats_,scores_,borderDs_, prior,varargin )
binary-classify proteins using representative feature sets of families
Data handling functions
[ pdbs,families,cacheOpts ] = data_astral_family ( varargin )
get the list of pdbs in a given SCOP family, using a representative ASTRAL file.
[ pdbs,families,cacheOpts ] = data_culled_family ( varargin )
get the list of pdbs in a given SCOP family, selected from similarity- reduced culled-pdb set.
[ pdbs ] = data_culled_parse ( varargin )
reads in list of pdb/chain identifiers from a culledPDB list.
[ pdbs ] = data_pdbs_cleanup ( varargin )
filters pdbs using desired criteria. filtering based on length of chain, or whether broken-backbones exist are possible.
data_prepare_batch ( varargin )
automates preprocessing of a list of pdbs. Resume/Continue using an external file is possible.
[ ptns,families ] = data_prepare_experiment ( varargin )
prepare an experiment dataset. reads in an external file of family pdbs and loads families and proteins.
[ s ] = data_rand_family ( varargin )
prints a random permutation of family members, which can be pasted to families.txt file for use in experiments.
[ pdbs,families,cacheOpts ] = data_scop_family ( varargin )
get the list of pdbs in a given SCOP family
[ families ] = data_scop_findFamily ( pdb )
finds the scop family of a given pdb.
3D display functions
signatures_show ( p,feats,scores,varargin )
show the mapping of features onto a protein
draw_backbone ( coords, varargin )
draws the protein backbone in 3D coords can be pdb file name, a protein struct, or a list of coords.
[ handles ] = draw_centers ( centers,varargin )
draws critical points. use this together with draw_backbone.
[ X, Y, Z ] = draw_cylinder ( R, N,r1,r2 )
draw a N-sided cylinder based on the generator curve in the vector R.
Experimentation batch scripts
[ stats ] = experiment_G ( ptns, family_inds, varargin )
run experiments for varying family/background size.
[ thestats ] = experiment_Gs ( ptns, families, fams, nonfams, varargin )
run experiments for varying family/background size
[ feateds ] = experiment_classification ( ptns, family_inds, varargin )
run experiments for classification, and collect results.
Family-specific functions
[ triads ] = family_triad_find ( p,varargin )
find the location of catalytic triad (approximate)
[ minmax ] = features_getMinMax ( ptns, varargin )
finds min,max of features, which can later be used for normalization.
Environment setup
globals_aksu ( )
setup the global paths and variables.
globals_garip ( )
setup the global paths and variables.
globals_set ( )
setup the global paths and variables. determines which machine we are on, and calls the appropriate globals_* configuration file.
Geometric utility functions
[ bins,binSize ] = lib_3D_hashing ( coords, varargin )
uniform hashing of the coordinates. which can be used to speed-up the nearest neighbor or range calculations.
[ ret ] = lib_points_within_spheres ( centers, atomCoords, varargin )
finds points within the centers.
[ center, r ] = lib_tetra_center ( tetra )
finds the center of the sphere that passes through a tetrahedra
[ r,rind,nextind,firstDist ] = matches_rank_triad ( feats,p,varargin )
find the rank of the triad among the features.
[ sum_writhe ] = writhe_edges ( edgeList, coor, exact )
calculate writhe of the given list of edges. synopsis: moleculeCoor=load('../data/1aus.mol') writhe=sum_writhe_volume([1:438;2:439]',moleculeCoor(:,2:4)) exact=[1|0] [exact|volume]
Mining for significant sites
[ feats,scores,border_ds,varargout ] = mine_Sites ( ptns, family_inds, varargin )
mines representative feature set, ordered by discriminative scores.
pdb reading and features processing
[ PDB_struct ] = my_pdbread ( pdbfile )
read in a pdb file.
[ obje ] = pdbname_parse ( pdb )
parse a pdb name into pdb,chain,range parts.
[ pdb ] = pdbname_unparse ( obje )
combine pdb,chain,range parts into a pdb name.
[ ptn,gotwhat ] = ptn_get ( ptn,varargin )
an interface to read/extract/calculate various properties (e.g., atoms, CA-atoms) of a protein. keeps track of caching, so properties are not calculated each time.
[ centerOfMass ] = ptn_get_atomCenterOfMass ( atoms,atomsWithinSpheres,atomCount,centers,varargin )
subfunction of ptn_get. not intended for direct call. calculates individual centerofmasses within the spheres, for each atom-type. note on code-synchronization: this function is similar to ptn_get_atomCount
[ atomCounts ] = ptn_get_atomCount ( atomsWithinSpheres,atoms,varargin )
subfunction of ptn_get. not intended for direct call. calculates individual atom-atomCounts within the spheres, for each atom-type. note on code-synchronization: this function is similar to ptn_get_atom_centerOfMass
[ cps ] = ptn_get_criticalPoints ( ptn,varargin )
subfunction of ptn_get. not intended for direct call. wrapper for criticalPoint executable finds the critical points in pairs.
[ f ] = ptn_get_features ( p,type )
subfunction of ptn_get. not intended for direct call. combines various features into a vector..
[ features ] = ptn_get_features_normalized ( features,which )
subfunction of ptn_get. not intended for direct call. normalizes features between [0-1]
[ numPieces ] = ptn_get_numPieces ( cas )
subfunction of ptn_get. not intended for direct call. calculates the number of backbone pieces that make up the center..
[ ptn ] = ptn_get_protein ( ptn,varargin )
subfunction of ptn_get. not intended for direct call. reads a protein from pdb, takes care of chain-selection and examines broken-chains (missing residues)
[ ptn ] = ptn_get_tetra ( ptn,varargin )
subfunction of ptn_get. not intended for direct call. wrapper for AD-tetrahedra executable. finds the AD-tetrahedra for a given set of points and thresholds.
[ writhe ] = ptn_get_writhe ( cas, caCoords )
subfunction of ptn_get. not intended for direct call. calculates writhing number for centers..
[ ret ] = ptn_get_x ( ptn,which )
helper function to unify accessing a protein's property.
Weight setup and optimization
[ bestWeights,bestWeightsHistory ] = weights_optimize ( ptns,families,family_inds,varargin )
optimize the weights of the Euclidean using a simulated annealing approach
[ weights,varargout ] = weights_get ( varargin )
get the weights to be used in Euclidean distance
[ W ] = weights_set ( o_weights,FEAT_SIZE )
set the weights used in Euclidean
[ feats ] = weights_use ( W, feats )
multiply feats with weights to use in distance function
Utility functions
[ varargout ] = myFuncs ( func, varargin )
provides a place to store various temporary functions. these are small scripts that don't quite make it into their own function file.
myLibrary/* Matlab scripts
Caching mechanism
[ opts ] = myCache_getOptions ( varargin )
not intended for direct use. iscalled from myCache_load/save
[ recache, cachevar, varargout ] = myCache_load ( varargin )
synopsis: myCache_load('identifier', struct('extension','branch'))
myCache_save ( variable, varargin )
[ ret ] = myCache_test ( varargin )
Optimized geometric operations
[ ret ] = myEuclidean ( X, Y )
[ mins, inds ] = myEuclidean_closest ( X, Y )
[ hist ] = myHistogram ( V,varargin )
Setting up Function arguments and options
[ varargin ] = myFix_varargin ( varargin )
varargin passed down to function calls gets encapsulated with another layer of {} cell'ation with each call. this function fixes the problem by extracting the innermost cell content.
[ defaults ] = myOptions_set ( defaults, varargin )
synopsis: opts = myOptions_set(struct('name','ahmet', 'surname','sacan'),varargin); vargin=overrides
myOptions_test ( varargin )
Other utility functions
[ fid ] = myFile_open ( filename,mode )
[ s ] = myImplode ( d, a )
[ str ] = myTime_pretty ( secs )