KEDDY: Knowledge-based Evaluation of Dependency DifferentialitY

This page provides the download and related descriptions of the Java implementation of KEDDY. The KEDDY algorithm is presented in "KEDDY: a knowledge-based statistical gene set test method to detect differential functional protein-protein interactions".

 

 

Introduction

KEDDY is a statistical test method to identify gene sets with differential functional protein-protein interactions between given two conditions. KEDDY utilizes known functional protein-protein interaction information, and compares two conditions by evaluating the probability distributions of functional protein-protein interaction networks based on the known interactions. For gene sets with known functional protein-protein interactions, KEDDY provides improved identification performance than the previous EDDY while requiring significantly less computations. This Java implementation of KEDDY is freely available to noncommercial users.

 

 

System Requirements

The programs in this page were developed and tested on Java version 1.8.0, and require Java virtual machine 1.8.0 or higher.

 

 

Downloads

KEDDY to test a single gene set: A command line version of KEDDY that tests single gene set of input.

 

 

KEDDY to test MSigDB gene sets: A command line version of KEDDY that has been prepared to test MSigDB gene sets with BIOGRID functional protein-protein interaction information as known interactions. 6,911 gene sets were curated from the categories of canonical pathways, GO biological processes and GO molecular functions of MSigDB 5.2, and 1,379,810 interactions were curated from BIOGRID 3.4.146. In case of future updates of MSigDB and BIOGRID, updated files will be released accordingly.

 

 

 

Installing KEDDY

Both versions can be installed as follows:

 

1. Download the file from the link.

2. Unzip the downloaded file into a work directory.

 

 

How to Use

KEDDY to test a single gene set

 

For command line options, execute the .jar file as follows:

 

$ java -jar RunKEDDY_release.jar

 

Available options are as follows:

 

-d      Tab-delimited input data file. Each row is a variable and each column is a sample.

          First row contains sample names. First column contains variable names. No missing value is allowed. Variable names should be unique.

-t        Tab-delimited text file with target gene names.

-i        The list of available interactions in BIOGRID format.

-c       Tab-delimited text file of sample class specification. The number of sample class labels should match the number of samples.

-pm    The number of random permutations

-wt      The permutation process terminates if a given wall time limit (in hours) expires.

-o        Name of the result output file

-no      Name of a file that has information on the likelihood of each interaction-specific network from each sample class

 

Example command:

 

% java -jar RunKEDDY_release.jar -d TCGA_GBM_Z2ND.txt -t MSigDB_targetGene_BIOCARTA_41BB_PATHWAY.txt -i MSigDB_knownInteraction_BIOCARTA_41BB_PATHWAY.txt -c Mesenchymal.txt -pm 1000 -wt 1 -o BIOCARTA_41BB_PATHWAY_result.txt -no BIOCARTA_41BB_PATHWAY_network.txt

 

KEDDY to test MSigDB gene sets

 

For command line options, execute the .jar file as follows:

 

$ java -jar RunKEDDYMSigDB_release.jar

 

Available options are as follows:

 

-df      Input data file.

-cf      Sample class specification file

-bm    BIOGRID_MSigDB pathway summary file (provided)

-mp    Maximum size of gene set to be tested

-sc     Either "amount" or "ratio". With "amount", KEDDY skips gene sets with less amount of known interactions than the given threshold.

          With "ratio", KEDDY skips gene sets with less interactions-to-genes ratios than the given threshold.

-st      Interaction amount or ratio threshold value

-ke     The number of random permutations

-wt     The permutation process terminates if a given wall time limit (in hours) expires.

-no     "true" or "false". If "true", a file will be written with information on the likelihood of each interaction-specific network from each sample

           class, for all the individual tested gene sets.

 

Example command:

 

% java -jar RunKEDDYMSigDB_release.jar -df TCGA_GBM_Z2ND.txt -cf Mesenchymal.txt -bm BIOGRID3.4_MSigDB5.2_PathwaySummary.txt -mp 50 -sc amount -st 10 -ke 1000 -wt 1 -no true

 

Note: In case of low memory errors, use the -Xmx option for Java virtual machine to increase the heap space for the program.

 

 

Example Input Files

 

Data file: 

 

Sample class specification file: 

 

Target gene set file: 

 

Known interaction file for the target gene set: