You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
3. The commands below will generate two CSV files in the `${KOMP_PATH}/impc_statistical_pipeline/IMPC_DRs/stats_pipeline_input_drXX.y/SP/jobs/Results_IMPC_SP_Windowed` directory for the unidimentional and categorical results. The files can be gzip and moved to the FTP directory. You can decorate and format the files by using one of the formatted files in the previous data releases.
### Step 3. Run the Extraction of Risky Genes Pipeline
129
+
### Step 2. Run the Extraction of Risky Genes Pipeline
142
130
This process generates a list of risky genes to check manually.
143
131
1. Allocate a machine on codon cluster: `bsub –M 8000 –Is /bin/bash`
144
132
2. Open an R session: `R`
@@ -151,7 +139,7 @@ This process generates a list of risky genes to check manually.
151
139
When there are no jobs running under the mi_stats user and all jobs are successfully completed.<br><br>
152
140
-***Do you expect any errors from the stats pipeline?***<br>
153
141
Yes, having a few errors is normal. If you observe more than a few errors, you may want to run the GapFilling pipeline. Refer to the Step_1.2_RunGapFillingPipeline.mp4 [video](https://www.ebi.ac.uk/seqdb/confluence/display/MouseInformatics/How+to+run+the+IMPC+statistical+pipeline). Make sure to log in to Confluence first.<br><br>
154
-
## Step 1 FAQ
142
+
## Statistical Pipeline FAQ
155
143
-***How can you determine on step 1 if the pipeline is still running?***<br>
156
144
The simplest method is to execute the `squeue` command. During the first 4 days of running the pipeline, there should be less than 20 jobs running. Otherwise, there should be 5000+ jobs running on the codon cluster.<br><br>
157
145
-***How to determine if step 1 is finished?***<br>
-***When should you run the gap filling pipeline after completing step 1?***<br>
168
156
In very rare cases, the stats pipeline may fail for unknown reasons.To resume the pipeline from the point of failure, you can use the GapFilling Pipeline. This is equivalent to running the pipeline by navigating to `<stats pipeline directory>/SP/jobs` and executing `AllJobs.bch`. Before doing so, make sure to edit function.R and set the parameter `onlyFillNonExistingResults` to TRUE. After making this change, run the pipeline by executing `./AllJobs.bch` and wait for the pipeline to fill in the missing analyses. Please note that this process may take up to 2 days.<br><br>
169
157
170
-
## Step 2 Annotation Pipeline FAQ
158
+
## Annotation Pipeline FAQ
171
159
The `IMPC_HadoopLoad` command uses the power of cluster to assign the annotations to the StatPackets and transfers the files to the Hadoop cluster (transfer=TRUE). The files will be transferred to `Hadoop:/hadoop/user/mi_stats/impc/statpackets/DRXX`.
172
160
**Note**: we run annotation pipeline with transfer=FALSE, so we don't transfer it now.
0 commit comments