Gene name error scanner webservice

Over the past few weeks, we've had a lot of feedback about our paper describing the sorry state of Excel auto-correct errors in supplemental files in spreadsheets.

In our group, we've discussed a number of ways that these errors could be minimised in future. One suggestion was to publish a webservice which permits reviewers and editors to upload and scan spreadsheets for the presence of gene name errors. So that's what I did. I took some basic file upload code in php and customised it so that it runs the shell script described in the paper. You can access the webservice here. We've been testing it for a few days and seems to work fine, except for the auto-generated email which I presume is being blocked by our IT group.


The code for the webservice is up at GitHub, so you can modify it and host another instance if you want. The code should run on Ubuntu machines that can run Apache2, php and other dependencies. The github repo contains an install.sh that should be able to install dependancies and configure the webpage in more or less 1 command. Modification of the php.ini is required to accommodate larger upload filesizes than the default, like the following example:

upload_max_filesize = 50M
post_max_size = 50M

I would like to thank other contributors to this tool: Yotam Eren, Antony Kaspi & Assam El-Osta

Popular posts from this blog

Mass download from google drive using R

Data analysis step 8: Pathway analysis with GSEA

Extract properly paired reads from a BAM file using SamTools