How to upload your data to Plastisipi using Filezila

  1. Download Filezilla from https://filezilla-project.org/
  2. enter sftp://lbcd41.snv.jussieu.fr for host. Be careful to put the sftp://
  3. enter your galaxy email identifier – enter your password (you must register to plastisipi before)
  4. Port should be set to 2121
  5. Navigate in Site local window and select your file to upload.
  6. Drag and drop in the Site remote window.

 

  • Then, go to your galaxy session, click on the “upload file from your computer” tool in the “Get data” section.
  • Your uploaded files should now show up in the ftp check list. Click and run the tool to import the files in your current history.

Launch of the GED bowtie wrapper

The test period of the  “GED bowtie small RNA oriented” tool is now finished.

The tools

  • Squashes Bowtie index and Aligns on fasta reference with -M option
  • match fasta reads on RNA reference index for miRNA or tRNA profiling
  • match fasta reads on DNA using bowtie -k opt Annotation by bowtie cascading
  • match fasta reads on DNA with -M 1 bowtie option for smRNA profiling (DNA target)
  • match fasta unique read mappers with -m 1 bowtie option (DNA target)
  • Bowtie to SAM option -M1 –best –strata

are deprecated and have been removed from the Mississippi server

Contact us if you need help to update your workflows

Storage Procedure of fastq RNAseq file in Galaxy data librairies

The RNAseq fastq files will be now converted into BAM format before storage in the Galaxy user data librairies.
As a consequence, using these files may imply (depending on what you want to do) to

  1. Convert the BAM file to a SAM file, using the “BAM-to-SAM converts BAM format to SAM format” tool
  2. Convert the resulting SAM file to a fastq file, using the “SAM to FASTQ creates a FASTQ file” Picard tool.

New Genome Annotation workflow

The former Genome Annotation Cascade workflow has been removed from the shared workflow and a new version is now available in this section.

The new workflow features are

  • A histogram procedure for small RNA size distribution completely refactored and much faster
  • An sample variable which can be set at workflow runtime by the user and propagates both in the names of the datasets in the history and in the pdf graphs.

Major Dmel update/ galaxy update

The galaxy instance has just been upgraded.

Moreover, the drosophila melanogaster genome (and sub libraries) have been updated to 5.49 (previously 5.43) this last week end.

Consequently, all workflow based on previous 5.49 release have to be update too. Otherwise, you will get some errors..

Sorry for the inconvenience. We are not planning to upgrade the Dmel genome files more than once a year

New version of the tool “Extract a region” !

Check it out !

The “Extract a region from a bowtie output” tool is now filtering on genomic/item coordinates AND/OR on read size !

A future version provide the option of outputing the filter reads either in a bowtie format or in a fasta format.

Updated Ensembl Transposon set

The previous Ensembl Transposon set was missing the TART-A, TART-B and TART-C sequences, whose headers were merged with the Tirant header.

We have fetched the corresponding sequences and added them in the bowtie index available through Galaxy.

Sorry for the inconvenience, if you previously used the Ensembl_transposon_set bowtie index, you’d better re-run the analyses.

NEW tool “Read count map”

Today, we release the new tool “Read count map” that will replace both “Compute Plot table for small RNA mapping” AND “Plot Read Number map Using R“.

Indeed, both these previous tools are now embedded in “Read count map” that produces both the data frame for small RNA read counts mapping and the PDF plot.

IMPORTANT

Note that the deprecated “Compute Plot table for small RNA mapping” python code had a minor bug that could affect the read count maps for the reverse strand matching reads. The new version fixed this bug. Therefore all users having previously mapped reads using “Compute Plot table for small RNA mapping”  should check the impact of the bug correction on their maps.

Clarification on “Compute Overlap Signature” tools

There is currently at 3 tools to compute overlap signatures in GED galaxy.

are extensively tested and suitable for production.
Of note, if the bowtie index use for read matching contain more than One item, both tools will calculate a matrix of paired reads (with varying overlap on the antisense strand from 1 to 28 nt) for Each item, and Then will sum up the signatures on all items. So, for instance, if the bowtie index contain a big transposon and a miRNA locus, the net signature will be attenuated. Keep this in mind. On the other hand, the ability to sum up the signatures from the various intems in the bowtie index is useful: for instance, you can compute the “globale” signature of all transposons using the all-transposons set from bowtie.
For both tools, you have to use the tool “Plot overlap signature with z-score” to output the signature graph.
The third tool “smRNA_diagnostic” belongs to the new Object-Oriented tool generation. It is useful but still under test/development. In particular, the interface of the output is still cryptic…

2 news tools to visualize read size distributions of numerous items in a library

Today we are releasing 2 news tools that help to visualize the read size distribution of family of items (say all transposons, all tRNA, etc) in a sequence dataset.

  • Compute read size distributions of + and – reads from a bowtie standard output
    This tool has the same name as the previous version, yet it is computing a dataframe with a completely different structure.
  • Plot Lattice of histograms from item/size dataframe is the xml/R wrapper for lattice representation of histograms.

Both tools can be easily linked into a short pipeline using the workflow editor.

In the future, as previously done for lattice read maps (Parse miRNA bowtie matching for lattice followed by Plot multiple Lattices for comparison), it will be possible to visualize several libraries side by side, with 1 library for 1 column and 1 item for each line.