Read and Print a Tab Separated File Java
jvarkit
Sam2Tsv
Prints the SAM alignments every bit a TAB delimited file.
Usage
Usage: sam2tsv [options] Files Options: -h, --help impress help and go out --helpFormat What kind of assistance. 1 of [usage,markdown,xml]. -o, --output Output file. Optional . Default: stdout -R, --reference Indexed fasta Reference file. This file must be indexed with samtools faidx and with picard CreateSequenceDictionary --regions Limit analysis to this interval. A source of intervals. The post-obit suffixes are recognized: vcf, vcf.gz bed, bed.gz, gtf, gff, gff.gz, gtf.gz.Otherwise information technology could exist an empty string (no interval) or a list of manifestly interval separated by '[ \t\due north;,]' -Northward, --skip-N Skip 'Due north' operator Default: simulated --validation-stringency SAM Reader Validation Stringency Default: LENIENT Possible Values: [STRICT, LENIENT, SILENT] --version print version and go out
Keywords
- sam
- bam
- table
- cram
- tsv
See too in Biostars
- https://www.biostars.org/p/157232
- https://world wide web.biostars.org/p/59647
- https://world wide web.biostars.org/p/253828
- https://www.biostars.org/p/264875
- https://www.biostars.org/p/277493
Compilation
Requirements / Dependencies
- java compiler SDK 11. Please check that this java is in the
${PATH}
. Setting JAVA_HOME is not enough : (e.g: https://github.com/lindenb/jvarkit/bug/23 )
Download and Compile
$ git clone "https://github.com/lindenb/jvarkit.git" $ cd jvarkit $ ./gradlew sam2tsv
The java jar file volition be installed in the dist
directory.
Cosmos Date
20170712
Source code
https://github.com/lindenb/jvarkit/tree/master/src/primary/java/com/github/lindenb/jvarkit/tools/sam2tsv/Sam2Tsv.java
Unit Tests
https://github.com/lindenb/jvarkit/tree/principal/src/test/java/com/github/lindenb/jvarkit/tools/sam2tsv/Sam2TsvTest.java
Contribute
- Upshot Tracker: http://github.com/lindenb/jvarkit/bug
- Source Code: http://github.com/lindenb/jvarkit
License
The project is licensed under the MIT license.
Citing
Should you cite sam2tsv ? https://github.com/mr-c/shouldacite/blob/master/should-I-cite-this-software.physician
The current reference is:
http://dx.doi.org/10.6084/m9.figshare.1425030
Lindenbaum, Pierre (2015): JVarkit: java-based utilities for Bioinformatics. figshare. http://dx.doi.org/10.6084/m9.figshare.1425030
Example
$ coffee -jar dist/sam2tsv.jar -R src/test/resources/toy.fa src/test/resources/toy.bam #Read-Name Flag MAPQ CHROM READ-POS0 READ-Base READ-QUAL REF-POS1 REF-Base CIGAR-OP r001 163 thirty ref 0 T . 7 T M r001 163 30 ref 1 T . 8 T M r001 163 30 ref ii A . 9 A Grand r001 163 30 ref 3 G . ten G One thousand r001 163 thirty ref four A . 11 A 1000 r001 163 30 ref 5 T . 12 T M r001 163 xxx ref 6 A . thirteen A G r001 163 30 ref 7 A . xiv A Chiliad r001 163 30 ref 8 A . . . I r001 163 30 ref 9 G . . . I r001 163 30 ref 10 A . . . I r001 163 30 ref 11 One thousand . . . I r001 163 30 ref 12 G . 15 K K r001 163 30 ref 13 A . sixteen A M r001 163 30 ref fourteen T . 17 T M r001 163 thirty ref 15 A . 18 A Grand r001 163 30 ref . . . 19 G D r001 163 thirty ref sixteen C . 20 C Grand r001 163 xxx ref 17 T . 21 T Thou r001 163 30 ref eighteen Grand . 22 Grand G r002 0 30 ref 0 A . 8 T Southward r002 0 xxx ref 1 A . . . I r002 0 xxx ref 2 A . . . I r002 0 thirty ref three A . 9 A Thou r002 0 30 ref 4 G . 10 Grand M r002 0 30 ref 5 A . 11 A M r002 0 30 ref half dozen T . 12 T M r002 0 30 ref vii A . thirteen A 1000 r002 0 30 ref eight A . 14 A Chiliad r002 0 thirty ref . . . . . P r002 0 thirty ref 9 One thousand . . . I r002 0 thirty ref . . . . . P r002 0 30 ref 10 G . . . I r002 0 30 ref xi G . fifteen M G r002 0 xxx ref 12 A . 16 A M (...)
Example 2
sam2tsv tin can read data from a linux pipe.
samtools view -h input.bam | coffee -jar dist/sam2tsv.jar
Citations
Sam2tsv was cited in :
- "Illumina TruSeq Synthetic Long-Reads Empower De Novo Associates and Resolve Circuitous, Highly-Repetitive Transposable Elements" . McCoy RC, Taylor RW, Blauwkamp TA, Kelley JL, Kertesz M, et al. (2014) Illumina TruSeq Synthetic Long-Reads Empower De Novo Assembly and Resolve Complex, Highly-Repetitive Transposable Elements. PLoS ONE 9(9): e106689. doi: ten.1371/journal.pone.0106689 http://journals.plos.org/plosone/article?id=ten.1371/periodical.pone.0106689
- "High-Throughput Identification of Genetic Variation Affect on pre-mRNA Splicing Efficiency". Scott I Adamson, Lijun Zhan, Brenton R Graveley. doi: https://doi.org/10.1101/191122.
- "Linkage of A-to-I RNA editing in metazoans and the touch on genome development " Molecular Biology and Development, msx274, https://doi.org/x.1093/molbev/msx274
- "Vex-seq: high-throughput identification of the impact of genetic variation on pre-mRNA splicing efficiency" Genome Biology201819:71 https://doi.org/ten.1186/s13059-018-1437-x
- "Accurate detection of m6A RNA modifications in native RNA sequences" Huanle Liu, Oguzhan Begik, Morghan C Lucas, Christopher E Mason, Schraga Schwartz, John S Mattick, Martin A Smith, Eva Maria Novoa bioRxiv 525741; doi: https://doi.org/x.1101/525741
- "Dart-seq: an antibiotic-free method for global m6A detection" Nature Methods https://doi.org/10.1038/s41592-019-0570-0
- "Thiouridine-to-Cytidine Conversion Sequencing (TUC-Seq) to Measure mRNA Transcription and Degradation Rates" The Eukaryotic RNA Exosome. Nov 2019. https://doi.org/10.1007/978-ane-4939-9822-7_10
- "Evolutionary forces on A-to-I RNA editing revealed past sequencing individual honeybee drones". Yuange Duan, Shengqian Dou, Jiaxing Huang, Eli Eisenberg, Jian Lu . 2020 . https://doi.org/10.1101/2020.01.15.907287
- "Sci-fate characterizes the dynamics of gene expression in unmarried cells" (2020) Nat Biotechnol (2020). https://doi.org/10.1038/s41587-020-0480-9
- "Mutations in virus-derived small-scale RNAs" Nigam Deepti; LaTourrette, Katherine; Garcia-Ruiz, Hernan. Scientific Reports (Nature Publisher Grouping); London Vol. ten, Iss. 1, (2020). DOI:x.1038/s41598-020-66374-2
- Qiu, Q., Hu, P., Qiu, X. et al. Massively parallel and fourth dimension-resolved RNA sequencing in single cells with scNT-seq. Nat Methods (2020). https://doi.org/10.1038/s41592-020-0935-4
- FUJIKURA, 1000. et al. Multiregion whole-exome sequencing of intraductal papillary mucinous neoplasms reveals frequent somatic KLF4 mutations predominantly in low-grade regions. Gut, [due south. fifty.], 2020. DOI x.1136/gutjnl-2020-321217
- Gao, Y., Liu, 10., Wu, B. et al. Quantitative profiling of N6-methyladenosine at single-base resolution in stalk-differentiating xylem of Populus trichocarpa using Nanopore direct RNA sequencing. Genome Biol 22, 22 (2021). https://doi.org/10.1186/s13059-020-02241-vii
- Liu H., Begik O., Novoa Due east.M. (2021) EpiNano: Detection of m6A RNA Modifications Using Oxford Nanopore Direct RNA Sequencing. In: McMahon M. (eds) RNA Modifications. Methods in Molecular Biology, vol 2298. Humana, New York, NY. https://doi.org/x.1007/978-1-0716-1374-0_3
- Yang & al. "Sequencing 5-Formyluracil in Genomic DNA at Single-Base Resolution" (2021) Analytical Chemistry doi: 10.1021/acs.analchem.1c03339
Source: http://lindenb.github.io/jvarkit/Sam2Tsv.html
0 Response to "Read and Print a Tab Separated File Java"
Post a Comment