Scalar Functions
Scalar functions are those that take one or more inputs and return a single value. The following scalar functions are supported.
- Sequence Scalar Functions
- Alignment Scalar Functions
- SAM Flags Scalar Functions
- SAM CIGAR Scalar Functions
- GFF Scalar Functions
Sequence Scalar Functions
These functions work on sequences. They either return the operated upon sequence or some scalar value (e.g. gc_content
).
complement
Calculate the complement of a sequence.
Examples
SELECT complement('ATCG') AS complement;
complement |
---|
TAGC |
Arguments
Name | Type | Description |
---|---|---|
sequence | varchar | The sequence to complement. |
gc_content
Calculate the GC content of a sequence.
Examples
SELECT gc_content('ATCG') AS gc_content;
gc_content |
---|
0.5 |
Arguments
Name | Type | Description |
---|---|---|
sequence | varchar | The sequence to calculate. |
reverse_complement
Calculate the reverse complement of a sequence.
Examples
SELECT reverse_complement('ATCG') AS reverse_complement;
reverse_complement |
---|
CGAT |
Arguments
Name | Type | Description |
---|---|---|
sequence | varchar | The sequence to reverse complement. |
transcribe
Transcribe a DNA sequence to RNA.
Examples
SELECT transcribe('ATCG') AS transcribed;
transcribed |
---|
AUCG |
Arguments
Name | Type | Description |
---|---|---|
sequence | varchar | The sequence to transcribe. |
translate_dna_to_aa
Translate a DNA sequence to protein. This will throw an error if the sequence is not a multiple of 3 or has codons that are not in the standard genetic code.
Examples
Base case to translate a single codon.
SELECT translate_dna_to_aa('ATG') AS aa;
aa |
---|
M |
Error on invalid sequence length.
SELECT translate_dna_to_aa('ATGG');
-- Error: Invalid Input Error: Invalid sequence length: 4
Error on invalid codon.
SELECT translate_dna_to_aa('XXX');
-- Error: Invalid Input Error: Invalid codon: XXX
Arguments
Name | Type | Description |
---|---|---|
sequence | varchar | The sequence to translate. |
reverse_transcribe
Reverse transcribe an RNA sequence to DNA.
Examples
SELECT reverse_transcribe('AUCG') AS reverse_transcribed;
reverse_transcribed |
---|
ATCG |
Arguments
Name | Type | Description |
---|---|---|
sequence | varchar | The sequence to reverse transcribe. |
Alignment Scalar Functions
These functions can be used to generate CIGAR strings describing an alignement between two sequences or operate on that CIGAR string in some way (e.g. extract a subsequence).
alignment_string
and alignment_string_wfa_gap_affine
Calculate the alignment string between two sequences using the WFA algorithm with affine gap penalties.
Examples
SELECT alignment_string_wfa_gap_affine('ATCGG', 'ATCG') AS alignment_string;
alignment_string |
---|
4M1I |
Arguments
Name | Type | Description |
---|---|---|
reference | varchar | The first sequence to align. |
query | varchar | The second sequence to align. |
match | int | The match score. |
mismatch | int | The mismatch score. |
gap_open | int | The gap open penalty. |
gap_extend | int | The gap extend penalty. |
memory_mode | varchar | The memory mode to use. |
alignment_score
and alignment_score_wfa_gap_affine
Calculate the alignment score between two sequences using the WFA algorithm with affine gap penalties.
Examples
SELECT alignment_score_wfa_gap_affine('ATCGG', 'ATCG') AS alignment_score;
alignment_score |
---|
-8.0 |
Arguments
Same as alignment_string
.
extract_from_cigar
Extract a subsequence from a sequence given a CIGAR string. This will trim outside insertions, and return a struct with the subsequence and position in the original sequence.
Examples
SELECT extract_from_cigar('ATCG', '4M') AS subsequence;
subsequence |
---|
{'sequence_start': 0, 'sequence_end': 4, 'sequence': ATCG} |
SELECT extract_from_cigar('BANANA', '3I3M') AS subsequence;
subsequence |
---|
{'sequence_start': 3, 'sequence_end': 6, 'sequence': ANA} |
SAM Flags Scalar Functions
These functions work on the SAM flags field and return true if the flag is set.
is_duplicate
Returns true if the read is a PCR or optical duplicate.
Examples
SELECT is_duplicate(4095);
-- true
Arguments
Name | Type | Description |
---|---|---|
flags | int | The SAM flags to parse. |
is_first_segment
Returns true if the read is the first segment in a template.
Examples
SELECT is_first_segment(4095);
-- true
Arguments
Name | Type | Description |
---|---|---|
flags | int | The SAM flags to parse. |
is_last_segment
Returns true if the read is the last segment in a template.
Examples
SELECT is_last_segment(4095);
-- true
Arguments
Name | Type | Description |
---|---|---|
flags | int | The SAM flags to parse. |
is_mate_reverse_complemented
Returns true if the mate is reverse complemented.
Examples
SELECT is_mate_reverse_complemented(4095);
-- true
Arguments
Name | Type | Description |
---|---|---|
flags | int | The SAM flags to parse. |
is_mate_unmapped
Returns true if the mate is unmapped.
Examples
SELECT is_mate_unmapped(4095);
-- true
Arguments
Name | Type | Description |
---|---|---|
flags | int | The SAM flags to parse. |
is_properly_aligned
Returns true if the read is properly aligned.
Examples
SELECT is_properly_aligned(4095);
-- true
Arguments
Name | Type | Description |
---|---|---|
flags | int | The SAM flags to parse. |
is_quality_control_failed
Returns true if the read failed platform/vendor quality checks.
Examples
SELECT is_quality_control_failed(4095);
-- true
Arguments
Name | Type | Description |
---|---|---|
flags | int | The SAM flags to parse. |
is_reverse_complemented
Returns true if the read is reverse complemented.
Examples
SELECT is_reverse_complemented(4095);
-- true
Arguments
Name | Type | Description |
---|---|---|
flags | int | The SAM flags to parse. |
is_secondary
Returns true if the read is not the primary alignment for the segment.
Examples
SELECT is_secondary(4095);
-- true
Arguments
Name | Type | Description |
---|---|---|
flags | int | The SAM flags to parse. |
is_segmented
Returns true if the read is part of a segmented alignment.
Examples
SELECT is_segmented(4095);
-- true
Arguments
Name | Type | Description |
---|---|---|
flags | int | The SAM flags to parse. |
is_supplementary
Returns true if the read is part of a supplementary alignment.
Examples
SELECT is_supplementary(4095);
-- true
Arguments
Name | Type | Description |
---|---|---|
flags | int | The SAM flags to parse. |
is_unmapped
Returns true if the read is unmapped.
Examples
SELECT is_unmapped(4095);
-- true
Arguments
Name | Type | Description |
---|---|---|
flags | int | The SAM flags to parse. |
SAM CIGAR Scalar Functions
The functions take in a CIGAR string.
parse_cigar
Parse a CIGAR string into a list of operations.
Examples
Decompose a CIGAR string into its operations.
SELECT parse_cigar('10M2I3D') AS ops;
ops |
---|
[{'op': M, 'len': 10}, {'op': I, 'len': 2}, {'op': D, 'len': 3}] |
Arguments
Name | Type | Description |
---|---|---|
cigar | varchar | The CIGAR string to parse. |