Skip to main content

Scalar Functions

Scalar functions are those that take one or more inputs and return a single value. The following scalar functions are supported.

Sequence Scalar Functions

These functions work on sequences. They either return the operated upon sequence or some scalar value (e.g. gc_content).

complement

Calculate the complement of a sequence.

Examples

SELECT complement('ATCG') AS complement;
complement
TAGC

Arguments

NameTypeDescription
sequencevarcharThe sequence to complement.

gc_content

Calculate the GC content of a sequence.

Examples

SELECT gc_content('ATCG') AS gc_content;
gc_content
0.5

Arguments

NameTypeDescription
sequencevarcharThe sequence to calculate.

reverse_complement

Calculate the reverse complement of a sequence.

Examples

SELECT reverse_complement('ATCG') AS reverse_complement;
reverse_complement
CGAT

Arguments

NameTypeDescription
sequencevarcharThe sequence to reverse complement.

transcribe

Transcribe a DNA sequence to RNA.

Examples

SELECT transcribe('ATCG') AS transcribed;
transcribed
AUCG

Arguments

NameTypeDescription
sequencevarcharThe sequence to transcribe.

translate_dna_to_aa

Translate a DNA sequence to protein. This will throw an error if the sequence is not a multiple of 3 or has codons that are not in the standard genetic code.

Examples

Base case to translate a single codon.

SELECT translate_dna_to_aa('ATG') AS aa;
aa
M

Error on invalid sequence length.

SELECT translate_dna_to_aa('ATGG');
-- Error: Invalid Input Error: Invalid sequence length: 4

Error on invalid codon.

SELECT translate_dna_to_aa('XXX');
-- Error: Invalid Input Error: Invalid codon: XXX

Arguments

NameTypeDescription
sequencevarcharThe sequence to translate.

reverse_transcribe

Reverse transcribe an RNA sequence to DNA.

Examples

SELECT reverse_transcribe('AUCG') AS reverse_transcribed;
reverse_transcribed
ATCG

Arguments

NameTypeDescription
sequencevarcharThe sequence to reverse transcribe.

Alignment Scalar Functions

These functions can be used to generate CIGAR strings describing an alignement between two sequences or operate on that CIGAR string in some way (e.g. extract a subsequence).

alignment_string and alignment_string_wfa_gap_affine

Calculate the alignment string between two sequences using the WFA algorithm with affine gap penalties.

Examples

SELECT alignment_string_wfa_gap_affine('ATCGG', 'ATCG') AS alignment_string;
alignment_string
4M1I

Arguments

NameTypeDescription
referencevarcharThe first sequence to align.
queryvarcharThe second sequence to align.
matchintThe match score.
mismatchintThe mismatch score.
gap_openintThe gap open penalty.
gap_extendintThe gap extend penalty.
memory_modevarcharThe memory mode to use.

alignment_score and alignment_score_wfa_gap_affine

Calculate the alignment score between two sequences using the WFA algorithm with affine gap penalties.

Examples

SELECT alignment_score_wfa_gap_affine('ATCGG', 'ATCG') AS alignment_score;
alignment_score
-8.0

Arguments

Same as alignment_string.

extract_from_cigar

Extract a subsequence from a sequence given a CIGAR string. This will trim outside insertions, and return a struct with the subsequence and position in the original sequence.

Examples

SELECT extract_from_cigar('ATCG', '4M') AS subsequence;
subsequence
{'sequence_start': 0, 'sequence_end': 4, 'sequence': ATCG}
SELECT extract_from_cigar('BANANA', '3I3M') AS subsequence;
subsequence
{'sequence_start': 3, 'sequence_end': 6, 'sequence': ANA}

SAM Flags Scalar Functions

These functions work on the SAM flags field and return true if the flag is set.

is_duplicate

Returns true if the read is a PCR or optical duplicate.

Examples

SELECT is_duplicate(4095);
-- true

Arguments

NameTypeDescription
flagsintThe SAM flags to parse.

is_first_segment

Returns true if the read is the first segment in a template.

Examples

SELECT is_first_segment(4095);
-- true

Arguments

NameTypeDescription
flagsintThe SAM flags to parse.

is_last_segment

Returns true if the read is the last segment in a template.

Examples

SELECT is_last_segment(4095);
-- true

Arguments

NameTypeDescription
flagsintThe SAM flags to parse.

is_mate_reverse_complemented

Returns true if the mate is reverse complemented.

Examples

SELECT is_mate_reverse_complemented(4095);
-- true

Arguments

NameTypeDescription
flagsintThe SAM flags to parse.

is_mate_unmapped

Returns true if the mate is unmapped.

Examples

SELECT is_mate_unmapped(4095);
-- true

Arguments

NameTypeDescription
flagsintThe SAM flags to parse.

is_properly_aligned

Returns true if the read is properly aligned.

Examples

SELECT is_properly_aligned(4095);
-- true

Arguments

NameTypeDescription
flagsintThe SAM flags to parse.

is_quality_control_failed

Returns true if the read failed platform/vendor quality checks.

Examples

SELECT is_quality_control_failed(4095);
-- true

Arguments

NameTypeDescription
flagsintThe SAM flags to parse.

is_reverse_complemented

Returns true if the read is reverse complemented.

Examples

SELECT is_reverse_complemented(4095);
-- true

Arguments

NameTypeDescription
flagsintThe SAM flags to parse.

is_secondary

Returns true if the read is not the primary alignment for the segment.

Examples

SELECT is_secondary(4095);
-- true

Arguments

NameTypeDescription
flagsintThe SAM flags to parse.

is_segmented

Returns true if the read is part of a segmented alignment.

Examples

SELECT is_segmented(4095);
-- true

Arguments

NameTypeDescription
flagsintThe SAM flags to parse.

is_supplementary

Returns true if the read is part of a supplementary alignment.

Examples

SELECT is_supplementary(4095);
-- true

Arguments

NameTypeDescription
flagsintThe SAM flags to parse.

is_unmapped

Returns true if the read is unmapped.

Examples

SELECT is_unmapped(4095);
-- true

Arguments

NameTypeDescription
flagsintThe SAM flags to parse.

SAM CIGAR Scalar Functions

The functions take in a CIGAR string.

parse_cigar

Parse a CIGAR string into a list of operations.

Examples

Decompose a CIGAR string into its operations.

SELECT parse_cigar('10M2I3D') AS ops;
ops
[{'op': M, 'len': 10}, {'op': I, 'len': 2}, {'op': D, 'len': 3}]

Arguments

NameTypeDescription
cigarvarcharThe CIGAR string to parse.

GFF Scalar Functions

gff_parse_attributes