k-mer
Substrings of length k contained in a biological sequence / From Wikipedia, the free encyclopedia
Dear Wikiwand AI, let's keep it short by simply answering these key questions:
Can you list the top facts and stats about K-mer?
Summarize this article for a 10 year old
In bioinformatics, k-mers are substrings of length contained within a biological sequence. Primarily used within the context of computational genomics and sequence analysis, in which k-mers are composed of nucleotides (i.e. A, T, G, and C), k-mers are capitalized upon to assemble DNA sequences,[1] improve heterologous gene expression,[2][3] identify species in metagenomic samples,[4] and create attenuated vaccines.[5] Usually, the term k-mer refers to all of a sequence's subsequences of length , such that the sequence AGAT would have four monomers (A, G, A, and T), three 2-mers (AG, GA, AT), two 3-mers (AGA and GAT) and one 4-mer (AGAT). More generally, a sequence of length will have k-mers and total possible k-mers, where is number of possible monomers (e.g. four in the case of DNA).