The guanine-cytosine content, or GC-content, of a DNA sequence indicates the percentage of nucleotide base pairs where guanine is bonded to cytosine. DNA with a higher GC-content will be harder to break apart.
Steps
Method 1
Method 1 of 2:
By Hand
Method 2
Method 2 of 2:
Programmatically (Python 2)
-
Create or accept an input file. This article assumes that the input is in FASTA format, with a single sequence per file.
-
Read in the file. For FASTA format:
- Discard the first line of the file.
- Remove all remaining newlines and other trailing whitespace.
def init ( sequence ): with open ( argv [ 1 ]) as input : sequence = "" . join ([ line . strip () for line in input . readlines ()[ 1 :]]) return sequence
-
Create a counter. Iterate through the data and increment your counter as you encounter any guanine or cytosine nucleotides.
-
Divide the GC count by the total length of the sequence, and output the result in percentage format.Advertisement
4
def
GCcontent
(
sequence
):
GCcount
=
0
for
letter
in
sequence
:
if
letter
==
"G"
or
letter
==
"C"
:
GCcount
+=
1
return
GCcount
Expert Q&A
Ask a Question
200 characters left
Include your email address to get a message when this question is answered.
Submit
Advertisement
Tips
- If you're calculating GC-content by hand, be sure to double check! It can be easy to miscount, especially if you're analyzing a long sequence on paper.Thanks
Advertisement
About this article
Thanks to all authors for creating a page that has been read 10,894 times.
Advertisement