FAQ

Frequently Asked Questions

How do I cite JASPAR?

It depends on what you have used it for. If you simply want to acknowledge you used the last version, use:

Khan, A. et al. JASPAR 2018: update of the open-access database of transcription factor binding profiles and its web framework. Nucleic Acids Res. gkx1126, doi: 10.1093/nar/gkx1126

Otherwise:

  • The original JASPAR paper: Sandelin A, Alkema W, Engstrom P, Wasserman WW, Lenhard B. JASPAR: an open-access database for eukaryotic transcription factor binding profiles. Nucleic Acids Res. 2004 Jan 1;32(Database issue):D91-4.
  • The first extension (JASPAR FAM and PHYLOFACTS collections): Vlieghe D, Sandelin A, De Bleser PJ, Vleminckx K, Wasserman WW, van Roy F, Lenhard B. A new generation of JASPAR, the open-access repository for transcription factor binding site profiles. Nucleic Acids Res. 2006 Jan 1;34(Database issue):D95-7.
  • Second expansion (POLII, SPLICE, CNE, many changes in the web service including matrix permutations): Bryne JC, Valen E, Tang MH, Marstrand T, Winther O, da Piedade I, Krogh A, Lenhard B, Sandelin A. JASPAR, the open access database of transcription factor-binding profiles: new content and tools in the 2008 update. Nucleic Acids Res. 2008 Jan;36(Database issue):D102-6.
  • Third expansion (Large expansion of the CORE collection, including yeast and worm matrices. Also includes new PBM collections): Portales-Casamar E, Thongjuea S, Kwon AT, Arenillas D, Zhao X, Valen E, Yusuf D, Lenhard B, Wasserman WW, Sandelin A. JASPAR 2010: the greatly expanded open-access database of transcription factor binding profiles. Nucleic Acids Res. 2010 Jan;38(Database issue):D105-10.
  • Fourth expansion: Mathelier, A., Zhao, X., Zhang, A. W., Parcy, F., Worsley-Hunt, R., Arenillas, D. J., Buchman, S., Chen, C.-y., Chou, A., Ienasescu, H., Lim, J., Shyr, C., Tan, G., Zhou, M., Lenhard, B., Sandelin, A. and Wasserman, W. W. JASPAR 2014: an extensively expanded and updated open-access database of transcription factor binding profiles. Nucleic Acids Research, 2014 Jan;42(Database issue):D142-7.
  • Fifth expansion: Mathelier, A., Fornes, O., Arenillas, D.J., Chen, C., Denay, G., Lee, J., Shi, W., Shyr, C., Tan, G., Worsley-Hunt, R., et al. (2015). JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 2016 44: D110-D115.

Why are certain sequences not downloadable from JASPAR CORE?

This is due to historical reasons. JASPAR CORE was originally built in order to create familial binding profiles for as many structural classes of transcription factor classes as possible. In some experimental literature, only matrices and not sequences are available. For this project, we were forced to include some matrices to gain coverage of certain binding site classes. For recent additions, it is a requirement to have the sequences available.

Why is not my matrix study included in JASPAR CORE?

There are two principal explanations. The most likely is that we were not aware of your work: please let us know! The other possible reason is that the publication did not live up to the demands of the curators. As we have human curation of all JASPAR CORE matrices, this is to some degree an arbitrary call – we are happy to discuss it with you.

Linking web services to CPU-intensive services within JASPAR

We appreciate that other services wants to link to JASPAR. However, if your are using the CPU-intensive services (matrix comparison, randomization or clustering), please ask the maintainers (see contact information below) before you do this – otherwise your server might be rejected without warning. In that case, we strongly suggest setting up a local JASPAR database, as the database and resources are freely available.

What motif formats JASPAR supports?

The data from JASPAR can be downloaded in four different motif formats:

1. Raw PFM: Each matrix is separated by a fasta like header starting with the > symbol and then a matrix ID. The count for each base (ACGT) is specified on its own space separated line where each element corresponds to one column. The order of the lines for the bases is A,C,G and finally T.

13 13 3 1 54 1 1 1 0 3 2 5
13 39 5 53 0 1 50 1 0 37 0 17
17 2 37 0 0 52 3 0 53 8 37 12
11 0 9 0 0 0 0 52 1 6 15 20

2. JASPAR: This is similar to the raw format, having an identical header. The lines for each base however starts with a label for the nucleotide (A,C,G or T) and then the columns follow enclosed in brackets: [].

A [13 13 3 1 54 1 1 1 0 3 2 5 ]
C [13 39 5 53 0 1 50 1 0 37 0 17 ]
G [17 2 37 0 0 52 3 0 53 8 37 12 ]
T [11 0 9 0 0 0 0 52 1 6 15 20 ]

3. TRANSFAC: This is a TRANSFAC-like format having a header starting with “DE” then the matrix ID, the matrix name and the matrix class. The data itself is transposed as compared to the other formats, meaning that each line correspond to a column in the matrix. The column lines start with a number denoting the column index (counting

from 0). After that follows tab separated counts for each base in that column in the order: A,C,G and T. After the lines with the counts follows a final line containing the string: “XX”.

DE MA0048    NHLH1    bHLH
P0    A     C     G   T
00    13    13    17  11   
01    13    39    2    0   
02    3    5    37    9   
03    1    53    0    0   
04    54    0    0    0   
05    1    1    52    0   
06    1    50    3    0   
07    1    1    0    52   
08    0    0    53    1   
09    3    37    8    6   
10    2    0    37    15   
11    5    17    12    20   
XX
//

4. MEME: MEME motif format is a simple text format for motifs that is accepted by the programs in the MEME Suite that require MEME Motif Format. A text file in MEME minimal motif format can contain more than one motif, and also (optionally) specifies the motif alphabet, background frequencies of the letters in the alphabet, and strand information (for motifs of complementable alphabets like DNA), as illustrated in the example below:

MEME version 4

ALPHABET= ACGT

strands: + -

Background letter frequencies
A 0.25 C 0.25 G 0.25 T 0.25

MOTIF MA0048.2 NHLH1
letter-probability matrix: alength= 4 w= 10 nsites= 3246 E= 0
 0.242760  0.667283  0.055761  0.034196
 0.142021  0.055145  0.667283  0.043746
 0.000924  0.667283  0.000000  0.000000
 0.667283  0.000924  0.000308  0.000000
 0.000000  0.092730  0.667283  0.035120
 0.029575  0.667283  0.038201  0.000000
 0.000000  0.001232  0.001232  0.667283
 0.000616  0.001232  0.667283  0.000308
 0.091189  0.667283  0.031423  0.107825
 0.094886  0.226741  0.667283  0.458718
URL http://jaspar.genereg.net/matrix/MA0048.2

Who is JASPAR anyhow?

JASPAR was originally the name of a master student project algorithm for comparing matrix profiles, an obscure tribute to an even more obscure dialog from the Black Adder episode “The Black Seal” between the Seven Most Evil Men in the Kingdom:

  • …and with all haste, we will meet at Old Jaspar’s tavern

  • How is old Jaspar these days?

  • Dead.

  • How?

  • I killed him.

  • [Loud cheer].