Submission of data
Please carefully follow this protocol step by step.Preparation of the MLST sequences
Organize sequences in five files in FASTA format, one for each of the five MLST genes. Use the allele templates provided as reference to verify the exact range of nucleotides to include in each gene sequence. Ambiguous sites (such N, ?, etc) are not allowed. The program does not read the header, so simply indicate what you need to associate a sequence to the strain it comes from.
Allele assignment of the MLST sequences
This step is necessary for you to verify for each of your sequences whether:
- It is an allele already present in the database. In this case the sequence has been already assigned an allele number.
- It is a new allele. A sequence is considered a novel allele if is has at least one base pair difference compared to any other sequence already in the database. In such a case the new sequence has to be inserted into the database as a new allele and assigned a number.
You can either perform a single locus, a multiple locus or a single locus batch query. A single locus batch query is advisable in any cases of multiple sequences.
To do a single locus batch query proceed as follows:
- Go to the Wolbachia MLST home page (http://pubmlst.org/wolbachia/) and click on 'Profiles database'.
- Click on 'Single locus batch query'.
- Select the locus (i.e. the gene) you want to query your sequences for (in the top left box). Copy all the sequences from one of the five gene files you prepared and paste them into the search box.
- Click Submit. The program will search that specific gene database for the best allele match to each of your query sequences.
The Results table will appear indicating for each sequence one of the following:
- 'Wrong length'. Such message appears in case a query sequence has not the standard length required (then verify again the correct length using the allele templates provided as reference).
- 'Allele' followed by a number. This is the case when a perfect match exists between your query sequence and an allele already in the database. For each strain that shows a perfect allele match, take a note of the given allele number and use this in the Isolates form using the appropriate allelic profile fields (i.e. 'gatB', 'coxA', 'hcpA', 'ftsZ' and 'fbpA').
- 'Nearest allele' followed by the number, 'X% Differences: nt XX...'. In this case the query sequence doesn't have a perfect match to any of the alleles present in the database. The nearest allele, i.e. the one with the highest sequence similarity to your query sequence, is given for information. This is the case when a query sequence represents a NEW ALLELE. Create a FASTA file with only new alleles.
- Repeat from step (c) querying all five gene files.
Submission of New Alleles
All new alleles need to be inserted into the database and assigned a number.
In order to do that:
- Prepare a FASTA file for each of the five genes, this time including only the new alleles (the header is not important). Check for redundant sequences (duplicates) in each file. To do that you can use a web tool on the MLST home page (Go to http://pubmlst.org/wolbachia/, then click on Web tools on the left side of the page and then click on Non-redundant databases (NRDB)). Be sure to eliminate all duplicate sequences in a way that each file does not contain redundant alleles.
- Send the above five FASTA files, along with forward and reverse trace files to the curator at the following e-mail address: laurab@ucr.edu. The curator will check sequence quality and submit the new alleles to the database.
- After receiving an e-mail of confirmation RE-QUERY all the new alleles as at step 2 and complete the allele assignment for each of your strains using the Isolates form.
ST assignment of allelic profiles
This step is necessary for you to verify for each of your allelic profiles whether:
- It is an allelic profile already present in the database. In this case the allelic profile has been already assigned an ST number.
- It is a new allelic profile. An allelic profile is considered a novel allelic profile if is has at least one allele difference with respect to any other allelic profile already in the database. In such a case the new allelic profile has to be inserted into the database and assigned an ST number.
You can either perform a single or a batch allelic profile query. A batch allelic profile query is advisable in any cases of multiple allelic profiles.
To do a batch allelic profile query proceed as follows:
- Go to the Wolbachia MLST home page (http://pubmlst.org/wolbachia/) and click on 'Profiles database'.
- Click on 'batch profile query'.
- Copy and paste the allelic profiles into the text box. Columns can be separated by any whitespace (space or tabs) and the first column should be an identifier (see example data provided).
Click Submit. The program will search the Profiles database for perfect matches to each of the query allelic profiles.
The Results table will appear indicating for each allelic profile one of the following:
- The ST number. This is the case when a perfect match exists between your query allelic profile and an allelic profile already into the database. For each strain take note of the ST number (this information does not go into the Isolates form, it is only a reference for the user).
- 'New'. In this case the query allelic profile doesn't have a perfect match to any of the allelic profile present in the database.
Submission of New Allelic Profiles
All new allelic profiles need to be inserted into the Profiles database and assigned an ST number.
In order to do that:
- Download the Allelic profiles form provided and fill it with all new allelic profiles found.
- Send the complete Allelic profiles form to the curator (laurab@ucr.edu).
Submission of strain information and privacy
All strain information is entered in the Isolates form. Once you have filled the form with all the information required, send it to the curator (laurab@ucr.edu). The curator will add it to the Isolates database.
If you have chosen to make your strain information private until publication (by selecting 'yes' in the 'private' field on the Isolates form), your strain information will be submitted to the database but it will not be visible to any users. The information will be released for public view only when requested by the user.
Your sequence data is public as soon as it is submitted to the profiles database. However, if your strain information is private, the alleles and allelic profiles stored in the Profiles database cannot be associated with any strain. For further information, please read the policy document at the main page.