add ID mapping instructions

This commit is contained in:
hyginn 2017-10-09 17:32:07 -04:00
parent 32a45fc059
commit 0c069a66ca

View File

@ -3,11 +3,12 @@
# Purpose: A Bioinformatics Course:
# R code accompanying the BIN-Storing_data unit
#
# Version: 1.0
# Version: 1.1
#
# Date: 2017 09 23
# Date: 2017 10 08
# Author: Boris Steipe (boris.steipe@utoronto.ca)
#
# V 1.1 Add instructions to retrieve UniProt ID from ID mapping service.
# V 1.0 First live version, complete rebuilt. Now using JSON data sources.
# V 0.1 First code copied from BCH441_A03_makeYFOlist.R
#
@ -28,30 +29,32 @@
#TOC>
#TOC> Section Title Line
#TOC> ------------------------------------------------------------
#TOC> 1 A Relational Datamodel in R: review 55
#TOC> 1.1 Building a sample database structure 95
#TOC> 1.1.1 completing the database 206
#TOC> 1.2 Querying the database 241
#TOC> 1.3 Task: submit for credit (part 1/2) 270
#TOC> 2 Implementing the protein datamodel 282
#TOC> 2.1 JSON formatted source data 308
#TOC> 2.2 "Sanitizing" sequence data 343
#TOC> 2.3 Create a protein table for our data model 363
#TOC> 2.3.1 Initialize the database 365
#TOC> 2.3.2 Add data 377
#TOC> 2.4 Complete the database 397
#TOC> 2.4.1 Examples of navigating the database 424
#TOC> 2.5 Updating the database 456
#TOC> 3 Add your own data 468
#TOC> 3.1 Find a protein 476
#TOC> 3.2 Put the information into JSON files 505
#TOC> 3.3 Create an R script to create the database 522
#TOC> 3.3.1 Check and validate 542
#TOC> 3.4 Task: submit for credit (part 2/2) 583
#TOC> 1 A Relational Datamodel in R: review 58
#TOC> 1.1 Building a sample database structure 98
#TOC> 1.1.1 completing the database 209
#TOC> 1.2 Querying the database 244
#TOC> 1.3 Task: submit for credit (part 1/2) 273
#TOC> 2 Implementing the protein datamodel 285
#TOC> 2.1 JSON formatted source data 311
#TOC> 2.2 "Sanitizing" sequence data 346
#TOC> 2.3 Create a protein table for our data model 366
#TOC> 2.3.1 Initialize the database 368
#TOC> 2.3.2 Add data 380
#TOC> 2.4 Complete the database 400
#TOC> 2.4.1 Examples of navigating the database 427
#TOC> 2.5 Updating the database 459
#TOC> 3 Add your own data 471
#TOC> 3.1 Find a protein 479
#TOC> 3.2 Put the information into JSON files 508
#TOC> 3.3 Create an R script to create the database 531
#TOC> 3.3.1 Check and validate 551
#TOC> 3.4 Task: submit for credit (part 2/2) 592
#TOC>
#TOC> ==========================================================================
# = 1 A Relational Datamodel in R: review =================================
# A disclaimer at first: we are not building an industry-strength database at
@ -512,6 +515,12 @@ myDB$taxonomy$species[sel]
# "name" of your protein. Open the file in the RStudio editor and replace
# all of the MBP1_SACCE data with the corresponding data of your protein.
#
# The UniProt ID may not be discoverable from the NCBI page. To retrieve
# it, navigate to http://www.uniprot.org/mapping/ , paste your RefSeq ID
# into the query field, make sure "RefSeqProtein" is selected for "From"
# and "UniProtKB" is selected for "To", and click "Go". In case this does
# not retrieve a single UniProt ID, contact me.
#
# - Do a similar thing for the MYSPE taxonomy entry. Copy
# "./data/refTaxonomy.json" and make a new file named "MYSPEtaxonomy.json".
# Create a valid JSON file with only one single entry - that of MYSPE.