Skip to main content

Creating Datasets

  • Chapter
  • First Online:
Bioinformation Discovery
  • 932 Accesses

Abstract

Data is the key in biological knowledge discovery. The data used in discovery is specific and specialized to a specific issue in cell and molecular biology. This is generally achieved by creating datasets of specific nature. Here, we discuss the importance of biological datasets in information gleaning and describe procedures for specialized dataset creation. The creation of data subsets for human leukocyte antigen (HLA) peptide binding, HLA–peptide structures, HLA class I and class II grouping of structures with peptides, protein subunit interactions, homodimers, heterodimers, homodimer folding into categories, fusion proteins, intron-containing genes in eukaryotes and intronless genes in eukaryotes, is described in this chapter.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Alexander J, Del Guercio MF, Fikes JD, et al. Recognition of a novel naturally processed, A2 restricted, HCV-NS4 epitope triggers IFN-gamma release in absence of detectable cytopathicity. Hum Immunol. 1998;12:776–782.

    Article  Google Scholar 

  • Chang KM, Gruener NH, Southwood S, et al. Identification of HLA-A3 and HLA-B7-restricted CTL response to hepatitis C virus in patients with acute and chronic hepatitis C. J Immunol. 1999;162:1156–1164.

    CAS  PubMed  Google Scholar 

  • Chen W, Khilko S, Fecondo J, et al. Determinant selection of major histocompatibility complex class I-restricted antigenic peptides is explained by class I-peptide affinity and is strongly influenced by nondominant anchor residues. J Exp Med. 1994;180:1471–1483.

    Article  CAS  Google Scholar 

  • Den Haan JM, Meadows LM, Wang W, et al. The minor histocompatibility antigen HA-1: A diallelic gene with a single amino acid polymorphism. Science. 1998;279:1054–1057.

    Article  Google Scholar 

  • Gao Y, Wang R, Lai L. Structure-based method for analyzing protein-protein interfaces. J Mol Model. 2004;10:44–54.

    Article  CAS  Google Scholar 

  • Gianfrani C, Oseroff C, Sidney J, et al. Human memory CTL response specific for influenza A virus is broad and multispecific. Hum Immunol. 2000;61:438–452.

    Article  CAS  Google Scholar 

  • Henrick K, Thornton JM. PQS: a protein quaternary structure file server. Trends Biochem Sci. 1998;23:358–361.

    Article  CAS  Google Scholar 

  • Kawashima I, Hudson SJ, Tsai V, et al. Multi-epitope approach for immunotherapy for cancer: identification of several CTL epitopes from various tumor-associated antigens expressed on solid epithelial tumors. Hum Immunol. 1998;59:1–14.

    Article  CAS  Google Scholar 

  • Jones S, Thornton JM. Principles of protein-protein interactions. Proc Natl Acad Sci U S A. 1996;93:13–20.

    Article  CAS  Google Scholar 

  • Laskowski RA. SURFNET: a program for visualizing molecular surfaces, cavities and intermolecular interactions. J Mol Graph. 1995;13:323–330.

    Article  CAS  Google Scholar 

  • Lauvau G, Kakimi K, Niedermann G, et al. Human transporters associated with antigen processing (TAPs) select epitope precursor peptides for processing in the endoplasmic reticulum and presentation to T cells. J Exp Med. 1999;190:1227–1240.

    Article  CAS  Google Scholar 

  • Lee B, Richard FM. The interpretation of protein structures: estimation of static accessibility. J Mol Biol. 1971;55:379–400.

    Article  CAS  Google Scholar 

  • Livingston BD, Crimi C, Fikes J, et al. Immunization with the HBV core 18–27 epitope elicits CTL responses in humans expressing different HLA-A2 supertype molecules. Hum Immunol. 1999;60:1013–1017.

    Article  CAS  Google Scholar 

  • McDonald IK, Thornton JM. Satisfying hydrogen bonding potential in proteins. J Mol Biol. 1994;238:777–793.

    Article  CAS  Google Scholar 

  • Nukaya I, Yasumoto M, Iwasaki T, et al. Identification of HLA-A24 epitope peptides of carcinoembryonic antigen which induce tumor-reactive cytotoxic T lymphocyte. Int J Cancer. 1999;80:92–97.

    Article  CAS  Google Scholar 

  • Rechenmann F. From data to knowledge. Bioinformatics. 2000;16:411.

    Article  CAS  Google Scholar 

  • Sakharkar MK, Kangueane P. Genome SEGE: A database for ‘intronless’ genes in eukaryotic genomes. BMC Bioinformatics. 2004;5:67.

    Article  Google Scholar 

  • Service RF. Structural genomics offers high-speed look at proteins. Science. 2000;287:1954–1956.

    Article  CAS  Google Scholar 

  • Sette A, Sidney J, del Guercio MF, et al. Peptide binding to the most frequent HLA-A class I alleles measured by quantitativemolecular binding assays. Mol Immunol. 1994;31:813–822.

    Article  CAS  Google Scholar 

  • Yiting Y, Chaturvedi I, Liew KM, et al. Can ends justify the means? Digging deep for human fusion genes of prokaryotic origin. Front Biosci. 2004;9:2964–2971.

    Article  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pandjassarame Kangueane .

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Kangueane, P. (2009). Creating Datasets. In: Bioinformation Discovery. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-0519-2_2

Download citation

Publish with us

Policies and ethics