SEER and Atomic Bomb Survivor Data Analysis Tools

Creates SEER (Surveillance, Epidemiology and End Results) and A-bomb data binaries from ASCII sources and provides tools for estimating SEER second cancer risks.



o msd now takes a ybrks vector for showing calendar year trends     

Notes o mkExcel2 is now mkExcelCsd, and mkExcel is now mkExcelTsd, to emphasize that their inputs are the outputs of csd and tsd, respectively. mkExcelTsd and tsd are internal.


o AML/MDS (leukemia 2016), CLL (leukemia research 2016), and CML (AAPSJ 2016) scripts were added to doc/papers      

Notes o Many fields were deleted or renamed in the SAS file of the 2016 SEER data release. Names used in my scipts that changed were reverted to originals in getFields() to avoid downstream problems. Numprims is no longer
there, so use seqnum instead. A few fields were also added to the new SEER data release.


o The function msd (mortality since diagnosis) computes mortality RR since diagnosis of cancer.  

o The function csd (cancer since diagnosis), now replaces tsd in computing 2nd cancer RR after a 1st cancer. 
  An added feature of csd is that it handles different intervals of years and ages at diagnosis. This changed
  the internal list of lists structure of the output, which getDF (instead of mkDF) converts to a data.frame (DF) 
  that ggplot2 can use, and which mkExcel2 (instead of mkExcel) uses to make an Excel file. 

Notes o tsd, mkExcel, and mkDF are now internal functions (in terms of help pages), i.e. they can still be used by old scripts o mkSEER no longer makes age19 in canc and it no longer makes popsga. If you still want 19-age groups use mkSEERold.


o The function tsd now takes a firstS argument, a verctor of first cancers of interest. This doesn't speed things 
  up much but it makes the output of tst easier to read and the output of mkExcel much more focused. 

o The latest National Vital Statistics Report data is now included as a data.frame.  


o simSeerSet simulates a seerSet class object for two cancer types, A and B. 

o mkSEER (mkSEER2->mkSEER and mkSEER>-mkSEERold) uses stdUS 2000 to extend single age popsa to popsae. It also 
  makes a trt column for radiation therapy by default.

Notes o random crashes from the C++ code have been debugged. Variants of fillPYM in straight C and R used to debug it are now in the folder doc/extraFuncs


o added seetSet to make objects of class seerSet for the pipeline seerSet->mk2D->tsd.    

o added mk2D that makes two-dimensional spline fits of incidence versus age and calendar year and plot2D to look at them.    

o added tsd to compute RR over various time since diagnosis intervals
  and mkExcel to look at the results.

o added helper functions: getBinInfo, getPY, post1PYO, getE    

o mapCancs is now called within mkSEER to make a cancer field in those files as well.  

Notes o Month is no longer in the package version number and year now reflects the SEER data release year rather than the SEERaBomb release year.

o published paper scripts now use the cancer field instead of histo2 (which is no longer included in pickfields by default) and  plots now also use quartz() when on a mac. The old courses folder is now called examples; scripts therein were updated
  and some were removed. 


o added US 2000 populations as 19-, 86-, and 100-age group data.frames.    

o created mkSEER2 to create pooled db files directly into e.g. ~\data\SEER\mrgd  


o mkHema87 was removed. Its functionality is now included in mkAbomb


o fixed pickFields() to make country and state of birth strings rather than integers


o changed default seerHome directory to ~\data\SEER to no longer assume admin rights

o updated 2010 to 2011 in mkSEER and in several help pages to match April 15,2014 SEER release

o changed mkSEER SQL default to TRUE 

o added pops single ages (popsa) to binary files and all.db  


o added mkAbomb() to convert RERF file lsshempy.csv into a binary and SQLite database

o added mkHema87() to convert RERF file HEMA87.DAT into a binary with column names

o sq vs ad lung incidence and survival added to doc/courses

o JCO 2013 survival plots added to doc/papers/JCO2013

o activeMS is now doc/papers/radiatEnvironBiophys2013


o old demos are now examples in doc/courses


o vignettes are not possible since SEER and Abomb data must be signed off on. 


o Now handles SEER data release of April 24, 2013

o Package version number now matches latest SEER release date.


o No longer handles SEER data released in 2012.


o The SEER SAS file changed: fields are no longer a continuum.  
  The implication is that getFields is now field name centric, since 
  I can no longer count on all fields being in the SAS file.
o Survival is now already in months in SEER, so mapping it is no longer needed. 
  Indeed, the SAS file no longer includes a field for survival with years and 
  months as two character strings in a field of 4 chars, though this field does
  still remain in the SEER data files. 


o eod13 and eod2 are now set to strings in pickFields to avoid errors when picked 

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.


2017.2 by Tomas Radivoyevitch, 3 months ago

Browse source code at

Authors: Tomas Radivoyevitch

Documentation:   PDF Manual  

GPL (>= 2) license

Imports Rcpp, reshape2, mgcv, LaF, DBI, RSQLite, rgl, XLConnect, scales, plyr, survival

Depends on dplyr, ggplot2

Suggests bbmle, demography

Linking to Rcpp

See at CRAN