[r-cran-mlmrev] 06/11: New upstream version 1.0-6
Andreas Tille
tille at debian.org
Thu Nov 9 15:47:35 UTC 2017
This is an automated email from the git hooks/post-receive script.
tille pushed a commit to branch master
in repository r-cran-mlmrev.
commit a67145227960c0e3dead552f3dc1b39cecb403b3
Author: Andreas Tille <tille at debian.org>
Date: Thu Nov 9 16:38:39 2017 +0100
New upstream version 1.0-6
---
.Rinstignore | 2 +
ChangeLog | 70 +++++
DESCRIPTION | 20 ++
MD5 | 52 ++++
NAMESPACE | 2 +
build/vignette.rds | Bin 0 -> 272 bytes
data/Chem97.rda | Bin 0 -> 130296 bytes
data/Contraception.rda | Bin 0 -> 5500 bytes
data/Early.rda | Bin 0 -> 1493 bytes
data/Exam.rda | Bin 0 -> 17548 bytes
data/Gcsemv.rda | Bin 0 -> 8220 bytes
data/Hsb82.rda | Bin 0 -> 68560 bytes
data/Mmmec.rda | Bin 0 -> 6555 bytes
data/Oxboys.rda | Bin 0 -> 1954 bytes
data/ScotsSec.rda | Bin 0 -> 9164 bytes
data/Socatt.rda | Bin 0 -> 4747 bytes
data/bdf.rda | Bin 0 -> 26684 bytes
data/egsingle.rda | Bin 0 -> 24600 bytes
data/guImmun.rda | Bin 0 -> 10772 bytes
data/guPrenat.rda | Bin 0 -> 13020 bytes
data/s3bbx.rda | Bin 0 -> 12388 bytes
data/s3bby.rda | Bin 0 -> 35379 bytes
data/star.rda | Bin 0 -> 200852 bytes
debian/README.source | 4 -
debian/README.test | 10 -
debian/changelog | 25 --
debian/compat | 1 -
debian/control | 23 --
debian/copyright | 32 ---
debian/docs | 3 -
debian/rules | 4 -
debian/source/format | 1 -
debian/tests/control | 3 -
debian/tests/run-unit-test | 13 -
debian/watch | 2 -
inst/doc/MlmSoftRev.R | 285 +++++++++++++++++++
inst/doc/MlmSoftRev.Rnw | 646 ++++++++++++++++++++++++++++++++++++++++++++
inst/doc/MlmSoftRev.pdf | Bin 0 -> 412362 bytes
inst/doc/StarData.R | 203 ++++++++++++++
inst/doc/StarData.Rnw | 306 +++++++++++++++++++++
inst/doc/StarData.pdf | Bin 0 -> 204841 bytes
inst/original/text-star.zip | Bin 0 -> 360427 bytes
man/Chem97.Rd | 43 +++
man/Contraception.Rd | 39 +++
man/Early.Rd | 34 +++
man/Exam.Rd | 44 +++
man/Gcsemv.Rd | 33 +++
man/Hsb82.Rd | 38 +++
man/Mmmec.Rd | 39 +++
man/Oxboys.Rd | 41 +++
man/ScotsSec.Rd | 41 +++
man/Socatt.Rd | 50 ++++
man/bdf.Rd | 54 ++++
man/egsingle.Rd | 47 ++++
man/guImmun.Rd | 59 ++++
man/guPrenat.Rd | 54 ++++
man/s3bbx.Rd | 35 +++
man/s3bby.Rd | 30 ++
man/star.Rd | 62 +++++
tests/guImmun.R | 39 +++
tests/lmerTest.R | 60 ++++
tests/versions.R | 11 +
vignettes/MlmSoftRev.Rnw | 646 ++++++++++++++++++++++++++++++++++++++++++++
vignettes/MlmSoftRev.bib | 8 +
vignettes/StarData.Rnw | 306 +++++++++++++++++++++
vignettes/myVignette.sty | 101 +++++++
66 files changed, 3500 insertions(+), 121 deletions(-)
diff --git a/.Rinstignore b/.Rinstignore
new file mode 100644
index 0000000..66affa3
--- /dev/null
+++ b/.Rinstignore
@@ -0,0 +1,2 @@
+doc/.*\.sty
+doc/figs
\ No newline at end of file
diff --git a/ChangeLog b/ChangeLog
new file mode 100644
index 0000000..9245bbd
--- /dev/null
+++ b/ChangeLog
@@ -0,0 +1,70 @@
+2011-02-15 Douglas Bates <bates at stat.wisc.edu>
+
+ * DESCRIPTION: Package now depends on R (>= 2.10). New version and date.
+ * data/Chem97.rda, data/Contraception.rda, data/Early.rda,
+ data/Exam.rda, data/Gcsemv.rda, data/Hsb82.rda, data/Mmmec.rda,
+ data/Oxboys.rda, data/ScotsSec.rda, data/Socatt.rda, data/bdf.rda,
+ data/egsingle.rda, data/guImmun.rda, data/guPrenat.rda,
+ data/s3bbx.rda, data/s3bby.rda, data/star.rda,
+ inst/doc/MlmSoftRev.pdf, inst/doc/StarData.pdf: resave the data for
+ higher compression, re-build the vignettes
+
+2009-08-08 Martin Maechler <maechler at stat.math.ethz.ch>
+
+ * [r399] DESCRIPTION, inst/doc/MlmSoftRev.Rnw, man/Contraception.Rd,
+ man/Mmmec.Rd, man/egsingle.Rd, tests/Early.Rout.save,
+ tests/Hsb82.Rout.save, tests/ScotsSec.R, tests/ScotsSec.Rout.save,
+ tests/bdf.Rout.save, tests/egsingle.Rout.save,
+ tests/guImmun.Rout.save, tests/lmerTest.R, tests/lmerTest.Rout.save:
+ update to CRAN`s version 0.99875-1 (from 2008-06 !)
+
+2010-08-04 Douglas Bates <bates at stat.wisc.edu>
+
+ * man/Contraception.Rd: References to lmer should be glmer
+
+2009-07-28 Martin Maechler <maechler at stat.math.ethz.ch>
+
+ * inst/doc/*.Rnw: end w/sessionInfo; Star*: use our "standard" SweaveOpts
+
+2008-06-19 Martin Maechler <maechler at stat.math.ethz.ch>
+
+ * man/Mmmec.Rd: use "offset= ." instead of of offset(.) in formula.
+
+2008-05-21 Martin Maechler <maechler at stat.math.ethz.ch>
+
+ * DESCRIPTION (License): use "correct" licence-syntax;
+ (Version): 0.99875-1: matching CRAN version of lme4.
+
+ * man/bdf.Rd and a few others: use \docType{data} instead of
+ \non_function{}
+
+2006-09-08 Douglas Bates <bates at stat.wisc.edu>
+
+ * DESCRIPTION (Depends): Update to current version of lme4
+
+2005-08-26 Douglas Bates <bates at wisc.edu>
+
+ * man/Mmmec.Rd: Added example and then conditionalized the
+ example using AGQ.
+
+2005-07-12 Douglas Bates <bates at wisc.edu>
+
+ * inst/doc/MlmSoftRev.Rnw: Expanded discussion considerably.
+
+2005-05-19 Douglas Bates <bates at bates1-home>
+
+ * tests/lmerTest.R: Removed defunct tests and updated results of
+ current tests to lme_0.95-8
+
+2005-05-13 Douglas Bates <bates at bates1-home>
+
+ * inst/doc/MlmSoftRev.Rnw (section{Two-level models for binary
+ data}): Added examples of generalized linear mixed model fits
+ using both PQL and the Laplacian approximation.
+
+2005-04-21 Douglas Bates <bates at bates1-home>
+
+ * inst/doc/StarData.pdf: added
+
+ * DESCRIPTION: Updated dependencies
+
diff --git a/DESCRIPTION b/DESCRIPTION
new file mode 100644
index 0000000..1764635
--- /dev/null
+++ b/DESCRIPTION
@@ -0,0 +1,20 @@
+Package: mlmRev
+Version: 1.0-6
+Date: 2014-03-11
+Title: Examples from Multilevel Modelling Software Review
+Author: Douglas Bates <bates at stat.wisc.edu>,
+ Martin Maechler <maechler at R-project.org> and
+ Ben Bolker <bolker at mcmaster.ca>
+Contact: LME4 Authors <lme4-authors at lists.r-forge.r-project.org>
+Maintainer: Steve Walker <steve.walker at utoronto.ca>
+Description: Data and examples from a multilevel modelling software review
+ as well as other well-known data sets from the multilevel modelling
+ literature.
+Depends: lme4, R (>= 2.10)
+Suggests: lattice
+LazyData: yes
+License: GPL (>= 2)
+Packaged: 2014-03-11 14:11:54 UTC; stevenwalker
+NeedsCompilation: no
+Repository: CRAN
+Date/Publication: 2014-03-11 16:41:16
diff --git a/MD5 b/MD5
new file mode 100644
index 0000000..01fcd65
--- /dev/null
+++ b/MD5
@@ -0,0 +1,52 @@
+520511891275075282fc5f500c6f07fb *ChangeLog
+3a0bdc43b158eba79c027d0e859e308d *DESCRIPTION
+78e3a88f875665cfbd477496fbecae0b *NAMESPACE
+fe68621355af4092b6acd1aa11601a49 *build/vignette.rds
+2d2625e8f4ff3c4fe3751e4781fc0267 *data/Chem97.rda
+15d2e8360d12b01ca310d6fee4d3634b *data/Contraception.rda
+f4b727b4de16c70be4b2e9e96758db2b *data/Early.rda
+f87f0cae90fad7f5ac7cba477a598134 *data/Exam.rda
+0b1beb8386b3ab210dcc9db6b7cff4ca *data/Gcsemv.rda
+9a08067e4a8e8129dc074cafc1d2ff0f *data/Hsb82.rda
+f714fcc6873ceba2cdf4b2976bd7b57d *data/Mmmec.rda
+b6e06e8eeed56ca58ffdf64cc2d3ba7c *data/Oxboys.rda
+05115b7d463a387df5d0e463f9bc7944 *data/ScotsSec.rda
+44045318326877a5e379e2c778b4fa25 *data/Socatt.rda
+eb80e6642a21bf1ac4668b4a50081923 *data/bdf.rda
+88aa7c5894d744ecc282396948682378 *data/egsingle.rda
+aadd470d0494f24354515b6f419b1725 *data/guImmun.rda
+e0f53953b8d79cf5c377db77d30cc460 *data/guPrenat.rda
+318e6db921abcbcaa19892f5bf2d5825 *data/s3bbx.rda
+863df9544b372951a4a27ef22272c93b *data/s3bby.rda
+27ed5a6f1fbe24b47732046315a8cb56 *data/star.rda
+ee11c5bd87a2ad79b1b2134f18360ae7 *inst/doc/MlmSoftRev.R
+bad05c22fdf345b002b5a4d6365f0f02 *inst/doc/MlmSoftRev.Rnw
+c31ca16434ab1c078c33f035948a9306 *inst/doc/MlmSoftRev.pdf
+3bd43ec5ad8b7c799d50aa631a23ed8b *inst/doc/StarData.R
+56ac13cb9a2977e8858bf3e4d301d029 *inst/doc/StarData.Rnw
+77b6745a538d58b03edbae1efbc3e6dc *inst/doc/StarData.pdf
+0e23fe9cf9c849e398af107ef5c72773 *inst/original/text-star.zip
+9e583f8ff5cc4c1c6bf424e49dd94f49 *man/Chem97.Rd
+2ed8f79f629cb3523518068f5707f784 *man/Contraception.Rd
+920f037d5a7d56743bd352d7f21d5102 *man/Early.Rd
+4a6e95e887d030118a8ff9a2e0bc434f *man/Exam.Rd
+5fc7e8df160705e6a66ad8520bd82aab *man/Gcsemv.Rd
+c5d8d86ea2ae157303642ed85acdd83b *man/Hsb82.Rd
+ec96b97b03051168a353a93c0173b1c3 *man/Mmmec.Rd
+6ab4c11407869faac53ffa74b2d6ce9a *man/Oxboys.Rd
+cf62a126432baf1063246bdc5ce59533 *man/ScotsSec.Rd
+55a03090f679d53a9d066410ee7e03bd *man/Socatt.Rd
+0d3811a7bab1c39fb90c2fe374d1a2b7 *man/bdf.Rd
+4920de8d64ae12550a5ca603628814ac *man/egsingle.Rd
+a1f925844da040e206bcb4fe8246ea64 *man/guImmun.Rd
+af9e29a767582088d14cd278354ce0cc *man/guPrenat.Rd
+7b3aeee1de9d4b34365b92cbad9c3b8b *man/s3bbx.Rd
+3198a6369b2821310aa9671071fa6be6 *man/s3bby.Rd
+4d7d68134e2def1fb2004b0033f2edd9 *man/star.Rd
+9594dfe90c990556f9ea36cd983d891e *tests/guImmun.R
+7d4dc651d7c884b0385ffa4843e4c8d7 *tests/lmerTest.R
+01b5965f5c3e67bf27a8e3b024a3cf1c *tests/versions.R
+bad05c22fdf345b002b5a4d6365f0f02 *vignettes/MlmSoftRev.Rnw
+8c5730e0cecdf2a96f647770924c1ee4 *vignettes/MlmSoftRev.bib
+56ac13cb9a2977e8858bf3e4d301d029 *vignettes/StarData.Rnw
+f41eb3d9fd035d80f016a01baeec764a *vignettes/myVignette.sty
diff --git a/NAMESPACE b/NAMESPACE
new file mode 100644
index 0000000..18ce6ff
--- /dev/null
+++ b/NAMESPACE
@@ -0,0 +1,2 @@
+# Export all names
+exportPattern(".")
diff --git a/build/vignette.rds b/build/vignette.rds
new file mode 100644
index 0000000..7793039
Binary files /dev/null and b/build/vignette.rds differ
diff --git a/data/Chem97.rda b/data/Chem97.rda
new file mode 100644
index 0000000..4beee18
Binary files /dev/null and b/data/Chem97.rda differ
diff --git a/data/Contraception.rda b/data/Contraception.rda
new file mode 100644
index 0000000..f8b0466
Binary files /dev/null and b/data/Contraception.rda differ
diff --git a/data/Early.rda b/data/Early.rda
new file mode 100644
index 0000000..06c9e04
Binary files /dev/null and b/data/Early.rda differ
diff --git a/data/Exam.rda b/data/Exam.rda
new file mode 100644
index 0000000..6b34cd8
Binary files /dev/null and b/data/Exam.rda differ
diff --git a/data/Gcsemv.rda b/data/Gcsemv.rda
new file mode 100644
index 0000000..37d7cd7
Binary files /dev/null and b/data/Gcsemv.rda differ
diff --git a/data/Hsb82.rda b/data/Hsb82.rda
new file mode 100644
index 0000000..c83d3e0
Binary files /dev/null and b/data/Hsb82.rda differ
diff --git a/data/Mmmec.rda b/data/Mmmec.rda
new file mode 100644
index 0000000..f755b4b
Binary files /dev/null and b/data/Mmmec.rda differ
diff --git a/data/Oxboys.rda b/data/Oxboys.rda
new file mode 100644
index 0000000..03f2130
Binary files /dev/null and b/data/Oxboys.rda differ
diff --git a/data/ScotsSec.rda b/data/ScotsSec.rda
new file mode 100644
index 0000000..4a91288
Binary files /dev/null and b/data/ScotsSec.rda differ
diff --git a/data/Socatt.rda b/data/Socatt.rda
new file mode 100644
index 0000000..2fd481a
Binary files /dev/null and b/data/Socatt.rda differ
diff --git a/data/bdf.rda b/data/bdf.rda
new file mode 100644
index 0000000..d941dc1
Binary files /dev/null and b/data/bdf.rda differ
diff --git a/data/egsingle.rda b/data/egsingle.rda
new file mode 100644
index 0000000..31ad735
Binary files /dev/null and b/data/egsingle.rda differ
diff --git a/data/guImmun.rda b/data/guImmun.rda
new file mode 100644
index 0000000..4d75a15
Binary files /dev/null and b/data/guImmun.rda differ
diff --git a/data/guPrenat.rda b/data/guPrenat.rda
new file mode 100644
index 0000000..798c962
Binary files /dev/null and b/data/guPrenat.rda differ
diff --git a/data/s3bbx.rda b/data/s3bbx.rda
new file mode 100644
index 0000000..f642427
Binary files /dev/null and b/data/s3bbx.rda differ
diff --git a/data/s3bby.rda b/data/s3bby.rda
new file mode 100644
index 0000000..9bdeae7
Binary files /dev/null and b/data/s3bby.rda differ
diff --git a/data/star.rda b/data/star.rda
new file mode 100644
index 0000000..23c142a
Binary files /dev/null and b/data/star.rda differ
diff --git a/debian/README.source b/debian/README.source
deleted file mode 100644
index 4039fdb..0000000
--- a/debian/README.source
+++ /dev/null
@@ -1,4 +0,0 @@
-The data/*.rda files contains non-generated datasets. More details can be found
-under the respective man/*.Rd files.
-
- -- Sébastien Villemot <sebastien at debian.org>, Sat, 30 Sep 2017 10:51:37 +0200
diff --git a/debian/README.test b/debian/README.test
deleted file mode 100644
index 88a2b9a..0000000
--- a/debian/README.test
+++ /dev/null
@@ -1,10 +0,0 @@
-R msm for Debian
-----------------
-
-This package can be tested running the unit test suite via
-
- sh run-unit-test
-
-in this directory
-
- -- Andreas Tille <tille at debian.org> Wed, 21 May 2014 13:01:34 +0200
diff --git a/debian/changelog b/debian/changelog
deleted file mode 100644
index 76e1751..0000000
--- a/debian/changelog
+++ /dev/null
@@ -1,25 +0,0 @@
-r-cran-mlmrev (1.0-6-3) unstable; urgency=medium
-
- * Team upload.
- * Rebuild for r-api-3.4 transition.
- * Convert to dh-r.
- * Add README.source documenting *.rda files.
- * Bump Standards-Version to 4.1.1.
- * Remove unneeded Testsuite header.
- * Use canonical CRAN homepage.
- * Use secure URL in Format field of d/copyright.
- * Bump to watch file format version 4.
-
- -- Sébastien Villemot <sebastien at debian.org> Sun, 01 Oct 2017 11:11:18 +0200
-
-r-cran-mlmrev (1.0-6-2) unstable; urgency=medium
-
- * Fix package name in autopkgtest (thanks to Gordon Ball for the patch)
-
- -- Andreas Tille <tille at debian.org> Thu, 23 Jun 2016 08:49:21 +0200
-
-r-cran-mlmrev (1.0-6-1) unstable; urgency=low
-
- * Initial release (Closes: #826850).
-
- -- Andreas Tille <tille at debian.org> Thu, 09 Jun 2016 15:08:48 +0200
diff --git a/debian/compat b/debian/compat
deleted file mode 100644
index ec63514..0000000
--- a/debian/compat
+++ /dev/null
@@ -1 +0,0 @@
-9
diff --git a/debian/control b/debian/control
deleted file mode 100644
index f1e3e84..0000000
--- a/debian/control
+++ /dev/null
@@ -1,23 +0,0 @@
-Source: r-cran-mlmrev
-Maintainer: Debian Science Team <debian-science-maintainers at lists.alioth.debian.org>
-Uploaders: Andreas Tille <tille at debian.org>
-Section: gnu-r
-Priority: optional
-Build-Depends: debhelper (>= 10),
- dh-r,
- r-cran-lme4
-Standards-Version: 4.1.1
-Vcs-Browser: https://anonscm.debian.org/viewvc/debian-science/packages/R/r-cran-mlmrev/trunk/
-Vcs-Svn: svn://anonscm.debian.org/debian-science/packages/R/r-cran-mlmrev/trunk/
-Homepage: https://cran.r-project.org/package=mlmRev
-
-Package: r-cran-mlmrev
-Architecture: all
-Depends: ${misc:Depends},
- ${R:Depends}
-Recommends: ${R:Recommends}
-Suggests: ${R:Suggests}
-Description: GNU R Examples from Multilevel Modelling Software Review
- This GNU R package provides data and examples from a multilevel
- modelling software review as well as other well-known data sets from the
- multilevel modelling literature.
diff --git a/debian/copyright b/debian/copyright
deleted file mode 100644
index 60d16db..0000000
--- a/debian/copyright
+++ /dev/null
@@ -1,32 +0,0 @@
-Format: https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
-Upstream-Name: mlmRev
-Upstream-Contact: Steve Walker <steve.walker at utoronto.ca>
-Source: https://cran.r-project.org/package=mlmRev
-
-Files: *
-Copyright: 2014-2016 Steve Walker <steve.walker at utoronto.ca>,
- Douglas Bates, Martin Maechler and Ben Bolker
-License: GPL-2+
-
-Files: debian/*
-Copyright: 2016 Andreas Tille <tille at debian.org>
-License: GPL-2+
-
-License: GPL-2+
- This program is free software; you can redistribute it and/or modify
- it under the terms of the GNU General Public License as published by
- the Free Software Foundation; either version 2 of the License, or
- (at your option) any later version.
- .
- This program is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- GNU General Public License for more details.
- .
- You should have received a copy of the GNU General Public License along
- with this program; if not, write to the Free Software Foundation, Inc.,
- 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
- .
- Comment: On Debian systems, the complete text of the GNU General Public
- License can be found in `/usr/share/common-licenses/GPL'.
-
diff --git a/debian/docs b/debian/docs
deleted file mode 100644
index 3adf0d6..0000000
--- a/debian/docs
+++ /dev/null
@@ -1,3 +0,0 @@
-debian/README.test
-debian/tests/run-unit-test
-tests
diff --git a/debian/rules b/debian/rules
deleted file mode 100755
index 68d9a36..0000000
--- a/debian/rules
+++ /dev/null
@@ -1,4 +0,0 @@
-#!/usr/bin/make -f
-
-%:
- dh $@ --buildsystem R
diff --git a/debian/source/format b/debian/source/format
deleted file mode 100644
index 163aaf8..0000000
--- a/debian/source/format
+++ /dev/null
@@ -1 +0,0 @@
-3.0 (quilt)
diff --git a/debian/tests/control b/debian/tests/control
deleted file mode 100644
index 905f1ee..0000000
--- a/debian/tests/control
+++ /dev/null
@@ -1,3 +0,0 @@
-Tests: run-unit-test
-Depends: @, @builddeps@, r-cran-numderiv, r-cran-testthat
-Restrictions: allow-stderr
diff --git a/debian/tests/run-unit-test b/debian/tests/run-unit-test
deleted file mode 100644
index b4c6c37..0000000
--- a/debian/tests/run-unit-test
+++ /dev/null
@@ -1,13 +0,0 @@
-#!/bin/sh -e
-
-pkg=r-cran-mlmrev
-
-if [ "$ADTTMP" = "" ] ; then
- ADTTMP=`mktemp -d /tmp/${pkg}-test.XXXXXX`
-fi
-cd $ADTTMP
-cp -a /usr/share/doc/${pkg}/tests/* $ADTTMP
-find . -name "*.gz" -exec gunzip \{\} \;
-for testx in `ls *.R` ; do
- LC_ALL=C.UTF-8 R --no-save < $testx
-done
diff --git a/debian/watch b/debian/watch
deleted file mode 100644
index a14177e..0000000
--- a/debian/watch
+++ /dev/null
@@ -1,2 +0,0 @@
-version=4
-http://cran.r-project.org/src/contrib/mlmRev_([-\d.]*)\.tar\.gz
diff --git a/inst/doc/MlmSoftRev.R b/inst/doc/MlmSoftRev.R
new file mode 100644
index 0000000..c8ab270
--- /dev/null
+++ b/inst/doc/MlmSoftRev.R
@@ -0,0 +1,285 @@
+### R code from vignette source 'MlmSoftRev.Rnw'
+
+###################################################
+### code chunk number 1: preliminaries
+###################################################
+options(width=80, show.signif.stars = FALSE,
+ lattice.theme = function() canonical.theme("pdf", color = FALSE))
+library(mlmRev)
+library(lme4)
+library(lattice)
+set.seed(1234321)
+
+
+###################################################
+### code chunk number 2: Examprep
+###################################################
+lmer(normexam ~ standLRT + sex + schgend + (1|school), Exam)
+
+
+###################################################
+### code chunk number 3: ExamData
+###################################################
+str(Exam)
+summary(Exam)
+
+
+###################################################
+### code chunk number 4: ExamFit
+###################################################
+(Em1 <- lmer(normexam ~ standLRT + sex + schgend + (1|school), Exam))
+
+
+###################################################
+### code chunk number 5: Examtime
+###################################################
+system.time(lmer(normexam ~ standLRT + sex + schgend + (1|school), Exam))
+
+
+###################################################
+### code chunk number 6: ExamRelevel
+###################################################
+Exam$sex <- relevel(Exam$sex, "M")
+Exam$schgend <- relevel(Exam$schgend, "girls")
+(Em2 <- lmer(normexam ~ standLRT + sex + schgend + (1|school), Exam))
+
+
+###################################################
+### code chunk number 7: ExamIds
+###################################################
+Exam <- within(Exam, ids <- factor(school:student))
+str(Exam)
+
+
+###################################################
+### code chunk number 8: dupExamIds
+###################################################
+as.character(Exam$ids[which(duplicated(Exam$ids))])
+
+
+###################################################
+### code chunk number 9: OnlyBoy
+###################################################
+subset(Exam, ids == '43:86')
+xtabs(~ sex + school, Exam, subset = school %in% c(43, 50, 52), drop = TRUE)
+
+
+###################################################
+### code chunk number 10: ExamXtabs
+###################################################
+xtabs(~ sex + school, Exam, subset = type == "Mxd", drop = TRUE)
+
+
+###################################################
+### code chunk number 11: ExamMosaicshow (eval = FALSE)
+###################################################
+## ExamMxd <- within(subset(Exam, type == "Mxd"), school <- factor(school))
+## mosaicplot(~ school + sex, ExamMxd)
+
+
+###################################################
+### code chunk number 12: ExamMosaic
+###################################################
+ExamMxd <- within(subset(Exam, type == "Mxd"), school <- factor(school))
+mosaicplot(~ school + sex, ExamMxd)
+
+
+###################################################
+### code chunk number 13: Examplot1
+###################################################
+print(xyplot(normexam ~ standLRT | sex * type, Exam,
+ type = c("g", "p", "smooth"), layout = c(2,2),
+ xlab = "Standardized London Reading Test score",
+ ylab = "Normalized exam score", aspect = 1.2))
+
+
+###################################################
+### code chunk number 14: Examplot1show (eval = FALSE)
+###################################################
+## xyplot(normexam ~ standLRT | sex * type, Exam, type = c("g", "p", "smooth"))
+
+
+###################################################
+### code chunk number 15: Examplot2
+###################################################
+print(xyplot(normexam ~ standLRT, Exam, groups = sex:type,
+ type = c("g", "smooth"), xlim = c(-3,3), ylim = c(-2,2),
+ xlab = "Standardized London Reading Test score",
+ ylab = "Normalized exam score",
+ auto.key = list(space = 'right', points = FALSE, lines = TRUE),
+ aspect = 1))
+
+
+###################################################
+### code chunk number 16: Examplot2show (eval = FALSE)
+###################################################
+## xyplot(normexam ~ standLRT, Exam, groups = sex:type, type = c("g", "smooth"))
+
+
+###################################################
+### code chunk number 17: Examplot3
+###################################################
+print(xyplot(normexam ~ standLRT | school, Exam,
+ type = c("g", "p", "r"),
+ xlab = "Standardized London Reading Test score",
+ ylab = "Normalized exam score",
+ subset = sex == "F" & type == "Sngl"))
+
+
+###################################################
+### code chunk number 18: Examplot3show
+###################################################
+xyplot(normexam ~ standLRT | school, Exam,
+ type = c("g", "p", "r"),
+ subset = sex == "F" & type == "Sngl")
+
+
+###################################################
+### code chunk number 19: Examplot4show (eval = FALSE)
+###################################################
+## xyplot(normexam ~ standLRT | school, Exam, type = c("g", "p", "r"),
+## subset = sex == "F" & type == "Sngl" & school != 48,
+## index.cond = function(x, y) coef(lm(y ~ x))[1])
+
+
+###################################################
+### code chunk number 20: Examplot4
+###################################################
+print(xyplot(normexam ~ standLRT | school, Exam,
+ type = c("g", "p", "r"),
+ xlab = "Standardized London Reading Test score",
+ ylab = "Normalized exam score",
+ subset = sex == "F" & type == "Sngl" & school != 48,
+ index.cond = function(x, y) coef(lm(y ~ x))[1]))
+
+
+###################################################
+### code chunk number 21: Examplot4a
+###################################################
+print(xyplot(normexam ~ standLRT | school, Exam,
+ type = c("g", "p", "r"),
+ xlab = "Standardized London Reading Test score",
+ ylab = "Normalized exam score",
+ subset = sex == "F" & type == "Sngl" & school != 48,
+ index.cond = function(x, y) coef(lm(y ~ x))[2]))
+
+
+###################################################
+### code chunk number 22: ExamLmListFS
+###################################################
+show(ExamFS <- lmList(normexam ~ standLRT | school, Exam,
+ subset = sex == "F" & type == "Sngl" & school != 48))
+
+
+###################################################
+### code chunk number 23: Examplot4cshow
+###################################################
+plot(confint(ExamFS, pool = TRUE), order = 1)
+
+
+###################################################
+### code chunk number 24: Examplot4c
+###################################################
+print(plot(confint(ExamFS, pool = TRUE), order = 1))
+
+
+###################################################
+### code chunk number 25: Examplot5
+###################################################
+print(xyplot(normexam ~ standLRT | school, Exam,
+ type = c("g", "p", "r"),
+ xlab = "Standardized London Reading Test score",
+ ylab = "Normalized exam score",
+ subset = sex == "M" & type == "Sngl", layout = c(5,2),
+ index.cond = function(x, y) coef(lm(y ~ x))[1]))
+
+
+###################################################
+### code chunk number 26: ExamLmListMS
+###################################################
+show(ExamMS <- lmList(normexam ~ standLRT | school, Exam,
+ subset = sex == "M" & type == "Sngl"))
+
+
+###################################################
+### code chunk number 27: Examplot5b
+###################################################
+print(plot(confint(ExamMS, pool = TRUE), order = 1))
+
+
+###################################################
+### code chunk number 28: Examplot6
+###################################################
+print(xyplot(normexam ~ standLRT | school, Exam, groups = sex,
+ type = c("g", "p", "r"),
+ xlab = "Standardized London Reading Test score",
+ ylab = "Normalized exam score",
+ subset = !school %in% c(43, 47) & type == "Mxd",
+ index.cond = function(x, y) coef(lm(y ~ x))[1],
+ auto.key = list(space = 'top', lines = TRUE,
+ columns = 2), layout = c(7,5),
+ aspect = 1.2))
+
+
+###################################################
+### code chunk number 29: ExamLmListM
+###################################################
+show(ExamM <- lmList(normexam ~ standLRT + sex| school, Exam,
+ subset = type == "Mxd" & !school %in% c(43,47,54)))
+
+
+###################################################
+### code chunk number 30: Examplot6b
+###################################################
+print(plot(confint(ExamM, pool = TRUE), order = 1))
+
+
+###################################################
+### code chunk number 31: Em3
+###################################################
+(Em3 <- lmer(normexam ~ standLRT + sex + type + (1|school), Exam))
+
+
+###################################################
+### code chunk number 32: Em4
+###################################################
+(Em4 <- lmer(normexam ~ standLRT + sex + type + (standLRT|school), Exam))
+
+
+###################################################
+### code chunk number 33: EmAnova
+###################################################
+anova(Em3, Em4)
+
+
+###################################################
+### code chunk number 34: Em5
+###################################################
+(Em5 <- lmer(normexam ~ standLRT + sex + type + (standLRT + sex|school), Exam))
+
+
+###################################################
+### code chunk number 35: Oxboys
+###################################################
+str(Oxboys)
+system.time(mX1 <- lmer(height ~ age + I(age^2) + I(age^3) + I(age^4) + (age + I(age^2)|Subject),
+ Oxboys))
+summary(mX1)
+system.time(mX2 <- lmer(height ~ poly(age,4) + (age + I(age^2)|Subject), Oxboys))
+summary(mX2)
+
+
+###################################################
+### code chunk number 36: ScotsSec
+###################################################
+str(ScotsSec)
+system.time(mS1 <- lmer(attain ~ sex + (1|primary) + (1|second), ScotsSec))
+summary(mS1)
+
+
+###################################################
+### code chunk number 37: sessionInfo
+###################################################
+toLatex(sessionInfo())
+
+
diff --git a/inst/doc/MlmSoftRev.Rnw b/inst/doc/MlmSoftRev.Rnw
new file mode 100644
index 0000000..af8726a
--- /dev/null
+++ b/inst/doc/MlmSoftRev.Rnw
@@ -0,0 +1,646 @@
+\documentclass[12pt]{article}
+\usepackage{Sweave}
+\usepackage{myVignette}
+\usepackage[authoryear,round]{natbib}
+\bibliographystyle{plainnat}
+\DefineVerbatimEnvironment{Sinput}{Verbatim}
+{formatcom={\vspace{-1.5ex}},fontshape=sl,
+ fontfamily=courier,fontseries=b, fontsize=\scriptsize}
+\DefineVerbatimEnvironment{Soutput}{Verbatim}
+{formatcom={\vspace{-2.5ex}},fontfamily=courier,fontseries=b,%
+ fontsize=\scriptsize}
+%%\VignetteIndexEntry{Examples from Multilevel Software Reviews}
+%%\VignetteDepends{lme4}
+%%\VignetteDepends{lattice}
+\begin{document}
+\SweaveOpts{engine=R,eps=FALSE,pdf=TRUE,width=5,height=3,strip.white=true,keep.source=TRUE}
+\SweaveOpts{prefix=TRUE,prefix.string=SoftRev,include=TRUE}% ^^^^^^^^^^^^^^^^
+\setkeys{Gin}{width=\textwidth}
+\title{Examples from Multilevel Software Comparative Reviews}
+\author{Douglas Bates\\R Development Core Team\\\email{Douglas.Bates at R-project.org}}
+\date{February 2005, {\small with updates up to \today}}
+\maketitle
+\begin{abstract}
+ The Center for Multilevel Modelling at the Institute of Education,
+ London maintains a web site of ``Software reviews of multilevel
+ modeling packages''. The data sets discussed in the reviews are
+ available at this web site. We have incorporated these data sets
+ in the \code{mlmRev} package for \RR{} and, in this vignette, provide
+ the results of fitting several models to these data sets.
+\end{abstract}
+<<preliminaries,echo=FALSE,print=FALSE,results=hide>>=
+options(width=80, show.signif.stars = FALSE,
+ lattice.theme = function() canonical.theme("pdf", color = FALSE))
+library(mlmRev)
+library(lme4)
+library(lattice)
+set.seed(1234321)
+@
+
+\section{Introduction}
+\label{sec:Intro}
+
+\RR{} is an Open Source implementation of John Chambers' \Slang{}
+language for data analysis and graphics. \RR{} was initially developed
+by Ross Ihaka and Robert Gentleman of the University of Auckland and
+now is developed and maintained by an international group of
+statistical computing experts.
+
+In addition to being Open Source software, which means that anyone can
+examine the source code to see exactly how the computations are being
+carried out, \RR{} is freely available from a network of archive sites
+on the Internet. There are precompiled versions for installation on
+the Windows operating system, Mac OS X and several distributions of
+the Linux operating system. Because the source code is available
+those interested in doing so can compile their own version if they wish.
+
+\RR{} provides an environment for interactive computing with data and for
+graphical display of data. Users and developers can extend the
+capabilities of \RR{} by writing their own functions in the language
+and by creating packages of functions and data sets. Many such
+packages are available on the archive network called CRAN
+(Comprehensive R Archive Network) for which the parent site is
+\url{http://cran.r-project.org}. One such package called \code{lme4}
+(along with a companion package called \code{Matrix}) provides
+functions to fit and display linear mixed models and generalized
+linear mixed models, which are the statisticians' names for the models
+called multilevel models or hierarchical linear models in other
+disciplines. The \code{lattice} package provides functions to
+generate several high level graphics plots that help with the
+visualization of the types of data to which such models are fit.
+Finally, the \code{mlmRev} package provides the data sets used in the
+``Software Reviews of Multilevel Modeling Packages'' from the
+Multilevel Modeling Group at the Institute of Education in the UK.
+This package also contains several other data sets from the multilevel
+modeling literature.
+
+The software reviews mentioned above were intended to provide
+comparison of the speed and accuracy of many different packages for
+fitting multilevel models. As such, there is a standard set of
+models that fit to each of the data sets in each of the packages that
+were capable of doing the fit. We will fit these models for
+comparative purposes but we will also do some graphical exploration of
+the data and, in some cases, discuss alternative models.
+
+We follow the general outline of the previous reviews, beginning with
+simpler structures and moving to the more complex structures. Because
+the previous reviews were performed on older and considerably slower
+computers than the ones on which this vignette will be compiled, the
+timings produced by the \code{system.time} function and shown in the
+text should not be compared with previous timings given on the web
+site. They are an indication of the times required to fit such models
+to these data sets on recent computers with processors running at
+around 2 GHz or faster.
+
+\section{Two-level normal models}
+\label{sec:TwoLevelNormal}
+
+In the multilevel modeling literature a two-level model is one with
+two levels of random variation; the per-observation noise term and
+random effects which are grouped according to the levels of a factor.
+We call this factor a \textit{grouping factor}. If the response is
+measured on a continuous scale (more or less) our initial models are
+based on a normal distribution for the per-observation noise and for
+the random effects. Thus such a model is called a ``two-level normal
+model'' even though it has only one grouping factor for the random effects.
+
+
+\subsection{The Exam data}
+\label{sec:Examdata}
+<<Examprep,results=hide,echo=FALSE>>=
+lmer(normexam ~ standLRT + sex + schgend + (1|school), Exam)
+@
+
+The data set called \code{Exam} provides the normalized exam scores
+attained by 4,059 students from 65 schools in inner London. Some of
+the covariates available with this exam score are the school the
+student attended, the sex of the student, the school gender (boys,
+girls, or mixed) and the student's result on the Standardised London
+Reading test.
+
+The \RR{} functions \code{str} and \code{summary} can be used to
+examine the structure of a data set (or, in general, any \RR{} object)
+and to provide a summary of an object.
+<<ExamData>>=
+str(Exam)
+summary(Exam)
+@
+
+
+\subsection{Model fits and timings}
+\label{sec:ExamFits}
+
+The first model to fit to the Exam data incorporates fixed-effects
+terms for the pretest score, the student's sex and the school gender.
+The only random-effects term is an additive shift associated with the
+school.
+
+<<ExamFit>>=
+(Em1 <- lmer(normexam ~ standLRT + sex + schgend + (1|school), Exam))
+@
+
+The \code{system.time} function can be used to time the execution of
+an \RR{} expression. It returns a vector of five numbers giving the
+user time (time spend executing applications code), the system time
+(time spent executing system functions called by the applications
+code), the elapsed time, and the user and system time for any child
+processes. The first number is what is commonly viewed as the time
+required to do the model fit. (The elapsed time is unsuitable because
+it can be affected by other processes running on the computer.) These
+times are in seconds. On modern computers this fit takes only a
+fraction of a second.
+
+<<Examtime>>=
+system.time(lmer(normexam ~ standLRT + sex + schgend + (1|school), Exam))
+@
+
+
+
+\subsection{Interpreting the fit}
+\label{sec:ExamInterpret}
+
+As can be seen from the output, the default method of fitting a linear
+mixed model is restricted maximum likelihood (REML). The estimates of
+the variance components correspond to those reported by other packages
+as given on the Multilevel Modelling Group's web site. Note that the
+estimates of the variance components are given on the scale of the
+variance and on the scale of the standard deviation. That is, the
+values in the column headed \code{Std.Dev.} are simply the square
+roots of the corresponding entry in the \code{Variance} column. They
+are \textbf{not} standard errors of the estimate of the variance.
+
+The estimates of the fixed-effects are different from those quoted on
+the web site because the terms for \code{sex} and \code{schgend} use a
+different parameterization than in the reviews. Here the reference
+level of \code{sex} is female (\code{F}) and the coefficient labelled
+\code{sexM} represents the difference for males compared to females.
+Similarly the reference level of \code{schgend} is \code{mixed} and
+the two coefficients represent the change from mixed to boys only and
+the change from mixed to girls only. The value of the coefficient
+labelled \code{Intercept} is affected by both these changes as is the
+value of the REML criterion.
+
+To reproduce the results obtained from other packages, we must change
+the reference level for each of these factors.
+
+<<ExamRelevel>>=
+Exam$sex <- relevel(Exam$sex, "M")
+Exam$schgend <- relevel(Exam$schgend, "girls")
+(Em2 <- lmer(normexam ~ standLRT + sex + schgend + (1|school), Exam))
+@
+
+The coefficients now correspond to those in the tables on the web
+site. It happens that the REML criterion at the optimum in this fit
+is the same as in the previous fit, but you cannot depend on this
+occuring. In general the value of the REML criterion at the optimum
+depends on the parameterization used for the fixed effects.
+
+\subsection{Further exploration}
+\label{sec:ExamExplore}
+
+
+\subsubsection{Checking consistency of the data}
+\label{sec:consistency}
+
+It is important to check the consistency of data before trying to fit
+sophisticated models. One should plot the data in many different ways
+to see if it looks reasonableand also check relationships between
+variables.
+
+For example, each observation in these data is associated with a
+particular student. The variable \code{student} is not a unique
+identifier of the student as it only has 650 unique values. It is
+intended to be a unique identifier within a school but it is not. To
+show this we create a factor that is the interaction of school and
+student then drop unused levels.
+
+<<ExamIds>>=
+Exam <- within(Exam, ids <- factor(school:student))
+str(Exam)
+@
+
+Notice that there are 4059 observations but only 4055 unique levels of
+student within school. We can check the ones that are duplicated
+
+<<dupExamIds>>=
+as.character(Exam$ids[which(duplicated(Exam$ids))])
+@
+
+One of these cases
+<<OnlyBoy>>=
+subset(Exam, ids == '43:86')
+xtabs(~ sex + school, Exam, subset = school %in% c(43, 50, 52), drop = TRUE)
+@
+is particularly interesting. Notice that one of the students
+numbered 86 in school 43 is the only male student out of 61 students
+from this school who took the exam. It is quite likely that this
+student's score was attributed to the wrong school and that the school
+is in fact a girls-only school, not a mixed-sex school.
+
+The causes of the other three cases of duplicate student numbers
+within a school are not as clear. It would be necessary to go back
+to the original data records to check these.
+
+The cross-tabulation of the students by sex and school for the
+mixed-sex schools
+<<ExamXtabs>>=
+xtabs(~ sex + school, Exam, subset = type == "Mxd", drop = TRUE)
+@
+shows another anomaly. School 47 is similar to school 43 in that,
+although it is classified as a mixed-sex school, 81 male students
+and only one female student took the exam. It is likely that the
+school was misrecorded for this one female student and the school is a
+male-only school.
+
+Another school is worth noting. There were only eight students from
+school 54 who took the exam so any within-school estimates from this
+school will be unreliable.
+
+A mosaic plot (Figure~\ref{fig:ExamMosaic}) produced with
+<<ExamMosaicshow, eval = FALSE>>=
+ExamMxd <- within(subset(Exam, type == "Mxd"), school <- factor(school))
+mosaicplot(~ school + sex, ExamMxd)
+@
+helps to detect mixed-sex schools with unusually large or unusually small ratios
+of females to males taking the exam.
+\begin{figure}[tbp]
+ \centering
+<<ExamMosaic,fig=TRUE,echo=FALSE,width=8,height=8>>=
+ExamMxd <- within(subset(Exam, type == "Mxd"), school <- factor(school))
+mosaicplot(~ school + sex, ExamMxd)
+@
+ \caption{A mosaic plot of the sex distribution by school. The areas
+ of the rectangles are proportional to the number of students of
+ that sex from that school who took the exam. Schools with an
+ unusally large or unusually small ratio or females to males are
+ highlighted.}
+ \label{fig:ExamMosaic}
+\end{figure}
+
+\subsubsection{Preliminary graphical displays}
+\label{sec:Graphical}
+
+In addition to the pretest score (\code{standLRT}), the predictor
+variables used in this model are the student's sex and the school
+gender, which is coded as having three levels. There is some
+redundancy in these two variables in that all the students in a
+boys-only school must be male. For graphical exploration we convert
+from \code{schgend} to \code{type}, an indicator of whether the school
+is a mixed-sex school or a single-sex school, and plot the response
+versus the pretest score for each combination of sex and school type.
+\begin{figure}[tbp]
+ \centering
+<<Examplot1,fig=TRUE,echo=FALSE,width=8,height=8>>=
+print(xyplot(normexam ~ standLRT | sex * type, Exam,
+ type = c("g", "p", "smooth"), layout = c(2,2),
+ xlab = "Standardized London Reading Test score",
+ ylab = "Normalized exam score", aspect = 1.2))
+@
+ \caption{Normalized exam score versus pretest
+ (Standardized London Reading Test) score for 4095 students from 65 schools
+ in inner London. The panels on the left show the male students'
+ scores; those on the right show the females' scores. The top row
+ of panels shows the scores of students in single-sex schools and
+ the bottom row shows the scores of students in mixed-sex
+ schools. A scatterplot smoother line for each panel has been added
+ to help visualize the trend.}
+ \label{fig:Examplot1}
+\end{figure}
+
+This plot is created with the \code{xyplot} from the \code{lattice}
+package as (essentially)
+<<Examplot1show, eval = FALSE>>=
+xyplot(normexam ~ standLRT | sex * type, Exam, type = c("g", "p", "smooth"))
+@
+The formula would be read as ``plot \code{normexam} by \code{standLRT}
+given \code{sex} by (school) \code{type}''. A few other arguments were
+added in the actual call to make the axis annotations more readable.
+
+Figure~\ref{fig:Examplot1} shows the even after accounting for a
+student's sex, pretest score and school type, there is considerable
+variation in the response. We may attribute some of this variation to
+differences in schools but the fitted model indicates that most of the
+variation is unaccounted or ``residual'' variation.
+
+In some ways the high level of residual variation obscures the pattern
+in the data. By removing the data points and overlaying the
+scatterplot smoothers we can concentrate on the relationships between
+the covariates.
+\begin{figure}[tbp]
+ \centering
+<<Examplot2,fig=TRUE,echo=FALSE,width=8,height=4>>=
+print(xyplot(normexam ~ standLRT, Exam, groups = sex:type,
+ type = c("g", "smooth"), xlim = c(-3,3), ylim = c(-2,2),
+ xlab = "Standardized London Reading Test score",
+ ylab = "Normalized exam score",
+ auto.key = list(space = 'right', points = FALSE, lines = TRUE),
+ aspect = 1))
+@
+ \caption{Overlaid scatterplot smoother lines of the normalized test
+ scores versus the pretest (Standardized London Reading Test) score
+ for female (F) and male (M) students in single-sex (Sngl) and
+ mixed-sex (Mxd) schools.}
+ \label{fig:Examplot2}
+\end{figure}
+The call to \code{xyplot} is essentially
+<<Examplot2show,eval=FALSE>>=
+xyplot(normexam ~ standLRT, Exam, groups = sex:type, type = c("g", "smooth"))
+@
+
+Figure~\ref{fig:Examplot2} is a remarkable plot in that it shows
+nearly a perfect ``main effects'' relationship of the response with
+the three covariates and almost no interaction. It is rare to see
+real data follow a simple theoretical relationship so closely.
+
+To elaborate, we can see that for each of the four groups the smoothed
+relationship between the exam score and the pretest score is close to
+a straight line and that the lines are close to being parallel. The
+only substantial deviation is in the smoothed relationship for the
+males in single-sex schools and this is the group with the fewest
+observations and hence the least precision in the estimated
+relationship. The lack of parallelism for this group is most apparent
+in the region of extremely low pretest scores where there are few
+observations and a single student who had a low pretest score and a
+moderate post-test score can substantially influence the curve. Five
+or six such points can be seen in the upper left panel of
+Figure~\ref{fig:Examplot1}.
+
+We should also notice the ordering of the lines and the spacing
+between the lines. The smoothed relationships for students in
+single-sex schools are consistently above those in the mixed-sex
+schools and, except for the region of low pretest scores described
+above, the relationship for the females in a given type of school is
+consistently above that for the males. Furthermore the distance
+between the female and male lines in the single-sex schools is
+approximately the same as the corresponding distance in the mixed-sex
+schools. We would summarize this by saying that there is a positive
+effect for females versus males and a positive effect for single-sex
+versus mixed-sex and no indication of interaction between these
+factors.
+
+
+\subsubsection{The effect of schools}
+\label{sec:schoolEffects}
+
+We can check for patterns within and between schools by plotting the
+response versus the pretest by school. Because there appear to be
+differences in this relationship for single-sex versus mixed-sex
+schools and for females versus males we consider these separately.
+\begin{figure}[tbp]
+ \centering
+<<Examplot3,fig=TRUE,echo=FALSE,width=8,height=8>>=
+print(xyplot(normexam ~ standLRT | school, Exam,
+ type = c("g", "p", "r"),
+ xlab = "Standardized London Reading Test score",
+ ylab = "Normalized exam score",
+ subset = sex == "F" & type == "Sngl"))
+@
+ \caption{Normalized exam scores versus pretest (Standardized London
+ Reading Test) score by school for female students in single-sex
+ schools.}
+ \label{fig:Examplot3}
+\end{figure}
+
+In Figure~\ref{fig:Examplot3} we plot the normalized exam scores
+versus the pretest score by school for female students in single-sex
+schools. The plot is produced as
+<<Examplot3show>>=
+xyplot(normexam ~ standLRT | school, Exam,
+ type = c("g", "p", "r"),
+ subset = sex == "F" & type == "Sngl")
+@
+The \code{"r"} in the \code{type} argument adds a simple linear
+regression line to each panel.
+
+The first thing we notice in Figure~\ref{fig:Examplot3} is that
+school 48 is an anomaly because only two students in this school took
+the exam. Because within-school results based on only two students
+are unreliable, we will exclude this school from further plots (but we
+do include these data when fitting comprehensive models).
+
+Although the regression lines on the panels can help us to look for
+variation in the schools, the ordering of the panels is, for our
+purposes, random. We recreate this plot in Figure~\ref{fig:Examplot4} using
+<<Examplot4show,eval=FALSE>>=
+xyplot(normexam ~ standLRT | school, Exam, type = c("g", "p", "r"),
+ subset = sex == "F" & type == "Sngl" & school != 48,
+ index.cond = function(x, y) coef(lm(y ~ x))[1])
+@
+\begin{figure}[tbp]
+ \centering
+<<Examplot4,fig=TRUE,echo=FALSE,width=8,height=8>>=
+print(xyplot(normexam ~ standLRT | school, Exam,
+ type = c("g", "p", "r"),
+ xlab = "Standardized London Reading Test score",
+ ylab = "Normalized exam score",
+ subset = sex == "F" & type == "Sngl" & school != 48,
+ index.cond = function(x, y) coef(lm(y ~ x))[1]))
+@
+ \caption{Normalized exam scores versus pretest (Standardized London
+ Reading Test) score by school for female students in single-sex
+ schools. School 48 where only two students took the exam has been
+ eliminated and the panels have been ordered by increasing
+ intercept (predicted normalized score for a pretest score of 0) of
+ the regression line.}
+ \label{fig:Examplot4}
+\end{figure}
+so that the panels are ordered (from left to right starting at the
+bottom row) by increasing intercept for the regression line (i.e.{} by
+increasing predicted exam score for a student with a pretest score of 0).
+
+Alternatively, we could order the panels by increasing slope of the
+within-school regression lines, as in Figure~\ref{fig:Examplot4a}.
+\begin{figure}[tbp]
+ \centering
+<<Examplot4a,fig=TRUE,echo=FALSE,width=8,height=8>>=
+print(xyplot(normexam ~ standLRT | school, Exam,
+ type = c("g", "p", "r"),
+ xlab = "Standardized London Reading Test score",
+ ylab = "Normalized exam score",
+ subset = sex == "F" & type == "Sngl" & school != 48,
+ index.cond = function(x, y) coef(lm(y ~ x))[2]))
+@
+ \caption{Normalized exam scores versus pretest (Standardized London
+ Reading Test) score by school for female students in single-sex
+ schools. School 48 has been eliminated and the panels have been
+ ordered by increasing slope of the within-school regression
+ lines.}
+ \label{fig:Examplot4a}
+\end{figure}
+
+Although it is informative to plot the within-school regression lines
+we need to assess the variability in the estimates of the coefficients
+before concluding if there is ``significant'' variability between
+schools. We can obtain the individual regression fits with the
+\code{lmList} function
+<<ExamLmListFS>>=
+show(ExamFS <- lmList(normexam ~ standLRT | school, Exam,
+ subset = sex == "F" & type == "Sngl" & school != 48))
+@
+and compare the confidence intervals on these coefficients.
+<<Examplot4cshow>>=
+plot(confint(ExamFS, pool = TRUE), order = 1)
+@
+\begin{figure}[tbp]
+ \centering
+<<Examplot4c,fig=TRUE,echo=FALSE,width=8,height=3>>=
+print(plot(confint(ExamFS, pool = TRUE), order = 1))
+@
+ \caption{Confidence intervals on the coefficients of the
+ within-school regression lines for female students in single-sex
+ schools. School 48 has been eliminated and the schools have been
+ ordered by increasing estimated intercept.}
+ \label{fig:Examplot4c}
+\end{figure}
+
+
+\begin{figure}[tbp]
+ \centering
+<<Examplot5,fig=TRUE,echo=FALSE,width=8,height=4>>=
+print(xyplot(normexam ~ standLRT | school, Exam,
+ type = c("g", "p", "r"),
+ xlab = "Standardized London Reading Test score",
+ ylab = "Normalized exam score",
+ subset = sex == "M" & type == "Sngl", layout = c(5,2),
+ index.cond = function(x, y) coef(lm(y ~ x))[1]))
+@
+ \caption{Normalized exam scores versus pretest (Standardized London
+ Reading Test) score by school for male students in single-sex
+ schools.}
+ \label{fig:Examplot5}
+\end{figure}
+
+<<ExamLmListMS>>=
+show(ExamMS <- lmList(normexam ~ standLRT | school, Exam,
+ subset = sex == "M" & type == "Sngl"))
+@
+The corresponding plot of the confidence intervals is shown in
+Figure~\ref{fig:Examplot5b}.
+\begin{figure}[tbp]
+ \centering
+<<Examplot5b,fig=TRUE,echo=FALSE,width=8,height=2.5>>=
+print(plot(confint(ExamMS, pool = TRUE), order = 1))
+@
+ \caption{Confidence intervals on the coefficients of the
+ within-school regression lines for female students in single-sex
+ schools. School 48 has been eliminated and the schools have been
+ ordered by increasing estimated intercept.}
+ \label{fig:Examplot5b}
+\end{figure}
+
+For the mixed-sex schools we can consider the effect of the pretest
+score and sex in the plot (Figure~\ref{fig:Examplot6}) and in the
+\begin{figure}[tbp]
+ \centering
+<<Examplot6,fig=TRUE,echo=FALSE,width=8,height=9>>=
+print(xyplot(normexam ~ standLRT | school, Exam, groups = sex,
+ type = c("g", "p", "r"),
+ xlab = "Standardized London Reading Test score",
+ ylab = "Normalized exam score",
+ subset = !school %in% c(43, 47) & type == "Mxd",
+ index.cond = function(x, y) coef(lm(y ~ x))[1],
+ auto.key = list(space = 'top', lines = TRUE,
+ columns = 2), layout = c(7,5),
+ aspect = 1.2))
+@
+ \caption{Normalized exam scores versus pretest
+ score by school and sex for students in mixed-sex
+ schools.}
+ \label{fig:Examplot6}
+\end{figure}
+separate model fits for each school.
+<<ExamLmListM>>=
+show(ExamM <- lmList(normexam ~ standLRT + sex| school, Exam,
+ subset = type == "Mxd" & !school %in% c(43,47,54)))
+@
+The confidence intervals for these separately fitted models,
+\begin{figure}[tbp]
+ \centering
+<<Examplot6b,fig=TRUE,echo=FALSE,width=8,height=4>>=
+print(plot(confint(ExamM, pool = TRUE), order = 1))
+@
+ \caption{Confidence intervals on the coefficients of the
+ within-school regression lines for female students in single-sex
+ schools. School 48 has been eliminated and the schools have been
+ ordered by increasing estimated intercept.}
+ \label{fig:Examplot6b}
+\end{figure}
+shown in Figure~\ref{fig:Examplot6b} indicate differences in the
+intercepts and possibly differences in the slopes with respect to the
+pretest scores. However, there is not a strong indication of
+variation by school in the effect of sex.
+
+
+\subsection{Multilevel models for the exam data}
+\label{sec:ExamModels}
+
+We begin with a model that has a random effects for the intercept by
+school plus additive fixed effects for the pretest score, the
+student's sex and the school type.
+<<Em3>>=
+(Em3 <- lmer(normexam ~ standLRT + sex + type + (1|school), Exam))
+@
+Our data exploration indicated that the slope with respect to the
+pretest score may vary by school. We can fit a model with random
+effects by school for both the slope and the intercept as
+<<Em4>>=
+(Em4 <- lmer(normexam ~ standLRT + sex + type + (standLRT|school), Exam))
+@
+and compare this fit to the previous fit with
+<<EmAnova>>=
+anova(Em3, Em4)
+@
+There is a strong evidence of a significant random effect for the
+slope by school, whether judged by AIC, BIC or the p-value for the
+likelihood ratio test.
+
+The p-value for the likelihood ratio test is based on a $\chi^2$ distribution
+with degrees of freedom calculated as the difference in the number of
+parameters in the two models. Because one of the parameters
+eliminated from the full model in the submodel is at its boundary the
+usual asymptotics for the likelihood ratio test do not apply.
+However, it can be shown that the p-value quoted for the test is
+conservative in the sense that it is an upper bound on the
+p-value that would be calculated say from a parametric bootstrap.
+
+Having an upper bound of $1.9\times 10^{-10}$ on the p-value can be
+regarded as ``highly significant'' evidence of the utility of the
+random effect for the slope by school.
+
+We could also add a random effect for the student's sex by school
+<<Em5>>=
+(Em5 <- lmer(normexam ~ standLRT + sex + type + (standLRT + sex|school), Exam))
+@
+Notice that the estimate of the variance of the \code{sexM} term is
+essentially zero so there is no need to test the significance of this
+variance component. We proceed with the analysis of \code{Em4}.
+
+\section{Growth curve model for repeated measures data}
+\label{sec:GrowthCurve}
+
+<<Oxboys>>=
+str(Oxboys)
+system.time(mX1 <- lmer(height ~ age + I(age^2) + I(age^3) + I(age^4) + (age + I(age^2)|Subject),
+ Oxboys))
+summary(mX1)
+system.time(mX2 <- lmer(height ~ poly(age,4) + (age + I(age^2)|Subject), Oxboys))
+summary(mX2)
+@
+
+\section{Cross-classification model}
+\label{sec:CrossClassified}
+
+<<ScotsSec>>=
+str(ScotsSec)
+system.time(mS1 <- lmer(attain ~ sex + (1|primary) + (1|second), ScotsSec))
+summary(mS1)
+@
+
+\section{Session Info}
+
+<<sessionInfo, results=tex>>=
+toLatex(sessionInfo())
+@
+
+%\bibliography{MlmSoftRev}
+\end{document}
diff --git a/inst/doc/MlmSoftRev.pdf b/inst/doc/MlmSoftRev.pdf
new file mode 100644
index 0000000..ffd4367
Binary files /dev/null and b/inst/doc/MlmSoftRev.pdf differ
diff --git a/inst/doc/StarData.R b/inst/doc/StarData.R
new file mode 100644
index 0000000..0fbcbe8
--- /dev/null
+++ b/inst/doc/StarData.R
@@ -0,0 +1,203 @@
+### R code from vignette source 'StarData.Rnw'
+
+###################################################
+### code chunk number 1: preliminaries
+###################################################
+options(show.signif.stars = FALSE,width=68)
+library(mlmRev)
+library(lme4)
+
+
+###################################################
+### code chunk number 2: datainput
+###################################################
+str(orig <- read.delim(unz(system.file("original/text-star.zip",
+ package = "mlmRev"),
+ "webstar.txt"),
+ colCl = rep(c("factor", "integer"), c(5, 48))))
+
+
+###################################################
+### code chunk number 3: strorig
+###################################################
+str(orig)
+
+
+###################################################
+### code chunk number 4: missval
+###################################################
+mv <- rep("9", 53)
+mv[c(4,17,26,34,45)] <- "99"
+mv[c(19,20,27,28,35,36,39,40,46:53)] <- "999"
+mv[5] <- "9999"
+mv[1] <- "999999"
+for (i in seq(a = orig)) orig[[i]][orig[[i]] == mv[i]] <- NA
+summary(orig[1:5])
+
+
+###################################################
+### code chunk number 5: dropunused
+###################################################
+for (i in seq(a = orig)) orig[[i]] <- orig[[i]][drop = TRUE]
+summary(orig[1:5])
+
+
+###################################################
+### code chunk number 6: lcase
+###################################################
+names(orig) <- tolower(names(orig))
+
+
+###################################################
+### code chunk number 7: factorlevels
+###################################################
+orig$hdegk <- ordered(orig$hdegk, levels = 1:6,
+ labels = c("ASSOC","BS/BA","MS/MA/MEd","MA+","Ed.S","Ed.D/Ph.D"))
+orig$hdeg1 <- ordered(orig$hdeg1, levels = 1:4,
+ labels = c("BS/BA","MS/MA/MEd","Ed.S","Ed.D/Ph.D"))
+orig$hdeg2 <- ordered(orig$hdeg2, levels = 1:4,
+ labels = c("BS/BA","MS/MA/MEd","Ed.S","Ed.D/Ph.D"))
+orig$hdeg3 <- ordered(orig$hdeg3, levels = 1:4,
+ labels = c("BS/BA","MS/MA/MEd","Ed.S","Ed.D/Ph.D"))
+orig$cladk <- factor(orig$cladk, levels = c(1:3,5:8),
+ labels = c("1","2","3","APPR","PROB","NOT","PEND"))
+orig$clad1 <- factor(orig$clad1, levels = 1:6,
+ labels = c("NOT","APPR","PROB","1","2","3"))
+orig$clad2 <- factor(orig$clad2, levels = 1:6,
+ labels = c("NOT","APPR","PROB","1","2","3"))
+orig$clad3 <- factor(orig$clad3, levels = 1:6,
+ labels = c("NOT","APPR","PROB","1","2","3"))
+
+
+###################################################
+### code chunk number 8: student
+###################################################
+student <- orig[1:5]
+names(student) <- c("id", "sx", "eth", "birthq", "birthy")
+levels(student$sx) <- c("M", "F")
+levels(student$eth) <- c("W", "B", "A", "H", "I", "O")
+student$birthy <- ordered(student$birthy)
+student$birthq <- ordered(paste(student$birthy,student$birthq,sep=":"))
+summary(student)
+
+
+###################################################
+### code chunk number 9: long
+###################################################
+long <- data.frame(id = rep(orig$newid, 4),
+ gr = ordered(rep(c("K", 1:3), each = nrow(orig)),
+ levels = c("K", 1:3)),
+ star = factor(unlist(orig[6:9])),
+ cltype = factor(unlist(orig[10:13])),
+ schtype = factor(unlist(orig[c(14,22,30,38)])),
+ hdeg = ordered(unlist(lapply(orig[c(15,24,32,43)],as.character)),
+ levels = c("ASSOC","BS/BA","MS/MA/MEd","MA+","Ed.S","Ed.D/Ph.D")),
+ clad = factor(unlist(lapply(orig[c(16,25,33,44)],as.character)),
+ levels = c("NOT","APPR","PROB","PEND","1","2","3")),
+ exp = unlist(orig[c(17,26,34,45)]),
+ trace = factor(unlist(orig[c(18,23,31,42)]), levels=1:6,
+ labels=c("W", "B", "A", "H", "I", "O")),
+ read = unlist(orig[c(19,27,35,39)]),
+ math = unlist(orig[c(20,28,36,40)]),
+ ses = factor(unlist(orig[c(21,29,37,41)]),labels=c("F","N")),
+ sch = factor(unlist(orig[50:53])))
+
+
+###################################################
+### code chunk number 10: summarylong
+###################################################
+summary(long)
+
+
+###################################################
+### code chunk number 11: isnacheck
+###################################################
+with(long, all.equal(is.na(schtype), is.na(sch)))
+with(long, all.equal(is.na(cltype), is.na(sch)))
+
+
+###################################################
+### code chunk number 12: longsubset
+###################################################
+long <- long[!is.na(long$sch),]
+
+
+###################################################
+### code chunk number 13: summlong
+###################################################
+summary(long[1:5])
+
+
+###################################################
+### code chunk number 14: dropstar
+###################################################
+long$star <- NULL
+
+
+###################################################
+### code chunk number 15: rownames
+###################################################
+rownames(long) <- paste(long$id, long$gr, sep = '/')
+
+
+###################################################
+### code chunk number 16: schooltable
+###################################################
+school <- unique(long[, c("sch", "schtype")])
+length(levels(school$sch)) == nrow(school)
+row.names(school) <- school$sch
+school <- school[order(as.integer(as.character(school$sch))),]
+long$schtype <- NULL
+levels(school$schtype) <- c("inner","suburb","rural","urban")
+levels(long$cltype) <- c("small", "reg", "reg+A")
+
+
+###################################################
+### code chunk number 17: merging
+###################################################
+star <- merge(merge(long, school, by = "sch"), student, by = "id")
+star$time <- as.integer(star$gr) - 1
+
+
+###################################################
+### code chunk number 18: teacher
+###################################################
+teacher <- unique(star[, c("cltype", "trace", "exp", "clad", "gr",
+ "hdeg", "sch")])
+teacher <- teacher[with(teacher, order(sch, gr, cltype, exp, hdeg,
+ clad, trace)), ]
+
+
+###################################################
+### code chunk number 19: teacherlabels
+###################################################
+row.names(teacher) <- tch <- teacher$tch <- seq(nrow(teacher))
+names(tch) <- do.call("paste", c(teacher[,1:7], list(sep=":")))
+star$tch <- tch[do.call("paste", c(star[c("cltype", "trace", "exp",
+ "clad", "gr", "hdeg", "sch")],
+ list(sep = ":")))]
+
+
+###################################################
+### code chunk number 20: classsizes
+###################################################
+table(table(star$tch))
+table(table(subset(star, cltype == "small")$tch))
+table(table(subset(star, cltype == "reg")$tch))
+table(table(subset(star, cltype == "reg+A")$tch))
+
+
+###################################################
+### code chunk number 21: initialModel
+###################################################
+library(lme4)
+(mm1 <- lmer(math ~ gr + sx + eth + cltype + (1|id) + (1|sch), star))
+(rm1 <- lmer(read ~ gr + sx + eth + cltype + (1|id) + (1|sch), star))
+
+
+###################################################
+### code chunk number 22: sessionInfo
+###################################################
+toLatex(sessionInfo())
+
+
diff --git a/inst/doc/StarData.Rnw b/inst/doc/StarData.Rnw
new file mode 100644
index 0000000..24525f6
--- /dev/null
+++ b/inst/doc/StarData.Rnw
@@ -0,0 +1,306 @@
+\documentclass[letterpaper]{article}
+\usepackage{fullpage}% save trees
+\usepackage{url}
+\title{Creating an R data set from STAR}
+\author{Douglas Bates\\Department of Statistics\\University of Wisconsin -- Madison}
+\date{April 5, 2005}
+\newcommand{\RR}{\textsf{R}}
+\newcommand{\code}[1]{\texttt{\small #1}}
+%%\VignetteIndexEntry{Creating an R data set from STAR}
+%%\VignetteDepends{lme4}
+\begin{document}
+\SweaveOpts{engine=R,eps=FALSE,pdf=TRUE,width=5,height=3}
+\SweaveOpts{strip.white=true,keep.source=TRUE}
+\SweaveOpts{prefix=TRUE,prefix.string=STAR,include=FALSE}
+\maketitle
+\begin{abstract}
+ A substantial portion of the data from Tennessee's Student Teacher
+ Achievement Ratio (STAR) project, a large-scale, four-year study of
+ reduced class size, has been made available to the public at
+ \url{http://www.heros-inc.org/data.htm}. We describe the creation
+ of an \RR (\url{http:www.r-project.org}) data set from these data.
+\end{abstract}
+<<preliminaries,echo=FALSE,results=hide>>=
+options(show.signif.stars = FALSE,width=68)
+library(mlmRev)
+library(lme4)
+@
+
+\section{Introduction}
+\label{sec:intro}
+
+The data from the STAR project are available in several different
+forms from the web site \url{http://www.heros-inc.org/data.htm}. The
+most convenient form for creation of an \RR data set is the
+tab-delimited text file. Download the archive file
+\url{http://www.heros-inc.org/text-star.zip} containing two files:
+\texttt{readme.txt}, a description of the data, and
+\texttt{webstar.txt}, the data themselves.
+
+
+\section{Reading the data}
+\label{sec:reading}
+
+From the data description file we can see that there are 53 columns in
+the data set and most of these columns are coded values. Such columns
+should be represented as factors in \RR but many of these columns will
+need to be combined before we can work with them. We will convert the
+first 5 columns to factors and leave the remaining 48 columns as integers.
+<<datainput>>=
+str(orig <- read.delim(unz(system.file("original/text-star.zip",
+ package = "mlmRev"),
+ "webstar.txt"),
+ colCl = rep(c("factor", "integer"), c(5, 48))))
+@
+In the call to \code{read.delim} we read directly from the zip archive
+and avoided expanding the much larger text file. The call to
+\code{system.file} determines the name of a file that is part of a
+package. In practice it is often more convenient to use the
+\code{file.choose} function which brings up a file chooser panel.
+
+We also check the form of the original data by calling \code{str} on
+the object
+<<strorig>>=
+str(orig)
+@
+
+\subsection{Missing value codes}
+\label{sec:missval}
+
+All the columns except the first column have missing values present.
+Typically the missing value code is \code{"9"} but \code{"99"},
+\code{"999"} and \code{"9999"} are also used.
+We convert these to \RR's missing value code \code{NA} column by column.
+<<missval>>=
+mv <- rep("9", 53)
+mv[c(4,17,26,34,45)] <- "99"
+mv[c(19,20,27,28,35,36,39,40,46:53)] <- "999"
+mv[5] <- "9999"
+mv[1] <- "999999"
+for (i in seq(a = orig)) orig[[i]][orig[[i]] == mv[i]] <- NA
+summary(orig[1:5])
+@
+
+Notice that level \code{"9"} is still present for the \code{SSEX}
+variable even after all the observations at that level have been
+replaced by the missing value code. To remove these unused levels
+from this and all the other columns, we loop over the columns
+selecting all the values but using the optional argument \code{drop = TRUE}.
+<<dropunused>>=
+for (i in seq(a = orig)) orig[[i]] <- orig[[i]][drop = TRUE]
+summary(orig[1:5])
+@
+
+For convenience we convert the names of the columns to lower case.
+<<lcase>>=
+names(orig) <- tolower(names(orig))
+@
+
+\section{Setting factor levels}
+\label{sec:factorLevels}
+
+In \RR the levels of a factor can be given meaningful labels instead of
+numeric codes and in most cases this eliminates the need for a
+separate codebook. For example storing the labels of \code{sex} as
+\code{"M"} and \code{"F"} makes the coding self-explanatory. When
+used in a model a factor is automatically converted to a set of
+``contrasts'' (there is a technical definition of the term
+``contrast'' in linear models that is not always fulfilled by these
+derived variables) and the corresponding coefficients are given
+meaningful names.
+
+When there is a natural ordering of the levels of a factor it can be
+created as an ordered factor that will preserve this ordering.
+
+The labels can be set after the factor is created or as part of the
+creation of the factor. Below we will create a ``long form'' of the
+data where each row corresponds to a combination of student and
+grade. In doing this we will need to concatenate related columns of
+the original data frame. For example, the columns \code{cltypek},
+\code{cltype1}, \code{cltype2} and \code{cltype3} will be concatenated
+to form a single column \code{cltype}. If the coding is consistent
+across the grades then it is easiest to concatenate the integer codes
+and set the labels on the ``long'' version of the variable.
+
+However there are two groups of variables, \code{hdeg} and
+\code{clad}, that are not coded consistently. In each case the codes
+used for kindergarten teachers are different from those used for
+teachers of grades 1 to 3 classes. The codes for kindergarten
+teachers are a superset of those for the other teachers but the
+numbering is not consistent; a bachelor's degree is coded as 2 for
+kindergarten but 1 for the others. Thus we cannot combine the numeric
+values - we must create the labels for each column and then
+concatenate the labels and convert to a factor.
+
+<<factorlevels>>=
+orig$hdegk <- ordered(orig$hdegk, levels = 1:6,
+ labels = c("ASSOC","BS/BA","MS/MA/MEd","MA+","Ed.S","Ed.D/Ph.D"))
+orig$hdeg1 <- ordered(orig$hdeg1, levels = 1:4,
+ labels = c("BS/BA","MS/MA/MEd","Ed.S","Ed.D/Ph.D"))
+orig$hdeg2 <- ordered(orig$hdeg2, levels = 1:4,
+ labels = c("BS/BA","MS/MA/MEd","Ed.S","Ed.D/Ph.D"))
+orig$hdeg3 <- ordered(orig$hdeg3, levels = 1:4,
+ labels = c("BS/BA","MS/MA/MEd","Ed.S","Ed.D/Ph.D"))
+orig$cladk <- factor(orig$cladk, levels = c(1:3,5:8),
+ labels = c("1","2","3","APPR","PROB","NOT","PEND"))
+orig$clad1 <- factor(orig$clad1, levels = 1:6,
+ labels = c("NOT","APPR","PROB","1","2","3"))
+orig$clad2 <- factor(orig$clad2, levels = 1:6,
+ labels = c("NOT","APPR","PROB","1","2","3"))
+orig$clad3 <- factor(orig$clad3, levels = 1:6,
+ labels = c("NOT","APPR","PROB","1","2","3"))
+@
+\section{Creating separate data frames}
+\label{sec:separate}
+
+These data are represented in a ``wide'' format where each row
+corresponds to a student. Some of the columns, such as \code{ssex},
+are indeed a property of the student; some, such as \code{hdegk} are
+properties of teachers; some, such as \code{schtypek} are properties
+of schools or classes in schools; and some are unique to a
+student/grade combination. We will create separate frames for each of
+these types.
+
+The first 5 columns are student-level data
+<<student>>=
+student <- orig[1:5]
+names(student) <- c("id", "sx", "eth", "birthq", "birthy")
+levels(student$sx) <- c("M", "F")
+levels(student$eth) <- c("W", "B", "A", "H", "I", "O")
+student$birthy <- ordered(student$birthy)
+student$birthq <- ordered(paste(student$birthy,student$birthq,sep=":"))
+summary(student)
+@
+
+The other columns refer to a combination of the student and grade. We
+first create an expanded or ``long'' version of the table with a row
+for each student/grade combination.
+
+To create the long version of the table we repeat the student ids four
+times and add a column for the grade level. Related groups of
+columns, such as \code{cltypek}, \code{cltype1}, \code{cltype2} and
+\code{cltype3}, are concatenated then converted to a factor. However,
+there are two groups, \code{hdeg} and \code{clad}, for which this
+approach will not work because these groups are not encoded
+consistently.
+<<long>>=
+long <- data.frame(id = rep(orig$newid, 4),
+ gr = ordered(rep(c("K", 1:3), each = nrow(orig)),
+ levels = c("K", 1:3)),
+ star = factor(unlist(orig[6:9])),
+ cltype = factor(unlist(orig[10:13])),
+ schtype = factor(unlist(orig[c(14,22,30,38)])),
+ hdeg = ordered(unlist(lapply(orig[c(15,24,32,43)],as.character)),
+ levels = c("ASSOC","BS/BA","MS/MA/MEd","MA+","Ed.S","Ed.D/Ph.D")),
+ clad = factor(unlist(lapply(orig[c(16,25,33,44)],as.character)),
+ levels = c("NOT","APPR","PROB","PEND","1","2","3")),
+ exp = unlist(orig[c(17,26,34,45)]),
+ trace = factor(unlist(orig[c(18,23,31,42)]), levels=1:6,
+ labels=c("W", "B", "A", "H", "I", "O")),
+ read = unlist(orig[c(19,27,35,39)]),
+ math = unlist(orig[c(20,28,36,40)]),
+ ses = factor(unlist(orig[c(21,29,37,41)]),labels=c("F","N")),
+ sch = factor(unlist(orig[50:53])))
+@
+
+We can now eliminate the combinations that are completely missing. Checking
+<<summarylong>>=
+summary(long)
+@
+indicates that fewest missing values are in the \code{sch},
+\code{cltype}, and \code{schtype} columns. They are also consistent
+<<isnacheck>>=
+with(long, all.equal(is.na(schtype), is.na(sch)))
+with(long, all.equal(is.na(cltype), is.na(sch)))
+@
+hence we use these to subset the data frame
+<<longsubset>>=
+long <- long[!is.na(long$sch),]
+@
+
+It turns out that we could have used the \code{star} column as this
+simply indicates if the student was in the study that year.
+<<summlong>>=
+summary(long[1:5])
+@
+Because it now contains no information we will drop it.
+<<dropstar>>=
+long$star <- NULL
+@
+
+For convenience we set the row names of this data frame to be a combination of the student id and the grade.
+<<rownames>>=
+rownames(long) <- paste(long$id, long$gr, sep = '/')
+@
+
+We can extract the school-level data from this table.
+<<schooltable>>=
+school <- unique(long[, c("sch", "schtype")])
+length(levels(school$sch)) == nrow(school)
+row.names(school) <- school$sch
+school <- school[order(as.integer(as.character(school$sch))),]
+long$schtype <- NULL
+levels(school$schtype) <- c("inner","suburb","rural","urban")
+levels(long$cltype) <- c("small", "reg", "reg+A")
+@
+
+We can create a merged data set with
+<<merging>>=
+star <- merge(merge(long, school, by = "sch"), student, by = "id")
+star$time <- as.integer(star$gr) - 1
+@
+
+\section{Assigning teacher ids}
+\label{sec:teacher}
+
+There are no teacher id numbers available but we can obtain a
+reasonably accurate surrogate by determining the unique combinations
+of all the variables associated with the teacher.
+<<teacher>>=
+teacher <- unique(star[, c("cltype", "trace", "exp", "clad", "gr",
+ "hdeg", "sch")])
+teacher <- teacher[with(teacher, order(sch, gr, cltype, exp, hdeg,
+ clad, trace)), ]
+@
+
+To generate the correspondence between the observations and the
+teacher we create labels that incorporate the levels of each of the
+variables that defined the unique combinations.
+<<teacherlabels>>=
+row.names(teacher) <- tch <- teacher$tch <- seq(nrow(teacher))
+names(tch) <- do.call("paste", c(teacher[,1:7], list(sep=":")))
+star$tch <- tch[do.call("paste", c(star[c("cltype", "trace", "exp",
+ "clad", "gr", "hdeg", "sch")],
+ list(sep = ":")))]
+@
+
+We can check if this is successful by generating tables of class sizes.
+<<classsizes>>=
+table(table(star$tch))
+table(table(subset(star, cltype == "small")$tch))
+table(table(subset(star, cltype == "reg")$tch))
+table(table(subset(star, cltype == "reg+A")$tch))
+@
+
+We see that there are three classes with sizes greater than 30 and
+that one of these is labelled as a ``small'' class. It is likely that
+each of these represents two or more classes but we do not have enough
+information to distinguish them.
+
+\section{Initial model fits}
+\label{sec:initial}
+
+Some initial model fits are
+<<initialModel>>=
+library(lme4)
+(mm1 <- lmer(math ~ gr + sx + eth + cltype + (1|id) + (1|sch), star))
+(rm1 <- lmer(read ~ gr + sx + eth + cltype + (1|id) + (1|sch), star))
+@
+
+\section{Session Info}
+
+<<sessionInfo, results=tex>>=
+toLatex(sessionInfo())
+@
+
+\end{document}
diff --git a/inst/doc/StarData.pdf b/inst/doc/StarData.pdf
new file mode 100644
index 0000000..39f6d0a
Binary files /dev/null and b/inst/doc/StarData.pdf differ
diff --git a/inst/original/text-star.zip b/inst/original/text-star.zip
new file mode 100644
index 0000000..9f9e50e
Binary files /dev/null and b/inst/original/text-star.zip differ
diff --git a/man/Chem97.Rd b/man/Chem97.Rd
new file mode 100644
index 0000000..db36009
--- /dev/null
+++ b/man/Chem97.Rd
@@ -0,0 +1,43 @@
+\name{Chem97}
+\alias{Chem97}
+\docType{data}
+\title{Scores on A-level Chemistry in 1997}
+\description{
+ Scores on the 1997 A-level Chemistry examination in Britain. Students
+ are grouped into schools within local education authories. In
+ addition some demographic and pre-test information is provided.
+}
+\usage{data(Chem97)}
+\format{
+ A data frame with 31022 observations on the following 8 variables.
+ \describe{
+ \item{lea}{Local Education Authority - a factor}
+ \item{school}{School identifier - a factor}
+ \item{student}{Student identifier - a factor}
+ \item{score}{Point score on A-level Chemistry in 1997}
+ \item{gender}{Student's gender}
+ \item{age}{Age in month, centred at 222 months or 18.5 years}
+ \item{gcsescore}{Average GCSE score of individual.}
+ \item{gcsecnt}{Average GCSE score of individual, centered at mean.}
+ }
+}
+\details{
+ This data set is relatively large with 31,022 individuals in 2,280
+ schools. Note that while this is used, illustratively, to fit Normal
+ response models, the distribution of the response is not well
+ described by a Normal distribution.}
+\source{
+ \url{http://multilevel.ioe.ac.uk/softrev/chem97.html}
+
+}
+\references{
+ Yang, M., Fielding, A. and Goldstein, H. (2002). Multilevel ordinal
+ models for examination grades (submitted to \emph{Statistical Modelling}).
+}
+\examples{
+str(Chem97)
+summary(Chem97)
+(fm1 <- lmer(score ~ (1|school) + (1|lea), Chem97))
+(fm2 <- lmer(score ~ gcsecnt + (1|school) + (1|lea), Chem97))
+}
+\keyword{datasets}
diff --git a/man/Contraception.Rd b/man/Contraception.Rd
new file mode 100644
index 0000000..daacdec
--- /dev/null
+++ b/man/Contraception.Rd
@@ -0,0 +1,39 @@
+\name{Contraception}
+\alias{Contraception}
+\docType{data}
+\title{Contraceptive use in Bangladesh}
+\description{
+ These data on the use of contraception by women in urban and rural
+ areas come from the 1988 Bangladesh Fertility Survey.
+}
+\usage{data(Contraception)}
+\format{
+ A data frame with 1934 observations on the following 6 variables.
+ \describe{
+ \item{woman}{Identifying code for each woman - a factor}
+ \item{district}{Identifying code for each district - a factor}
+ \item{use}{Contraceptive use at time of survey}
+ \item{livch}{Number of living children at time of survey - an
+ ordered factor. Levels are \code{0}, \code{1}, \code{2}, \code{3+}}
+ \item{age}{Age of woman at time of survey (in years), centred around
+ mean.}
+ \item{urban}{Type of region of residence - a factor. Levels are
+ \code{urban} and \code{rural}}
+ }
+}
+%\details{}
+\source{
+ \url{http://multilevel.ioe.ac.uk/softrev/bang.html}
+}
+\references{
+ Steele, F., Diamond, I. And Amin, S. (1996). Immunization uptake in
+ rural Bangladesh: a multilevel analysis. \emph{Journal of the Royal
+ Statistical Society, Series A} (159): 289-299.
+}
+\examples{
+str(Contraception)
+summary(Contraception)
+(fm1 <- glmer(use ~ urban+age+livch+(1|district), Contraception, binomial))
+(fm2 <- glmer(use ~ urban+age+livch+(urban|district), Contraception, binomial))
+}
+\keyword{datasets}
diff --git a/man/Early.Rd b/man/Early.Rd
new file mode 100644
index 0000000..de96657
--- /dev/null
+++ b/man/Early.Rd
@@ -0,0 +1,34 @@
+\name{Early}
+\alias{Early}
+\docType{data}
+\title{Early childhood intervention study}
+\description{
+ Cognitive scores of infants in a study of early childhood
+ intervention. The 103 infants from low income African American
+ families were divided into a treatment group (58
+ infants) and a control group (45 infants). Starting at 0.5 years of age
+ the infants in the treatment group were exposed to an enriched environment.
+ Each infant's cognitive score on an age-specific, normalized scale was
+ recorded at ages 1, 1.5, and 2 years.
+}
+\usage{data(Early)}
+\format{
+ This \code{groupedData} object contains the following columns
+ \describe{
+ \item{id}{An ordered factor of the id number for each infant.}
+ \item{cog}{A numeric cognitive score.}
+ \item{age}{The age of the infant at the measurement.}
+ \item{trt}{A factor with two levels, \code{"N"} and \code{"Y"},
+ indicating if the infant is in the early childhood intervention
+ program.}
+ }
+}
+%\source{}
+\references{
+ Singer, Judith D. and Willett, John B. (2003), \emph{Applied
+ Longitudinal Data Analysis}, Oxford University Press. (Ch. 3)
+}
+\examples{
+str(Early)
+}
+\keyword{datasets}
diff --git a/man/Exam.Rd b/man/Exam.Rd
new file mode 100644
index 0000000..f1dadea
--- /dev/null
+++ b/man/Exam.Rd
@@ -0,0 +1,44 @@
+\name{Exam}
+\alias{Exam}
+\docType{data}
+\title{Exam scores from inner London}
+\description{
+ Exam scores of 4,059 students from 65 schools in Inner London.
+}
+\usage{data(Exam)}
+\format{
+ A data frame with 4059 observations on the following 9 variables.
+ \describe{
+ \item{school}{School ID - a factor.}
+ \item{normexam}{Normalized exam score.}
+ \item{schgend}{School gender - a factor. Levels are \code{mixed},
+ \code{boys}, and \code{girls}.}
+ \item{schavg}{School average of intake score.}
+ \item{vr}{Student level Verbal Reasoning (VR) score band at intake -
+ a factor. Levels are \code{bottom 25\%}, \code{mid 50\%}, and
+ \code{top 25\%}.}
+ \item{intake}{Band of student's intake score - a factor.
+ Levels are \code{bottom 25\%}, \code{mid 50\%} and \code{top
+ 25\%}./}
+ \item{standLRT}{Standardised LR test score.}
+ \item{sex}{Sex of the student - levels are \code{F} and \code{M}.}
+ \item{type}{School type - levels are \code{Mxd} and \code{Sngl}.}
+ \item{student}{Student id (within school) - a factor}
+ }
+}
+%\details{}
+\source{
+ \url{http://multilevel.ioe.ac.uk/softrev/exam.html}
+}
+\references{
+ Goldstein, H., Rasbash, J., et al (1993). A multilevel analysis of
+ school examination results. \emph{Oxford Review of Education} 19: 425-433
+}
+\examples{
+str(Exam)
+summary(Exam)
+(fm1 <- lmer(normexam ~ standLRT + sex + schgend + (1|school), Exam))
+(fm2 <- lmer(normexam ~ standLRT*sex + schgend + (1|school), Exam))
+(fm3 <- lmer(normexam ~ standLRT*sex + schgend + (1|school), Exam))
+}
+\keyword{datasets}
diff --git a/man/Gcsemv.Rd b/man/Gcsemv.Rd
new file mode 100644
index 0000000..00b76cf
--- /dev/null
+++ b/man/Gcsemv.Rd
@@ -0,0 +1,33 @@
+\name{Gcsemv}
+\alias{Gcsemv}
+\docType{data}
+\title{GCSE exam score}
+\description{
+ The GCSE exam scores on a science subject. Two components of the exam
+ were chosen as outcome variables: written paper and course work. There
+ are 1,905 students from 73 schools in England.
+}
+\usage{data(Gcsemv)}
+\format{
+ A data frame with 1905 observations on the following 5 variables.
+ \describe{
+ \item{school}{School ID - a factor}
+ \item{student}{Student ID - a factor}
+ \item{gender}{Gender of student}
+ \item{written}{Total score on written paper}
+ \item{course}{Total score on coursework paper}
+ }
+}
+%\details{}
+\source{
+ \url{http://multilevel.ioe.ac.uk/softrev/gcsemv.html}
+}
+\references{
+ Multivariate response models. (2000). In Rasbash, J., et al, \emph{A
+ user's guide to MLwiN}, Institute of Education, University of
+ London.
+}
+\examples{
+str(Gcsemv)
+}
+\keyword{datasets}
diff --git a/man/Hsb82.Rd b/man/Hsb82.Rd
new file mode 100644
index 0000000..05a7a9d
--- /dev/null
+++ b/man/Hsb82.Rd
@@ -0,0 +1,38 @@
+\name{Hsb82}
+\alias{Hsb82}
+\docType{data}
+\title{High School and Beyond - 1982}
+\description{
+ Data from the 1982 study ``High School and Beyond''.
+}
+\usage{data(Hsb82)}
+\format{
+ A data frame with 7185 observations on students including the
+ following 8 variables.
+ \describe{
+ \item{school}{an ordered factor designating the school that the
+ student attends.}
+ \item{minrty}{a factor with levels}
+ \item{sx}{a factor with levels \code{Male} and \code{Female}}
+ \item{ses}{a numeric vector of socio-economic scores}
+ \item{mAch}{a numeric vector of Mathematics achievement scores}
+ \item{meanses}{a numeric vector of mean \code{ses} for the school}
+ \item{sector}{a factor with levels \code{Public} and \code{Catholic}}
+ \item{cses}{a numeric vector of centered \code{ses} values where the
+ centering is with respect to the \code{meanses} for the school.}
+ }
+}
+\details{
+ Each row in this data frame contains the data for one student.
+}
+%\source{}
+\references{
+ Raudenbush, Stephen and Bryk, Anthony (2002), \emph{Hierarchical
+ Linear Models: Applications and Data Analysis Methods}, Sage
+ (chapter 4).
+}
+\examples{
+data(Hsb82)
+summary(Hsb82)
+}
+\keyword{datasets}
diff --git a/man/Mmmec.Rd b/man/Mmmec.Rd
new file mode 100644
index 0000000..dac04d5
--- /dev/null
+++ b/man/Mmmec.Rd
@@ -0,0 +1,39 @@
+\name{Mmmec}
+\alias{Mmmec}
+\docType{data}
+\title{Malignant melanoma deaths in Europe}
+\description{
+ Malignant Melanoma Mortality in the European Community associated with
+ the impact of UV radiation exposure.
+}
+\usage{data(Mmmec)}
+\format{
+ A data frame with 354 observations on the following 6 variables.
+ \describe{
+ \item{nation}{a factor with levels \code{Belgium}, \code{W.Germany},
+ \code{Denmark}, \code{France}, \code{UK}, \code{Italy}, \code{Ireland},
+ \code{Luxembourg}, and \code{Netherlands}}
+ \item{region}{Region ID - a factor.}
+ \item{county}{County ID - a factor.}
+ \item{deaths}{Number of male deaths due to MM during 1971--1980}
+ \item{expected}{Number of expected deaths.}
+ \item{uvb}{Centered measure of the UVB dose reaching the earth's
+ surface in each county.}
+ }
+}
+%\details{}
+\source{
+ \url{http://multilevel.ioe.ac.uk/softrev/mmmec.html}
+}
+\references{
+ Langford, I.H., Bentham, G. and McDonald, A. 1998: Multilevel
+ modelling of geographically aggregated health data: a case study on
+ malignant melanoma mortality and UV exposure in the European
+ community. \emph{Statistics in Medicine} 17: 41-58.
+}
+\examples{
+str(Mmmec)
+summary(Mmmec)
+(fm1 <- glmer(deaths ~ uvb + (1|region), Mmmec, poisson, offset = log(expected)))
+}
+\keyword{datasets}
diff --git a/man/Oxboys.Rd b/man/Oxboys.Rd
new file mode 100644
index 0000000..6225094
--- /dev/null
+++ b/man/Oxboys.Rd
@@ -0,0 +1,41 @@
+\name{Oxboys}
+\alias{Oxboys}
+\docType{data}
+\title{Heights of Boys in Oxford}
+\description{
+ The \code{Oxboys} data frame has 234 rows and 4 columns.
+}
+\format{
+ This data frame contains the following columns:
+ \describe{
+ \item{Subject}{
+ an ordered factor giving a unique identifier for each boy in
+ the experiment
+ }
+ \item{age}{
+ a numeric vector giving the standardized age (dimensionless)
+ }
+ \item{height}{
+ a numeric vector giving the height of the boy (cm)
+ }
+ \item{Occasion}{
+ an ordered factor - the result of converting \code{age} from a
+ continuous variable to a count so these slightly unbalanced
+ data can be analyzed as balanced.
+ }
+ }
+}
+\details{
+ These data are described in Goldstein (1987) as data on the
+ height of a selection of boys from Oxford, England versus a
+ standardized age.
+}
+\source{
+ Pinheiro, J. C. and Bates, D. M. (2000)
+ \emph{Mixed-Effects Models in S and S-PLUS},
+ Springer, New York. (Appendix A.19)
+}
+\examples{
+data(Oxboys)
+}
+\keyword{datasets}
diff --git a/man/ScotsSec.Rd b/man/ScotsSec.Rd
new file mode 100644
index 0000000..713f340
--- /dev/null
+++ b/man/ScotsSec.Rd
@@ -0,0 +1,41 @@
+\name{ScotsSec}
+\alias{ScotsSec}
+\docType{data}
+\title{Scottish secondary school scores}
+\description{
+ Scores attained by 3435 Scottish secondary school students on a
+ standardized test taken at age 16. Both the primary school and the
+ secondary school that the student attended have been recorded.
+}
+\usage{data(ScotsSec)}
+\format{
+ A data frame with 3435 observations on the following 6 variables.
+ \describe{
+ \item{verbal}{The verbal reasoning score on a test taken by the
+ students on entry to secondary school.}
+ \item{attain}{The score attained on the standardized test taken at
+ age 16.}
+ \item{primary}{A factor indicating the primary school that the
+ student attended.}
+ \item{sex}{A factor with levels \code{M} and \code{F}}
+ \item{social}{The student's social class on a numeric scale from low
+ to high social class.}
+ \item{second}{A factor indicating the secondary school that the
+ student attended.}
+ }
+}
+\details{
+ These data are an example of cross-classified grouping factors.
+}
+\source{
+ \url{http://multilevel.ioe.ac.uk/softrev/2lev-xc.html}
+}
+\references{
+ Paterson, L. (1991). Socio economic status and educational attainment:
+ a multidimensional and multilevel study. \emph{Evaluation and Research
+ in Education} \bold{5}: 97-121.
+}
+\examples{
+str(ScotsSec)
+}
+\keyword{datasets}
diff --git a/man/Socatt.Rd b/man/Socatt.Rd
new file mode 100644
index 0000000..252bdb0
--- /dev/null
+++ b/man/Socatt.Rd
@@ -0,0 +1,50 @@
+\name{Socatt}
+\alias{Socatt}
+\docType{data}
+\title{Social Attitudes Survey}
+\description{
+ These data come from the British Social Attitudes (BSA) Survey started
+ in 1983. The eligible persons were all adults aged 18 or over living
+ in private households in Britain. The data consist of completed
+ results of 264 respondents out of 410.
+}
+\usage{data(Socatt)}
+\format{
+ A data frame with 1056 observations on the following 9 variables.
+ \describe{
+ \item{district}{District ID - a factor}
+ \item{respond}{Respondent code (within district) - a factor}
+ \item{year}{A factor with levels \code{1983}, \code{1984},
+ \code{1985}, and \code{1986}}
+ \item{numpos}{An ordered factor giving the number of positive answers to
+ seven questions.}
+ \item{party}{Political party chosen - a factor. Levels are
+ \code{conservative}, \code{labour}, \code{Lib/SDP/Alliance},
+ \code{others}, and \code{none}.}
+ \item{class}{Self assessed social class - a factor. Levels are
+ \code{middle}, \code{upper working}, and \code{lower working}.}
+ \item{gender}{Respondent's sex. (1=male, 2=female)}
+ \item{age}{Age in years}
+ \item{religion}{Religion - a factor. Levels are \code{Roman
+ Catholic}, \code{Protestant/Church of England}, \code{others},
+ and \code{none}.}
+ }
+}
+\details{
+ These data are provided at \url{http://multilevel.ioe.ac.uk/softrev/}
+ as an example of multilevel data with a multinomial response. Results
+ from several multilevel software packages are provided at this site.
+}
+\source{
+ \url{http://multilevel.ioe.ac.uk/softrev/socatt.htm}
+}
+\references{
+ McGrath, K. and Waterton, J. (1986). \emph{British Social Attitudes
+ 1983-1986 panel survey.} London, Social and Community Planning
+ Research.
+}
+\examples{
+str(Socatt)
+summary(Socatt)
+}
+\keyword{datasets}
diff --git a/man/bdf.Rd b/man/bdf.Rd
new file mode 100644
index 0000000..4bdf2e5
--- /dev/null
+++ b/man/bdf.Rd
@@ -0,0 +1,54 @@
+\name{bdf}
+\title{Language Scores of 8-Graders in The Netherlands}
+\alias{bdf}
+\docType{data}
+\usage{data(bdf)}
+\description{
+ The \code{bdf} data frame has 2287 rows and 25 columns of language
+ scores from grade 8 pupils in elementary schools in The Netherlands.
+}
+\format{
+ \describe{
+ \item{schoolNR}{a factor denoting the school.}
+ \item{pupilNR}{a factor denoting the pupil.}
+ \item{IQ.verb}{a numeric vector of verbal IQ scores}
+ \item{IQ.perf}{a numeric vector of IQ scores.}
+ \item{sex}{Sex of the student.}
+ \item{Minority}{a factor indicating if the student is a member of a
+ minority group.}
+ \item{repeatgr}{an ordered factor indicating if one or more grades
+ have been repeated.}
+ \item{aritPRET}{a numeric vector}
+ \item{classNR}{a numeric vector}
+ \item{aritPOST}{a numeric vector}
+ \item{langPRET}{a numeric vector}
+ \item{langPOST}{a numeric vector}
+ \item{ses}{a numeric vector of socioeconomic status indicators.}
+ \item{denomina}{a factor indicating of the school is a public
+ school, a Protestant private school, a Catholic private school, or
+ a non-denominational private school.}
+ \item{schoolSES}{a numeric vector}
+ \item{satiprin}{a numeric vector}
+ \item{natitest}{a factor with levels \code{0} and \code{1}}
+ \item{meetings}{a numeric vector}
+ \item{currmeet}{a numeric vector}
+ \item{mixedgra}{a factor indicating if the class is a mixed-grade class.}
+ \item{percmino}{a numeric vector}
+ \item{aritdiff}{a numeric vector}
+ \item{homework}{a numeric vector}
+ \item{classsiz}{a numeric vector}
+ \item{groupsiz}{a numeric vector}
+ }
+}
+\source{
+ \url{http://stat.gamma.rug.nl/snijders/multilevel.htm}
+}
+\references{
+ Snijders, Tom and Bosker, Roel (1999)
+ \emph{Multilevel Analysis: An Introduction to Basic and Advanced
+ Multilevel Modeling}, Sage.
+}
+\examples{
+summary(bdf)
+}
+\keyword{datasets}
diff --git a/man/egsingle.Rd b/man/egsingle.Rd
new file mode 100644
index 0000000..59c278b
--- /dev/null
+++ b/man/egsingle.Rd
@@ -0,0 +1,47 @@
+\name{egsingle}
+\alias{egsingle}
+\docType{data}
+\title{US Sustaining Effects study}
+\description{
+ A subset of the mathematics scores from the U.S. Sustaining Effects
+ Study. The subset consists of information on 1721 students from 60 schools
+}
+\usage{data(egsingle)}
+\format{
+ A data frame with 7230 observations on the following 12 variables.
+ \describe{
+ \item{schoolid}{a factor of school identifiers}
+ \item{childid}{a factor of student identifiers}
+ \item{year}{a numeric vector indicating the year of the test}
+ \item{grade}{a numeric vector indicating the student's grade}
+ \item{math}{a numeric vector of test scores on the IRT scale score metric}
+ \item{retained}{a factor with levels \code{0} \code{1} indicating if
+ the student has been retained in a grade.}
+ \item{female}{a factor with levels \code{Female} \code{Male}
+ indicating the student's sex}
+ \item{black}{a factor with levels \code{0} \code{1} indicating if
+ the student is Black}
+ \item{hispanic}{a factor with levels \code{0} \code{1} indicating if
+ the student is Hispanic}
+ \item{size}{a numeric vector indicating the number of students
+ enrolled in the school}
+ \item{lowinc}{a numeric vector giving the percentage of low-income
+ students in the school}
+ \item{mobility}{a numeric vector}
+ }
+}
+%\details{}
+\source{
+ These data are distributed with the HLM software package (Bryk,
+ Raudenbush and Congdon, 1996). Conversion to the R format is
+ described in Doran and Lockwood (2004).
+}
+\references{
+ Doran, Harold C. and Lockwood, J.R. (2004), \emph{Fitting value-added
+ models in R}, (submitted).
+}
+\examples{
+str(egsingle)
+(fm1 <- lmer(math~year*size+female+(1|childid)+(1|schoolid), egsingle))
+}
+\keyword{datasets}
diff --git a/man/guImmun.Rd b/man/guImmun.Rd
new file mode 100644
index 0000000..b0f9d62
--- /dev/null
+++ b/man/guImmun.Rd
@@ -0,0 +1,59 @@
+\name{guImmun}
+\encoding{latin1}
+\alias{guImmun}
+\docType{data}
+\title{Immunization in Guatemala}
+\description{
+ Immunizations received by children in Guatemala.
+}
+\usage{data(guImmun)}
+\format{
+ A data frame with 2159 observations on the following 13 variables.
+ \describe{
+ \item{kid}{a factor identifying the child}
+ \item{mom}{a factor identifying the family.}
+ \item{comm}{a factor identifying the community.}
+ \item{immun}{a factor indicating if the child received a
+ complete set of immunizations. All children in this data frame
+ received at least one immunization.}
+ \item{kid2p}{a factor indicating if the child was 2 years or older
+ at the time of the survey.}
+ \item{mom25p}{a factor indicating if the mother was 25 years or older.}
+ \item{ord}{an factor indicating the child's birth's order within the
+ family. Levels are \code{01} - first child, \code{23} - second or
+ third child, \code{46} - fourth to sixth child, \code{7p} -
+ seventh or later child.}
+ \item{ethn}{a factor indicating the mother's ethnicity. Levels are
+ \code{L} - Ladino, \code{N} - indigenous not speaking Spanish, and
+ \code{S} - indigenous speaking Spanish.}
+ \item{momEd}{a factor describing the mother's level of eduation.
+ Levels are \code{N} - not finished primary, \code{P} - finished
+ primary, \code{S} - finished secondary}
+ \item{husEd}{a factor describing the husband's level of education.
+ Levels are the same as for \code{momEd} plus \code{U} - unknown.}
+ \item{momWork}{a factor indicating if the mother had ever
+ worked outside the home.}
+ \item{rural}{a factor indicating if the family's location is
+ considered rural or urban.}
+ \item{pcInd81}{the percentage of indigenous population in the
+ community at the 1981 census.}
+ }
+}
+%\details{}
+\source{
+ These data are available at
+ \url{http://data.princeton.edu/multilevel/guImmun.dat}. Multiple
+ indicator columns in the original data table have been collapsed to
+ factors for this data frame.
+}
+\references{
+ Rodriguez, Germ�n and Goldman, Noreen (1995), "Improved estimation
+ procedures for multilevel models with binary response: a case-study",
+ \emph{Journal of the Royal Statistical Society, Series A}, \bold{164},
+ 339-355.
+}
+\examples{
+data(guImmun)
+summary(guImmun)
+}
+\keyword{datasets}
diff --git a/man/guPrenat.Rd b/man/guPrenat.Rd
new file mode 100644
index 0000000..88af480
--- /dev/null
+++ b/man/guPrenat.Rd
@@ -0,0 +1,54 @@
+\name{guPrenat}
+\encoding{latin1}
+\alias{guPrenat}
+\docType{data}
+\title{Prenatal care in Guatemala}
+\description{
+ Data on the prenatal care received by mothers in Guatemala.
+}
+\usage{data(guPrenat)}
+\format{
+ A data frame with 2449 observations on the following 15 variables.
+ \describe{
+ \item{kid}{a factor identifying the birth}
+ \item{mom}{a factor identifying the mother or family}
+ \item{cluster}{a factor identifying the community}
+ \item{prenat}{a factor indicating if traditional or modern prenatal
+ care was provided for the birth.}
+ \item{childAge}{an ordered factor of the child's age at the time of
+ the survey. }
+ \item{motherAge}{a factor indicating if the mother was older or
+ younger. The cut-off age is 25 years.}
+ \item{birthOrd}{an ordered factor for the birth's order within the family.}
+ \item{indig}{a factor indicating if the mother is Ladino, or
+ indigenous not speaking Spanish, or indigenous speaking Spanish.}
+ \item{momEd}{a factor describing the mother's level of eduation.}
+ \item{husEd}{a factor describing the husband's level of education.}
+ \item{husEmpl}{a factor describing the husband's employment status.}
+ \item{toilet}{a factor indicating if there is a modern toilet in the
+ house.}
+ \item{TV}{a factor indicating if there is a TV in the house and, if
+ so, the frequency with which it is used.}
+ \item{pcInd81}{the percentage of indigenous population in the
+ community at the 1981 census.}
+ \item{ssDist}{distance from the community to the nearest clinic.}
+ }
+}
+%\details{}
+\source{
+ These data are available at
+ \url{http://data.princeton.edu/multilevel/guPrenat.dat}. Multiple
+ indicator columns in the original data table have been collapsed to
+ factors for this data frame.
+}
+\references{
+ Rodriguez, Germ�n and Goldman, Noreen (1995), "Improved estimation
+ procedures for multilevel models with binary response: a case-study",
+ \emph{Journal of the Royal Statistical Society, Series A}, \bold{164},
+ 339-355.
+}
+\examples{
+data(guPrenat)
+summary(guPrenat)
+}
+\keyword{datasets}
diff --git a/man/s3bbx.Rd b/man/s3bbx.Rd
new file mode 100644
index 0000000..d852e67
--- /dev/null
+++ b/man/s3bbx.Rd
@@ -0,0 +1,35 @@
+\name{s3bbx}
+\title{Covariates in the Rodriguez and Goldman simulation}
+\docType{data}
+\alias{s3bbx}
+\encoding{latin1}
+\usage{data(s3bbx)}
+\description{
+ The \code{s3bbx} data frame has 2449 rows and 6 columns of the
+ covariates in the simulation by Rodriguez and Goldman of multilevel
+ dichotomous data.
+}
+\format{
+ This data frame contains the following columns:
+ \describe{
+ \item{child}{a numeric vector identifying the child}
+ \item{family}{a numeric vector identifying the family}
+ \item{community}{a numeric vector identifying the community}
+ \item{chldcov}{a numeric vector of the child-level covariate}
+ \item{famcov}{a numeric vector of the family-level covariate}
+ \item{commcov}{a numeric vector of the community-level covariate}
+ }
+}
+\source{
+ \url{http://data.princeton.edu/multilevel/simul.htm}
+}
+\references{
+ Rodriguez, Germ�n and Goldman, Noreen (1995)
+ An assessment of estimation procedures for multilevel models with
+ binary responses,
+ \emph{Journal of the Royal Statistical Society, Series A} \bold{158}, 73--89.
+}
+\examples{
+str(s3bbx)
+}
+\keyword{datasets}
diff --git a/man/s3bby.Rd b/man/s3bby.Rd
new file mode 100644
index 0000000..5902951
--- /dev/null
+++ b/man/s3bby.Rd
@@ -0,0 +1,30 @@
+\name{s3bby}
+\title{Responses simulated by Rodriguez and Goldman}
+\docType{data}
+\alias{s3bby}
+\encoding{latin1}
+\description{
+ A matrix of the results of 100 simulations of dichotomous multilevel
+ data. The rows correspond to the 2449 births for which the covariates
+ are given in \code{\link{s3bbx}}. The elements of the matrix are all
+ 0, indicating no modern prenatal care, or 1, indicating model prenatal
+ care. These were simulated with "large" variances for both the family
+ and the community random effects.
+}
+\usage{data(s3bby)}
+\format{
+ An integer matrix with 2449 rows and 100 columns.
+}
+\source{
+ \url{http://data.princeton.edu/multilevel/simul.htm}
+}
+\references{
+ Rodriguez, Germ�n and Goldman, Noreen (1995)
+ An assessment of estimation procedures for multilevel models with
+ binary responses,
+ \emph{Journal of the Royal Statistical Society, Series A} \bold{158}, 73--89.
+}
+\examples{
+str(s3bby)
+}
+\keyword{datasets}
diff --git a/man/star.Rd b/man/star.Rd
new file mode 100644
index 0000000..3e4013a
--- /dev/null
+++ b/man/star.Rd
@@ -0,0 +1,62 @@
+\name{star}
+\alias{star}
+\docType{data}
+\title{Student Teacher Achievement Ratio (STAR) project data}
+\description{
+Data from Tennessee's Student Teacher Achievement Ratio (STAR) project
+which was a large-scale, four-year study of the effect of reduced class size.
+}
+\usage{data(star)}
+\format{
+ A data frame with 26796 observations on the following 18 variables.
+ \describe{
+ \item{\code{id}}{a factor - student id number}
+ \item{\code{sch}}{a factor - school id number}
+ \item{\code{gr}}{grade - an ordered factor with levels \code{K} <
+ \code{1} < \code{2} < \code{3}}
+ \item{\code{cltype}}{class type - a factor with levels \code{small},
+ \code{reg} and \code{reg+A}. The last level indicates a regular
+ class size with a teachers aide.}
+ \item{\code{hdeg}}{highest degree obtained by the teacher - an
+ ordered factor with levels \code{ASSOC} < \code{BS/BA} <
+ \code{MS/MA/MEd} < \code{MA+} < \code{Ed.S} < \code{Ed.D/Ph.D}}
+ \item{\code{clad}}{career ladder position of the teacher - a factor
+ with levels \code{NOT} \code{APPR} \code{PROB} \code{PEND}
+ \code{1} \code{2} \code{3}}
+ \item{\code{exp}}{a numeric vector - the total number of years of
+ experience of the teacher}
+ \item{\code{trace}}{teacher's race - a factor with levels \code{W},
+ \code{B}, \code{A}, \code{H}, \code{I} and \code{O} representing
+ white, black, Asian, Hispanic, Indian (Native American) and other}
+ \item{\code{read}}{the student's total reading scaled score}
+ \item{\code{math}}{the student's total math scaled score}
+ \item{\code{ses}}{socioeconomic status - a factor with levels
+ \code{F} and \code{N} representing eligible for free lunches or
+ not eligible}
+ \item{\code{schtype}}{school type - a factor with levels
+ \code{inner}, \code{suburb}, \code{rural} and \code{urban}}
+ \item{\code{sx}}{student's sex - a factor with levels \code{M} \code{F}}
+ \item{\code{eth}}{student's ethnicity - a factor with the same
+ levels as \code{trace}}
+ \item{\code{birthq}}{student's birth quarter - an ordered factor with
+ levels \code{1977:1} < \dots < \code{1982:2}}
+ \item{\code{birthy}}{student's birth year - an ordered factor with levels \code{1977:1982}}
+ \item{\code{yrs}}{number of years of schooling for the student - a
+ numeric version of the grade \code{gr} with Kindergarten
+ represented as 0. This variable was generated from \code{gr} and
+ does not allow for a student being retained.}
+ \item{\code{tch}}{a factor - teacher id number}
+ }
+}
+\details{
+Details of the original data source and the process of conversion to
+this representation are given in the vignette.
+}
+\source{
+ \url{http://www.heros-inc.org/data.htm}
+}
+%\references{}
+\examples{
+str(star)
+}
+\keyword{datasets}
diff --git a/tests/guImmun.R b/tests/guImmun.R
new file mode 100644
index 0000000..ff194a8
--- /dev/null
+++ b/tests/guImmun.R
@@ -0,0 +1,39 @@
+library(mlmRev)
+options(digits=6, useFancyQuotes = FALSE)# signif.stars for once..
+fm <- glmer(immun ~ kid2p + mom25p + ord + ethn + momEd +
+ husEd + momWork + rural + pcInd81 + (1|mom) + (1|comm),
+ data = guImmun, family = binomial)
+print(fm, symbolic.cor = TRUE)
+
+fm.h <- update(fm, ~ . - husEd)
+print(fm.h, corr = FALSE)
+fm.ho <- update(fm.h, ~ . - ord)
+## FIXME: shows 53 outer iterations (+ probably IRLS ones) --
+## but no such info is kept stored
+print(fm.ho, corr = FALSE)
+
+anova(fm, fm.h, fm.ho)
+
+(fm.hoe <- update(fm.ho, ~ . - ethn))
+
+(fm.hoem <- update(fm.hoe, ~ . - mom25p))
+
+(AN <- anova(fm, fm.h, fm.ho, fm.hoe, fm.hoem))
+
+AN[, "logLik"] + 1362 # an inversion in the first two models
+## FIXME: AN doesn't have a deviance column!
+## AN[, "deviance"] - 2711 # deviance scale shows this more clearly
+stopifnot(AN[,"Df"] == c(9,10,12,15,18),
+# all.equal(AN[,"logLik"] + 1362,
+# c(0.6072186497422, 0.6289103306312, 0.8541186984307,
+# 2.725550814599, 6.299084917162), tol = 1e-6),
+# all.equal(fixef(fm.hoem)[-1],
+# c("kid2pY" = 1.2662536, "momEdP"= 0.35116180,
+# "momEdS"= 0.3487824136, "momWorkY"=0.2672759992340,
+# "ruralY"=-0.678846606719, "pcInd81"=-0.9612710104134),
+# tol = 1e-4),
+ TRUE
+ )
+
+
+cat('Time elapsed: ', proc.time(),'\n') # "stats"
diff --git a/tests/lmerTest.R b/tests/lmerTest.R
new file mode 100644
index 0000000..6c12b42
--- /dev/null
+++ b/tests/lmerTest.R
@@ -0,0 +1,60 @@
+#### LMER: Put all the small data set tests into one file
+library(mlmRev)
+options(digits=6, show.signif.stars = FALSE)
+
+## bdf ---------------- Data ---------------------
+(fm01 <- lmer(langPOST ~ IQ.ver.cen + avg.IQ.ver.cen + (1|schoolNR), bdf))
+(fm02 <- lmer(langPOST ~ IQ.ver.cen + avg.IQ.ver.cen +(IQ.ver.cen|schoolNR), bdf))
+##
+anova(fm01, fm02)
+cat('Time elapsed: ', (.pt <- proc.time()),'\n') # "stats"
+
+## egsingle ----------- Data ---------------------
+(fm1 <- lmer(math ~ year + (1|childid) + (1|schoolid), egsingle))
+(fm2 <- lmer(math ~ year + (1|childid) + (year|schoolid), egsingle))
+(fm3 <- lmer(math ~ year + (year|childid) + (1|schoolid), egsingle))
+(fm4 <- lmer(math ~ year + (year|childid) + (year|schoolid), egsingle))
+##
+anova(fm1, fm2, fm3, fm4)
+cat('Time elapsed: ', {.ot <- .pt; (.pt <- proc.time()) - .ot},'\n') # "stats"
+
+## Early -------------- Data ---------------------
+Early$tos <- Early$age - 0.5
+(fm1E <- lmer(cog ~ tos * trt + (tos|id), Early))
+
+## Exam --------------- Data ---------------------
+(fm05 <- lmer(normexam ~ standLRT + sex + schgend + (1|school), Exam))
+
+## Chem97 ------------- Data ---------------------
+(fm06 <- lmer(score ~ gcsecnt + (1|school) + (1|lea), Chem97))
+
+cat('Time elapsed: ', {.ot <- .pt; (.pt <- proc.time()) - .ot},'\n') # "stats"
+
+## Hsb82 -------------- Data ---------------------
+lmer(mAch ~ meanses*cses + sector*cses + (cses|school), Hsb82)
+
+## Oxford ------------- Data ---------------------
+(fm07 <- lmer(height ~ age + I(age^2) + I(age^3) + I(age^4) +
+ (age + I(age^2)|Subject), data = Oxboys))
+(fm08 <- lmer(height ~ poly(age,4) +
+ (age + I(age^2)|Subject), data = Oxboys))
+anova(fm07, fm08)
+stopifnot(all.equal(logLik(fm07, REML=FALSE),
+ logLik(fm08, REML=FALSE), tol=1e-07))
+cat('Time elapsed: ', {.ot <- .pt; (.pt <- proc.time()) - .ot},'\n') # "stats"
+
+## ScotsSec ----------- Data ---------------------
+cntr <- list()
+(fmS1 <- lmer(attain ~ verbal * sex + (1|primary) + (1|second), ScotsSec,
+ verbose = 1))
+#(fmS2 <- lmer(attain ~ verbal * sex + (1|primary) + (1|second), ScotsSec,
+# control = c(cntr, list(niterEM = 40))))
+## fmS1 and fmS2 should be essentially identical when optimizing with nlminb
+## The fits are substantially different when optimizing with optim
+##
+(fmS3 <- lmer(attain ~ verbal + sex + (1|primary) + (1|second), ScotsSec))
+(fmS4 <- lmer(attain ~ verbal + sex + (1|primary) + (sex|second), ScotsSec))
+##
+anova(fmS1, fmS3, fmS4)
+
+cat('Time elapsed: ', {.ot <- .pt; (.pt <- proc.time()) - .ot},'\n') # "stats"
diff --git a/tests/versions.R b/tests/versions.R
new file mode 100644
index 0000000..b469f54
--- /dev/null
+++ b/tests/versions.R
@@ -0,0 +1,11 @@
+#### tests/versions.R
+## Need this only as long as all other *.R test files, having a matching *.Rout
+
+library("mlmRev", verbose=TRUE)
+
+for(p in c("mlmRev", "lme4", "Matrix")) {
+ print(packageDescription(p)); cat(rep.int("-",60), "\n", sep="") }
+
+sessionInfo()
+## host specific :
+structure(Sys.info()[c(4,5,1:3)], class="simple.list")
diff --git a/vignettes/MlmSoftRev.Rnw b/vignettes/MlmSoftRev.Rnw
new file mode 100644
index 0000000..af8726a
--- /dev/null
+++ b/vignettes/MlmSoftRev.Rnw
@@ -0,0 +1,646 @@
+\documentclass[12pt]{article}
+\usepackage{Sweave}
+\usepackage{myVignette}
+\usepackage[authoryear,round]{natbib}
+\bibliographystyle{plainnat}
+\DefineVerbatimEnvironment{Sinput}{Verbatim}
+{formatcom={\vspace{-1.5ex}},fontshape=sl,
+ fontfamily=courier,fontseries=b, fontsize=\scriptsize}
+\DefineVerbatimEnvironment{Soutput}{Verbatim}
+{formatcom={\vspace{-2.5ex}},fontfamily=courier,fontseries=b,%
+ fontsize=\scriptsize}
+%%\VignetteIndexEntry{Examples from Multilevel Software Reviews}
+%%\VignetteDepends{lme4}
+%%\VignetteDepends{lattice}
+\begin{document}
+\SweaveOpts{engine=R,eps=FALSE,pdf=TRUE,width=5,height=3,strip.white=true,keep.source=TRUE}
+\SweaveOpts{prefix=TRUE,prefix.string=SoftRev,include=TRUE}% ^^^^^^^^^^^^^^^^
+\setkeys{Gin}{width=\textwidth}
+\title{Examples from Multilevel Software Comparative Reviews}
+\author{Douglas Bates\\R Development Core Team\\\email{Douglas.Bates at R-project.org}}
+\date{February 2005, {\small with updates up to \today}}
+\maketitle
+\begin{abstract}
+ The Center for Multilevel Modelling at the Institute of Education,
+ London maintains a web site of ``Software reviews of multilevel
+ modeling packages''. The data sets discussed in the reviews are
+ available at this web site. We have incorporated these data sets
+ in the \code{mlmRev} package for \RR{} and, in this vignette, provide
+ the results of fitting several models to these data sets.
+\end{abstract}
+<<preliminaries,echo=FALSE,print=FALSE,results=hide>>=
+options(width=80, show.signif.stars = FALSE,
+ lattice.theme = function() canonical.theme("pdf", color = FALSE))
+library(mlmRev)
+library(lme4)
+library(lattice)
+set.seed(1234321)
+@
+
+\section{Introduction}
+\label{sec:Intro}
+
+\RR{} is an Open Source implementation of John Chambers' \Slang{}
+language for data analysis and graphics. \RR{} was initially developed
+by Ross Ihaka and Robert Gentleman of the University of Auckland and
+now is developed and maintained by an international group of
+statistical computing experts.
+
+In addition to being Open Source software, which means that anyone can
+examine the source code to see exactly how the computations are being
+carried out, \RR{} is freely available from a network of archive sites
+on the Internet. There are precompiled versions for installation on
+the Windows operating system, Mac OS X and several distributions of
+the Linux operating system. Because the source code is available
+those interested in doing so can compile their own version if they wish.
+
+\RR{} provides an environment for interactive computing with data and for
+graphical display of data. Users and developers can extend the
+capabilities of \RR{} by writing their own functions in the language
+and by creating packages of functions and data sets. Many such
+packages are available on the archive network called CRAN
+(Comprehensive R Archive Network) for which the parent site is
+\url{http://cran.r-project.org}. One such package called \code{lme4}
+(along with a companion package called \code{Matrix}) provides
+functions to fit and display linear mixed models and generalized
+linear mixed models, which are the statisticians' names for the models
+called multilevel models or hierarchical linear models in other
+disciplines. The \code{lattice} package provides functions to
+generate several high level graphics plots that help with the
+visualization of the types of data to which such models are fit.
+Finally, the \code{mlmRev} package provides the data sets used in the
+``Software Reviews of Multilevel Modeling Packages'' from the
+Multilevel Modeling Group at the Institute of Education in the UK.
+This package also contains several other data sets from the multilevel
+modeling literature.
+
+The software reviews mentioned above were intended to provide
+comparison of the speed and accuracy of many different packages for
+fitting multilevel models. As such, there is a standard set of
+models that fit to each of the data sets in each of the packages that
+were capable of doing the fit. We will fit these models for
+comparative purposes but we will also do some graphical exploration of
+the data and, in some cases, discuss alternative models.
+
+We follow the general outline of the previous reviews, beginning with
+simpler structures and moving to the more complex structures. Because
+the previous reviews were performed on older and considerably slower
+computers than the ones on which this vignette will be compiled, the
+timings produced by the \code{system.time} function and shown in the
+text should not be compared with previous timings given on the web
+site. They are an indication of the times required to fit such models
+to these data sets on recent computers with processors running at
+around 2 GHz or faster.
+
+\section{Two-level normal models}
+\label{sec:TwoLevelNormal}
+
+In the multilevel modeling literature a two-level model is one with
+two levels of random variation; the per-observation noise term and
+random effects which are grouped according to the levels of a factor.
+We call this factor a \textit{grouping factor}. If the response is
+measured on a continuous scale (more or less) our initial models are
+based on a normal distribution for the per-observation noise and for
+the random effects. Thus such a model is called a ``two-level normal
+model'' even though it has only one grouping factor for the random effects.
+
+
+\subsection{The Exam data}
+\label{sec:Examdata}
+<<Examprep,results=hide,echo=FALSE>>=
+lmer(normexam ~ standLRT + sex + schgend + (1|school), Exam)
+@
+
+The data set called \code{Exam} provides the normalized exam scores
+attained by 4,059 students from 65 schools in inner London. Some of
+the covariates available with this exam score are the school the
+student attended, the sex of the student, the school gender (boys,
+girls, or mixed) and the student's result on the Standardised London
+Reading test.
+
+The \RR{} functions \code{str} and \code{summary} can be used to
+examine the structure of a data set (or, in general, any \RR{} object)
+and to provide a summary of an object.
+<<ExamData>>=
+str(Exam)
+summary(Exam)
+@
+
+
+\subsection{Model fits and timings}
+\label{sec:ExamFits}
+
+The first model to fit to the Exam data incorporates fixed-effects
+terms for the pretest score, the student's sex and the school gender.
+The only random-effects term is an additive shift associated with the
+school.
+
+<<ExamFit>>=
+(Em1 <- lmer(normexam ~ standLRT + sex + schgend + (1|school), Exam))
+@
+
+The \code{system.time} function can be used to time the execution of
+an \RR{} expression. It returns a vector of five numbers giving the
+user time (time spend executing applications code), the system time
+(time spent executing system functions called by the applications
+code), the elapsed time, and the user and system time for any child
+processes. The first number is what is commonly viewed as the time
+required to do the model fit. (The elapsed time is unsuitable because
+it can be affected by other processes running on the computer.) These
+times are in seconds. On modern computers this fit takes only a
+fraction of a second.
+
+<<Examtime>>=
+system.time(lmer(normexam ~ standLRT + sex + schgend + (1|school), Exam))
+@
+
+
+
+\subsection{Interpreting the fit}
+\label{sec:ExamInterpret}
+
+As can be seen from the output, the default method of fitting a linear
+mixed model is restricted maximum likelihood (REML). The estimates of
+the variance components correspond to those reported by other packages
+as given on the Multilevel Modelling Group's web site. Note that the
+estimates of the variance components are given on the scale of the
+variance and on the scale of the standard deviation. That is, the
+values in the column headed \code{Std.Dev.} are simply the square
+roots of the corresponding entry in the \code{Variance} column. They
+are \textbf{not} standard errors of the estimate of the variance.
+
+The estimates of the fixed-effects are different from those quoted on
+the web site because the terms for \code{sex} and \code{schgend} use a
+different parameterization than in the reviews. Here the reference
+level of \code{sex} is female (\code{F}) and the coefficient labelled
+\code{sexM} represents the difference for males compared to females.
+Similarly the reference level of \code{schgend} is \code{mixed} and
+the two coefficients represent the change from mixed to boys only and
+the change from mixed to girls only. The value of the coefficient
+labelled \code{Intercept} is affected by both these changes as is the
+value of the REML criterion.
+
+To reproduce the results obtained from other packages, we must change
+the reference level for each of these factors.
+
+<<ExamRelevel>>=
+Exam$sex <- relevel(Exam$sex, "M")
+Exam$schgend <- relevel(Exam$schgend, "girls")
+(Em2 <- lmer(normexam ~ standLRT + sex + schgend + (1|school), Exam))
+@
+
+The coefficients now correspond to those in the tables on the web
+site. It happens that the REML criterion at the optimum in this fit
+is the same as in the previous fit, but you cannot depend on this
+occuring. In general the value of the REML criterion at the optimum
+depends on the parameterization used for the fixed effects.
+
+\subsection{Further exploration}
+\label{sec:ExamExplore}
+
+
+\subsubsection{Checking consistency of the data}
+\label{sec:consistency}
+
+It is important to check the consistency of data before trying to fit
+sophisticated models. One should plot the data in many different ways
+to see if it looks reasonableand also check relationships between
+variables.
+
+For example, each observation in these data is associated with a
+particular student. The variable \code{student} is not a unique
+identifier of the student as it only has 650 unique values. It is
+intended to be a unique identifier within a school but it is not. To
+show this we create a factor that is the interaction of school and
+student then drop unused levels.
+
+<<ExamIds>>=
+Exam <- within(Exam, ids <- factor(school:student))
+str(Exam)
+@
+
+Notice that there are 4059 observations but only 4055 unique levels of
+student within school. We can check the ones that are duplicated
+
+<<dupExamIds>>=
+as.character(Exam$ids[which(duplicated(Exam$ids))])
+@
+
+One of these cases
+<<OnlyBoy>>=
+subset(Exam, ids == '43:86')
+xtabs(~ sex + school, Exam, subset = school %in% c(43, 50, 52), drop = TRUE)
+@
+is particularly interesting. Notice that one of the students
+numbered 86 in school 43 is the only male student out of 61 students
+from this school who took the exam. It is quite likely that this
+student's score was attributed to the wrong school and that the school
+is in fact a girls-only school, not a mixed-sex school.
+
+The causes of the other three cases of duplicate student numbers
+within a school are not as clear. It would be necessary to go back
+to the original data records to check these.
+
+The cross-tabulation of the students by sex and school for the
+mixed-sex schools
+<<ExamXtabs>>=
+xtabs(~ sex + school, Exam, subset = type == "Mxd", drop = TRUE)
+@
+shows another anomaly. School 47 is similar to school 43 in that,
+although it is classified as a mixed-sex school, 81 male students
+and only one female student took the exam. It is likely that the
+school was misrecorded for this one female student and the school is a
+male-only school.
+
+Another school is worth noting. There were only eight students from
+school 54 who took the exam so any within-school estimates from this
+school will be unreliable.
+
+A mosaic plot (Figure~\ref{fig:ExamMosaic}) produced with
+<<ExamMosaicshow, eval = FALSE>>=
+ExamMxd <- within(subset(Exam, type == "Mxd"), school <- factor(school))
+mosaicplot(~ school + sex, ExamMxd)
+@
+helps to detect mixed-sex schools with unusually large or unusually small ratios
+of females to males taking the exam.
+\begin{figure}[tbp]
+ \centering
+<<ExamMosaic,fig=TRUE,echo=FALSE,width=8,height=8>>=
+ExamMxd <- within(subset(Exam, type == "Mxd"), school <- factor(school))
+mosaicplot(~ school + sex, ExamMxd)
+@
+ \caption{A mosaic plot of the sex distribution by school. The areas
+ of the rectangles are proportional to the number of students of
+ that sex from that school who took the exam. Schools with an
+ unusally large or unusually small ratio or females to males are
+ highlighted.}
+ \label{fig:ExamMosaic}
+\end{figure}
+
+\subsubsection{Preliminary graphical displays}
+\label{sec:Graphical}
+
+In addition to the pretest score (\code{standLRT}), the predictor
+variables used in this model are the student's sex and the school
+gender, which is coded as having three levels. There is some
+redundancy in these two variables in that all the students in a
+boys-only school must be male. For graphical exploration we convert
+from \code{schgend} to \code{type}, an indicator of whether the school
+is a mixed-sex school or a single-sex school, and plot the response
+versus the pretest score for each combination of sex and school type.
+\begin{figure}[tbp]
+ \centering
+<<Examplot1,fig=TRUE,echo=FALSE,width=8,height=8>>=
+print(xyplot(normexam ~ standLRT | sex * type, Exam,
+ type = c("g", "p", "smooth"), layout = c(2,2),
+ xlab = "Standardized London Reading Test score",
+ ylab = "Normalized exam score", aspect = 1.2))
+@
+ \caption{Normalized exam score versus pretest
+ (Standardized London Reading Test) score for 4095 students from 65 schools
+ in inner London. The panels on the left show the male students'
+ scores; those on the right show the females' scores. The top row
+ of panels shows the scores of students in single-sex schools and
+ the bottom row shows the scores of students in mixed-sex
+ schools. A scatterplot smoother line for each panel has been added
+ to help visualize the trend.}
+ \label{fig:Examplot1}
+\end{figure}
+
+This plot is created with the \code{xyplot} from the \code{lattice}
+package as (essentially)
+<<Examplot1show, eval = FALSE>>=
+xyplot(normexam ~ standLRT | sex * type, Exam, type = c("g", "p", "smooth"))
+@
+The formula would be read as ``plot \code{normexam} by \code{standLRT}
+given \code{sex} by (school) \code{type}''. A few other arguments were
+added in the actual call to make the axis annotations more readable.
+
+Figure~\ref{fig:Examplot1} shows the even after accounting for a
+student's sex, pretest score and school type, there is considerable
+variation in the response. We may attribute some of this variation to
+differences in schools but the fitted model indicates that most of the
+variation is unaccounted or ``residual'' variation.
+
+In some ways the high level of residual variation obscures the pattern
+in the data. By removing the data points and overlaying the
+scatterplot smoothers we can concentrate on the relationships between
+the covariates.
+\begin{figure}[tbp]
+ \centering
+<<Examplot2,fig=TRUE,echo=FALSE,width=8,height=4>>=
+print(xyplot(normexam ~ standLRT, Exam, groups = sex:type,
+ type = c("g", "smooth"), xlim = c(-3,3), ylim = c(-2,2),
+ xlab = "Standardized London Reading Test score",
+ ylab = "Normalized exam score",
+ auto.key = list(space = 'right', points = FALSE, lines = TRUE),
+ aspect = 1))
+@
+ \caption{Overlaid scatterplot smoother lines of the normalized test
+ scores versus the pretest (Standardized London Reading Test) score
+ for female (F) and male (M) students in single-sex (Sngl) and
+ mixed-sex (Mxd) schools.}
+ \label{fig:Examplot2}
+\end{figure}
+The call to \code{xyplot} is essentially
+<<Examplot2show,eval=FALSE>>=
+xyplot(normexam ~ standLRT, Exam, groups = sex:type, type = c("g", "smooth"))
+@
+
+Figure~\ref{fig:Examplot2} is a remarkable plot in that it shows
+nearly a perfect ``main effects'' relationship of the response with
+the three covariates and almost no interaction. It is rare to see
+real data follow a simple theoretical relationship so closely.
+
+To elaborate, we can see that for each of the four groups the smoothed
+relationship between the exam score and the pretest score is close to
+a straight line and that the lines are close to being parallel. The
+only substantial deviation is in the smoothed relationship for the
+males in single-sex schools and this is the group with the fewest
+observations and hence the least precision in the estimated
+relationship. The lack of parallelism for this group is most apparent
+in the region of extremely low pretest scores where there are few
+observations and a single student who had a low pretest score and a
+moderate post-test score can substantially influence the curve. Five
+or six such points can be seen in the upper left panel of
+Figure~\ref{fig:Examplot1}.
+
+We should also notice the ordering of the lines and the spacing
+between the lines. The smoothed relationships for students in
+single-sex schools are consistently above those in the mixed-sex
+schools and, except for the region of low pretest scores described
+above, the relationship for the females in a given type of school is
+consistently above that for the males. Furthermore the distance
+between the female and male lines in the single-sex schools is
+approximately the same as the corresponding distance in the mixed-sex
+schools. We would summarize this by saying that there is a positive
+effect for females versus males and a positive effect for single-sex
+versus mixed-sex and no indication of interaction between these
+factors.
+
+
+\subsubsection{The effect of schools}
+\label{sec:schoolEffects}
+
+We can check for patterns within and between schools by plotting the
+response versus the pretest by school. Because there appear to be
+differences in this relationship for single-sex versus mixed-sex
+schools and for females versus males we consider these separately.
+\begin{figure}[tbp]
+ \centering
+<<Examplot3,fig=TRUE,echo=FALSE,width=8,height=8>>=
+print(xyplot(normexam ~ standLRT | school, Exam,
+ type = c("g", "p", "r"),
+ xlab = "Standardized London Reading Test score",
+ ylab = "Normalized exam score",
+ subset = sex == "F" & type == "Sngl"))
+@
+ \caption{Normalized exam scores versus pretest (Standardized London
+ Reading Test) score by school for female students in single-sex
+ schools.}
+ \label{fig:Examplot3}
+\end{figure}
+
+In Figure~\ref{fig:Examplot3} we plot the normalized exam scores
+versus the pretest score by school for female students in single-sex
+schools. The plot is produced as
+<<Examplot3show>>=
+xyplot(normexam ~ standLRT | school, Exam,
+ type = c("g", "p", "r"),
+ subset = sex == "F" & type == "Sngl")
+@
+The \code{"r"} in the \code{type} argument adds a simple linear
+regression line to each panel.
+
+The first thing we notice in Figure~\ref{fig:Examplot3} is that
+school 48 is an anomaly because only two students in this school took
+the exam. Because within-school results based on only two students
+are unreliable, we will exclude this school from further plots (but we
+do include these data when fitting comprehensive models).
+
+Although the regression lines on the panels can help us to look for
+variation in the schools, the ordering of the panels is, for our
+purposes, random. We recreate this plot in Figure~\ref{fig:Examplot4} using
+<<Examplot4show,eval=FALSE>>=
+xyplot(normexam ~ standLRT | school, Exam, type = c("g", "p", "r"),
+ subset = sex == "F" & type == "Sngl" & school != 48,
+ index.cond = function(x, y) coef(lm(y ~ x))[1])
+@
+\begin{figure}[tbp]
+ \centering
+<<Examplot4,fig=TRUE,echo=FALSE,width=8,height=8>>=
+print(xyplot(normexam ~ standLRT | school, Exam,
+ type = c("g", "p", "r"),
+ xlab = "Standardized London Reading Test score",
+ ylab = "Normalized exam score",
+ subset = sex == "F" & type == "Sngl" & school != 48,
+ index.cond = function(x, y) coef(lm(y ~ x))[1]))
+@
+ \caption{Normalized exam scores versus pretest (Standardized London
+ Reading Test) score by school for female students in single-sex
+ schools. School 48 where only two students took the exam has been
+ eliminated and the panels have been ordered by increasing
+ intercept (predicted normalized score for a pretest score of 0) of
+ the regression line.}
+ \label{fig:Examplot4}
+\end{figure}
+so that the panels are ordered (from left to right starting at the
+bottom row) by increasing intercept for the regression line (i.e.{} by
+increasing predicted exam score for a student with a pretest score of 0).
+
+Alternatively, we could order the panels by increasing slope of the
+within-school regression lines, as in Figure~\ref{fig:Examplot4a}.
+\begin{figure}[tbp]
+ \centering
+<<Examplot4a,fig=TRUE,echo=FALSE,width=8,height=8>>=
+print(xyplot(normexam ~ standLRT | school, Exam,
+ type = c("g", "p", "r"),
+ xlab = "Standardized London Reading Test score",
+ ylab = "Normalized exam score",
+ subset = sex == "F" & type == "Sngl" & school != 48,
+ index.cond = function(x, y) coef(lm(y ~ x))[2]))
+@
+ \caption{Normalized exam scores versus pretest (Standardized London
+ Reading Test) score by school for female students in single-sex
+ schools. School 48 has been eliminated and the panels have been
+ ordered by increasing slope of the within-school regression
+ lines.}
+ \label{fig:Examplot4a}
+\end{figure}
+
+Although it is informative to plot the within-school regression lines
+we need to assess the variability in the estimates of the coefficients
+before concluding if there is ``significant'' variability between
+schools. We can obtain the individual regression fits with the
+\code{lmList} function
+<<ExamLmListFS>>=
+show(ExamFS <- lmList(normexam ~ standLRT | school, Exam,
+ subset = sex == "F" & type == "Sngl" & school != 48))
+@
+and compare the confidence intervals on these coefficients.
+<<Examplot4cshow>>=
+plot(confint(ExamFS, pool = TRUE), order = 1)
+@
+\begin{figure}[tbp]
+ \centering
+<<Examplot4c,fig=TRUE,echo=FALSE,width=8,height=3>>=
+print(plot(confint(ExamFS, pool = TRUE), order = 1))
+@
+ \caption{Confidence intervals on the coefficients of the
+ within-school regression lines for female students in single-sex
+ schools. School 48 has been eliminated and the schools have been
+ ordered by increasing estimated intercept.}
+ \label{fig:Examplot4c}
+\end{figure}
+
+
+\begin{figure}[tbp]
+ \centering
+<<Examplot5,fig=TRUE,echo=FALSE,width=8,height=4>>=
+print(xyplot(normexam ~ standLRT | school, Exam,
+ type = c("g", "p", "r"),
+ xlab = "Standardized London Reading Test score",
+ ylab = "Normalized exam score",
+ subset = sex == "M" & type == "Sngl", layout = c(5,2),
+ index.cond = function(x, y) coef(lm(y ~ x))[1]))
+@
+ \caption{Normalized exam scores versus pretest (Standardized London
+ Reading Test) score by school for male students in single-sex
+ schools.}
+ \label{fig:Examplot5}
+\end{figure}
+
+<<ExamLmListMS>>=
+show(ExamMS <- lmList(normexam ~ standLRT | school, Exam,
+ subset = sex == "M" & type == "Sngl"))
+@
+The corresponding plot of the confidence intervals is shown in
+Figure~\ref{fig:Examplot5b}.
+\begin{figure}[tbp]
+ \centering
+<<Examplot5b,fig=TRUE,echo=FALSE,width=8,height=2.5>>=
+print(plot(confint(ExamMS, pool = TRUE), order = 1))
+@
+ \caption{Confidence intervals on the coefficients of the
+ within-school regression lines for female students in single-sex
+ schools. School 48 has been eliminated and the schools have been
+ ordered by increasing estimated intercept.}
+ \label{fig:Examplot5b}
+\end{figure}
+
+For the mixed-sex schools we can consider the effect of the pretest
+score and sex in the plot (Figure~\ref{fig:Examplot6}) and in the
+\begin{figure}[tbp]
+ \centering
+<<Examplot6,fig=TRUE,echo=FALSE,width=8,height=9>>=
+print(xyplot(normexam ~ standLRT | school, Exam, groups = sex,
+ type = c("g", "p", "r"),
+ xlab = "Standardized London Reading Test score",
+ ylab = "Normalized exam score",
+ subset = !school %in% c(43, 47) & type == "Mxd",
+ index.cond = function(x, y) coef(lm(y ~ x))[1],
+ auto.key = list(space = 'top', lines = TRUE,
+ columns = 2), layout = c(7,5),
+ aspect = 1.2))
+@
+ \caption{Normalized exam scores versus pretest
+ score by school and sex for students in mixed-sex
+ schools.}
+ \label{fig:Examplot6}
+\end{figure}
+separate model fits for each school.
+<<ExamLmListM>>=
+show(ExamM <- lmList(normexam ~ standLRT + sex| school, Exam,
+ subset = type == "Mxd" & !school %in% c(43,47,54)))
+@
+The confidence intervals for these separately fitted models,
+\begin{figure}[tbp]
+ \centering
+<<Examplot6b,fig=TRUE,echo=FALSE,width=8,height=4>>=
+print(plot(confint(ExamM, pool = TRUE), order = 1))
+@
+ \caption{Confidence intervals on the coefficients of the
+ within-school regression lines for female students in single-sex
+ schools. School 48 has been eliminated and the schools have been
+ ordered by increasing estimated intercept.}
+ \label{fig:Examplot6b}
+\end{figure}
+shown in Figure~\ref{fig:Examplot6b} indicate differences in the
+intercepts and possibly differences in the slopes with respect to the
+pretest scores. However, there is not a strong indication of
+variation by school in the effect of sex.
+
+
+\subsection{Multilevel models for the exam data}
+\label{sec:ExamModels}
+
+We begin with a model that has a random effects for the intercept by
+school plus additive fixed effects for the pretest score, the
+student's sex and the school type.
+<<Em3>>=
+(Em3 <- lmer(normexam ~ standLRT + sex + type + (1|school), Exam))
+@
+Our data exploration indicated that the slope with respect to the
+pretest score may vary by school. We can fit a model with random
+effects by school for both the slope and the intercept as
+<<Em4>>=
+(Em4 <- lmer(normexam ~ standLRT + sex + type + (standLRT|school), Exam))
+@
+and compare this fit to the previous fit with
+<<EmAnova>>=
+anova(Em3, Em4)
+@
+There is a strong evidence of a significant random effect for the
+slope by school, whether judged by AIC, BIC or the p-value for the
+likelihood ratio test.
+
+The p-value for the likelihood ratio test is based on a $\chi^2$ distribution
+with degrees of freedom calculated as the difference in the number of
+parameters in the two models. Because one of the parameters
+eliminated from the full model in the submodel is at its boundary the
+usual asymptotics for the likelihood ratio test do not apply.
+However, it can be shown that the p-value quoted for the test is
+conservative in the sense that it is an upper bound on the
+p-value that would be calculated say from a parametric bootstrap.
+
+Having an upper bound of $1.9\times 10^{-10}$ on the p-value can be
+regarded as ``highly significant'' evidence of the utility of the
+random effect for the slope by school.
+
+We could also add a random effect for the student's sex by school
+<<Em5>>=
+(Em5 <- lmer(normexam ~ standLRT + sex + type + (standLRT + sex|school), Exam))
+@
+Notice that the estimate of the variance of the \code{sexM} term is
+essentially zero so there is no need to test the significance of this
+variance component. We proceed with the analysis of \code{Em4}.
+
+\section{Growth curve model for repeated measures data}
+\label{sec:GrowthCurve}
+
+<<Oxboys>>=
+str(Oxboys)
+system.time(mX1 <- lmer(height ~ age + I(age^2) + I(age^3) + I(age^4) + (age + I(age^2)|Subject),
+ Oxboys))
+summary(mX1)
+system.time(mX2 <- lmer(height ~ poly(age,4) + (age + I(age^2)|Subject), Oxboys))
+summary(mX2)
+@
+
+\section{Cross-classification model}
+\label{sec:CrossClassified}
+
+<<ScotsSec>>=
+str(ScotsSec)
+system.time(mS1 <- lmer(attain ~ sex + (1|primary) + (1|second), ScotsSec))
+summary(mS1)
+@
+
+\section{Session Info}
+
+<<sessionInfo, results=tex>>=
+toLatex(sessionInfo())
+@
+
+%\bibliography{MlmSoftRev}
+\end{document}
diff --git a/vignettes/MlmSoftRev.bib b/vignettes/MlmSoftRev.bib
new file mode 100644
index 0000000..42beffe
--- /dev/null
+++ b/vignettes/MlmSoftRev.bib
@@ -0,0 +1,8 @@
+ at Book{box73:_bayes_infer_statis_analy,
+ author = {Box, G.E.P. and Tiao, G.C.},
+ title = {Bayesian Inference in Statistical Analysis},
+ publisher = {Addison-Wesley},
+ year = 1973,
+ address = {Reading, MA}
+}
+
diff --git a/vignettes/StarData.Rnw b/vignettes/StarData.Rnw
new file mode 100644
index 0000000..24525f6
--- /dev/null
+++ b/vignettes/StarData.Rnw
@@ -0,0 +1,306 @@
+\documentclass[letterpaper]{article}
+\usepackage{fullpage}% save trees
+\usepackage{url}
+\title{Creating an R data set from STAR}
+\author{Douglas Bates\\Department of Statistics\\University of Wisconsin -- Madison}
+\date{April 5, 2005}
+\newcommand{\RR}{\textsf{R}}
+\newcommand{\code}[1]{\texttt{\small #1}}
+%%\VignetteIndexEntry{Creating an R data set from STAR}
+%%\VignetteDepends{lme4}
+\begin{document}
+\SweaveOpts{engine=R,eps=FALSE,pdf=TRUE,width=5,height=3}
+\SweaveOpts{strip.white=true,keep.source=TRUE}
+\SweaveOpts{prefix=TRUE,prefix.string=STAR,include=FALSE}
+\maketitle
+\begin{abstract}
+ A substantial portion of the data from Tennessee's Student Teacher
+ Achievement Ratio (STAR) project, a large-scale, four-year study of
+ reduced class size, has been made available to the public at
+ \url{http://www.heros-inc.org/data.htm}. We describe the creation
+ of an \RR (\url{http:www.r-project.org}) data set from these data.
+\end{abstract}
+<<preliminaries,echo=FALSE,results=hide>>=
+options(show.signif.stars = FALSE,width=68)
+library(mlmRev)
+library(lme4)
+@
+
+\section{Introduction}
+\label{sec:intro}
+
+The data from the STAR project are available in several different
+forms from the web site \url{http://www.heros-inc.org/data.htm}. The
+most convenient form for creation of an \RR data set is the
+tab-delimited text file. Download the archive file
+\url{http://www.heros-inc.org/text-star.zip} containing two files:
+\texttt{readme.txt}, a description of the data, and
+\texttt{webstar.txt}, the data themselves.
+
+
+\section{Reading the data}
+\label{sec:reading}
+
+From the data description file we can see that there are 53 columns in
+the data set and most of these columns are coded values. Such columns
+should be represented as factors in \RR but many of these columns will
+need to be combined before we can work with them. We will convert the
+first 5 columns to factors and leave the remaining 48 columns as integers.
+<<datainput>>=
+str(orig <- read.delim(unz(system.file("original/text-star.zip",
+ package = "mlmRev"),
+ "webstar.txt"),
+ colCl = rep(c("factor", "integer"), c(5, 48))))
+@
+In the call to \code{read.delim} we read directly from the zip archive
+and avoided expanding the much larger text file. The call to
+\code{system.file} determines the name of a file that is part of a
+package. In practice it is often more convenient to use the
+\code{file.choose} function which brings up a file chooser panel.
+
+We also check the form of the original data by calling \code{str} on
+the object
+<<strorig>>=
+str(orig)
+@
+
+\subsection{Missing value codes}
+\label{sec:missval}
+
+All the columns except the first column have missing values present.
+Typically the missing value code is \code{"9"} but \code{"99"},
+\code{"999"} and \code{"9999"} are also used.
+We convert these to \RR's missing value code \code{NA} column by column.
+<<missval>>=
+mv <- rep("9", 53)
+mv[c(4,17,26,34,45)] <- "99"
+mv[c(19,20,27,28,35,36,39,40,46:53)] <- "999"
+mv[5] <- "9999"
+mv[1] <- "999999"
+for (i in seq(a = orig)) orig[[i]][orig[[i]] == mv[i]] <- NA
+summary(orig[1:5])
+@
+
+Notice that level \code{"9"} is still present for the \code{SSEX}
+variable even after all the observations at that level have been
+replaced by the missing value code. To remove these unused levels
+from this and all the other columns, we loop over the columns
+selecting all the values but using the optional argument \code{drop = TRUE}.
+<<dropunused>>=
+for (i in seq(a = orig)) orig[[i]] <- orig[[i]][drop = TRUE]
+summary(orig[1:5])
+@
+
+For convenience we convert the names of the columns to lower case.
+<<lcase>>=
+names(orig) <- tolower(names(orig))
+@
+
+\section{Setting factor levels}
+\label{sec:factorLevels}
+
+In \RR the levels of a factor can be given meaningful labels instead of
+numeric codes and in most cases this eliminates the need for a
+separate codebook. For example storing the labels of \code{sex} as
+\code{"M"} and \code{"F"} makes the coding self-explanatory. When
+used in a model a factor is automatically converted to a set of
+``contrasts'' (there is a technical definition of the term
+``contrast'' in linear models that is not always fulfilled by these
+derived variables) and the corresponding coefficients are given
+meaningful names.
+
+When there is a natural ordering of the levels of a factor it can be
+created as an ordered factor that will preserve this ordering.
+
+The labels can be set after the factor is created or as part of the
+creation of the factor. Below we will create a ``long form'' of the
+data where each row corresponds to a combination of student and
+grade. In doing this we will need to concatenate related columns of
+the original data frame. For example, the columns \code{cltypek},
+\code{cltype1}, \code{cltype2} and \code{cltype3} will be concatenated
+to form a single column \code{cltype}. If the coding is consistent
+across the grades then it is easiest to concatenate the integer codes
+and set the labels on the ``long'' version of the variable.
+
+However there are two groups of variables, \code{hdeg} and
+\code{clad}, that are not coded consistently. In each case the codes
+used for kindergarten teachers are different from those used for
+teachers of grades 1 to 3 classes. The codes for kindergarten
+teachers are a superset of those for the other teachers but the
+numbering is not consistent; a bachelor's degree is coded as 2 for
+kindergarten but 1 for the others. Thus we cannot combine the numeric
+values - we must create the labels for each column and then
+concatenate the labels and convert to a factor.
+
+<<factorlevels>>=
+orig$hdegk <- ordered(orig$hdegk, levels = 1:6,
+ labels = c("ASSOC","BS/BA","MS/MA/MEd","MA+","Ed.S","Ed.D/Ph.D"))
+orig$hdeg1 <- ordered(orig$hdeg1, levels = 1:4,
+ labels = c("BS/BA","MS/MA/MEd","Ed.S","Ed.D/Ph.D"))
+orig$hdeg2 <- ordered(orig$hdeg2, levels = 1:4,
+ labels = c("BS/BA","MS/MA/MEd","Ed.S","Ed.D/Ph.D"))
+orig$hdeg3 <- ordered(orig$hdeg3, levels = 1:4,
+ labels = c("BS/BA","MS/MA/MEd","Ed.S","Ed.D/Ph.D"))
+orig$cladk <- factor(orig$cladk, levels = c(1:3,5:8),
+ labels = c("1","2","3","APPR","PROB","NOT","PEND"))
+orig$clad1 <- factor(orig$clad1, levels = 1:6,
+ labels = c("NOT","APPR","PROB","1","2","3"))
+orig$clad2 <- factor(orig$clad2, levels = 1:6,
+ labels = c("NOT","APPR","PROB","1","2","3"))
+orig$clad3 <- factor(orig$clad3, levels = 1:6,
+ labels = c("NOT","APPR","PROB","1","2","3"))
+@
+\section{Creating separate data frames}
+\label{sec:separate}
+
+These data are represented in a ``wide'' format where each row
+corresponds to a student. Some of the columns, such as \code{ssex},
+are indeed a property of the student; some, such as \code{hdegk} are
+properties of teachers; some, such as \code{schtypek} are properties
+of schools or classes in schools; and some are unique to a
+student/grade combination. We will create separate frames for each of
+these types.
+
+The first 5 columns are student-level data
+<<student>>=
+student <- orig[1:5]
+names(student) <- c("id", "sx", "eth", "birthq", "birthy")
+levels(student$sx) <- c("M", "F")
+levels(student$eth) <- c("W", "B", "A", "H", "I", "O")
+student$birthy <- ordered(student$birthy)
+student$birthq <- ordered(paste(student$birthy,student$birthq,sep=":"))
+summary(student)
+@
+
+The other columns refer to a combination of the student and grade. We
+first create an expanded or ``long'' version of the table with a row
+for each student/grade combination.
+
+To create the long version of the table we repeat the student ids four
+times and add a column for the grade level. Related groups of
+columns, such as \code{cltypek}, \code{cltype1}, \code{cltype2} and
+\code{cltype3}, are concatenated then converted to a factor. However,
+there are two groups, \code{hdeg} and \code{clad}, for which this
+approach will not work because these groups are not encoded
+consistently.
+<<long>>=
+long <- data.frame(id = rep(orig$newid, 4),
+ gr = ordered(rep(c("K", 1:3), each = nrow(orig)),
+ levels = c("K", 1:3)),
+ star = factor(unlist(orig[6:9])),
+ cltype = factor(unlist(orig[10:13])),
+ schtype = factor(unlist(orig[c(14,22,30,38)])),
+ hdeg = ordered(unlist(lapply(orig[c(15,24,32,43)],as.character)),
+ levels = c("ASSOC","BS/BA","MS/MA/MEd","MA+","Ed.S","Ed.D/Ph.D")),
+ clad = factor(unlist(lapply(orig[c(16,25,33,44)],as.character)),
+ levels = c("NOT","APPR","PROB","PEND","1","2","3")),
+ exp = unlist(orig[c(17,26,34,45)]),
+ trace = factor(unlist(orig[c(18,23,31,42)]), levels=1:6,
+ labels=c("W", "B", "A", "H", "I", "O")),
+ read = unlist(orig[c(19,27,35,39)]),
+ math = unlist(orig[c(20,28,36,40)]),
+ ses = factor(unlist(orig[c(21,29,37,41)]),labels=c("F","N")),
+ sch = factor(unlist(orig[50:53])))
+@
+
+We can now eliminate the combinations that are completely missing. Checking
+<<summarylong>>=
+summary(long)
+@
+indicates that fewest missing values are in the \code{sch},
+\code{cltype}, and \code{schtype} columns. They are also consistent
+<<isnacheck>>=
+with(long, all.equal(is.na(schtype), is.na(sch)))
+with(long, all.equal(is.na(cltype), is.na(sch)))
+@
+hence we use these to subset the data frame
+<<longsubset>>=
+long <- long[!is.na(long$sch),]
+@
+
+It turns out that we could have used the \code{star} column as this
+simply indicates if the student was in the study that year.
+<<summlong>>=
+summary(long[1:5])
+@
+Because it now contains no information we will drop it.
+<<dropstar>>=
+long$star <- NULL
+@
+
+For convenience we set the row names of this data frame to be a combination of the student id and the grade.
+<<rownames>>=
+rownames(long) <- paste(long$id, long$gr, sep = '/')
+@
+
+We can extract the school-level data from this table.
+<<schooltable>>=
+school <- unique(long[, c("sch", "schtype")])
+length(levels(school$sch)) == nrow(school)
+row.names(school) <- school$sch
+school <- school[order(as.integer(as.character(school$sch))),]
+long$schtype <- NULL
+levels(school$schtype) <- c("inner","suburb","rural","urban")
+levels(long$cltype) <- c("small", "reg", "reg+A")
+@
+
+We can create a merged data set with
+<<merging>>=
+star <- merge(merge(long, school, by = "sch"), student, by = "id")
+star$time <- as.integer(star$gr) - 1
+@
+
+\section{Assigning teacher ids}
+\label{sec:teacher}
+
+There are no teacher id numbers available but we can obtain a
+reasonably accurate surrogate by determining the unique combinations
+of all the variables associated with the teacher.
+<<teacher>>=
+teacher <- unique(star[, c("cltype", "trace", "exp", "clad", "gr",
+ "hdeg", "sch")])
+teacher <- teacher[with(teacher, order(sch, gr, cltype, exp, hdeg,
+ clad, trace)), ]
+@
+
+To generate the correspondence between the observations and the
+teacher we create labels that incorporate the levels of each of the
+variables that defined the unique combinations.
+<<teacherlabels>>=
+row.names(teacher) <- tch <- teacher$tch <- seq(nrow(teacher))
+names(tch) <- do.call("paste", c(teacher[,1:7], list(sep=":")))
+star$tch <- tch[do.call("paste", c(star[c("cltype", "trace", "exp",
+ "clad", "gr", "hdeg", "sch")],
+ list(sep = ":")))]
+@
+
+We can check if this is successful by generating tables of class sizes.
+<<classsizes>>=
+table(table(star$tch))
+table(table(subset(star, cltype == "small")$tch))
+table(table(subset(star, cltype == "reg")$tch))
+table(table(subset(star, cltype == "reg+A")$tch))
+@
+
+We see that there are three classes with sizes greater than 30 and
+that one of these is labelled as a ``small'' class. It is likely that
+each of these represents two or more classes but we do not have enough
+information to distinguish them.
+
+\section{Initial model fits}
+\label{sec:initial}
+
+Some initial model fits are
+<<initialModel>>=
+library(lme4)
+(mm1 <- lmer(math ~ gr + sx + eth + cltype + (1|id) + (1|sch), star))
+(rm1 <- lmer(read ~ gr + sx + eth + cltype + (1|id) + (1|sch), star))
+@
+
+\section{Session Info}
+
+<<sessionInfo, results=tex>>=
+toLatex(sessionInfo())
+@
+
+\end{document}
diff --git a/vignettes/myVignette.sty b/vignettes/myVignette.sty
new file mode 100644
index 0000000..09c834e
--- /dev/null
+++ b/vignettes/myVignette.sty
@@ -0,0 +1,101 @@
+\RequirePackage{hyperref}
+\RequirePackage{url}
+\RequirePackage{amsmath}
+\RequirePackage{bm}
+\newcommand{\Slang}{\textsf{S} language}
+\newcommand{\LXX}{\ensuremath{\bL_\mathit{XX}}}
+\newcommand{\LXZ}{\ensuremath{\bL_\mathit{XZ}}}
+\newcommand{\LZZ}{\ensuremath{\bL_\mathit{ZZ}}}
+\newcommand{\RR}{\textsf{R}}
+\newcommand{\RXX}{\ensuremath{\bR_\mathit{XX}}}
+\newcommand{\RZX}{\ensuremath{\bR_\mathit{ZX}}}
+\newcommand{\RZZ}{\ensuremath{\bR_\mathit{ZZ}}}
+\newcommand{\TZZ}{\ensuremath{\bT_\mathit{ZZ}}}
+\newcommand{\bA}{\ensuremath{\bm{A}}}
+\newcommand{\bB}{\ensuremath{\bm{B}}}
+\newcommand{\bDelta}{\ensuremath{\bm{\Delta}}}
+\newcommand{\bD}{\ensuremath{\bm{D}}}
+\newcommand{\bI}{\ensuremath{\bm{I}}}
+\newcommand{\bJ}{\ensuremath{\bm{J}}}
+\newcommand{\bL}{\ensuremath{\bm{L}}}
+\newcommand{\bM}{\ensuremath{\bm{M}}}
+\newcommand{\bOmega}{\ensuremath{\bm{\Omega}}}
+\newcommand{\bPhi}{\ensuremath{\bm{\Phi}}}
+\newcommand{\bR}{\ensuremath{\bm{R}}}
+\newcommand{\bT}{\ensuremath{\bm{T}}}
+\newcommand{\bX}{\ensuremath{\bm{X}}}
+\newcommand{\bY}{\ensuremath{\bm{Y}}}
+\newcommand{\bZ}{\ensuremath{\bm{Z}}}
+\newcommand{\balpha}{\ensuremath{\bm{\alpha}}}
+\newcommand{\ba}{\ensuremath{\bm{a}}}
+\newcommand{\bbeta}{\ensuremath{\bm{\beta}}}
+\newcommand{\bb}{\ensuremath{\bm{b}}}
+\newcommand{\bc}{\ensuremath{\bm{c}}}
+\newcommand{\bd}{\ensuremath{\bm{d}}}
+\newcommand{\bdelta}{\ensuremath{\bm{\delta}}}
+\newcommand{\bbe}{\ensuremath{\bm{e}}}
+\newcommand{\beps}{\ensuremath{\bm{\epsilon}}}
+\newcommand{\be}{\ensuremath{\bm{e}}}
+\newcommand{\bff}{\ensuremath{\bm{f}}}
+\newcommand{\bg}{\ensuremath{\bm{g}}}
+\newcommand{\blambda}{\ensuremath{\bm{\lambda}}}
+\newcommand{\bmu}{\ensuremath{\bm{\mu}}}
+\newcommand{\boeta}{\ensuremath{\bm{\eta}}}
+\newcommand{\bomega}{\ensuremath{\bm{\omega}}}
+\newcommand{\bphi}{\ensuremath{\bm{\phi}}}
+\newcommand{\br}{\ensuremath{\bm{r}}}
+\newcommand{\btheta}{\ensuremath{\bm{\theta}}}
+\newcommand{\bt}{\ensuremath{\bm{t}}}
+\newcommand{\bu}{\ensuremath{\bm{u}}}
+\newcommand{\bv}{\ensuremath{\bm{v}}}
+\newcommand{\bw}{\ensuremath{\bm{w}}}
+\newcommand{\bx}{\ensuremath{\bm{x}}}
+\newcommand{\by}{\ensuremath{\bm{y}}}
+\newcommand{\bzer}{\ensuremath{\bm{0}}}
+\newcommand{\dX}{\ensuremath{\bm{d}_{\mathit{X}}}}
+\newcommand{\dZ}{\ensuremath{\bm{d}_{\mathit{Z}}}}
+\newcommand{\deri}{{\der_i}}
+\newcommand{\derj}{{\der_j}}
+\newcommand{\der}{\operatorname{\mathsf{D}}}
+\newcommand{\dy}{\ensuremath{d_{\mathit{y}}}}
+\newcommand{\email}[1]{\href{mailto:#1}{\normalfont\texttt{#1}}}
+\newcommand{\invtrans}{\ensuremath{^{-\mathsf{T}}}}
+\newcommand{\inv}{\ensuremath{^{-1}}}
+\newcommand{\ovec}{\operatorname{\mathrm{vec}}}
+\newcommand{\pperp}{{P^{\perp}}}
+\newcommand{\pplus}{\ensuremath{\bPhi^{+}}}
+\newcommand{\ptpi}{\ensuremath{\left(\bPhi\trans\bPhi\right)^{-1}}}
+\newcommand{\rXy}{\ensuremath{\br_\mathit{Xy}}}
+\newcommand{\rZy}{\ensuremath{\br_\mathit{Zy}}}
+\newcommand{\ryy}{\ensuremath{r_\mathit{yy}}}
+\newcommand{\tL}{\ensuremath{\tilde{\bL}}}
+\newcommand{\tRXX}{\ensuremath{\tilde{\bR}_\mathit{XX}}}
+\newcommand{\tRZX}{\ensuremath{\tilde{\bR}_\mathit{ZX}}}
+\newcommand{\tildy}{\ensuremath{\tilde{\by}}}
+\newcommand{\trans}{\ensuremath{^\mathsf{T}}}
+\newcommand{\tr}{\operatorname{tr}}
+\newcommand{\vb}{\ensuremath{\bm{V}_{\bm{b}}}}
+\newcommand\code{\bgroup\@codex}
+\def\@codex#1{{\normalfont\ttfamily\hyphenchar\font=-1 #1}\egroup}
+\newcommand{\kbd}[1]{{\normalfont\texttt{#1}}}
+\newcommand{\key}[1]{{\normalfont\texttt{\uppercase{#1}}}}
+\newcommand\samp{`\bgroup\@noligs\@sampx}
+\def\@sampx#1{{\normalfont\texttt{#1}}\egroup'}
+\newcommand{\var}[1]{{\normalfont\textsl{#1}}}
+\let\env=\code
+\newcommand{\file}[1]{{`\normalfont\textsf{#1}'}}
+\let\command=\code
+\let\option=\samp
+\newcommand{\dfn}[1]{{\normalfont\textsl{#1}}}
+\newcommand{\acronym}[1]{{\normalfont\textsc{\lowercase{#1}}}}
+\newcommand{\strong}[1]{{\normalfont\fontseries{b}\selectfont #1}}
+\let\pkg=\strong
+\RequirePackage{alltt}
+\newenvironment{example}{\begin{alltt}}{\end{alltt}}
+\newenvironment{smallexample}{\begin{alltt}\small}{\end{alltt}}
+\newenvironment{display}{\list{}{}\item\relax}{\endlist}
+\newenvironment{smallverbatim}{\small\verbatim}{\endverbatim}
+\RequirePackage{fancyvrb}
+\DefineVerbatimEnvironment{Sinput}{Verbatim}{fontsize=\small,fontshape=sl}
+\DefineVerbatimEnvironment{Soutput}{Verbatim}{fontsize=\small}
+\DefineVerbatimEnvironment{Scode}{Verbatim}{fontsize=\small,fontshape=sl}
--
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-science/packages/r-cran-mlmrev.git
More information about the debian-science-commits
mailing list