- Psychometric Methodologies and Testing
- Advanced Statistical Modeling Techniques
- Disability Education and Employment
- Student Assessment and Feedback
- Educational Assessment and Improvement
- Educational and Psychological Assessments
- Online Learning and Analytics
- School Choice and Performance
- Higher Education Learning Practices
- Collaborative Teaching and Inclusion
- Intelligent Tutoring Systems and Adaptive Learning
- Inclusion and Disability in Education and Sport
- Educational Technology and Assessment
- Motivation and Self-Concept in Sports
- Multi-Criteria Decision Making
- Innovative Teaching and Learning Methods
- Education, Achievement, and Giftedness
- Evaluation and Performance Assessment
- Engineering Education and Curriculum Development
- Psychological Testing and Assessment
- Teaching and Learning Programming
- Mathematics Education and Teaching Techniques
- Grit, Self-Efficacy, and Motivation
- Educational Assessment and Pedagogy
- Imbalanced Data Classification Techniques
University of Kansas
2014-2024
Institute for Student Achievement
2015-2021
Access to Wholistic and Productive Living Institute
2019
University of Bath
2008
Educational Testing Service
1984-1991
An effect size of about .70 (or .40–.70) is often claimed for the efficacy formative assessment, but not supported by existing research base. More than 300 studies that appeared to address assessment in grades K-12 were reviewed. Many had severely flawed designs yielding uninterpretable results. Only 13 provided sufficient information calculate relevant sizes. A total 42 independent sizes available. The median observed was .25. Using a random effects model, weighted mean .20 calculated....
There have been many studies of the comparability computer-administered and paper-administered tests. Not surprisingly (given variety measurement statistical sampling issues that can affect any one study) results such not always consistent. Moreover, quality computer-based test administration systems has changed considerably over recent years, as computer-experience students. This study synthesizes 81 performed between 1997 2007. The estimated effect size across all was very small (–.01...
A context effect occurs when examinees' item re sponding behavior is affected by the location of an within a test. Recent advances in testing practice, most notably adaptive and certain innovative equating schemes, require items to be more invariant across intended usages than earlier methods. In this paper, effects are identified as form mul tidimensionality, examples situations where important described. Then, susceptibility 10 types from Graduate Record Examination General Test...
Research suggests that self-determination skills are positively correlated with factors have been shown to improve academic achievement, but the direct relationship among self-determination, self-concept, and achievement is not fully understood. This study offers an empirical explanation of how self-concept affect for adolescents learning disabilities after taking into consideration covariates gender, income, urbanicity. In a nationally representative sample ( N = 560), proposed model...
One of the major assumptions item response theory (IRT)models is that performance on a set items unidimensional, is, probability successful by examinees can be modeled mathematical model has only one ability parameter. In practice, this strong assumption likely to violated. An important pragmatic question consider is: What are consequences these violations? research, evidence provided violations unidimensionality verbal scale GRE Aptitude Test, and impact IRT equating examined. Previous...
The use of item-ability regressions (the comparison the regression observed proportion people answering an item correctly on estimated θ with response function) to investigate psychometric properties particular types in a given population was explored using data from four administrations 10 (a total 806 items) Graduate Record Examinations General Test. Although method does not allow absolute deter mination fit for latent trait model (in this case, three-parameter logistic model), it show...
The No Child Left Behind Act (2001) and the Individuals with Disabilities Education Improvement (2004) emphasize accountability to improve student academic achievement. Promoting self-determination has been proposed as a means achieving this outcome. Elementary teachers in 30 states were surveyed measure (a) their perceived importance of self-determination, (b) what extent they teach it, (c) barriers that inhibit them from teaching it. Both general special educators assigned considerable...
Coefficient alpha (α) has been described as a lower bound for test reliability. However, previous research indicates that when certain assumptions are violated, α can either overestimate or underestimate Raykov (1997a) shown how structural equation modeling (SEM) be used to estimate This study introduced method factors into the model in avoid potential limitation of SEM approach. Monte Carlo simulation shows (α SEM) show substantial bias, though most extreme circumstances bias estimates...
Background Despite polling data that suggests teachers are well respected by the general public, criticism of teacher preparation various organizations and interest groups is common, often highlighting perceived need for increasing their rigor performance. A number studies reports have critiqued preparation, high-profile leaders like Secretary Education Arne Duncan called substantive changes. At same time, field has been embracing change with idea accountability based on student Indeed,...
Promoting the self-determination of students with disabilities as a means to access general curriculum has been subject research in recent years, importance efforts promote during elementary years. To examine status such field, 203 special educators were surveyed 23 states determine how (a) classroom instructional practices or strategies, (b) ecological setting variables, and (c) self-reported barriers promoting affected their perceptions teaching frequency which they did so. Results...
ABSTRACT The research described in this paper deals solely with the effect of position an item within a test on examinee's responding behavior at level. For simplicity's sake, will be referred to as practice when result is improved examinee performance and fatigue poorer performance. Item response theory statistics were used assess effects because, unlike traditional statistics, they are sample invariant. In addition, use allows one make reasonable adjustment for speededness, which important...
This study examined the validity of test accommodation in third–eighth graders using differential item functioning (DIF) and mixture IRT models. Two data sets were used for these analyses. With first set (N = 51,591) we whether type (i.e., story, explanation, straightforward) or features associated with difficulty, discrimination, DIF. The second 3,452) was to investigate observed DIF related students' status, gender, race, disability, latent academic ability. Item types significantly...
This study examined the learner characteristics and performance scores of students in 2009 alternate assessment-modified achievement standard for one Midwestern state. Comparing differences by disability category each content area from students' 2008 test type assignments facilitated examining appropriateness assignment. The results raise concerns because some with disabilities seemed to have been inappropriately assigned type. Students intellectual had lowest performances across grade level...
Abstract The continual supply of new items is crucial to maintaining quality for many tests. Automatic item generation (AIG) has the potential rapidly increase number that are available. However, efficiency AIG will be mitigated if generated must submitted traditional, time‐consuming review processes. In two studies, mathematics achievement were subjected multiple stages qualitative measuring intended skills, followed by empirical tryout in operational testing. High rates success found....
ABSTRACT The feasibility of using item response theory as a psychometric model for the GRE Aptitude Test was addressed by assessing reasonableness assumptions types and examinee populations. Items from four forms administrations were calibrated three‐parameter logistic (one form given at two one administration used forms; exact relationships between are in Forms Populations section this report). unidimensionality assumption variety ways. Previous factor analytic research on reviewed to...
The Dynamic Learning Maps™ Alternate Assessment is based on a different set of guiding principles than other assessments. In this article we describe its characteristics and look at the history alternate assessment problems in implementing useful programs for students with significant cognitive disabilities.
A multivariate longitudinal DCM is developed that the composite of two components, log-linear cognitive diagnostic model (LCDM) as measurement component evaluates mastery status attributes at each occasion, and a generalized growth curve describes attribute over time. The proposed represents an improvement in current DCMs given its ability to incorporate both balanced unbalanced data measure single directly without assuming grow same pattern. One simulation study was conducted evaluate terms...
ABSTRACT A necessary prerequisite to the operational use of item response theory (IRT) in any testing program is investigation feasibility such an approach. This report presents results research for Graduate Management Admission Test (GMAT). Despite fact that GMAT data appear violate a basic assumption three‐parameter logistic model, local independence, model was able replicate accurately observed responses. IRT‐based equating consistent across two randomly selected samples and four...
The purpose of this case study was to determine teachers’ rationales for assigning students with mild disabilities alternate assessment based on achievement standards (AA-AAS). In interviews, special educators stated that their primary considerations in making the assignments were low academic performance, student use extended standard modifications, and inflexible 1% cap. None teachers provided grade-level content or appropriate modifications. Some competent reading, but assigned 2010...
Despite much theoretical support, meta-analysis of the efficacy formative assessment does not provided empirical evidence commensurate with expectations. This study suggests that teachers need a better organizing structure to allow process live up its promise. We propose use learning map systems can provide structure, and we describe aspects using support mathematics instruction in two projects: Dynamic Learning Maps® alternate (DLM) Use Maps as an Organizing Structure for Formative...
Background The importance of reading motivation has led to the development a large number self‐report measures; however, there is still need for usable measure adolescent that captures theoretically and empirically distinct constructs. Methods current paper details validation computer adapted motivation, Adaptive Reading Motivation Measure (ARMM), which assesses constructs curiosity, involvement, interest, value, challenge, grades, recognition, competition, avoidance, self‐efficacy,...
Assessment practices for measuring adverse life events (ALEs) are often characterized by considerable variability, which is associated with inconsistency and reproducibility issues when conducting research on children ALE exposure. One aspect of assessment variability caregiver report children’s history that has received minimal attention format. To address this issue, the current study evaluated concordance between two main formats: interviews questionnaires. This involved examining overall...