Enrollment Analytics
This document is under construction and is actively being updated.
We recommend that the university identify and standardize a core set of enrollment data that accommodates the varied nature of an institution as complex as Tufts University while also providing real-time, actionable data that clarifies fiscal realities and supports institutional goals.
We support two repositories for the data: the Admissions system of record (Slate, AMP, etcetera) for current, operational data and immediate on-site reporting, and Data Warehouse for long-term storage. We strongly recommend that operational systems have a set of robust standards for data retention that are enforced externally, and that all comparison reports and year over year analyses that extend beyond a three-year window must be created using DW source data.
https://tufts.box.com/s/yllatvzi0jyqcioddgh3wt8f7q1lgdmj
The model for providing ongoing enrollment analytics data/insights should be:
flexible: can integrate new requirements rapidly without requiring wholesale changes
robust: can withstand uncertainty, constant changes, variety of dataset size, and missing data without compromising overall integrity or validity
comprehensive: all schools, programs, centers and units with programs or courses that ultimately generate a transcript are included
inclusive: ensuring our core data reflects the multiple disparate identities that every single prospect, applicant, and student brings with them
Challenges
Accommodating all kinds of schools and programs, especially “non-traditional cycle” graduate & professional programs
multiple annual entry terms
spring or summer start dates
“unaligned” start dates (e.g. late July starts in Medical and Dental)
alternate application stages, including interviewing and traffic rules
small application populations
programs with external admissions vendors
Accommodating a variety of dataset sizes; not all programs will have statistically significant numbers or numbers that make sense
Managing external admissions vendor (e.g. CAS/Liaison) data to align with our standards
our policies vs. external reporting organization requirements/policies (i.e., what APTA considers complete vs. what TUSMGP PT consider complete; race/ethnicity categories used on CAS applications versus what is used in Slate-hosted applications, and what is recorded in SIS) .
Different prompts in different systems, need for consistent classifications
race
gender (e.g. is X an option?)
gender identity (East Asian vs. West Asian, Hispanic country of origin, etc.)
veteran status
Variations in usage of basic decision codes within and across instances
Although the export codes are parallel, which makes it appear as though the data is aligned, how and why each decision code is assigned is not consistent
Withdrawal from the admissions process versus Cancellation of Intent to Enroll
Final decision on Incomplete Applications
Deny vs. Do Not Interview decisions
Points of ambiguity
Which programs are included?
anything that ‘admits’?
anything that generates a transcript?
anything that generates an enrollment record?
corner cases include programs where students enroll in their coursework through University College, but are admitted into a program (GSBS: PREP)
What counts as a “cycle”?
Application changes within cycle (MPH change concentration or start term from spring to fall)
When do Admissions systems cease to be the “system of record” (e.g., if a student withdraws during add/drop, does admissions update their decision stack?)
Under what circumstances should the manual SIS quick admit process be used, if ever?
SIS history of admit decline? Program Action codes?
What counts as a complete application?
Process for deferrals
At what point do we do a data freeze? What sort of auditing reports/processes would we need for each program to comply? What does a clean data set look like, anyway?
everyone has a decision (bonus points if it is the correct one)
Initial Data points [for MS and PhD programs]
Biodemo
gender identity
legal sex
race/ethnicity
veteran status
state of residence
US citizenship status
other country or countries of citizenship
year of birth
Programmatic
school
program
concentration
degree
start term/year
application deadline(s)
target enrollment number
Application cycle dates for each person
application started
application submitted
application
invited to interview
interview
decision
committed
cancelled
deferred
enrolled
Sources
Paid marketing
{needs to accommodate different ways of attributing sources?]
Recruitment Activities
Campus Visit: students physically come to us
External Event: we physically go to them (tabling at a conference, school visit or fair, etc)
Online event (online recruitment fair, information session, or interview)
Mailing campaign
Scholarship(s)
awarded
accepted
declined
Decline Data?!
What school attending
Why?
Yield Activities---what of this is our responsibility?
[dataset of programs; then dataset scoped entity for cycle open date. CONSIDER USING EXPORT DATA ON DATASET FOR FEED THROUGH KAFKA]
Application open and close
Program start/open semester/term
Program history (i.e., DPT started Jan 2021)
Open House/key recruitment dates (interview timing?)
Key dates for cycle based on program type (i.e., MD has plan to enroll, commit to enroll “traffic rules”)
Talk with Tristan: we are in the process of identifying core fields that we want to align across all instances , and we want to make sure that for these fields are the ones being used across
Phase I
Generate list of fields; review data needs with stakeholders
OIR--Christina Butler
Marketing
Digital Archives
EADs
Admissions directors
Consultants?
explore Denodo and Tableau
Identify sources of data (Slate, Hubspot, etc.)
Milestone: report of key pain points/problems to be solved with this project? Roadblocks?
Phase II
Identify core data needed for reporting:
Develop data dictionary/common definitions/crosswalk
Identify which data need to be restructured(?) and which cannot be; accommodate these differences across instances/schools/programs
Plan for how to implement any updates/new fields in each Slate instance
timed to application cycles?
auditing and correcting old data?
Develop model for operational querying and reporting out of local Slate instances, differentiate that from analytics and dashboards based in Denodo/Tableau, provide appropriate access, and support this distinction with robust retention policies in individual Slate instances.
current versus year over year
single school versus multiple
admission operations versus marketing and communications
Develop initial model/templates for enrollment analytics reports/dashboards
identify which data is required for each and how it needs to be structured
Who gets access to which dashboards?
Build out connections between systems (Denodo, Slate, DW, etc.)
Phase III
Launch initial dashboards
Process for data auditing and maintenance
Process for requesting/accommodating new data (in Slate or for Tableau reporting)
Advanced/custom reporting and dashboards (by school/program?)