Extraneous Entity Row Cleanup
Slate Instance | TUP |
Requestor/Reporter | Helen Williams |
Date | 6/24/2024 |
Status | pending |
Bug Description | Dataset imports are creating multiple extra entity rows that should be deleted. Create retention policy/overnight process to delete empty or extraneous rows. |
Issue Description
There have been some cases, particularly where data was imported from a CAS, that multiple inquiry details rows were created. This is largely due to the fact that when adding entity data via a source format, Slate can only match on a unique identifier, and if one is not used, then each time data is run through the source format a new row is created, even if it is the same feild values across all entity fields.
Troubleshooting/Research
Query created to identify number of inquiry details rows per source per person record: https://tusmgp.admissions.tufts.edu/manage/query/build?id=d8a223ae-e92a-4367-bf94-e6be8b6f90f8
Pretty sure the first time we started using inquiry details rows was around Sept. 2021.
Source Formats w/Inquiry Row mappings:
Name | Source Value Used |
|
---|---|---|
CASPA Application (CAS) | CAS Application Import |
|
CASPA In Progress (CAS) | CAS Application Import |
|
PTCAS Application (CAS) | CAS Application Import |
|
PTCAS In Progress (CAS) | CAS Application Import |
|
SOPHAS Application (CAS) | CAS Application Import |
|
SOPHAS In Progress (CAS) | CAS Application Import |
|
SOPHAS In Progress 2022 | Data Import |
|
SOPHAS Submitted 2022 | Data Import |
|
PTCAS In Progress 2023 | NO SOURCE | Was matching on Program Interest ID |
PTCAS In Progress 2022 | NO SOURCE |
|
Inquiry Details Row matching:
Concentration Interest
Planned Entry Term
Program Interest
Resolution Steps
For currently running CAS API source formats (as of 7/1/2024):
Source formats have been updated so that the inquiry details row matches on the Program ID provided by the CAS; this is unique to the designation, so new rows are only created if the applicant selects a different location (DPT) or a different start term/concentration/format (MPH). Otherwise, it will match and only update the row.
CASPA In Progress/Application, SOPHAS In Progress/Application:
progMate.progSele0.programID = Program Interest ID
PTCAS In Progress/Application (already used all 3 mappings available for Program ID so had to do a field fusion):
progMate.progSele0.programID + ‘2025’ = Program Interest ID