Case Proposal 008 — PhD Candidate (Opioid-Related Risk Research)Date: 14/11/2025A PhD candidate working on opioid-related risk approached me for assistance in preparing her dataset for advanced analysis.She collects data based on medical visits, where each subject may have multiple visit records. After attending a statistics course, she decided that logistic regression would be suitable for her research. However, she encountered a structural problem: her dataset was arranged by visit rather than by subject, and she needed a clean “one subject per row” file before modelling.Following an initial review, I proposed a structured consultation process to map out the variables she required and to understand how she was currently recording her data.I recommended maintaining two data sheets: 1. a subject-level sheet (one row per subject), and 2. a visit-level sheet (multiple records per subject),mirroring the CDISC concepts of ADSL and BDS.Using programming, we processed more than 10,000 visit-level records. A derived flag was created to indicate whether each subject ever experienced the symptom of interest. We then derived an analysis flag to identify the first occurrence of that symptom. The flagged information was merged back into the subject-level dataset to support her logistic regression and downstream analysis.Outcome:Her data cleaning timeline was reduced from four months to one month, allowing her to focus her efforts on analysis and interpretation rather than manual restructuring.