IISE Data Analytics and Information systems division (dais) student data analytics competition

The Data Analytics Competition is an annual student competition organized by the Data Analytics and Information Systems (DAIS) Division of the Institute of Industrial and Systems Engineers (IISE). This year's competition is Nursing Home Time-Series Data Prediction Triathlon.The main objective of the competition is to provide the students with the opportunity to learn, showcase, and enhance their data analysis and visualization skills through working on real-world problems with real-sized data sets.

Problem Description - Nursing Home Staffing Hour Prediction Triathlon

Nursing homes in the United States provide the most comprehensive set of professional care a person can receive outside hospital. The care includes a range of coordinated medical, personal, and social services to meet the needs of residents who are chronically ill or disabled. Staffing adequately in a cost-benefit manner is vital to nursing home management in the United States. Since staffing cannot be adjusted in real-time, it is critical for each nursing home facility to make accurate half-a-month to one-month-ahead prediction on the staffing level from the existing staff, which can help them make intelligent hiring and HR decisions.

The Centers for Medicare & Medicaid Services (CMS) provides public data sets, known as Payroll Based Journal Public Use Files (PBJ PUFs), containing daily nursing home staffing levels (often measured by staffing hours) and resident census data. Nursing homes must submit accurate staffing information, including agency and contract staff, through the PBJ system, based on verifiable payroll data, in a format specified by CMS. Facilities report the number of hours each staff member is paid to work each day. The quarterly PBJ data files are available beginning with data from the first calendar quarter of 2017. New data files will be uploaded to data.cms.gov on a quarterly basis. The public use files report information on staffing hours for each day in the quarter. The staffing data in the PBJ PUFs is aggregated to the facility-day. This means that all included facilities have one record (or row of data) for each day in a quarterly file. For more information, please click here.

Competition Rules

  • Training data set. Nursing staff (e.g., CNA, LPN, RN) data files belonging to 100 nursing homes in Indiana and Florida, US (where the two institutions the competition steering committee chairs are affiliated with are located at) will be released to the participating teams by Jan. 24, 2024. These files will contain data collected in the second quarter of 2024 (i.e., Q2 2024), the latest period with data available by the time of this announcement. The names of the selected nursing homes will not be released.

  • Prediction events. Three staffing hour quantities that can be derived from the data files will be announced as the three prediction targets (i.e., “induvial race in a triathlon”) together with several prediction performance metrics for each target. This is meant to give teams the leads to decide if they want to participate in this competition and perhaps a jump start as well.

  • Two-Phase Competition. The competition will contain two phases of judging based on the prediction results submitted. From phase 1, we will select four finalists. From phase 2, we will determine among the four finalists a first-prize winner, a second-prize winner, and two honorable mentions.

Evaluation Process

Phase 1

  • Scoring: All teams will be sorted based on the performance metrics. The team that has the worst prediction will be given 1 point. Then every team ranked one spot ahead will be given 1 more point until the top 3 high-performing teams, they will be given 2 more points than the team ranked one spot lower.

Phase 2

  • Judging: We assume a new data set will be released after the four finalists make their final submission together with the nursing home names. The two steering committee chairs will schedule a meeting with each finalist team at the conference to run their model/algorithm trained with the same 100 nursing homes on the newly available data. Phase 2 is about temporal transferability of the prediction tool.

  • Scoring:The four semi-finalist teams will be sorted based on the performance metrics. The four teams will be given 10, 7, 4, 1 based on their ranking in each category of performance metrics. The total score of each team will be the sum of their scores over all categories. The ranking based on the total score will carry 60% of the weight. The other two items will be the final report (20%) and the final presentation (20%).

Eligibility

  • Individuals or teams of a maximum of four members (no post-doc/faculty allowed).
  • Student members must be either undergraduate or graduate students from higher education institutes in the field of Industrial & Systems Engineering or related fields.
  • Student members should be enrolled at the time of the submission of the proposal.
  • At least one of the team members must be an active member of the Data Analytics and Information System (DAIS) division of IISE.
  • A team must submit a notice of intent to be eligible for the competition participation.

Competition Process

Notice of Intent

A team must submit a notice of intent to participate in the competition via email to the chairs of the competition steering committee by Friday, January 17, 2025. The notice of intent needs to include:

  1. The list of names of team members, their affiliations, and contact information (email and phone).

  2. One team member is identified as the main contact.

The competition steering committee chairs will share the initial data sets of 100 nursing homes through email to the main contact provided by each team on Friday, January 24, 2025.

Submission and Judging of the Results

Phase 1

For Phase 1, participating teams or individuals are required to submit their source codes and scripts via email to the chairs of the competition steering committee by Friday, Feb. 28, 2025. The competition steering committee will evaluate the submissions, and a maximum of top four teams will be selected as finalists. More details about the submission guidelines and review criteria will be released along with the initial data set. The committee chairs will judge each team in the first week of March. The four finalist teams will be notified by Friday, March 7, 2025.

Phase 2

For Phase 2, the finalist teams will be given the full list of the 100 nursing homes on Monday, March 10, 2025. This will give them more flexibility to adapt and refine their time-series prediction tool for the demand forecasting. The finalist teams or individuals are required to submit 1) source code and scripts and 2) final report via email to the committee chairs by Friday, May 2, 2025 (this date may be changed once we have an update on the release date of the data by CMS).

Final Presentation

The selected finalist teams and individuals will present the model motivation, model approach, and results at the 2025 IISE Annual Conference & Expo, May 31 – June 3, 2025, Atlanta, USA. More information on the final report and presentation contents will be provided to the finalist teams.

Approval of the Notice of Intent

The competition steering committee will review and approve or reject the submitted notices of intent to participate based on the eligibility criteria. Approval emails will be sent together with the dataset by the chair of the committee by Friday, January 24, 2025.

Selection of Top Finalist Teams

The competition steering committee will select a maximum of four finalist teams. Finalist teams will be notified by email by the chair of the committee by Friday, March 7, 2025.

Selection of Winners

The selected finalist teams will run their codes, produce their results, present their products in 2025 IISE Annual Conference & Expo, May 31 – June 3, 2025, Atlanta, USA. A blind vote cast by invited judges will decide the winner after the presentations.

Important Dates (Deadlines)

  • Notice of Intent: January 17, 2025
  • Competition challenge datasets are made available: January 24, 2025
  • Deadline for submission for Phase-I judging: February 28, 2025
  • Notification to the finalist teams: March 7, 2025
  • Competition challenge dataset for phase 2 made available to the finalist teams: March 10, 2025
  • Deadline for submission for Phase-II judging: May 2, 2025
  • IISE Annual Conference & Expo: May 31 – June 3, 2025

Recognition

  • Recognition at the DAIS Town Hall Meeting
  • Certificate provided by IISE (either mailed or given at the town hall meeting) for the 1st prize winners.
  • 2nd and 3rd prize winners will receive a digital certificate.
  • Recognition in ISE magazine
  • Recognition on DAIS webpage and in the newsletter

Competition Chairs

Nan Kong, Purdue University
Mingyang Li, University of South Florida

Conflict of Interest

  • Societies and divisions must follow standard conflict of interest guidelines. Those guidelines include, but are not limited to:
  • Officers and Board members of the S/D should be ineligible for awards during the period of their service, without approval by the Senior VP for Technical Operations (SVP). Exceptions may only be made by the SVP when awards are time sensitive (i.e., a Student Best Paper Award for a Student Board Member, when the student is graduating), and the impacted board member(s) or officer(s) must recuse themselves from the award process. For awards that are not time-sensitive, the nominee should wait until their Officer or Board service is complete to be nominated.
  • The awards committee (or judging committee) should not include members who have either a personal or professional relationship with the nominees. For example, a faculty member should not be judging a paper competition where a student from the same university is a nominee for best student paper award.
  • The awards committee (or judging committee) should actively change its membership on a rotating basis from year to year to ensure fairness, equity, and diversity. That is, some members of the judging committee should roll off the committee and new members should roll on.

2024 Winner

1st Place
Hairong Wang, Lingchao Mao, Zihan Zhang
Georgia Institute of Technology

2023 Winner

1st Place
Chengyu Tao, Xuanming Cao, Peng Ye, Juan Du (advisor)
The Hong Kong University of Science and Technology

Questions, contact Amy Straub at IISE.

SHARE